Encyclopedia of Systems and Control PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1569

John Baillieul

Tariq Samad
Editors-in-Chief

Encyclopedia of
Systems and
Control

1 3Reference
Encyclopedia of Systems and Control
John Baillieul • Tariq Samad
Editors

Encyclopedia of Systems
and Control

With 416 Figures and 21 Tables


Editors
John Baillieul Tariq Samad
Mechanical Engineering Honeywell Automation and
Electrical and Computer Engineering Control Solutions
Boston University Golden Valley, MN, USA
Boston, MA, USA

ISBN 978-1-4471-5057-2 ISBN 978-1-4471-5058-9 (eBook)


ISBN 978-1-4471-5059-6 (print and electronic bundle)
DOI 10.1007/978-1-4471-5058-9

Library of Congress Control Number: 2015941487

Springer London Heidelberg New York Dordrecht


© Springer-Verlag London 2015
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole
or part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way,
and transmission or information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in
this book are believed to be true and accurate at the date of publication. Neither the publisher
nor the authors or the editors give a warranty, express or implied, with respect to the material
contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

Springer-Verlag London is part of Springer Science+Business Media (www.springer.com)


Preface

The history of Automatic Control is both ancient and modern. If we adopt


the broad view that an automatic control system is any mechanism by which
an input action and output action are dynamically coupled, then the origins
of this encyclopedia’s subject matter may be traced back more than 2,000
years to the era of primitive time-keeping and the clepsydra water clock
perfected by Ctesibius of Alexandria. In more recent history, frequently cited
examples of feedback control include the automatically refilling reservoirs of
flush toilets (perfected in the late nineteenth century) and the celebrated fly-
ball steam-flow governor described in J.C. Maxwell’s 1868 Royal Society of
London paper—“On Governors.”
Although it is useful to keep the technologies of antiquity in mind, the
history of systems and control as covered in the pages of this encyclopedia
begins in the twentieth century. The history was profoundly influenced by
work of Nyqvist, Black, Bode, and others who were developing amplifier
theory in response to the need to transmit wireline signals over long distances.
This research provided major conceptual advances in feedback and stability
that proved to be of interest in the theory of servomechanisms that was being
developed at the same time. Driven by the need for fast and accurate control of
weapons systems during World War II, automatic control developed quickly
as a recognizable discipline.
While the developments of the first half of the twentieth century are an
important backdrop for the Encyclopedia of Systems and Control, most of the
topics directly treat developments from 1948 to the present. The year 1948
was auspicious for systems and control—and indeed for all the information
sciences. Norbert Wiener’s book Cybernetics was published by Wiley, the
transistor was invented (and given its name), and Shannon’s seminal paper
“A Mathematical Theory of Communication” was published in the Bell
System Technical Journal. In the years that followed, important ideas of
Shannon, Wiener, Von Neumann, Turing, and many others changed the way
people thought about the basic concepts of control systems. The theoretical
advances have propelled industrial and societal impact as well (and vice
versa). Today, advanced control is a crucial enabling technology in domains
as numerous and diverse as aerospace, automotive, and marine vehicles; the
process industries and manufacturing; electric power systems; homes and
buildings; robotics; communication networks; economics and finance; and
biology and biomedical devices.

v
vi Preface

It is this incredible broadening of the scope of the field that has motivated
the editors to assemble the entries that follow. This encyclopedia aims to help
students, researchers, and practitioners learn the basic elements of a vast array
of topics that are now considered part of systems and control. The goal is to
provide entry-level access to subject matter together with cross-references to
related topics and pointers to original research and source material.
Entries in the encyclopedia are organized alphabetically by title, and
extensive links to related entries are included to facilitate topical reading—
these links are listed in “Cross-References” sections within entries. All cross-
referenced entries are indicated by a preceding symbol:. In the electronic
version of the encyclopedia these entries are hyperlinked for ease of access.
The creation of the Encyclopedia of Systems and Control has been a major
undertaking that has unfolded over a 3-year period. We owe an enormous debt
to major intellectual leaders in the field who agreed to serve as topical section
editors. They have ensured the value of the opus by recruiting leading experts
in each of the covered topics and carefully reviewing drafts. It has been a
pleasure also to work with Oliver Jackson and Andrew Spencer of Springer,
who have been unfailingly accommodating and responsive over this time.
As we reflect back over the course of this project, we are reminded of
how it began. Gary Balas, one of the world’s experts in robust control and
aerospace applications, came to one of us after a meeting with Oliver at the
Springer booth at a conference and suggested this encyclopedia—but was
adamant that he wasn’t the right person to lead it. The two of us took the
initiative (ultimately getting Gary to agree to be the section editor for the
aerospace control entries). Gary died last year after a courageous fight with
cancer. Our sense of accomplishment is infused with sadness at the loss of a
close friend and colleague.
We hope readers find this encyclopedia a useful and valuable compendium
and we welcome your feedback.

Boston, USA John Baillieul


Minneapolis, USA Tariq Samad
May 2015
Section Editors

Linear Systems Theory (Time-Domain)


Panos J. Antsaklis Department of Electrical Engineering, University of
Notre Dame, Notre Dame, IN, USA

Aerospace Applications
Gary Balas Deceased. Formerly at Aerospace Engineering and Mechanics
Department, University of Minnesota, Minneapolis, MN, USA

Game Theory
Tamer Başar Coordinated Science Laboratory, University of Illinois,
Urbana, IL, USA

Economic and Financial Systems


Alain Bensoussan Naveen Jindal School of Management, University of
Texas at Dallas, Richardson, TX, USA

Geometric Optimal Control


Anthony Bloch Department of Mathematics, The University of Michigan,
Ann Arbor, MI, USA

Classical Optimal Control


Michael Cantoni Department of Electrical & Electronic Engineering,
The University of Melbourne, Parkville, VIC, Australia

Discrete-Event Systems
Christos G. Cassandras Division of Systems Engineering, Center for
Information and Systems Engineering, Boston University, Brookline, MA,
USA

Electric Energy Systems


Joe H. Chow Department of Electrical and Computer Systems Engineering,
Rensselaer Polytechnic Institute, Troy, NY, USA

Control of Networked Systems


Jorge Cortés Department of Mechanical and Aerospace Engineering, Uni-
versity of California, San Diego, La Jolla, CA, USA

vii
viii Section Editors

Estimation and Filtering


Frederick E. Daum Raytheon Company, Woburn, MA, USA

Control of Process Systems


Sebastian Engell Fakultät Bio- und Chemieingenieurwesen, Technische
Universität Dortmund, Dortmund, Germany

Automotive and Road Transportation


Luigi Glielmo Facoltà di Ingegneria dell’Università del Sannio in
Benevento, Benevento, Italy

Stochastic Control
Lei Guo Academy of Mathematics and Systems Science, Chinese Academy
of Sciences (CAS), Beijing, China

Nonlinear Control
Alberto Isidori Department of Computer and System Sciences “A. Ruberti”,
University of Rome “La Sapienza”, Rome, Italy

Biosystems and Control


Mustafa Khammash Department of Biosystems Science and Engineering,
Swiss Federal Institute of Technology at Zurich (ETHZ), Basel, Switzerland

Distributed Parameter Systems


Miroslav Krstic Department of Mechanical and Aerospace Engineering,
University of California, San Diego, La Jolla, CA, USA

Hybrid Systems
Francoise Lamnabhi-Lagarrigue Laboratoire des Signaux et Systèmes –
CNRS, European Embedded Control Institute, Gif-sur-Yvette, France

Identification and Modeling


Lennart Ljung Division of Automatic Control, Department of Electrical
Engineering, Linköping University, Linköping, Sweden
Model-Predictive Control
David Mayne Department of Electrical and Electronic Engineering,
Imperial College London, London, UK

Adaptive Control
Richard Hume Middleton School of Electrical Engineering and Computer
Science, The University of Newcastle, Callaghan, NSW, Australia

Intelligent Control
Thomas Parisini Department of Electrical and Electronic Engineering,
Imperial College London, London, UK
Section Editors ix

Control of Marine Vessels


Kristin Y. Pettersen Department of Engineering Cybernetics, Norwegian
University of Science and Technology, Trondheim, Norway

Frequency-Domain Control
Michael Sebek Department of Control Engineering, Faculty of Electrical
Engineering, Czech Technical University in Prague, Prague 6, Czech
Republic

Robotics
Bruno Siciliano Dipartimento di Informatica e Sistemistica, Università
degli Studi di Napoli Federico II, Napoli, Italy

Other Applications of Advanced Control


Toshiharu Sugie Department of Systems Science, Graduate School of
Informatics, Kyoto University, Uji, Kyoto, Japan

Complex Systems with Uncertainty


Roberto Tempo CNR-IEIIT, Politecnico di Torino, Torino, Italy

Control of Manufacturing Systems


Dawn Tilbury Department of Mechanical Engineering, University of
Michigan, Ann Arbor, MI, USA

Computer-Aided Control Systems Design


Andreas Varga Institute of System Dynamics and Control, German
Aerospace Center, DLR Oberpfaffenhofen, Wessling, Germany

Information-Based Control
Wing-Shing Wong Department of Information Engineering, The Chinese
University of Hong Kong, Hong Kong, China

Robust Control
Kemin Zhou Department of Electrical and Computer Engineering,
Louisiana State University, Baton Rouge, LA, USA
Contributors

Teodoro Alamo Departamento de Ingeniería de Sistemas y Automática,


Escuela Superior de Ingeniería, Universidad de Sevilla, Sevilla, Spain
Frank Allgöwer Institute for Systems Theory and Automatic Control,
University of Stuttgart, Stuttgart, Germany
Tansu Alpcan Department of Electrical and Electronic Engineering, The
University of Melbourne, Melbourne, Australia
Eitan Altman INRIA, Sophia-Antipolis, France
Sean B. Andersson Mechanical Engineering and Division of Systems
Engineering, Boston University, Boston, MA, USA
David Angeli Department of Electrical and Electronic Engineering, Imperial
College London, London, UK
Dipartimento di Ingegneria dell’Informazione, University of Florence, Italy
Finn Ankersen European Space Agency, Noordwijk, The Netherlands
Anuradha M. Annaswamy Active-adaptive Control Laboratory, Department
of Mechanical Engineering, Massachusetts Institute of Technology,
Cambridge, MA, USA
Gianluca Antonelli University of Cassino and Southern Lazio, Cassino,
Italy
Panos J. Antsaklis Department of Electrical Engineering, University of
Notre Dame, Notre Dame, IN, USA
Pierre Apkarian DCSD, ONERA – The French Aerospace Lab, Toulouse,
France
A. Astolfi Department of Electrical and Electronic Engineering, Imperial
College London, London, UK
Dipartimento di Ingegneria Civile e Ingegneria Informatica, Università di
Roma Tor Vergata, Roma, Italy
Karl Åström Department of Automatic Control, Lund University, Lund,
Sweden

xi
xii Contributors

Thomas A. Badgwell ExxonMobil Research & Engineering, Annandale,


NJ, USA
John Baillieul Mechanical Engineering, Electrical and Computer Engineer-
ing, Boston University, Boston, MA, USA
Gary Balas Deceased. Formerly at Aerospace Engineering and Mechanics
Department, University of Minnesota, Minneapolis, MN, USA
Yaakov Bar-Shalom University of Connecticut, Storrs, CT, USA
B. Ross Barmish University of Wisconsin, Madison, WI, USA
Tamer Başar Coordinated Science Laboratory, University of Illinois,
Urbana, IL, USA
Georges Bastin Department of Mathematical Engineering, University
Catholique de Louvain, Louvain-La-Neuve, Belgium
Karine Beauchard CNRS, CMLS, Ecole Polytechnique, Palaiseau, France
Nikolaos Bekiaris-Liberis Department of Mechanical and Aerospace
Engineering, University of California, San Diego, La Jolla, CA, USA
Christine M. Belcastro NASA Langley Research Center, Hampton, VA,
USA
Alberto Bemporad IMT Institute for Advanced Studies Lucca, Lucca, Italy
Peter Benner Max Planck Institute for Dynamics of Complex Technical
Systems, Magdeburg, Germany
Pierre Bernhard INRIA-Sophia Antipolis Méditerranée, Sophia Antipolis,
France
Tomasz R. Bielecki Department of Applied Mathematics, Illinois Institute
of Technology, Chicago, IL, USA
Mogens Blanke Department of Electrical Engineering, Automation and
Control Group, Technical University of Denmark (DTU), Lyngby, Denmark
Centre for Autonomous Marine Operations and Systems (AMOS),
Norwegian University of Science and Technology, Trondheim, Norway
Anthony Bloch Department of Mathematics, The University of Michigan,
Ann Arbor, MI, USA
Bernard Bonnard Institute of Mathematics, University of Burgundy, Dijon,
France
Dominique Bonvin Laboratoire d’Automatique, École Polytechnique
Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
Ugo Boscain CNRS CMAP, École Polytechnique, Palaiseau, France
Team GECO INRIA Saclay, Palaiseau, France
James E. Braun Purdue University, West Lafayette, IN, USA
Contributors xiii

Roger Brockett Harvard University, Cambridge, MA, USA


Linda Bushnell Department of Electrical Engineering, University of
Washington, Seattle, WA, USA
Fabrizio Caccavale School of Engineering, Università degli Studi della
Basilicata, Potenza, Italy
Abel Cadenillas University of Alberta, Edmonton, AB, Canada
Peter E. Caines McGill University, Montreal, QC, Canada
Andrea Caiti DII – Department of Information Engineering & Centro
“E. Piaggio”, ISME – Interuniversity Research Centre on Integrated Systems
for the Marine Environment, University of Pisa, Pisa, Italy
Octavia Camps Electrical and Computer Engineering Department,
Northeastern University, Boston, MA, USA
Mark Cannon Department of Engineering Science, University of Oxford,
Oxford, UK
Michael Cantoni Department of Electrical & Electronic Engineering,
The University of Melbourne, Parkville, VIC, Australia
Xi-Ren Cao Department of Finance and Department of Automation,
Shanghai Jiao Tong University, Shanghai, China
Institute of Advanced Study, Hong Kong University of Science and
Technology, Hong Kong, China
Daniele Carnevale Dipartimento di Ing. Civile ed Ing. Informatica,
Università di Roma “Tor Vergata”, Roma, Italy
Giuseppe Casalino University of Genoa, Genoa, Italy
Francesco Casella Politecnico di Milano, Milan, Italy
Christos G. Cassandras Division of Systems Engineering, Center for
Information and Systems Engineering, Boston University, Brookline, MA,
USA
David A. Castañón Boston University, Boston, MA, USA
Eduardo Cerpa Departamento de Matemática, Universidad Técnica Fed-
erico Santa María, Valparaiso, Chile
François Chaumette Inria, Rennes, France
Ben M. Chen Department of Electrical and Computer Engineering, National
University of Singapore, Singapore, Singapore
Jie Chen City University of Hong Kong, Hong Kong, China
Tongwen Chen Department of Electrical and Computer Engineering,
University of Alberta, Edmonton, AB, Canada
Benoit Chevalier-Roignant Oliver Wyman, Munich, Germany
xiv Contributors

Hsiao-Dong Chiang School of Electrical and Computer Engineering,


Cornell University, Ithaca, NY, USA
Stefano Chiaverini Dipartimento di Ingegneria Elettrica e dell’Informazione
“Maurizio Scarano”, Università degli Studi di Cassino e del Lazio
Meridionale, Cassino (FR), Italy
Luigi Chisci Dipartimento di Ingegneria dell’Informazione, Università di
Firenze, Firenze, Italy
Alessandro Chiuso Department of Information Engineering, University of
Padova, Padova, Italy
Joe H. Chow Department of Electrical and Computer Systems Engineering,
Rensselaer Polytechnic Institute, Troy, NY, USA
Monique Chyba University of Hawaii-Manoa, Manoa, HI, USA
Martin Corless School of Aeronautics & Astronautics, Purdue University,
West Lafayette, IN, USA
Jean-Michel Coron Laboratoire Jacques-Louis Lions, University Pierre et
Marie Curie, Paris, France
Jorge Cortés Department of Mechanical and Aerospace Engineering,
University of California, San Diego, La Jolla, CA, USA
Fabrizio Dabbene CNR-IEIIT, Politecnico di Torino, Torino, Italy
Mark L. Darby CMiD Solutions, Houston, TX, USA
Frederick E. Daum Raytheon Company, Woburn, MA, USA
David Martin De Diego Instituto de Ciencias Matemáticas (CSIC-UAM-
UC3M-UCM), Madrid, Spain
Alessandro De Luca Sapienza Università di Roma, Roma, Italy
Cesar de Prada Departamento de Ingeniería de Sistemas y Automática,
University of Valladolid, Valladolid, Spain
Enrique Del Castillo Department of Industrial and Manufacturing
Engineering, The Pennsylvania State University, University Park, PA, USA
Luigi del Re Johannes Kepler Universität, Linz, Austria
Domitilla Del Vecchio Department of Mechanical Engineering, Mas-
sachusetts Institute of Technology, Cambridge, MA, USA
Moritz Diehl Department of Microsystems Engineering (IMTEK),
University of Freiburg, Freiburg, Germany
ESAT-STADIUS/OPTEC, KU Leuven, Leuven-Heverlee, Belgium
Steven X. Ding University of Duisburg-Essen, Duisburg, Germany
Ian Dobson Iowa State University, Ames, IA, USA
Contributors xv

Alejandro D. Domínguez-García University of Illinois at Urbana-


Champaign, Urbana-Champaign, IL, USA
Sebastian Dormido Departamento de Informatica y Automatica, UNED,
Madrid, Spain
Tyrone Duncan Department of Mathematics, University of Kansas,
Lawrence, KS, USA
Alexander Efremov Moscow Aviation Institute, Moscow, Russia
Magnus Egerstedt Georgia Institute of Technology, Atlanta, GA, USA
Naomi Ehrich Leonard Department of Mechanical and Aerospace
Engineering, Princeton University, Princeton, NJ, USA
Abbas Emami-Naeini Stanford University, Stanford, CA, USA
Sebastian Engell Fakultät Bio- und Chemieingenieurwesen, Technische
Universität Dortmund, Dortmund, Germany
Dale Enns Honeywell International Inc., Minneapolis, MN, USA
Kaan Erkorkmaz Department of Mechanical & Mechatronics Engineering,
University of Waterloo, Waterloo, ON, Canada
Fabio Fagnani Dipartimento di Scienze Matematiche ‘G.L. Lagrange’,
Politecnico di Torino, Torino, Italy
Maurizio Falcone Dipartimento di Matematica, SAPIENZA – Università di
Roma, Rome, Italy
Paolo Falcone Department of Signals and Systems, Mechatronics Group,
Chalmers University of Technology, Göteborg, Sweden
Alfonso Farina Selex ES, Roma, Italy
Heike Faßbender Institut Computational Mathematics, Technische
Universität Braunschweig, Braunschweig, Germany
Augusto Ferrante Dipartimento di Ingegneria dell’Informazione, Univer-
sità di Padova, Padova, Italy
Thor I. Fossen Department of Engineering Cyberentics, Centre for
Autonomous Marine Operations and Systems, Norwegian University of
Science and Technology, Trondheim, Norway
Bruce A. Francis Department of Electrical and Computer Engineering,
University of Toronto, Toronto, ON, Canada
Emilio Frazzoli Massachusetts Institute of Technology, Cambridge, MA,
USA
Georg Frey Saarland University, Saarbrücken, Germany
Minyue Fu School of Electrical Engineering and Computer Science,
University of Newcastle, Callaghan, NSW, Australia
xvi Contributors

Sergio Galeani Dipartimento di Ingegneria Civile e Ingegneria Informatica,


Università di Roma “Tor Vergata”, Roma, Italy
Mario Garcia-Sanz Case Western Reserve University, Cleveland, OH, USA
Janos Gertler George Mason University, Fairfax, VA, USA
Alessandro Giua DIEE, University of Cagliari, Cagliari, Italy
LSIS, Aix-en-Provence, France
S. Torkel Glad Department of Electrical Engineering, Linköping University,
Linköping, Sweden
Keith Glover Department of Engineering, University of Cambridge,
Cambridge, UK
Ambarish Goswami Honda Research Institute, Mountain View, CA, USA
Pulkit Grover Carnegie Mellon University, Pittsburgh, PA, USA
Lars Grüne Mathematical Institute, University of Bayreuth, Bayreuth,
Germany
Vijay Gupta Department of Electrical Engineering, University of Notre
Dame, Notre Dame, IN, USA
Fredrik Gustafsson Division of Automatic Control, Department of
Electrical Engineering, Linköping University, Linköping, Sweden
Christoforos N. Hadjicostis University of Cyprus, Nicosia, Cyprus
Tore Hägglund Lund University, Lund, Sweden
Christine Haissig Honeywell International Inc., Minneapolis, MN, USA
Bruce Hajek University of Illinois, Urbana, IL, USA
Alain Haurie ORDECSYS and University of Geneva, Switzerland
GERAD-HEC Montréal PQ, Canada
W.P.M.H. Heemels Department of Mechanical Engineering, Eindhoven
University of Technology, Eindhoven, The Netherlands
Didier Henrion LAAS-CNRS, University of Toulouse, Toulouse, France
Faculty of Electrical Engineering, Czech Technical University in Prague,
Prague, Czech Republic
João P. Hespanha Center for Control, Dynamical Systems and Computation,
University of California, Santa Barbara, CA, USA
Håkan Hjalmarsson School of Electrical Engineering, ACCESS Linnaeus
Center, KTH Royal Institute of Technology, Stockholm, Sweden
Ying Hu IRMAR, Université Rennes 1, Rennes Cedex, France
Luigi Iannelli Università degli Studi del Sannio, Benevento, Italy
Contributors xvii

Pablo A. Iglesias Electrical & Computer Engineering, The Johns Hopkins


University, Baltimore, MD, USA
Petros A. Ioannou University of Southern California, Los Angeles, CA,
USA
Hideaki Ishii Tokyo Institute of Technology, Yokohama, Japan
Alberto Isidori Department of Computer and System Sciences “A. Ruberti”,
University of Rome “La Sapienza”, Rome, Italy
Tetsuya Iwasaki Department of Mechanical & Aerospace Engineering,
University of California, Los Angeles, CA, USA
Ali Jadbabaie University of Pennsylvania, Philadelphia, PA, USA
Monique Jeanblanc Laboratoire Analyse et Probabilités, IBGBI, Université
d’Evry Val d’Essonne, Evry Cedex, France
Karl H. Johansson ACCESS Linnaeus Center, Royal Institute of
Technology, Stockholm, Sweden
Ramesh Johari Stanford University, Stanford, CA, USA
Mihailo R. Jovanović Department of Electrical and Computer Engineering,
University of Minnesota, Minneapolis, MN, USA
Hans-Michael Kaltenbach ETH Zürich, Basel, Switzerland
Christoph Kawan Courant Institute of Mathematical Sciences, New York
University, New York, USA
Matthias Kawski School of Mathematical and Statistical Sciences, Arizona
State University, Tempe, AZ, USA
Hassan K. Khalil Department of Electrical and Computer Engineering,
Michigan State University, East Lansing, MI, USA
Mustafa Khammash Department of Biosystems Science and Engineering,
Swiss Federal Institute of Technology at Zurich (ETHZ), Basel, Switzerland
Rudibert King Technische Universität Berlin, Berlin, Germany
Basil Kouvaritakis Department of Engineering Science, University of
Oxford, Oxford, UK
A.J. Krener Department of Applied Mathematics, Naval Postgrauate
School, Monterey, CA, USA
Miroslav Krstic Department of Mechanical and Aerospace Engineering,
University of California, San Diego, La Jolla, CA, USA
Vladimír Kučera Faculty of Electrical Engineering, Czech Technical
University of Prague, Prague, Czech Republic
Stéphane Lafortune Department of Electrical Engineering and Computer
Science, University of Michigan, Ann Arbor, MI, USA
xviii Contributors

Frank L. Lewis Arlington Research Institute, University of Texas, Fort


Worth, TX, USA
Daniel Limon Departamento de Ingeniería de Sistemas y Automática,
Escuela Superior de Ingeniería, Universidad de Sevilla, Sevilla, Spain
Kang-Zhi Liu Department of Electrical and Electronic Engineering, Chiba
University, Chiba, Japan
Lennart Ljung Division of Automatic Control, Department of Electrical
Engineering, Linköping University, Linköping, Sweden
Marco Lovera Politecnico di Milano, Milan, Italy
J. Lygeros Automatic Control Laboratory, Swiss Federal Institute of
Technology Zurich (ETHZ), Zurich, Switzerland
Kevin M. Lynch Mechanical Engineering Department, Northwestern
University, Evanston, IL, USA
Ronald Mahler Eagan, MN, USA
Lorenzo Marconi C.A.SY. Ű- DEI, University of Bologna, Bologna, Italy
Sonia Martínez Department of Mechanical and Aerospace Engineering,
University of California, La Jolla, San Diego, CA, USA
Wolfgang Mauntz Fakultät Bio- und Chemieingenieurwesen, Technische
Universität Dortmund, Dortmund, Germany
Volker Mehrmann Institut für Mathematik MA 4-5, Technische Universität
Berlin, Berlin, Germany
Claudio Melchiorri Dipartimento di Ingegneria dell’Energia Elettrica e
dell’Informazione, Alma Mater Studiorum Università di Bologna, Bologna,
Italy
Mehran Mesbahi University of Washington, Seattle, WA, USA
Alexandre R. Mesquita Department of Electronics Engineering, Federal
University of Minas Gerais, Belo Horizonte, Brazil
Thomas Meurer Faculty of Engineering, Christian-Albrechts-University
Kiel, Kiel, Germany
Wim Michiels KU Leuven, Leuven (Heverlee), Belgium
Richard Hume Middleton School of Electrical Engineering and Computer
Science, The University of Newcastle, Callaghan, NSW, Australia
S.O. Reza Moheimani School of Electrical Engineering & Computer
Science, The University of Newcastle, Callaghan, NSW, Australia
Julian Morris School of Chemical Engineering and Advanced Materials,
Centre for Process Analytics and Control Technology, Newcastle University,
Newcastle Upon Tyne, UK
James Moyne Mechanical Engineering Department, University of
Michigan, Ann Arbor, MI, USA
Contributors xix

Curtis P. Mracek Raytheon Missile Systems, Waltham, MA, USA


Richard M. Murray Control and Dynamical Systems, Caltech, Pasadena,
CA, USA
Hideo Nagai Osaka University, Osaka, Japan
Girish N. Nair Department of Electrical & Electronic Engineering,
University of Melbourne, Melbourne, VIC, Australia
Angelia Nedić Industrial and Enterprise Systems Engineering, University of
Illinois, Urbana, IL, USA
Arye Nehorai Preston M. Green Department of Electrical and Systems
Engineering, Washington University in St. Louis, St. Louis, MO,
USA
Dragan Nesic Department of Electrical and Electronic Engineering, The
University of Melbourne, Melbourne, VIC, Australia
Brett Ninness School of Electrical and Computer Engineering, University
of Newcastle, Newcastle, Australia
Hidekazu Nishimura Graduate School of System Design and Management,
Keio University, Yokohama, Japan
Dominikus Noll Institut de Mathématiques, Université de Toulouse,
Toulouse, France
Lorenzo Ntogramatzidis Department of Mathematics and Statistics, Curtin
University, Perth, WA, Australia
Giuseppe Oriolo Sapienza Università di Roma, Roma, Italy
Richard W. Osborne University of Connecticut, Storrs, CT, USA
Martin Otter Institute of System Dynamics and Control, German
Aerospace Center (DLR), Wessling, Germany
David H. Owens University of Sheffield, Sheffield, UK
Hitay Özbay Department of Electrical and Electronics Engineering, Bilkent
University, Ankara, Turkey
Asuman Ozdaglar Laboratory for Information and Decision Systems,
Department of Electrical Engineering and Computer Science, Massachusetts
Institute of Technology, Cambridge, MA, USA
Andrew Packard Mechanical Engineering Department, University of
California, Berkeley, CA, USA
Fernando Paganini Universidad ORT Uruguay, Montevideo, Uruguay
Gabriele Pannocchia University of Pisa, Pisa, Italy
Lucy Y. Pao University of Colorado, Boulder, CO, USA
George J. Pappas Department of Electrical and Systems Engineering,
University of Pennsylvania, Philadelphia, PA, USA
xx Contributors

Frank C. Park Robotics Laboratory, Seoul National University, Seoul,


Korea
Bozenna Pasik-Duncan Department of Mathematics, University of Kansas,
Lawrence, KS, USA
Ron J. Patton School of Engineering, University of Hull, Hull, UK
Marco Pavone Stanford University, Stanford, CA, USA
Shige Peng Shandong University, Jinan, Shandong Province, China
Tristan Perez Electrical Engineering & Computer Science, Queensland
University of Technology, Brisbane, QLD, Australia
Ian R. Petersen School of Engineering and Information Technology,
University of New South Wales, the Australian Defence Force Academy,
Canberra, Australia
Kristin Y. Pettersen Department of Engineering Cybernetics, Norwegian
University of Science and Technology, Trondheim, Norway
Benedetto Piccoli Mathematical Sciences and Center for Computational and
Integrative Biology, Rutgers University, Camden, NJ, USA
Rik Pintelon Department ELEC, Vrije Universiteit Brussel, Brussels,
Belgium
Boris Polyak Institute of Control Science, Moscow, Russia
Romain Postoyan Université de Lorraine, CRAN, France
CNRS, CRAN, France
J. David Powell Stanford University, Stanford, CA, USA
Laurent Praly MINES ParisTech, PSL Research University, CAS,
Fontainebleau, France
Domenico Prattichizzo University of Siena, Siena, Italy
James A. Primbs University of Texas at Dallas, Richardson, TX, USA
S. Joe Qin University of Southern California, Los Angeles, CA, USA
Li Qiu Hong Kong University of Science and Technology, Hong Kong SAR,
China
Rajesh Rajamani Department of Mechanical Engineering, University of
Minnesota, Twin Cities, Minneapolis, MN, USA
Saša Raković Oxford University, Oxford, UK
James B. Rawlings University of Wisconsin, Madison, WI, USA
Jean-Pierre Raymond Institut de Mathématiques, Université Paul Sabatier
Toulouse III & CNRS, Toulouse Cedex, France
Contributors xxi

Wei Ren Department of Electrical Engineering, University of California,


Riverside, CA, USA
Spyros Reveliotis School of Industrial & Systems Engineering, Georgia
Institute of Technology, Atlanta, GA, USA
Giorgio Rizzoni Department of Mechanical and Aerospace Engineering,
Center for Automotive Research, The Ohio State University, Columbus, OH,
USA
L.C.G. Rogers University of Cambridge, Cambridge, UK
Pierre Rouchon Centre Automatique et Systèmes, Mines ParisTech, Paris
Cedex 06, France
Ricardo G. Sanfelice Department of Computer Engineering, University of
California at Santa Cruz, Santa Cruz, CA, USA
Heinz Schättler Washington University, St. Louis, MO, USA
Thomas B. Schön Department of Information Technology, Uppsala
University, Uppsala, Sweden
Johan Schoukens Department ELEC, Vrije Universiteit Brussel, Brussels,
Belgium
Michael Sebek Department of Control Engineering, Faculty of Electrical
Engineering, Czech Technical University in Prague, Prague 6, Czech
Republic
Peter Seiler Aerospace Engineering and Mechanics Department, University
of Minnesota, Minneapolis, MN, USA
Suresh P. Sethi Jindal School of Management, The University of Texas at
Dallas, Richardson, TX, USA
Sirish L. Shah Department of Chemical and Materials Engineering,
University of Alberta Edmonton, Edmonton, AB, Canada
Jeff S. Shamma School of Electrical and Computer Engineering, Georgia
Institute of Technology, Atlanta, GA, USA
Pavel Shcherbakov Institute of Control Science, Moscow, Russia
Jianjun Shi Georgia Institute of Technology, Atlanta, GA, USA
Manuel Silva Instituto de Investigation en Ingeniería de Aragón (I3A),
Universidad de Zaragoza, Zaragoza, Spain
Vasile Sima Advanced Research, National Institute for Research &
Development in Informatics, Bucharest, Romania
Robert E. Skelton University of California, San Diego, CA, USA
Sigurd Skogestad Department of Chemical Engineering, Norwegian
University of Science and Technology (NTNU), Trondheim, Norway
xxii Contributors

Eduardo D. Sontag Rutgers University, New Brunswick, NJ, USA


Asgeir J. Sørensen Department of Marine Technology, Centre for
Autonomous Marine Operations and Systems (AMOS), Norwegian
University of Science and Technology, NTNU, Trondheim, Norway
Mark W. Spong Erik Jonsson School of Engineering and Computer
Science, The University of Texas at Dallas, Richardson, TX, USA
R. Srikant Department of Electrical and Computer Engineering and the
Coordinated Science Lab, University of Illinois at Urbana-Champaign,
Champaign, IL, USA
Jörg Stelling ETH Zürich, Basel, Switzerland
Jing Sun University of Michigan, Ann Arbor, MI, USA
Krzysztof Szajowski Faculty of Fundamental Problems of Technology,
Institute of Mathematics and Computer Science, Wroclaw University of
Technology, Wroclaw, Poland
Mario Sznaier Electrical and Computer Engineering Department,
Northeastern University, Boston, MA, USA
Paulo Tabuada Department of Electrical Engineering, University of
California, Los Angeles, CA, USA
Satoshi Tadokoro Tohoku University, Sendai, Japan
Gongguo Tang Department of Electrical Engineering & Computer Science,
Colorado School of Mines, Golden, CO, USA
Shanjian Tang Fudan University, Shanghai, China
Andrew R. Teel Electrical and Computer Engineering Department,
University of California, Santa Barbara, CA, USA
Roberto Tempo CNR-IEIIT, Politecnico di Torino, Torino, Italy
Onur Toker Fatih University, Istanbul, Turkey
H.L. Trentelman Johann Bernoulli Institute for Mathematics and Computer
Science, University of Groningen, Groningen, AV, The Netherlands
Jorge Otávio Trierweiler Group of Intensification, Modelling, Simulation,
Control and Optimization of Processes (GIMSCOP), Department of
Chemical Engineering, Federal University of Rio Grande do Sul (UFRGS),
Porto Alegre, RS, Brazil
Lenos Trigeorgis University of Cyprus, Nicosia, Cyprus
Eric Tseng Ford Motor Company, Dearborn, MI, USA
Kyriakos G. Vamvoudakis Center for Control, Dynamical Systems and
Computation (CCDC), University of California, Santa Barbara, CA, USA
Contributors xxiii

Paul Van Dooren ICTEAM: Department of Mathematical Engineering,


Catholic University of Louvain, Louvain-la-Neuve, Belgium
O. Arda Vanli Department of Industrial and Manufacturing Engineering,
High Performance Materials Institute Florida A&M University and Florida
State University, Tallahassee, FL, USA
Andreas Varga Institute of System Dynamics and Control, German
Aerospace Center, DLR Oberpfaffenhofen, Wessling, Germany
Michel Verhaegen Delft Center for Systems and Control, Delft University,
Delft, The Netherlands
Mathukumalli Vidyasagar University of Texas at Dallas, Richardson, TX,
USA
Luigi Villani Dipartimento di Ingeneria Elettrica e Tecnologie dell’
Informazione, UniversitJa degli Studi di Napoli Federico II, Napoli, Italy
Richard B. Vinter Imperial College, London, UK
Antonio Visioli Dipartimento di Ingegneria Meccanica e Industriale,
University of Brescia, Brescia, Italy
Vijay Vittal Arizona State University, Tempe, AZ, USA
Costas Vournas School of Electrical and Computer Engineering, National
Technical University of Athens, Zografou, Greece
Steffen Waldherr Institute for Automation Engineering, Otto-von-
Guericke-Universität Magdgeburg, Magdeburg, Germany
Fred Wang University of Tennessee, Knoxville, TN, USA
Yorai Wardi School of Electrical and Computer Engineering, Georgia
Institute of Technology, Atlanta, GA, USA
John M. Wassick The Dow Chemical Company, Midland, MI, USA
Wing-Shing Wong Department of Information Engineering, The Chinese
University of Hong Kong, Hong Kong, China
W.M. Wonham Department of Electrical & Computer Engineering,
University of Toronto, Toronto, ON, Canada
Yutaka Yamamoto Department of Applied Analysis and Complex
Dynamical Systems, Graduate School of Informatics, Kyoto University,
Kyoto, Japan
Hong Ye The Mathworks, Inc., Natick, MA, USA
George Yin Department of Mathematics, Wayne State University, Detroit,
MI, USA
Serdar Yüksel Department of Mathematics and Statistics, Queen’s
University, Kingston, ON, Canada
xxiv Contributors

Michael M. Zavlanos Department of Mechanical Engineering and


Materials Science, Duke University, Durham, NC, USA
Qing Zhang Department of Mathematics, The University of Georgia,
Athens, GA, USA
Qinghua Zhang Inria, Campus de Beaulieu, Rennes Cedex, France
A

Balancing Electrical Generation


Active Power Control of Wind Power
and Load on the Grid
Plants for Grid Integration
Wind penetration levels across the world have
Lucy Y. Pao
increased dramatically, with installed capacity
University of Colorado, Boulder, CO, USA
growing at a mean annual rate of 25 % over
the last decade (Gsanger and Pitteloud 2013).
Abstract Some nations in Western Europe, particularly
Denmark, Portugal, Spain, and Germany, have
Increasing penetrations of intermittent renewable seen wind provide more than 16 % of their an-
energy sources, such as wind, on the utility grid nual electrical energy needs (Wiser and Bolinger
have led to concerns over the reliability of the 2013). To maintain grid frequency at its nominal
grid. One approach for improving grid reliability value, the electrical generation must equal the
with increasing wind penetrations is to actively electrical load on the grid. This balancing has
control the real power output of wind turbines historically been left up to conventional utilities
and wind power plants. Providing a full range with synchronous generators, which can vary
of responses requires derating wind power plants their active power output by simply varying their
so that there is headroom to both increase and fuel input. Grid frequency control is performed
decrease power to provide grid balancing services across a number of regimes and time scales, with
and stabilizing responses. Initial results indicate both manual and automatic control commands.
that wind turbines may be able to provide pri- Further details can be found in Rebours et al.
mary frequency control and frequency regulation (2007) and Ela et al. (2011).
services more rapidly than conventional power Wind turbines and wind power plants are
plants. now being recognized as having the potential to
meet demanding grid stabilizing requirements
set by transmission system operators (Aho et al.
Keywords 2013a,b; Buckspan et al. 2012; Ela et al. 2011;
Miller et al. 2011). Recent grid code requirements
Active power control; Automatic generation have spurred the development of wind turbine
control; Frequency regulation; Grid balancing; active power control (APC) systems, which allow
Grid integration; Primary frequency control; wind turbines to participate in grid frequency
Wind energy regulation and provide stabilizing responses to

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
2 Active Power Control of Wind Power Plants for Grid Integration

sudden changes in grid frequency. The ability


of wind turbines to provide APC services also
allows them to follow forecast-based power
production schedules.
For a wind turbine to fully participate in grid
frequency control, it must be derated (to Pderated )
with respect to the maximum power (Pmax ) that
can be generated given the available wind, allow-
ing for both increases and decreases in power, if
necessary. Wind turbines can derate their power
output by pitching their blades to shed aerody-
namic power or reducing their generator torque
in order to operate at higher-than-optimal rotor
speeds. Wind turbines can then respond at dif- Active Power Control of Wind Power Plants for Grid
ferent time scales to provide more or less power Integration, Fig. 1 Simulation results showing the ca-
pability of wind power plants to provide APC services
through pitch control (which can provide a power on a small-scale grid model. The total grid size is 3 GW,
response within seconds) and generator torque and a frequency event is induced due to the sudden
control (which can provide a power response active power imbalance when 5 % of generation is taken
within milliseconds). offline at time = 200 s. Each wind power plant is derated
to 90 % of its rated capacity. The system response with
all conventional generation (no wind) is compared to the
cases when there are wind power plants on the grid at
Wind Turbine Inertial and Primary 10 % penetration (i) with a baseline control system (wind
Frequency Control baseline) where wind does not provide APC services and
(ii) with an APC system (wind APC) that uses a 3 % droop
curve where either 50 % or 100 % of the wind power plants
Inertial and primary frequency control is gen- provide PFC
erally considered to be the first 5–10 s after a
frequency event occurs. In this regime, the gov-
ernors of capable utilities actuate, allowing for A time-varying droop curve can be designed to be
a temporary increase or decrease in the utilities’ more aggressive when the magnitude of the rate
power outputs. The primary frequency control of change of frequency of the grid is larger.
(PFC) response provided by conventional syn- Figure 1 shows a simulation of a grid re-
chronous generators can be characterized by a sponse under different scenarios when 5 % of
droop curve, which relates fluctuations in grid the generating capacity suddenly goes offline.
frequency to a change in power from the utility. When the wind power plant (10 % of the gen-
For example, a 3 % droop curve means that a 3 % eration on the grid) is operating with its normal
change in grid frequency yields a 100 % change baseline control system that does not provide
in commanded power. APC services, the frequency response is worse
Although modern wind turbines do not in- than the no-wind scenario, due to the reduced
herently provide inertial or primary frequency amount of conventional generation in the wind-
control responses because their power electronics baseline scenario that can provide power control
impart a buffer between their generators and the services. However, compared to both the no-wind
grid, such responses can be produced through and wind-baseline cases, using PFC with a droop
careful design of the wind turbine control sys- curve results in the frequency decline being ar-
tems. While the physical properties of a con- rested at a minimum (nadir) frequency fnadir that
ventional synchronous generator yield a static is closer to the nominal fnom D 60 Hz frequency
droop characteristic, a wind turbine can be con- level; further, the steady-state frequency fss after
trolled to provide a primary frequency response the PFC response is also closer to fnom . It is
via either a static or time-varying droop curve. important to prevent the difference fnom fnadir
Active Power Control of Wind Power Plants for Grid Integration 3

from exceeding a threshold that can lead to un- existing control loops, a desire to limit actuator
derfrequency load shedding (UFLS) or rolling usage and structural loading, and wind variability.
blackouts. The particular threshold varies across A
utility grids, but the largest such threshold in
North America is 1.5 Hz. Active Power Control of Wind Power
Stability issues arising from the altered control Plants
algorithms must be analyzed (Buckspan et al.
2013). The trade-offs between aggressive pri- A wind power plant, often referred to as a wind
mary frequency control and resulting structural farm, consists of many wind turbines. In wind
loads also need to be evaluated carefully. Ini- power plants, wake effects can reduce generation
tial research shows that potential grid support in downstream turbines to less than 60 % of the
can be achieved while not causing any increases lead turbine (Barthelmie et al. 2009; Porté-Agel
in structural loading and hence fatigue damage et al. 2013). There are many emerging areas
and operations and maintenance costs (Buckspan of active research, including the modeling of
et al. 2012). wakes and wake effects and how these models
can then be used to coordinate the control of
individual turbines so that the overall wind power
Wind Turbine Automatic Generation plant can reliably track the desired power ref-
Control erence command. A wind farm controller can
be interconnected with the utility grid, trans-
Secondary frequency control, also known as au- mission system operator (TSO), and individual
tomatic generation control (AGC), occurs on a turbines as shown in Fig. 3. By properly account-
slower time scale than PFC. AGC commands can ing for the wakes, wind farm controllers can
be generated from highly damped proportional allocate appropriate power reference commands
integral (PI) controllers or logic controllers to to the individual wind turbines. Individual tur-
regulate grid frequency and are used to control bine generator torque and blade pitch controllers,
the power output of participating power plants. In as discussed earlier, can be designed so that
many geographical regions, frequency regulation each turbine follows the power reference com-
services are compensated through a competitive mand issued by the wind farm controller. Meth-
market, where power plants that provide faster ods for intelligent, distributed control of entire
and more accurate AGC command tracking are wind farms to rapidly respond to grid frequency
paid more. disturbances could significantly reduce frequency
An active power control system that combines deviations and improve recovery speed to such
both primary and secondary/AGC frequency con- disturbances.
trol capabilities has recently been detailed in Aho
et al. (2013a). Figure 2 presents initial exper-
imental field test results of this active power Combining Techniques with Other
controller, in response to prerecorded frequency Approaches for Balancing the Grid
events, showing how responsive wind turbines
can be to both manual derating commands as well Ultimately, active power control of wind turbines
as rapidly changing automatic primary frequency and wind power plants should be combined
control commands generated via a droop curve. with both demand-side management and storage
Overall, results indicate that wind turbines can to provide a more comprehensive solution
respond more rapidly than conventional power that enables balancing electrical generation
plants. However, increasing the power control and electrical load with large penetrations
and regulation performance of a wind turbine of wind energy on the grid. Demand-side
should be carefully considered due to a number management (Callaway and Hiskens 2011; Kowli
of complicating factors, including coupling with and Meyn 2011; Palensky and Dietrich 2011)
4 Active Power Control of Wind Power Plants for Grid Integration

Active Power Control of Wind Power Plants for Grid Gevorgian, NREL). The upper plot shows the grid fre-
Integration, Fig. 2 The frequency data input and power quency, which is passed through a 5 % droop curve with
that is commanded and generated during a field test a deadband to generate a power command. The high-
with a 550 kW research wind turbine at the US National frequency fluctuations in the generated power would be
Renewable Energy Laboratory (NREL). The frequency smoothed when aggregating the power output of an entire
data was recorded on the Electric Reliability Council of wind power plant
Texas (ERCOT) interconnection (data courtesy of Vahan

Active Power Control of Wind Power Plants for Grid The wind farm controller uses measurements of the utility
Integration, Fig. 3 Schematic showing the communica- grid frequency and automatic generation control power
tion and coupling between the wind farm control system, command signals from the grid operator to determine a
individual wind turbines, utility grid, and the grid operator. power reference for each turbine in the wind farm
Active Power Control of Wind Power Plants for Grid Integration 5

aims to alter the demand in order to mitigate peak Barthelmie RJ, Hansen K, Frandsen ST, Rathmann O,
electrical loads and hence to maintain sufficient Schepers JG, Schlez W, Phillips J, Rados K, Zervos A,
Politis ES, Chaviaropoulos PK (2009) Modelling and
control authority among generating units. As measuring flow and wind turbine wakes in large wind A
more effective and economical energy storage farms offshore. Wind Energy 12:431–444
solutions (Pickard and Abbott 2012) at the power Buckspan A, Aho J, Fleming P, Jeong Y, Pao L (2012)
plant scale are developed, wind (and solar) energy Combining droop curve concepts with control systems
for wind turbine active power control. In: Proceedings
can then be stored when wind (and solar) energy of the IEEE symposium power electronics and ma-
availability is not well matched with electrical chines in wind applications, Denver, July 2012
demand. Advances in wind forecasting (Giebel Buckspan A, Pao L, Aho J, Fleming P (2013) Stability
et al. 2011) will also improve wind power analysis of a wind turbine active power control system.
In: Proceedings of the American control conference,
forecasts to facilitate more accurate scheduling Washington, DC, June 2013, pp 1420–1425
of larger amounts of wind power on the grid. Callaway DS, Hiskens IA (2011) Achieving con-
trollability of electric loads. Proc IEEE 99(1):
184–199
Ela E, Milligan M, Kirby B (2011) Operating reserves and
Cross-References variable generation. Technical report, National Renew-
able Energy Laboratory, NREL/TP-5500-51928
 Control of Fluids and Fluid-Structure Interac- Ela E, Gevorgian V, Fleming P, Zhang YC, Singh M,
tions Muljadi E, Scholbrock A, Aho J, Buckspan A, Pao
L, Singhvi V, Tuohy A, Pourbeik P, Brooks D, Bhatt
 Control Structure Selection N (2014) Active power controls from wind power:
 Coordination of Distributed Energy Resources bridging the gaps. Technical report, National Re-
for Provision of Ancillary Services: Architec- newable Energy Laboratory, NREL/TP-5D00-60574,
tures and Algorithms Jan 2014
Giebel G, Brownsword R, Kariniotakis G, Denhard M,
 Electric Energy Transfer and Control via Power Draxl C (2011) The state-of-the-art in short-term pre-
Electronics diction of wind power: a literature overview. Technical
 Networked Control Systems: Architecture and report, ANEMOS.plus/SafeWind, Jan 2011
Stability Issues Gsanger S, Pitteloud J-D (2013) World wind energy
report 2012. The World Wind Energy Association,
 Power System Voltage Stability May 2013
 Small Signal Stability in Electric Power Sys- Kowli AS, Meyn SP (2011) Supporting wind generation
tems deployment with demand response. In: Proceedings of
the IEEE power and energy society general meeting,
Detroit, July 2011
Miller N, Clark K, Shao M (2011) Frequency responsive
Recommended Reading wind plant controls: impacts on grid performance. In:
Proceedings of the IEEE power and energy society
general meeting, Detroit, July 2011
A recent comprehensive report on active power Palensky P, Dietrich D (2011) Demand-side manage-
control that covers topics ranging from control ment: demand response, intelligent energy systems,
design to power system engineering to economics and smart loads. IEEE Trans Ind Inform 7(3):
can be found in Ela et al. (2014) and the refer- 381–388
Pickard WF, Abbott D (eds) (2012) The intermittency
ences therein. challenge: massive energy storage in a sustainable
future. Proc IEEE 100(2):317–321. Special issue
Porté-Agel F, Wu Y-T, Chen C-H (2013) A numerical
Bibliography study of the effects of wind direction on turbine wakes
and power losses in a large wind farm. Energies
Aho J, Pao L, Buckspan A, Fleming P (2013a) An active 6:5297–5313
power control system for wind turbines capable of pri- Rebours Y, Kirschen D, Trotignon M, Rossignol S (2007)
mary and secondary frequency control for supporting A survey of frequency and voltage control ancillary
grid reliability. In: Proceedings of the AIAA aerospace services-part I: technical features. IEEE Trans Power
sciences meeting, Grapevine, Jan 2013 Syst 22(1):350–357
Aho J, Buckspan A, Dunne F, Pao LY (2013b) Controlling Wiser R, Bolinger M (2013) 2012 Wind technologies mar-
wind energy for utility grid reliability. ASME Dyn ket report. Lawrence Berkeley National Laboratory
Syst Control Mag 1(3):4–12 Report, Aug 2013
6 Adaptive Control of Linear Time-Invariant Systems

is based on multiple models, search methods,


Adaptive Control of Linear and switching logic. In this class of schemes,
Time-Invariant Systems the unknown parameter space is partitioned to
smaller subsets. For each subset, a parameter
Petros A. Ioannou
estimator or a stabilizing controller is designed
University of Southern California, Los Angeles,
or a combination of the two. The problem then
CA, USA
is to identify which subset in the parameter
space the unknown plant model belongs to and/or
which controller is a stabilizing one and meets the
Abstract control objective. A switching logic is designed
based on different considerations to identify
Adaptive control of linear time-invariant (LTI) the most appropriate plant model or controller
systems deals with the control of LTI systems from the list of candidate plant models and/or
whose parameters are constant but otherwise controllers. In this entry, we briefly describe the
completely unknown. In some cases, large norm above approaches to adaptive control for LTI
bounds as to where the unknown parameters systems.
are located in the parameter space are also
assumed to be known. In general, adaptive
control deals with LTI plants which cannot Keywords
be controlled with fixed gain controllers,
i.e., nonadaptive control methods, and their Adaptive pole placement control; Direct MRAC;
parameters even though assumed constant for Indirect MRAC; LTI systems; Model reference
design and analysis purposes may change adaptive control; Robust adaptive control
over time in an unpredictable manner. Most
of the adaptive control approaches for LTI
systems use the so-called certainty equivalence
principle where a control law motivated from
Model Reference Adaptive Control
the known parameter case is combined with
In model reference control (MRC), the desired
an adaptive law for estimating on line the
plant behavior is described by a reference model
unknown parameters. The control law could
which is simply an LTI system with a transfer
be associated with different control objectives
function Wm .s/ and is driven by a reference
and the adaptive law with different parameter
input. The controller transfer function C.s;   /,
estimation techniques. These combinations give
where   is a vector with the coefficients of
rise to a wide class of adaptive control schemes.
C.s/, is then developed so that the closed-loop
The two popular control objectives that led to a
plant has a transfer function equal to Wm .s/. This
wide range of adaptive control schemes include
transfer function matching guarantees that the
model reference adaptive control (MRAC) and
plant will match the reference model response for
adaptive pole placement control (APPC). In
any reference input signal. In this case the plant
MRAC, the control objective is for the plant
transfer function Gp .s; p /, where p is a vector
output to track the output of a reference model,
with all the coefficients of Gp .s/, together with
designed to represent the desired properties
the controller transfer function C.s;   / should
of the plant, for any reference input signal.
lead to a closed-loop transfer function from the
APPC is more general and is based on control
reference input r to the plant output yp that is
laws whose objective is to set the poles of
equal to Wm .s/, i.e.,
the closed loop at desired locations chosen
based on performance requirements. Another
class of adaptive controllers for LTI systems yp .s/ ym .s/
D Wm .s/ D ; (1)
that involves ideas from MRAC and APPC r.s/ r.s/
Adaptive Control of Linear Time-Invariant Systems 7

where ym is the output of the reference model. time t is then used to calculate the controller
For this transfer matching to be possible, Gp .s/ parameter vector .t/ D F .p .t// used in the
and Wm .s/ have to satisfy certain assumptions. controller C.s; /. This class of MRAC is called A
These assumptions enable the calculation of the indirect MRAC, because the controller parame-
controller parameter vector   as ters are not updated directly, but calculated at
each time t using the estimated plant parameters.
  D F .p /; (2) Another way of designing MRAC schemes is to
parameterize the plant transfer function in terms
where F is a function of the plant parameters of the desired controller parameter vector   .
p , to satisfy the matching equation (1). The This is possible in the MRC case, because the
function in (2) has a special form in the case structure of the MRC law is such that we can use
of MRC that allows the design of both direct (2) to write
and indirect MRAC. For more general classes
of controller structures, this is not possible in p D F 1 .  /; (3)
general as the function F is nonlinear. This trans-
fer function matching guarantees that the track- where F 1 is the inverse of the map-
ing error e1 D yp  ym converges to zero for ping F ./, and then express Gp .s; p / D
any given reference input signal r. If the plant Gp .s; F 1 .  // D GN p .s;   /. The adaptive law
parameter vector p is known, then the controller for estimating   online can now be developed by
parameters   can be calculated using (2), and using yp D GN p .s;   /up to obtain a parametric
the controller C.s;   / can be implemented. We model that is appropriate for estimating the
are considering the case where p is unknown. controller vector   as the unknown parameter
In this case, the use of the certainty equivalence vector. The MRAC can then be developed using
(CE) approach, (Astrom and Wittenmark 1995; the CE approach as shown in Fig. 1b. In this case,
Egardt 1979; Ioannou and Fidan 2006; Ioannou the controller parameter .t/ is updated directly
and Kokotovic 1983; Ioannou and Sun 1996; without any intermediate calculations, and for
Landau 1979; Landau et al. 1998; Morse 1996; this reason, the scheme is called direct MRAC.
Narendra and Annaswamy 1989; Narendra and The division of MRAC to indirect and direct
Balakrishnan 1997; Sastry and Bodson 1989; is, in general, unique to MRC structures, and it is
Stefanovic and Safonov 2011; Tao 2003) where possible due to the fact that the inverse maps in
the unknown parameters are replaced with their (2) and (3) exist which is a direct consequence
estimates, leads to the adaptive control scheme of the control objective and the assumptions the
referred to as indirect MRAC, shown in Fig. 1a. plant and reference model are required to satisfy
The unknown plant parameter vector p is es- for the control law to exist. These assumptions
timated at each time t denoted by p .t/, using an are summarized below:
online parameter estimator referred to as adaptive Plant Assumptions: Gp .s/ is minimum phase,
law. The plant parameter estimate p .t/ at each i.e., has stable zeros, its relative degree, n D

Adaptive Control of Linear Time-Invariant Systems, Fig. 1 Structure of (a) indirect MRAC, (b) direct MRAC
8 Adaptive Control of Linear Time-Invariant Systems

number of polesnumber of zeros, is known remains however. This approach covers both di-
and an upper bound n on its order is also rect and indirect MRAC and lead to adaptive laws
known. In addition, the sign of its high- which contain no normalization signals (Ioannou
frequency gain is known even though it can be and Fidan 2006; Ioannou and Sun 1996). A more
relaxed with additional complexity. straightforward design approach is based on the
Reference Model Assumptions: Wm .s/ has stable CE principle which separates the control design
poles and zeros, its relative degree is equal to from the parameter estimation part and leads to a
n that of the plant, and its order is equal or much wider class of MRAC which can be direct
less to the one assumed for the plant, i.e., of n. or indirect. In this case, the adaptive laws need
The above assumptions are also used to meet to be normalized for stability, and the analysis is
the control objective in the case of known pa- far more complicated than the approach based on
rameters, and therefore the minimum phase and SPR with no normalization. An example of such a
relative degree assumptions are characteristics of direct MRAC scheme for the case of known sign
the control objective and do not arise because of the high-frequency gain which is assumed to
of adaptive control considerations. The relative be positive for both plant and reference model is
degree matching is used to avoid the need to listed below:
differentiate signals in the control law. The mini- Control law:
mum phase assumption comes from the fact that
the only way for the control law to force the
closed-loop plant transfer function to be equal ˛.s/ ˛.s/
up D 1T .t/ up C 2T yp C 3 .t/yp
to that of the reference model is to cancel the .s/ .s/
zeros of the plant using feedback and replace
Cc0 .t/r D  T .t/!; (4)
them with those of the reference model using a
feedforward term. Such zero pole cancelations
are possible if the zeros are stable, i.e., the plant
where ˛ , ˛n2 .s/ D Œs n2 ; s n3 ; : : : ; s; 1T
is minimum phase; otherwise stability cannot be
for n  2, and ˛.s/ , 0 for n D 1, and .s/
guaranteed for nonzero initial conditions and/or
is a monic polynomial with stable roots and
inexact cancelations.
degree n  1 having numerator of Wm .s/ as a
The design of MRAC in Fig. 1 has additional
factor.
variations depending on how the adaptive law
Adaptive law:
is designed. If the reference model is chosen to
P D  "; (5)
be strictly positive real (SPR) which limits its
transfer function and that of the plant to have where  is a positive definite matrix referred
relative degree 1, the derivation of adaptive law to as the adaptive gain and P D  " , " D
e1 
and stability analysis is fairly straightforward, m2s
, m2s D 1 C  T  C u2f , D  T  C uf ,
and for this reason, this class of MRAC schemes  D Wm .s/!, and uf D Wm .s/up .
attracted a lot of interest. As the relative degree The stability properties of the above direct
changes to 2, the design becomes more complex MRAC scheme which are typical for all classes
as in order to use the SPR property, the CE of MRAC are the following (Ioannou and Fidan
control law has to be modified by adding an extra 2006; Ioannou and Sun 1996): (i) All signals
nonlinear term. The stability analysis remains to in the closed-loop plant are bounded, and the
be simple as a single Lyapunov function can be tracking error e1 converges to zero asymptotically
used to establish stability. As the relative degree and (ii) if the plant transfer function contains no
increases further, the design complexity increases zero pole cancelations and r is sufficiently rich
by requiring the addition of more nonlinear terms of order 2n, i.e., it contains at least n distinct
in the CE control law (Ioannou and Fidan 2006; frequencies, then the parameter error jj Q D

Ioannou and Sun 1996). The simplicity of using j   j and the tracking error e1 converge to
a single Lyapunov function analysis for stability zero exponentially fast.
Adaptive Control of Linear Time-Invariant Systems 9

Adaptive Pole Placement Control z  pT 


Pp D  "; " D ; m2s D 1 C  T ;
m2s
Let us consider the SISO LTI plant: (7) A
Zp .s/ where  D  T > 0 is the adaptive gain
yp D Gp .s/up ; Gp .s/ D ; (6) and p D ŒbOn1 ; : : : ; bO0 ; aO n1 ; : : : ; aO 0 T are the
Rp .s/
estimated plant parameters which can be used to
where Gp .s/ is proper and Rp .s/ is a monic form the estimated plant polynomials RO p .s; t/ D
polynomial. The control objective is to choose s n C aO n1 .t/s n1 C : : : C aO 1 .t/s C aO 0 .t/ and
the plant input up so that the closed-loop poles ZO p .s; t/ D bOn1 .t/s n1 C : : : C bO1 .t/s C bO0 .t/
are assigned to those of a given monic Hurwitz of Rp .s/ and Zp .s/, respectively, at each time t.
polynomial A .s/, and yp is required to follow The adaptive control law is given as
a certain class of reference signals ym assumed  
O t/Qm .s/ 1
to satisfy Qm .s/ym D 0 where Qm .s/ is known up D .s/  L.s; up
as the internal model of ym and is designed to .s/
have all roots in Refsg  0 with no repeated 1
PO .s; t/ .yp  ym /; (8)
roots on the j!-axis. The polynomial A .s/, re- .s/
ferred to as the desired closed-loop characteristic
polynomial, is chosen based on the closed-loop where L.s;O t/ and PO .s; t/ are obtained by solv-
performance requirements. To meet the control ing the polynomial equation L.s; O t/  Qm .s/ 
objective, we make the following assumptions RO p .s; t/ C PO .s; t/  ZO p .s; t/ D A .s/ at each
about the plant: time t. The operation X.s; t/  Y .s; t/ denotes a
P1. Gp .s/ is strictly proper with known de- multiplication of polynomials where s is simply
gree, and Rp .s/ is a monic polynomial whose treated as a variable. The existence and unique-
degree n is known and Qm .s/Zp .s/ and Rp .s/ ness of L.s;O t/ and PO .s; t/ is guaranteed pro-
are coprime. vided RO p .s; t/  Qm .s/ and ZO p .s; t/ are coprime
Assumption P1 allows Zp and Rp to be non- at each frozen time t. The adaptive laws that
Hurwitz in contrast to the MRAC case where Zp generate the coefficients of RO p .s; t/ and ZO p .s; t/
is required to be Hurwitz. cannot guarantee this property, which means that
The design of the APPC scheme is based at certain points in time, the solution L.s; O t/,
on the CE principle. The plant parameters are O
P .s; t/ may not exist. This problem is known
estimated at each time t and used to calculate the as the stabilizability problem in indirect APPC
controller parameters that meet the control ob- and further modifications are needed in order to
jective for the estimated plant as follows: Using handle it (Goodwin and Sin 1984; Ioannou and
(6) the plant equation can be expressed in a Fidan 2006; Ioannou and Sun 1996). Assuming
form convenient for parameter estimation via the that the stabilizability condition holds at each
model (Goodwin and Sin 1984; Ioannou and time t, it can be shown (Goodwin and Sin 1984;
Fidan 2006; Ioannou and Sun 1996): Ioannou and Fidan 2006; Ioannou and Sun 1996)
that all signals are bounded and the tracking
z D p ;
error converges to zero with time. Other indi-
D sn 
D Œb T ; a T T , rect adaptive pole placement control schemes
where z p .s/ yp , p
˛ T .s/ ˛ T .s/
include adaptive linear quadratic (Ioannou and
 D Πn1 p .s/
up ;  n1p .s/
yp T , ˛n1 D Œs n1 ; Fidan 2006; Ioannou and Sun 1996). In principle
: : : ; s; 1 , a D Œan1 ; : : : ; a0 T , b D
T 
any nonadaptive control scheme can be made
Œbn1 ; : : : ; b0 T , and p .s/ is a Hurwitz monic adaptive by replacing the unknown parameters
design polynomial. As an example of a parameter with their estimates in the calculation of the
estimation algorithm, we consider the gradient controller parameters. The design of direct APPC
algorithm schemes is not possible in general as the map
10 Adaptive Control of Linear Time-Invariant Systems

between the plant and controller parameters is disengaged or the adaptive law is switched off,
nonlinear, and the plant parameters cannot be there is no guarantee that a small disturbance
expressed as a convenient function of the con- will not drive the corresponding LTI scheme
troller parameters. This prevents parametrization unstable. Nevertheless these techniques allow the
of the plant transfer function with respect to the incorporation of well-established robust control
controller parameters as done in the case of MRC. techniques in designing a priori the set of con-
In special cases where such parametrization is troller candidates. The problem is that if the
possible such as in MRAC which can be viewed plant parameters change in a way not accounted
as a special case of APPC, the design of direct for a priori, no controller from the set may be
APPC is possible. Chapters on  Adaptive Con- stabilizing leading to an unstable system.
trol, Overview,  Robust Adaptive Control, and
 History of Adaptive Control provide additional
Robust Adaptive Control
information regarding MRAC and APPC.
The MRAC and APPC schemes presented above
are designed for LTI systems. Due to the adaptive
Search Methods, Multiple Models,
law, the closed-loop system is no longer LTI but
and Switching Schemes
nonlinear and time varying. It has been shown
using simple examples that the pure integral ac-
One of the drawbacks of APPC is the stabilizabil-
tion of the adaptive law could cause parameter
ity condition which requires the estimated plant
drift in the presence of small disturbances and/or
at each time t to satisfy the detectability and
unmodeled dynamics (Ioannou and Fidan 2006;
stabilizability condition that is necessary for the
Ioannou and Kokotovic 1983; Ioannou and Sun
controller parameters to exist. Since the adaptive
1996) which could then excite the unmodeled
law cannot guarantee such a property, an ap-
dynamics and lead to instability. Modifications
proach emerged that involves the pre-calculation
to counteract these possible instabilities led to
of a set of controllers based on the partition-
the field of robust adaptive control whose focus
ing of the plant parameter space. The problem
was to modify the adaptive law in order to guar-
then becomes one of identifying which one of
antee robustness with respect to disturbances,
the controllers is the most appropriate one. The
unmodeled dynamics, time-varying parameters,
switching to the “best” possible controller could
classes of nonlinearities, etc., by using techniques
be based on some logic that is driven by some
such as normalizing signals, projection, fixed and
cost index, multiple estimation models, and other
switching sigma modification, etc.
techniques (Fekri et al. 2007; Hespanha et al.
2003; Kuipers and Ioannou 2010; Morse 1996;
Narendra and Balakrishnan 1997; Stefanovic and Cross-References
Safonov 2011). One of the drawbacks of this ap-
proach is that it is difficult if at all possible to find  Adaptive Control, Overview
a finite set of stabilizing controllers that cover  History of Adaptive Control
the whole unknown parameter space especially  Model Reference Adaptive Control
for high-order plants. If found its dimension may  Robust Adaptive Control
be so large that makes it impractical. Another  Switching Adaptive Control
drawback that is present in all adaptive schemes
is that in the absence of persistently exciting
signals which guarantee that the input/output data Bibliography
have sufficient information about the unknown
Astrom K, Wittenmark B (1995) Adaptive control.
plant parameters, there is no guarantee that the
Addison-Wesley, Reading
controller the scheme converged to is indeed a Egardt B (1979) Stability of adaptive controllers. Springer,
stabilizing one. In other words, if switching is New York
Adaptive Control, Overview 11

Fekri S, Athans M, Pascoal A (2007) Robust multiple control has matured in many areas. This entry
model adaptive control (RMMAC): a case study. Int gives an overview of adaptive control with point-
J Adapt Control Signal Process 21(1):1–30
Goodwin G, Sin K (1984) Adaptive filtering prediction
ers to more detailed specific topics. A
and control. Prentice-Hall, Englewood Cliffs
Hespanha JP, Liberzon D, Morse A (2003) Hysteresis-
based switching algorithms for supervisory control of
uncertain systems. Automatica 39(2):263–272
Keywords
Ioannou P, Fidan B (2006) Adaptive control tutorial.
SIAM, Philadelphia Adaptive control; Estimation
Ioannou P, Kokotovic P (1983) Adaptive systems with
reduced models. Springer, Berlin/New York
Ioannou P, Sun J (1996) Robust adaptive control. Prentice-
Hall, Upper Saddle River Introduction
Kuipers M, Ioannou P (2010) Multiple model adap-
tive control with mixing. IEEE Trans Autom Control
55(8):1822–1836
What Is Adaptive Control
Landau Y (1979) Adaptive control: the model reference Feedback control has a long history of using sens-
approach. Marcel Dekker, New York ing, decision, and actuation elements to achieve
Landau I, Lozano R, M’Saad M (1998) Adaptive control. an overall goal. The general structure of a control
Springer, New York
Morse A (1996) Supervisory control of families of lin-
system may be illustrated in Fig. 1. It has long
ear set-point controllers part I: exact matching. IEEE been known that high fidelity control relies on
Trans Autom Control 41(10):1413–1431 knowledge of the system to be controlled. For
Narendra K, Annaswamy A (1989) Stable adaptive sys- example, in most cases, knowledge of the plant
tems. Prentice Hall, Englewood Cliffs
Narendra K, Balakrishnan J (1997) Adaptive control
gain and/or time constants (represented by p in
using multiple models. IEEE Trans Autom Control Fig. 1) is important in feedback control design.
42(2):171–187 In addition, disturbance characteristics (e.g., fre-
Sastry S, Bodson M (1989) Adaptive control: stability, quency of a sinusoidal disturbance), d in Fig. 1,
convergence and robustness. Prentice Hall, Englewood
Cliffs are important in feedback compensator design.
Stefanovic M, Safonov M (2011) Safe adaptive control: Many control design and synthesis techniques
data-driven stability analysis and robust synthesis. are model based, using prior knowledge of both
Lecture notes in control and information sciences, model structure and parameters. In other cases,
vol 405. Springer, Berlin
Tao G (2003) Adaptive control design and analysis. Wiley- a fixed controller structure is used, and the con-
Interscience, Hoboken troller parameters, C in Fig. 1, are tuned em-
pirically during control system commissioning.

Disturbances
Noise
Adaptive Control, Overview D (qd )
n(t) d (t)
Richard Hume Middleton u (t) y (t)
School of Electrical Engineering and Computer Actuation Plant Measurements
G (qp)
Science, The University of Newcastle,
Callaghan, NSW, Australia

Control
K(qc)
Abstract

Adaptive control describes a range of techniques r (t)


Reference Signals
for altering control behavior using measured sig-
nals to achieve high control performance under Adaptive Control, Overview, Fig. 1 General control
uncertainty. The theory and practice of adaptive and adaptive control diagram
12 Adaptive Control, Overview

However, if the plant parameters vary widely with There are also large classes of adaptive
time or have large uncertainties, these approaches controllers that are continuously monitoring the
may be inadequate for high-performance control. plant input-output signals to adjust the strategy.
There are two main ways of approaching high- These adjustments are often parametrized by a
performance control with unknown plant and relatively small number of coefficients, C . These
disturbance characteristics: include schemes where the controller parameters
1. Robust control ( Optimization Based Robust are directly adjusted using measureable data
Control), wherein a controller is designed to (also referred to as “implicit,” since there
perform adequately despite the uncertainties. is no explicit plant model generated). Early
Variable structure control may have very high examples of this often included model reference
levels of robustness in some cases and there- adaptive control ( Model Reference Adaptive
fore is a special class of robust nonlinear Control). Other schemes (Middleton et al. 1988)
control. explicitly estimate a plant model P ; thereafter,
2. Adaptive control, where the controller learns performing online control design and, therefore,
and adjusts its strategy based on measured the adaptation of controller parameters C
data. This frequently takes the form where the are indirect. This then led on to a range of
controller parameters, C , are time-varying other adaptive control techniques applicable to
functions that depend on the available data linear systems ( Adaptive Control of Linear
(y.t/, u.t/, and r.t/). Adaptive control has Time-Invariant Systems).
close links to intelligent control (including There have been significant questions con-
neural control ( Neural Control and Approx- cerning the sensitivity of some adaptive control
imate Dynamic Programming), where specific algorithms to unmodeled dynamics, time-varying
types of learning are considered) and also systems, and noise (Ioannou and Kokotovic
to stochastic adaptive control ( Stochastic 1984; Rohrs et al. 1985). This prompted a very
Adaptive Control). active period of research to analyze and redesign
Robust control is most useful when there are adaptive control to provide suitable robustness
large unmodeled dynamics (i.e., structural un- ( Robust Adaptive Control) (e.g., Anderson
certainties), relatively high levels of noise, or et al. 1986; Ioannou and Sun 2012) and parameter
rapid and unpredictable parameter changes. Con- tracking for time-varying systems (e.g., Kreis-
versely, for slow or largely predictable parame- selmeier 1986; Middleton and Goodwin 1988).
ter variations, with relatively well-known model Work in this area further spread to nonpara-
structure and limited noise levels, adaptive con- metric methods, such as switching, or super-
trol may provide a very useful tool for high- visory adaptive control ( Switching Adaptive
performance control (Åström and Wittenmark Control) (e.g., Fu and Barmish 1986; Morse et al.
2008). 1992). In addition, there has been a great deal of
work on the more difficult problem of adaptive
control for nonlinear systems ( Nonlinear Adap-
Varieties of Adaptive Control tive Control).
A further adaptive control technique is ex-
One practical variant of adaptive control is con- tremum seeking control ( Extremum Seeking
troller auto-tuning ( Autotuning). Auto-tuning Control). In extremum seeking (or self optimiz-
is particularly useful for PID and similar con- ing) control, the desired reference for the system
trollers and involves a specific phase of signal is unknown, instead we wish to maximize (or
injection, followed by analysis, PID gain compu- minimize) some variable in the system (Ariyur
tation, and implementation. These techniques are and Krstic 2003). These techniques have quite
an important aid to commissioning and mainte- distinct modes of operation that have proven
nance of distributed control systems. important in a range of applications.
Adaptive Cruise Control 13

A final control algorithm that has nonpara- Ioannou PA, Kokotovic PV (1984) Instability analysis
metric features is iterative learning control and improvement of robustness of adaptive control.
Automatica 20(5):583–594
( Iterative Learning Control) (Amann et al. Ioannou PA, Sun J (2012) Robust adaptive control. Dover A
1996; Moore 1993). This control scheme Publications, Mineola/New York
considers a system with a highly structured, Kreisselmeier G (1986) Adaptive control of a class
namely, repetitive finite run, control problem. of slowly time-varying plants. Syst Control Lett
8(2):97–103
In this case, by taking a nonparametric approach Middleton RH, Goodwin GC (1988) Adaptive control
of utilizing information from previous run(s), in of time-varying linear systems. IEEE Trans Autom
many cases, near-perfect asymptotic tracking can Control 33(2):150–155
be achieved. Middleton RH, Goodwin GC, Hill DJ, Mayne DQ (1988)
Design issues in adaptive control. IEEE Trans Autom
Adaptive control has a rich history ( History Control 33(1):50–58
of Adaptive Control) and has been established Moore KL (1993) Iterative learning control for determin-
as an important tool for some classes of control istic systems. Advances in industrial control series.
problems. Springer, London/New York
Morse AS, Mayne DQ, Goodwin GC (1992) Applications
of hysteresis switching in parameter adaptive control.
IEEE Trans Autom Control 37(9):1343–1354
Cross-References Rohrs C, Valavani L, Athans M, Stein G (1985) Robust-
ness of continuous-time adaptive control algorithms
 Adaptive Control of Linear Time-Invariant in the presence of unmodeled dynamics. IEEE Trans
Autom Control 30(9):881–889
Systems
 Autotuning
 Extremum Seeking Control
 History of Adaptive Control Adaptive Cruise Control
 Iterative Learning Control
 Model Reference Adaptive Control Rajesh Rajamani
 Neural Control and Approximate Dynamic Department of Mechanical Engineering,
Programming University of Minnesota, Twin Cities,
 Nonlinear Adaptive Control Minneapolis, MN, USA
 Optimization Based Robust Control
 Robust Adaptive Control
 Stochastic Adaptive Control Abstract
 Switching Adaptive Control
This chapter discusses advanced cruise control
automotive technologies, including adaptive
Bibliography cruise control (ACC) in which spacing control,
speed control, and a number of transitional
Amann N, Owens DH, Rogers E (1996) Iterative learn- maneuvers must be performed. The ACC system
ing control using optimal feedback and feedforward
actions. Int J Control 65(2):277–293
must satisfy difficult performance requirements
Anderson BDO, Bitmead RR, Johnson CR, Kokotovic PV, of vehicle stability and string stability. The
Kosut RL, Mareels IMY, Praly L, Riedle BD (1986) technical challenges involved and the control
Stability of adaptive systems: passivity and averaging design techniques utilized in ACC system design
analysis. MIT, Cambridge
Ariyur KB, Krstic M (2003) Real-time optimization by are presented.
extremum-seeking control. Wiley, New Jersey
Åström KJ, Wittenmark B (2008) Adaptive control. Keywords
Courier Dover Publications, Mineola
Fu M, Barmish BR (1986) Adaptive stabilization of linear
systems via switching control. IEEE Trans Autom Collision avoidance; String stability; Traffic sta-
Control 31(12):1097–1103 bility; Vehicle following
14 Adaptive Cruise Control

Introduction
Upper
Controller
Adaptive cruise control (ACC) is an extension of
cruise control. An ACC vehicle includes a radar,
a lidar, or other sensor that measures the distance desired fault
to any preceding vehicle in the same lane on the acceleration messages
highway. In the absence of preceding vehicles,
the speed of the car is controlled to a driver- Lower
desired value. In the presence of a preceding Controller
vehicle, the controller determines whether the ve-
hicle should switch from speed control to spacing
control. In spacing control, the distance to the
preceding car is controlled to a desired value.
actuator inputs
A different form of advanced cruise control is
a forward collision avoidance (FCA) system. An Adaptive Cruise Control, Fig. 1 Structure of longitudi-
FCA system uses a distance sensor to determine nal control system
if the vehicle is approaching a car ahead too
quickly and will automatically apply brakes to
minimize the chances of a forward collision. Controller design for vehicle following is the
For the 2013 model year, 29 % vehicles have primary topic of discussion in the sections titled
forward collision warning as an available option “Vehicle Following Requirements” and “String
and 12 % include autonomous braking for a full Stability Analysis” in this chapter.
FCA system. Examples of models in which an Transitional maneuvers and transitional con-
FCA system is standard are the Mercedes Benz trol algorithms are discussed in the section titled
G-class and the Volvo S-60, S-80, XC-60, and “Transitional Maneuvers” in this chapter.
XC-70. The longitudinal control system architecture
It should be noted that an FCA system does for an ACC vehicle is typically designed to be
not involve steady-state vehicle following. An hierarchical, with an upper-level controller and a
ACC system on the other hand involves control of lower-level controller, as shown in Fig. 1.
speed and spacing to desired steady-state values. The upper-level controller determines the de-
ACC systems have been in the market in Japan sired acceleration for the vehicle. The lower level
since 1995, in Europe since 1998, and in the US controller determines the throttle and/or brake
since 2000. An ACC system provides enhanced commands required to track the desired accelera-
driver comfort and convenience by allowing ex- tion. Vehicle dynamic models, engine maps, and
tended operation of the cruise control option even nonlinear control synthesis techniques are used
in the presence of other traffic. in the design of the lower controller (Rajamani
2012). This chapter will focus only on the design
of the upper controller, also known as the ACC
Controller Architecture controller.
As far as the upper-level controller is con-
The ACC system has two modes of steady state cerned, the plant model for control design is
operation: speed control and vehicle following
(i.e., spacing control). Speed control is traditional xR i D u (1)
cruise control and is a well-established tech-
nology. A proportional-integral controller based where the subscript i denotes the i th car in a string
on feedback of vehicle speed (calculated from of consecutive ACC cars. The acceleration of
rotational wheel speeds) is used in cruise control the car is thus assumed to be the control input.
(Rajamani 2012). However, due to the finite bandwidth associated
Adaptive Cruise Control 15

Adaptive Cruise Control, i-1


Fig. 2 String of adaptive
cruise control vehicles
A
xi+1
xi
xi-1

with the lower level controller, each car is ac- vehicle speed xP i . The ACC control law is said to
tually expected to track its desired acceleration provide individual vehicle stability if the spacing
imperfectly. The objective of the upper level con- error of the ACC vehicle converges to zero when
troller design is therefore stated as that of meeting the preceding vehicle is operating at constant
required performance specifications robustly in speed:
the presence of a first order lag in the lower-level xR i 1 ! 0 ) ıi ! 0: (4)
controller performance:

1 1 (b) String stability


xR i D xR i _des D ui : (2) The spacing error is expected to be non-zero

s C 1
s C 1
during acceleration or deceleration of the preced-
Equation (1) is thus assumed to be the nominal ing vehicle. It is important then to describe how
plant model while the performance specifications the spacing error would propagate from vehicle
have to be met even if the actual plant model were to vehicle in a string of ACC vehicles during
given by Eq. (2). The lag
typically has a value acceleration. The string stability of a string of
between 0.2 and 0.5 s (Rajamani 2012). ACC vehicles refers to a property in which spac-
ing errors are guaranteed not to amplify as they
propagate towards the tail of the string (Swaroop
Vehicle Following Requirements and Hedrick 1996).

In the vehicle following mode of operation, the


ACC vehicle maintains a desired spacing from
String Stability Analysis
the preceding vehicle. The two important perfor-
mance specifications that the vehicle following
In this section, mathematical conditions that en-
control system must satisfy are: individual vehi-
sure string stability are provided.
cle stability and string stability.
Let ıi and ıi 1 be the spacing errors of con-
(a) Individual vehicle stability
secutive ACC vehicles in a string. Let HO .s/ be
Consider a string of vehicles on the highway
the transfer function relating these errors:
using a longitudinal control system for vehicle
following, as shown in Fig. 2. Let xi be the
location of the i th vehicle measured from an ıOi
HO .s/ D .s/: (5)
inertial reference. The spacing error for the i th ıOi 1
vehicle (the ACC vehicle under consideration) is
then defined as
The following two conditions can be used to
ıi D xi  xi 1 C Ldes : (3) determine if the system is string stable:
(a) The transfer function HO .s/ should satisfy
Here, Ldes is the desired spacing and includes
 
the preceding vehicle length `i 1 . Ldes could be  O 
H .s/  1: (6)
chosen as a function of variables such as the 1
16 Adaptive Cruise Control

(b) The impulse response function h.t/ corre- Equation (9) describes the propagation of spacing
sponding to HO .s/ should not change sign errors along the vehicle string.
(Swaroop and Hedrick 1996), i.e., All positive values of kp and kv guarantee
individual vehicle stability. However, it can be
h.t/ > 0 8t  0: (7) shown that there are no positive values of kp
and kv for which the magnitude of G.s/ can be
The reasons for these two requirements to be guaranteed to be less than unity at all frequencies.
satisfied are described in Rajamani (2012). The details of this proof are available in Rajamani
Roughly speaking, Eq. (6) ensures that jjıi jj2  (2012).
jjıi 1 jj2 , which means that the energy in the Thus, the constant spacing policy will always
spacing error signal decreases as the spacing be string unstable.
error propagates towards the tail of the string.
Equation (7) ensures that the steady state spacing Constant Time-Gap Spacing
errors of the vehicles in the string have the same
sign. This is important because a positive spacing Since the constant spacing policy is unsuitable
error implies that a vehicle is closer than desired for autonomous control, a better spacing policy
while a negative spacing error implies that it is that can ensure both individual vehicle stability
further apart than desired. If the steady state value and string stability must be used. The constant
of ıi is positive while that of ıi 1 is negative, then time-gap (CTG) spacing policy is such a spacing
this might be dangerous due to the vehicle being policy. In the CTG spacing policy, the desired
closer, even though in terms of magnitude ıi inter-vehicle spacing is not constant but varies
might be smaller than ıi 1 . with velocity. The spacing error is defined as
If conditions (6) and (7) are both satisfied, then
jjıi jj1  jjıi 1 jj1 (Rajamani 2012). ıi D xi  xi 1 C Ldes C hxP i : (10)

The parameter h is referred to as the time-gap.


Constant Inter-vehicle Spacing The following controller based on the CTG
spacing policy can be used to regulate the spacing
The ACC system only utilizes on board sensors error at zero (Swaroop et al. 1994):
like radar and does not depend on inter-vehicle
communication from other vehicles. Hence the 1
xR i _des D  .xP i  xP i 1 C ıi / (11)
only variables available as feedback for the up- h
per controller are inter-vehicle spacing, relative
With this control law, it can be shown that the
velocity and the ACC vehicle’s own velocity.
spacing errors of successive vehicles ıi and ıi 1
Under the constant spacing policy, the spacing
are independent of each other:
error of the i th vehicle was defined in Eq. (3).
If the acceleration of the vehicle can be instan-
ıPi D  ıi (12)
taneously controlled, then it can be shown that a
linear control system of the type
Thus, ıi is independent of ıi 1 and is expected to
converge to zero as long as > 0. However, this
xR i D kp ıi  k v ıPi (8)
result is only true if any desired acceleration can
results in the following closed-loop transfer func- be instantaneously obtained by the vehicle i.e., if
tion between consecutive spacing errors
D 0.
In the presence of the lower controller and
actuator dynamics given by Eq. (2), it can be
ıOi kp C kv s
HO .s/ D .s/ D 2 : (9) shown that the dynamic relation between ıi and
ıOi 1 s C kv s C kp ıi 1 in the transfer function domain is
Adaptive Cruise Control 17

sC transitional trajectory that will bring the ACC ve-


HO .s/ D (13)
h
s 3 C hs 2 C .1 C h/s C hicle to its steady state following distance needs
to be designed. Similarly, a decision on the mode A
The string stability of this system can be ana- of operation and design of a transitional trajectory
lyzed by checking if the magnitude of the above are required when an ACC vehicle loses its target.
transfer function is always less than or equal to The regular CTG control law cannot directly
1. It can be shown that this is the case at all be used to follow a newly encountered vehicle,
frequencies if and only if (Rajamani 2012) see Rajamani (2012) for illustrative examples.
When a new target vehicle is encountered by
h  2
: (14) the ACC vehicle, a “range – range rate” diagram
can be used (Fancher and Bareket 1994) to decide
if
Further, if Eq. (14) is satisfied, then it is also
(a) The vehicle should use speed control.
guaranteed that one can find a value of such
(b) The vehicle should use spacing control (with
that Eq. (7) is satisfied. Thus the condition (14) is
a defined transition trajectory in which de-
necessary (Swaroop and Hedrick 1996) for string
sired spacing varies slowly with time)
stability.
(c) The vehicle should brake as hard as possible
Since the typical value of
is of the order
in order to avoid a crash.
of 0.5 s, Eq. (14) implies that ACC vehicles must
The maximum allowable values for acceleration
maintain at least a 1-s time gap between vehicles
and deceleration need to be taken into account in
for string stability.
making these decisions.
For the range – range rate (R  R/ P diagram,
define range R and range rate R asP
Transitional Maneuvers
R D xi 1  xi (15)
While under speed control, an ACC vehicle might
suddenly encounter a new vehicle in its lane
RP D xP i 1  xP i D V i 1  Vi (16)
(either due to a lane change or due to a slower
moving preceding vehicle). The ACC vehicle where xi 1 , xi , Vi 1 , and Vi are inertial positions
must then decide whether to continue to operate and velocities of the preceding vehicle and the
under the speed control mode or transition to the ACC vehicle respectively.
vehicle following mode or initiate hard braking. A typical R  RP diagram is shown in Fig. 3
If a transition to vehicle following is required, a (Fancher and Bareket 1994). Depending on the

Adaptive Cruise Control,


R
Fig. 3 Range vs.
range-rate diagram
Region 1
Switching line for starting headway
Velocity control
Control
Region 2

Headway Desired spacing RH


Control

Too
Close
Region 3

0
Crash dR / dt
18 Adaptive Cruise Control

A The trajectory of the ACC vehicle during constant


R deceleration is a parabola on the R  RP diagram
(Rajamani 2012).
The switching line should be such that travel
along the line is comfortable and does not con-
B
stitute high deceleration. The deceleration during
Rfinal coasting (zero throttle and zero braking) can be
C
used to determine the slope of the switching line
· (Rajamani 2012).
R Note that string stability is not a concern
Adaptive Cruise Control, Fig. 4 Switching line for during transitional maneuvers (Rajamani 2012).
spacing control

Traffic Stability
measured real-time values of R and RP ; and
the R  RP diagram in Fig. 3, the ACC system In addition to individual vehicle stability and
determines the mode of longitudinal control. string stability, another type of stability analysis
For instance, in region 1, the vehicle continues that has received significant interest in ACC liter-
to operate under speed control. In region 2, ature is traffic flow stability. Traffic flow stability
the vehicle operates under spacing control. In refers to the stable evolution of traffic velocity
region 3, the vehicle decelerates at the maximum and traffic density on a highway section, for given
allowed deceleration so as to try and avoid a inflow and outflow conditions. One well-known
crash. result in this regard in literature is that traffic flow
The switching line from speed to spacing con- is defined to be stable if @q @ is positive, i.e., as
trol is given by the density  of traffic increases, traffic flow rate
q must increase (Swaroop and Rajagopal 1999).
R D T RP C Rfinal (17)
If this condition is not satisfied, the highway
section would be unable to accommodate any
where T is the slope of the switching line. When a
constant inflow of vehicles from an oncoming
slower vehicle is encountered at a distance larger
ramp. The steady state traffic flow on the highway
than the desired final distance Rfinal , the switching
section would come to a stop, if the ramp inflow
line shown in Fig. 4 can be used to determine
did not stop (Swaroop and Rajagopal 1999).
when and whether the vehicle should switch to
It has been shown that the constant time-
spacing control. If the distance R is greater than
gap spacing policy used in ACC systems has
that given by the line, speed control should be
a negative q   slope and thus does not lead
used.
to traffic flow stability (Swaroop and Rajagopal
The overall strategy (shown by trajectory
1999). It has also been shown that it is possible
ABC) is to first reduce gap at constant RP and
to design other spacing policies (in which the
then follow the desired spacing given by the
desired spacing between vehicles is a nonlinear
switching line of Eq. (17).
function of speed, instead of being proportional
The control law during spacing control on this
to speed) that can provide stable traffic flow
transitional trajectory is as follows. Depending on
P determine R from Eq. (17). Then (Santhanakrishnan and Rajamani 2003).
the value of R,
The importance of traffic flow stability has
use R as the desired inter-vehicle spacing in the
not been fully understood by the research com-
PD control law
  munity. Traffic flow stability is likely to become
xR des D kp .xi  R/  kd xP i  RP (18) important when the number of ACC vehicles
Adaptive Cruise Control 19

on the highway increase and their penetration other vehicles. There is a likelihood of evolution
percentage into vehicles on the road becomes of current systems into co-operative adaptive
significant. cruise control (CACC) systems which utilize A
wireless communication with other vehicles
Recent Automotive Market and highway infrastructure. This evolution
Developments could be facilitated by the dedicated short-
range communications (DSRC) capability being
The latest versions of ACC systems on the market developed by government agencies in the US,
have been enhanced with collision warning, inte- Europe and Japan. In the US, DSRC is being
grated brake support, and stop-and-go operation developed with a primary goal of enabling
functionality. communication between vehicles and with
The collision warning feature uses the same infrastructure to reduce collisions and support
radar as the ACC system to detect moving other safety applications. In CACC, wireless
vehicles ahead and determine whether driver communication could provide acceleration
intervention is required. In this case, visual signals from several preceding downstream
and audio warnings are provided to alert the vehicles. These signals could be used in better
driver and brakes are pre-charged to allow spacing policies and control algorithms to
quick deceleration. On Ford’s ACC-equipped improve safety, ensure string stability, and
vehicles, brakes are also automatically applied improve traffic flow.
when the driver lifts the foot off from the
accelerator pedal in a detected collision warning
scenario. Cross-References
When enabled with stop-and-go functional-
ity, the ACC system can also operate at low  Lane Keeping
vehicle speeds in heavy traffic. The vehicle can  Vehicle Dynamics Control
be automatically brought to a complete stop when  Vehicular Chains
needed and restarted automatically. Stop-and-go
is an expensive option and requires the use of
multiple radar sensors on each car. For instance,
the BMW ACC system uses two short range Bibliography
and one long range radar sensor for stop-and-go
operation. Fancher P, Bareket Z (1994) Evaluating headway control
The 2013 versions of ACC on the Cadillac using range versus range-rate relationships. Veh Syst
Dyn 23(8):575–596
ATS and on the Mercedes Distronic systems are
Rajamani R (2012) Vehicle dynamics and control, 2nd
also being integrated with camera based lateral edn. Springer, New York. ISBN:978-1461414322
lane position measurement systems. On the Mer- Santhanakrishnan K, Rajamani R (2003) On spacing poli-
cedes Distronic systems, a camera steering assist cies for highway vehicle automation. IEEE Trans Intell
Transp Syst 4(4):198–204
system provides automatic steering, while on the
Swaroop D, Hedrick JK (1996) String stability of in-
Cadillac ATS, a camera based system provides terconnected dynamic systems. IEEE Trans Autom
lane departure warnings. Control 41(3):349–357
Swaroop D, Rajagopal KR (1999) Intelligent cruise con-
trol systems and traffic flow stability. Transp Res C
Future Directions Emerg Technol 7(6):329–352
Swaroop D, Hedrick JK, Chien CC, Ioannou P (1994)
A comparison of spacing and headway control laws
Current ACC systems use only on-board sensors for automatically controlled vehicles. Veh Syst Dyn
and do not use wireless communication with 23(8):597–625
20 Advanced Manipulation for Underwater Sampling

Keywords
Advanced Manipulation
for Underwater Sampling
Kinematic control law (KCL); Manipulator;
Motion priorities
Giuseppe Casalino
University of Genoa, Genoa, Italy
Introduction

Abstract An automated system for underwater sampling


is here intended to be an autonomous underwa-
This entry deals with the kinematic self- ter floating manipulator (see Fig. 1) capable of
coordination aspects to be managed by parts collecting samples corresponding to an a priori
of underwater floating manipulators, whenever assigned template. The snapshots of Fig. 1 outline
employed for sample collections at the seafloor. the most recent realization of a system of this
Kinematic self-coordination is here intended kind (completed in 2012 within the EU-funded
as the autonomous ability exhibited by the system project TRIDENT; Sanz et al. 2012) when in
in closed loop specifying the most appropriate operation, which is characterized by a vehicle
reference velocities for its main constitutive parts and an endowed 7-dof arm exhibiting comparable
(i.e., the supporting vehicle and the arm) in order masses and inertia, thus resulting in potentially
to execute the sample collection with respect to faster and more agile designs than the very few
both safety and best operability conditions for similar previous realizations.
the system while also guaranteeing the needed Its general the operational mode consists in
“execution agility” in performing the task, par- exploring an assigned area of the seafloor, while
ticularly useful in case of underwater repeated executing a collection each time a feature corre-
collections. To this end, the devising and em- sponding to the assigned template is recognized
ployment of a unifying control framework capa- (by the vehicle endowed with a stereovision sys-
ble of guaranteeing the above properties will be tem) as a sample to be collected.
outlined. Thus the autonomous functionalities to be ex-
Such a framework is however intended to only hibited are the following (to be sequenced as they
represent the so-called Kinematic Control Layer are listed on an event-driven basis): (1) explore an
(KCL) overlaying a Dynamic Control Layer assigned seabed area while visually performing
(DCL), where the overall system dynamic and model-based sample recognitions, (2) suspend
hydrodynamic effects are suitably accounted the exploration and grasping a recognized sam-
for, to the benefit of closed loop tracking ple, (3) deposit the sample inside an endowed
of the reference system velocities. Since the container, and (4) then restart exploring till the
DCL design is carried out in a way which next recognized sample.
is substantially independent from the system Functionalities (1) and (4), since they do not
mission(s), it will not constitute a specific topic require the arm usage, naturally reenter within the
of this entry, even if some orienting references topics of navigation, patrolling, visual mapping,
about it will be provided. etc., which are typical of traditional AUVs and
At this entry’s end, as a follow-up of the consequently will not be discussed here. Only
resulting structural invariance of the devised KCL functionality (2) will be discussed, since it is
framework, future challenges addressing much most distinctive of the considered system (often
wider and complex underwater applications will termed as I-AUV, with “I” for “Intervention”) and
be foreseen, beyond the here-considered sample because functionality (3) can be established along
collection one. the same lines of (2) as a particular simpler case.
Advanced Manipulation for Underwater Sampling 21

Advanced Manipulation for Underwater Sampling, Fig. 1 Snapshots showing the underwater floating manipulator
TRIDENT when autonomously picking an identified object

By then focusing on functionality (2), we conditions (involving various system variables)


must note how the sample grasping ultimate whose achievement (accordingly with their
objective, which translates into a specific safety/enabling role) must therefore deserve the
position/attitude to be reached by the end- highest priority.
effector, must however be achieved within the System motions guaranteeing such prioritized
preliminary fulfillment of also other objectives, objective achievements should moreover allow
each one reflecting the need of guaranteeing for a concurrent management of them (i.e., avoid-
the system operating within both its safety ing a sequential motion management whenever
and best operability conditions. For instance, possible), which means requiring each objective
the arm’s joint limits must be respected and progressing toward its achievement, by at each
the arm singular postures avoided. Moreover, time instant only exploiting the residual system
since the sample position is estimated via the mobility allowed by the current progresses of its
vehicle with a stereo camera, the sample must higher priority objectives. Since the available
stay grossly centered inside its visual cone, system mobility will progressively increase
since otherwise the visual feedback would be during time, accordingly with the progressive
lost and the sample search would need to start achievement of all inequality objectives, this
again. Also, the sample must stay within suitable will guarantee the grasping objective to be also
horizontal and vertical distance limits from the completed by eventually progressing within
camera frame, in order for the vision algorithm adequate system safety and best operability
to be well performing. And furthermore, in these conditions. In this way the system will also
conditions the vehicle should be maintained with exhibit the necessary “agility” in executing its
an approximately horizontal attitude, for energy maneuvers, in a way faster than in case they were
savings. executed on a sequential motion basis.
With the exception of the objective of making The devising of an effective way to incor-
the end-effector position/attitude reaching the porate all the inequality and equality objectives
grasping position, which is clearly an equality within a uniform and computationally efficient
condition, its related safety/enabling objectives task-priority-based algorithmic framework for
are instead represented by a set of inequality underwater floating manipulators has been the
22 Advanced Manipulation for Underwater Sampling

result of the developments outlined in the next requiring the achievement of the inequality type
section. objective
The developed framework however solely > m
represents the so-called Kinematic Control Layer
While the above objectives arise from inherently
(KCL) of the overall control architecture, that
scalar variables, other objectives instead arise as
is, the one in charge of closed-loop real-time
conditions to be achieved within the Cartesian
control generating the system velocity vector y
space, where each one of them can be conve-
as a reference signal, to be in turn concurrently
niently expressed in terms of the modulus associ-
tracked, via the action of the arm joint torques
ated to a corresponding Cartesian vector variable.
and vehicle thrusters, by an adequate underlying
To be more specific, let us, for instance, refer
Dynamic Control Layer (DCL), where the overall
to the need of avoiding the occlusions between
dynamic and hydrodynamic effects are kept into
the sample and the stereo camera, which might
account to the benefit of such velocity tracking.
occasionally occur due to the arm link motions.
Since the DCL can actually be designed in a
Then such need can be, for instance, translated
way substantially independent from the system
into the ultimate achievement of the following set
mission(s), it will not constitute a specific topic
of inequalities, for suitable chosen values of the
of this entry. Its detailed dynamic-hydrodynamic
boundaries
model-based structuring, also including a stabil-
ity analysis, can be found in Casalino (2011),
together with a more detailed description of the klk > lm I k
k >
m I k k < M
upper-lying KCL, while more general references
on underwater dynamic control aspects can be where l is the vector lying on the vehicle x-y
found, for instance, in Antonelli (2006). plane, joining the arm elbow with the line parallel
to the vehicle z-axis and passing through camera
frame origin, as sketched in Fig. 2a. Moreover
Task-Priority-Based Control of is the misalignment vector formed by vector

Floating Manipulators also lying on the vehicle x-y plane, joining the
lines parallel to the vehicle z-axis and, respec-
The above-outlined typical set of objectives (of tively, passing through the elbow and the end-
inequality and/or equality types) to be achieved effector origin.
within a sampling mission are here formalized. As for the vehicle, it must keep the object of
Then some helpful generalizing definitions are interest grossly centered in the camera frame (see
given, prior to presenting the related unifying Fig. 2b), thus meaning that the modulus of the
task-priority-based algorithmic framework to orientation error , formed by the unit vector np
be used. of vector p from the sample to the camera frame
and the unit vector kc of the z-axis of the camera
Inequality and Equality Objectives frame itself, must ultimately satisfy the inequality
One of the objectives, of inequality type, related
to both arm safety and its operability is that
of maintaining each joint within corresponding k k < M
minimum and maximum limits, that is,
Furthermore, the camera must also be closer than
q1m < qi < qiM I i D 1; 2; : : : ; 7 a given horizontal distance dM to the vertical
line passing through the sample, and it must
Moreover, in order to have the arm operating lie between a maximum and minimum height
with dexterity, its manipulability measure (Naka- with respect to the sample itself, thus implying
mura 1991; Yoshikawa 1985) must ultimately the achievement of the following inequalities
stay above a minimum threshold value, thus also (Fig. 2c, d):
Advanced Manipulation for Underwater Sampling 23

kv
a jv b kv

iv
iv A
v
v
jv
vT
l g

kv
p
τ h
ξ

g
g

c kv d kv
iv iv
v v
gT v
v
jv
gT jv

p
h p

hm

g
d
g

Advanced Manipulation for Underwater Sampling, Fig. 2 Vectors allowing for the defintion of some inequality
objectives in the Cartesian space: (a) camera occlusion, (b) camera centering, (c) camera distance, (d) camera height

kd k < dM I hm < khk < hM error and  the orientation one of the end-effector
frame with respect to the sample frame
Also since the vehicle should exhibit an
krk D 0I k#k D 0
almost horizontal attitude, this further requires
the achievement of the following additional
As already repeatedly remarked, the achievement
inequality:
of the above inequality objectives (since related
kk < M
to the system safety and/or its best operability)
with  the misalignment vector formed by the must globally deserve a priority higher than the
absolute vertical unit vector ko with the vehicle last equality.
z-axis one kv .
And finally the end-effector must eventually Basic Definitions
reach the sample, for then picking it. Thus the fol- The following definitions only regard a generic
lowing, now of equality type, objectives must also vector s 2 R3 characterizing a corresponding
be ultimately achieved, where r is the position generic objective defined in the Cartesian space
24 Advanced Manipulation for Underwater Sampling

(for instance, with the exclusion of the joint and In case PN could be exactly assigned to its
manipulability limits, all the other above-reported P it would consequently
corresponding error rate ,
objectives). In this case the vector is termed to be smoothly drive  toward the achievement of its
the error vector of the objective, and it is assumed associated objective. Note however that for in-
measured with components on the vehicle frame. equality objectives, it would necessarily impose
Then its modulus P D 0 in correspondence of a point located
inside the interval of validity of the inequality
D
P ksk objective itself, while instead such an error rate
zeroing effect should be relaxed, for allowing
is termed to be the error, while its unit vector the helpful subsequent system mobility increase,
which allows for further progress toward other
nDs=I
P  ¤0 lower priority control objectives. Such a relax-
ation aspect will be dealt with soon.
is accordingly denoted as the unit error vector. Furthermore, in correspondence of a reference
PN the so-called reference error vector
error rate ,
Then the following differential Jacobian relation-
ship can always be evaluated for each of them: rate can also be defined as

sP D Hy sPN Dn
P P

where y 2 RN (N D .7 C 6/ for the system that for equality objectives requiring the zeroing
of Fig. 1) is the stacked vector composed of the of their error  simply becomes
joint velocity vector qP 2 R7 , plus the stacked
vector v 2 R6 of the absolute vehicle velocities sPN D
P  s
(linear and angular) with components on the
vehicle frame and with sP clearly representing the whose evaluation, since not requiring its unit
time derivative of vector s itself, as seen from vector n, will be useful for managing equality
the vehicle frame and with components on it objectives.
(see Casalino (2011) for details on the real-time Finally note that for each objective not de-
evaluation of Jacobian matrices H /. fined in the Cartesian space (like, for instance,
Obviously, for the time derivative P of the above joint limits and manipulability), the
the error, also the following differential corresponding scalar error variable, its rate, and
relationship holds its reference error rate can instead be managed
directly, since obviously they do not require any
P D nT Hy preliminary scalar reduction process.

Further, to each error variable , a so-called error Managing the Higher Priority Inequality
reference rate is real time assigned of the form Objectives
A prioritized list of the various scalar inequal-
PN D .   o /˛./ ity objectives, to be concurrently progressively
achieved, is suitably established in a descending
where for equality objectives  o is the target priority order.
value and ˛./  1, while for inequality ones, Then, by starting to consider the highest pri-
 o is the threshold value and ˛./ is a left- ority one, we have that the linear manifold of
cutting or right-cutting (in correspondence of  o / the system velocity vector y (i.e., the arm joints
smooth sigmoidal activation function, depending velocity vector qP stacked with vector v of the
on whether the objective is to force  to be below vehicle linear and angular velocities), capable of
or above  o , respectively. driving toward its achievement, results at each
Advanced Manipulation for Underwater Sampling 25

time instant as the set of solution of the following When instead ˛1 is within its transition zone
minimization problem with scalar argument, with 0 < ˛1 < 1 (i.e., when the first inequality is near
row vector G1 D˛P 1 nT1 H1 and scalar ˛1 the same to be achieved), since the two terms of the so- A
activation function embedded within the refer- lution manifold now become only approximately
ence error rate PN 1 orthogonal, this can make the usage of the second
term for managing lower priority tasks, possibly
( ) counteracting the first, currently acting in favor
 2
S1 D
P argmin  PN 1  G1 y  , of the highest priority one, but in any case with-
y
out any possibility of making the primary error
y D G1# PN 1 C .I  G1# G1 /z1 D
P 1 C Q1 z1 I 8z1 variable 1 getting out of its enlarged boundaries
(i.e., the ones inclusive of the transition zone),
(1) thus meaning that once the primary variable 1
has entered within such larger boundaries, it will
The above minimization, whose solution man- definitely never get out of them.
ifold appears at the right (also expressed in a With the above considerations in mind,
concise notation with an obvious correspondence managing the remaining priority-descending
of terms) parameterized by the arbitrary vector sequence of inequality objectives can then be
z1 , has to be assumed executed without extracting done by applying the same philosophy to each
the common factor ˛1 , that is, by evaluating of them and within the mobility space left free
the pseudo-inverse matrix G1# via the regularized by its preceding ones, that is, as the result of
form the following sequence of nested minimization
problems:
 1 ( )
G1# D ˛12 nT1 H1 H1T n1 C p1 ˛1 H1T n1
 2
 P
P argmin N i  Gi y
Si D  I i D 1; 2; : : : ; k
y2Si 1
with p1 , a suitably chosen bell-shaped, finite sup-
port and centered on zero, regularizing function with Gi D˛
P i nTi Hi and with k indexing the low-
of the norm of row vector G1 . est priority inequality objective and where the
In the above solution manifold, when ˛1 D highest priority objective has been also included
1 (i.e., when the first inequality is still far to for the sake of completeness (upon letting So D
be achieved), the second arbitrary term Q1 z1 is RN /. In this way the procedure guarantees the
orthogonal to the first, thus having no influence concurrent prioritized convergence (actually oc-
on the generated P 1 D PN 1 and consequently curring as a sort of “domino effect” scattering
suitable to be used for also progressing toward along the prioritized objective list) toward the
the achievement of other lower priority objec- ultimate fulfillment of all inequality objectives,
tives, without perturbing the current progressive each one within its enlarged bounds at worse and
achievement of the first one. Note however that, with no possibility of getting out of them, once
since in this condition the span of the second reached.
term results one dimension less than the whole Further, a simple algebra allows translating the
system velocity space y 2 RN , this implies that above sequence of k nested minimizations into
the lower priority objectives can be progressed the following algorithmic structure, with initial-
by only acting within a one-dimension reduced ization 0 D 0; Q0 D I (see Casalino et al.
system velocity subspace. 2012a,b for more details):
When ˛1 D 0 (i.e., when the first inequality is
achieved) since G1# D 0 (as granted by the reg-
GO 1 DG
P i Qi
ularization) and consequently y D z1 , the lower
priority objectives can instead be progressed by  
Ti D I  Qi 1 Gi# Gi
now exploiting the whole system velocity space.
26 Advanced Manipulation for Underwater Sampling

i D Ti i 1 C Qi 1 Gi# PN i y D m C Qm zm I 8zm


 
Qi D Qi 1 I  Gi# Gi
where the still possibly existing residual arbitrari-
ending with the last k-th iteration with the solu- ness space Qm zm can be further used for assign-
tion manifold ing motion priorities between the arm and the
vehicle, for instance, via the following additional
y D k C Qk zk I 8zk least-priority ending task

where the residual arbitrariness space Qk zz has to y D argmin kk2 D mC1


be then used for managing the remaining equality y2Sm
objectives, as hereafter indicated.
whose solution mC1 (with no more arbitrariness
Managing the Lower Priority Equality required) finally assures (while respecting all pre-
Objectives and Subsystem Motion vious priorities) a motion minimality of the vehi-
Priorities cle, thus implicitly assigning to the arm a greater
For managing the lower priority equality mobility, which in turn allows the exploitation of
objectives when these require the zeroing of its generally higher motion precision, especially
their associated error i (as, for instance, for the during the ultimate convergence toward the final
end-effector sample reaching task), the following grasping.
sequence of nested minimization problems has
to be instead considered (with initialization k ;
Qk /:
Implementations
( )
 2
P argmin PsNi  Hi y  I i D .kC1/; : : : ; m
Si D The recently realized TRIDENT system of Fig. 1,
y2Si 1 embedding the above introduced task-priority-
based control architecture, has been operating
with m indexing the last priority equality objec- at sea in 2012 (Port Soller Harbor, Mallorca,
tive and where the whole reference error vector Spain). A detailed presentation of the preliminary
rates sPNi and associated whole error vectors sPi performed simulations, then followed by pool
have now to be used, since for ˛i  1 (as it is experiments, and finally followed by field trials
for any equality objective) the otherwise needed executed within a true underwater sea environ-
evaluation of unit vectors ni (which become ill ment can be found in Simetti et al. (2013). The
defined for the relevant error i approaching related EU-funded TRIDENT project (Sanz et al.
zero) would most probably provoke unwanted 2012) is the first one where agile manipulation
chattering phenomena around i D 0, while could be effectively achieved by part of an un-
instead the above avoids such risk (since sPNi and sPi derwater floating manipulator, not only as the
can be evaluated without requiring ni ), even if at consequence of the comparable masses and iner-
the cost of requiring, for each equality objective, tia exhibited by the vehicle and arm, but mainly
three degrees of mobility instead of a sole one, due to the adopted unified task-priority-based
as it instead is for each inequality objectives. control framework. Capabilities for autonomous
However, note how the algorithmic translation underwater floating manipulation were however
of the above procedure remains structurally the already achieved for the first time in 2009 at
same as the one for the inequality objectives the University of Hawaii, within the SAUVIM
(obviously with the substitutions sPNi ! PN i , Hi ! project (Marani et al. 2009, 2014; Yuh et al.
Gi , and with initialization k , Qk /, thus ending in 1998) even if without effective agility (the related
correspondence of the m-th last equality objective system was in fact a 6-t vehicle endowed with a
with the solution manifold less than 35 kg arm).
Advanced Manipulation for Underwater Sampling 27

Future Directions other than sample collection) at increasing


depths but obviously also those of the offshore
The presented task-priority-based KCL structure industry. A
is invariant with the addition, deletion, and substi- Moreover, by exploiting the current and future
tution (even on-the-fly) of the various objectives, developments on underwater exploration and
as well as invariant to changes in their priority survey mission performed by normal AUVs
ordering, thus constituting an invariant core po- (i.e., nonmanipulative), a possible work scenario
tentially capable of supporting intervention tasks might also include the presence of these lasts,
beyond the sole sample collection ones. On this for accomplishing different service activities
basis, more complex systems and operational supporting the intervention ones, for instance,
cases, such as, for instance, multi-arm systems relays with the surface, then informative activities
and/or even cooperating ones, can be foreseen (for instance, the delivery of the area model built
to be developed along the lines established by during a previous survey phase or the delivery of
the roadmap of Fig. 3 (with case 0 the current the intervention mission, both downloaded when
development state). in surface and then transferred to the intervention
The future availability of agile floating agents upon docking), or even when hovering
single-arm or multi-arm manipulators, also on the work area (for instance, close to a well-
implementing cooperative interventions in recognized feature) behaving as a local reference
force of a unified control and coordination system for the self-localization of the operative
structure (to this aim purposely extended), agents via twin USBL devices.
might in fact pave the way toward the
realization of underwater hard-work robotized
places, where different intervention agents Cross-References
might individually or cooperatively perform
different object manipulation and transportation  Control of Networks of Underwater Vehicles
activities, also including assembly ones,  Control of Ship Roll Motion
thus far beyond the here considered case of  Dynamic Positioning Control Systems for
sample collection. Such scenarios deserve the Ships and Underwater Vehicles
attention not only of the science community  Mathematical Models of Marine Vehicle-Ma-
when needing to execute underwater works nipulator Systems
(excavation, coring, instrument handling, etc.,  Mathematical Models of Ships and Underwater
Vehicles
 Motion Planning for Marine Control Systems
 Redundant Robots
 Robot Grasp Control
 Robot Teleoperation
 Underactuated Marine Control Systems
 Underactuated Robots

Bibliography

Antonelli G (2006) Underwater robotics. Springer tracts


in advanced robotics. Springer, New York
Casalino G (2011) Trident overall system modeling, in-
Advanced Manipulation for Underwater Sampling, cluding all needed variables for reactive coordination.
Fig. 3 A sketch of the foreseen roadmap for future de- Technical report ISME-2011. Available at http://www.
velopment of marine intervention robotics grasal.dist.unige.it/files/89
28 Air Traffic Management Modernization: Promise and Challenges

Casalino G, Zereik E, Simetti E, Torelli S, Sperindè A, Abstract


Turetta A (2012a) Agility for uunderwater floating ma-
nipulation task and subsystem priority based control
strategies. In: International conference on intelligent This entry provides a broad overview of how
robots and systems (IROS 2012), Vilamoura-Algarve air traffic for commercial air travel is scheduled
Casalino G, Zereik E, Simetti E, Torelli S, Sperindè and managed throughout the world. The major
A, Turetta A (2012b) A task and subsystem priority causes of delays and congestion are described,
basedCcontrol strategy for underwater floating manip-
ulators. In: IFAC workshop on navigation, guidance which include tight scheduling, safety restric-
and control of underwater vehicles (NGCUV 2012), tions, infrastructure limitations, and major distur-
Porto bances. The technical and financial challenges to
Marani G, Choi SK, Yuh J (2009) Underwater au- air traffic management are outlined, along with
tonomous manipulation for intervention missions
AUVs. Ocean Eng 36(1):15–23 some of the promising developments for future
Marani G, Yuh J (2014) Introduction to autonomous modernization.
manipulation - case study with an underwater robot,
SAUVIM. Springer Tracts in Advanced Robotics 102,
Springer, pp. 1-156
Nakamura Y (1991) Advanced robotics: redundancy and Keywords
optimization. Addison Wesley, Reading
Sanz P, Ridao P, Oliver G, Casalino G, Insurralde C, Sil- Air traffic management; Air traffic control;
vestre C, Melchiorri M, Turetta A (2012) TRIDENT: Airport capacity; Airspace management; Flight
recent I mprovements about autonomous underwater
intervention missions. In: IFAC workshop on navi- safety
gation, guidance and control of underwater vehicles
(NGCUV 2012), Porto
Simetti E, Casalino G, Torelli S, Sperinde A, Turetta
Introduction: How Does Air Traffic
A (2013) Experimental results on task priority and
dynamic programming based approach to underwater Management Work?
floating manipulation. In: OCEANS 2013, Bergen,
June 2013 This entry focuses on air traffic management for
Yoshikawa T (1985) Manipulability of robotic mecha-
commercial air travel, the passenger- and cargo-
nisms. Int J Robot Res 4(1):3–9. 1998
Yuh J, Cho SK, Ikehara C, Kim GH, McMurty G, carrying operations with which most of us are
Ghasemi-Nejhad M, Sarkar N, Sugihara K (1998) familiar. This is the air travel with a pressing
Design of a semi-autonomous underwater vehicle for need for modernization to address current and
intervention missions (SAUVIM). In: Proceedings of
future congestion. Passenger and cargo traffic
the 1998 international symposium on underwater tech-
nology, Tokyo, Apr 1998 is projected to double over the next 20 years,
with growth rates of 3–4 % annually in developed
markets such as the USA and Europe and growth
Air Traffic Management rates of 6 % and more in developing markets such
Modernization: Promise as Asia Pacific and the Middle East.
and Challenges In most of the world, air travel is a distributed,
market-driven system. Airlines schedule flights
Christine Haissig based on when people want to fly and when it is
Honeywell International Inc., Minneapolis, optimal to transport cargo. Most passenger flights
MN, USA are scheduled during the day; most package car-
rier flights are overnight. Some airports limit
how many flights can be scheduled by having
Synonyms a slot system, others do not. This decentralized
schedule of flights to and from airports around the
ATM Modernization world is controlled by a network of air navigation
service providers (ANSPs) staffed with air traffic
The copyright holder of this entry is © Honeywell controllers, who ensure that aircraft are separated
International Inc. safely.
Air Traffic Management Modernization: Promise and Challenges 29

The International Civil Aviation Organization Flight management systems (FMS) assist in flight
(ICAO) has divided the world’s airspace into planning in addition to providing lateral and ver-
flight information regions. Each region has a tical control of the airplane. Many aircraft have A
country that controls the airspace, and the ANSP special safety systems such as the Traffic Alert
for each country can be a government depart- and Collision Avoidance System, which alerts the
ment, state-owned company, or private organiza- flight crew to potential collisions with other air-
tion. For example, in the United States, the ANSP borne aircraft, and the Terrain Avoidance Warn-
is the Federal Aviation Administration (FAA), ing Systems (TAWS), which alert the flight crew
which is a government department. The Canadian to potential flight into terrain.
ANSP is NAV CANADA, which is a private
company.
Each country is different in terms of the ser-
Causes of Congestion and Delays
vices provided by the ANSP, how the ANSP
operates, and the tools available to the controllers.
Congestion and delays are caused by multiple
In the USA and Europe, the airspace is divided
reasons. These include tight scheduling, safety
into sectors and areas around airports. An air
limitations on how quickly aircraft can take off
traffic control center is responsible for traffic flow
and land and how closely they can fly, infras-
within its sector and rules and procedures are in
tructure limitations such as the number of run-
place to cover transfer of control between sectors.
ways at an airport and the airway structure, and
The areas around busy airports are usually han-
disturbances such as weather and unscheduled
dled by a terminal radar approach control. The air
maintenance.
traffic control tower personnel handle departing
aircraft, landing aircraft, and the movement of
aircraft on the airport surface. Tight Scheduling
Air traffic controllers in developed air travel Tight scheduling is a major contributor to con-
markets like the USA and Europe have tools that gestion and delays. The hub and spoke system
help them with the business of controlling and that many major airlines operate with to minimize
separating aircraft. Tower controllers operating at connection times means that aircraft arrive and
airports can see aircraft directly through windows depart in multiple banks during the day. During
or on computer screens through surveillance the arrival and departure banks, airports are very
technology such as radar and Automatic busy. As mentioned previously, passengers have
Dependent Surveillance-Broadcast (ADS-B). preferred times to travel, which also increase de-
Tower controllers may have additional tools mand at certain times. At airports that do not limit
to help detect and prevent potential collisions flight schedules by using slot scheduling, the
on the airport surface. En route controllers can number of flights scheduled can actually exceed
see aircraft on computer screens and may have the departure and arrival capacity of the airport
additional tools to help detect potential losses even in best-case conditions. One of the reasons
of separation between aircraft. Controllers can that airlines are asked to report on-time statistics
communicate with aircraft via radio and some is to make the published airline schedules more
have datalink communication available such reflective of the average time from departure to
as Controller-Pilot Datalink Communications arrival, not the best-case time.
(CPDLC). Aircraft themselves are also tightly scheduled.
Flight crews have tools to help with navigating Aircraft are an expensive capital asset. Since cus-
and flying the airplane. Autopilots and autothrot- tomers are very sensitive to ticket prices, airlines
tles off-load the pilot from having to continuously need to have their aircraft flying as many hours as
control the aircraft; instead the pilot can specify possible per day. Airlines also limit the number of
the speed, altitude, and heading and the autopilot spare aircraft and flight crews available to fill in
and autothrottle will maintain those commands. when operations are disrupted to control costs.
30 Air Traffic Management Modernization: Promise and Challenges

Safety Restrictions The airspace itself is a limitation. The airspace


Safety restrictions contribute to congestion. where controllers provide separation services is
There is a limit to how quickly aircraft can divided into an orderly structure of airways.
take off from and land on a runway. Sometimes The airways are like one-way, one-lane roads in
runways are used for both departing and arriving the sky. They are stacked at different altitudes,
aircraft; at other times a runway may be used which are usually separated by either 1,000 ft.
for departures only or arrivals only. Either way, or 2,000 ft. The width of the airways depends
the rule that controllers follow for safety is that on how well aircraft can navigate. In the US
only one aircraft can occupy the runway at one domestic airspace where there are regular
time. Thus, a landing aircraft must turn off of navigation aids and direct surveillance of aircraft,
the runway before another aircraft can take off the airways have a plus or minus 4 NM width.
or land. This limitation and other limitations like Over the ocean, airways may need to be separated
the ability of controllers to manage the arrival laterally by as much as 120 NM since there are
and departure aircraft propagate backwards from fewer navigation aids and aircraft are not under
the airport. Aircraft need to be spaced in an direct control but separated procedurally. The
orderly flow and separated no closer than what limited number of airways that the airspace can
can be supported by airport arrival rates. The support limits available capacity.
backward propagation can go all the way to the The airways themselves have capacity lim-
departure airports and cause aircraft to be held on itations just as traditional roads do. There are
the ground as a means to regulate the traffic flow special challenges for airways since aircraft need
into a congested airport or through a congested a minimum separation distance, aircraft cannot
air traffic sector. slow down to a stop, and airways do not allow
There is a limit on how close aircraft can fly. passing. So, although it may look like there is a
Aircraft produce a wake that can be dangerous lot of space in which aircraft can fly, there are
for other aircraft that are following too closely actually a limited number of routes between a city
behind. Pilots are aware of this limitation and pair or over oceanic airspace.
space safely when doing visual separation. Rules The radio that is used for pilots and controllers
that controllers apply for separation take into to communicate is another infrastructure limita-
account wake turbulence limitations, surveillance tion. At busy airports, there is significant radio
limitations, and limitations on how well aircraft congestion and pilots may need to wait to get an
can navigate and conform to the required speed, instruction or response from a controller.
altitude, and heading.
The human is a safety limitation. Controllers Disturbances
and pilots are human. Being human, they have Weather is a significant disturbance in air traffic
excellent reasoning capability. However, they are management. Weather acts negatively in many
limited as to the number of tasks they can perform ways. Wet or icy pavement affects the braking
and are subject to fatigue. The rules and proce- ability of aircraft so they cannot vacate a runway
dures in place to manage and fly aircraft take into as quickly as in dry conditions. Low cloud ceil-
account human limitations. ings mean that all approaches must be instrument
approaches rather than visual approaches, which
also reduces runway arrival rates. Snow must be
Infrastructure Limitations cleared from runways, closing them for some
Infrastructure limitations contribute to congestion period of time. High winds can mean that certain
and delays. Airport capacity is one infrastructure approaches cannot be used because they are not
limitation. The number of runways combined safe. In extreme weather, an airport may need to
with the available aircraft gates and capacity to close. Weather can block certain airways from
process passengers through the terminal limit the use, requiring rerouting of aircraft. Rerouting
airport capacity. increases demand on nearby airways, which may
Air Traffic Management Modernization: Promise and Challenges 31

or may not have the required additional capacity, makes it a challenge to accurately estimate bene-
so the rerouting cascades on both sides of the fits. It is complicated to understand if an improve-
weather. ment in one part of the system will really help A
or just shift where the congestion points are. De-
cisions need to be made on what improvements
Why Is Air Traffic Management are the best to invest in. For government entities,
Modernization So Hard? societal benefits can be as important as financial
payback, and someone needs to decide whose
Air traffic management modernization is difficult interests are more important. For example, the
for financial and technical reasons. The air traffic people living around an airport might want longer
management system operates around the clock. It arrival paths at night to minimize noise while air
cannot be taken down for a significant period of travelers and the airline want the airline to fly the
time without a major effect on commerce and the most direct route into an airport. A combination
economy. of subject matter expertise and simulation can
Financing is a significant challenge for air provide a starting point to estimate benefit, but
traffic management modernization. Governments often only operational deployment will provide
worldwide are facing budgetary challenges and realistic estimates.
improvements to air travel are one of many com- It is a long process to develop new technolo-
peting financial interests. Local airport authori- gies and operational procedures even when the
ties have similar challenges in raising money for benefit is clear and financing is available. The
airport improvements. Airlines have competitive typical development steps include describing the
limitations on how much ticket prices can rise operational concept; developing new controllers
and therefore need to see a payback on invest- procedures, pilot procedures, or phraseology if
ment in aircraft upgrades that can be as short as needed; performing a safety and performance
2 years. analysis to determine high level requirements;
Another financial challenge is that the entity performing simulations that at some point may
that needs to pay for the majority of an improve- include controllers or pilots; designing and build-
ment may not be the entity that gets the majority ing equipment that can include software, hard-
of the benefit, at least near term. One example of ware, or both; and field testing or flight testing the
this is the installation of ADS-B transmitters on new equipment. Typically, new ground tools are
aircraft. Buying and installing an ADS-B trans- field tested in a shadow mode, where controllers
mitter costs the aircraft owner money. It benefits can use the tool in a mock situation driven by
the ANSPs, who can receive the transmissions real data before the tool is made fully opera-
and have them augment or replace expensive tional. Flight testing is performed on aircraft that
radar surveillance, but only if a large number of are flying with experimental certificates so that
aircraft are equipped. Eventually the ANSP ben- equipment can be tested and demonstrated prior
efit will be seen by the aircraft operator through to formal certification.
lower operating costs but it takes time. This is Avionics need to be certified before opera-
one reason that ADS-B transmitter equipage was tional use to meet the rules established to ensure
mandated in the USA, Europe, and other parts of that a high safety standard is applied to air travel.
the world rather than letting market forces drive To support certification, standards are developed.
equipage. Frequently the standards are developed through
All entities, whether governmental or private, international cooperation and through consen-
need some sort of business case to justify invest- sus decision-making that includes many different
ment, where it can be shown that the benefit of the organizations such as ANSPs, airlines, aircraft
improvement outweighs the cost. The same sys- manufacturers, avionics suppliers, pilot associa-
tem complexity that makes congestion and delays tions, controller associations, and more. This is
in one region propagate throughout the system a slow process but an important one, since it
32 Air Traffic Management Modernization: Promise and Challenges

reduces development risk for avionics suppliers financing. Each program has organized its efforts
and helps ensure that equipment can be used differently but there are many similarities in the
worldwide. operational objectives and improvements being
Once new avionics or ground tools are avail- developed.
able, it takes time for them to be deployed. Airport capacity problems are being addressed
For example, aircraft fleets are upgraded as air- in multiple ways. Controllers are being provided
craft come in for major maintenance rather than with advanced surface movement guidance and
pulling them out of scheduled service. Flight control systems that combine radar surveillance,
crews need to be trained on new equipment before ADS-B surveillance, and sensors installed at the
it can be used, and training takes time. Ground airport with valued-added tools to assist with traf-
tools are typically deployed site by site, and the fic control and alert controllers to potential col-
controllers also require training on new equip- lisions. Datalink communications between con-
ment and new procedures. trollers and pilots will reduce radio-frequency
congestion, reduce communication errors, and
enable more complex communication. The USA
Promise for the Future and Europe have plans to develop a modernized
datalink communication infrastructure between
Despite the challenges and complexity of air controllers and pilots that would include infor-
traffic management, there is a path forward for mation like departure clearances and the taxiway
significant improvement in both developed and route clearance. Aircraft on arrival to an airport
developing air travel markets. Developing air will be controlled more precisely by equipping
travel markets in countries like China and India aircraft with capabilities such as the ability to fly
can improve air traffic management using pro- to a required time of arrival and the ability to
cedures, tools, and technology that is already space with respect to another aircraft.
used in developed markets such as the USA and Domestic airspace congestion is being ad-
Europe. Emerging markets like China are will- dressed in Europe by moving towards a single
ing to make significant investments in improving European sky where the ANSPs for the individ-
air traffic management by building new airports, ual nations coordinate activities and airspace is
expanding existing airports, changing controller structured not as 27 national regions but operated
procedures, and investing in controller tools. In as larger blocks. Similar efforts are undergoing
developed markets, new procedures, tools, and in the USA to improve the cooperation and coor-
technologies will need to be implemented. In dination between the individual airspace sectors.
some regions, mandates and financial incentives In some countries, large blocks of airspace are
may play a part in enabling infrastructure and reserved for special use by the military. In those
equipment changes that are not driven by the countries, efforts are in place to have dynamic
marketplace. special use airspace that is reserved on an as-
The USA and Europe are both supporting needed basis but otherwise available for civil
significant research, development, and im- use.
plementation programs to support air traffic Oceanic airspace congestion is being
management modernization. In the USA, the addressed by leveraging the improved navigation
FAA has a program known as NextGen, the performance of aircraft. Some route structures
Next Generation Air Transportation System. In are available only to aircraft that can flight to
Europe, the European Commission oversees a required navigation performance. These route
a program known as SESAR, the Single structures have less required lateral separation,
European Sky Air Traffic Management Research, and thus more routes can be flown in the same
which is a joint effort between the European airspace. Pilot tools that leverage ADS-B are
Union, EUROCONTROL, and industry partners. allowing aircraft to make flight level changes
Both programs have substantial support and with reduced separation and in the future
Aircraft Flight Control 33

are expected to allow pilots to do additional and far term to support air traffic management
maneuvering that is restricted today, such as modernization worldwide.
passing slower aircraft. A
Weather cannot be controlled but efforts are
underway to do better prediction and provide Cross-References
more accurate and timely information to pilots,
controllers, and aircraft dispatchers at airlines.  Aircraft Flight Control
On-board radars that pilots use to divert around  Pilot-Vehicle System Modeling
weather are adding more sophisticated process-
ing algorithms to better differentiate hazardous
weather. Future flight management systems will
have the capability to include additional weather
Bibliography
information. Datalinks between the air and the Collinson R (2011) Introduction to avionics systems, 3rd
ground or between aircraft may be updated to edn. Springer, Dordrecht
include information from the on-board radar sys- http://www.faa.gov/nextgen/
tems, allowing aircraft to act as local weather http://www.sesarju.eu/
Nolan M (2011) Fundamentals of air traffic control, 5th
sensors. Improved weather information for pilots, edn. Cengage Learning, Clifton Park
controllers, and dispatchers improves flight plan-
ning and minimizes the necessary size of devia-
tions around hazardous weather while retaining
safety. Aircraft Flight Control
Weather is also addressed by providing air-
craft and airports with equipment to improve Dale Enns
airport access in reduced visibility. Ground-based Honeywell International Inc., Minneapolis,
augmentation systems installed at airports pro- MN, USA
vide aircraft with the capability to do precision-
based navigation for approaches to airports with
low weather ceilings. Other technologies like Abstract
enhanced vision and synthetic vision, which can
be part of a combined vision system, provide the Aircraft flight control is concerned with using
capability to land in poor visibility. the control surfaces to change aerodynamic mo-
ments, to change attitude angles of the aircraft
relative to the air flow, and ultimately change
Summary the aerodynamic forces to allow the aircraft to
achieve the desired maneuver or steady condi-
Air traffic management is a complex and interest- tion. Control laws create the commanded con-
ing problem. The expected increase in air travel trol surface positions based on pilot and sensor
worldwide is driving a need for improvements inputs. Traditional control laws employ propor-
to the existing system so that more passengers tional and integral compensation with scheduled
can be handled while at the same time reducing gains, limiting elements, and cross feeds between
congestion and delays. Significant research and coupled feedback loops. Dynamic inversion is an
development efforts are underway worldwide to approach to develop control laws that systemati-
develop safe and effective solutions that include cally addresses the equivalent of gain schedules
controller tools, pilot tools, aircraft avionics, in- and the multivariable cross feeds, can incorpo-
frastructure improvements, and new procedures. rate constrained optimization for the limiting ele-
Despite the technical and financial challenges, ments, and maintains the use of proportional and
many promising technologies and new proce- integral compensation to achieve the benefits of
dures will be implemented in the near, mid-, feedback.
34 Aircraft Flight Control

Keywords Although the following discussion is applica-


ble to a wide range of flight vehicles including
Control allocation; Control surfaces; Dynamic in- gliders, unmanned aerial vehicles, lifting bodies,
version; Proportional and integral control; Rigid missiles, rockets, helicopters, and satellites, the
body equations of motion; Zero dynamics focus of this entry will be on fixed wing commer-
cial and military aircraft with human pilots.

Introduction Flight

Flying is made possible by flight control and this Aircraft are maneuvered by changing the forces
applies to birds and the Wright Flyer, as well as acting on the mass center, e.g., a steady level
modern flight vehicles. In addition to balancing turn requires a steady force towards the direction
lift and weight forces, successful flight also re- of turn. The force is the aerodynamic lift force
quires a balance of moments or torques about the .L/ and it is banked or rotated into the direction
mass center. Control is a means to adjust these of the turn. The direction can be adjusted with
moments to stay in equilibrium and to perform the bank angle . / and for a given airspeed .V /
maneuvers. While birds use their feathers and and air density ./, the magnitude of the force
the Wright Flyer warped its wings, modern flight can be adjusted with the angle-of-attack .˛/. This
vehicles utilize hinged control surfaces to adjust is called bank-to-turn. Aircraft, e.g., missiles,
the moments. The control action can be open can also skid-to-turn where the aerodynamic side
or closed loop, where closed loop refers to a force .Y / is adjusted with the sideslip angle .ˇ/
feedback loop consisting of sensors, computer, but this entry will focus on bank-to-turn.
and actuation. A direct connection between the Equations of motion (Enns et al. 1996; Stevens
cockpit pilot controls and the control surfaces and Lewis 1992) can be used to relate the time
without a feedback loop is open loop control. The rates of change of , ˛, and ˇ to roll .p/, pitch
computer in the feedback loop implements a con- .q/, and yaw .r/ rate. See Fig. 1. Approximate
trol law (computer program). The development of relations (for near steady level flight with no
the control law is discussed in this entry. wind) are

Aircraft Flight Control,


Fig. 1 Flight control
variables
Aircraft Flight Control 35

P D p pitch and heave degrees-of-freedom) for this dy-


namical system are
L  mg
˛P D q C A
mV
Y qP D M˛ ˛ C Mq q C Mıe ıe
ˇP D r C
mV ˛P D Z˛ ˛ C q C Zıe ıe

where m is the aircraft mass, and g is the grav-


itational acceleration. In straight and level flight where M˛ , Mq , Z˛ are stability derivatives, and
conditions L D mg and Y D 0 so we think of Mıe is the control derivative, all of which can be
these equations as kinematic equations where the regarded as constants for a given airspeed and air
rates of change of the angles , ˛, and ˇ are the density.
angular velocities p; q; and r. Although Z˛ < 0 and Mq < 0 are stabilizing,
Three moments called roll, pitch, and yaw for M˛ > 0 makes the short period motion inherently
angular motion to move the right wing up or unstable. In fact, the short period motion of the
down, nose up or down, and nose right or left, Wright Flyer was unstable. Some modern aircraft
respectively create the angular accelerations to are also unstable.
change p, q, and r, respectively. The equations
are Newton’s 2nd law for rotational motion. The Lateral-Directional Axes Example
moments (about the mass center) are dominated Consider just the roll, yaw, and side motion with
by aerodynamic contributions and depend on , four state variables . , p; r, ˇ/ and two inputs
V , ˛, ˇ, p; q; r, and the control surfaces. The .ıa , ır /. We will use the standard state space
control surfaces are aileron .ıa /, elevator .ıe /, equations with matrices A; B; C for this example.
and rudder .ır / and are arranged to contribute pri- The short period equations apply for yaw and
marily roll, pitch, and yaw moments respectively. side motion (or dutch roll motion) with appro-
The control surfaces .ıa , ıe , ır / contribute priate replacements, e.g., q with r, ˛ with ˇ,
to angular accelerations which are integrated to M with N . We add the term V1 g to the ˇP
obtain the angular rates .p; q; r/. The integral of equation. We include the kinematic equation P D
angular rates contributes to the attitude angles p and add the term Lˇ ˇ to the pP equation. The
. ; ˛; ˇ/. The direction and magnitude of aero- dutch roll, like the short period, can be unstable
dynamic forces can be adjusted with the attitude if Nˇ < 0, e.g., airplanes without a vertical tail.
angles. The forces create the maneuvers or steady There is coupling between the motions asso-
conditions for operation of the aircraft. ciated with stability derivatives Lr , Lˇ , Np and
control derivatives Lır and Nıa . This is a fourth
Pure Roll Axis Example order multivariable coupled system where ıa , ır
Consider just the roll motion. The differential are the inputs and we can consider .p; r/ or
equation (Newton’s 2nd law for the roll degree- . ; ˇ/ as the outputs.
of-freedom) for this dynamical system is

pP D Lp p C Lıa ıa Control

where Lp is the stability derivative and Lıa is the The control objectives are to provide stability,
control derivative both of which can be regarded disturbance rejection, desensitization, and
as constants for a given airspeed and air density. satisfactory steady state and transient response
to commands. Specifications and guidelines for
Pitch Axis or Short Period Example these objectives are assessed quantitatively with
Consider just the pitch and heave motion. The frequency, time, and covariance analyses and
differential equations (Newton’s 2nd law for the simulations.
36 Aircraft Flight Control

Aircraft Flight Control, Fig. 2 Closed loop feedback system and desired dynamics

Integrator with P + I Control Pure Roll Motion Example


The system to be controlled is the integrator for With algebraic manipulations called dynamic in-
y in Fig. 2 and the output of the integrator .y/ version we can use the pure integrator results
is the controlled variable. The proportional gain in the previous section for the pure roll motion
.Kb > 0/ is a frequency and sets the bandwidth example. For the controlled variable y D p,
or crossover frequency of the feedback loop. The given a measurement of the state x D p and
value of Kb will be between 1 and 10 rad/s in values for Lp and Lıa , we simply solve for the
most aircraft applications. Integral action can be input .u D ıa / that gives the desired rate of
included with the gain, fi > 0 with a value change of the output yPdes D pP des . The solution
between 0 and 1.5 in most applications. The value is
of the command gain, fc > 0, is set to achieve a ıa D L1 P des  Lp p/
ıa . p
desired closed loop response from the command
Since Lıa and Lp vary with air density and
yc to the output y. Values of fi D 0:25 and fc D
airspeed, we are motivated to schedule these
0:5 are typical. In realistic applications, there is a
portions of the control law accordingly.
limit that applies at the input to the integrator. In
these cases, we are obligated to include an anti-
integral windup gain, fa > 0 (typical value of 2) Short Period Example
to prevent continued integration beyond the limit. Similar algebraic manipulations use the general
The input to the limiter (yPdes ) is called the desired state space notation
rate of change of the controlled variable (Enns
et al. 1996). xP D Ax C Bu
The closed loop transfer function is
y D Cx

y Kb .fc s C fi Kb /
D 2 We want to solve for u to achieve a desired rate
yc s C Kb s C fi Kb2 of change of y, so we start with

and the pilot produces the commands, .yc / with yP D CAx C CBu
cockpit inceptors, e.g., sticks, pedals.
The control system robustness can be adjusted
If we can invert CB, i.e., it is not zero, for the
with the choices made for y, Kb , fi , and fc .
short period case, we solve for u with
These desired dynamics are utilized in all of
the examples to follow. In the following, we use
dynamic inversion (Enns et al. 1996; Wacker u D .CB/1 .yPdes  CAx/
et al. 2001) to algebraically manipulate the equa-
tions of motion into the equivalent of the integra- Implementation requires a measurement of the
tor for y in Fig. 2. state, x and models for the matrices CA and CB.
Aircraft Flight Control 37

The closed loop poles include the open loop Lateral-Directional Example
zeros of the transfer function y.s/
u.s/
(zero dynamics) If we choose the two angular rates as the con-
in addition to the roots of the desired dynamics trolled variables (p; r), then the zero dynamics A
characteristic equation. Closed loop stability re- are favorable. We use the same proportional plus
quires stable zero dynamics. The zero dynamics integral desired dynamics in Fig. 2 but there are
have an impact on control system robustness and two signals represented by each wire (one associ-
can influence the precise choice of y. ated with p and the other r).
When y D q, the control law includes the The same state space equations are used for
following dynamic inversion equation the dynamic inversion step but now CA and CB
are 2  4 and 2  2 matrices, respectively instead
ıe D Mı1
e
.qPdes  Mq q  M˛ ˛/ of scalars. The superscript in u D .CB/1 .yPdes 
CAx/ now means matrix inverse instead of recip-
and the open loop zero is Z˛  Zıe Mı1 e
M˛ , rocal. The zero dynamics are assessed with the
which in almost every case of interest is a neg- transmission zeros of the matrix transfer function
ative number. .p; r/=.ıa ; ır /.
Note that there are no restrictions on the open In the practical case where the aileron and
loop poles. This control law is effective and rudder are limited, it is possible to place a higher
practical in stabilization of an aircraft with an priority on solving one equation vs. another if the
open loop unstable short period mode. equations are coupled, by proper allocation of the
Since Mıe , Mq and M˛ vary with air density commands to the control surfaces which is called
and airspeed we are motivated to schedule these control allocation (Enns 1998). In these cases, we
portions of the control law accordingly. use a constrained optimization approach
When y D ˛, the zero dynamics are not
suitable as closed loop poles. In this case, the min jjCBu  .yPdes  CAx/jj
umin uumax
pitch rate controller described above is the inner
loop and we apply dynamic inversion a second instead of the matrix inverse followed by a lim-
time as an outer loop (Enns and Keviczky 2006) iter. In cases where there are redundant controls,
where we approximate the angle-of-attack dy- i.e., the matrix CB has more columns than rows,
namics with the simplification that pitch rate has we introduce a preferred solution, up and solve a
reached steady state, i.e., qP D 0 and regard pitch different constrained optimization problem
rate as the input .u D q/ and angle-of-attack as
the controlled variable .y D ˛/. The approximate
min jju  up jj
equation of motion is CBuCCAxDyPdes

 
˛P D Z˛ ˛ C q  Zıe Mı1
e
M˛ ˛ C Mq q to find the solution that solves the equations that
  is closest to the preferred solution. We utilize
D Z˛  Zıe Mı1e
M˛ ˛ weighted norms to accomplish the desired prior-
 
C 1  Zıe Mı1
e
Mq q ity.
An outer loop to control the attitude angles
This equation is inverted to give . ; ˇ/ can be obtained with an approach analo-
gous to the one used in the previous section.
 1
qc D 1  Zıe Mı1e
Mq
    Nonlinear Example
˛P des  Z˛  Zıe Mı1
e
M˛ ˛ Dynamic inversion can be used directly with the
nonlinear equations of motion (Enns et al. 1996;
qc obtained from this equation is passed to the Wacker et al. 2001). General equations of mo-
inner loop as a command, i.e., yc of the inner tion, e.g., 6 degree-of-freedom rigid body can be
loop. expressed with xP D f .x; u/ and the controlled
38 Aircraft Flight Control

variable is given by y D h.x/. With the chain portions of the control law are scheduled. Aircraft
rule of calculus we obtain do exhibit coupling between axes and so multi-
variable feedback loop approaches are effective.
@h Nonlinearities in the form of limits (noninvert-
yP D .x/ f .x; u/
@x ible) and nonlinear expressions, e.g., trigonomet-
ric, polynomial, and table look-up (invertible)
and for a given yP D yPdes and (measured) x we are present in flight control development. The
can solve this equation for u either directly or dynamic inversion approach has been shown to
approximately. In practice, the first order Taylor include the traditional feedback control princi-
Series approximation is effective ples, systematically develops the equivalent of the
gain schedules, applies to multivariable systems,
yP Š a .x; u0 / C b .x; u0 / .u  u0 / applies to invertible nonlinearities, and can be
used to avoid issues with noninvertible nonlinear-
where u0 is typically the past value of u, in ities to the extent it is physically possible.
a discrete implementation. As in the previous Future developments will include adaptation,
example, Fig. 2 can be used to obtain yPdes . The reconfiguration, estimation, and nonlinear
terms a .x; u0 /  b .x; u0 / u0 and b .x; u0 / are analyses. Adaptive control concepts will continue
analogous to the terms CAx and the matrix CB, to mature and become integrated with approaches
respectively. Control allocation can be utilized such as dynamic inversion to deal with
in the same way as discussed above. The zero unstructured or nonparameterized uncertainty or
dynamics are evaluated with transmission zeros variations in the aircraft dynamics. Parameterized
at the intended operating points. Outer loops can uncertainty will be incorporated with near real
be employed in the same manner as discussed in time reconfiguration of the aircraft model used
the previous section. as part of the control law, e.g., reallocation of
The control law with this approach utilizes control surfaces after an actuation failure. State
the equations of motion which can include table variables used as measurements in the control law
lookup for aerodynamics, propulsion, mass prop- will be estimated as well as directly measured
erties, and reference geometry as appropriate. in nominal and sensor failure cases. Advances
The raw aircraft data or an approximation to the in nonlinear dynamical systems analyses will
data takes the place of gain schedules with this create improved intuition, understanding, and
approach. guidelines for control law development.

Summary and Future Directions Cross-References

Flight control is concerned with tracking com-  PID Control


mands for angular rates. The commands may  Satellite Control
come directly from the pilot or indirectly from  Tactical Missile Autopilots
the pilot through an outer loop, where the pilot
directly commands the outer loop. Feedback con-
trol enables stabilization of aircraft that are inher-
Bibliography
ently unstable and provides disturbance rejection
and insensitive closed-loop response in the face Enns DF (1998) Control allocation approaches. In: Pro-
of uncertain or varying vehicle dynamics. Propor- ceedings of the AIAA guidance, navigation, and con-
tional and integral control provide these benefits trol conference, Boston
Enns DF, Keviczky T (2006) Dynamic inversion based
of feedback. The aircraft dynamics are signifi-
flight control for autonomous RMAX helicopter. In:
cantly different for low altitude and high speed Proceedings of the 2006 American control conference,
compared to high altitude and low speed and so Minneapolis, 14–16 June 2006
Applications of Discrete-Event Systems 39

Enns DF et al (1996) Application of multivariable con- of the controls community to address the
trol theory to aircraft flight control laws, final report: control needs of applications concerning some
multivariable control design guidelines. Technical re-
port WL-TR-96-3099, Flight Dynamics Directorate,
complex production and service operations, A
Wright-Patterson Air Force Base, OH 45433-7562, like those taking place in manufacturing and
USA other workflow systems, telecommunication
Stevens BL, Lewis FL (1992) Aircraft control and simula- and data-processing systems, and transportation
tion. Wiley, New York
Wacker R, Enns DF, Bugajski DJ, Munday S, Merkle S
systems. These operations were seeking the
(2001) X-38 application of dynamic inversion flight ability to support higher levels of efficiency
control. In: Proceedings of the 24th annual AAS guid- and productivity and more demanding notions
ance and control conference, Breckenridge, 31 Jan–4 of quality of product and service. At the same
Feb 2001
time, the thriving computing technologies of
the era, and in particular the emergence of
the microprocessor, were cultivating, and to a
significant extent supporting, visions of ever-
Applications of Discrete-Event
increasing automation and autonomy for the
Systems
aforementioned operations. The DES community
set out to provide a systematic and rigorous
Spyros Reveliotis
understanding of the dynamics that drive the
School of Industrial & Systems Engineering,
aforementioned operations and their complexity,
Georgia Institute of Technology, Atlanta,
and to develop a control paradigm that would
GA, USA
define and enforce the target behaviors for those
environments in an effective and robust manner.
In order to address the aforementioned objec-
Abstract
tives, the controls community had to extend its
methodological base, borrowing concepts, mod-
This entry provides an overview of the prob-
els, and tools from other disciplines. Among
lems addressed by discrete-event systems (DES)
these disciplines, the following two played a
theory, with an emphasis on their connection to
particularly central role in the development of
various application contexts. The primary inten-
the DES theory: (i) the Theoretical Computer
tions are to reveal the caliber and the strengths
Science (TCS) and (ii) the Operations Research
of this theory and to direct the interested reader,
(OR). As a new research area, DES thrived on
through the listed citations, to the corresponding
the analytical strength and the synergies that
literature. The concluding part of the entry also
resulted from the rigorous integration of the mod-
identifies some remaining challenges and further
eling frameworks that were borrowed from TCS
opportunities for the area.
and OR. Furthermore, the DES community sub-
stantially extended those borrowed frameworks,
bringing in them many of its control-theoretic
Keywords perspectives and concepts.
In general, DES-based approaches are charac-
Applications; Discrete-event systems terized by (i) their emphasis on a rigorous and
formal representation of the investigated systems
and the underlying dynamics; (ii) a double focus
Introduction on time-related aspects and metrics that define
traditional/standard notions of performance for
Discrete-event systems (DES) theory ( Models the considered systems, but also on a more be-
for Discrete Event Systems: An Overview) haviorally oriented analysis that is necessary for
(Cassandras and Lafortune 2008) emerged in ensuring fundamental notions of “correctness,”
the late 1970s/early 1980s from the effort “stability,” and “safety” of the system operation,
40 Applications of Discrete-Event Systems

especially in the context of the aspired levels frequently characterized as untimed DES models.
of autonomy; (iii) the interplay between the two In the practical applications of DES theory, the
lines of analysis mentioned in item (ii) above most popular such models are the Finite State Au-
and the further connection of this analysis to tomaton (FSA) (Cassandras and Lafortune 2008;
structural attributes of the underlying system; Hopcroft and Ullman 1979;  Supervisory Con-
and (iv) an effort to complement the analytical trol of Discrete-Event Systems;  Diagnosis of
characterizations and developments with design Discrete Event Systems), and the Petri net (PN)
procedures and tools that will provide solutions (Cassandras and Lafortune 2008; Murata 1989;
provably consistent with the posed specifications  Modeling, Analysis, and Control with Petri
and effectively implementable within the time Nets).
and other resource constraints imposed by the In the context of DES applications, these
“real-time” nature of the target applications. modeling frameworks have been used to provide
The rest of this entry overviews the current succinct characterizations of the underlying
achievements of DES theory with respect to event-driven dynamics and to design controllers,
(w.r.t.) the different classes of problems that in the form of supervisors, that will restrict these
have been addressed by it and highlights the dynamics so that they abide to safety, consistency,
potential that is defined by these achievements fairness, and other similar considerations
for a range of motivating applications. On the ( Supervisory Control of Discrete-Event
other hand, the constricted nature of this entry Systems). As a more concrete example, in the
does not allow an expansive treatment of the context of contemporary manufacturing, DES-
aforementioned themes. Hence, the provided based behavioral control – frequently referred to
coverage is further supported and supplemented as supervisory control (SC) – has been promoted
by an extensive list of references that will as a systematic methodology for the synthesis and
connect the interested reader to the relevant verification of the control logic that is necessary
literature. for the support of the, so-called, SCADA
(Supervisory Control and Data Acquisition)
function. This control function is typically
A Tour of DES Problems implemented through the Programmable Logic
and Applications Controllers (PLCs) that have been employed in
contemporary manufacturing shop-floors, and
DES-Based Behavioral Modeling, Analysis, DES SC theory can support it (i) by providing
and Control more rigor and specificity to the models that are
The basic characterization of behavior in the employed for the underlying plant behavior and
DES-theoretic framework is through the various the imposed specifications and (ii) by offering
event sequences that can be generated by the the ability to synthesize control policies that are
underlying system. Collectively, these sequences provably correct by construction. Some example
are known as the (formal) language generated by works that have pursued the application of DES
the plant system, and the primary intention is to SC along these lines can be found in Balemi
restrict the plant behavior within a subset of the et al. (1993), Brandin (1996), Park et al. (1999),
generated event strings. The investigation of this Chandra et al. (2003), Endsley et al. (2006), and
problem is further facilitated by the introduction Andersson et al. (2010).
of certain mechanisms that act as formal repre- On the other hand, the aforementioned activity
sentations of the studied systems, in the sense that has also defined a further need for pertinent
they generate the same strings of events (i.e., the interfaces that will translate (a) the plant structure
same formal language). Since these models are and the target behavior to the necessary DES-
concerned with the representation of the event theoretic models and (b) the obtained policies to
sequences that are generated by DES, and not PLC executables. This need has led to a line of
by the exact timing of these events, they are research, in terms of representational models and
Applications of Discrete-Event Systems 41

computational tools, that is complementary to the in the design and operation of secure systems
core DES developments described in the previous (Dubreil et al. 2010; Saboori and Hadjicostis
paragraphs. Indicatively we mention the develop- 2012, 2014). A
ment of GRAFCET (David and Alla 1992) and
of the sequential function charts (SFCs) (Lewis Dealing with the Underlying
1998) from the earlier times, while some more Computational Complexity
recent endeavor along these lines is reported in As revealed from the discussion of the previous
Wightkin et al. (2011) and Alenljung et al. (2012) paragraphs, many of the applications of DES SC
and the references cited therein. theory concern the integration and coordination
Besides its employment in the manufacturing of behavior that is generated by a number of in-
domain, DES SC theory has also been considered teracting components. In these cases, the formal
for the coordination of the communicating pro- models that are necessary for the description of
cesses that take place in various embedded sys- the underlying plant behavior may grow their size
tems (Feng et al. 2007); the systematic validation very fast, and the algorithms that are involved in
of the embedded software that is employed in the behavioral analysis and control synthesis may
various control applications, ranging from power become practically intractable. Nevertheless, the
systems and nuclear plants to aircraft and au- rigorous methodological base that underlies DES
tomotive electronics (Li and Kumar 2012); the theory provides also a framework for addressing
synthesis of the control logic in the electronic these computational challenges in an effective
switches that are utilized in telecom and data and structured manner.
networks; and the modeling, analysis, and control More specifically, DES SC theory provides
of the operations that take place in health-care conditions under which the control specifica-
systems (Sampath et al. 2008). Wassyng et al. tions can be decomposable to the constituent
(2011) gives a very interesting account of the plant components while maintaining the integrity
gains, but also the extensive challenges, experi- and correctness of the overall plant behavior
enced by a team of researchers who have tried to ( Supervisory Control of Discrete-Event Sys-
apply formal methods, similar to those that have tems; Wonham 2006). The aforementioned works
been promoted by the behavioral DES theory, to of Brandin (1996) and Endsley et al. (2006) pro-
the development and certification of the software vide some concrete examples for the application
that manages some safety-critical operations for of modular control synthesis. On the other hand,
Canadian nuclear plants. there are fundamental problems addressed by SC
Apart from control, untimed DES models theory and practice that require a holistic view
have also been employed for the diagnosis of of the underlying plant and its operation, and
critical events, like certain failures, that cannot thus, they are not amenable to modular solutions.
be observed explicitly, but their occurrence For such cases, DES SC theory can still provide
can be inferred from some resultant behavioral effective solutions through (i) the identification of
patterns (Sampath et al. 1996;  Diagnosis of special plant structure, of practical relevance, for
Discrete Event Systems). More recently, the which the target supervisors are implementable
relevant methodology has been extended with in a computationally efficient manner and (ii)
prognostic capability (Kumar and Takai 2010), the development of structured approaches that
while an interesting variation of it addresses can systematically trade-off the original specifi-
the “dual” problem that concerns the design cations for computational tractability.
of systems where certain events or behavioral A particular application that has benefited
patterns must remain undetectable by an external from, and, at the same time, has significantly
observer who has only partial observation of the promoted this last capability of DES SC theory,
system behavior; this last requirement has been is that concerning the deadlock-free operation
formally characterized by the notion of “opacity” of many systems where a set of processes that
in the relevant literature, and it finds application execute concurrently and in a staged manner are
42 Applications of Discrete-Event Systems

competing, at each of their processing stages, this type of analysis, the untimed DES behavioral
for the allocation of a finite set of reusable models are extended to their timed versions. This
resources. In DES theory, this problem is extension takes place by endowing the original
known as the liveness-enforcing supervision of untimed models with additional attributes that
sequential resource allocation systems (RAS) characterize the experienced delays between
(Reveliotis 2005), and it underlies the operation the activation of an event and its execution
of many contemporary applications: from the (provided that it is not preempted by some other
resource allocation taking place in contemporary conflicting event). Timed models are further
manufacturing shop floors, Ezpeleta et al. classified by the extent and the nature of the
(1995), Reveliotis and Ferreira (1996), and randomness that is captured by them. A basic
Jeng et al. (2002), to the traveling and/or work- such categorization is between deterministic
space negotiation in robotic systems (Reveliotis models, where the aforementioned delays take
and Roszkowska 2011), automated railway fixed values for every event and stochastic
(Giua et al. 2006), and other guidepath-based models which admit more general distributions.
traffic systems (Reveliotis 2000); to Internet- From an application standpoint, timed DES
based workflow management systems like those models connect DES theory to the multitude
envisioned for e-commerce and certain banking of applications that have been addressed by
and insurance claim processing applications Dynamic Programming, Stochastic Control,
(Van der Aalst 1997); and to the allocation of and scheduling theory (Bertsekas 1995; Meyn
the semaphores that control the accessibility 2008; Pinedo 2002). Also, in their most general
of shared resources by concurrently executing definition, stochastic DES models provide
threads in parallel computer programs (Liao the theoretical foundation of discrete-event
et al. 2013). A systematic introduction to the simulation (Banks et al. 2009).
DES-based modeling of RAS and their liveness- Similar to the case of behavioral DES theory, a
enforcing supervision is provided in Reveliotis practical concern that challenges the application
(2005) and Zhou and Fanti (2004), while some of timed DES models for performance model-
more recent developments in the area are ing, analysis, and control is the very large size
epitomized in Reveliotis (2007), Li et al. (2008) of these models, even for fairly small systems.
and Reveliotis and Nazeem (2013). DES theory has tried to circumvent these com-
Closing the above discussion on the ability of putational challenges through the development of
DES theory to address effectively the complexity methodology that enables the assessment of the
that underlies the DES SC problem, we should system performance, over a set of possible con-
point out that the same merits of the theory figurations, from the observation of its behavior
have also enabled the effective management of and the resultant performance at a single configu-
the complexity that underlies problems related ration. The required observations can be obtained
to the performance modeling and control of the through simulation, and in many cases, they can
various DES applications. We shall return to this be collected from a single realization – or sample
capability in the next section that discusses the path – of the observed behavior; but then, the
achievements of DES theory in this domain. considered methods can also be applied on the
actual system, and thus, they become a tool for
DES Performance Control real-time optimization, adaptation, and learning.
and the Interplay Among Structure, Collectively, the aforementioned methods define
Behavior, and Performance a “sensitivity”-based approach to DES perfor-
DES theory is also interested in the performance mance modeling, analysis, and control (Cassan-
modeling, analysis, and control of its target dras and Lafortune 2008;  Perturbation Anal-
applications w.r.t. time-related aspects like ysis of Discrete Event Systems). Historically,
throughput, resource utilization, experienced DES sensitivity analysis originated in the early
latencies, and congestion patterns. To support 1980s in an effort to address the performance
Applications of Discrete-Event Systems 43

analysis and optimization of queueing systems the performance-oriented control policies that are
w.r.t. certain structural parameters like the ar- necessary for any particular DES instantiation.
rival and processing rates (Ho and Cao 1991). This is a rather novel topic in the relevant DES A
But the current theory addresses more general literature, and some recent works in this direction
stochastic DES models that bring it closer to can be found in Cao (2005), Li and Reveliotis
broader endeavors to support incremental op- (2013), Markovski and Su (2013), and David-
timization, approximation, and learning in the Henriet et al. (2013).
context of stochastic optimal control (Cao 2007).
Some particular applications of DES sensitiv- The Roles of Abstraction and Fluidification
ity analysis for the performance optimization of The notions of “abstraction” and “fluidification”
production, telecom, and computing systems can play a significant role in mastering the complex-
be found in Cassandras and Strickland (1988), ity that arises in many DES applications. Further-
Cassandras (1994), Panayiotou and Cassandras more, both of these concepts have an important
(1999), Homem-de Mello et al. (1999), Fu and role in defining the essence and the boundaries of
Xie (2002), and Santoso et al. (2005). DES-based modeling.
Another interesting development in time- In general systems theory, abstraction can be
based DES theory is the theory of (max,+) broadly defined as the effort to develop sim-
algebra (Baccelli et al. 1992). In its practical plified models for the considered dynamics that
applications, this theory addresses the timed retain, however, adequate information to resolve
dynamics of systems that involve the synchro- the posed questions in an effective manner. In
nization of a number of concurrently executing DES theory, abstraction has been pursued w.r.t.
processes with no conflicts among them, and the modeling of both the timed and untimed
it provides important structural results on the behaviors, giving rise to hierarchical structures
factors that determine the behavior of these and models. A theory for hierarchical SC is
systems in terms of the occurrence rates of presented in Wonham (2006), while some appli-
various critical events and the experienced cations of hierarchical SC in the manufacturing
latencies among them. Motivational applications domain are presented in Hill et al. (2010) and
of (max,+) algebra can be traced in the design and Schmidt (2012). In general, hierarchical SC relies
control of telecommunication and data networks, on a “spatial” decomposition that tries to local-
manufacturing, and railway systems, and more ize/encapsulate the plant behavior into a number
recently the theory has found considerable of modules that interact through the communi-
practical application in the computation of cation structure that is defined by the hierarchy.
repetitive/cyclical schedules that seek to optimize On the other hand, when it comes to timed DES
the throughput rate of automated robotic cells and behavior and models, a popular approach seeks
of the cluster tools that are used in semiconductor to define a hierarchical structure for the underly-
manufacturing (Kim and Lee 2012; Lee 2008; ing decision-making process by taking advantage
Park et al. 1999). of the different time scales that correspond to
Both sensitivity-based methods and the theory the occurrence of the various event types. Some
of (max,+) algebra that were discussed in the particular works that formalize and systematize
previous paragraphs are enabled by the explicit, this idea in the application context of production
formal modeling of the DES structure and behav- systems can be found in Gershwin (1994) and
ior in the pursued performance analysis and con- Sethi and Zhang (1994) and the references cited
trol. This integrative modeling capability that is therein.
supported by DES theory also enables a profound In fact, the DES models that have been em-
analysis of the impact of the imposed behavioral- ployed in many application areas can be per-
control policies upon the system performance ceived themselves as abstractions of dynamics of
and, thus, the pursuance of a more integrative a more continuous, time-driven nature, where the
approach to the synthesis of the behavioral and underlying plant undergoes some fundamental
44 Applications of Discrete-Event Systems

(possibly structural) transition upon the occur- approximations is empirically assessed on the
rence of certain events that are defined either en- basis of the delivered results (by comparing these
dogenously or exogenously w.r.t. these dynamics. results to some “baseline” performance). There
The combined consideration of the discrete-event are, however, a number of cases where the relaxed
dynamics that are generated in the manner de- fluid model has been shown to retain impor-
scribed above, with the continuous, time-driven tant behavioral attributes of its original coun-
dynamics that characterize the modalities of the terpart (Dai 1995). Furthermore, some recent
underlying plant, has led to the extension of works have investigated more analytically the
the original DES theory to the, so-called, hybrid impact of the approximation that is introduced
systems theory. Hybrid systems theory is itself by these models on the quality of the delivered
very rich, and it is covered in another section results (Wardi and Cassandras 2013). Some more
of this encyclopedia (see also  Discrete Event works regarding the application of fluidification
Systems and Hybrid Systems, Connections Be- in the DES-theoretic modeling frameworks, and
tween). From an application standpoint, it in- of the potential advantages that it brings in vari-
creases substantially the relevance of the DES ous application contexts, can be found in Srikant
modeling framework and brings this framework (2004), Meyn (2008), David and Alla (2005), and
to some new and exciting applications. Some Cassandras and Yao (2013).
of the most prominent applications concern the
coordination of autonomous vehicles and robotic
systems, and a nice anthology of works concern- Summary and Future Directions
ing the application of hybrid systems theory in
this particular application area can be found in The discussion of the previous section has
the IEEE Robotics and Automation magazine of revealed the extensive application range and
September 2011. These works also reveal the potential of DES theory and its ability to provide
strong affinity that exists between hybrid systems structured and rigorous solutions to complex
theory and the DES modeling paradigm. Along and sometimes ill-defined problems. On the
similar lines, hybrid systems theory underlies other hand, the same discussion has revealed
also the endeavors for the development of the the challenges that underlie many of the DES
Automated Highway Systems that have been ex- applications. The complexity that arises from
plored for the support of the future urban traf- the intricate and integrating nature of most DES
fic needs (Horowitz and Varaiya 2000). Finally, models is perhaps the most prominent of these
hybrid systems theory and its DES component challenges. This complexity manifests itself in
have been explored more recently as potential the involved computations, but also in the need
tools for the formal modeling and analysis of the for further infrastructure, in terms of modeling in-
molecular dynamics that are studied by systems terfaces and computational tools, that will render
biology (Curry 2012). DES theory more accessible to the practitioner.
Fluidification, on the other hand, is the effort The DES community is aware of this need,
to represent as continuous flows, dynamics that and the last few years have seen the development
are essentially of discrete-event type, in order of a number of computational platforms that seek
to alleviate the computational challenges that to implement and leverage the existing theory
typically result from discreteness and its com- by connecting it to various application settings;
binatorial nature. The resulting models serve as indicatively, we mention DESUMA (Ricker et al.
approximations of the original dynamics, fre- 2006), SUPREMICA (Akesson et al. 2006), and
quently they have the formal structure of hybrid TCT (Feng and Wonham 2006) that support DES
systems, and they define a basis for develop- behavioral modeling, analysis, and control along
ing “relaxations” for the originally addressed the lines of DES SC theory, while the website
problems. Usually, their justification is of an ad entitled “The Petri Nets World” has an extensive
hoc nature, and the quality of the established database of tools that support modeling and
Applications of Discrete-Event Systems 45

analysis through untimed and timed variations Andersson K, Richardsson J, Lennartson B, Fabian M
of the Petri net model. Model checking tools, like (2010) Coordination of operations by relation extrac-
tion for manufacturing cell controllers. IEEE Trans
SMV and NuSpin, that are used for verification Control Syst Technol 18: 414–429 A
purposes are also important enablers for the prac- Baccelli F, Cohen G, Olsder GJ, Quadrat JP (1992)
tical application of DES theory, and, of course, Synchronization and linearity: an algebra for discrete
there are a number of programming languages event systems. Wiley, New York
Balemi S, Hoffmann GJ, Wong-Toi PG, Franklin GJ
and platforms, like Arena, AutoMod, and Simio, (1993) Supervisory control of a rapid thermal multi-
that support discrete-event simulation. However, processor. IEEE Trans Autom Control 38:1040–1059
with the exception of the discrete-event- Banks J, Carson II JS, Nelson BL, Nicol DM (2009)
simulation software, which is a pretty mature Discrete-event system simulation, 5th edn. Prentice
Hall, Upper Saddle
industry, the rest of the aforementioned endeavors Bertsekas DP (1995) Dynamic programming and optimal
currently evolve primarily within the academic control, vols 1, 2. Athena Scientific, Belmont
and the broader research community. Hence, a Brandin B (1996) The real-time supervisory control of an
remaining challenge for the DES community is experimental manufacturing cell. IEEE Trans Robot
Autom 12:1–14
the strengthening and expansion of the afore- Cao X-R (2005) Basic ideas for event-based optimization
mentioned computational platforms to robust of Markov systems. Discret Event Syst Theory Appl
and user-friendly computational tools. The avail- 15:169–197
ability of such industrial-strength computational Cao X-R (2007) Stochastic learning and optimization: a
sensitivity approach. Springer, New York
tools, combined with the development of a body Cassandras CG (1994) Perturbation analysis and “rapid
of control engineers well-trained in DES theory, learning” in the control of manufacturing systems. In:
will be catalytic for bringing all the developments Leondes CT (ed) Dynamics of discrete event systems,
that were described in the earlier parts of this vol 51. Academic, Boston, pp 243–284
Cassandras CG, Lafortune S (2008) Introduction to dis-
document even closer to the industrial practice. crete event systems, 2nd edn. Springer, New York
Cassandras CG, Strickland SG (1988) Perturbation an-
alytic methodologies for design and optimization of
Cross-References communication networks. IEEE J Sel Areas Commun
6:158–171
 Diagnosis of Discrete Event Systems Cassandras CG, Yao C (2013) Hybrid models for the
 Discrete Event Systems and Hybrid Systems, control and optimization of manufacturing systems. In:
Campos J, Seatzu C, Xie X (eds) Formal methods in
Connections Between manufacturing. CRC/Taylor and Francis, Boca Raton
 Modeling, Analysis, and Control with Petri Chandra V, Huang Z, Kumar R (2003) Automated control
Nets synthesis for an assembly line using discrete event
 Models for Discrete Event Systems: An system theory. IEEE Trans Syst Man Cybern Part C
33:284–289
Overview Curry JER (2012) Some perspectives and challenges in the
 Perturbation Analysis of Discrete Event Sys- (discrete) control of cellular systems. In: Proceedings
tems of the WODES 2012, Guadalajar. IFAC, pp 1–3
 Supervisory Control of Discrete-Event Systems Dai JG (1995) On positive Harris recurrence of multiclass
queueing networks: a unified approach via fluid limit
models. Ann Appl Probab 5:49–77
David R, Alla H (1992) Petri nets and Grafcet: tools
Bibliography for modelling discrete event systems. Prentice-Hall,
Upper Saddle
Akesson K, Fabian M, Flordal H, Malik R (2006) David R, Alla H (2005) Discrete, continuous and hybrid
SUPREMICA-an integrated environment for verifica- Petri nets. Springer, Berlin
tion, synthesis and simulation of discrete event sys- David-Henriet X, Hardouin L, Raisch J, Cottenceau B
tems. In: Proceedings of the 8th international work- (2013) Optimal control for timed event graphs under
shop on discrete event systems, Ann Arbor. IEEE, partial synchronization. In: Proceedings of the 52nd
pp 384–385 IEEE conference on decision and control, Florence.
Alenljung T, Lennartson B, Hosseini MN (2012) Sensor IEEE
graphs for discrete event modeling applied to formal Dubreil J, Darondeau P, Marchand H (2010) Supervi-
verification of PLCs. IEEE Trans Control Syst Technol sory control for opacity. IEEE Trans Autom Control
20:1506–1521 55:1089–1100
46 Applications of Discrete-Event Systems

Endsley EW, Almeida EE, Tilbury DM (2006) Modular In: Proceedings of the 52nd IEEE conference on de-
finite state machines: development and application to cision and control, Florence. IEEE
reconfigurable manufacturing cell controller genera- Li Z, Zhou M, Wu N (2008) A survey and comparison
tion. Control Eng Pract 14:1127–1142 of Petri net-based deadlock prevention policies for
Ezpeleta J, Colom JM, Martinez J (1995) A Petri net based flexible manufacturing systems. IEEE Trans Syst Man
deadlock prevention policy for flexible manufacturing Cybern Part C 38:173–188
systems. IEEE Trans R&A 11:173–184 Liao H, Wang Y, Cho HK, Stanley J, Kelly T, Lafortune
Feng L, Wonham WM (2006) TCT: a computation tool for S, Mahlke S, Reveliotis S (2013) Concurrency bugs in
supervisory control synthesis. In: Proceedings of the multithreaded software: modeling and analysis using
8th international workshop on discrete event systems, Petri nets. Discret Event Syst Theory Appl 23:157–195
Ann Arbor. IEEE, pp 388–389 Markovski J, Su R (2013) Towards optimal supervisory
Feng L, Wonham WM, Thiagarajan PS (2007) Designing controller synthesis of stochastic nondeterministic dis-
communicating transaction processes by supervisory crete event systems. In: Proceedings of the 52nd IEEE
control theory. Formal Methods Syst Design 30:117– conference on decision and control, Florence. IEEE
141 Meyn S (2008) Control techniques for complex networks.
Fu M, Xie X (2002) Derivative estimation for buffer ca- Cambridge University Press, Cambridge
pacity of continuous transfer lines subject to operation- Murata T (1989) Petri nets: properties, analysis and appli-
dependent failures. Discret Event Syst Theory Appl cations. Proc IEEE 77:541–580
12:447–469 Panayiotou CG, Cassandras CG (1999) Optimization
Gershwin SB (1994) Manufacturing systems engineering. of kanban-based manufacturing systems. Automatica
Prentice Hall, Englewood Cliffs 35:1521–1533
Giua A, Fanti MP, Seatzu C (2006) Monitor design for col- Park E, Tilbury DM, Khargonekar PP (1999) Modular
ored Petri nets: an application to deadlock prevention logic controllers for machining systems: formal repre-
in railway networks. Control Eng Pract 10:1231–1247 sentations and performance analysis using Petri nets.
Hill RC, Cury JER, de Queiroz MH, Tilbury DM, Lafor- IEEE Trans Robot Autom 15:1046–1061
tune S (2010) Multi-level hierarchical interface-based Pinedo M (2002) Scheduling. Prentice Hall, Upper Saddle
supervisory control. Automatica 46:1152–1164 River
Ho YC, Cao X-R (1991) Perturbation analysis of discrete Reveliotis SA (2000) Conflict resolution in AGV systems.
event systems. Kluwer Academic, Boston IIE Trans 32(7):647–659
Homem-de Mello T, Shapiro A, Spearman ML Reveliotis SA (2005) Real-time management of resource
(1999) Finding optimal material release times allocation systems: a discrete event systems approach.
using simulation-based optimization. Manage Sci Springer, New York
45:86–102 Reveliotis SA (2007) Algebraic deadlock avoidance poli-
Hopcroft JE, Ullman JD (1979) Introduction to automata cies for sequential resource allocation systems. In:
theory, languages and computation. Addison-Wesley, Lahmar M (ed) Facility logistics: approaches and so-
Reading lutions to next generation challenges. Auerbach Publi-
Horowitz R, Varaiya P (2000) Control design of auto- cations, Boca Raton, pp 235–289
mated highway system. Proc IEEE 88: 913–925 Reveliotis SA, Ferreira PM (1996) Deadlock avoidance
Jeng M, Xie X, Peng MY (2002) Process nets with re- policies for automated manufacturing cells. IEEE
sources for manufacturing modeling and their analysis. Trans Robot Autom 12:845–857
IEEE Trans Robot Autom 18:875–889 Reveliotis S, Nazeem A (2013) Deadlock avoidance poli-
Kim J-H, Lee T-E (2012) Feedback control design for cies for automated manufacturing systems using finite
cluster tools with wafer residency time constraints. In: state automata. In: Campos J, Seatzu C, Xie X (eds)
IEEE conference on systems, man and cybernetics, Formal methods in manufacturing. CRC/Taylor and
Seoul. IEEE, pp 3063–3068 Francis
Kumar R, Takai S (2010) Decentralized prognosis of Reveliotis S, Roszkowska E (2011) Conflict resolution in
failures in discrete event systems. IEEE Trans Autom free-ranging multi-vehicle systems: a resource alloca-
Control 55:48–59 tion paradigm. IEEE Trans Robot 27:283–296
Lee T-E (2008) A review of cluster tool scheduling and Ricker L, Lafortune S, Gene S (2006) DESUMA: a tool
control for semiconductor manufacturing. In: Pro- integrating giddes and umdes. In: Proceedings of the
ceedings of the winter simulation conference, Miami. 8th international workshop on discrete event systems,
INFORMS, pp 1–6 Ann Arbor. IEEE, pp 392–393
Lewis RW (1998) Programming industrial control systems Saboori A, Hadjicostis CN (2012) Opacity-enforcing su-
using IEC 1131-3. Technical report, The Institution of pervisory strategies via state estimator constructions.
Electrical Engineers IEEE Trans Autom Control 57:1155–1165
Li M, Kumar R (2012) Model-based automatic test gen- Saboori A, Hadjicostis CN (2014) Current-state opacity
eration for Simulink/Stateflow using extended finite formulations in probabilistic finite automata. IEEE
automaton. In: Proceedings of the CASE, Seoul. IEEE Trans Autom Control 59:120–133
Li R, Reveliotis S (2013) Performance optimization Sampath M, Sengupta R, Lafortune S, Sinnamohideen K,
for a class of generalized stochastic Petri nets. Teneketzis D (1996) Failure diagnosis using disrcete
Auctions 47

event models. IEEE Trans Control Syst Technol equilibrium from game theory can be applied to
4:105–124 auctions. Auction theory aims to characterize
Sampath R, Darabi H, Buy U, Liu J (2008) Control re-
configuration of discrete event systems with dynamic
and compare the equilibrium outcomes for A
control specifications. IEEE Trans Autom Sci Eng different types of auctions. Combinatorial
5:84–100 auctions arise when multiple-related items are
Santoso T, Ahmed S, Goetschalckx M, Shapiro A (2005) sold simultaneously.
A stochastic programming approach for supply chain
network design under uncertainty. Europ J Oper Res
167:96–115
Schmidt K (2012) Computation of supervisors for re- Keywords
configurable machine tools. In: Proceedings of the
WODES 2012, Guadalajara. IFAC, pp 227–232
Sethi SP, Zhang Q (1994) Hierarchical decision making in Auction; Combinatorial auction; Game theory
stochastic manufacturing systems. Birkhäuser, Boston
Srikant R (2004) The mathematics of internet congestion
control. Birkhäuser, Boston Introduction
Van der Aalst W (1997) Verification of workflow nets. In:
Azema P, Balbo G (eds) Lecture notes in computer
science, vol 1248. Springer, New York, pp 407–426 Three commonly used types of auctions for the
Wardi Y, Cassandras CG (2013) Approximate IPA: trad- sale of a single item are the following:
ing unbiasedness for simplicity. In: Proceedings of • First price auction: Each bidder submits a bid
the 52nd IEEE conference on decision and control,
Florence. IEEE one of the bidders submitting the maximum
Wassyng A, Lawford M, Maibaum T (2011) Software cer- bid wins, and the payment for the item is
tification experience in the Canadian muclear industry: the maximum bid. (In this context “wins”
lessons for the future. In: EMSOFT’11, Taipei means receives the item, no matter what the
Wightkin N, Guy U, Darabi H (2011) Formal modeling of
sequential function charts with time Petri nets. IEEE payment.)
Trans Control Syst Technol 19:455–464 • Second price auction or Vickrey auction: Each
Wonham WM (2006) Supervisory control of discrete bidder submits a bid, one of the bidders sub-
event systems. Technical report ECE 1636F/1637S
mitting the maximum bid wins, and the pay-
2006-07, Electrical & Computer Eng., University of
Toronto ment for the item is the second highest bid.
Zhou M, Fanti MP (eds) (2004) Deadlock resolution in • English auction: The price for the item in-
computer-integrated systems. Marcel Dekker, Singa- creases continuously or in some small incre-
pore
ments, and bidders drop out at some points
in time. Once all but one of the bidders has
dropped out, the remaining bidder wins and
ATM Modernization the payment is the price at which the last of
the other bidders dropped out.
 Air Traffic Management Modernization: A key goal of the theory of auctions is to
Promise and Challenges predict how the bidders will bid, and predict
the resulting outcomes of the auction: which
bidder is the winner and what is the payment.
Auctions
For example, a seller may be interested in the
expected payment (seller revenue). A seller may
Bruce Hajek
have the option to choose one auction format over
University of Illinois, Urbana, IL, USA
another and be interested in revenue comparisons.
Another item of interest is efficiency or social
Abstract welfare. For sale of a single item, the outcome is
efficient if the item is sold to the bidder with the
Auctions are procedures for selling one or highest value for the item. The book of V. Krishna
more items to one or more bidders. Auctions (2002) provides an excellent introduction to the
induce games among the bidders, so notions of theory of auctions.
48 Auctions

Auctions Versus Seller Mechanisms the highest bid of the other bidders, the payoff
of bidder i is ui .xi  yi / if he wins and ui .0/ if
An important class of mechanisms within the the- he does not win. Thus, bidder i would prefer to
ory of mechanism design are seller mechanisms, win whenever ui .xi  yi / > ui .0/ and not win
which implement the sale of one or more items whenever ui .xi  yi / < ui .0/: That is precisely
to one or more bidders. Some authors would what happens if bidder i bids xi ; no matter what
consider all such mechanisms to be auctions, but the bids of the other bidders are. That is, bidding
the definition of auctions is often more narrowly xi is a weakly dominant strategy for bidder i:
interpreted, with auctions being the subclass of Nash equilibrium can be found for the other
seller mechanisms which do not depend on the types of auctions under a model with incomplete
fine details of the set of bidders. The rules of the information, in which the type of each bidder i is
three types of auction mentioned above do not equal to the value of the object to the bidder and is
depend on fine details of the bidders, such as the modeled as a random variable Xi with a density
number of bidders or statistical information about function fi supported by some interval Œai ; bi :
how valuable the item is to particular bidders. In A simple case is that the bidders are all risk
contrast, designing a procedure to sell an item to neutral, the densities are all equal to some fixed
a known set of bidders under specific statistical density f; and the Xi ’s are mutually independent.
assumptions about the bidders’ preferences in The English auction in this context is equiva-
order to maximize the expected revenue (as in lent to the second price auction: in an English
Myerson (1981)) would be considered a problem auction, dropping out when the price reaches
of mechanism design, which is outside the more his true value is a weakly dominant strategy for
narrowly defined scope of auctions. The narrower a bidder, and for the weakly dominant strategy
definition of auctions was championed by R. Wil- equilibrium, the outcome of the auction is the
son (1987). An article on  Mechanism Design same as for the second price auction. For the first
appears in this encyclopedia. price auction in this symmetric case, there exists a
symmetric Bayesian equilibrium. It corresponds
to all bidders using the bidding function ˇ (so the
Equilibrium Strategies in Auctions bid of bidder i is ˇ.Xi /), where ˇ is given by
ˇ.x/ D EŒY1 jY1  x: The expected revenue to
An auction induces a noncooperative game the seller in this case is EŒY1 jY1 < X1 ; which is
among the bidders, and a commonly used the same as the expected revenue for the second
predictor of the outcome of the auction is an price auction and English auction.
equilibrium of the game, such as a Nash or
Bayes-Nash equilibrium. For a risk neutral bidder
i with value xi for the item, if the bidder wins Equilibrium for Auctions
and the payment is Mi ; the payoff of the bidder with Interdependent Valuations
is xi  Mi : If the bidder does not win, the payoff
of the bidder is zero. If, instead, the bidder is Seminal work of Milgrom and Weber (1982)
risk averse with risk aversion measured by an addresses the performance of the above three
increasing utility function ui ; the payoff of the auction formats in case the bidders do not
bidder would be ui .xi  Mi / if the bidder wins know the value of the item, but each bidder i
and ui .0/ if the bidder does not win. has a private signal Xi about the value Vi of
The second price auction format is character- the item to bidder i: The values and signals
ized by simplicity of the bidding strategies. If .X1 ; : : : Xn ; V1 ; : : : ; Vn / can be interdependent.
bidder i knows the value xi of the item to himself, Under the assumption of invariance of the
then for the second price auction format, a weakly joint distribution of .X1 ; : : : Xn ; V1 ; : : : ; Vn /
dominant strategy for the bidder is to truthfully under permutation of the bidders and a strong
report xi as his bid for the item. Indeed, if yi is form of positive correlation of the random
Auctions 49

variables .X1 ; : : : Xn ; V1 ; : : : ; Vn / (see Milgrom discovery to help bidders select the packages (or
and Weber 1982 or Krishna 2002 for details), a bundles) of items most suitable for them to buy.
symmetric Bayes-Nash equilibrium is identified A key is that complementarities may exist among A
for each of the three auction formats mentioned the items for a given bidder. Complementarity
above, and the expected revenues for the means that a bidder may place a significantly
three auction formats are shown to satisfy the higher value on a bundle of items than the sum of
ordering R.first price/  R.second price/  R.English/ : values the bidder would place on the items indi-
A significant extension of the theory of Milgrom vidually. Complementarities lead to the exposure
and Weber due to DeMarzo et al. (2005) is the problem, which occurs when a bidder wins only
theory of security-bid auctions in which bidders a subset of items of a desired bundle at a price
compete to buy an asset and the final payment is which is significantly higher than the price paid.
determined by a contract involving the value of For example, a customer might place a high value
the asset as revealed after the auction. on a particular pair of shoes, but little value on a
single shoe alone.
A variation of simultaneous ascending price
Combinatorial Auctions auctions for combinatorial auctions is auctions
with package bidding (see, e.g., Ausubel and
Combinatorial auctions implement the simultane- Milgrom 2002; Cramton 2013). A bidder will
ous sale of multiple items. A simple version is the either win a package of items he bid for or no
simultaneous ascending price auction with activ- items, thereby eliminating the exposure problem.
ity constraints (Cramton 2006; Milgrom 2004). For example, in simultaneous clock auctions with
Such an auction procedure was originally pro- package bidding, the price for each item increases
posed by Preston, McAfee, Paul Milgrom, and according to a fixed schedule (the clock), and bid-
Robert Wilson for the US FCC wireless spectrum ders report the packages of items they would pre-
auction in 1994 and was used for the vast majority fer to purchase for the given prices. The price for
of spectrum auctions worldwide since then Cram- a given item stops increasing when the number of
ton (2013). The auction proceeds in rounds. In bidders for that item drops to zero or one, and the
each round a minimum price is set for each item, clock phase of the auction is complete when the
with the minimum prices for the initial round number of bidders for every item is zero or one.
being reserve prices set by the seller. A given Following the clock phase, bidders can submit
bidder may place a bid on an item in a given additional bids for packages of items. With the
round such that the bid is greater than or equal inputs from bidders acquired during the clock
to the minimum price for the item. If one or more phase and supplemental bid phase, the auctioneer
bidders bid on an item in a round, a provisional then runs a winner determination algorithm to
winner of the item is selected from among the select a set of bids for non-overlapping packages
bidders with the highest bid for the item in the that maximizes the sum of the bids. This winner
round, with the new provisional price being the determination problem is NP hard, but is com-
highest bid. The minimum price for the item is putationally feasible using integer programming
increased 10 % (or some other small percentage) or dynamic programming methods for moderate
above the new provisional price. Once there is a numbers of items (perhaps up to 30). In addition,
round with no bids, the set of provisional winners the vector of payments charged to the winners
is identified. Often constraints are placed on the is determined by a two-step process. First, the
bidders in the form of activity rules. An activity (generalized) Vickrey price for each bidder is
rule requires a bidder to maintain a history of determined, which is defined to be the minimum
bidding in order to continue bidding, so as to the bidder would have had to bid in order to
prevent bidders from not bidding in early rounds be a winner. Secondly, the vector of Vickrey
and bidding aggressively in later rounds. The prices is projected onto the core of the reported
motivation for activity rules is to promote price prices. The second step insures that no coalition
50 Autotuning

consisting of a set of bidders and the seller can


achieve a higher sum of payoffs (calculated using
Autotuning
the bids received) for some different selection
Tore Hägglund
of winners than the coalition received under the
Lund University, Lund, Sweden
outcome of the auction. While this is a promising
family of auctions, the projection to the core
introduces some incentive for bidders to deviate Abstract
from truthful reporting, and much remains to be
understood about such auctions. Autotuning, or automatic tuning, means that the
controller is tuned automatically. Autotuning is
Summary and Future Directions normally applied to PID controllers, but the tech-
nique can also be used to initialize more advanced
Auction theory provides a good understanding controllers. The main approaches to autotuning
of the outcomes of the standard auctions for the are based on step response analysis or frequency
sale of a single item. Recently emerging auc- response analysis obtained using relay feedback.
tions, such as for the generation and consumption Autotuning has been well received in industry,
of electrical power, and for selection of online and today most distributed control systems have
advertisements, are challenging to analyze and some kind of autotuning technique.
comprise a direction for future research. Much
remains to be understood in the theory of combi-
natorial auctions, such as the degree of incentive Keywords
compatibility offered by core projecting auctions.
Automatic tuning; Gain scheduling; PID control;
Process control; Proportional-integral-derivative
Cross-References
control; Relay feedback
 Game Theory: Historical Overview

Background
Bibliography
In the late 1970s and early 1980s, there was a
Ausubel LM, Milgrom PR (2002) Ascending auctions quite rapid change of controller implementation
with package bidding. BE J Theor Econ 1(1):Article 1 in process control. The analog controllers were
Cramton P (2006) Simultaneous ascending auctions. In:
Cramton P, Shoham Y, Steinberg R (eds) Combinato- replaced by computer-based controllers and dis-
rial auctions, chapter 4. MIT, Cambridge, pp 99–114 tributed control systems. The functionality of the
Cramton P (2013) Spectrum auction design. Rev Ind new controllers was often more or less a copy
Organ 42(4):161–190 of the old analog equipment, but new functions
DeMarzo PM, Kremer I, Skrzypacz A (2005) Bidding
with securities: auctions and security bidding with that utilized the computer implementation were
securities: auctions and security design. Am Econ Rev gradually introduced. One of the first functions of
95(4):936–959 this type was autotuning. Autotuning is a method
Krishna V (2002) Auction theory. Academic, San Diego to tune the controllers, normally PID controllers,
Milgrom PR (2004) Putting auction theory to work.
Cambridge University Press, Cambridge/ New York automatically.
Milgrom PR, Weber RJ (1982) A theory of auctions and
competitive bidding. Econometrica 50(5):1089–1122
Myerson R (1981) Optimal auction design. Math Oper
Res 6(1):58–73 What Is Autotuning?
Wilson R (1987) Game theoretic analysis of trading
processes. In: Bewley T (ed) Advances in economic
theory. Cambridge University Press, Cambridge/New A PID controller in its basic form has the struc-
York ture
Autotuning 51

Zt
Methods Based on Step Response
1 d
u.t/ D K e.t/ C e.
/d
C Td e.t/ ; Analysis
Ti dt
0 A
Most methods for automatic tuning of PID
where u is the controller output and e D ysp y is controllers are based on step response analysis.
the control error, where ysp is the setpoint and y When the operator wishes to tune the controller,
is the process output. There are three parameters an open-loop step response experiment is
in the controller, gain K, integral time Ti , and performed. A process model is then obtained
derivative time Td . These parameters have to be from the step response, and controller parameters
set by the user. Their values are dependent of the are determined. This is usually done using simple
process dynamics and the specifications of the formulas or look-up tables.
control loop. The most common process model used for
A process control plant may have thousands PID controller tuning based on step response ex-
of control loops, which means that maintaining periments is the first-order plus dead-time model
high-performance controller tuning can be very
time consuming. This was the main reason why Kp
G.s/ D e sL
procedures for automatic tuning were installed so 1 C sT
rapidly in the computer-based controllers.
When a controller is to be tuned, the following where Kp is the static gain, T is the apparent
steps are normally performed by the user: time constant, and L is the apparent dead time.
1. To determine the process dynamics, a minor These three parameters can be obtained from a
disturbance is injected by changing the control step response experiment according to Fig. 1.
signal. Static gain Kp is given by the ratio between
2. By studying the response in the process out- the steady-state change in process output and
put, the process dynamics can be determined, the magnitude of the control signal step, Kp D
i.e., a process model is derived. y=u. Dead-time L is determined from the
3. The controller parameters are finally deter- time elapsed from the step change to the inter-
mined based on the process model and the section of the largest slope of the process output
specifications. with the level of the process output before the step
Autotuning means simply that these three change. Finally, time constant T is the time when
steps are performed automatically. Instead of the process output has reached 63 % of its final
having a human to perform these tasks, they value, subtracted by L.
are performed automatically on demand from
the user. Ideally, the autotuning should be fully
Process output
automatic, which means that no information
about the process dynamics is required from
63 %
the user. Δy
Automatic tuning can be performed in many
ways. The process disturbance can take differ-
ent forms, e.g., in the form of step changes or L T
some kind of oscillatory excitation. The model Control signal
obtained can be more or less accurate. There are
also many ways to tune the controller based on
Δu
the process model.
Here, we will discuss two main approaches
for autotuning, namely, those that are based on
step response analysis and those that are based Autotuning, Fig. 1 Determination of Kp , L, and T from
on frequency response analysis. a step response experiment
52 Autotuning

The greatest difficulty in carrying out tuning


automatically is in selecting the amplitude of the ysp u y
Σ Process
step. The user naturally wants the disturbance to
be as small as possible so that the process is not PID
disturbed more than necessary. On the other hand,
−1
it is easier to determine the process model if the
disturbance is large. The result of this dilemma
Autotuning, Fig. 2 The relay autotuner. In the tuning
is usually that the user has to decide how large mode the process is connected to relay feedback
the step in the control signal should be. Another
problem is to determine when the step response
has reached its final value. Describing function analysis can be used to
determine the process characteristics. The de-
scribing function of a relay with hysteresis is
r !
4d   2 
Methods Based on Frequency
N.a/ D 1 i
Response Analysis a a a

Frequency-domain characteristics of the process where d is the relay amplitude,  the relay hys-
can be obtained by adding sinusoidals to the teresis, and a the amplitude of the input signal.
control signal, but without knowing the frequency The negative inverse of this describing function is
response of the process, the interesting frequency a straight line parallel to the real axis; see Fig. 4.
range and acceptable amplitudes are not known. The oscillation corresponds to the point where the
A method that automatically provides a rele- negative inverse describing function crosses the
vant frequency response can be determined from Nyquist curve of the process, i.e., where
experiments with relay feedback according to
Fig. 2. Notice that there is a switch that selects 1
G.i !/ D 
either relay feedback or ordinary PID feedback. N.a/
When it is desired to tune the system, the PID
function is disconnected and the system is con- Since N.a/ is known, G.i !/ can be determined
nected to relay feedback control. Relay feedback from the amplitude a and the frequency ! of the
control is the same as on/off control, but where oscillation.
the on and off levels are carefully chosen and not Notice that the relay experiment is easily au-
0 and 100 %. The relay feedback makes the con- tomated. There is often an initialization phase
trol loop oscillate. The period and the amplitude where the noise level in the process output is de-
of the oscillation is determined when steady-state termined during a short period of time. The noise
oscillation is obtained. This gives the ultimate level is used to determine the relay hysteresis
period and the ultimate gain. The parameters of a and a desired oscillation amplitude in the process
PID controller can then be determined from these output. After this initialization phase, the relay
values. The PID controller is then automatically function is introduced. Since the amplitude of the
switched in again, and the control is executed oscillation is proportional to the relay output, it is
with the new PID parameters. easy to control it by adjusting the relay output.
For large classes of processes, relay feedback
gives an oscillation with period close to the ulti- Different Adaptive Techniques
mate frequency !u , as shown in Fig. 3, where the
control signal is a square wave and the process In the late 1970s, at the same time as autotuning
output is close to a sinusoid. The gain of the procedures were developed and implemented in
transfer function at this frequency is also easy to industrial controllers, there was a large academic
obtain from amplitude measurements. interest in adaptive control. These two concepts
Autotuning 53

55

A
50
y

45
0 5 10 15 20 25 30 35 40

60

55

50
u

45

40
0 5 10 15 20 25 30 35 40

Autotuning, Fig. 3 Process output y and control signal u during relay feedback

dynamics. Therefore, they are both called adap-


Im
tive techniques.
There is a third adaptive technique, namely,
gain scheduling. Gain scheduling is a system
where controller parameters are changed
Re depending on measured operating conditions.
The scheduling variable can, for instance, be
1 the measurement signal, controller output, or an

N(a) external signal. For historical reasons the word
G (iω) gain scheduling is used even if other parameters
like integral time or derivative time are changed.
Gain scheduling is a very effective way of
Autotuning, Fig. 4 The negative inverse describing controlling systems whose dynamics change
function of a relay with hysteresis 1=N.a/ and a Nyquist with the operating conditions. Automatic tuning
curve G.i !/ has made it possible to generate gain schedules
automatically.
are often mixed up with each other. Autotun- Although research on adaptive techniques has
ing is sometimes called tuning on demand. An almost exclusively focused on adaptation, ex-
identification experiment is performed, controller perience has shown that autotuning and gain
parameters are determined, and the controller scheduling have much wider industrial applica-
is then run with fixed parameters. An adaptive bility. Figure 5 illustrates the appropriate use of
controller is, however, a controller where the con- the different techniques.
troller parameters are adjusted online based on Controller performance is the first issue to
information from routine data. Automatic tuning consider. If requirements are modest, a controller
and adaptive control have, however, one thing in with constant parameters and conservative tuning
common, namely, that they are methods to adapt can be used. Other solutions should be considered
the controller parameters to the actual process when higher performance is required.
54 Autotuning

Autotuning, Fig. 5 When


Constant but Predictable Unpredictable
to use different adaptive
unknown dynamics changes in dynamics changes in dynamics
techniques

Auto-tuning Auto-tuning Auto-tuning

Gain scheduling Adaptation

Constant controller Predictable Unpredictable


parameters parameter changes parameter changes

If the process dynamics are constant, a con- Industrial Products


troller with constant parameters should be used.
The parameters of the controller can be obtained Commercial PID controllers with adaptive tech-
by autotuning. niques have been available since the beginning of
If the process dynamics or the charac- the late 1970s, both in single-station controllers
ter of the disturbances are changing, it is and in distributed control systems.
useful to compensate for these changes by Two important, but distinct, applications of
changing the controller. If the variations PID autotuners are temperature controllers and
can be predicted from measured signals, process controllers. Temperature controllers are
gain scheduling should be used since it is primarily designed for temperature control,
simpler and gives superior and more robust whereas process controllers are supposed to
performance than continuous adaptation. work for a wide range of control loops in the
Typical examples are variations caused by process industry such as flow, pressure, level,
nonlinearities in the control loop. Autotuning temperature, and concentration control loops.
can be used to build up the gain schedules Automatic tuning is easier to implement in
automatically. temperature controllers, since most temperature
There are also cases where the variations in control loops have several common features.
process dynamics are not predictable. Typical This is the main reason why automatic tuning
examples are changes due to unmeasurable vari- was introduced more rapidly in these controllers.
ations in raw material, wear, fouling, etc. These Since the processes that are controlled with
variations cannot be handled by gain scheduling process controllers may have large differences
but must be dealt with by adaptation. An auto- in their dynamics, tuning becomes more difficult
tuning procedure is often used to initialize the compared to the pure temperature control loops.
adaptive controller. It is then sometimes called Automatic tuning can also be performed by
pre-tuning or initial tuning. external devices which are connected to the con-
To summarize, autotuning is a key component trol loop during the tuning phase. Since these
in all adaptive techniques and a prerequisite for devices are supposed to work together with con-
their use in practice. trollers from different manufacturers, they must
Averaging Algorithms and Consensus 55

be provided with quite a lot of information about


the controller structure and parameterization in
Averaging Algorithms
order to provide appropriate controller param-
and Consensus
A
eters. Such information includes signal ranges,
Wei Ren
controller structure (series or parallel form), sam-
Department of Electrical Engineering,
pling rate, filter time constants, and units of
University of California, Riverside, CA, USA
the different controller parameters (gain or pro-
portional band, minutes or seconds, time or re-
peats/time).
Abstract

In this article, we overview averaging algorithms


Summary and Future Directions and consensus in the context of distributed coor-
dination and control of networked systems. The
Most of the autotuning methods that are avail- two subjects are closely related but not iden-
able in industrial products today were devel- tical. Distributed consensus means that a team
oped about 30 years ago, when computer-based of agents reaches an agreement on certain vari-
controllers started to appear. These autotuners ables of interest by interacting with their neigh-
are often based on simple models and simple bors. Distributed averaging aims at computing
tuning rules. With the computer power available the average of certain variables of interest among
today, and the increased knowledge about PID multiple agents by local communication. Hence
controller design, there is a potential for improv- averaging can be treated as a special case of
ing the autotuners, and more efficient autotuners consensus – average consensus. For distributed
will probably appear in industrial products quite consensus, we introduce distributed algorithms
soon. for agents with single-integrator, general linear,
and nonlinear dynamics. For distributed averag-
ing, we introduce static and dynamic averaging
Cross-References algorithms. The former is useful for computing
the average of initial conditions (or constant sig-
 Adaptive Control, Overview
nals), while the latter is useful for computing the
 PID Control
average of time-varying signals. Future research
directions are also discussed.

Keywords
Bibliography
Cooperative control; Coordination; Distributed
Åström KJ, Hägglund T (1995) PID controllers: theory,
control; Multi-agent systems; Networked systems
design, and tuning. ISA – The Instrumentation, Sys-
tems, and Automation Society, Research Triangle Park
Åström KJ, Hägglund T (2005) Advanced PID control.
ISA – The Instrumentation, Systems, and Automation
Society, Research Triangle Park Introduction
Vilanova R, Visioli A (eds) (2012) PID control in the third
millennium. Springer, Dordrecht In the area of control of networked systems, low
Visioli A (2006) Practical PID control. Springer, London
Yu C-C (2006) Autotuning of PID controllers – a relay cost, high adaptivity and scalability, great robust-
feedback approach. Springer, London ness, and easy maintenance are critical factors.
56 Averaging Algorithms and Consensus

To achieve these factors, distributed coordination a directed graph denotes that agent j can obtain
and control algorithms that rely on only local in- information from agent i , but not necessarily vice
teraction between neighboring agents to achieve versa. In contrast, an edge .i; j / in an undirected
collective group behavior are more favorable than graph denotes that agents i and j can obtain
centralized ones. In this article, we overview information from each other. Agent j is a (in-)
averaging algorithms and consensus in the con- neighbor of agent i if .j; i / 2 E. Let Ni denote
text of distributed coordination and control of the neighbor set of agent i . We assume that i 2
networked systems. Ni . A directed path is a sequence of edges in
Distributed consensus means that a team of a directed graph of the form .i1 ; i2 /; .i2 ; i3 /; : : :,
agents reaches an agreement on certain variables where ij 2 V. An undirected path in an undi-
of interest by interacting with their neighbors. rected graph is defined analogously. A directed
A consensus algorithm is an update law that graph is strongly connected if there is a directed
drives the variables of interest of all agents in path from every agent to every other agent. An
the network to converge to a common value undirected graph is connected if there is an undi-
(Jadbabaie et al. 2003; Olfati-Saber et al. 2007; rected path between every pair of distinct agents.
Ren and Beard 2008). Examples of the variables A directed graph has a directed spanning tree if
of interest include a local representation of the there exists at least one agent that has directed
center and shape of a formation, the rendezvous paths to all other agents. For example, Fig. 1
time, the length of a perimeter being monitored, shows a directed graph that has a directed span-
the direction of motion for a multi-vehicle swarm, ning but is not strongly connected. The adjacency
and the probability that a target has been identi- matrix A D Œaij  2 Rnn associated with G is
fied. Consensus algorithms have applications in defined such that aij (the weight of edge .j; i /)
rendezvous, formation control, flocking, attitude is positive if agent j is a neighbor of agent i
alignment, and sensor networks (Bai et al. 2011a; while aij D 0 otherwise. The (nonsymmetric)
Bullo et al. 2009; Mesbahi and Egerstedt 2010; Laplacian matrix (Agaev and Chebotarev 2005)
Qu 2009; Ren and Cao 2011). Distributed aver- L D Œ`ij  2 Rnn associated with A and hence G
P
aging algorithms aim at computing the average is defined as `i i D j ¤i aij and `ij D aij for
of certain variables of interest among multiple all i ¤ j . For an undirected graph, we assume
agents by local communication. Distributed av- that aij D aj i . A graph is balanced if for every
eraging finds applications in distributed comput- agent the total edge weights of its incoming links
ing, distributed signal processing, and distributed is equal to the total edge weights of its outgoing
P P
optimization (Tsitsiklis et al. 1986). Hence the links ( nj D1 aij D nj D1 aj i for all i ).
variables of interest are dependent on the appli-
cations (e.g., a sensor measurement or a network
quantity). Consensus and averaging algorithms
are closely connected and yet nonidentical. When
A1 A2
all agents are able to compute the average, they
essentially reach a consensus, the so-called av-
erage consensus. On the other hand, when the
agents reach a consensus, the consensus value
A3 A4 A5
might or might not be the average value.
Averaging Algorithms and Consensus, Fig. 1 A di-
Graph Theory Notations. Suppose that there rected graph that characterizes the interaction among five
are n agents in a network. A network topology agents, where Ai , i D 1; : : : ; 5, denotes agent i . An
(equivalently, graph) G consisting of a node set arrow from agent j to agent i indicates that agent i
receives information from agent j . The directed graph
V D f1; : : : ; ng and an edge set E  V  V will has a directed spanning tree but is not strongly connected.
be used to model interaction (communication or Here both agents 1 and 2 have directed paths to all other
sensing) between the n agents. An edge .i; j / in agents
Averaging Algorithms and Consensus 57

Consensus xP i .t/ D ui .t/; i D 1; : : : ; n; (1)

Consensus has a long history in management where xi is the state and ui is the control input. A
science, statistical physics, and distributed com- A common consensus algorithm for (1) is
puting and finds recent interests in distributed
control. While in the area of distributed control X
ui .t/ D aij .t/Œxj .t/  xi .t/; (2)
of networked systems the term consensus was
j 2Ni .t /
initially more or less dominantly referred to the
case of a continuous-time version of a distributed
where Ni .t/ is the neighbor set of agent i at time
linear averaging algorithm, such a term has been
t and aij .t/ is the .i; j / entry of the adjacency
broadened to a great extent later on. Related
matrix A of the graph G at time t. A consequence
problems to consensus include synchronization,
of (2) is that the state xi .t/ of agent i is driven
agreement, and rendezvous. The study of con-
toward the states of its neighbors or equivalently
sensus can be categorized in various manners.
toward the weighted average of its neighbors’
For example, in terms of the final consensus
states. The closed-loop system of (1) using (2)
value, the agents could reach a consensus on
can be written in matrix form as
the average, a weighted average, the maximum
value, the minimum value, or a general function
of their initial conditions, or even a (changing) P
x.t/ D L.t/x.t/; (3)
state that serves as a reference. A consensus
algorithm could be linear or nonlinear. Consensus where x is a column stack vector of all xi and
algorithms can be designed for agents with linear L is the Laplacian matrix. Consensus is reached
or nonlinear dynamics. As the agent dynamics if for all initial states, the agents’ states even-
become more complicated, so do the algorithm tually
 becomeidentical. That is, for all xi .0/,
design and analysis. Numerous issues are also xi .t/  xj .t/ approaches zero eventually.
involved in consensus such as network topologies The properties of the Laplacian matrix L play
(fixed vs. switching, deterministic vs. random, an important role in the analysis of the closed-
directed vs. undirected, asynchronous vs. syn- loop system (3). When the graph G (and hence
chronous), time delay, quantization, optimality, the associated Laplacian matrix L) is fixed, (3)
sampling effects, and convergence speed. For can be analyzed by studying the eigenvalues and
example, in real applications, due to nonuniform eigenvectors of L. Due to its special structure,
communication/sensing ranges or limited field of for any graph G, the associated Laplacian ma-
view of sensors, the network topology could be trix L has at least one zero eigenvalue with an
directed rather than undirected. Also due to unre- associated right eigenvector 1 (column vector of
liable communication/sensing and limited com- all ones) and all other eigenvalues have positive
munication/sensing ranges, the network topology real parts. To ensure consensus, it is equivalent
could be switching rather than fixed. to ensure that L has a simple zero eigenvalue. It
can be shown that the following three statements
are equivalent: (i) the agents reach a consensus
Consensus for Agents exponentially for arbitrary initial states; (ii) the
with Single-Integrator Dynamics graph G has a directed spanning tree; and (iii) the
We start with a fundamental consensus algo- Laplacian matrix L has a simple zero eigenvalue
rithm for agents with single-integrator dynamics. with an associated right eigenvector 1 and all
The results in this section follow from Jadbabaie other eigenvalues have positive real parts. When
et al. (2003), Olfati-Saber et al. (2007), Ren and consensus is reached, the final consensus value is
Beard (2008), Moreau (2005), and Agaev and a weighted average of the initial states of those
Chebotarev (2000). Consider agents with single- agents that have directed paths to all other agents
integrator dynamics (see Fig. 2 for an illustration).
58 Averaging Algorithms and Consensus

Averaging Algorithms 10
and Consensus, Fig. 2
Consensus for five agents 9 i=1
using the algorithm (2) for i=2
(1). Here the graph G is 8 i=3
given by Fig. 1. The initial i=4
states are chosen as i=5
7
xi .0/ D 2i , where
i D 1; : : : ; 5. Consensus is

xi
6
reached as G has a directed
spanning tree. The final
5
consensus value is a
weighted average of the
4
initial states of agents 1
and 2
3

2
0 1 2 3 4 5 6 7 8
time(s)

When the graph G.t/ is switching at time account for actuator saturation and a signum type
instants t0 ; t1 ; : : :, the solution to the closed-loop function could be introduced to achieve finite-
system (3) is given by x.t/ D ˚.t; 0/x.0/, where time convergence.
˚.t; 0/ is the transition matrix corresponding to As shown above, for single-integrator dynam-
L.t/. Consensus is reached if ˚.t; 0/ eventually ics, the consensus convergence is determined
converges to a matrix with identical rows. Here entirely by the network topologies. The primary
˚.t; 0/ D ˚.t; tk /˚.tk ; tk1 /    ˚.t1 ; 0/, where reason is that the single-integrator dynamics are
˚.tk ; tk1 / is the transition matrix corresponding internally stable. However, when more compli-
to L.t/ at time interval Œtk1 ; tk . It turns out that cated agent dynamics are involved, the consen-
each transition matrix is a row-stochastic matrix sus algorithm design and analysis become more
with positive diagonal entries. A square matrix is complicated. On one hand, whether the graph is
row stochastic if all its entries are nonnegative undirected (respectively, switching) or not has
and all of its row sums are one. The consen- significant influence on the complexity of the
sus convergence can be analyzed by studying consensus analysis. On the other hand, not only
the product of row-stochastic matrices. Another the network topology but also the agent dynamics
analysis technique is a Lyapunov approach (e.g., themselves and the parameters in the consensus
max xi  min xi ). It can be shown that the agents’ algorithm play important roles. Next we intro-
states reach a consensus if there exists an infinite duce consensus for agents with general linear and
sequence of contiguous, uniformly bounded time nonlinear dynamics.
intervals, with the property that across each such
interval, the union of the graphs G.t/ has a
Consensus for Agents with General Linear
directed spanning tree. That is, across each such
Dynamics
interval, there exists at least one agent that can
In some circumstances, it is relevant to deal with
directly or indirectly influence all other agents. It
agents with general linear dynamics, which can
is also possible to achieve certain nice features by
also be regarded as linearized models of certain
designing nonlinear P consensus algorithms of the nonlinear dynamics. The results in this section
form ui .t/ D j 2Ni .t / aij .t/ Œxj .t/  xi .t/,
follow from Li et al. (2010). Consider agents with
where ./ is a nonlinear function satisfying
general linear dynamics
certain properties. One example is a continu-
ous nondecreasing odd function. For example, a
saturation type function could be introduced to xP i D Axi C Bui ; yi D C xi ; (4)
Averaging Algorithms and Consensus 59

where xi 2 Rm , ui 2 Rp , and yi 2 Rq where vi 2 Rm are the observer states, F 2


are, respectively, the state, the control input, and Rpn and L 2 Rmq are the feedback gain
the output of agent i and A, B, C are constant matrices, and c > 0 is a coupling gain. Here the A
matrices with compatible dimensions. algorithm (6) uses not only the relative outputs
When each agent has access to the relative between each agent and its neighbors but also
states between itself and its neighbors, a dis- its own and neighbors’ observer states. While
tributed static consensus algorithm is designed relative outputs could be obtained through local
for (4) as measurements, the neighbors’ observer states can
X only be obtained via communication. It can be
ui D cK aij .xi  xj /; (5) shown that if the graph G has a directed span-
j 2Ni
ning tree, consensus is reached using (6) for (4)
where c > 0 is a coupling gain, K 2 Rpm if the matrices A C BF and A C c i .L/LC ,
is the feedback gain matrix, and Ni and aij are where i .L/ ¤ 0, are Hurwitz. The observer-
defined as in (2). It can be shown that if the type consensus algorithm (6) can be seen as an
graph G has a directed spanning tree, consensus extension of the single-system observer design to
is reached using (5) for (4) if and only if all the multi-agent systems. Here the separation princi-
matrices A C c i .L/BK, where i .L/ ¤ 0 are ple of the traditional observer design still holds
Hurwitz. Here i .L/ denotes the i th eigenvalue in the multi-agent setting in the sense that the
of the Laplacian matrix L. A necessary condition feedback gain matrices F and L can be designed
for reaching a consensus is that the pair .A; B/ is separately.
stabilizable. The consensus algorithm (5) can be
designed via two steps: Consensus for Agents with Nonlinear
(a) Solve the linear matrix inequality AT P C Dynamics
PA  2BB T < 0 to get a positive-definite In multi-agent applications, agents usually rep-
solution P . Then let the feedback gain matrix resent physical vehicles with special dynamics,
K D B T P 1 . especially nonlinear dynamics for the most part.
(b) Select the coupling strength c larger than the Examples include Lagrangian systems for robotic
threshold value 1= min ReΠi .L/, where manipulators and autonomous robots, nonholo-
i .L/¤0 nomic systems for unicycles, attitude dynamics
Re./ denotes the real part. for rigid bodies, and general nonlinear systems.
Note that here the threshold value depends on Similar to the consensus algorithms for linear
the eigenvalues of the Laplacian matrix, which multi-agent systems, the consensus algorithms
is in some sense global information. To over- used for these nonlinear agents are often designed
come such a limitation, it is possible to introduce based on state differences between each agent and
adaptive gains in the algorithm design. The gains its neighbors. But due to the inherent nonlinear-
could be updated dynamically using local infor- ity, the problem is more complicated and addi-
mation. tional terms might be required in the algorithm
When the relative states between each agent design. The main techniques used in the con-
and its neighbors are not available, one is mo- sensus analysis for nonlinear multi-agent systems
tivated to make use of the output information are often Lyapunov-based techniques (Lyapunov
and employ observer-based design to estimate functions, passivity theory, nonlinear contraction
the relative states. An observer-type consensus analysis, and potential functions).
algorithm is designed for (4) as Early results on consensus for agents
X with nonlinear dynamics primarily focus on
vPi D .A C BF /vi C cL aij ŒC.vi  vj /
undirected graphs to exploit the symmetry to
j 2Ni
facilitate the construction of Lyapunov function
 .yi  yj /;
candidates. Unfortunately, the extension from an
ui D F vi ; i D 1;    ; n; (6) undirected graph to a directed one is nontrivial.
60 Averaging Algorithms and Consensus

For example, the directed graph does not preserve neighbors but also local auxiliary variables from
the passivity properties in general. Moreover, the neighbors. While relative physical states could
directed graph could cause difficulties in the be obtained through sensing, the exchange of
design of (positive-definite) Lyapunov functions. auxiliary variables can only be achieved by
One approach is to integrate the nonnegative left communication. Hence such generalization is
eigenvector of the Laplacian matrix associated obtained at the price of increased communication
with the zero eigenvalue into the Lyapunov between the neighboring agents. Unlike some
function, which is valid for strongly connected other algorithms, it is generally impossible
graphs and has been applied in some problems. to implement the algorithm relying on purely
Another approach is based on sliding mode relative sensing between neighbors without the
control. The idea is to design a sliding surface need for communication.
for reaching a consensus. Taking multiple
Lagrangian systems as an example, the agent
dynamics are represented by Averaging Algorithms

Mi .qi /qRi C Ci .qi ; qP i /qPi C gi .qi / D


i ; Existing distributed averaging algorithms are pri-
marily static averaging algorithms based on linear
i D 1;    ; n; (7) local average iterations or gossip iterations. These
algorithms are capable of computing the average
where qi 2 Rp is the vector of generalized of the initial conditions of all agents (or con-
coordinates, Mi .qi / 2 Rpp is the symmetric stant signals) in a network. In particular, the lin-
positive-definite inertia matrix, Ci .qi ; qPi /qPi 2 Rp ear local-average-iteration algorithms are usually
is the vector of Coriolis and centrifugal torques, synchronous, where at each iteration each agent
gi .qi / 2 Rp is the vector of gravitational torque, repeatedly updates its state to be the average of
and
i 2 Rp is the vector of control torque on the those of its neighbors. The gossip algorithms are
i th agent. The sliding surface can be designed as asynchronous, where at each iteration a random
X pair of agents are selected to exchange their
si D qPi  qP ri D qPi C ˛ aij .qi  qj / (8) states and update them to be the average of the
j 2Ni two. Dynamic averaging algorithms are of signif-
icance when there exist time-varying signals. The
where ˛ is a positive scalar. Note that when objective is to compute the average of these time-
si D 0, (8) is actually the closed-loop system of a varying signals in a distributed manner.
consensus algorithm for single integrators. Then
if the control torque
i can be designed using Static Averaging
only local information from neighbors to drive si Take a linear local-average-iteration algorithm as
to zero, consensus will be reached as si can be an example. The results in this section follow
treated as a vanishing disturbance to a system that from Tsitsiklis et al. (1986), Jadbabaie et al.
reaches consensus exponentially. (2003), and Olfati-Saber et al. (2007). Let xi be
It is generally very challenging to deal with the information state of agent i . A linear local-
general directed or switching graphs for agents average-iteration-type algorithm has the form
with more complicated dynamics other than
X
single-integrator dynamics. In some cases, the xi Œk C 1 D aij Œkxj Œk; i D 1; : : : ; n;
challenge could be overcome by introducing and j 2Ni Œk
updating additional auxiliary variables (often (9)
observer-based algorithms) and exchanging where k denotes a communication event, Ni Œk
these variables between neighbors (see, e.g., denotes the neighbor set of agent i , and aij Œk
(6)). In the algorithm design, the agents might is the .i; j / entry of the adjacency matrix A of
use not only relative physical states between the graph G that represents the communication
Averaging Algorithms and Consensus 61

Signal r1(t) Signal r2(t) Signal r3(t)

A
A1 A2 A3

Signal r8(t) 1 8 Signal r4(t)


A8 Average: 8 i=1 ri(t) A4

A7 A6 A5

Signal r7(t) Signal r6(t) Signal r5(t)

Averaging Algorithms and Consensus, Fig. 3 Illustra- varying) signal associated with agent i . Each agent needs
tion of distributed averaging of multiple (time-varying) to compute the average of all agents’ signals but can
signals. Here Ai denotes agent i and ri .t / denotes a (time- communicate with only its neighbors

topology at time k, with the additional assump- can be analyzed by studying the eigenvalues
tion that A is row stochastic and ai i Œk > 0 for and eigenvectors of the row-stochastic matrix A.
all i D 1; : : : ; n. Intuitively, the information state Because all diagonal entries of A are positive,
of each agent is updated as the weighted average Gershgorin’s disc theorem implies that all eigen-
of its current state and the current states of its values of A are either within the open unit disk or
neighbors at each iteration. Note that an agent at one. When the graph G is strongly connected,
maintains its current state if it does not exchange the Perron-Frobenius theorem implies that A has
information with other agents at that event in- a simple eigenvalue at one with an associated
stant. In fact, a discretized version of the closed- right eigenvector 1 and an associated positive left
loop system of (1) using (2) (with a sufficiently eigenvector. Hence when G is strongly connected,
small sampling period) takes in the form of (9). it turns out that limk!1 Ak D 1 T , where  T is a
The objective here is for all agents to compute the positive left eigenvector of A associated with the
average of their initial states by communicating eigenvalue one and satisfies  T 1 D 1. Note that
with only their neighbors. That is, each xi Œk xŒk D Ak xŒ0. Hence, each agent’s state xi Œk
P
approaches n1 nj D1 xj Œ0 eventually. To compute approaches  T xŒ0 eventually. If it can be further
the average of multiple constant signals ci , we ensured that  D n1 1, then averaging is achieved.
could simply set xi Œ0 D ci . The algorithm (9) It can be shown that the agents’ states converge to
can be written in matrix form as xŒk C 1 D the average of their initial values if and only if the
AŒkxŒk, where x is a column stack vector of all directed graph G is both strongly connected and
xi and AŒk D Œaij Œk is a row-stochastic matrix. balanced or the undirected graph G is connected.
When the graph G (and hence the matrix A) When the graph is switching, the convergence of
is fixed, the convergence of the algorithm (9) the algorithm (9) can be analyzed by studying the
62 Averaging Algorithms and Consensus

product of row-stochastic matrices. Such analysis where ˛ is a positive scalar, Ni denotes the
is closely related to Markov chains. It can be neighbor set of agent i , sgn./ is the signum func-
shown that the agents’ states converge to the tion defined componentwise, i is the internal
average of their initial values if the directed state of the estimator with i .0/ D 0, and xi
graph G is balanced at each communication event is the estimate of the average rN .t/. Due to the
and strongly connected in a joint manner or the existence of the discontinuous signum function,
undirected graph G is jointly connected. the solution of (10) is understood in the Filippov
sense (Cortes 2008).
The idea behind the algorithm (10) is as
Dynamic Averaging
follows.
Pn First, P
(10) is designed to ensure that
In a more general setting, there exist n time- n
i D1Pxi .t/ D i D1 rP
i .t/ holds for allPtime. Note
varying signals, ri .t/, i D 1; : : : ; n, which could n n n
that x
i D1 i .t/ D 
i D1 i .t/ C i D1 ri .t/.
be an external signal or an output from a dy-
When the graph G is undirected and i .0/ D 0,
namical system. Here ri .t/ is available to only P Pn
it follows that R niD1 i .t/ D i D1 i .0/ C
agent i and each agent can exchange information Pn P t
˛ i D1 j 2Ni 0 sgnŒxj .
/  xi .
/d
D 0.
with only its neighbors. Each agent maintains a P P
As a result, niD1 xi .t/ D niD1 ri .t/ holds for
local estimate, denoted by xi .t/, of the average
4 P all time. Second, when G is connected, if the
of all the signals rN .t/ D n1 nkD1 rk .t/. The algorithm (10) guarantees that all estimates xi
objective is to design a distributed algorithm for approach the same value in finite time, then it can
agent i based on ri .t/ and xj .t/, j 2 Ni .t/, be guaranteed that each estimate approaches the
such that all agents will finally track the average average of all signals in finite time.
that changes over time. That is, kxi .t/  rN .t/k,
i D 1; : : : ; n, approaches zero eventually. Such
a dynamic averaging idea finds applications in
distributed sensor fusion with time-varying mea- Summary and Future Research
surements (Bai et al. 2011b; Spanos and Murray Directions
2005) and distributed estimation and tracking
(Yang et al. 2008). Averaging algorithms and consensus play an
Figure 3 illustrates the dynamic averaging important role in distributed control of networked
idea. If there exists a central station that can systems. While there is significant progress
always access the signals of all agents, then it is in this direction, there are still numerous
trivial to compute the average. Unfortunately, in open problems. For example, it is challenging
a distributed context, where there does not exist a to achieve averaging when the graph is not
central station and each agent can only communi- balanced. It is generally not clear how to deal
cate with its local neighbors, it is challenging for with a general directed or switching graph for
each agent to compute the average that changes nonlinear agents or nonlinear algorithms when
over time. While each agent could compute the the algorithms are based on only interagent
average of its own and local neighbors’ signals, physical state coupling without the need for
this will not be the average of all signals. communicating additional auxiliary variables
When the signal ri .t/ can be arbitrary but its between neighbors. The study of consensus
derivative exists and is bounded almost every- for multiple underactuated agents remains
where, a distributed nonlinear nonsmooth algo- a challenge. Furthermore, when the agents’
rithm is designed in Chen et al. (2012) as dynamics are heterogeneous, it is challenging
to design consensus algorithms. In addition, in
X the existing study, it is often assumed that the
Pi .t/ D ˛ sgnŒxj .t/  xi .t/
j 2Ni
agents are cooperative. When there exist faulty
or malicious agents, the problem becomes more
xi .t/ D i .t/ C ri .t/; i D 1; : : : ; n; (10) involved.
Averaging Algorithms and Consensus 63

Cross-References Cortes J (2008) Discontinuous dynamical systems. IEEE


Control Syst Mag 28(3):36–73
Jadbabaie A, Lin J, Morse AS (2003) Coordination of
 Distributed Optimization groups of mobile autonomous agents using nearest A
 Dynamic Graphs, Connectivity of neighbor rules. IEEE Trans Autom Control 48(6):988–
 Flocking in Networked Systems 1001
 Graphs for Modeling Networked Interactions Li Z, Duan Z, Chen G, Huang L (2010) Consensus of
multiagent systems and synchronization of complex
 Networked Systems networks: a unified viewpoint. IEEE Trans Circuits
 Oscillator Synchronization Syst I Regul Pap 57(1):213–224
 Vehicular Chains Mesbahi M, Egerstedt M (2010) Graph theoretic methods
for multiagent networks. Princeton University Press,
Princeton
Bibliography Moreau L (2005) Stability of multi-agent systems with
time-dependent communication links. IEEE Trans Au-
tom Control 50(2):169–182
Agaev R, Chebotarev P (2000) The matrix of maximum
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
out forests of a digraph and its applications. Autom
and cooperation in networked multi-agent systems.
Remote Control 61(9):1424–1450
Proc IEEE 95(1):215–233
Agaev R, Chebotarev P (2005) On the spectra of non-
Qu Z (2009) Cooperative control of dynamical sys-
symmetric Laplacian matrices. Linear Algebra Appl
tems: applications to autonomous vehicles. Springer,
399:157–178
London
Bai H, Arcak M, Wen J (2011a) Cooperative control de-
Ren W, Beard RW (2008) Distributed consensus in multi-
sign: a systematic, passivity-based approach. Springer,
vehicle cooperative control. Springer, London
New York
Ren W, Cao Y (2011) Distributed coordination of multi-
Bai H, Freeman RA, Lynch KM (2011b) Distributed
agent networks. Springer, London
Kalman filtering using the internal model average
Spanos DP, Murray RM (2005) Distributed sensor fusion
consensus estimator. In: Proceedings of the
using dynamic consensus. In: Proceedings of the IFAC
American control conference, San Francisco,
world congress, Prague
pp 1500–1505
Tsitsiklis JN, Bertsekas DP, Athans M (1986) Distributed
Bullo F, Cortes J, Martinez S (2009) Distributed con-
asynchronous deterministic and stochastic gradient
trol of robotic networks. Princeton University Press,
optimization algorithms. IEEE Trans Autom Control
Princeton
31(9):803–812
Chen F, Cao Y, Ren W (2012) Distributed average
Yang P, Freeman RA, Lynch KM (2008) Multi-agent
tracking of multiple time-varying reference signals
coordination by decentralized estimation and control.
with bounded derivatives. IEEE Trans Autom Control
IEEE Trans Autom Control 53(11):2480–2496
57(12):3169–3174
B

and driven by a d -dim. Brownian motion B


Backward Stochastic Differential is
Equations and Related Control 
Problems d Yt D f .t; Yt ; Zt /dt C Zt dBt ;
YT D ;
Shige Peng
Shandong University, Jinan, Shandong Province, or its integral form
China Z Z
T T
Yt D  C f .s; !; Ys ; Zs /ds  Zs dBs ;
Synonyms t t
(1)
BSDE where  is a given random variable depending on
the (canonical) Brownian path Bt .!/ D !.t/ on
Abstract Œ0; T , f .t; !; y; z/ is a given function of the time
t, the Brownian path ! on Œ0; t, and the pair of
A conditional expectation of the form Yt D variables .y; z/ 2 Rm  Rmd. A solution of this
RT BSDE is a pair of stochastic processes .Yt ; Zt /,
EΠC t fs dsjFt  is regarded as a simple and
typical example of backward stochastic differen- the solution of the above equation, on Œ0; T 
tial equation (abbreviated by BSDE). BSDEs are satisfying the following constraint: for each t,
widely applied to formulate and solve problems the value of Yt .!/, Zt .!/ depends only on the
related to stochastic optimal control, stochastic Brownian path ! on Œ0; t. Notice that, because
games, and stochastic valuation. of this constraint, the extra freedom Zt is needed.
For simplicity we set d D m D 1.
Often square-integrable conditions for  and
Keywords f and Lipschitz condition for f with respect
to .y; z/ are assumed under which there exists a
Brownian motion; Feynman-Kac formula; Lips- unique square-integrable solution .Yt ; Zt / on
chitz condition; Optimal stopping Œ0; T  (existence and uniqueness theorem of
BSDE). We can also consider a multidimensional
process Y and/or a multidimensional Brownian
Definition motion B, Lp -integrable conditions (p  1) for
 and f , as well as local Lipschitz conditions of
A typical real valued backward stochastic differ- f with respect to .y; z/. If Yt is real valued, we
ential equation defined on a time interval Œ0; T  often call the equation a real valued BSDE.

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
66 Backward Stochastic Differential Equations and Related Control Problems

We compare this BSDE with the classical multidimensional stocks with degenerate
stochastic differential equation (SDE): volatility matrix  can be treated by constrained
BSDE. Assume that a small investor whose
dXs D .Xs /dBs C b.Xs /ds investment behavior cannot affect market prices
and who invests at time t 2 Œ0; T  the amount
with given initial condition Xs jsD0 D x 2 Rn . Its t of his or her wealth Yt in the security and
integral form is t0 in the bond, thus Yt D t0 C t . If his
investment strategy is self-financing, then we
Z t have d Yt D t0 dPt0 =Pt0 C t dPt =Pt , thus
Xt .!/ D x C .Xs .!//dBs .!/
0
Z t
d Yt D .rYt C t /dt C t dBt ;
C b.Xs .!//ds: (2)
0
where  D  1 .b  r/. A strategy .Yt ; t /t 2Œ0;T 
Linear backward stochastic differential is said to be feasible if Yt  0, t 2 Œ0; T .
equation was firstly introduced (Bismut 1973) A European path-dependent contingent claim set-
in stochastic optimal control problems to solve tled at time T is a given nonnegative function of
the adjoint equation in the stochastic maximum path  D ..Pt /t 2Œ0;T  /. A feasible strategy .Y; /
principle of Pontryagin’s type. The above is called a hedging strategy against a contingent
existence and uniqueness theorem was obtained claim  at the maturity T if it satisfies
by Pardoux and Peng (1990). In the research
domain of economics, this type of 1-dimensional d Yt D .rYt C t /dt C t dBt ; YT D :
BSDE was also independently derived by Duffie
and Epstein (1992). Comparison theorem of This problem can be regarded as finding a
BSDE was obtained in Peng (1992) and improved stochastic control  and an initial condition Y0
in El Karoui et al. (1997a). Nonlinear Feynman- such that the final state replicates the contingent
Kac formula was obtained in Peng (1991, 1992) claim , i.e., YT D . This type of replications
and improved in Pardoux and Peng (1992). BSDE is also called “exact controllability” in terms
is applied as a nonlinear Black-Scholes option of stochastic control (see Peng 2005 for more
pricing formula in finance. This formulation was general results).
given in El Karoui et al. (1997b). We refer to a Observe that .Y; / is the solution of the
recent survey in Peng (2010) for more details. above BSDE. It is called a superhedging strat-
egy if there exists an increasing process Kt , of-
ten called an accumulated consumption process,
such that
Hedging and Risk Measuring
in Finance d Yt D .rYt C t /dt C t dBt  dKt ; YT D :

Let us consider the following hedging problem


This type of strategies is often applied in a
in a financial market with a typical model of
constrained market in which certain constraint
continuous time asset price: the basic securities
.Yt ; t / 2  is imposed. In fact a real market has
consist of two assets, a riskless one called bond,
many frictions and constraints. An example is the
and a risky security called stock. Their prices are
common case where interest rate R for borrowing
governed by dPt0 D Pt0 rdt, for the bond, and
money is higher than the bond rate r. The above
equation for the hedging strategy becomes
dPt D Pt Œbdt C dBt ; for the stock:
d Yt DŒrYt C t   .R  r/.t  Yt /C 
Here we only consider the situation where
the volatility rate  > 0. The case of dt C t dBt ; YT D ;
Backward Stochastic Differential Equations and Related Control Problems 67

Z Z
where Œ˛C D maxf˛; 0g. A short selling con- T T
Ytu D h.XT / C l.Xs ; us /ds  Zsu dBs :
straint t  0 is also a typical requirement in t t
markets. The method of constrained BSDE can
be applied to this type of problems. BSDE theory From Girsanov transformation, under the proba-
provides powerful tools to the robust pricing bility measure PQ defined by B
and risk measures for contingent claims (see El
Karoui et al. 1997a). For the dynamic risk mea- Z
d PQ T
sure under Brownian filtration, see Rosazza Gi- jT D exp b.Xs ; us /dBs
anin (2006), Peng (2004), Barrieu and El Karoui dP 0
Z T 
(2005), Hu et al. (2005), and Delbaen et al. 1
(2010).  jb.Xs ; us ; vs /j2 ds
2 0

Comparison Theorem
Xt is a Brownian motion, and the above BSDE is
The comparison theorem, for a real valued changed to
BSDE, tells us that, if .Yt ; Zt / and .YNt ; ZN t /
are two solutions of BSDE (1) with terminal Z T
condition YT D , YNT D N such that .!/  Ytu D h.XT / C Œl.Xs ; us / C hZsu ; b.Xs ; us /ids
N
.!/, ! 2 , then one has Yt  YNt . This theorem
t
Z
holds if f and , N satisfy the abovementioned
T
 Zsu dXs ;
L2 -integrability condition and f is a Lipschitz t
function in .y; z/. This theorem plays the same
important role as the maximum principle in where h; i is the Euclidean scalar product in Rd .
PDE theory. The theorem also has several very Notice that P and PQ are absolutely continuous
interesting generalizations (see Buckdahn et al. with each other. Compare this BSDE with the
2000). following one:

Stochastic Optimization and Two-Person Z T Z T


Zero-Sum Stochastic Games YOt D h.XT / C H.Xs ; ZO s /ds  ZO s dXs ;
t t
An important point of view is to regard an ex-
pectation value as a solution of a special type of (3)
BSDE. Consider an optimal control problem
"Z # where H.x; z/ WD infu2U fl.x; u/Chz; b.x; u/ig.
T It is a direct consequence of the comparison
min J.u/ W J.u/ D E l.Xs ; us /ds C h.XT / :
u 0 theorem of BSDE that YO0  Y0u D J.u/, for any
admissible control ut . Moreover, one can find a
feedback control uO such that YO0 D J.Ou/.
Here the state process X is controlled by the
The above BSDE method has been introduced
control process ut which is valued in a control
to solve the following two-person zero-sum game
(compact) domain U through the following d -
(Hamadèene and Lepeltier 1995):
dimensional SDE

dXs D b.Xs ; us /ds C .Xs /dBs max min J.u; v/; J.u; v/
v u
Z T 
defined in a Wiener probability space .; F ; P / DE l.Xs ; us ; vs /ds C h.XT /
with the Brownian motion Bt .!/ D !.t/ which 0

is the canonical process. Here we only discuss the


case   Id for simplicity. Observe that in fact with
the expected value J.u/ is Y0u D EŒY0u , where
Ytu solves the BSDE dXs D b.Xs ; us ; vs /ds C dBs ;
68 Backward Stochastic Differential Equations and Related Control Problems

where .us ; vs / is formulated as above with com- of the corresponding quasilinear PDE. Once YT
pact control domains us 2 U and vs 2 V . In and g are path functions of X , then the solution
this case the equilibrium of the game exists if the of the BSDE becomes also path dependent. In
following Isaac condition is satisfied: this sense, we can say that the PDE is in fact
a “state-dependent BSDE,”and BSDE gives us a
H.x; z/ WDmax inf fl.x; u; v/Chz; b.x; u; v/ig new generalization of “path-dependent PDE” of
v2V u2U
parabolic and/or elliptic types. This principle was
D inf max fl.x; u; v/Chz; b.x; u; v/ig ; illustrated in Peng (2010) for both quasilinear and
u2U v2V
fully nonlinear situations.
and the equilibrium is also obtained through a Observe that BSDE (1) and forward SDE
BSDE (3) defined above. (2) are only partially coupled. A fully coupled
system of SDE and BSDE is called a forward-
Nonlinear Feynman-Kac Formula backward stochastic differential equation
A very interesting situation is when f D g.Xt ;y;z/ (FBSDE). It has the following form:
and YT D'.XT / in BSDE (1). In this case we
have the following relation, called “nonlinear
Feynman-Kac formula,” dXt D b.t; Xt ; Yt ; Zt /dt C .t; Xt ; Yt ; Zt /dBt ;
X0 D x 2 Rn ;
d Yt D f .t; Xt ; Yt ; Zt /dt  Zt dBt ; YT D '.XT /:
Yt D u.t; Xt /; Zt D  T .Xt /ru.t; Xt /

In general the Lipschitz assumptions for b, , f ,


where u D u.t; x/ is the solution of the following
and ' w. r. t. .x; y; z/ are not enough. Then Ma
quasilinear parabolic PDE:
et al. (1994) have proposed a four-step scheme
method of FBSDE for the nondegenerate Marko-
@t u C Lu C g.x; u;  T ru/ D 0; (4) vian case with  independent of Z. For the case
dim.x/ D dim.y/ D n, Hu and Peng (1995) pro-
u.x; T / D '.x/; (5) posed a new type of monotonicity condition. This
method does not need to assume the coefficients
where L is the following, possibly degenerate, to be deterministic. Peng and Wu (1999) have
elliptic operator: weakened the monotonicity condition. Observe
that in the case where b D ry H.x; y; z/,  D
1 X
d
rz H.x; y; z/, and f D rx H.x; y; z/, for a given
L'.x/ D aij .x/@2xi xj '.x/
2 i;j D1 real valued function H convex in x concave in
.y; z/, the above FBSDE is called the stochastic
X
d
Hamilton equation associated to a stochastic op-
C bi .x/@xi '.x/; a.x/ D .x/ T .x/: timal control problem. We also refer to the book
i D1
of Ma and Yong (1999) for a systematic exposi-
tion on this subject. For time-symmetric forward-
Nonlinear Feynman-Kac formula can be used to backward stochastic differential equations and its
solve a nonlinear PDE of form (4) to (5) by a relation with stochastic optimality, see Peng and
BSDE (1) coupled with an SDE (2). Shi (2003) and Han et al. (2010).
A general principle is, once we solve a BSDE
driven by a Markov process X for which the ter-
minal condition YT at time T depends only on XT Reflected BSDE and Optimal Stopping
and the generator f .t; !; y; z/ also depends on If .Y; Z/ solves the BSDE
the state Xt at each time t, then the corresponding
solution of the BSDE is also state dependent, d Ys D g.s; Ys ; Zs /dsCZs dBs dKs ; YT D ;
namely, Yt D u.t; Xt /, where u is the solution (6)
Backward Stochastic Differential Equations and Related Control Problems 69

where K is a càdlàg and increasing process Œ0; T , t (El Karoui and Quenez 1995; Cvitanic
with K0 D 0 and Kt 2 L2P .Ft /, then Y or and Karatzas 1993; El Karoui et al. 1997a) for the
.Y; Z; K/ is called a supersolution of the BSDE, problem of superhedging in a market with con-
or g-supersolution. This notion is often used vex constrained portfolios (Cvitanic et al. 1998).
for constrained BSDEs. A typical situation is as The case with an arbitrary closed constraint was B
follows: for a given continuous adapted process proved in Peng (1999).
.Lt /t 2Œ0;T  , find a smallest g-supersolution
.Y; Z; K/ such that Yt  Lt . This problem Backward Stochastic Semigroup
was initialed in El Karoui et al. (1997b). It is and g-Expectations
g
proved that this problem is equivalent to finding Let Et;T ΠD Yt where Y is the solution of
g
a triple .Y; Z; K/ satisfying (4) and the following BSDE (1). .Et;T Œ/0t T <1 has the (backward)
reflecting condition of Skorohod type: semigroup property (Peng 1997)
Z g g g g
T Es;t ŒEt;T Œ D Es;T Œ; ET;T Œ
Ys  Ls ; .Ys  Ls /dKs D 0: (7)
0
D ; 0  s  t  T:
In fact  WD infft 2 Œ0; T  W Kt > 0g is the
optimal stopping time associated to this BSDE. For a real valued BSDE, by the comparison
g
A well-known example is the pricing of Ameri- theorem, the semigroup is monotone: Et;T Π
g N N If moreover gjzD0 D 0, then the
can option. Et;T Œ, if   .
g
Moreover, a new type of nonlinear Feynman- semigroup is constant preserving: Et;T Œc  c.
Kac formula was introduced: if all coefficients are Thus the semigroup forms in fact a nonlinear
given as in the formulation of the above nonlinear expectation called g-expectation (since this non-
Feynman-Kac formula and Ls D ˆ.Xs / where linear expectation is totally determined by the
ˆ satisfies the same condition as ', then we have generator g).
Ys D u.s; Xs /, where u D u.t; x/ is the solution This notion allows us to establish a nonlinear
of the following variational inequality: g-martingale theory, e.g., g-supermartingale
decomposition theorem. Peng (1999) claims
minf@t u C Lu C g.x; u;   Du/; u  ˆg that, if Y is a square-integrable càdlàg
g-supermartingale, then it has the unique
D 0; .t; x/ 2 Œ0; T   Rn ; (8) decomposition: there exists a unique predictable,
increasing, and càdlàg process A such that Y
with terminal condition ujt DT D '. They also solves
demonstrated that this reflected BSDE is a pow-
erful tool to deal with contingent claims of Amer- d Yt D g.t; Yt ; Zt /dt C dAt  Zt dBt :
ican types in a financial market with constraints.
BSDE reflected within two barriers, a lower A theoretically challenging and practically im-
one L and an upper one U , was first investigated portant problem is as follows: given an abstract
by Cvitanic and Karatzas (1996) where a type family of expectations .Es;t Œ/st satisfying the
of nonlinear Dynkin games was formulated for a same backward semigroup properties as these of
two-player model with zero-sum utility and each g-expectation, can we find a function g such that
g
player chooses his own optimal exit time. Es;t  Es;t ? Coquet, Hu et al. (2005) proved that
Stochastic optimal switching problems can be if E is dominated by g
-expectation with g
.z/ D
also solved by new types of oblique-reflected
jzj for a large enough constant
> 0, then there
BSDEs. exists a unique function g D g.t; !; z/ satisfying
A more general case of constrained BSDE is to
-Lipschitz condition such that .Es;t Œ/st is in
find the smallest g-supersolution .Y; Z; K/ with fact a g-expectation. For a concave dynamic ex-
constraint .Yt ; Zt / 2 t where, for each t 2 pectation with an assumption much weaker than
70 Backward Stochastic Differential Equations and Related Control Problems

the above domination condition, we can still find Bibliography


a function g D g.t; z/ with possibly singular
values (Delbaen et al. 2010). For the case without Barrieu P, El Karoui N (2005) Inf-convolution of risk
measures and optimal risk transfer. Financ Stoch 9:
the assumption of constant preservation, see Peng
269–298
(2005). In practice, the above criterion is very Bismut JM (1973) Conjugate convex functions in optimal
useful to test whether a dynamic pricing mech- stochastic control. J Math Anal Apl 44: 384–404
anism of contingent contracts can be represented Buckdahn R, Quincampoix M, Rascanu A (2000)
Viability property for a backward stochastic differ-
through a concrete function g.
ential equation and applications to partial differen-
A serious challenging problem in the tial equations. Probab Theory Relat Fields 116(4):
stochastic control theory is as follows: it is based 485–504
on a given probability space .; F ; P /. But in Chen Z, Epstein L (2002) Ambiguity, risk and as-
set returns in continuous time. Econometrica 70(4):
most practical situations, it is far from being
1403–1443
true. In many risky situations, it is necessary Coquet F, Hu Y, Memin J, Peng S (2002) Filtration consis-
to consider the uncertainty of the probability tent nonlinear expectations and related g-Expectations.
measures themselves, e.g., fP g 2‚ , namely, Probab Theory Relat Fields 123: 1–27
Cvitanic J, Karatzas I (1993) Hedging contingent claims
the well-known Knightian uncertainty (Knight
with constrained portfolios. Ann Probab 3(4):652–681
1921). A new framework of G-expectation space Cvitanic J, Karatzas I (1996) Backward stochastic dif-
.; H; E/O and the corresponding random and ferential equations with reflection and Dynkin games.
stochastic analysis (Itô’s analysis) is introduced Ann Probab 24(4):2024–2056
(see Peng 2007, 2010 and Soner et al. 2012) to Cvitanic J, Karatzas I, Soner M (1998) Backward stochas-
tic differential equations with constraints on the gains-
replace the probability framework .; F ; P /. process. Ann Probab 26(4):1522–1551
g-expectation is a special and typical case in this Delbaen F, Rosazza Gianin E, Peng S (2010)
new theory. Representation of the penalty term of dynamic
concave utilities. Finance Stoch 14:449–472
Duffie D, Epstein L (1992) Appendix C with costis
skiadas, stochastic differential utility. Econometrica
60(2):353–394
Cross-References
El Karoui N, Quenez M-C (1995) Dynamic programming
and pricing of contingent claims in an incomplete
 Numerical Methods for Continuous-Time market. SIAM Control Optim 33(1): 29–66
Stochastic Control Problems El Karoui N, Peng S, Quenez M-C (1997a) Backward
stochastic differential equation in finance. Math
 Risk-Sensitive Stochastic Control
Financ 7(1):1–71
 Stochastic Dynamic Programming El Karoui N, Kapoudjian C, Pardoux E, Peng S, Quenez
 Stochastic Linear-Quadratic Control M-C (1997b) Reflected solutions of backward SDE
 Stochastic Maximum Principle and related obstacle problems for PDEs. Ann Probab
25(2):702–737
Hamadèene S, Lepeltier JP (1995) Zero-sum stochastic
differential games and backward equations. Syst Con-
trol Lett 24(4):259–263
Han Y, Peng S, Wu Z (2010) Maximum principle for
Recommended Reading backward doubly stochastic control systems with ap-
plications. SIAM J Control 48(7):4224–4241
BSDE theory applied in maximization of stochas- Hu Y, Peng S (1995) Solution of forward-backward
tic control can be found in the book of Yong stochastic differential-equations. Probab Theory Relat
Fields 103(2):273–283
and Zhou (1999); stochastic control problem in Hu Y, Imkeller P, Müller M (2005) Utility maximiza-
finance in El Karoui et al. (1997a); optimal stop- tion in incomplete markets. Ann Appl Probab 15(3):
ping and reflected BSDE in El Karoui et al. 1691–1712
(1997b); Maximization under Knightian uncer- Knight F (1921) Risk, uncertainty and profit. Hougton
Mifflin Company, Boston. (Dover, 2006)
tainty using nonlinear expectation can be found Ma J, Yong J (1999) Forward-backward stochastic differ-
in Chen and Epstein (2002) and a survey paper in ential equations and their applications. Lecture notes
Peng (2010). in mathematics, vol 1702. Springer, Berlin/New York
Basic Numerical Methods and Software for Computer Aided Control Systems Design 71

Ma J, Protter P, Yong J (1994) Solving forward–


backward stochastic differential equations explicitly, Basic Numerical Methods
a four step scheme. Probab Theory Relat Fields 98: and Software for Computer Aided
339–359 Control Systems Design
Pardoux E, Peng S (1990) Adapted solution of a back-
ward stochastic differential equation. Syst Control Lett
Volker Mehrmann1 and Paul Van Dooren2
B
14(1):55–61
1
Pardoux E, Peng S (1992) Backward stochastic dif- Institut für Mathematik MA 4-5, Technische
ferential equations and quasilinear parabolic partial Universität Berlin, Berlin, Germany
differential equations, Stochastic partial differential 2
ICTEAM: Department of Mathematical
equations and their applications. In: Proceedings of
the IFIP. Lecture notes in CIS, vol 176. Springer, Engineering, Catholic University of Louvain,
pp 200–217 Louvain-la-Neuve, Belgium
Peng S (1991) Probabilistic interpretation for systems
of quasilinear parabolic partial differential equations.
Stochastics 37:61–74
Peng S (1992) A generalized dynamic programming prin- Abstract
ciple and hamilton-jacobi-bellmen equation. Stochas-
tics 38:119–134 Basic principles for the development of computa-
Peng S (1994) Backward stochastic differential equation
and exact controllability of stochastic control systems.
tional methods for the analysis and design of lin-
Prog Nat Sci 4(3):274–284 ear time-invariant systems are discussed. These
Peng S (1997) BSDE and stochastic optimizations. In: Yan have been used in the design of the subroutine
J, Peng S, Fang S, Wu LM (eds) Topics in stochastic library SLICOT. The principles are illustrated on
analysis. Lecture notes of xiangfan summer school,
chap 2. Science Publication (in Chinese, 1995)
the basis of a method to check the controllability
Peng S (1999) Monotonic limit theorem of BSDE and of a linear system.
nonlinear decomposition theorem of Doob-Meyer’s
type. Probab Theory Relat Fields 113(4):473–499
Peng S (2004) Nonlinear expectation, nonlinear evalua-
tions and risk measurs. In: Back K, Bielecki TR, Hipp Keywords
C, Peng S, Schachermayer W (eds) Stochastic methods
in finance lectures, C.I.M.E.-E.M.S. Summer School Accuracy; Basic numerical methods; Bench-
held in Bressanone/Brixen, LNM vol 1856. Springer,
pp 143–217. (Edit. M. Frittelli and W. Runggaldier) marking; Controllability; Documentation and
Peng S (2005) Dynamically consistent nonlinear evalua- implementation standards; Efficiency; Software
tions and expectations. arXiv:math. PR/ 0501415 v1 design
Peng S (2007) G-expectation, G-Brownian motion and
related stochastic calculus of Itô’s type. In: Benth
et al. (eds) Stochastic analysis and applications, The
Abel Symposium 2005, Abel Symposia, pp 541–567. Introduction
Springer
Peng S (2010) Backward stochastic differential equation,
nonlinear expectation and their applications. In: Pro- Basic numerical methods for the analysis and
ceedings of the international congress of mathemati- design of dynamical systems are at the heart of
cians, Hyderabad most techniques in systems and control theory
Peng S, Shi Y (2003) A type of time-symmetric forward-
backward stochastic differential equations. C R Math
that are used to describe, control, or optimize
Acad Sci Paris 336:773–778 industrial and economical processes. There are
Peng S, Wu Z (1999) Fully coupled forward-backward many methods available for all the different tasks
stochastic differential equations and applications to in systems and control, but even though most
optimal control. SIAM J Control Optim 37(3):
825–843
of these methods are based on sound theoretical
Rosazza Gianin E (2006) Risk measures via principles, many of them still fail when applied
G-expectations. Insur Math Econ 39:19–34 to real-life problems. The reasons for this may
Soner M, Touzi N, Zhang J (2012) Wellposedness of be quite diverse, such as the fact that the system
second order backward SDEs. Probab Theory Relat
Fields 153(1–2):149–190
dimensions are very large, that the underlying
Yong J, Zhou X (1999) Stochastic control. Applications of problem is very sensitive to small changes in
mathematics, vol 43. Springer, New York the data, or that the method lacks numerical
72 Basic Numerical Methods and Software for Computer Aided Control Systems Design

robustness when implemented in a finite preci- section “An Illustration” using one specific task,
sion environment. namely, checking the controllability of a system.
To overcome such failures, major efforts have We refer to SLICOT (2012) for more details
been made in the last few decades to develop on SLICOT and to Varga (2004) for a general
robust, well-implemented, and standardized discussion on numerical software for systems and
software packages for computer-aided control control.
systems design (Grübel 1983; Nag Slicot 1990;
Wieslander 1977). Following the standards of
modern software design, such packages should The Control Subroutine Library
consist of numerically robust routines with SLICOT
known performance in terms of reliability and
efficiency that can be used to form the basis When designing a subroutine library of basic
of more complex control methods. Also to algorithms, one should make sure that it satisfies
avoid duplication and to achieve efficiency certain basic requirements and that it follows
and portability to different computational a strict standardization in implementation and
environments, it is essential to make maximal documentation. It should also contain standard-
use of the established standard packages that ized test sets that can be used for benchmarking,
are available for numerical computations, e.g., and it should provide means for maintenance
the Basic Linear Algebra Subroutines (BLAS) and portability to new computing environments.
(Dongarra et al. 1990) or the Linear Algebra The subroutine library SLICOT was designed to
Packages (LAPACK) (Anderson et al. 1992). satisfy the following basic recommendations that
On the basis of such standard packages, the next are typically expected in this context (Benner
layer of more complex control methods can then et al. 1999).
be built in a robust way. Robustness: A subroutine must either return
In the late 1980s, a working group was cre- reliable results or it must return an error or
ated in Europe to coordinate efforts and integrate warning indicator, if the problem has not been
and extend the earlier software developments well posed or if the problem does not fall in
in systems and control. Thanks to the support the class to which the algorithm is applicable
of the European Union, this eventually led to or if the problem is too ill-conditioned
the development of the Subroutine Library in to be solved in a particular computing
Control Theory (SLICOT) (Benner et al. 1999; environment.
SLICOT 2012). This library contains most of the Numerical stability and accuracy: Subroutines
basic computational methods for control systems are supposed to return results that are as good
design of linear time-invariant control systems. as can be expected when working at a given
An important feature of this and similar kind precision. They also should provide an option
of subroutine libraries is that the development to return a parameter estimating the accuracy
of further higher level methods is not restricted actually achieved.
by specific requirements of the languages or data Efficiency: An algorithm should never be cho-
structures used and that the routines can be eas- sen for its speed if it fails to meet the usual
ily incorporated within other more user-friendly standards of robustness, numerical stability,
software systems (Gomez et al. 1997; MATLAB and accuracy, as described above. Efficiency
2013). Usually, this low-level reusability can only must be evaluated, e.g., in terms of the num-
be achieved by using a general-purpose program- ber of floating-point operations, the memory
ming language like C or Fortran. requirements, or the number and cost of itera-
We cannot present all the features of the SLI- tions to be performed.
COT library here. Instead, we discuss its general Modern computer architectures: The require-
philosophy in section “The Control Subroutine ments of modern computer architectures
Library SLICOT” and illustrate these concepts in must be taken into account, such as shared
Basic Numerical Methods and Software for Computer Aided Control Systems Design 73

or distributed memory parallel processors, benchmark collections have been developed (see,
which are the standard environments of e.g., Benner et al. 1997; Frederick 1998, or http://
today. The differences in the various www.slicot.org/index.php?site=benchmarks).
architectures may imply different choices of
algorithms. B
Comprehensive functional coverage: The Maintenance, Open Access, and Archives
routines of the library should solve control It is a major challenge to maintain a well-
systems relevant computational problems and developed library accessible and usable over
try to cover a comprehensive set of routines time when computer architectures and operating
to make it functional for a wide range of systems are changing rapidly, while keeping the
users. The SLICOT library covers most of the library open for access to the user community.
numerical linear algebra methods needed in This usually requires financial resources that
systems analysis and synthesis problems for either have to be provided by public funding or
standard and generalized state space models, by licensing the commercial use.
such as Lyapunov, Sylvester, and Riccati equa- In the SLICOT library, this challenge has been
tion solvers, transfer matrix factorizations, addressed by the formation of the Niconet As-
similarity and equivalence transformations, sociation (http://www.niconet-ev.info/en/) which
structure exploiting algorithms, and condition provides the current versions of the codes and
number estimators. all the documentations. Those of Release 4.5 are
The implementation of subroutines for a li- available under the GNU General Public License
brary should be highly standardized, and it should or from the archives of http://www.slicot.org/.
be accompanied by a well-written online docu-
mentation as well as a user manual (see, e.g.,
standard Denham and Benson 1981; Working
Group Software 1996) which is compatible with An Illustration
that of the LAPACK library (Anderson et al.
1992). Although such highly restricted standards To give an illustration for the development of a
often put a heavy burden on the programmer, it basic control system routine, we consider the
has been observed that it has a high importance specific problem of checking controllability
for the reusability of software and it also has of a linear time-invariant control system. A
turned out to be a very valuable tool in teaching linear time-invariant control problem has the
students how to implement algorithms in the form
context of their studies.

Benchmarking dx
In the validation of numerical software, it is D Ax C Bu; t 2 Œt0 ; 1/ (1)
dt
extremely important to be able to test the cor-
rectness of the implementation as well as the
performance of the method, which is one of the Here x denotes the state and u the input function,
major steps in the construction of a software and the system matrices are typically of the form
library. To achieve this, one needs a standard- A 2 Rn;n , B 2 Rn;m .
ized set of benchmark examples that allows an One of the most important topics in control
evaluation of a method with respect to correct- is the question whether by an appropriate choice
ness, accuracy, and efficiency and to analyze of input function u.t/ we can control the system
the behavior of the method in extreme situa- from an arbitrary state to the null state. This prop-
tions, i.e., on problems where the limit of the erty, called controllability, can be characterized
possible accuracy is reached. In the context of by one of the following equivalent conditions (see
basic systems and control methods, several such Paige 1981).
74 Basic Numerical Methods and Software for Computer Aided Control Systems Design

Theorem 1 The following are equivalent: Theorem 2 (Singular value decomposition)


(i) System (1) is controllable. Given A 2 Rn;m , then there exist orthogonal
(ii) Rank ŒB; AB; A2 B;    ; An1 B D n. matrices U; V with U 2 Rn;n ; V 2 Rm;m , such
(iii) Rank ŒB; A  I  D n 8 2 C. that A D U †V T and † 2 Rn;m is quasi-
(iv) 9F such that A and A C BF have no diagonal, i.e.,
common eigenvalues.
2 3
 1
The conditions of Theorem 1 are nice for †r 0 6 :: 7
†D where †r D 4 : 5;
theoretical purposes, but none of them is really 0 0
r
adequate for the implementation of an algorithm
that satisfies the requirements described in
the previous section. Condition (ii) creates and the nonzero singular values i are ordered
difficulties because the controllability matrix as 1  2      r > 0.
K D ŒB; AB; A2 B;    ; An1 B will be highly The SVD presents the best way to determine
corrupted by roundoff errors. Condition (iii) can (numerical) ranks of matrices in finite precision
simply not be checked in finite time. However, it arithmetic by counting the number of singular
is sufficient to check this condition only for the values satisfying j  u1 and by putting those
eigenvalues of A, but this is extremely expensive. for which j < u1 equal to zero. The compu-
And finally, condition (iv) will almost always tational method for the SVD is well established
give disjoint spectra between A and A C BF and analyzed, and it has been implemented in
since the computation of eigenvalues is sensitive the LAPACK routine SGESVD (see http://www.
to roundoff. netlib.org/lapack/). A faster but less reliable alter-
To devise numerical procedures, one often native to compute the numerical rank of a matrix
resorts to the computation of canonical or con- A is its QR factorization with pivoting (see, e.g.,
densed forms of the underlying system. To obtain Golub and Van Loan 1996).
such a form one employs controllability preserv-
Theorem 3 (QRE decomposition) Given A 2
ing linear transformations x 7! P x, u 7! Qu
Rn;m , then there exists an orthogonal matrix Q 2
with nonsingular matrices P 2 Rn;n , Q 2 Rm;m .
Rn;n and a permutation E 2 Rm;m , such that
The canonical form under these transformations
A D QRE T and R 2 Rn;m is trapezoidal, i.e.,
is the Luenberger form (see Luenberger 1967).
This form allows to check the controllability us- 2 3
ing the above criterion (iii) by simple inspection r11 : : : r1l : : : r1m
6 :: :: 7
of the condensed matrices. This is ideal from 6 : : 7
RD6 7:
a theoretical point of view but is very sensitive 4 rl l : : : rlm 5
to small perturbations in the data, in particular 0 0
because the transformation matrices may have
arbitrary large norm, which may lead to large and the nonzero diagonal entries ri i are ordered
errors. as r11      rl l > 0.
For the implementation as robust numerical
software one uses instead transformations with The (numerical) rank in this case is again ob-
real orthogonal matrices P; Q that can be im- tained by counting the diagonal elements ri i 
plemented in a backward stable manner, i.e., ur11 .
the resulting backward error is bounded by a One can use such orthogonal transformations
small constant times the unit roundoff u of the to construct the controllability staircase form (see
finite precision arithmetic, and employs for reli- Van Dooren 1981).
able rank determinations the well-known singular Theorem 4 (Staircase form) Given matrices
value decomposition (SVD) (see, e.g., Golub and A 2 Rn;n , B 2 Rn;m , then there exist orthogonal
Van Loan 1996). matrices P; Q with P 2 Rn;n , Q 2 Rm;m , so that
Basic Numerical Methods and Software for Computer Aided Control Systems Design 75

2 3
A11     A1;r1 A1;r so that
n1
6 : : : 7
6 : : : 7 n2
6A21 : : : 7
6
6 :: :: :
7 :
: 7 :
2 3
PAPT D 6 : : : : 7 : A11 A12 A13
6 : : 7n
4 Ar1;r2 Ar1;r1 Ar1;r 5 r1 A WD P2 AP2T DW 4A21 A22 A23 5 ;
0  0 0 Arr
nr
0 A32 A33 B
n1 : : : nr2 nr1 nr
2 3 (2)
B1 0 n1 3 2
6 0 07 n2 B1 0
6 7
6 : :7 :
6 : :7 :
6 : :7 : B WD P2 B DW 4 0 05 ;
PBQ D 6 7
6 : :7 :
4 :: :: 5 ::
0 0
0 0 nr where A21 D Œ†21 0, and B1 WD V21T †B .
n1 m  n1
Step 2:
i=3
where n1  n2      nr1  nr  0; nr1 > DO WHILE .ni 1 > 0 AND Ai;i 1 ¤ 0/.
0, Ai;i 1 D †i;i 1 0 , with nonsingular blocks
 Perform  an SVD of Ai;i 1 D Ui;i 1
†i;i 1 2 Rni ;ni and B1 2 Rn1 ;n1 . †i;i 1 0
V T with
0 0 i;i 1
Notice that when using the reduced pair in
†i;i 1 2 Rni ;ni nonsingular and diagonal.
condition (iii) of Theorem 1, the controllability
Set
condition is just nr D 0, which is simply checked
by inspection. A numerically stable algorithm 2 3
In1
to compute the staircase form of Theorem 4 is 6 :: 7
6 7
given below. It is based on the use of the singular 6 : 7
Pi WD 6
6 Ini 2
7 ; P WD Pi P;
7
value decomposition, but one could also have 6 T 7
used instead the QR decomposition with column 4 Vi;i 1 5
T
Ui;i
pivoting. 1

Staircase Algorithm so that


Input: A 2 Rn;n ; B 2 Rn;m
2 3
Output: PAP T ; PBQ in the form (2), P; Q A11  A1;iC1
6 : 7
orthogonal 6 : 7
  6A21 :
6
A2;iC1 7
7
†B 0 6 :: :: : 7
Step 0: Perform an SVD B D UB VT A WD Pi APiT DW 6
6 : : :
: 7
7
0 0 B 6 7
6 : 7
with nonsingular and diagonal †B 2 Rn1 ;n1 . Set 4 Ai;i1 Ai;i
:
: 5
P WD UBT , Q WD VB , so that 0 AiC1;i AiC1;iC1

 
A11 A12 where Ai;i 1 D Πi;i 1 0.
A WD UBT AUB D ;
A21 A22 i WD i C 1
END
 
†B 0 r WD i
B WD UBT BVB D
0 0 It is clear that this algorithm will stop with
with A11 of size n1  n1 .   ni D 0 or Ai;i 1 D 0. In every step, the
†21 0 remaining block shrinks at least by 1 row/column,
Step 1: Perform an SVD A21 D U21 VT
0 0 21 as long as Rank Ai;i 1 > 1, so that the algorithm
with nonsingular and diagonal †21 2 Rn2 ;n2 . Set stops after maximally n  1 steps. It has been
 T  shown in Van Dooren (1981) that system (1) is
V21 0 controllable if and only if in the staircase form of
P2 WD T ; P WD P2 P
0 U21 .A; B/ one has nr D 0.
76 Basic Numerical Methods and Software for Computer Aided Control Systems Design

It should be noted that the updating which will be the methods of choice for such
transformations Pi of this algorithm will affect problems. We still need to understand the nu-
previously created “stairs” so that the blocks merical challenges in such areas, before we can
denoted as †i;i 1 will not be diagonal anymore, propose numerically reliable software for these
but their singular values are unchanged. This is problems: the area is still quite open for new
critical in the decision about the controllability of developments.
the pair .A; B/ since it depends on the numerical
rank of the submatrices Ai;i 1 and B (see
Demmel and Kågström 1993). Based on this
and a detailed error and perturbation analysis, Cross-References
the Staircase Algorithm has been implemented
in the SLICOT routine AB01OD, and it uses  Computer-Aided Control Systems Design:
in the worst-case O.n4 / flops (a “flop” is an Introduction and Historical Overview
elementary floating-point operation C; ; ,  Interactive Environments and Software Tools
or =). For efficiency reasons, the SLICOT for CACSD
routine AB01OD does not use SVDs for rank
decisions, but QR decompositions with column
pivoting. When applying the corresponding Bibliography
orthogonal transformations to the system without
accumulating them, the complexity can be Anderson E, Bai Z, Bischof C, Demmel J, Dongarra J,
reduced to O.n3 / flops. It has been provided with Du Croz J, Greenbaum A, Hammarling S, McKen-
error bounds, condition estimates, and warning ney A, Ostrouchov S, Sorensen D (1995) LAPACK
users’ guide, 2nd edn. SIAM, Philadelphia. http://
strategies. www.netlib.org/lapack/
Benner P, Laub AJ, Mehrmann V (1997) Benchmarks for
the numerical solution of algebraic Riccati equations.
Summary and Future Directions Control Syst Mag 17:18–28
Benner P, Mehrmann V, Sima V, Van Huffel S, Varga A
(1999) SLICOT-A subroutine library in systems and
We have presented the SLICOT library and the control theory. Appl Comput Control Signals Circuits
basic principles for the design of such basic 1:499–532
subroutine libraries. To illustrate these princi- Demmel JW, Kågström B (1993) The generalized Schur
decomposition of an arbitrary pencil A  B: ro-
ples, we have presented the development of a bust software with error bounds and applications. Part
method for checking controllability for a linear I: theory and algorithms. ACM Trans Math Softw
time-invariant control system. But the SLICOT 19:160–174
Denham MJ, Benson CJ (1981) Implementation and
library contains much more than that. It essen-
documentation standards for the software library
tially covers most of the problems listed in the in control engineering (SLICE). Technical report
selected reprint volume (Patel et al. 1994). This 81/3, Kingston Polytechnic, Control Systems Research
volume contained in 1994 the state of the art Group, Kingston
Dongarra JJ, Du Croz J, Duff IS, Hammarling S (1990) A
in numerical methods for systems and control,
set of level 3 basic linear algebra subprograms. ACM
but the field has strongly evolved since then. Trans Math Softw 16:1–17
Examples of areas that were not in this vol- Frederick DK (1988) Benchmark problems for computer
ume but that are included in SLICOT are pe- aided control system design. In: Proceedings of the 4th
IFAC symposium on computer-aided control systems
riodic systems, differential algebraic equations,
design, Bejing, pp 1–6
and model reduction. Areas which still need new Golub GH, Van Loan CF (1996) Matrix computations, 3rd
results and software are the control of large- edn. The Johns Hopkins University Press, Baltimore
scale systems, obtained either from discretiza- Gomez C, Bunks C, Chancelior J-P, Delebecque F
(1997) Integrated scientific computing with scilab.
tions of partial differential equations or from the
Birkhäuser, Boston. https://www.scilab.org/
interconnection of a large number of interact- Grübel G (1983) Die regelungstechnische Programmbib-
ing systems. But it is unclear for the moment liothek RASP. Regelungstechnik 31: 75–81
Bilinear Control of Schrödinger PDEs 77

Luenberger DG (1967) Canonical forms for linear by various teams in the last decade. The results
multivariable systems. IEEE Trans Autom Control are illustrated on several classical examples.
12(3):290–293
Paige CC (1981) Properties of numerical algorithms re-
lated to computing controllability. IEEE Trans Autom Keywords
Control AC-26:130–138 B
Patel R, Laub A, Van Dooren P (eds) (1994) Numerical
linear algebra techniques for systems and control. Approximate controllability; Global exact con-
IEEE, Piscataway trollability; Local exact controllability; Quantum
The Control and Systems Library SLICOT (2012) particles; Schrödinger equation; Small-time con-
The NICONET society. NICONET e.V. http://www.
trollability
niconet-ev.info/en/
The MathWorks, Inc. (2013) MATLAB version 8.1. The
MathWorks, Inc., Natick
Introduction
The Numerical Algorithms Group (1993) NAG SLICOT
library manual, release 2. The Numerical Algorithms
Group, Wilkinson House, Oxford. Updates Release 1 A quantum particle, in a space with dimension N
of May 1990 (N D 1; 2; 3), in a potential V D V .x/, and in an
The Working Group on Software (1996) SLICOT
electric field u D u.t/, is represented by a wave
implementation and documentation standards
2.1. WGS-report 96-1. http://www.icm.tu-bs.de/ function W .t; x/ 2 R   ! C on the
NICONET/reports.html L2 . ; C/ sphere S
Van Dooren P (1981) The generalized eigenstructure Z
problem in linear system theory. IEEE Trans Autom
Control AC-26:111–129 j .t; x/j2 dx D 1; 8t 2 R;

Varga A (ed) (2004) Special issue on numerical awareness
in control. Control Syst Mag 24-1: 14–17
Wieslander J (1977) Scandinavian control library. A sub-
where  RN is a possibly unbounded open
routine library in the field of automatic control. Tech- domain. In first approximation, the time evolution
nical report, Department of Automatic Control, Lund of the wave function is given by the Schrödinger
Institute of Technology, Lund equation,
8
< i @t .t; x/ D . C V / .t; x/
u.t/
.x/ .t; x/ ; t 2 .0; C1/; x 2 ;
:
.t; x/ D 0; x 2 @
Bilinear Control of Schrödinger PDEs (1)
where
is the dipolar moment of the particle
Karine Beauchard1 and Pierre Rouchon2
1 and „ D 1 here. Sometimes, this equation is
CNRS, CMLS, Ecole Polytechnique, Palaiseau,
considered in the more abstract framework
France
2
Centre Automatique et Systèmes, Mines d
ParisTech, Paris Cedex 06, France i D .H0 C u.t/H1 / (2)
dt
where lives on the unit sphere of a separable
Hilbert space H and the Hamiltonians H0 , H1
Abstract are Hermitian operators on H. A natural question,
with many practical applications, is the existence
This entry is an introduction to modern issues of a control u that steers the wave function
about controllability of Schrödinger PDEs with from a given initial state 0 , to a prescribed target
bilinear controls. This model is pertinent for a f.
quantum particle, controlled by an electric field. The goal of this survey is to present
We review recent developments in the field, with well-established results concerning exact and
discrimination between exact and approximate approximate controllabilities for the bilinear
controllabilities, in finite or infinite time. We also control system (1), with applications to relevant
underline the variety of mathematical tools used examples. The main difficulties are the infinite
78 Bilinear Control of Schrödinger PDEs

dimension of H and the nonlinearity of the The first control result of the literature
control system. states
2 the 1 noncontrollability of system (1)2 in
H \ H0 ./ \ S with controls u 2 L ((0,
T), R/ (Ball et al. 1982; Turinici 2000). More
Preliminary Results precisely, by applying L2 (0, T / controls u, the
reachable
2 wave
functions (T / form a subset of
When the Hilbert space H has finite dimension n, H \ H0 ./ \ S with empty interior. This
1

then controllability of Eq. (2) is well understood statement does not give obstructions for system
(D’Alessandro 2008). If, for example, the Lie (1) to be controllable in different functional
algebra spanned by H0 and H1 coincides with spaces as we will see below, but it indicates
u.n/, the set of skew-Hermitian matrices, then that controllability issues are much more subtle
system (2) is globally controllable: for any initial in infinite dimension than in finite dimension.
and final states 0 , f 2 H of length one, there
exist T > 0 and a bounded open-loop control
Œ0; T  3 t 7! u.t/ steering from (0) = 0 to Local Exact Controllability
(T / = f .
In infinite dimension, this idea served to intuit In 1D and with Discrete Spectrum
a negative controllability result in Mirrahimi and This section is devoted to the 1D PDE:
Rouchon (2004), but the above characterization
cannot be generalized because iterated Lie brack- 8
ets of unbounded operators are not necessarily < i @t .t; x/ D @2x .t; x/
u.t/
.x/ .t; x/ ; x 2 .0; 1/ ; t 2 .0; T / ;
well defined. For example, the quantum harmonic :
.t; 0/ D .t; 1/ D 0:
oscillator
(4)

i @t .t; x/ D  @2x .t; x/ C x 2 .t; x/ We call “ground state” the solution of the
free system (u D 0) built with the first eigen-
 u .t/ x .t; x/ ; x 2 R; (3) value and eigenvector of @2x W 1 .t; x/ D
p
2 sin.x/ e i  . Under appropriate assump-
2t

is not controllable (in any reasonable sense) (Mir- tions on the dipolar moment
, then system (4)
rahimi and Rouchon 2004) even if all its Galerkin is controllable around the ground state, locally in
approximations are controllable (Fu et al. 2001). 3
H.0/ .0; 1/ \ S, with controls in L2 ((0,T), R/, as
Thus, much care is required in the use of Galerkin stated below.
approximations to prove controllability in infinite
Theorem 1 Assume
2 H 3 ..0; 1/; R/ and
dimension. This motivates the search of different
methods to study exact controllability of bilinear ˇZ 1 ˇ
ˇ ˇ
PDEs of form (1). ˇ
.x/ sin .x/ sin .kx/ dx ˇ  c ; 8k 2 N
ˇ ˇ k3
In infinite dimension, the norms need to be 0
(5)
specified. In this article, we use Sobolev norms.
For s 2 N, the Sobolev space H s ./ is the for some constant c > 0. Then, for every
space of functions : ˝ ! C with square inte- T > 0, there exists ı > 0 such that for
grable derivatives d k for k D 0; : : : ; s (deriva- every 0 ; f 2 S \ H.0/ 3
..0; 1/; C/ with

tives are well defined in the distribution sense). k 0  1 .0/kH 3 C
f  1 .T / H 3 < ı,
H s ./ is endowed with the norm k k H s W D there exists u 2 L ..0; T /; R/ such that the
2
P k 2
1=2
s d 2 . We also use the space solution of (4) with initial condition .0; x/ D
kD0 L ./
0 .x/ satisfies .T / D f .
1
H0 ./ which contains functions 2 H 1 ./ ˚
that vanish on the boundary @ (in the trace Here, H.0/3
.0; 1/ W D 2 H 3 ..0; 1/; C/ ;
00
sense) (Brézis 1999). D D 0 at x D 0; 1g. We refer to Beauchard
Bilinear Control of Schrödinger PDEs 79

and Laurent (2010) and Beauchard et al. (2013) in H 2 \ H01 and the reachable set to coincide
for proof and generalizations to nonlinear PDEs. 3
with H.0/ .
The proof relies on the linearization principle, by
applying the classical inverse mapping theorem
Open Problems in Multi-D or
to the endpoint map. Controllability of the lin-
with Continuous Spectrum B
earized system around the ground state is a con-
The linearization principle used to prove Theo-
sequence of assumption (5) and classical results
rem 1 does not work in multi-D: the trigonometric
about trigonometric moment problems. A subtle
moment problem, associated to the controllabil-
smoothing effect allows to prove C 1 regularity of
ity of the linearized system, cannot be solved.
the endpoint map.
Indeed, its frequencies, which are the eigenvalues
The assumption (5) holds for generic
2
of the Dirichlet Laplacian operator, do not satisfy
H 3 ..0; 1/; R/ and plays a key role for local ex-
a required gap condition (Loreti and Komornik
act controllability to hold in small time T . In
2005).
Beauchard and Morancey (2014), local exact con-
The study of a toy model (Beauchard 2011)
trollability is proved under the weaker assump-
suggests that if local controllability holds in 2D
tion, namely,
0 .0/ ˙
0 .1/ ¤ 0, but only in large
(with a priori bounded L2 -controls) then a posi-
time T .
tive minimal time is required, whatever
is. The
Moreover, under appropriate assumptions on
appropriate functional frame for such a result is

, references Coron (2006) and Beauchard and


an open problem.
Morancey (2014) propose explicit motions that
In 3D or in the presence of continuous
are impossible in small time T;with small con-
spectrum, we conjecture that local exact con-
trols in L2 . Thus, a positive minimal time is
trollability does not hold (with a priori bounded
required for local exact controllability, even if
L2 -controls) because the gap condition in the
information propagates at infinite speed. This
spectrum of the Dirichlet Laplacian operator is
minimal time is due to nonlinearities; its charac-
violated (see Beauchard et al. (2010) for a toy
terization is an open problem.
model from nuclear magnetic resonance and
Actually, assumption
0 .0/ ˙
0 .1/ ¤ 0 is not
ensemble controllability as originally stated in Li
necessary for local exact controllability in large
and Khaneja (2009)). Thus, exact controllability
time. For instance, the quantum box, i.e.,
should be investigated with controls that are
8 not a priori bounded in L2 ; this requires new
< i @t .t; x/ D @2x .t; x/
u.t/x .t; x/ ; x 2 .0; 1/ ; (6) techniques. We refer to Nersesyan and Nersisyan
: (2012a) for precise negative results.
.t; 0/ D .t; 1/ D 0;
Finally, we emphasize that exact controlla-
is treated in Beauchard (2005). Of course, these bility in multi-D but in infinite time has been
results are proved with additional techniques: proved in Nersesyan and Nersisyan (2012a,b),
power series expansions and Coron’s return with techniques similar to one used in the proof
method (Coron 2007). of Theorem 1.
There is no contradiction between the nega-
tive result of section “Preliminary Results” and
the positive result of Theorem 1. Indeed, the Approximate Controllability
wave function cannot be steered between any
two points 0 , f of H 2 \ H01 , but it can Different approaches have been developed to
be steered between any two points 0 , f of prove approximate controllability.
3
H.0/ , which is smaller than H 2 \ H01 . In par-
3
ticular, H.0/ ..0; 1/; C/ has an empty interior in Lyapunov Techniques
H.0/ ..0; 1/; C/. Thus, there is no incompatibility
2
Due to measurement effect and back action,
between the reachable set to have empty interior closed-loop controls in the Schrödinger frame are
80 Bilinear Control of Schrödinger PDEs

not appropriate. However, closed-loop controls • Time reversibility of the Schrödinger equation
may be computed via numerical simulations (i.e., if ( (t, x/, u.t// is a trajectory, then so
and then applied to real quantum systems is .  .T  t; x/; u.T  t// where * is the
in open loop, without measurement. Then, complex conjugate of ).
the strategy consists in designing damping Let us expose this strategy on the quantum box
feedback laws, thanks to a controlled Lyapunov (6). First, one can check the assumptions of Theo-
function, which encodes the distance to the rem 2 with V .x/ =  x and
(x/ D .1 )x when
target. In finite dimension, the convergence  > 0 is small enough. This means that, in (6),
proof relies on LaSalle invariance principle. In we consider controls u.t/ of the form  + u.t/.
infinite dimension, this principle works when Thus, an initial condition 0 2 H.0/ 3C
can be
the trajectories of the closed-loop system are steered arbitrarily close
to the first eigenvector
compact (in the appropriate space), which is ' 1 ; of @2x C x , in H 3 norm. Moreover, by
often difficult to prove. Thus, two adaptations a variant of Theorem 1, the local exact control-
3
have been proposed: approximate convergence lability of (6) holds in H.0/ around 1; . There-
(Beauchard and Mirrahimi 2009; Mirrahimi fore, the initial condition 0 2 H.0/ 3C
can be
2009) and weak convergence (Beauchard and steered exactly to 1; in finite time. By the time
Nersesyan 2010) to the target. reversibility of the Schrödinger equation, we can
also steer exactly the solution from 1; to any
target f 2 H 3C . Therefore, the solution can be
Variational Methods and Global Exact steered exactly from any initial condition 0 2
Controllability 3C
H.0/ to any target f 2 H.0/ 3C
in finite time.
The global approximate controllability of (1),
in any Sobolev space, is proved in Nersesvan
(2010), under generic assumptions on (V, Geometric Techniques Applied to Galerkin

), with Lyapunov techniques and variational Approximations


arguments. In Boscain et al. (2012, 2013) and Chambrion
Theorem 2 Let V,
2 C 1 .; R/ and et al. (2009) the authors study the control of
. j /j 2N  ; . j /j 2N  be the eigenvalues and Schrödinger PDEs, in the abstract form (2) and
˛ eigenvectors of . C V /. Assume
normalized under technical assumptions on the (unbounded)
˝

j ; 1 ¤ 0; for all j  2 and 1  j ¤ operators H0 and H1 that ensure the existence of
p  q for all j; p; q 2 N* such that solutions with piecewise constant controls u:
f1; j g ¤ fp; qgI j ¤ 1. Then, for every 1. H0 is skew-adjoint on its domain D.H0 /.
s > 0, the system (1) is globally approximately 2. There exists a Hilbert basis .'k /k2N of H made
 
controllable in H.V s
/ W D . C V /s=2 , the of eigenvectors of H0 : H0 k = i k k and
domain of . C V /s=2 W for every –; ı > 0 k 2 D.H1 /; 8k 2 N.
and 0 2 S \ H.V s 3. H0 + uH 1 is essentially skew-adjoint (not
/ , there exist a time T > 0
and a control u 2 C01 ..0; T /; R/ such that the necessarily with domain D.H0 // for every u 2
solution of (1) with initial condition .0/ D 0 ˛ ı > 0.
˝[0,ı] for some
satisfies k .T /  1 kH sı < –. 4. H1 'j ; 'k D 0 for every j , k 2 N such that
.V /
j = k and j ¤ k.
This theorem is of particular importance. In-
Theorem 3 Assume that, for every j; k 2
deed, in 1D and for appropriate choices of (V ,
N, there exists a finite number of integers

), global exact controllability of (1) in H 3C can


p1 ; : : : ; pr 2 N such that
be proved by combining the following:
• Global approximate controllability in H 3
given by Theorem 2, ˝ ˛
p1 D j; pr D k; H1 'pl ; 'plC1
• Local exact controllability in H 3 given by
Theorem 1, ¤ 0; 8l D 1; : : : ; r  1
Bilinear Control of Schrödinger PDEs 81

ˇ ˇ
j L  M j ¤ ˇ pl  plC1 ˇ ; 81  l  r  1; sign of a rich structure and subtle nature of
LM 2 N with fL; M g ¤ fpl ; plC1 g. control issues. New methods will probably be
Then for every – > 0 and 0 ; f in the unit necessary to answer the remaining open problems
sphere of H , there exists a piecewise constant in the field.
function u W Œ0; T–  ! Œ0; ı such that the solution This survey is far from being complete. B
condition .0/ D 0 satisfies
of In particular, we do not consider numerical
(2) with initial
.T– /  f < –: methods to derive the steering control such as
H
those used in NMR (Nielsen et al. 2010) to
We refer to Boscain et al. (2012, 2013) and achieve robustness versus parameter uncertainties
Chambrion et al. (2009) for proof and additional
or such as monotone algorithms (Baudouin and
results such as estimates on the L1 norm of
Salomon 2008; Liao et al. 2011) for optimal
the control. Note that H0 is not necessarily of
control (Cancès et al. 2000). We do not consider
the form ( C V /, H1 can be unbounded, ı
also open quantum systems where the state
may be arbitrary small, and the two assumptions
is then the density operator , a nonnegative
are generic with respect to (H0 ; H1 ). The con- Hermitian operator with unit trace on H. The
nectivity and transition frequency conditions in
Schrödinger equation is then replaced by the
Theorem 3 mean physically that each pair of
Lindblad equation:
H0 eigenstates is connected via a finite number
of first-order (one-photon) transitions and that d X
the transition frequencies between pairs of eigen-  D  ŒH0 C uH1 ;  C L L
dt 
states are all different.
Note that, contrary to Theorems 2, Theorem 3 1 
 L L  C L L
cannot be combined with Theorem 1 to prove 2
global exact controllability. Indeed, functional
spaces are different: H D L2 ./ in Theorem 3, with operator L related to the decoherence
whereas H 3 -regularitv is required for Theorem 1. channel . Even in the case of finite dimensional
This kind of results applies to several relevant Hilbert space H, controllability of such system
examples such as the control of a particule in is not yet well understood and characterized
a quantum box by an electric field (6) and the (see Altafini (2003) and Kurniawan et al. (2012)).
control of the planar rotation of a linear molecule
by means of two electric fields: Cross-References

i @t .t; / D @20 C u1 .t/ cos ./  Control of Quantum Systems
Cu2 .t/ sin .// .t; / ;  2 T  Robustness Issues in Quantum Control

where T is the lD-torus. However, several other Acknowledgments The authors were partially supported
by the “Agence Nationale de la Recherche” (ANR), Projet
systems of physical interest are not covered by Blanc EMAQS number ANR-2011-BS01-017-01.
these results such as trapped ions modeled by two
coupled quantum harmonic oscillators. In Erve-
doza and Puel (2009), specific methods have been Bibliography
used to prove their approximate controllability.
Altafini C (2003) Controllability properties for finite
dimensional quantum Markovian master equations.
J Math Phys 44(6):2357–2372
Concluding Remarks Ball JM, Marsden JE, Slemrod M (1982) Controllabil-
ity for distributed bilinear systems. SIAM J Control
Optim 20:575–597
The variety of methods developed by different
Baudouin L, Salomon J (2008) Constructive solutions of
authors to characterize controllability of a bilinear control problem for a Schrödinger equation.
Schrödinger PDEs with bilinear control is the Syst Control Lett 57(6):453–464
82 Boundary Control of 1-D Hyperbolic Systems

Beauchard K (2005) Local controllability of a 1-D Kurniawan I, Dirr G, Helmke U (2012) Controllability
Schrödinger equation. J Math Pures Appl 84:851–956 aspects of quantum dynamics: unified approach for
Beauchard K (2011) Local controllability and non control- closed and open systems. IEEE Trans Autom Control
lability for a ID wave equation with bilinear control. 57(8):1984–1996
J Diff Equ 250:2064–2098 Li JS, Khaneja N (2009) Ensemble control of Bloch
Beauchard K, Laurent C (2010) Local controllability of equations. IEEE Trans Autom Control 54(3):528–536
1D linear and nonlinear Schrödinger equations with Liao S-K, Ho T-S, Chu S-I, Rabitz HH (2011) Fast-kick-
bilinear control. J Math Pures Appl 94(5):520–554 off monotonically convergent algorithm for searching
Beauchard K, Mirrahimi M (2009) Practical stabiliza- optimal control fields. Phys Rev A 84(3):031401
tion of a quantum particle in a one-dimensional in- Loreti P, Komornik V (2005) Fourier series in control
finite square potential well. SIAM J Control Optim theory. Springer, New York
48(2):1179–1205 Mirrahimi M (2009) Lyapunov control of a quantum par-
Beauchard K, Morancey M (2014) Local controllability ticle in a decaying potential. Ann Inst Henri Poincaré
of 1D Schrödinger equations with bilinear control (c) Nonlinear Anal 26:1743–1765
and minimal time, vol 4. Mathematical Control and Mirrahimi M, Rouchon P (2004) Controllability of quan-
Related Fields tum harmonic oscillators. IEEE Trans Autom Control
Beauchard K, Nersesyan V (2010) Semi-global weak 49(5):745–747
stabilization of bilinear Schrödinger equations. CRAS Nersesvan V (2010) Global approximate controllability
348(19–20):1073–1078 for Schrödinger equation in higher Sobolev norms and
Beauchard K, Coron J-M, Rouchon P (2010) Control- applications. Ann IHP Nonlinear Anal 27(3):901–915
lability issues for continuous spectrum systems and Nersesyan V, Nersisyan H (2012a) Global exact controlla-
ensemble controllability of Bloch equations. Commun bility in infinite time of Schrödinger equation. J Math
Math Phys 290(2):525–557 Pures Appl 97(4):295–317
Beauchard K, Lange H, Teismann H (2013, preprint) Lo- Nersesyan V, Nersisyan H (2012b) Global exact con-
cal exact controllability of a Bose-Einstein condensate trollability in infinite time of Schrödinger equation:
in a 1D time-varying box. arXiv:1303.2713 multidimensional case. Preprint: arXiv:1201.3445
Boscain U, Caponigro M, Chambrion T, Sigalotti M Nielsen NC, Kehlet C, Glaser SJ and Khaneja N (2010)
(2012) A weak spectral condition for the controlla- Optimal Control Methods in NMR Spectroscopy.
bility of the bilinear Schrödinger equation with ap- eMagRes
plication to the control of a rotating planar molecule. Turinici G (2000) On the controllability of bilinear quan-
Commun Math Phys 311(2):423–455 tum systems. In: Le Bris C, Defranceschi M (eds)
Boscain U, Chambrion T, Sigalotti M (2013) On Mathematical models and methods for ab initio quan-
some open questions in bilinear quantum control. tum chemistry. Lecture notes in chemistry, vol 74.
arXiv:1304.7181 Springer
Brézis H (1999) Analyse fonctionnelles: théorie et appli-
cations. Dunod, Paris
Cancès E, Le Bris C, Pilot M (2000) Contrôle optimal
bilinéaire d’une équation de Schrödinger. CRAS Paris
330:567–571 Boundary Control of 1-D Hyperbolic
Chambrion T, Mason P, Sigalotti M, Boscain M (2009)
Controllability of the discrete-spectrum Schrödinger
Systems
equation driven by an external field. Ann Inst Henri
Poincaré Anal Nonlinéaire 26(l):329–349 Georges Bastin1 and Jean-Michel Coron2
Coron J-M (2006) On the small-time local controllability 1
Department of Mathematical Engineering,
of a quantum particule in a moving one-dimensional
infinite square potential well. C R Acad Sci Paris I University Catholique de Louvain,
342:103–108 Louvain-La-Neuve, Belgium
2
Coron J-M (2007) Control and nonlinearity. Mathematical Laboratoire Jacques-Louis Lions, University
surveys and monographs, vol 136. American Mathe- Pierre et Marie Curie, Paris, France
matical Society, Providence
D’Alessandro D (2008) Introduction to quantum control
and dynamics. Applied mathematics and nonlinear
science. Chapman & Hall/CRC, Boca Raton
Ervedoza S, Puel J-P (2009) Approximate controllability Abstract
for a system of Schrödinger equations modeling a
single trapped ion. Ann Inst Henri Poincaré Anal One-dimensional hyperbolic systems are com-
Nonlinéaire 26(6):2111–2136
monly used to describe the evolution of various
Fu H, Schirmer SG, Solomon AI (2001) Complete con-
trollability of finite level quantum systems. J Phys A physical systems. For many of these systems,
34(8):1678–1690 controls are available on the boundary. There
Boundary Control of 1-D Hyperbolic Systems 83

 ! 
are then two natural questions: controllability It 0 L1
s Ix
(steer the system from a given state to a desired C D 0; (2)
Vt Cs1 0 Vx
target) and stabilization (construct feedback laws
leading to a good behavior of the closed loop
system around a given set point). where I.t; x/ is the current intensity, V .t; x/ is B
the voltage, Ls is the self-inductance per unit
length, and Cs is the self-capacitance per unit
Keywords length. The system has two characteristic ve-
locities (which are the eigenvalues of the ma-
Chromatography; Controllability; Electrical trix A):
lines; Hyperbolic systems; Open channels; Road
traffic; Stabilization 1 1
1 D p > 0 > 2 D  p : (3)
Ls Cs Ls Cs

One-Dimensional Hyperbolic Systems Saint-Venant Equation for Open Channels


First proposed by Barré de Saint-Venant
The operation of many physical systems may be in (1871), the Saint-Venant equations (also
represented by hyperbolic systems in one space called shallow water equations) describe the
dimension. These systems are described by the propagation of water in open channels (see
following partial differential equation: Fig. 2). In the case of a horizontal channel
with rectangular cross section, unit width, and
negligible friction, the Saint-Venant model is a
Yt C A.Y /Yx D 0; t 2 Œ0; T ; x 2 Œ0; L; (1) hyperbolic system of the form

  
where: Ht V H Hx
• t and x are two independent variables: a time C D 0; (4)
Vt g V Vx
variable t 2 Œ0; T  and a space variable x 2
Œ0; L over a finite interval.
where H.t; x/ is the water depth, V .t; x/ is the
• Y W Œ0; T   Œ0; L ! Rn is the vector of state
water horizontal velocity, and g is the gravity ac-
variables.
celeration. Under subcritical flow conditions, the
• A W Rn ! Mn;n .R/ with Mn;n .R/ is the set
system is hyperbolic with characteristic velocities
of n  n real matrices.
• Yt and Yx denote the partial derivatives of Y p p
with respect to t and x, respectively. 1 D V C gH > 0 > 2 D V  gH : (5)
The system (1) is hyperbolic which means that
A.Y / has n distinct real eigenvalues (called char- Aw-Rascle Equations for Fluid Models
acteristic velocities) for all Y in a domain of of Road Traffic
Rn . Here are some typical examples of physical In the fluid paradigm for road traffic modeling,
models having the form of a hyperbolic system. the traffic is described in terms of two basic
macroscopic state variables: the density %.t; x/
Electrical Lines and the speed V .t; x/ of the vehicles at position x
First proposed by Heaviside in (1885, 1886 and along the road at time t. The following dynamical
1887), the equations of (lossless) electrical lines model for road traffic was proposed by Aw and
(also called telegrapher equations) describe the Rascle in (2000):
propagation of current and voltage along electri-   
cal transmission lines (see Fig. 1). It is a hyper- %t V % %t
C F .Y / D 0: (6)
bolic system of the following form: Vt 0 V  Q.%/ Vt
84 Boundary Control of 1-D Hyperbolic Systems

Boundary Control of 1-D Hyperbolic Systems, Fig. 1 Transmission line connecting a power supply to a resistive
load R` ; the power supply is represented by a Thevenin equivalent with efm U.t / and internal resistance Rg

Boundary Control of 1-D


Hyperbolic Systems,
Fig. 2 Lateral view of a
pool of a horizontal open
channel

The system is hyperbolic with characteristic ve-


locities

1 D V > 2 D V  Q.%/: (7)

In this model the first equation of (6) is a conti-


nuity equation that represents the conservation of
the number of vehicles on the road. The second
equation of (6) is a phenomenological model
describing the speed variations induced by the
driver’s behavior.

Chromatography
In chromatography, a mixture of species with
different affinities is injected in the carrying fluid
at the entrance of the process as illustrated in
Fig. 3. The various substances travel at different
propagation speeds and are ultimately separated
in different bands. The dynamics of the mixture
are described by a system of partial differential
equations:

Boundary Control of 1-D Hyperbolic Systems, Fig. 3


.Pi C Li .P //t C V .Pi /x D 0 i D 1; : : : ; n; Principle of chromatography

ki Pi
Li .P / D P ; (8) where Pi (i D 1; : : : ; n) denote the densities
1C j kj Pj =Pmax of the n carried species. The function Li .P /
Boundary Control of 1-D Hyperbolic Systems 85

(called the “Langmuir isotherm”) was proposed 2. Open channels. A standard situation is when
by Langmuir in (1916). the boundary conditions are assigned by tun-
able hydraulic gates as in irrigation canals and
navigable rivers; see Fig. 4.
Boundary Control The hydraulic model of mobile spillways B
gives the boundary conditions
Boundary control of 1-D hyperbolic systems q
refers to situations where manipulated control  3
H.t; 0/V .t; 0/ D kG Z0 .t/  U0 .t/ ;
inputs are physically located at the boundaries.
q
Formally, this means that the system (1) is  3
considered under n boundary conditions having H.t; L/V .t; L/ D kG H.t; L/  UL .t/ ;
the general form
where H.t; 0/ and H.t; L/ denote the water
depth at the boundaries inside the pool, Z0 .t/
B Y .t; 0/; Y .t; L/; U.t/ D 0; (9)
and ZL .t/ are the water levels on the other
side of the gate, kG is a constant gate shape
with B W Rn  Rn  Rq ! Rn . The depen- parameter, and U0 and UL represent the weir
dence of the map B on (Y .t; 0/; Y .t; L/) refers elevations. The Saint-Venant equations cou-
to natural physical constraints on the system. The pled to these boundary conditions constitute a
function U.t/ 2 Rq represents a set of q ex- boundary control system with U0 .t/ and UL .t/
ogenous control inputs. The following examples as command signals.
illustrate how the control boundary conditions (9) 3. Ramp metering. Ramp metering is a strategy
may be defined for some commonly used control that uses traffic lights to regulate the flow of
devices: traffic entering freeways according to mea-
1. Electrical lines. For the circuit represented in sured traffic conditions as illustrated in Fig. 5.
Fig. 1, the line model (2) is to be considered For the stretch of motorway represented in this
under the following boundary conditions: figure, the boundary conditions are

V .t; 0/ C Rg I.t; 0/ D U.t/; %.t; 0/V .t; 0/ D Qin .t/ C U.t/;


V .t; L/  R` I.t; L/ D 0: %.t; L/V .t; L/ D Qout .t/;

The telegrapher equations (2) coupled with where U.t/ is the inflow rate controlled by
these boundary conditions constitute therefore the traffic lights. The Aw-Rascle equations (6)
a boundary control system with the voltage coupled to these boundary conditions consti-
U.t/ as control input. tute a boundary control system with U.t/ as

Boundary Control of 1-D Hyperbolic Systems, Fig. 4 Hydraulic gates at the input and the output of a pool
86 Boundary Control of 1-D Hyperbolic Systems

Boundary Control of 1-D Hyperbolic Systems, Fig. 5 Ramp metering on a stretch of a motorway

the command signal. In a feedback implemen- Œ0; L 7! Y0 .x/ 2 Rn , it is possible to reach in


tation of the ramp metering strategy, U.t/ may time T a desired target state Y1 W x 2 Œ0; L 7!
be a function of the measured disturbances Y1 .x/ 2 Rn , with Y0 .x/ and Y1 .x/ close to Y  .
Qint .t/ or Qout .t/ that are imposed by the
Theorem 4 (See Li and Rao 2003) If there exist
traffic conditions.
control inputs U C .t/ and U  .t/ such that the
4. Simulated moving bed chromatography is
boundary conditions (9) are equivalent to
a technology where several interconnected
chromatographic columns are switched
periodically against the fluid flow. This Y C .t; 0/ D U C .t/; Y  .t; L/ D U  .t/;
allows for a continuous separation with a (11)
better performance than the discontinuous then the boundary control system (1), (11) is
single-column chromatography. An efficient locally controllable for the C 1 -norm if and only
operation of SMB chromatography requires a if T > Tc with
tight control of the process by manipulating  
the inflow rates in the columns. This process L L
Tc D max ;:::; :
is therefore a typical example of a periodic j 1 j j n j
boundary control hyperbolic system.

Feedback Stabilization
Controllability
For the boundary control system (1), (9), the
In this section and in the following one, Y 2 R  n problem of local boundary feedback stabilization
is such that none of the eigenvalues of A.Y  / is the problem of finding boundary feedback
are 0. After an appropriate linear state transfor- control actions
mation, the matrix A.Y  / can be assumed to be
diagonal, with distinct and nonzero entries: U.t/ D F .Y .t; 0/; Y .t; L/; Y  /;
F W Rn  Rn  Rn ! Rp ; (12)

A.Y / D diag . 1 ; 2 ; : : : ; n /; (10)
1 > 2 >    > m > 0 > mC1 >    > n : such that the system trajectory exponentially con-
verges to a desired steady-state Y  (called set
point) from any initial condition Y0 .x/ close to
Let Y C 2 Rm and Y  2 Rnm be such that Y T D
Y  . In such case, the set point is said to be
.Y CT Y T /T .
exponentially stable.
For the boundary control system (1), (9), the
local controllability issue is to investigate if, Theorem 5 (See Coron et al. 2008) If
starting from a given initial state Y0 W x 2 there exists a boundary feedback U.t/ D
Boundary Control of 1-D Hyperbolic Systems 87

F .Y .t; 0/; Y .t; L/; Y  / such that the boundary As shown in Li (2010), this problem has strong
conditions (9) are written in the form connections with the controllability problem and
 C  the system (1) is observable if the time T is large
Y C .t; 0/ Y .t; L/ enough.
DG ; G.Y  / D Y  ;
Y  .t; L/ Y  .t; 0/ The above results are on smooth solutions of B
(13) (1). However, the system (1) is known to be
then, for the boundary control system (1), (13), well posed in class of BV -solutions (Bounded
the set point Y  is locally exponentially stable for Variations), with extra conditions (e.g., entropy
the H 2 -norm if type); see in particular Bressan (2000). There are
partial results on the controllability in this class.
Inf fk G 0 .Y  / 1 kI 2 Dg < 1; See, in particular, Ancona and Marson (1998) and
Horsin (1998) for n D 1. For n D 2, it is shown
in Bressan and Coclite (2002) that Theorem 4 no
where k k denotes the usual 2-norm of n  n real
longer holds in general in the BV class. However,
matrices, G 0 .Y  / denotes the Jacobian matrix
there are positive results for important physical
of the map G at Y  , and D denotes the set of
systems; see, for example, Glass (2007) for the
n  n diagonal real matrices with strictly positive
1-D isentropic Euler equation.
diagonal entries.
For the stabilization in the C 1 -norm, another
sufficient condition is given in Li (1994). Cross-References

 Controllability and Observability


Summary and Future Directions
 Control of Fluids and Fluid-Structure Interac-
tions
With suitable boundary controls, hyperbolic sys-
 Control of Linear Systems with Delays
tems can be controlled and stabilized around a
 Feedback Stabilization of Nonlinear Systems
desired set point. However, in many situations
 Lyapunov’s Stability Theory
the hyperbolic model is not sufficient: one needs
to add a zero-order term and (1) has to be re-
placed by
Bibliography
Yt CA.Y /Yx CC.Y / D 0; t 2 Œ0; T ; x 2 Œ0; L;
Ancona F, Marson A (1998) On the attainable set for
(14) scalar nonlinear conservation laws with boundary con-
trol. SIAM J Control Optim 36(1):290–312 (elec-
where C W Rn ! Rn . This is, for example,
tronic)
the case for the open channels when slope and Aw A, Rascle M (2000) Resurrection of “second order”
friction cannot be neglected. Note that the set models of traffic flow. SIAM J Appl Math 60(3):916–
point Y  may now depend on x. For the con- 938 (electronic)
Barré de Saint-Venant A-C (1871) Théorie du mouvement
trollability issue, the new term C.Y / turns out
non permanent des eaux, avec application aux crues
to be not essential; see in particular Li (2010). des rivières et à l’introduction des marées dans leur lit.
The situation is not the same for the stabilization Comptes rendus de l’Académie des Sciences de Paris,
and only partial results are known. In particular, Série 1, Mathématiques, 53:147–154
Bressan A (2000) Hyperbolic systems of conservation
Coron et al. (2013) uses Krstic’s backstepping
laws. Volume 20 of Oxford lecture series in mathe-
approach (Krstic and Smyshlyaev 2008) to treat matics and its applications. Oxford University Press,
the case n D 2 and m D 1. Oxford. The one-dimensional Cauchy problem
Another important issue for the system (1) is Bressan A, Coclite GM (2002) On the boundary control of
systems of conservation laws. SIAM J Control Optim
the observability problem: assume that the state
41(2):607–622 (electronic)
is measured on the boundary during the interval Coron J-M, Bastin G, d’Andréa Novel B (2008) Dis-
of time Œ0; T , can one recover the initial data? sipative boundary conditions for one-dimensional
88 Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs

nonlinear hyperbolic systems. SIAM J Control Optim when they are posed on a bounded interval.
47(3):1460–1498 Different choices of controls are studied for each
Coron J-M, Vazquez R, Krstic M, Bastin G (2013) Local
exponential H 2 stabilization of a 2  2 quasilinear hy-
equation.
perbolic system using backstepping. SIAM J Control
Optim 51(3):2005–2035
Glass O (2007) On the controllability of the 1-D isentropic Keywords
Euler equation. J Eur Math Soc (JEMS) 9(3):427–486
Heaviside O (1885, 1886 and 1887) Electromagnetic in-
duction and its propagation. The Electrician, reprinted Controllability; Dispersive equations; Higher-
in Electrical Papers, 2 vols, London. Macmillan and order partial differential equations; Parabolic
co. 1892 equations; Stabilizability
Horsin T (1998) On the controllability of the Burgers
equation. ESAIM Control Optim Calc Var 3:83–95
(electronic)
Krstic M, Smyshlyaev A (2008) Boundary control of Introduction
PDEs. Volume 16 of advances in design and con-
trol. Society for Industrial and Applied Mathematics
(SIAM), Philadelphia. A course on backstepping de- The Korteweg-de Vries (KdV) and the Kuramoto-
signs Sivashinsky (KS) equations have very different
Langmuir I (1916) The constitution and fundamental properties because they do not belong to the same
properties of solids and liquids. Part I. Solids. J Am class of partial differential equations (PDEs).
Chem Soc 38:2221–2295
Li T (1994) Global classical solutions for quasilinear The first one is a third-order nonlinear dispersive
hyperbolic systems. Volume 32 of RAM: research in equation
applied mathematics. Masson, Paris
Li T (2010) Controllability and observability for quasilin- yt C yx C yxxx C yyx D 0; (1)
ear hyperbolic systems. Volume 3 of AIMS series on
applied mathematics. American Institute of Mathemat-
ical Sciences (AIMS), Springfield and the second one is a fourth-order nonlinear
Li T, Rao B-P (2003) Exact boundary controllability parabolic equation
for quasi-linear hyperbolic systems. SIAM J Control
Optim 41(6):1748–1755 (electronic)
ut C uxxxx C uxx C uux D 0; (2)

where > 0 is called the anti-diffusion param-


Boundary Control of Korteweg-de
eter. However, they have one important charac-
Vries and Kuramoto–Sivashinsky
teristic in common. They are both used to model
PDEs
nonlinear propagation phenomena in the space
x-direction when the variable t stands for time.
Eduardo Cerpa
The KdV equation serves as a model for waves
Departamento de Matemática, Universidad
propagation in shallow water surfaces (Korteweg
Técnica Federico Santa María, Valparaiso, Chile
and de Vries 1895), and the KS equation models
front propagation in reaction-diffusion phenom-
Abstract ena including some instability effects (Kuramoto
and Tsuzuki 1975; Sivashinsky 1977).
The Korteweg-de Vries (KdV) and the Kuramoto- From a control point of view, a new com-
Sivashinsky (KS) partial differential equations mon characteristic arises. Because of the order
are used to model nonlinear propagation of one- of the spatial derivatives involved, when studying
dimensional phenomena. The KdV equation these equations on a bounded interval Œ0; L, two
is used in fluid mechanics to describe waves boundary conditions have to be imposed at the
propagation in shallow water surfaces, while same point, for instance, at x D L. Thus, we
the KS equation models front propagation in can consider control systems where we control
reaction-diffusion systems. In this article, the one boundary condition but not all the bound-
boundary control of these equations is considered ary data at one endpoint of the interval. This
Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs 89

configuration is not possible for the classical the space of bounded functions. The main control
wave and heat equations where at each extreme, properties to be mentioned in this article are con-
only one boundary condition exists and therefore trollability, stability, and stabilization. A control
controlling one or all the boundary data at one system is said to be exactly controllable if the
point is the same. system can be driven from any initial state to B
The KdV equation being of third order in another one in finite time. This kind of properties
space, three boundary conditions have to be im- holds, for instance, for hyperbolic system as the
posed: one at the left endpoint x D 0 and two at wave equation. The notion of null-controllability
the right endpoint x D L. For the KS equation, means that the system can be driven to the origin
four boundary conditions are needed to get a from any initial state. The main example for
well-posed system, two at each extreme. We will this property is the heat equation, which presents
focus on the cases where Dirichlet and Neumann regularizing effects. Even if the initial data is
boundary conditions are considered because lack discontinuous, right after t D 0, the solution
of controllability phenomena appears. This holds of the heat equation becomes very smooth, and
for some special values of the length of the therefore, it is not possible to impose a discontin-
interval for the KdV equation and depends on the uous final state. A system is said to be asymptoti-
anti-diffusion coefficient for the KS equation. cally stable if the solutions of the system without
The particular cases where the lack of any control converge as the time goes to infinity
controllability occurs can be seen as isolated to a stationary solution of the PDE. When this
anomalies. However, those phenomena give convergence holds with a control depending at
us important information on the systems. In each time on the state of the system (feedback
particular, any method independent of the value control), the system is said to be stabilizable by
of those constants cannot control or stabilize means of a feedback control law.
the system when acting from the corresponding All these properties have local versions when a
control input where trouble appears. In all of smallness condition for the initial and/or the final
these cases, for both the KdV and the KS state is added. This local character is normally
equations, the space of uncontrollable states is due to the nonlinearity of the system.
finite dimensional, and therefore, some methods
coming from the control of ordinary differential
equations can be applied.
The KdV Equation

General Definitions The classical approach to deal with nonlinearities


is first to linearize the system around a given
Infinite-dimensional control systems described state or trajectory, then to study the linear system
by PDEs have attracted a lot of attention since the and finally to go back to the nonlinear one by
1970s. In this framework, the state of the control means of an inversion argument or a fixed-point
system is given by the solution of an evolution theorem. Linearizing (1) around the origin, we
PDE. This solution can be seen as a trajectory get the equation
in an infinite-dimensional Hilbert space H , for
instance, the space of square integrable functions yt C yx C yxxx D 0; (3)
or some Sobolev space. Thus, for any time t, the
state belongs to H . Concerning the control input, which can be studied on a finite interval Œ0; L
this is either an internal force distributed in the under the following three boundary conditions:
domain, or a punctual force localized within the
domain, or some boundary data as considered in
y.t; 0/ D h1 .t/; y.t; L/ D h2 .t/; and
this article. For any time t, the control belongs
to a control space U , which can be, for instance, yx .t; L/ D h3 .t/: (4)
90 Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs

Thus, viewing h1 .t/; h2 .t/; h3 .t/ 2 R as controls obtain the original KdV control system and the
and the solution y.t; / W Œ0; L ! R as the state, following results.
we can consider the linear control system (3)–(4)
Theorem 2 The nonlinear KdV system (1)–
and the nonlinear one (1)–(4).
(4) is:
We will report on the role of each input control
1. Locally null-controllable when controlled
when the other two are off. The tools used are
from h1 (i.e., h2 D h3 D 0) (Glass and
mainly the duality controllability-observability,
Guerrero 2008).
Carleman estimates, the multiplier method, the
2. Locally exactly controllable when controlled
compactness-uniqueness argument, the backstep-
from h2 (i.e., h1 D h3 D 0) if L does not
ping method, and fixed-point theorems. Surpris-
belong to the set O of critical lengths (Glass
ingly, the control properties of the system depend
and Guerrero 2010).
strongly on the location of the controls.
3. Locally exactly controllable when controlled
Theorem 1 The linear KdV system (3)–(4) is: from h3 (i.e., h1 D h2 D 0). If L belongs to the
1. Null-controllable when controlled from h1 set of critical lengths N , then a minimal time
(i.e., h2 D h3 D 0) (Glass and Guerrero of control may be required (see Cerpa 2014).
2008). 4. Asymptotically stable to the origin if L … N
2. Exactly controllable when controlled from h2 and no control is applied (Perla Menzala et al.
(i.e., h1 D h3 D 0) if and only if L does not 2002).
belong to a set O of critical lengths defined in 5. Locally stabilizable by means of a feedback
Glass and Guerrero (2010). law using h1 only (i.e., h2 D h3 D 0) (Cerpa
3. Exactly controllable when controlled from h3 and Coron 2013).
(i.e., h1 D h2 D 0) if and only if L does not
Item 3 in Theorem 2 is a truly nonlinear re-
belong to a set of critical lengths N defined in
sult obtained by applying a power series method
Rosier (1997).
introduced in Coron and Crépeau (2004). All
4. Asymptotically stable to the origin if L … N
other items are implied by perturbation argu-
and no control is applied (Perla Menzala et al.
ments based on the linear control system. The re-
2002).
lated control system formed by (1) with boundary
5. Stabilizable by means of a feedback law using
controls
h1 only (i.e., h2 D h3 D 0) Cerpa and Coron
(2013).
y.t; 0/ D h1 .t/; yx .t; L/ D h2 .t/; and
If L 2 N or L 2 O, one says that L is
a critical length since the linear control system yxx .t; L/ D h3 .t/; (5)
(3)–(4) loses controllability properties when only
one control input is applied. In those cases, there is studied in Cerpa et al. (2013), and the same
exists a finite-dimensional subspace of L2 .0; L/ phenomenon of critical lengths appears.
which is unreachable from 0 for the linear system.
The sets N and O contain infinitely many critical
lengths, but they are countable sets.
The KS Equation
When one is allowed to use more than one
boundary control input, there is no critical spatial
Applying the same strategy than for KdV, we
domain, and the exact controllability holds for
linearize (2) around the origin to get the equation
any L > 0. This is proved in Zhang (1999) when
three boundary controls are used. The case of two
control inputs is solved in Rosier (1997), Glass ut C uxxxx C uxx D 0; (6)
and Guerrero (2010), and Cerpa et al. (2013).
Previous results concern the linearized control which can be studied on the finite interval Œ0; 1
system. Considering the nonlinearity yyx , we under the following four boundary conditions:
Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs 91

u.t; 0/ D v1 .t/; ux .t; 0/ D v2 .t/; It is known from Liu and Krstic (2001) that
if < 4 2 , then the system is exponentially
u.t; 1/ D v3 .t/; and ux .t; 1/ D v4 .t/: stable in L2 .0; 1/. On the other hand, if D 4 2 ,
(7) then zero becomes an eigenvalue of the system,
Thus, viewing v1 .t/; v2 .t/; v3 .t/; v4 .t/ 2 R as and therefore the asymptotic stability fails. When B
controls and the solution u.t; / W Œ0; 1 ! R > 4 2 , the system has positive eigenvalues
as the state, we can consider the linear control and becomes unstable. In order to stabilize this
system (6)–(7) and the nonlinear one (2)–(7). The system, a finite-dimensional-based feedback law
role of the parameter is crucial. The KS equa- can be designed by using the pole placement
tion is parabolic and the eigenvalues of system method (item 4 in Theorem 3).
(6)–(7) with no control (v1 D v2 D v3 D v4 D 0) Previous results concern the linearized control
go to 1. If increases, then the eigenvalues system. If we add the nonlinearity uux , we obtain
move to the right. When > 4 2 , the sys- the original KS control system and the following
tem becomes unstable because there are a finite results.
number of positive eigenvalues. In this unstable Theorem 4 The KS control system (2)–(7) is:
regime, the system loses control properties for 1. Locally null-controllable when controlled
some values of . from v1 and v2 (i.e., v3 D v4 D 0). The
same is true when controlling v3 and v4 (i.e.,
Theorem 3 The linear KS control system (6)–(7) v1 D v2 D 0) (Cerpa and Mercado 2011).
is: 2. Asymptotically stable to the origin if <
1. Null-controllable when controlled from v1 and 4 2 and no control is applied (Liu and Krstic
v2 (i.e., v3 D v4 D 0). The same is true when 2001).
controlling v3 and v4 (i.e., v1 D v2 D 0)
(Cerpa and Mercado 2011; Lin Guo 2002). There are less results for the nonlinear systems
2. Null-controllable when controlled from v2 than for the linear one. This is due to the fact that
(i.e., v1 D v2 D v3 D 0) if and only if does the spectral techniques used to study the linear
not belong to a countable set M defined in system with only one control input are not robust
Cerpa (2010). enough to deal with perturbations in order to
3. Asymptotically stable to the origin if < address the nonlinear control system.
4 2 and no control is applied (Liu and Krstic
2001).
4. Stabilizable by means of a feedback law using
v2 only (i.e., v2 D v3 D v4 D 0) if and only if Summary and Future Directions
… M (Cerpa 2010).
The KdV and the KS equations possess both
In the critical case 2 M , the linear system noncontrol results when one boundary control
is not null-controllable anymore if we control input is applied. This is due to the fact that
v2 only (item 2 in Theorem 3). The space of both are higher-order equations, and therefore,
noncontrollable states is finite dimensional. To when posed on a bounded interval, more than
obtain the null-controllability of the linear system one boundary condition should be imposed at
in these cases, we have to add another control. the same point. The KdV equation is exactly
Controlling with v2 and v4 does not improve the controllable when acting from the right and null-
situation in the critical cases. Unlike that, the controllable when acting from the left. On the
system becomes null-controllable if we can act other hand, the KS equation, being parabolic as
on v1 and v2 . This result with two input controls the heat equation, is not exactly controllable but
has been proved in Lin Guo (2002) for the case null-controllable. Most of the results are implied
D 0 and in Cerpa and Mercado (2011) in the by the behaviors of the corresponding linear sys-
general case (item 1 in Theorem 3). tem, which are very well understood.
92 Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs

For the KdV equation, the main directions to Bibliography


investigate at this moment are the controllability
and the stability for the nonlinear equation in Armaou A, Christofides PD (2000) Feedback control
of the Kuramoto-Sivashinsky equation. Physica D
critical domains. Among others, some questions
137:49–61
concerning controllability, minimal time of con- Cerpa E (2010) Null controllability and stabilization of a
trol, and decay rates for the stability are open. linear Kuramoto-Sivashinsky equation. Commun Pure
Regarding the KS equation, there are few results Appl Anal 9:91–102
Cerpa E (2014) Control of a Korteweg-de Vries equation:
for the nonlinear system with one control input
a tutorial. Math Control Rel Fields 4:45–99
even if we are not in a critical value of the Cerpa E, Coron J-M (2013) Rapid stabilization for a
anti-diffusion parameter. In the critical cases, the Korteweg-de Vries equation from the left dirich-
controllability and stability issues are wide open. let boundary condition. IEEE Trans Autom Control
58:1688–1695
In general, for PDEs, there are few results Cerpa E, Mercado A (2011) Local exact controllability
about delay phenomena, output feedback laws, to the trajectories of the 1-D Kuramoto-Sivashinsky
adaptive control, and other classical questions equation. J Differ Equ 250:2024–2044
in control theory. The existing results on these Cerpa E, Rivas I, Zhang B-Y (2013) Boundary controlla-
bility of the Korteweg-de Vries equation on a bounded
topics mainly concern the more popular heat and domain. SIAM J Control Optim 51:2976–3010
wave equations. As KdV and KS equations are Christofides PD, Armaou A (2000) Global stabilization
one dimensional in space, many mathematical of the Kuramoto-Sivashinsky equation via distributed
tools are available to tackle those problems. For output feedback control. Syst Control Lett 39:283–294
Coron JM (2007) Control and nonlinearity. American
all that, to our opinion, the KdV and KS equations Mathematical Society, Providence
are excellent candidates to continue investigating Coron J-M, Crépeau E (2004) Exact boundary control-
these control properties in a PDE framework. lability of a nonlinear KdV equation with critical
lengths. J Eur Math Soc 6:367–398
Glass O, Guerrero S (2008) Some exact controllability
results for the linear KdV equation and uniform con-
Cross-References trollability in the zero-dispersion limit. Asymptot Anal
60:61–100
Glass O, Guerrero S (2010) Controllability of the KdV
 Boundary Control of 1-D Hyperbolic Systems
equation from the right Dirichlet boundary condition.
 Controllability and Observability Syst Control Lett 59:390–395
 Control of Fluids and Fluid-Structure Interac- Korteweg DJ, de Vries G (1895) On the change of form
tions of long waves advancing in a rectangular canal, and
on a new type of long stationary waves. Philos Mag
 Feedback Stabilization of Nonlinear Systems
39:422–443
 Stability: Lyapunov, Linear Systems Krstic M (2009) Delay compensation for nonlinear, adap-
tive, and PDE systems. Birkhauser, Boston
Kuramoto Y, Tsuzuki T (1975) On the formation of dissi-
pative structures in reaction-diffusion systems. Theor
Recommended Reading Phys 54:687–699
Lin Guo Y-J (2002) Null boundary controllability for a
The book Coron (2007) is a very good refer- fourth order parabolic equation. Taiwan J Math 6:
421–431
ence to study the control of PDEs. In Cerpa
Liu W-J, Krstic M (2001) Stability enhancement by
(2014), a tutorial presentation of the KdV con- boundary control in the Kuramoto-Sivashinsky equa-
trol system is given. Control system for PDEs tion. Nonlinear Anal Ser A Theory Methods 43:
with boundary conditions and internal controls is 485–507
Perla Menzala G, Vasconcellos CF, Zuazua E (2002)
considered in Rosier and Zhang (2009) and the Stabilization of the Korteweg-de Vries equation with
references therein for the KdV equation and in localized damping. Q Appl Math LX:111–129
Armaou and Christofides (2000) and Christofides Rosier L (1997) Exact boundary controllability for the
and Armaou (2000) for the KS equation. Control Korteweg-de Vries equation on a bounded domain.
ESAIM Control Optim Calc Var 2:33–55
topics as delay and adaptive control are studied Rosier L, Zhang B-Y (2009) Control and stabilization
in the framework of PDEs in Krstic (2009) and of the Korteweg-de Vries equation: recent progresses.
Smyshlyaev and Krstic (2010), respectively. J Syst Sci Complex 22:647–682
Bounds on Estimation 93

Sivashinsky GI (1977) Nonlinear analysis of hydrody- true parameter values and those estimated by the
namic instability in laminar flames – I derivation of algorithm. However, explicit forms of estimation
basic equations. Acta Astronaut 4:1177–1206
Smyshlyaev A, Krstic M (2010) Adaptive control of
error are usually difficult to obtain except for
parabolic PDEs. Princeton University Press, Princeton the simplest statistical models. Therefore, per-
Zhang BY (1999) Exact boundary controllability of the formance bounds are derived as a way of quan- B
Korteweg-de Vries equation. SIAM J Control Optim tifying estimation accuracy while maintaining
37:543–565
tractability.
In many cases, it is beneficial to quantify
performance using universal bounds that are in-
dependent of the estimation algorithms and rely
Bounds on Estimation only upon the model. In this regard, universal
lower bounds are particularly useful as it provides
Arye Nehorai1 and Gongguo Tang2 means to assess the difficulty of performing es-
1
Preston M. Green Department of Electrical and timation for a particular model and can act as
Systems Engineering, Washington University in benchmarks to evaluate the quality of any algo-
St. Louis, St. Louis, MO, USA rithm: the closer the estimation error of the algo-
2
Department of Electrical Engineering & rithm to the lower bound, the better the algorithm.
Computer Science, Colorado School of Mines, In the following, we review three widely used
Golden, CO, USA universal lower bounds on estimation: Cramér-
Rao bound (CRB), Barankin-type bound (BTB),
and Ziv-Zakai bound (ZZB). These bounds find
numerous applications in determining the per-
Abstract formance of sensor arrays, radar, and nonlinear
filtering; in benchmarking various algorithms;
We review several universal lower bounds on and in optimal design of systems.
statistical estimation, including deterministic
bounds on unbiased estimators such as Cramér-
Rao bound and Barankin-type bound, as well as
Bayesian bounds such as Ziv-Zakai bound. We Statistical Model and Related
present explicit forms of these bounds, illustrate Concepts
their usage for parameter estimation in Gaussian
additive noise, and compare their tightness. To formalize matters, we define a statistical
model for estimation as a family of parameterized
probability density functions in RN : fp.xI / W
 2 ‚ Rd g. We observe a realization of
Keywords x 2 RN generated from a distribution p.xI /,
where  2 ‚ is the true parameter to be
Barankin-type bound; Cramér-Rao bound; estimated from data x. Though we assume a
Mean-squared error; Statistical estimation; single observation x, the model is general enough
Ziv-Zakai bound to encompass multiple independent, identically
distributed samples (i.i.d.) by considering the
joint probability distribution.
Introduction An estimator of  is a measurable function of
O
the observation .x/ W RN ! ‚. An unbiased
Statistical estimation involves inferring the values estimator is one such that
of parameters specifying a statistical model from
n o
data. The performance of a particular statistical O
E .x/ D ; 8 2 ‚: (1)
algorithm is measured by the error between the
94 Bounds on Estimation

Here we used the subscript  to emphasize estimation. Define the Fisher information matrix
that the expectation is taken with respect to I./ via
p.xI /. We focus on the performance of
 
unbiased estimators in this entry. There are @ @
Ii;j ./ D E log p.xI / log p.xI /
various ways to measure the error of the estimator @i @j
O
.x/. Two typical ones are the error covariance  
@2
matrix: D E log p.xI / :
@i @j
n o
E .O  /.O  /T D Cov./;
O (2)
O the error
Then for any unbiased estimator ,
where the equation holds only for unbiased esti- covariance matrix is bounded by
mators, and the mean-squared error (MSE): n o
n o  n E .O  /.O  /T
ŒI./1 ; (5)
E O
k.x/  k22 D trace E .O  /
o
where A
B for two symmetric matrices means
.O  /T : (3) A  B is positive semidefinite. The inverse of the
Fisher information matrix CRB./ D ŒI./1 is
Example 1 (Signal in additive Gaussian noise called the Cramér-Rao bound.
(SAGN)) To illustrate the usage of different When  is scalar, I./ measures the expected
estimation bounds, we use the following sensitivity of the density function with respect to
statistical model as a running example: changes in the parameter. A density family that
is more sensitive to parameter changes (larger
xn D sn ./ C wn ; n D 0; : : : ; N  1: (4) I./) will generate observations that look more
different when the true parameter varies, making
Here  2 ‚ R is a scalar parameter to be it easier to estimate (smaller error).
estimated and the noise wn follows i.i.d. Gaussian Example 2 For the SAGN model (4), the CRB is
distribution with mean 0 and known variance  2 .
Therefore, the density function for x is 2
CRB./ D I./1 D h i2 : (6)
PN 1 @sn . /
p.xI / nD0 @

Y
N 1  
1 .xn  sn .//2
D p exp  The inverse dependence on the `2 norm of signal
nD0
2 2 2 derivative suggests that signals more sensitive to
( N 1 ) parameter change are easier to estimate.
1 X .xn  sn .//2
D p exp  : For the frequency estimation problem with
. 2/N nD0
2 2 sn ./ D cos.2 n/, the CRB as a function of
 is plotted in Fig. 1.
In particular, we consider the frequency estima- There are many modifications of the basic
tion problem where sn ./ D cos.2 n/ with CRB such as the posterior CRB (Tichavsky et al.
‚ D Œ0; 14 /. 1998; Van Trees 2001), the hybrid CRB (Rockah
and Schultheiss 1987), the modified CRB
(D’Andrea et al. 1994), the concentrated CRB
Cramér-Rao Bound (Hochwald and Nehorai 1994), and constrained
CRB (Gorman and Hero 1990; Marzetta 1993;
The Cramér-Rao bound (CRB) (Kay 2001a; Sto- Stoica and Ng 1998). The posterior CRB
ica and Nehorai 1989; Van Trees 2001) is ar- takes into account the prior information of the
guably the most well-known lower bounds on parameters when they are modeled as random
Bounds on Estimation 95

family p.xI / behaves similarly around  0 and


 1 , but these two points are not in neighbor-
hoods of each other. Then it would be difficult
to distinguish these two points for any estimation
algorithm, and the estimation performance for the B
first statistical model would be bad (an extreme
case is p.xI  0 /  p.xI  1 / in which case the
model is non-identifiable; more discussions on
identifiability and Fisher information matrix can
be found in Hochwald and Nehorai (1997)). In
the second model, we remove the point  1 and its
near neighborhood from ‚, then the performance
should get better. However, CRB for both models
would remain the same whether we exclude  1
from ‚ or not. As a matter of fact, CRB. 0 / uses
only the fact that the estimator is unbiased in a
Bounds on Estimation, Fig. 1 Cramér-Rao bound on neighborhood of the true parameter  0 .
frequency estimation: N D 5 vs. N D 10 Barankin bound addresses CRB’s shortcoming
of not respecting the global structure of the sta-
tistical model by introducing finitely many test
variables, while the hybrid CRB considers the points f i ; i D 1; : : : ; M g and ensures that the
case that the parameters contain both random and estimator is unbiased at the neighborhood of  0
deterministic parts. The modified CRB and the as well as these test points (Forster and Larzabal
concentrated CRB focus on handling nuisance 2002). The original Barankin bound (Barankin
1949) is derived for scalar parameter  2 ‚ R
b
parameters in a tractable manner. The application
of the these CRBs requires a regular parameter and any unbiased estimator g./ for a function
space (e.g., an open set in Rd ). However, in many g./:
case, the parameter space ‚ is a low-dimensional
manifolds in Rd specified by equalities and
b
E .g./  g.//2  sup
M; i ;ai
inequalities. In this case, the constrained CRB
hP i2
provides tighter lower bounds by incorporating M i i
 g.//
mD1 a .g. /
knowledge of the constraints. hP i2 (7)
M i
E i p.xI /
mD1 a p.xI /

Barankin Bound Using (7), we can derive a Barankin-type bound


on the error covariance matrix of any unbiased
CRB is a local bound in the sense that it involves O
estimator .x/ for a vector parameter  2 ‚
only local properties (the first or second order R (Forster and Larzabal 2002):
d
derivatives) of the log-likelihood function. So if
n o
two families of log-likelihood functions coincide E .O  /.O  /T  ˆ.B  11T /1 ˆT ; (8)
at a region near  0 , the CRB at  0 would be the
same, even if they are drastically different in other
where the matrices are defined via
regions of the parameter space.
However, the entire parametric space should  
play a role in determining the difficulty of param- p.xI  i / p.xI  j /
Bi;j D E ; 1  i; j  M;
eter estimation. To see this, imagine that there are p.xI / p.xI /
two statistical models. In the first model there is  
another point  1 2 ‚ such that the likelihood ˆ D 1      M  
96 Bounds on Estimation

In most cases, it is extremely difficult to derive


an analytical form of the Barankin bound by
optimizing with respect to M and the test points
f j g. In Fig. 2, we plot the Barankin-type bounds
for sn ./ D cos.2 n/ for M D 10 randomly
selected test points. We observe that Barankin-
type bound is tighter than the CRB when SNR
is small. There is a SNR region around 0 dB that
the Brankin-type bound drops drastically. This is
usually called the “threshold” phenomenon. Prac-
tical systems operate much better in the region
above the threshold.
The basic CRB and BTB belong to the family
Bounds on Estimation, Fig. 2 Cramér-Rao bound vs. of deterministic “covariance inequality” bounds
Barankin-type bound on frequency estimation when  0 D in the sense that the unknown parameter is as-
0:1. The BTB is obtained using M D 10 uniform random
points sumed to be a deterministic quantity (as opposed
to a random quantity). Additionally, both bounds
work only for unbiased estimators, making them
inappropriate performance indicators for biased
and 1 is the vector in RM with all ones. Note estimators such as many regularization-based es-
that we have used  i with a superscript to denote timators.
different points in ‚, while i with a subscript to
denote the i th component of a point .
Since the bound (8) is valid for any M and
Ziv-Zakai Bound
any choice of test points f i g, we obtain the
tightest bound by taking the supremum over all
In this section, we introduce the Ziv-Zakai bound
finite families of test points. Note that when we
(ZZB) (Bell et al. 1997) that is applicable to any
have d test points that approach  in d linearly
estimator (not necessarily unbiased). Unlike the
independent directions, the Barankin-type bound
CRB and BTB, the ZZB is a Bayesian bound and
(8) converges to the CRB. If we have more than
the errors are averaged by the prior distribution
d test points, however, the Barankin-type bound
p . / of the parameter. For any a 2 Rd , the ZZB
is always not worse than the CRB. Particularly,
states that
the Barankin-type bound is much tighter in the
regime of low signal-to-noise ratio (SNR) and n o
O
aT E ..x/ O
 /..x/  /T a 
small number of measurements, which allows
one to investigate the “threshold” phenomena as R
1 1
˚ R
2 0
V maxıWaT ıDh Rd .p . / C p . C ı//
shown in the next example. 
ti mesPmin . ; C ı/d hdh;
Example 3 For the SAGN model, if we have M
test points, the elements of matrix B are of the
following form: where the expectation is taken with respect to
the joint disunity p.xI /p . /, Vfq.h/g D
( N 1
maxr0 q.h C r/ is the valley-filling function,
1 X and Pmin . ; C ı/ is the minimal probability of
Bi;j D exp Œsn . i /  sn ./
 2 nD0 error for the following binary hypothesis testing
) problem:
Œsn . j /  sn ./
H0 W  D I x p.xI /
Bounds on Estimation 97

H1 W  D C ıI x p.xI C ı/

with

p . / B
Pr.H0 / D
p . / C p . C ı/
p . C ı/
Pr.H1 / D :
p . / C p . C ı/

Example 4 For the ASGN model, we assume a


uniform prior probability, i.e., p . / D 4; 2
Œ0; 1=4/. The ZZB simplifies to
Bounds on Estimation, Fig. 3 Ziv-Zakai bound vs.
n o maximum likelihood estimator for frequency estimation
E kO .x/  k22 

Z 1
( "Z 1
#)
1 4 4 h the performance of statistically modeled physical
V 8Pmin . ; C h/d hdh:
2 0 0
systems that is independent of any specific
algorithm.
Future directions of performance bounds on
The binary hypothesis testing problem is to de-
estimation include deriving tighter bounds, de-
cide which one of two signals is buried in addi-
veloping computational schemes to approximate
tive Gaussian noise. The optimal detector with
existing bounds in a tractable way, and applying
minimal probability of error is the minimum
them to practical problems.
distance receiver (Kay 2001b), and the associated
probability of error is
Cross-References
Pmin . ; C h/
0 s 1  Estimation, Survey on
PN 1
1 .s . /  s . C h// 2  Particle Filters
D Q@ A;
nD0 n n
2 2
Recommended Reading
R1 2
 t2
where Q.h/ D h
p1 e dt. For the
2 Kay SM (2001a), Chapter 2 and 3; Stoica P,
frequency estimation problem, we numerically
Nehorai A (1989); Van Trees HL (2001), Chap-
estimate the integral and plot the resulting
ter 2.7; Forster and Larzabal (2002); Bell et al.
ZZB in Fig. 3 together with the mean-squared
(1997).
error for the maximum likelihood estimator
(MLE). Acknowledgments This work was supported in part
by NSF Grants CCF-1014908 and CCF-0963742, ONR
Grant N000141310050, AFOSR Grant FA9550-11-1-
Summary and Future Directions 0210.

We have reviewed several important performance


bounds on statistical estimation problems,
Bibliography
particularly, the Cramér-Rao bound, the
Barankin-type bound, and the Ziv-Zakai bound. Barankin EW (1949) Locally best unbiased estimates.
These bounds provide a universal way to quantify Ann Math Stat 20(4):477–501
98 Building Control Systems

Bell KL, Steinberg Y, Ephraim Y, Van Trees HL (1997)


Extended Ziv-Zakai lower bound for vector param- Building Control Systems
eter estimation. IEEE Trans Inf Theory 43(2):624–
637 James E. Braun
D’Andrea AN, Mengali U, Reggiannini R (1994) The Purdue University, West Lafayette, IN, USA
modified Cramér-Rao bound and its application to
synchronization problems. IEEE Trans Commun
42(234):1391–1399
Forster P, Larzabal P (2002) On lower bounds for deter- Abstract
ministic parameter estimation. In: IEEE international
conference on acoustics, speech, and signal processing
(ICASSP), 2002, Orlando, vol 2. IEEE, pp II–1141
This entry provides an overview of systems and
Gorman JD, Hero AO (1990) Lower bounds for paramet- issues related to providing optimized controls for
ric estimation with constraints. IEEE Trans Inf Theory commercial buildings. It includes a description of
36(6):1285–1301 the evolution of the control systems over time,
Hochwald B, Nehorai A (1994) Concentrated Cramér- typical equipment and control variables, typical
Rao bound expressions. IEEE Trans Inf Theory 40(2):
363–371 two-level hierarchal structure for feedback and
Hochwald B, Nehorai A (1997) On identifiability and supervisory control, definition of the optimal su-
information-regularity in parametrized normal distri- pervisory control problem, references to typical
butions. Circuits Syst Signal Process 16(1):83–89 heuristic control approaches, and a description of
Kay SM (2001a) Fundamentals of statistical signal pro- current and future developments.
cessing, volume 1: estimation theory. Prentice Hall,
Upper Saddle River, NJ
Kay SM (2001b) Fundamentals of statistical signal
processing, volume 2: detection theory. Prentice Hall, Keywords
Upper Saddle River, NJ
Marzetta TL (1993) A simple derivation of the constrained
multiple parameter Cramér-Rao bound. IEEE Trans Building automation systems (BAS); Cooling
Signal Process 41(6):2247–2249 plant optimization; Energy management and
Rockah Y, Schultheiss PM (1987) Array shape calibration controls systems (EMCS); Intelligent building
using sources in unknown locations – part I: far-field
controls
sources. IEEE Trans Acoust Speech Signal Process
35(3):286–299
Stoica P, Nehorai A (1989) MUSIC, maximum likelihood,
and Cramér-Rao bound. IEEE Trans Acoust Speech
Signal Process 37(5):720–741
Introduction
Stoica P, Ng BC (1998) On the Cramér-Rao bound
under parametric constraints. IEEE Signal Process Computerized control systems were developed
Lett 5(7):177–179 in the 1980s for commercial buildings and are
Tichavsky P, Muravchik CH, Nehorai A (1998) Posterior typically termed energy management and control
Cramér-Rao bounds for discrete-time nonlinear filter-
ing. IEEE Trans Signal Process 46(5):1386–1396
systems (EMCS) or building automation systems
Van Trees HL (2001) Detection, estimation, and mod- (BAS). They have been most successfully ap-
ulation theory: part 1, detection, estimation, and plied to large commercial buildings that have
linear modulation theory. Jhon Wiley & Sons, hundreds of building zones and thousands of con-
Hoboken, NJ
trol points. Less than about 15 % of commercial
buildings have EMCS, but they serve about 40 %
of the floor area. Small commercial buildings
tend not to have an EMCS, although there is
a recent trend towards the use of wireless ther-
mostats with cloud-based energy management
BSDE solutions.
EMCS architectures for buildings have
 Backward Stochastic Differential Equations evolved from centralized to highly distributed
and Related Control Problems systems as depicted in Fig. 1 in order to reduce
Building Control Systems 99

Building Control Systems, Fig. 1 Evolution from centralized to distributed network architectures

wiring costs and provide more modular solutions. of operation that depend on time and external
The development of open communications conditions.
protocols, such as BACNet, has enabled the Each local-loop feedback controller acts
use of distributed control devices from different independently, but their performance can be
vendors and improved the cost-effectiveness coupled to other local-loop controllers if not
of ECMS. There has also been a recent trend tuned appropriately. Adaptive tuning algorithms
towards the use of existing enterprise networks have been developed in recent years to enable
to reduce system installed costs and to more controllers to automatically adjust to changing
easily allow remote access and control from any weather and load conditions. There are typically
Internet accessible device. a number of degrees of freedom in adjusting
An EMCS for a large commercial building can supervisory control set points over a wide
automate the control of many of the building and range while still achieving adequate comfort
system functions, including scheduling of lights conditions. Optimal control of supervisory set
and zone thermostat settings according to occu- points involves minimizing a cost function
pancy patterns. Security and fire safety systems with respect to the free variables and subject
tend to be managed using separate systems. In to constraints. Although model-based, control
addition to scheduling, an EMCS manages the optimization approaches are not typically
control of individual equipment and subsystems employed in buildings, they have been used to
that provide heating, ventilation, and air condi- inform the development and assessment of some
tioning of the building (HVAC). This control is heuristic control strategies. Most commonly,
achieved using a two-level hierarchical structure strategies for adjusting supervisory control
of local-loop and supervisory control. Local-loop variables are established at the control design
control of individual set points is typically im- phase based on some limited analysis of the
plemented using individual proportional-integral HVAC system and specified as a sequence of
(PI) feedback algorithms that manipulate individ- operations that is programmed into the EMCS.
ual actuators in response to deviations from the
set points. For example, supply air temperature
from a cooling coil is controlled by adjusting Systems, Equipment, and Controls
a valve opening that provides chilled water to
the coil. The second level of supervisory con- The greatest challenges and opportunities for
trol specifies the set points and other modes optimizing supervisory control variables exist for
100 Building Control Systems

Building Control Systems, Fig. 2 Schematic of a chilled water cooling system

centralized cooling systems that are employed the coil is controlled with a local feedback con-
in large commercial buildings because of the troller that adjusts the flow of water using a valve.
large number of control variables and degrees A supply fan and return fan (not shown in Fig. 2)
of freedom along with utility rate incentives. provide the necessary airflow to and from the
A simplified schematic of a typical centralized zones. With a VAV system, zone temperature set
cooling plant is shown in Fig. 2 with components points are regulated using a feedback controller
grouped under air distribution, chilled water loop, applied to dampers within the VAV boxes. The
chiller plant, and condenser water loop. overall air flow provided by the AHU is typically
Typical air distribution systems include VAV controlled to maintain a duct static pressure set
(variable-air volume) boxes within the zones, point within the supply duct.
air-handling units, ducts, and controls. An air- The chilled water loop communicates
handling unit (AHU) provides the primary condi- between the cooling coils within the AHUs
tioning, ventilation, and flow of air and includes and chillers that provide the primary source
cooling and heating coils, dampers, fans, and con- for cooling. It consists of pumps, pipes,
trols. A single air handler typically serves many valves, and controls. Primary/secondary chilled
zones and several air handlers are utilized in a water systems are commonly employed to
large commercial building. For each AHU, out- accommodate variable-speed pumping. In the
door ventilation air is mixed with return air from primary loop, fixed-speed pumps are used to
the zones and fed to the cooling coil. Outdoor and provide relatively constant chiller flow rates to
return air dampers are typically controlled using ensure good performance and reduce the risk of
an economizer control that selects between min- evaporator tube freezing. Individual pumps are
imum and maximum ventilation air depending typically cycled on and off with a chiller that
upon the condition of the outside air. The cooling it serves. The secondary loop incorporates one
coil provides both cooling and dehumidification or more variable-speed pumps that are typically
of the process air. The air outlet temperature from controlled to maintain a set point for chilled water
Building Control Systems 101

loop differential pressure between the building of any control changes. However, zone feedback
supplies and returns. controllers respond to higher temperatures by
The primary source of cooling for the system increasing VAV box airflow through increased
is typically provided by one or more chillers damper openings. This leads to reduced static
that are arranged in parallel and have dedicated pressure in the primary supply duct, which causes B
pumps. Each chiller has an on-board local-loop the AHU supply fan controller to create ad-
feedback controller that adjusts its cooling capac- ditional airflow. The greater airflow causes an
ity to maintain a specified set point for chilled wa- increase in supply air temperatures leaving the
ter supply temperature. Additional chiller control cooling coils in the absence of any additional
variables include the number of chillers operating control changes. However, the supply air tem-
and the relative loading for each chiller. The perature feedback controllers respond by opening
relative loading can be controlled for a given total the cooling coil valves to increase water flow and
cooling requirement by utilizing different chilled the heat transfer to the chilled water (the cooling
water supply set points for constant individual load). For variable-speed pumping, a feedback
flow or by adjusting individual flows for iden- controller would respond to decreasing pressure
tical set points. Chillers can be augmented with differential by increasing the pump speed. The
thermal storage to reduce the amount of chiller chillers would then experience increased loads
power required during occupied periods in order due to higher return water temperature and/or
to reduce on-peak energy and power demand flow rate that would lead to increases in chilled
costs. The thermal storage medium is cooled water supply temperatures. However, the chiller
during the unoccupied, nighttime period using the controllers would respond by increasing chiller
chillers when electricity is less expensive. During cooling capacities in order to maintain the chilled
occupied times, a combination of the chillers and water supply set points (and match the cooling
storage are used to meet cooling requirements. coil loads). In turn, the heat rejection to the con-
Control of thermal storage is defined by the denser water loop would increase to balance the
manner in which the storage medium is charged increased energy removed by the chiller, which
and discharged over time. would increase the temperature of water leaving
The condenser water loop includes cooling the condenser. The temperature of water leav-
towers, pumps, piping, and controls. Cooling ing the cooling tower would then increase due
towers reject energy to the ambient air through to an increase in its energy water temperature.
heat transfer and possibly evaporation (for wet However, a feedback controller would respond to
towers). Larger systems tend to have multiple the higher condenser water supply temperature
cooling towers with each tower having multiple and increase the tower airflow. At some load,
cells that share a common sump with individual the current set of operating chillers would not
fans having two or more speed settings. The be sufficient to meet the load (i.e., maintain the
number of operating cells and tower fan speeds chilled water supply set points) and an additional
are often controlled using a local-loop feedback chiller would need to be brought online.
controller that maintains a set point for the This example illustrated how different local-
water temperature leaving the cooling tower. loop controllers might respond to load changes
Typically, condenser water pumps are dedicated in order to maintain individual set points. Super-
to individual chillers (i.e., each pump is cycled visory control might change these set points and
on and off with a chiller that it serves). modes of operation. At any given time, it is pos-
In order to better understand building control sible to meet the cooling needs with any number
variables, interactions, and opportunities, con- of different modes of operation and set points
sider how controls change in response to in- leading to the potential for control optimization
creasing building cooling requirements for the to minimize an objective function.
system of Fig. 2. As energy gains to the zones The system depicted in Fig. 2 and described
increase, zone temperatures rise in the absence in the preceding paragraphs represents one of
102 Building Control Systems

many different types of systems employed in X


rate periods
commercial buildings. Medium-sized commer- J D Jp C Rd;a max ŒPk kD1 t o Nmonth
cial buildings often employ multiple direct ex- pD1
pansion (DX) cooling systems where refrigerant
with respect to a trajectory of controls uk ; k D
*
flows between each AHU and an outdoor con- *
densing unit that employs variable capacity com- 1 to Nmonth ; M k ; k D 1 to Nmonth
pressors. The compressor capacity is typically where
controlled to maintain a supply air temperature
X
Np
 
set point, which is still available as a supervisory Jp DRe;p Pp;j t C Rd;p max Pp;j j D1 t o N
p
control variable. However, the other condensing j D1
unit controls (e.g., condensing fans, expansion
valve) are typically prepackaged with the unit with the optimization subject to the following
and not available to the EMCS. For smaller com- general constraints
mercial buildings, rooftop units (RTUs) are typi- * *
Pk D P .f k ; uk ; M k /
*
cally employed that contain a prepackaged AHU,
* *
refrigeration cycle, and controls. Each RTU di- * *
x k D x.x k1 ; f k ; uk ; M k /
*

rectly cools the air in a portion of the build-


u k;min  uk  uk;max
* * *

ing in response to an individual thermostat. The * * * *


xk;min  xk  xk;max y k .f k ; uk ; M k /  y k;max
*
capacity control is typically on/off staging of * *
*
the compressor and constant volume air flow is .f k ; u k ; M k /
mostly commonly employed. In this case, the
only free supervisory control variables are the where J is the monthly electrical cost ($), the
thermostat set points. In general, the degrees subscript p denotes that a quantity is limited
of freedom for supervisory control decrease in to a particular type of rate period p (e.g., on-
going from chilled water cooling plants to DX peak, off-peak, mid-peak), Rd;a is an anytime
system to RTUs. In addition, the utility rate in- demand charge ($/kW) that is applied to the
centives for taking advantage of thermal storage maximum power consumption occurring over the
and advanced controls are greater for large com- month Pk is average building power (kW) for
mercial building applications. stage k within the month, Nmonth is the number
of stages in the month, Re;p is the unit cost of
electrical energy ($/kWh) for rate period type
p, t is the length of the stage (h), Np is the
Optimal Supervisory Control number of stages within rate period type p in
the month, Rd;p is a rate period specific demand
In commercial buildings, it is common to have charge ($/kW) that is applied to the maximum
electric utility rates that have energy and de- power consumption occurring during the month
mand charges that vary with time of use. The within rate period p, fk is a vector of uncontrolled
different rate periods can often include on-peak, inputs that affect building power consumption
*
off-peak, and mid-peak periods. For this type of (e.g., weather, internal gains), uk is a vector of
rate structure, the time horizon necessary to truly continuous supervisory control variables (e.g.,
minimize operating costs extends over the entire supply air temperature set point), Mk is a vector
month. In order to better understand the con- of discrete supervisory control variables (chiller
trol issues, consider the general optimal control on/off controls), xk is a vector of state variables,
problem for minimizing monthly electrical utility yk is a vector of outputs, and subscripts min and
charges associated with operating an all-electric max denote minimum and maximum allowable
cooling system in the presence of time-of-use values.
and demand charges. The dynamic optimization The state variables could characterize the state
involves minimizing of a storage device such as a chilled water or
Building Control Systems 103

ice storage tank. In this case, the states would Other heuristic approaches have been devel-
be constrained between limits associated with oped (e.g., ASHRAE 2011; Braun 2007) for
the device’s practical storage potential. When controlling the charging and discharging of ice
variations in zone temperature set points are con- or chilled water storage that were derived from
sidered within an optimization, then state vari- a daily optimization formulation. For the case B
ables associated with the distributed nature of of real-time pricing of energy, heuristic charg-
energy storage within the building structure are ing and discharging strategies were derived from
important to consider. The outputs are additional minimizing a daily cost function
quantities of interest, such as equipment cooling
capacities, occupant comfort conditions, etc., that X
Nday

often need to be constrained. In order to imple- Jday D Re;k Pk t


ment a model-based predictive control scheme, it kD1

would be necessary to have models for the build-


ing power, state variables, and outputs in terms with respect to a trajectory of charging and dis-
of the control and uncontrolled variables. The charging rates, subject to a constraint of equal
uncontrolled variables would generally include beginning and ending storage states along with
weather (temperature, humidity, solar radiation) other constraints previously described. For the
and internal gains due to lights and occupants, case of typical time-of-use (e.g., on-peak, off-
etc., that would need to be forecasted over a peak) or real-time pricing energy charges with
prediction horizon. demand charges, heuristic strategies have been
It is not feasible to solve this type of monthly developed based on the same form of the daily
optimization problem for buildings for a variety cost function above with an added demand cost
of reasons, including that forecasting of uncon- constraint Rd;k Pk  TDC where TDC is a
trolled inputs beyond a day is unreliable. Also, target demand cost that is set heuristically at the
it is very costly to develop the models necessary beginning of each billing period and updated at
to implement a model-based control approach of each stage as TDCkC1 D max.TDCk ; Rd;k Pk /.
this scale for a particular building. However, it The heuristic storage control strategies can be
is instructive to consider some special cases that readily combined with heuristic strategies for the
have led to some practical control approaches. cooling plant components.
First of all, consider the problem of optimiz- There has been a lot of interest in develop-
ing only the cooling plant supervisory control ing practical methods for dynamic control of
variables when energy storage effects are not zone temperature set points within the bounds of
important. This is typically the case for typical comfort in order to minimize the utility costs.
systems that do not include ice or chilled water However, this is a very difficult problem and so
storage. For this scenario, the future does not this remains in the research realm for the time
matter and the problem can be reformulated as being with limited commercial success.
a static optimization problem, such that for each
stage k the goal is to minimize the building power Summary and Future Directions
consumption, J D Pk , with respect to the current
*
supervisory control variables, uk and Mk , and Although there is great opportunity for reduc-
subject to constraints. ASHRAE (2011) presents ing energy use and operating costs in buildings
a number of heuristic approaches for adjusting through optimal supervisory control, it is rarely
supervisory control variables that have been de- implemented in practice because of high costs as-
veloped through consideration of this type of sociated with engineering site-specific solutions.
optimization problem. This includes algorithms Current efforts are underway to develop scal-
for adjusting cooling tower fan settings, chilled able approaches that utilize general methods for
water supply air set points, and chiller sequencing configuring and learning models needed to im-
and loading. plement model-based predictive control (MPC).
104 Building Control Systems

The current thinking is that solutions for optimal Bibliography


supervisory control will be implemented in the
cloud and overlay on existing building automa- ASHRAE (2011) Supervisory control strategies and op-
timization. In: 2011 ASHRAE handbook of HVAC
tion systems (BMS) through the use of universal
applications, chap 42. ASHRAE, Atlanta, GA
middleware. This will reduce the cost of im- Braun JE (2007) A near-optimal control strategy for cool
plementation compared to programming within storage systems with dynamic electric rates. HVAC&R
existing BMS. There is also a need to reduce Res 13(4):557–580
Li H, Yu D, Braun JE (2011) A review of virtual sens-
the cost of the additional sensors needed to im-
ing technology and application in building systems.
plement MPC. One approach involves the use of HVAC&R Res 17(5):619–645
virtual sensors that employ models with low-cost Mitchell JW, Braun JE (2013) Principles of heating venti-
sensor inputs to provide higher value information lation and air conditioning in buildings. Wiley, Hobo-
ken
that would normally require expensive sensors to
Roth KW, Westphalen D, Feng MY, Llana P, Quartararo L
obtain. (2005) Energy impact of commercial building controls
and performance diagnostics: market characterization,
energy impact of building faults, and energy savings
Cross-References potential. TIAX report no. D0180
Wang S, Ma Z (2008) Supervisory and optimal control
 Model-Predictive Control in Practice of building HVAC systems: a review. HVAC&R Res
14(1):3–32
 PID Control
C

can be defined as a sequence of dependent events


CACSD
that successively weaken the power system (IEEE
PES CAMS Task Force on Cascading Failure
 Computer-Aided Control Systems Design:
2008). The events and their dependencies are
Introduction and Historical Overview
very varied and include outages or failures of
many different parts of the power system and
a whole range of possible physical, cyber, and
Cascading Network Failure in Power human interactions. The events and dependen-
Grid Blackouts cies tend to be rare or complicated, since the
common and straightforward failures tend to be
Ian Dobson already mitigated by engineering design or oper-
Iowa State University, Ames, IA, USA ating practice.
Examples of a small initial outage cascad-
ing into a complicated sequence of dependent
Abstract outages are the August 10, 1996, blackout of
the Northwest United States that disconnected
Cascading failure consists of complicated se- power to about 7.5 million customers (Kosterev
quences of dependent failures and can cause large et al. 1999) and the August 14, 2003 blackout
blackouts. The emerging risk analysis, simula- of about 50 million customers in Northeastern
tion, and modeling of cascading blackouts are United States and Canada (US-Canada Power
briefly surveyed, and key references are sug- System Outage Task Force 2004). Although such
gested. extreme events are rare, the direct costs run to
billions of dollars and the disruption to society
is substantial. Large blackouts also have a strong
Keywords effect on shaping the way power systems are
regulated and the reputation of the power in-
Branching process; Dependent failures; Outage; dustry. Moreover, some blackouts involve social
Power law; Risk; Simulation disruptions that can multiply the economic dam-
age. The hardship to people and possible deaths
Introduction underscore the engineer’s responsibility to work
to avoid blackouts.
The main mechanism for the rare and costly It is useful when analyzing cascading failure to
widespread blackouts of bulk power transmission consider cascading events of all sizes, including
systems is cascading failure. Cascading failure the short cascades that do not lead to interruption

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
106 Cascading Network Failure in Power Grid Blackouts

of power to customers and cascades that in- roughly proportional to blackout size, although
volve events in other infrastructures, especially larger blackouts may well have costs (especially
since loss of electricity can significantly impair indirect costs) that increase faster than linearly. In
other essential or economically important infras- the case of the power law dependence, the larger
tructures. Note that in the context of interact- blackouts can become rarer at a similar rate as
ing infrastructures, the term “cascading failure” costs increase, and then the risk of large black-
sometimes has the more restrictive definition of outs is comparable to or even exceeding the risk
events cascading between infrastructures (Rinaldi of small blackouts. Mitigation of blackout risk
et al. 2001). should consider both small and large blackouts,
because mitigating the small blackouts that are
easiest to analyze may inadvertently increase the
Blackout Risk risk of large blackouts (Newman et al. 2011).
Approaches to quantify blackout risk are chal-
Cascading failure is a sequence of dependent lenging and emerging, but there are also valuable
events that successively weaken the power sys- approaches to mitigating blackout risk that do
tem. At a given stage in the cascade, the previous not quantify the blackout risk. The n-1 criterion
events have weakened the power system so that that requires the power system to survive any sin-
further events are more likely. It is this depen- gle component failure has the effect of reducing
dence that makes the long series of cascading cascading failures. The individual mechanisms
events that cause large blackouts likely enough of dependence in cascades (overloads, protection
to pose a substantial risk. (If the events were failures, voltage collapse, transient stability, lack
independent, then the probability of a large num- of situational awareness, human error, etc.) can
ber of events would be the product of the small be addressed individually by specialized analyses
probabilities of individual events and would be or simulations or by training and procedures.
vanishingly small.) The statistics for the distribu- Credible initiating outages can be sampled and
tion of sizes of blackouts have correspondingly simulated, and those resulting in cascading can
“heavy tails” indicating that blackouts of all sizes, be mitigated (Hardiman et al. 2004). This can be
including large blackouts, can occur. Large black- thought of as a “defense in depth” approach in
outs are rare, but they are expected to happen which mitigating a subset of credible contingen-
occasionally, and they are not “perfect storms.” cies is likely to mitigate other possible contin-
In particular, it has been observed in several gencies not studied. Moreover, when blackouts
developed countries that the probability distribu- occur, a postmortem analysis of that particular se-
tion of blackout size has an approximate power quence of events leads to lessons learned that can
law dependence (Carreras et al. 2004b; Dobson be implemented to mitigate the risk of some sim-
et al. 2007; Hines et al. 2009). (The power law ilar blackouts (US-Canada Power System Outage
is of course limited in extent because every grid Task Force 2004).
has a largest possible blackout in which the en-
tire grid blacks out.) The power law region can
be explained using ideas from complex systems Simulation and Models
theory. The main idea is that over the long term,
the power grid reliability is shaped by the engi- There are many simulations of cascading
neering responses to blackouts and the slow load blackouts using Monte Carlo and other methods,
growth and tends to evolve towards the power law for example, Hardiman et al. (2004), Carreras
distribution of blackout size (Dobson et al. 2007; et al. (2004a), Chen et al. (2005), Kirschen et al.
Ren et al. 2008). (2004), Anghel et al. (2007), and Bienstock
Blackout risk can be defined as the prod- and Mattia (2007). All these simulations
uct of blackout probability and blackout cost. select and approximate a modest subset of the
One simple assumption is that blackout cost is many physical and engineering mechanisms of
Cascading Network Failure in Power Grid Blackouts 107

cascading failure, such as line overloads, voltage largely motivated by idealized models of propa-
collapse, and protection failures. In addition, gation of failures in the Internet. The way that
operator actions or the effects of engineering the failures propagate only along the network links
network may also be crudely represented. Some is not realistic for power systems, which satisfy
of the simulations give a set of credible cascades, Kirchhoff’s laws so that many types of failures
and others approximately estimate blackout risk. propagate differently. For example, line over-
Except for describing the initial outages, loads tend to propagate along cutsets of the net- C
where there is a wealth of useful knowledge, work. However, the high-level qualitative results
much of standard risk analysis and modeling does of phase transitions in the complex networks
not easily apply to cascading failure in power have provided inspiration for similar effects to
systems because of the variety of dependencies be discovered in power system models (Dobson
and mechanisms, the combinatorial explosion of et al. 2007). There is also a possible research
rare possibilities, and the heavy-tailed probability opportunity to elaborate the complex network
distributions. However, progress has been made models to incorporate some of the realities of
in probabilistic models of cascading (Chen et al. power system and then validate them.
2006; Dobson 2012; Rahnamay-Naeini et al.
2012).
The goal of high-level probabilistic models is Summary and Future Directions
to capture salient features of the cascade without
detailed models of the interactions and dependen- One challenge for simulation is what selection
cies. They provide insight into cascading failure of phenomena to model and in how much detail
data and simulations, and parameters of the high- in order to get useful engineering results. Faster
level models can serve as metrics of cascading. simulations would help to ease the requirements
Branching process models are transient of sampling appropriately from all the sources
Markov probabilistic models in which, after of uncertainty. Better metrics of cascading in
some initial outages, the outages are produced addition to average propagation need to be de-
in successive generations. Each outage in each veloped and extracted from real and simulated
generation (a “parent” outage) produces a data in order to better quantify and understand
probabilistic number of outages (“children” blackout risk. There are many new ideas emerg-
outages) in the next generation according to an ing to analyze and simulate cascading failure,
offspring probability distribution. The children and the next step is to validate and improve
failures then become parents to produce the next these new approaches by comparing them with
generation and so on, until there is a generation observed blackout data. Overall, there is an excit-
with zero children and the cascade stops. As ing challenge to build on the more deterministic
might be expected, a key parameter describing approaches to mitigate cascading failure and find
the cascading is its average propagation, which ways to more directly quantify and mitigate cas-
is the average number of children outages cading blackout risk by coordinated analysis of
per parent outage. Branching processes have real data, simulation, and probabilistic models.
traditionally been applied to many cascading
processes outside of risk analysis (Harris), but
they have recently been validated and applied to Cross-References
estimate the distribution of the total number of
outages from utility outage data (Dobson 2012).  Hybrid Dynamical Systems, Feedback Con-
A probabilistic model that tracks the cascade as trol of
it progresses in time through lumped grid states  Lyapunov Methods in Power System Stability
is presented in Rahnamay-Naeini et al. (2012).  Power System Voltage Stability
There is an extensive complex networks liter-  Small Signal Stability in Electric Power
ature on cascading in abstract networks that is Systems
108 Cash Management

Bibliography Newman DE, Carreras BA, Lynch VE, Dobson I (2011)


Exploring complex systems aspects of blackout risk
Anghel M, Werley KA, Motter AE (2007) Stochastic and mitigation. IEEE Trans Reliab 60(1): 134–143
model for power grid dynamics. In: 40th Hawaii inter- Rahnamay-Naeini M, Wang Z, Ghani N, Mammoli A,
national conference on system sciences, Hawaii, Jan Hayat M.M (2014) Stochastic Analysis of Cascading-
2007 Failure Dynamics in Power Grids, to appear in IEEE
Bienstock D, Mattia S (2007) Using mixed-integer pro- Transactions on Power Systems
gramming to solve power grid blackout problems. Ren H, Dobson I, Carreras BA (2008) Long-term effect
Discret Optim 4(1):115–141 of the n-1 criterion on cascading line outages in an
Carreras BA, Lynch VE, Dobson I, Newman DE (2004a) evolving power transmission grid. IEEE Trans Power
Complex dynamics of blackouts in power transmission Syst 23(3):1217–1225
systems. Chaos 14(3):643–652 Rinaldi SM, Peerenboom JP, Kelly TK (2001) Identifying,
Carreras BA, Newman DE, Dobson I, Poole AB (2004b) understanding, and analyzing critical infrastructure
Evidence for self-organized criticality in a time series interdependencies. IEEE Control Syst Mag 21:11–25
of electric power system blackouts. IEEE Trans Cir- US-Canada Power System Outage Task Force (2004)
cuits Syst Part I 51(9):1733–1740 Final report on the August 14, 2003 blackout in the
Chen J, Thorp JS, Dobson I (2005) Cascading dynamics United States and Canada
and mitigation assessment in power system distur-
bances via a hidden failure model. Int J Electr Power
Energy Syst 27(4):318–326
Chen Q, Jiang C, Qiu W, McCalley JD (2006) Probability
models for estimating the probabilities of cascading
outages in high-voltage transmission network. IEEE
Cash Management
Trans Power Syst 21(3): 1423–1431
Dobson I, Carreras BA, Newman DE (2005) A loading- Abel Cadenillas
dependent model of probabilistic cascading failure. University of Alberta, Edmonton, AB, Canada
Probab Eng Inf Sci 19(1):15–32
Dobson I, Carreras BA, Lynch VE, Newman DE (2007)
Complex systems analysis of series of blackouts: cas-
cading failure, critical points, and self-organization. Abstract
Chaos 17:026103
Dobson I (2012) Estimating the propagation and extent Cash on hand (or cash held in highly liquid form
of cascading line outages from utility data with a
branching process, IEEE Trans Power Systems 27(4): in a bank account) is needed for routine busi-
2146–215 ness and personal transactions. The problem of
Hardiman RC, Kumbale MT, Makarov YV (2004) An determining the right amount of cash to hold in-
advanced tool for analyzing multiple cascading fail- volves balancing liquidity against investment op-
ures. In: Eighth international conference on probabil-
ity methods applied to power systems, Ames, Sept portunity costs. This entry traces solutions using
2004 both discrete-time and continuous-time stochas-
Harris TE (1989) Theory of branching processes. Dover, tic models.
New York
Hines P, Apt J, Talukdar S (2009) Large blackouts in North
America: historical trends and policy implications.
Energy Policy 37(12):5249–5259 Keywords
IEEE PES CAMS Task Force on Cascading Failure (2008)
Initial review of methods for cascading failure analy-
sis in electric power transmission systems. In: IEEE Brownian motion; Inventory theory; Stochastic
power and energy society general meeting, Pittsburgh, impulse control
July 2008
Kirschen DS, Strbac G (2004) Why investments do not
prevent blackouts. Electr J 17(2):29–36
Kirschen DS, Jawayeera D, Nedic DP, Allan RN (2004) Definition
A probabilistic indicator of system stress. IEEE Trans
Power Syst 19(3):1650–1657
Kosterev D, Taylor C, Mittelstadt W (1999) Model vali-
A firm needs to keep cash, either in the form
dation for the August 10, 1996 WSCC system outage. of cash on hand or as a bank deposit, to meet
IEEE Trans Power Syst 14:967–979 its daily transaction requirements. Daily inflows
Cash Management 109

and outflows of cash are random. There is a finite strong assumptions, Vial (1972) proved that if an
target for the cash balance, which could be zero in optimal policy exists, then it is of a simple form
some cases. The firm wants to select a policy that .a; ˛; ˇ; b/.
minimizes the expected total discounted cost for This means that the cash balance should
being far away from the target during some time be increased to level ˛ when it reaches
horizon. This time horizon is usually infinity. level a and should be reduced to level ˇ
The firm has an incentive to keep the cash level when it reaches level b. Constantinides (1976) C
low, because each unit of positive cash leads to assumed that an optimal policy exists and
a holding cost since cash has alternative uses it is of a simple form, and determined the
like dividends or investments in earning assets. above levels and discussed the properties
The firm has an incentive to keep the cash level of the optimal solution. Constantinides and
high, because penalty costs are generated as a Richard (1978) proved the main assumptions of
result of delays in meeting demands for cash. Vial (1972) and therefore obtained rigorously
The firm can increase its cash balance by raising a solution for the cash management prob-
new capital or by selling some earnings assets, lem.
and it can reduce its cash balance by paying Constantinides and Richard (1978) applied
dividends or investing in earning assets. This the theory of stochastic impulse control devel-
control of the cash balance generates fixed and oped by Bensoussan and Lions (1973, 1975,
proportional transaction costs. Thus, there is a 1982). He used a Brownian motion W to model
cost when the cash balance is different from its the uncertainty in the inventory. Formally, he
target, and there is also a cost for increasing considered a probability space .; F ; P / to-
or reducing the cash reserve. The objective of gether with a filtration .Ft / generated by a one-
the manager is to minimize the expected total dimensional Brownian motion W . He considered
discounted cost. Xt W D inventory level at time t, and assumed
Hasbrouck (2007), Madhavan and Smidt that X is an adapted stochastic process given
(1993), and Manaster and Mann (1996) study by
inventories of stocks that are similar to the cash
Z t Z t 1
X
management problem.
Xt D x  ds  d Ws C Ifi <t g i :
0 0 i D1

The Solution Here,  > 0 is the drift of the demand and  >
0 is the volatility of the demand. Furthermore, i
The qualitative form of optimal policies of the is the time of the i -th intervention and i is the
cash management problem in discrete time was intensity of the i -th intervention.
studied by Eppen and Fama (1968, 1969), Girgis A stochastic impulse control is a pair
(1968), and Neave (1970). However, their solu-
tions were incomplete. ..n /I .n //
Many of the difficulties that they and other
researchers encountered in a discrete-time frame- D .0 ; 1 ; 2 ; : : : ; n ; : : : I 0 ; 1 ; 2 ; : : : ; n ; : : :/;
work disappeared when it was assumed that de-
cisions were made continuously in time and that where
demand is generated by a Brownian motion with
drift. Vial (1972) formulated the cash manage- 0 D 0 < 1 < 2 <    < n <   
ment problem in continuous time with fixed and
proportional transaction costs, linear holding and is an increasing sequence of stopping times and
penalty costs, and demand for cash generated .n / is a sequence of random variables such that
by a Brownian motion with drift. Under very each n W  7! R is measurable with respect
110 Cash Management

to Fn . We assume 0 D 0. The management (the Cross-References


controller) decides to act at time
 Financial Markets Modeling
 Inventory Theory
X C D Xi C i :
i

Bibliography
We note that i and X can also take negative
Bensoussan A, Lions JL (1973) Nouvelle formulation de
values. The management wants to select the pair problemes de controle impulsionnel et applications. C
R Acad Sci (Paris) Ser A 276:1189–1192
Bensoussan A, Lions JL (1975) Nouvelles methodes en
..n /I .n // controle impulsionnel. Appl Math Opt 1:289–312
Bensoussan A, Lions JL (1982) Controle impulsionnel et
inequations quasi variationelles. Bordas, Paris
Cadenillas A, Zapatero F (1999) Optimal Central Bank
that minimizes the functional J defined by intervention in the foreign exchange market. J Econ
Theory 87:218–242
Z 1
Cadenillas A, Lakner P, Pinedo M (2010) Optimal control
of a mean-reverting inventory. Oper Res 58:1697–
J.xI ..n /I .n /// W D E e t f .Xt /dt
0
1710
1
# Constantinides GM (1976) Stochastic cash management
X with fixed and proportional transaction costs. Manage
n
C e g.n /Ifn <1g ; Sci 22:1320–1331
nD1 Constantinides GM, Richard SF (1978) Existence of op-
timal simple policies for discounted-cost inventory
and cash management in continuous time. Oper Res
where 26:620–636
f .x/ D max .hx; px/ Eppen GD, Fama EF (1968) Solutions for cash balance
and simple dynamic portfolio problems. J Bus
and 41:94–112
Eppen GD, Fama EF (1969) Cash balance and simple
8 dynamic portfolio problems with proportional costs.
< C C c if  > 0
Int Econ Rev 10:119–133
g./ D min .C; D/ if  D 0 Feng H, Muthuraman K (2010) A computational method
:
D  d if  < 0 for stochastic impulse control problems. Math Oper
Res 35:830–850
Furthermore,  > 0; C; c; D; d 2 .0; 1/, Girgis NM (1968) Optimal cash balance level. Manage
and h; p 2 .0; 1/. Here, f represents the run- Sci 15:130–140
ning cost incurred by deviating from the aimed Harrison JM, Sellke TM, Taylor AJ (1983) Impulse con-
trol of Brownian motion. Math Oper Res 8:454–466
cash level 0, C represents the fixed cost per Hasbrouck J (2007) Empirical market microstructure.
intervention when the management pushes the Oxford University Press, New York
cash level upwards, D represents the fixed cost Madhavan A, Smidt S (1993) An analysis of changes
per intervention when the management pushes in specialist inventories and quotations. J Finance
48:1595–1628
the cash level downwards, c represents the pro- Manaster S, Mann SC (1996) Life in the pits: competitive
portional cost per intervention when the manage- market making and inventory control. Rev Financ
ment pushes the cash level upwards, d represents Stud 9:953–975
the proportional cost per intervention when the Neave EH (1970) The stochastic cash-balance problem
with fixed costs for increases and decreases. Manage
management pushes the cash level downwards, Sci 16:472–490
and  is the discount rate. Ormeci M, Dai JG, Vande Vate J (2008) Impulse control
The results of Constantinides were comple- of Brownian motion: the constrained average cost
mented, extended, or improved by Cadenillas case. Oper Res 56:618–629
Vial JP (1972) A continuous time model for the cash
et al. (2010), Cadenillas and Zapatero (1999), balance problem. In: Szego GP, Shell C (eds)
Feng and Muthuraman (2010), Harrison et al. Mathematical methods in investment and finance.
(1983), and Ormeci et al. (2008). North Holland, Amsterdam
Classical Frequency-Domain Design Methods 111

gain and/or phase margins and system bandwidth


Classical Frequency-Domain Design or error characteristics. This section will cover
Methods the design process for finding an appropriate
D.s/.
J. David Powell and Abbas Emami-Naeini
Stanford University, Stanford, CA, USA

Design Specifications C
Abstract
As discussed in Section X, the gain margin
The design of feedback control systems in indus- (GM) is the factor by which the gain can be
try is probably accomplished using frequency- raised before instability results. The phase
response (FR) methods more often than any other. margin (PM) is the amount by which the
Frequency-response design is popular primarily phase of D. j!/G. j!/ exceeds 180ı when
because it provides good designs in the face of jD.j!/G. j!/j D 1; the crossover frequency.
uncertainty in the plant model (G.s/ in Fig. 1). Performance requirements for control systems
For example, for systems with poorly known are often partially specified in terms of PM and/or
or changing high-frequency resonances, we can GM. For example, a typical specification might
temper the feedback design to alleviate the effects include the requirement that PM > 50ı and GM
of those uncertainties. Currently, this tempering is > 5. It can be shown that the PM tends to
carried out more easily using FR design than with correlate well with the damping ratio, ; of the
any other method. The method is most effective closed-loop roots. In fact, it is shown in Franklin
for systems that are stable in open loop; however, et al. (2010), that
it can also be applied to systems with instabilities.
PM
This section will introduce the reader to methods Š
of design (i.e., finding D.s/ in Fig. 1) using lead 100
and lag compensation. It will also cover the use
for many systems; however, the actual resulting
of FR design to reduce steady-state errors and
damping and/or response overshoot of the final
to improve robustness to uncertainties in high-
closed-loop system will need to be verified if they
frequency dynamics.
are specified as well as the PM. A PM of 50ı
would tend to yield a  of 0.5 for the closed-loop
roots, which is a modestly damped system. The
Keywords GM does not generally correlate directly with the
damping ratio, but is a measure of the degree of
Amplitude stabilization; Bandwidth; Bode plot; stability and is a useful secondary specification to
Crossover frequency; Frequency response; Gain; ensure stability.
Gain stabilization; Gain margin; Notch filter; Another design specification is the band-
Phase; Phase margin; Stability width, which was defined in Section X. The
bandwidth is a direct measure of the frequency
at which the closed-loop system starts to fail in
Introduction following the input command. It is also a measure
of the speed of response of a closed-loop system.
Finding an appropriate compensation (D.s/ in Generally speaking, it correlates well with the
Fig. 1) using frequency response is probably the step response rise time of the system.
easiest of all feedback control design methods. In some cases, the steady-state error must
Designs are achievable starting with the FR be less than a certain amount. As discussed in
plots of both magnitude and phase of G.s/ then Franklin et al. (2010), the steady-state error is
selecting D.s/ to achieve certain values of the a direct function of the low-frequency gain of
112 Classical Frequency-Domain Design Methods

Controller Plant
+ U
R Σ Y
D(s) G(s)

Classical Frequency-Domain Design Methods, Fig. 1 Feedback system showing compensation, D.s/ (Source:
Franklin et al. (2010, p-249), Reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ)

the FR magnitude plot. However, increasing the typically accomplished by lead compensation.
low-frequency gain typically will raise the en- A phase increase (or lead) is accomplished by
tire magnitude plot upward, thus increasing the placing a zero in D.s/. However, that alone
magnitude 1 crossover frequency and, therefore, would cause an undesirable high-frequency gain
increasing the speed of response and bandwidth which would amplify noise; therefore, a first-
of the system. order pole is added in the denominator at frequen-
cies substantially higher than the zero break point
of the compensator. Thus, the phase lead still
Compensation Design occurs, but the amplification at high frequencies
is limited. The resulting lead compensation has a
In some cases, the design of a feedback com- transfer function of
pensation can be accomplished by using pro-
portional control only, i.e., setting D.s/ D K Ts C1
D.s/ D K ; ˛ < 1; (1)
(see Fig. 1) and selecting a suitable value for K. ˛T s C 1
This can be accomplished by plotting the mag-
nitude and phase of G.s/, looking at jG. j!/j where 1=˛ is the ratio between the pole/zero
at the frequency where †G. j!/ D 180ı, and break-point frequencies. Figure 2 shows the fre-
then selecting K so that jKG. j!/j yields the quency response of this lead compensation. The
desired GM. Similarly, if a particular value of maximum amount of phase lead supplied is de-
PM is desired, one can find the frequency where pendent on the ratio of the pole to zero and is
†G. j!/ D 180ı C the desired PM. The value shown in Fig. 3 as a function of that ratio.
of jKG. j!/j at that frequency must equal 1; For example, a lead compensator with a
therefore, the value of jG. j!/j must equal 1=K. zero at s D 2 .T D 0:5/ and a pole at
Note that the jKG. j!/j curve moves vertically s D 10 .˛T D 0:1/ (and thus ˛ D 15 ) would
based on the value of K; however the †KG. j!/ yield the maximum phase lead of max D 40ı .
curve is not affected by the value of K. This Note from the figure that we could increase the
characteristic simplifies the design process. phase lead almost up to 90ı using higher values
In more typical cases, proportional feedback of the lead ratio, 1=˛; however, Fig. 2 shows that
alone is not sufficient. There is a need for a increasing values of 1=˛ also produces higher
certain damping (i.e., PM) and/or speed of re- amplifications at higher frequencies. Thus, our
sponse (i.e., bandwidth) and there is no value of task is to select a value of 1=˛ that is a good
K that will meet the specifications. Therefore, compromise between an acceptable PM and
some increased damping from the compensation acceptable noise sensitivity at high frequencies.
is required. Likewise, a certain steady-state error Usually the compromise suggests that a lead
requirement and its resulting low-frequency gain compensation should contribute a maximum
will cause the jD. j!/G. j!/j to be greater than of 70ı to the phase. If a greater phase lead is
desired for an acceptable PM, so more phase needed, then a double-lead compensation would
lead is required from the compensation. This is be suggested, where
Classical Frequency-Domain Design Methods 113

Classical
Frequency-Domain
Design Methods, Fig. 2
Lead-compensation
frequency response with
1=˛ D 10; K D 1
(Source: Franklin et al.
(2010, p-349), Reprinted
by permission of Pearson
C
Education, Inc.)

 2
Ts C1 and if R.s/ D 1=s 2 for a unit ramp, Eq. (2)
D.s/ D K : reduces to
˛T s C 1
 
Even if a system had negligible amounts of 1 1
ess D lim D :
noise present, the pole must exist at some point s!0 s C D.s/Œ1=.s C 1/
D.0/
because of the impossibility of building a pure
differentiator. No physical system – mechanical Therefore, we find that D.0/, the steady-state
or electrical or digital – responds with infinite gain of the compensation, cannot be less than 10
amplitude at infinite frequencies, so there will be if it is to meet the error criterion, so we pick
a limit in the frequency range (or bandwidth) for K D 10. The frequency response of KG.s/ in
which derivative information (or phase lead) can Fig. 4 shows that the PM D 20ı if no phase
be provided. lead is added by compensation. If it were pos-
As an example of designing a lead compen- sible to simply add phase without affecting the
sator, let us design compensation for a DC motor magnitude, we would need an additional phase
with the transfer function of only 25ı at the KG.s/ crossover frequency
of ! D 3 rad/sec. However, maintaining the
1 same low-frequency gain and adding a compen-
G.s/ D :
s.s C 1/ sator zero will increase the crossover frequency;
hence, more than a 25ı phase contribution will
We wish to obtain a steady-state error of less than be required from the lead compensation. To be
0.1 for a unit-ramp input and we desire a system safe, we will design the lead compensator so
bandwidth greater than 3 rad/sec. Furthermore, that it supplies a maximum phase lead of 40ı .
we desire a PM of 45ı . To accomplish the error Figure 3 shows that 1=˛ D 5 will accomplish
requirement, Franklin et al. shows that that goal. We will derive the greatest benefit from
  the compensation if the maximum phase lead
1 from the compensator occurs at the crossover fre-
ess D lim s R.s/; (2)
s!0 1 C D.s/G.s/ quency. With some trial and error, we determine
114 Classical Frequency-Domain Design Methods

Classical
Frequency-Domain
Design Methods, Fig. 3
Maximum phase increase
for lead compensation
(Source: Franklin et al.
(2010, p-350), Reprinted
by permission of Pearson
Education, Inc.)

Classical Frequency-Domain Design Methods, Fig. 4 Frequency response for lead-compensation design (Source:
Franklin et al. (2010, p-352), Reprinted by permission of Pearson Education, Inc.)
Classical Frequency-Domain Design Methods 115

that placing the zero at ! D 2 rad/sec and the


pole at ! D 10 rad/sec causes the maximum
phase lead to be at the crossover frequency. The
compensation, therefore, is

s=2 C 1
D.s/ D 10 : C
s=10 C 1

The frequency-response characteristics of


L.s/ D D.s/G.s/ in Fig. 4 can be seen to yield
a PM of 53ı , which satisfies the PM and steady-
state error design goals. In addition, the crossover
frequency of 5 rad/sec will also yield a bandwidth
greater than 3 rad/sec, as desired.
Lag compensation is the same form as the Classical Frequency-Domain Design Methods, Fig. 5
Effect of high-frequency plant uncertainty (Source:
lead compensation in Eq. (1) except that ˛ > 1.
Franklin et al. (2010, p-372), Reprinted by permission of
Therefore, the pole is at a lower frequency than Pearson Education, Inc.)
the zero and it produces higher gain at lower
frequencies. The compensation is used primar-
ily to reduce steady-state errors by raising the
low-frequency gain but without increasing the Conversely, if all unknown high-frequency
crossover frequency and speed of response. This phenomena are guaranteed to remain below a
can be accomplished by placing the pole and zero magnitude of 1, stability can be guaranteed.
of the lag compensation well below the crossover The likelihood of an unknown resonance in
frequency. Alternatively, lag compensation can the plant G rising above 1 can be reduced if
also be used to improve the PM by keeping the the nominal high-frequency loop gain .L/ is
low frequency gain the same, but reducing the lowered by the addition of extra poles in D.s/.
gain near crossover, thus reducing the crossover When the stability of a system with resonances
frequency. That will usually improve the PM is assured by tailoring the high-frequency
since the phase of the uncompensated system magnitude never to exceed 1, we refer to this
typically is higher at lower frequencies. process as amplitude or gain stabilization.
Systems being controlled often have high- Of course, if the resonance characteristics are
frequency dynamic phenomena, such as known exactly and remain the same under all
mechanical resonances, that could have an conditions, a specially tailored compensation,
impact on the stability of a system. In very- such as a notch filter at the resonant frequency,
high-performance designs, these high-frequency can be used to tailor the phase for stability even
dynamics are included in the plant model, though the amplitude does exceed magnitude 1
and a compensator is designed with a specific as explained in Franklin et al. (2010). Design
knowledge of those dynamics. However, a more of a notch filter is more easily carried out using
robust approach for designing with uncertain root locus or state-space design methods, all of
high-frequency dynamics is to keep the high- which are discussed in Franklin et al. (2010). This
frequency gain low, just as we did for sensor- method of stabilization is referred to as phase
noise reduction. The reason for this can be stabilization. A drawback to phase stabilization
seen from the gain–frequency relationship of is that the resonance information is often not
a typical system, shown in Fig. 5. The only available with adequate precision or varies with
way instability can result from high-frequency time; therefore, the method is more susceptible
dynamics is if an unknown high-frequency to errors in the plant model used in the design.
resonance causes the magnitude to rise above 1. Thus, we see that sensitivity to plant uncertainty
116 Computational Complexity Issues in Robust Control

and sensor noise are both reduced by sufficiently innovative approaches and led to the development
low gain at high-frequency. of new and efficient algorithms. However,
some of the other problems in robust control
theory had attracted significant amount of
Summary and Future Directions research, but none of the proposed algorithms
were efficient, namely, had execution time
Before the common use of computers in design, bounded by a polynomial of the “problem
frequency-response design was the only widely size.” Several important problems in robust
used method. While it is still the most widely control theory are either of decision type or
used method for routine designs, complex sys- of computation/approximation type, and one
tems and their design are being carried out using would like to have an algorithm which can be
a multitude of methods. This section introduces used to answer all or most of the possible cases
just one of many possible methods. and can be executed on a classical computer in
reasonable amount of time. There is a branch
of theoretical computer science, called theory
Cross-References of computation, which can be used to study the
difficulty of problems in robust control theory.
 Frequency-Response and Frequency-Domain In the following, classical computer system,
Models algorithm, efficient algorithm, unsolvability,
 Polynomial/Algebraic Design Methods tractability, NP-hardness, and NP-completeness
 Quantitative Feedback Theory will be introduced in a more rigorous fashion,
 Spectral Factorization with applications to problems from robust control
theory.

Bibliography
Keywords
Franklin GF, Powell JD, Workman M (1998) Digital con-
trol of dynamic systems, 3rd edn. Ellis-Kagle Press,
Half Moon Bay Approximation complexity; Computational com-
Franklin GF, Powell JD, Emami-Naeini A (2010) Feed- plexity; NP-complete; NP-hard; Unsolvability
back control of dynamic systems, 6th edn. Pearson,
Upper Saddle River
Franklin GF, Powell JD, Emami-Naeini A (2015) Feed-
back control of dynamic systems, 7th edn. Pearson, Introduction
Upper Saddle River
The term algorithm is used to refer to differ-
ent notions which are all somewhat consistent
with our intuitive understanding. This ambiguity
Computational Complexity Issues in may sometimes generate significant confusion,
Robust Control and therefore, a rigorous definition is of extreme
importance. One commonly accepted “intuitive”
Onur Toker definition is a set of rules that a person can per-
Fatih University, Istanbul, Turkey form with paper and pencil. However, there are
“algorithms” which involve random number gen-
eration, for example, finding a primitive root in
Abstract Zp (Knuth 1997). Based on this observation, one
may ask whether a random number generation-
Robust control theory has introduced several based set of rules should be also considered
new and challenging problems for researchers. as an algorithm, provided that it will terminate
Some of these problems have been solved by after finitely many steps for all instances of the
Computational Complexity Issues in Robust Control 117

problem or for a significant majority of the cases. Lewis and Papadimitriou (1998), and Sipser
In a similar fashion, one may ask whether any (2006). Despite this being a quite simple
real number, including irrational ones which can- and low-performance “hardware” compared to
not be represented on a digital computer with- today’s engineering standards, the following
out an approximation error, should be allowed two observations justify their use in the study
as an input to an algorithm and, furthermore, of computational complexity of engineering
should all calculations be limited to algebraic problems. Anything which can be solved on C
functions only or should exact calculation of non- today’s current digital computers can be solved
algebraic functions, e.g., trigonometric functions, on a Turing machine. Furthermore, a polynomial-
the gamma function, etc., be acceptable in an time algorithm on today’s digital computers will
algorithm. Although all of these seem acceptable correspond to again a polynomial-time algorithm
with respect to our intuitive understanding of the on the original Turing machine, and vice versa.
algorithm, from a rigorous point of view, they A widely accepted definition for an algorithm
are different notions. In the context of robust is a Turing machine with a program, which is
control theory, as well as many other engineer- guaranteed to terminate after finitely many steps.
ing disciplines, there is a separate and widely For some mathematical and engineering prob-
accepted definition of algorithm, which is based lems, it can be shown that there can be no algo-
on today’s digital computers, more precisely the rithm which can handle all possible cases. Such
Turing machine (Turing 1936). Alan M. Turing problems are called unsolvable. The condition
defined a “hypothetical computation machine” “all cases” may be considered too tough, and
to formally define the notions of algorithm and one may argue that such negative results have
computability. A Turing machine is, in principle, only theoretical importance and have no practi-
quite similar to today’s digital computers widely cal implications. But such results do imply that
used in many engineering applications. The engi- we should concentrate our efforts on alterna-
neering community seems to widely accept the tive research directions, like the development of
use of current digital computers and Turing’s algorithms only for cases which appear more
definitions of algorithm and computability. frequently in real scientific/engineering applica-
However, depending on new scientific, engi- tions, without asking the algorithm to work for
neering, and technological developments, supe- the remaining cases as well.
rior computation machines may be constructed. Here is a famous unsolvable mathematical
For example, there is no guarantee that quantum problem: Hilbert’s tenth problem is basically
computing research will not lead to superior com- the development of an algorithm for testing
putation machines (Chen et al. 2006; Kaye et al. whether a Diophantine equation has an integer
2007). In the future, the engineering community solution. However, in 1970, Matijasevich
may feel the need to revise formal definitions showed that there can be no such algorithm
of algorithm, computability, tractability, etc., if (Matiyasevich 1993). Therefore, we say that
such superior computation machines can be con- the problem of checking whether a Diophantine
structed and used for scientific/engineering appli- equation has an integer solution is unsolvable.
cations. Several unsolvability results for dynamical
systems can be proved by using the Post
Turing Machines and Unsolvability correspondence problem (Davis 1985) and the
Turing machine is basically a mathematical embedding of free semigroups into matrices.
model of a simplified computation device. The For example, the problem of checking the
original definition involves a tape-like device stability of saturated linear dynamical systems
for memory. For an easy-to-read introduction is proved to be undecidable (Blondel et al. 2001),
to the Turing machine model, see Garey and meaning that no general stability test algorithm
Johnson (1979) and Papadimitriou (1995), and can be developed for such systems. A similar
for more details, see Hopcroft et al. (2001), unsolvability result is reported in Blondel and
118 Computational Complexity Issues in Robust Control

Tsitsiklis (2000a) for boundedness of switching Papadimitriou 1995). Such algorithms are also
systems of the type called efficient, and associated problems are clas-
sified as tractable. The term problem size means
x.k C 1/ D Af .k/ x.k/; number of bits used in a suitable encoding of
the problem (Garey and Johnson 1979; Papadim-
where f is assumed to be an arbitrary and un- itriou 1995).
known function from N into f0; 1g. A closely One may argue that this worst-case approach
related asymptotic stability problem is equivalent of being always polynomial time is a quite con-
to testing whether the joint spectral radius (JSR) servative requirement. In reality, a practicing en-
(Rota and Strang 1960) of a set of matrices is less gineer may consider being polynomial time on
than one. For a quite long period of time, there the average quite satisfactory for many appli-
was a conjecture called the finiteness conjecture cations. The same may be true for algorithms
(FC) (Lagarias and Wang 1995), which was gen- which are polynomial time for most of the cases.
erally believed or hoped to be true, at least for However, the existing computational complex-
a group of researchers. FC may be interpreted ity theory is developed around this idea of be-
as “For asymptotic stability of x.k C 1/ D ing polynomial time even in the worst case.
Af .k/ x.k/ type switching systems, it is enough Therefore, many of the computational complex-
to consider periodic switchings only.” There was ity results proved in the literature do not imply
no known counterexample, and the truth of this the impossibility of algorithms which are nei-
conjecture would imply existence of an algorithm ther polynomial time on the average nor poly-
for the abovementioned JSR problem. However, nomial time for most of the cases. Note that
it was shown in Bousch and Mairesse (2002) despite not being efficient, such algorithms may
that FC is not true (see Blondel et al. (2003) for be considered quite valuable by a practicing en-
a simplified proof). There are numerous known gineer. Tractability and efficiency can be defined
computationally very valuable procedures related in several different ways, but the abovementioned
to JSR approximation, for example, see Blon- polynomial-time solvability even in the worst-
del and Nesterov (2005) and references therein. case approach is widely adopted by the engineer-
However, the development of an algorithm to test ing community.
whether JSR is less than one remains as an open NP-hardness and NP-completeness are origi-
problem. nally defined to express the inherent difficulty of
For further results on unsolvability and un- decision-type problems, not for approximation-
solved problems in robust control, see Blondel type problems. Although approximation com-
et al. (1999), Blondel and Megretski (2004), and plexity is an important and active research area in
references therein. the theory of computation (Papadimitriou 1995),
most of the classical results are on decision-type
Tractability, NP-Hardness, and problems. Many robust control-related problems
NP-Completeness can be formulated as “Check whether < 1,”
The engineering community is interested in not which is a decision-type problem. Approximate
only solution algorithms but algorithms which value of may not be always good enough to
are fast even in the worst case and if not on the “solve” the problem, i.e., to decide about robust
average. Sometimes, this speed requirement may stability. For certain other engineering applica-
be relaxed to being fast for most of the cases tions for which approximate values of optimiza-
and sometimes to only a significant percentage tion problems are good enough to “solve” the
of the cases. Currently, the theory of computa- problem, the complexity of a decision problem
tion is developed around the idea of algorithms may not be very relevant. For example, in a
which are polynomial time even in the worst case, minimum effort control problem, usually there
namely, execution time bounded by a polynomial may be no point in computing the exact value of
of the problem size (Garey and Johnson 1979; the minimum, because good approximations will
Computational Complexity Issues in Robust Control 119

be just fine for most cases. However, for a robust polynomial-time solution algorithm can be de-
control problem, a result like D 0:99 ˙ 0:02 veloped. All NP-complete problems are also NP-
may be not enough to decide about robust sta- hard, but they are only as “hard” as any other
bility, although the approximation error is about problem in the set of NP-complete problems.
2 % only. Basically, both the conservativeness of The first known NP-complete problem is
the current tractability definition and the differ- SATISFIABILITY (Cook 1971). In this problem,
ences between decision- and approximation-type there is a single Boolean equation with several C
problems should be always kept in mind when variables, and we would like to test whether
interpreting computational complexity results re- there is an assignment to these variables which
ported here as well as in the literature. make the Boolean expression true. This important
In this subsection, and in the next one, we discovery enabled proofs of NP-completeness
will consider decision problems only. The class or NP-hardness of several other problems by
P corresponds to decision problems which can using simple polynomial reduction techniques
be solved by a Turing machine with a suitable only (Garey and Johnson 1979). Among these,
program in polynomial time (Garey and Johnson quadratic programming is an important one and
1979). This is interpreted as decision problems led to the discovery of several other complexity
which have polynomial-time solution algorithms. results in robust control theory. The quadratic
The definition of the class NP is more technical programming (QP) can be defined as
and involves nondeterministic Turing machines
(Garey and Johnson 1979). It may be interpreted q D min x T Qx C c T x;
Axb
as the class of decision problems for which the
truth of the problem can be verified in polynomial
more precisely testing whether q < 1 or not
time. It is currently unknown whether P and NP
(decision version). When the matrix Q is positive
are equal or not. This is a major open problem,
definite, convex optimization techniques can be
and the importance of it in the theory of computa-
used; however, the general version of the problem
tion is comparable to the importance of Riemann
is NP-hard.
hypothesis in number theory.
A related problem is linear programming (LP)
A problem is NP-complete if it is NP and
every NP problem polynomially reduces to it
(Garey and Johnson 1979). For an NP-complete q D min c T x;
Axb
problem, being in P is equivalent to P D NP.
There are literally hundreds of such problems, which is used in certain robust control problems
and it is generally argued that since after sev- (Dahleh and Diaz-Bobillo 1994) and has a quite
eral years of research nobody was able to de- interesting status. Simplex method (Dantzig
velop a polynomial-time algorithm for these NP- 1963) is a very popular computational technique
complete problems, there is probably no such for LP and is known to have polynomial-time
algorithm, and most likely P ¤ NP. Although complexity on the “average” (Smale 1983).
current evidence is more toward P ¤ NP, this However, there are examples where the simplex
does not constitute a formal proof, and the history method requires exponentially growing number
of mathematics and science is full of surprising of steps with the problem size (Klee and
discoveries. Minty 1972). In 1979, Khachiyan proposed
A problem (not necessarily NP) is called NP- the ellipsoid algorithm for LP, which was the
hard if and only if there is an NP-complete first known polynomial-time approximation
problem which is polynomial time reducible to algorithm (Schrijver 1998). Because of the nature
it (Garey and Johnson 1979). Being NP-hard is of the problem, one can answer the decision
sometimes called being intractable and means version of LP in polynomial time by using
that unless P D NP, which is considered to the ellipsoid algorithm for approximation and
be very unlikely by a group of researchers, no stopping when the error is below a certain level.
120 Computational Complexity Issues in Robust Control

But all of these results are for standard Turing fractional transformations (LFT), which
machines with input parameters restricted to are mainly used to study systems which
rational numbers. An interesting open problem is have component-level uncertainties (Packard
whether LP admits a polynomial algorithm in the and Doyle 1993). Basically, bounds on the
real number model of computation. component-level uncertainties are given, and
we would like to check whether the system
Complexity of Certain Robust Control is robustly stable or not. This is known to be
Problems NP-hard.
There are several computational complexity re- Structured Singular Value Problem (SSVP):
sults for robust control problems (see Blondel and Given a matrix M and uncertainty structure
Tsitsiklis (2000b) for a more detailed survey). , check whether the structured singular value
Here we summarize some of the key results on  .M / < 1:
interval matrices and structured singular values. This is proved to be NP-hard for real, and mixed,
Kharitonov theorem is about robust Hurwitz uncertainty structures (Braatz et al. 1994), as well
stability of polynomials with coefficients as for complex uncertainties with no repetitions
restricted to intervals (Kharitonov 1978). The (Toker and Ozbay 1996, 1998).
problem is known to have a surprisingly simple
solution; however, the matrix version of the
problem has a quite different nature. If we have a Approximation Complexity
matrix family Decision version of QP is NP-hard, but approx-
imation of QP is quite “difficult” as well. An
˚ approximation is called a -approximation if
A D A 2 Rnn W ˛i;j  Ai;j  ˇi;j ;
 the absolute value of the error is bounded by
i; j D 1; : : : ; n ; (1)  times the absolute value of max–min of the
function. The following is a classical result on
where ˛i;j ; ˇi;j are given constants for i; j D QP (Bellare and Rogaway 1995): Unless P D
1; : : : ; n, then it is called an interval matrix. Such NP, QP does not admit a polynomial-time -
matrices do occur in descriptions of uncertain approximation algorithm even for  < 1=3.
dynamical systems. The following two stability This is sometimes informally stated as “QP is
problems about interval matrices are known to be NP-hard to approximate.” Much work is needed
NP-hard: toward similar results on robustness margin and
Interval Matrix Problem 1 (IMP1): Decide related optimization problems of the classical
whether a given interval matrix, A, is robust robust control theory.
Hur- witz stable or not. Namely, check An interesting case is the complex structured
whether all members of A are Hurwitz-stable singular value computation with no repeated un-
matrices, i.e., all eigenvalues are in open left certainties. There is a convex relaxation of the
half plane. problem, the standard upper bound , which is
Interval Matrix Problem 2 (IMP2): Decide known to result in quite tight approximations for
whether a given interval matrix, A, has a most cases of the original problem (Packard and
Hurwitz-stable matrix or not. Namely, check Doyle 1993). However, despite strong numerical
whether there exists at least one matrix in A evidence, a formal proof of “good approximation
which is Hurwitz stable. for most cases” result is not available. We also do
For a proof of NP-hardness of IMP1, see Pol- not have much theoretical information about how
jak and Rohn (1993) and Nemirovskii (1993), hard it is to approximate the complex structured
and for a proof of IMP2, see Blondel and singular value. For example, it is not known
Tsitsiklis (1997). whether it admits a polynomial-time approxima-
Another important problem is related to tion algorithm with error bounded by, say, 5 %.
structured singular values (SSV) and linear In summary, much work needs to be done in these
Computational Complexity Issues in Robust Control 121

directions for many other robust control problems and 5 % error-bounded result will be of great
whose decision versions are NP-hard. importance for practicing engineers. Therefore,
such directions should also be studied, and
various meaningful alternatives, like being
Summary and Future Directions polynomial time on the average or for most
of cases or anything which makes sense for a
The study of the “Is P ¤ NP?” question turned practicing engineer, should be considered as an C
out to be a quite difficult one. Researchers agree alternative direction.
that really new and innovative tools are needed In summary, computational complexity the-
to study this problem. On one other extreme, one ory guides research on the development of algo-
can question whether we can really say some- rithms, indicating which directions are dead ends
thing about this problem within the Zermelo- and which directions are worth to investigate.
Fraenkel (ZF) set theory or will it have a status
similar to axiom of choice (AC) and the contin-
uum hypothesis (CH) where we can neither refute Cross-References
nor provide a proof (Aaronson 1995). Therefore,
the question may be indeed much deeper than we  Optimization Based Robust Control
thought, and standard axioms of today’s mathe-  Robust Control in Gap Metric
matics may not be enough to provide an answer.  Robust Fault Diagnosis and Control
As for any such problem, we can still hope that  Robustness Issues in Quantum Control
in the future, new “self-evident” axioms may be  Structured Singular Value and Applications:
discovered, and with the help of them, we may Analyzing the Effect of Linear Time-Invariant
provide an answer. Uncertainty in Linear Systems
All of the complexity results mentioned here
are with respect to the standard Turing machine
which is a simplified model of today’s digital Bibliography
computers. Depending on the progress in science,
engineering, and technology, if superior compu- Aaronson S (1995) Is P versus NP formally independent?
tation machines can be constructed, then some Technical report 81, EATCS
of the abovementioned problems can be solved Bellare M, Rogaway P (1995) The complexity of approx-
imating a nonlinear program. Math Program 69:429–
much faster on these devices, and current re- 441
sults/problems of the theory of computation may Blondel VD, Megretski A (2004) Unsolved problems in
no longer be of great importance or relevance for mathematical systems and control theory. Princeton
University Press, Princeton
engineering and scientific applications. In such a
Blondel VD, Nesterov Y (2005) Computationally efficient
case, one may also need to revise definitions of approximations of the joint spectral radius. SIAM J
the terms algorithm, tractable, etc., according to Matrix Anal 27:256–272
these new devices. Blondel VD, Tsitsiklis JN (1997) NP-hardness of some
linear control design problems. SIAM J Control Optim
Currently, there are several NP-hardness
35:2118–2127
results about robust control problems, mostly Blondel VD, Tsitsiklis JN (2000a) The boundedness of
NP-hardness of decision problems about all products of a pair of matrices is undecidable. Syst
robustness. However, much work is needed on Control Lett 41:135–140
Blondel VD, Tsitsiklis JN (2000b) A survey of com-
the approximation complexity and conservatism
putational complexity results in systems and control.
of various convex relaxations of these problems. Automatica 36:1249–1274
Even if a robust stability problem is NP-hard, a Blondel VD, Sontag ED, Vidyasagar M, Willems JC
polynomial-time algorithm to estimate robustness (1999) Open problems in mathematical systems and
control theory. Springer, London
margin with, say, 5 % error is not ruled out
Blondel VD, Bournez O, Koiran P, Tsitsiklis JN (2001)
with the NP-hardness of the decision version The stability of saturated linear dynamical systems is
of the problem. Indeed, a polynomial-time undecidable. J Comput Syst Sci 62:442–462
122 Computer-Aided Control Systems Design: Introduction and Historical Overview

Blondel VD, Theys J, Vladimirov AA (2003) An ele- Schrijver A (1998) Theory of linear and integer program-
mentary counterexample to the finiteness conjecture. ming. Wiley, Chichester
SIAM J Matrix Anal 24:963–970 Sipser M (2006) Introduction to the theory of computa-
Bousch T, Mairesse J (2002) Asymptotic height optimiza- tion. Thomson Course Technology, Boston
tion for topical IFS, Tetris heaps and the finiteness Smale S (1983) On the average number of steps in the
conjecture. J Am Math Soc 15:77–111 simplex method of linear programming. Math Program
Braatz R, Young P, Doyle J, Morari M (1994) Computa- 27:241–262
tional complexity of  calculation. IEEE Trans Autom Toker O, Ozbay H (1996) Complexity issues in robust
Control 39:1000–1002 stability of linear delay differential systems. Math
Chen G, Church DA, Englert BG, Henkel C, Rohwedder Control Signals Syst 9:386–400
B, Scully MO, Zubairy MS (2006) Quantum comput- Toker O, Ozbay H (1998) On the NP-hardness of the
ing devices. Chapman and Hall/CRC, Boca Raton purely complex mu computation, analysis/synthesis,
Cook S (1971) The complexity of theorem proving pro- and some related problems in multidimensional sys-
cedures. In: Proceedings of the third annual ACM tems. IEEE Trans Autom Control 43:409–414
symposium on theory of computing, Shaker Heights, Turing AM (1936) On computable numbers, with an
pp 151–158 application to the Entscheidungsproblem. Proc Lond
Dahleh MA, Diaz-Bobillo I (1994) Control of uncertain Math Soc 42:230–265
systems. Prentice Hall, Englewood Cliffs
Dantzig G (1963) Linear programming and extensions.
Princeton University Press, Princeton
Davis M (1985) Computability and unsolvability. Dover
Garey MR, Johnson DS (1979) Computers and intractabil- Computer-Aided Control Systems
ity, a guide to the theory of NP-completeness. W. H. Design: Introduction and Historical
Freeman, San Francisco Overview
Hopcroft JE, Motwani R, Ullman JD (2001) Introduc-
tion to automata theory, languages, and computation.
Addison Wesley, Boston Andreas Varga
Kaye P, Laflamme R, Mosca M (2007) An introduc- Institute of System Dynamics and Control,
tion to quantum computing. Oxford University Press, German Aerospace Center, DLR
Oxford
Kharitonov VL (1978) Asymptotic stability of an equi-
Oberpfaffenhofen, Wessling, Germany
librium position of a family of systems of linear dif-
ferential equations. Differentsial’nye Uravneniya 14:
2086–2088 Synonyms
Klee V, Minty GJ (1972) How good is the simplex al-
gorithm? In: Inequalities III (proceedings of the third
symposium on inequalities), Los Angeles. Academic, CACSD
New York/London, pp 159–175
Knuth DE (1997) Art of computer programming, vol-
ume 2: seminumerical algorithms, 3rd edn. Addison-
Wesley, Reading Abstract
Lagarias JC, Wang Y (1995) The finiteness conjecture for
the generalized spectral radius of a set of matrices.
Linear Algebra Appl 214:17–42 Computer-aided control system design (CACSD)
Lewis HR, Papadimitriou CH (1998) Elements of the encompasses a broad range of Methods and
theory of computation. Prentice Hall, Upper Saddle tools and technologies for system modelling,
River
Matiyasevich YV (1993) Quantum computing devices.
control system synthesis and tuning, dynamic
MIT system analysis and simulation, as well as
Nemirovskii A (1993) Several NP-hard problems arising validation and verification. The domain of
in robust stability analysis. Math Control Signals Syst CACSD enlarged progressively over decades
6:99–105
Packard A, Doyle J (1993) The complex structured singu-
from simple collections of algorithms and
lar value. Automatica 29:71–109 programs for control system analysis and
Papadimitriou CH (1995) Computational complexity. synthesis to comprehensive tool sets and user-
Addison-Wesley/Longman, Reading friendly environments supporting all aspects
Poljak S, Rohn J (1993) Checking robust nonsingularity is
NP-hard. Math Control Signals Syst 6:1–9
of developing and deploying advanced control
Rota GC, Strang G (1960) A note on the joint spectral systems in various application fields. This entry
radius. Proc Neth Acad 22:379–381 gives a brief introduction to CACSD and reviews
Computer-Aided Control Systems Design: Introduction and Historical Overview 123

the evolution of key concepts and technologies with performing virtual experiments on a given
underlying the CACSD domain. Several plant model to analyze and predict the dynamic
cornerstone achievements in developing reliable behavior of a physical plant. Often, modelling
numerical algorithms; implementing robust and simulation are closely connected parts of
numerical software; and developing sophisticated dedicated environments, where specific classes of
integrated modelling, simulation, and design models can be built and appropriate simulation
environments are highlighted. methods can be employed. Simulation is also a C
powerful tool for the validation of mathematical
models and their approximations. In the context
Keywords of CACSD, simulation is frequently used as a
control system tuning-aid, as, for example, in an
CACSD; Modelling; Numerical analysis; Simu- optimization-based tuning approach using time-
lation; Software tools domain performance criteria.
System analysis and synthesis are concerned
with the investigation of properties of the un-
Introduction derlying synthesis model and in the determi-
nation of a control system which fulfills basic
To design a control system for a plant, a typical requirements for the closed-loop controlled plant,
computer-aided control system design (CACSD) such as stability or various time or frequency re-
work flow comprises several interlaced activities. sponse requirements. The analysis also serves to
Model building is often a necessary first step check existence conditions for the solvability of
consisting in developing suitable mathematical synthesis problems, according to established de-
models to accurately describe the plant dynami- sign methodologies. An important synthesis goal
cal behavior. High-fidelity physical plant models is the guarantee of the performance robustness.
obtained, for example, by using the first prin- To achieve this goal, robust control synthesis
ciples of modelling, primarily serve for anal- methodologies often employ optimization-based
ysis and validation purposes using appropriate parameter tuning in conjunction with worst-case
simulation techniques. These dynamic models analysis techniques. A rich collection of reliable
are usually defined by a set of ordinary differ- numerical algorithms are available to perform
ential equations (ODEs), differential algebraic such analysis and synthesis tasks. These algo-
equation (DAEs), or partial differential equations rithms form the core of CACSD and their devel-
(PDEs). However, for control system synthesis opment represented one of the main motivations
purposes simpler models are required, which are for CACSD-related research.
derived by simplifying high-fidelity models (e.g., Performance robustness assessment of the
by linearization, discretization, or model reduc- resulting closed-loop control system is a key
tion) or directly determined in a specific form aspect of the verification and validation activ-
from input-output measurement data using sys- ity. For a reliable assessment, simulation-based
tem identification techniques. Frequently used worst-case analysis represents, often, the only
synthesis models are continuous or discrete-time way to prove the performance robustness of the
linear time-invariant (LTI) models describing the synthesized control system in the presence of
nominal behavior of the plant in a specific oper- parametric uncertainties and variabilities.
ating point. The more accurate linear parameter
varying (LPV) models may serve to account for
uncertainties due to various performed approxi- Development of CACSD Tools
mations, nonlinearities, or varying model param-
eters. The development of CACSD tools for system
Simulation of dynamical systems is a closely analysis and synthesis started around 1960,
related activity to modelling and is concerned immediately after general-purpose digital
124 Computer-Aided Control Systems Design: Introduction and Historical Overview

computers, and new programming languages based on interconnections of subsystems. The


became available for research and development underlying modelling language served as the
purposes. In what follows, we give a historical basis of the first version of Modelica (Elmquist
survey of these developments in the main et al. 1997), a modern equation-based modelling
CACSD areas. language which was the result of a coordinated
effort for the unification and standardization of
Modelling and Simulation Tools expertise gained over many years with object-
Among the first developments were modelling oriented physical modelling. The latest devel-
and simulation tools for continuous-time systems opments in this area are comprehensive model
described by differential equations based on ded- libraries for different application domains such
icated simulation languages. Over 40 continuous- as mechanical, electrical, electronic, hydraulic,
system simulation languages had been developed thermal, control, and electric power systems.
as of 1974 (Nilsen and Karplus 1974), which Notable commercial front-ends for Modelica
evolved out of attempts at digitally emulating the are Dymola, MapleSim, and SystemModeler,
behavior of widely used analog computers before where the last two are tightly integrated in the
1960. A notable development in this period was symbolic computation environments Maple and
the CSSL standard (Augustin et al. 1967), which Mathematica, respectively.
defined a system as an interconnection of blocks
corresponding to operators which emulated the Numerical Software Tools
main analog simulation blocks and implied the The computational tools for CACSD rely on
integration of the underlying ODEs using suitable many numerical algorithms whose development
numerical methods. For many years, the ACSL and implementation in computer codes was
preprocessor to Fortran (Mitchel and Gauthier the primary motivation of this research area
1976) was one of the most successful implemen- since its beginnings. The Automatic Synthesis
tations of the CSSL standard. Program (ASP) developed in 1966 (Kalman and
A turning point was the development of Englar 1966) was implemented in FAP (Fortran
graphical user interfaces allowing graphical Assembly Program) and ran only on an IBM
block diagram-based modelling. The most 7090–7094 machine. The Fortran II version of
important developments were SystemBuild (Shah ASP (known as FASP) can be considered to be
et al. 1985) and SIMULAB (later marketed as the first collection of computational CACSD
Simulink) (Grace 1991). Both products used tools ported to several mainframe computers.
a customizable set of block libraries and were Interestingly, the linear algebra computations
seamlessly integrated in, respectively, MATRIXx were covered by only three routines (diagonal
and MATLAB, two powerful interactive decomposition, inversion, and pseudoinverse),
matrix computation environments (see below). and no routines were used for eigenvalue or
SystemBuild provided several advanced features polynomial roots computation. The main analysis
such as event management, code generation, and and synthesis functions covered the sampled-data
DAE-based modelling and simulation. Simulink discretization (via matrix exponential), minimal
excelled from the beginning with its intuitive, realization, time-varying Riccati equation solu-
easy-to-use user interface. Recent extensions of tion for quadratic control, filtering, and stability
Simulink allow the modelling and simulation of analysis. The FASP itself performed the required
hybrid systems, code generation for real-time computational sequences by interpreting simple
applications, and various enhancements of the commands with parameters. The extensive docu-
model building process (e.g., object-oriented mentation containing a detailed description of
modelling). algorithmic approaches and many examples
The object-oriented paradigm for system mod- marked the starting point of an intensive
elling was introduced with Dymola (Elmqvist research on algorithms and numerical software,
1978) to support physical system modelling which culminated in the development of the
Computer-Aided Control Systems Design: Introduction and Historical Overview 125

high-performance control and systems library for CACSD as a new significantly extended
SLICOT (Benner et al. 1999; Huffel et al. version of the original SLICOT. The goals
2004). In what follows, we highlight the main of the new library were to cover the main
achievements along this development process. computational needs of CACSD, by relying
The direct successor of FASP is the Variable exclusively on LAPACK and BLAS, and to
Dimension Automatic Synthesis Program guarantee similar numerical performance as
(VASP) (implemented in Fortran IV on IBM that of the LAPACK routines. The software C
360) (White and Lee 1971), while a further development used rigorous standards for
development was ORACLS (Armstrong 1978), implementation in Fortran 77, modularization,
which included several routines from the testing, and documentation (similar to that used
newly developed eigenvalue package EISPACK in LAPACK). The development of the latest
(Garbow et al. 1977; Smith et al. 1976) as versions of RASP and SLICOT eliminated
well as solvers for linear (Lyapunov, Sylvester) practically any duplication of efforts and led to
and quadratic (Riccati) matrix equations. a library which contained the best software from
From this point, the mainstream development RASP, SLICOT, BIMAS, and BIMASC. The
of numerical algorithms for linear system current version of SLICOT is fully maintained by
analysis and synthesis closely followed the the NICONET association (http://www.niconet-
development of algorithms and software for ev.info/en/) and serves as basis for implementing
numerical linear algebra. A common feature of all advanced computational functions for CACSD in
subsequent developments was the extensive use interactive environments as MATLAB (http://
of robust linear algebra software with the Basic www.mathworks.com), Maple (http://www.
Linear Algebra Subprograms (BLAS) (Lawson maplesoft.com/products/maple/), Scilab (http://
et al. 1979) and the Linear Algebra Package www.scilab.org/) and Octave (http://www.gnu.
(LINPACK) for solving linear systems (Dongarra org/software/octave/).
et al. 1979). Several control libraries have been
developed almost simultaneously, relying on the Interactive Tools
robust numerical linear algebra core software Early experiments during 1970–1985 included
formed of BLAS, LINPACK, and EISPACK. the development of several interactive CACSD
Notable examples are RASP (based partly on tools employing menu-driven interaction,
VASP and ORACLS) (Grübel 1983) – developed question-answer dialogues, or command
originally by the University of Bochum and later languages. The April 1982 special issue of IEEE
by the German Aerospace Center (DLR); BIMAS Control Systems Magazine was dedicated to
(Varga and Sima 1985) and BIMASC (Varga and CACSD environments and presented software
Davidoviciu 1986) – two Romanian initiatives; summaries of 20 interactive CACSD packages.
and SLICOT (Boom et al. 1991) – a Benelux This development period was marked by the
initiative of several universities jointly with the establishment of new standards for programming
Numerical Algorithm Group (NAG). languages (Fortran 77, C), availability of high-
The last development phase was marked quality numerical software libraries (BLAS,
by the availability of the Linear Algebra EISPACK, LINPACK, ODEPACK), transition
Package (LAPACK) (Anderson et al. 1992), from mainframe computers to minicomputers,
whose declared goal was to make the widely and finally to the nowadays-ubiquitous personal
used EISPACK and LINPACK libraries run computers as computing platforms, spectacular
efficiently on shared memory vector and parallel developments in graphical display technolo-
processors. To minimize the development efforts, gies, and application of sound programming
several active research teams from Europe paradigms (e.g., strong data typing).
started, in the framework of the NICONET A remarkable event in this period was the
project, a concentrated effort to develop a development of MATLAB, a command language-
high-performance numerical software library based interactive matrix laboratory (Moler 1980).
126 Computer-Aided Control Systems Design: Introduction and Historical Overview

The original version of MATLAB was written Simulink represent the industrial and academic
in Fortran 77. It was primarily intended as a standard for CACSD. The existing CACSD
student teaching tool and provided interactive tools are constantly extended and enriched
access to selected subroutines from LINPACK with new model classes, new computational
and EISPACK. The tool circulated for a while algorithms (e.g., structure-exploiting eigenvalue
in the public domain, and its high flexibility computations based on SLICOT), dedicated
was soon recognized. Several CACSD-oriented graphical user interfaces (e.g., tuning of PID
commercial clones have been implemented in controllers or control-related visualizations),
the C language, the most important among them advanced robust control system synthesis, etc.
being MATRIXx (Walker et al. 1982) and PC- Also, many third-party toolboxes contribute to
MATLAB (Moler et al. 1985). the wide usage of this tool.
The period after 1985 until around 2000 can Basic CACSD functionality incorporating
be seen as a consolidation and expansion period symbolic processing techniques and higher
for many commercial and noncommercial tools. precision computations is available in the Maple
In an inventory of CACSD-related software product MapleSim Control Design Toolbox as
issued by the Benelux Working Group on well as in the Mathematica Control Systems
Software (WGS) under the auspices of the product. Free alternatives to MATLAB are the
IEEE Control Systems Society, there were in MATLAB-like environments Scilab, a French
1992 in active development 70 stand-alone initiative pioneered by INRIA, and Octave, which
CACSD packages, 21 tools based on or similar has recently added some CACSD functionality.
to MATLAB, and 27 modelling/simulation
environments. It is interesting to look more
closely at the evolutions of the two main
players MATRIXx and MATLAB, which Summary and Future Directions
took place under harshly competitive condi-
tions. The development and maintenance of integrated
MATRIXx with its main components Xmath, CACSD environments, which provide support
SystemBuild, and AutoCode had over many for all aspects of the CACSD cycle such as mod-
years a leading role (especially among industrial elling, design, and simulation, involve sustained,
customers), excelling with a rich functionality strongly interdisciplinary efforts. Therefore, the
in domains such as system identification, CACSD tool development activities must rely
control system synthesis, model reduction, on the expertise of many professionals covering
modelling, simulation, and code generation. such diverse fields as control system engineering,
After 2003, MATRIXx (http://www.ni.com/ programming languages and techniques, man-
matrixx/) became a product of the National machine interaction, numerical methods in linear
Instruments Corporation and complements algebra and control, optimization, computer
its main product family LabView, a visual visualization, and model building techniques.
programming language-based system design This may explain why currently only a few of
platform and development environment (http:// the commercial developments of prior years are
www.ni.com/labview). still in use and actively maintained/developed.
MATLAB gained broad academic acceptance Unfortunately, the number of actively developed
by integrating many new methodological devel- noncommercial alternative products is even
opments in the control field into several control- lower. The dominance of MATLAB, as a
related toolboxes. MATLAB also evolved as a de facto standard for both industrial and
powerful programming language, which allows academic usage of integrated tools covering
easy object-oriented manipulation of different all aspects of the broader area of computer-
system descriptions via operator overloading. aided control engineering (CACE), cannot be
At present, the CACSD tools of MATLAB and overseen.
Computer-Aided Control Systems Design: Introduction and Historical Overview 127

The new trends in CACSD are partly aspects of developing numerical algorithms and
related to handling more complex applications, software for CACSD.
involving time-varying (e.g., periodic, multi- The main trends over the last three decades
rate sampled-data, and differential algebraic) in CACSD-related research can be followed in
linear dynamic systems, nonlinear systems with the programs/proceedings of the biannual IEEE
many parametric uncertainties, and large-scale Symposia on CACSD from 1981 to 2013 (partly
models (e.g., originating from the discretization available at http://ieeexplore.ieee.org) as well as C
of PDEs). To address many computational of the triennial IFAC Symposia on CACSD from
aspects of model building (e.g., model reduction 1979 to 2000. Additional information can be
of large order systems), optimization-based found in several CACSD-focused survey articles
robust controller tuning using multiple-model and special issues (e.g., No. 4/1982; No. 2/2000)
approaches, or optimization-based robustness of the IEEE Control Systems Magazine.
assessment using global-optimization techniques,
parallel computation techniques allow substantial
savings in computational times and facilitate
addressing computational problems for large- Bibliography
scale systems. A topic which needs further
research is the exploitation of the benefits of Anderson E, Bai Z, Bishop J, Demmel J, Du Croz J,
Greenbaum A, Hammarling S, McKenney A, Ostrou-
combining numerical and symbolic computations chov S, Sorensen D (1992) LAPACK user’s guide.
(e.g., in model building and manipulation). SIAM, Philadelphia
Armstrong ES (1978) ORACLS – a system for linear-
quadratic Gaussian control law design. Technical pa-
per 1106 96-1, NASA
Cross-References Augustin DC, Strauss JC, Fineberg MS, Johnson BB,
Linebarger RN, Sansom FJ (1967) The SCi continu-
 Interactive Environments and Software Tools ous system simulation language (CSSL). Simulation
for CACSD 9:281–303
Benner P, Mehrmann V, Sima V, Van Huffel S, Varga A
 Model Building for Control System Synthesis (1999) SLICOT – a subroutine library in systems and
 Model Order Reduction: Techniques and Tools control theory. In: Datta BN (ed) Applied and compu-
 Multi-domain Modeling and Simulation tational control, signals and circuits, vol 1. Birkhäuser,
 Optimization-Based Control Design Tech- Boston, pp 499–539
Dongarra JJ, Moler CB, Bunch JR, Stewart GW (1979)
niques and Tools LINPACK user’s guide. SIAM, Philadelphia
 Robust Synthesis and Robustness Analysis Elmquist H et al (1997) Modelica – a unified object-
Techniques and Tools oriented language for physical systems model-
 Validation and Verification Techniques and ing (version 1). http://www.modelica.org/documents/
Modelica1.pdf
Tools Elmqvist H (1978) A structured model language for large
continuous systems. PhD thesis, Department of Auto-
matic Control, Lund University, Sweden
Garbow BS, Boyle JM, Dongarra JJ, Moler CB (1977)
Recommended Reading Matrix eigensystem routines – EISPACK guide exten-
sion. Springer, Heidelberg
The historical development of CACSD concepts Grace ACW (1991) SIMULAB, an integrated environ-
and techniques was the subject of several ar- ment for simulation and control. In: Proceedings of
American Control Conference, Boston, pp 1015–1020
ticles in reference works Rimvall and Jobling
Grübel G (1983) Die regelungstechnische Programmbib-
(1995) and Schmid (2002). A selection of papers liothek RASP. Regelungstechnik 31:75–81
on numerical algorithms underlying the develop- Kalman RE, Englar TS (1966) A user’s manual for the
ment of CACSD appeared in Patel et al. (1994). automatic synthesis program (program C). Technical
report CR-475, NASA
The special issue No. 2/2004 of the IEEE Con-
Lawson CL, Hanson RJ, Kincaid DR, Krogh FT (1979)
trol Systems Magazine on Numerical Awareness Basic linear algebra subprograms for Fortran usage.
in Control presents several surveys on different ACM Trans Math Softw 5:308–323
128 Consensus of Complex Multi-agent Systems

Mitchel EEL, Gauthier JS (1976) Advanced contin-


uous simulation language (ACSL). Simulation 26: Consensus of Complex Multi-agent
72–78 Systems
Moler CB (1980) MATLAB users’ guide. Technical re-
port, Department of Computer Science, University of
Fabio Fagnani
New Mexico, Albuquerque
Moler CB, Little J, Bangert S, Kleinman S (1985) PC- Dipartimento di Scienze Matematiche ‘G.L.
MATLAB, users’ guide, version 2.0. Technical report, Lagrange’, Politecnico di Torino, Torino, Italy
The MathWorks Inc., Sherborn
Nilsen RN, Karplus WJ (1974) Continuous-system simu-
lation languages: a state-of-the-art survey. Math Com- Abstract
put Simul 16:17–25. doi:http://dx.doi.org/10.1016/
S0378-4754(74)80003-0
Patel RV, Laub AJ, Van Dooren P (eds) (1994) Numerical This entry provides a broad overview of the basic
linear algebra techniques for systems and control. elements of consensus dynamics. It describes the
IEEE, Piscataway classical Perron-Frobenius theorem that provides
Rimvall C, Jobling CP (1995) Computer-aided control
systems design. In: Levine WS (ed) The control hand- the main theoretical tool to study the convergence
book. CRC, Boca Raton, pp 429–442 properties of such systems. Classes of consensus
Schmid C (2002) Computer-aided control system models that are treated include simple random
engineering tools. In: Unbehauen H (ed) Control
walks on grid-like graphs and in graphs with a
systems, robotics and automation. http://www.eolss.
net/outlinecomponents/Control-Systems-Robotics- bottleneck, consensus on graphs with intermit-
Automation.aspx tently randomly appearing edges between nodes
Shah CS, Floyd MA, Lehman LL (1985) MATRIXx: (gossip models), and models with nodes that
control design and model building CAE capabili-
do not modify their state over time (stubborn
ties. In: Jamshidi M, Herget CJ (eds) Advances in
computer aided control systems engineering. North- agent models). Application to cooperative con-
Holland/Elsevier, Amsterdam, pp 181–207 trol, sensor networks, and socioeconomic models
Smith BT, Boyle JM, Dongarra JJ, Garbow BS, Ikebe are mentioned.
Y, Klema VC, Moler CB (1976) Matrix eigensystem
routines – EISPACK guide. Lecture notes in computer
science, vol 6, 2nd edn. Springer, Berlin/New York
van den Boom A, Brown A, Geurts A, Hammarling S, Keywords
Kool R, Vanbegin M, Van Dooren P, Van Huffel S
(1991) SLICOT, a subroutine library in control and Consensus; Electrical networks; Gossip model;
systems theory. In: Preprints of 5th IFAC/IMACS
symposium of CADCS’91, Swansea. Pergamon Press, Spectral gap; Stubborn agents
Oxford, pp 89–94
Van Huffel S, Sima V, Varga A, Hammarling S, Dele-
becque F (2004) High-performance numerical soft- Multi-agent Systems and Consensus
ware for control. Control Syst Mag 24:60–76
Varga A, Davidoviciu A (1986) BIMASC – a package of
Fortran subprograms for analysis, modelling, design Multi-agent systems constitute one of the fun-
and simulation of control systems. In: Hansen NE, damental paradigms of science and technology
Larsen PM (eds) Preprints of 3rd IFAC/IFIP Inter- of the present century (Castellano et al. 2009;
national Symposium on Computer Aided Design in
Control and Engineering (CADCE’85), Copenhagen.
Strogatz 2003). The main idea is that of creating
Pergamon Press, Oxford, pp 151–156 complex dynamical evolutions from the interac-
Varga A, Sima V (1985) BIMAS – a basic mathe- tions of many simple units. Indeed such collective
matical package for computer aided systems anal- behaviors are quite evident in biological and
ysis and design. In: Gertler J, Keviczky L (eds)
Preprints of 9th IFAC World Congress, Hungary, vol 8,
social systems and were indeed considered in
pp 202–207 earlier times. More recently, the digital revolu-
Walker R, Gregory C, Shah S (1982) MATRIXx: a data tion and the miniaturization in electronics have
analysis, system identification, control design and sim- made possible the creation of man-made com-
ulation package. Control Syst Mag 2:30–37
White JS, Lee HQ (1971) Users manual for the variable
plex architectures of interconnected simple de-
automatic synthesis program (VASP). Technical mem- vices (computers, sensors, cameras). Moreover,
orandum TM X-2417, NASA the creation of the Internet has opened a totally
Consensus of Complex Multi-agent Systems 129

new form of social and economic aggregation. combinations of the components of x.t/: this mo-
This has strongly pushed towards a systematic tivates the term averaging dynamics. Stochastic
and deep study of multi-agent dynamical sys- matrices owe their name to their use in prob-
tems. Mathematically they typically consist of a ability: the term Pvw can be interpreted as the
graph where each node possesses a state vari- probability of making a jump in the graph from
able; states are coupled at the dynamical level state v to state w. In this way you construct what
through dependences determined by the edges is called a random walk on the graph G. C
in the graph. One of the challenging problems The network structure is hidden in the nonzero
in the field of multi-agent systems is to analyze pattern of P . Indeed we can associate to P a
the emergence of complex collective phenom- graph: GP D .V; EP / where the set of edges is
ena from the interactions of the units which are given by EP WD f.u; v/ 2 V  V j Puv > 0g.
typically quite simple. Complexity is typically Elements in EP represent the communication
the outcome of the topology and the nature of edges among the units: if .u; v/ 2 EP , it means
interconnections. that unit u has access to the state of unit v. Denote
Consensus dynamics (also known as average by 1 2 RV the all 1’s vector. Notice that P 1 D 1:
dynamics) (Carli et al. 2008; Jadbabaie et al. this shows that once the states of units are at
2003) is one of the most popular and simplest consensus, they will no longer evolve. Will the
multi-agent dynamics. One convenient way dynamics always converge to a consensus point?
to introduce it is with the language of social Remarkably, some of the key properties of
sciences. Imagine that a number of independent P responsible for the transient and asymptotic
units possess an information represented by a real behavior of the linear system (1) are determined
number, for instance, such number can represent by the connectivity properties of the underlying
an opinion on a given fact. Units interact and graph GP . We recall that, given two vertices
change their opinion by averaging with the opin- u; v 2 V, a path (of length l) from u to v in GP is
ions of other units. Under certain assumptions, any sequence of vertices u D u1 ; u2 ; : : : ; ulC1 D
this will lead the all community to converge to a v such that .ui ; ui C1 / 2 EP for every i D
consensus opinion which takes into consideration 1; : : : ; s. GP is said to be strongly connected if
all the initial opinion of the agents. In social for any pair of vertices u ¤ v in V there is a
sciences, empiric evidences (Galton 1907) have path in GP connecting u to v. The period of a
shown how such aggregate opinion may give a node u is defined as the greatest common divisor
very good estimation of unknown quantities: such of the lengths of all closed paths from u to u. In
phenomenon has been proposed in the literature the strongly connected graph, all nodes have the
as wisdom of crowds (Surowiecki 2004). same period, and the graph is called aperiodic if
such a period is 1. If x is a vector, we will use
Consensus Dynamics, Graphs, and the notation x  to denote its transpose. If A is
Stochastic Matrices a finite set, jAj denotes the number of elements
Mathematically, consensus dynamics are special in A. The following classical result holds true
linear dynamical systems of type (Gantmacher 1959):

x.t C 1/ D P x.t/ (1) Theorem 1 (Perron-Frobenius) Assume that


P 2 RVV is such that GP is strongly connected
where x.t/ 2 RV and P 2 RVV is a stochastic and aperiodic. Then,
matrix (e.g., a matrix with nonnegative elements 1. 1 is an algebraically simple eigenvalue of P .
such that every row sums to 1). V represents the 2. There exists a (unique) probability vector 2
P
finite set of units (agents) in the network and RV ( v > 0 for all v and v v D 1) which is
x.t/v is to be interpreted has the state (opin- a left eigenvector for P , namely,  P D  .
ion) of agent v at time t. Equation (1) implies 3. All the remaining eigenvalues of P are of
that states of agents at time t C 1 are convex modulus strictly less than 1.
130 Consensus of Complex Multi-agent Systems

A straightforward linear algebra consequence Examples and Large-Scale Analysis


of this result is that P t ! 1  for t ! C1.
This yields In this section, we present some classical exam-
ples. Consider a strongly connected graph G D
lim x.t/ D lim P t x.0/ D 1.  x.0// .V; E/. The adjacency matrix of G is a square
t !C1 t !C1 matrix AG 2 f0; 1gVV such that .AG /uv D 1
(2) iff .u; v/ 2 E. G is said to be symmetric if AG is
All agents’ state are thus converging to the symmetric. Given a symmetric graph G D .V; E/,
common value  x.0/, called consensus point we can consider the stochastic matrix P given
P
which is a convex combination of the initial states by Puv D du1 .AG /uv where du D v .AG /uv
with weights given by the invariant probability is the degree of node u. P is called the simple
components. random walk (SRW) on G: each agent gives the
If is the uniform vector (i.e., v D jVj1 same weight to the state of its neighbors. Clearly,
for all units v), the common asymptotic value is GP D G. A simple check shows that v D dv =jEj
simply the arithmetic mean of the initial states: is the invariant probability for P . The consensus
all agents equally contribute to the final com- point is given by
mon state. A special case when this happens
is when P is symmetric. The distributed com- X
 x.0/ D jEj1 dv x.0/v
putation of the arithmetic mean is an impor- v
tant step to solve estimation problems for sensor
networks. As a specific example, consider the
Each node contributes with its initial state to this
situation where there are N sensors deployed
consensus with a weight which is proportional to
in a certain area and each of them makes a
the degree of the node. Notice that the SRW P
noisy measurement of a physical quantity x. Let
is symmetric iff the graph is regular, namely, all
yv D x C !v be the measure obtained by
units have the same degree.
sensor v, where !v is a zero mean Gaussian
We now present a number of classical ex-
noise. It is well known that if noises are inde-
amples based on families of graphs with larger
pendent and identically distributed, the optimal
and larger number of nodes N . In this setting,
mean square estimator of the quantity x given the
particularly relevant is to understand the behavior
entire set of measurements fyv g is exactly given
P of the second eigenvalue 2 as a function of
by xO D N 1 v yv . Other fields of application
N . Typically, one considers  > 0 fixed and
is in the control of cooperative autonomous ve-
solves the equation 2t D . The solution  D
hicles (Fax and Murray 2004; Jadbabaie et al.
.ln 21 /1 ln  1 will be called the convergence
2003).
time: it essentially represents the time needed to
Basic linear algebra allows to study the
shrink of a factor  the distance to consensus.
rate of convergence to consensus. Indeed,
Dependence of 2 on N will also yield that 
if GP is strongly connected and aperiodic,
will be a function of N . In the sequel, we will
the matrix P has all its eigenvalues in the
investigate such dependence for SRW’s on certain
unit ball: 1 is the only eigenvalue with
classical families of graphs.
modulo equal to 1, while all the others have
modulo strictly less than one. If we denote Example 1 (SRW on a complete graph) Consider
by 2 < 1 the largest modulo of such the complete graph on the set V: KV WD
eigenvalues (different from 1), we can show .V; V  V / (also self loops are present). The SRW
that x.t/  1.  x.0// converges exponentially on KV is simply given by P D N 1 11 where
to 0 as 2t . In the following, we will briefly N D jVj. Clearly, D N 1 1. Eigenvalues of P
refer to 2 as to the second eigenvalue are 1 with multiplicity 1 and 0 with multiplicity
of P . N  1. Therefore, 2 D 0. Consensus in this case
Consensus of Complex Multi-agent Systems 131

is achieved in just one step: x.t/ D N 1 11 x.0/ consensus dynamics will necessarily exhibit a
for all t  1. slow convergence.
Formally, given a symmetric graph G D
Example 2 (SRW on a cycle graph) Consider the
.V; E/ and a subset of nodes S  V, define
symmetric cycle graph CN D .V; E/ where V D
eS as the number of edges with at least one node
f0; : : : ; N  1g and E D f.v; v C 1/; .v C 1; v/g
in S and eS S as the number of edges with both
where sum is mod N . The graph CN is clearly C
nodes in S . The bottleneck of S in G is defined
strongly connected and is also aperiodic if N is
as ˆ.S / D eS S =eS . Finally, the bottleneck ratio
odd. The corresponding SRW P has eigenvalues
of G is defined as
2 k
k D cos ˆ WD min ˆ.S /
N S WeS =e1=2

Therefore, if N is odd, the second eigenvalue is where e D jEj is the number of edges in the
given by graph.
Let P be the SRW on G and let 2 be its second
2 1 eigenvalue. Then,
2 D cos D 1  2 2 2 C o.N 2 /
N N (3) Proposition 1 (Cheeger bound Levin et al.
for N ! C1 2008)
1  2  2ˆ : (4)
while the corresponding convergence time is
given by Example 4 (Graphs with a bottleneck) Consider
two complete graphs with n nodes connected by
just one edge. If S is the set of nodes of one of
 D .ln 21 /1 ln  1  N 2 for N ! C1
the two complete graphs, we obtain

Example 3 (SRW on toroidal grids) The toroidal 1


d -grids Cnd is formally obtained as a product ˆ.S / D
n2 C1
of cycle graphs. The number of nodes is N D
nd . It can be shown that the convergence time Bound (4) implies that the convergence time is at
behaves as least of the order of n2 in spite of the fact that
in each complete graph convergence would be in
  N 2=d for N ! C1 finite time!

Convergence time exhibits a slower growth in N


as the dimension d of the grid increases: this is Other Models
intuitive since the increase in d determines a bet-
ter connectivity of the graph and a consequently The systems studied so far are based on the
faster diffusion of information. assumptions that units all behave the same, and
For a general stochastic matrix (even for SRW they share a common clock and update their state
on general graphs), the computation of the second in a synchronous fashion. In this section, we
eigenvalue is not possible in closed form and can discuss more general models.
actually be also difficult from a numerical point
of view. It is therefore important to develop tools Random Consensus Models
for efficient estimation. One of these is based Regarding the assumption of synchronicity, it
on the concept of bottleneck: if a graph can be turns out to be unfeasible in many contexts. For
splitted into two loosely connected parts, then instance, in the opinion dynamics modeling, it
132 Consensus of Complex Multi-agent Systems

is not realistic to assume that all interactions It can be proven that Q11 is asymptotically stable
happen at the same time: agents are embedded in (.Q11 /t ! 0). Henceforth, x R .t/ ! x R .1/ for
a physical continuous time, and interactions can t ! C1 with the limit opinions satisfying the
be imagined to take place at random, for instance, relation
in a pairwise fashion.
One of the most famous random consensus
model is the gossip model. Fix a real number x R .1/ D Q11 x R .1/ C Q12 x S .0/ (6)
q 2 .0; 1/ and a symmetric graph G D .V; E/.
At every time instant t, an edge .u; v/ 2 E
If we define „ WD .I  Q11 /1 Q12 , we can write
is activated with uniform probability jEj1 , and
x R .1/ D „x S .0/. It is easy to see that „ has
nodes u and v exchange their states and produce P
nonnegative elements and that s „us D 1 for
a new state according to the equations
all u 2 R: asymptotic opinions of regular agents
are thus convex combinations of the opinions
xu .t C 1/ D .1  q/xu .t/ C qxv .t/
of stubborn agents. If all stubborn agents are
xv .t C 1/ D qxu .t/ C .1  q/xv .t/ in the same state x, then, consensus is reached
by all agents in the point x. However, typically,
The states of the other units remain unchanged. consensus is not reached in such a system: we
Will this dynamics lead to a consensus? If will discuss an example below.
the same edge is activated at every time instant, There is a useful alternative interpretation
clearly consensus will not be achieved. However, of the asymptotic opinions. Interpreting the
it can be shown that, with probability one, con- graph G as an electrical circuit where edges
sensus will be reached (Boyd et al. 2006). are unit resistors, relation (6) can be seen as
a Laplace-type equation on the graph G with
Consensus Dynamics with Stubborn boundary conditions given by assigning the
Agents voltage x S .0/ to the stubborn agents. In this
In this entry, we investigate consensus dynamics way, x R .1/ can be interpreted as the vector of
models where some of the agents do not modify voltages of the regular agents when stubborn
their own state (stubborn agents). These systems agents have fixed voltage x S .0/. Thanks to
are of interest in socioeconomic models (Ace- the electrical analogy, we can compute the
moglu et al. 2013). asymptotic opinion of the agents by computing
Consider a symmetric connected graph G D the voltages in the various nodes in the graph. We
.V; E/. We assume a splitting V D S [ R with propose a concrete application in the following
the understanding that agents in S are stubborn example.
agents not changing their state, while those in Example 5 (Stubborn agents in a line graph)
R are regular agents whose state modifies in Consider the line graph LN D .V; E/ where
time according to a SRW consensus dynamics, V D f1; 2; : : : ; N g and where E D f.u; u C
namely, 1/; .u C 1; u/ j u D 1; : : : ; N  1g. Assume that
S D f1; N g and R D V n S. Consider the graph
1 X as an electrical circuit. Replacing the line with a
xu .t C 1/ D .AG /uv xv .t/ ; 8u 2 R
du v2V single edge connecting 1 and N having resistance
N  1 and applying Ohm’s law, we obtain that
By assembling the state of the regular and of the the current flowing from 1 to N is equal to
stubborn agents in vectors denoted, respectively, ˆ D .N  1/1 ŒxNS .0/  x1S .0/
. If we now fix
as x R .t/ and x S .t/, dynamics can be recasted in an arbitrary node v 2 V and applying again the
matrix form as same arguments in the part of graph from 1 to v,
we obtain that the voltage at v, xvR .1/ satisfies
x R .t C 1/ D Q11 x R .t/ C Q12 x S .t/ the relation xvR .1/  x1S .0/ D ˆ.v  1/. We
(5)
x S .t C 1/ D x S .t/ thus obtain
Control and Optimization of Batch Processes 133

v1 S
xvR .1/ D x1S .0/ C Œx .0/  x1S .0/
: Control and Optimization of Batch
N 1 N
Processes

In Acemoglu et al. (2013), further examples Dominique Bonvin


are discussed showing how, because of the Laboratoire d’Automatique, École
topology of the graph, different asymptotic Polytechnique Fédérale de Lausanne (EPFL), C
configurations may show up. While in graphs 1015 Lausanne, Switzerland
presenting bottlenecks polarization phenomena
can be recorded, in graphs where the con-
vergence rate is low, there will be a typical Abstract
asymptotic opinion shared by most of the regular
agents. A batch process is characterized by the
repetition of time-varying operations of finite
duration. Due to the repetition, there are two
independent “time” variables, namely, the run
Cross-References time during a batch and the batch counter.
Accordingly, the control and optimization
 Averaging Algorithms and Consensus objectives can be defined for a given batch
 Information-Based Multi-Agent Systems or over several batches. This entry describes
the various control and optimization strategies
available for the operation of batch processes.
These include conventional feedback control,
Bibliography predictive control, iterative learning control, and
run-to-run control on the one hand and model-
Acemoglu D, Como G, Fagnani F, Ozdaglar A (2013) based repeated optimization and model-free self-
Opinion fluctuations and disagreement in social net-
works. Math Oper Res 38(1):1–27
optimizing schemes on the other.
Boyd S, Ghosh A, Prabhakar B, Shah D (2006) Ran-
domized gossip algorithms. IEEE Trans Inf Theory
52(6):2508–2530
Keywords
Carli R, Fagnani F, Speranzon A, Zampieri S (2008)
Communication constraints in the average consensus Batch control; Batch process optimization; Dy-
problem. Automatica 44(3):671–684 namic optimization; Iterative learning control;
Castellano C, Fortunato S, Loreto V (2009) Statistical
physics of social dynamics. Rev Modern Phys 81:591–
Run-to-run control; Run-to-run optimization
646
Fax JA, Murray RM (2004) Information flow and cooper-
ative control of vehicle formations. IEEE Trans Autom Introduction
Control 49(9):1465–1476
Galton F (1907) Vox populi. Nature 75:450–451 Batch processing is widely used in the manu-
Gantmacher FR (1959) The theory of matrices. Chelsea facturing of goods and commodity products, in
Publishers, New York
Jadbabaie A, Lin J, Morse AS (2003) Coordination of particular in the chemical, pharmaceutical, and
groups of mobile autonomous agents using nearest food industries. These industries account for sev-
neighbor rules. IEEE Trans Autom Control 48(6):988– eral billion US dollars in annual sales. Batch
1001 operation differs significantly from continuous
Levin DA, Peres Y, Wilmer EL (2008) Markov chains and
mixing times. AMS, Providence operation. While in continuous operation the pro-
Strogatz SH (2003) Sync: the emerging science of sponta- cess is maintained at an economically desirable
neous order. Hyperion, New York operating point, the process evolves continuously
Surowiecki J (2004) The wisdom of crowds: why the from an initial to a final time in batch processing.
many are smarter than the few and how collective
wisdom shapes business, economies, societies and na- In the chemical industry, for example, since the
tions. Little, Brown. (Traduzione italiana: La saggezza design of a continuous plant requires substantial
della folla, Fusi Orari, 2007) engineering effort, continuous operation is rarely
134 Control and Optimization of Batch Processes

Control objectives
Implementation
aspect Run-time references Run-end references
y ref (t) or y ref[0,t f ] z ref

1 Feedback control 2 Predictive control


Online
uk (t) → yk(t) → yk[0,tf ] uk (t) → zpred,k(t)
(within-run)
FBC MPC

3 Iterative learning control 4 Run-to-run control


Iterative
(run-to-run) uk [0,tf ] → yk[0,tf ] (πk) = uk [0,tf ] → zk

ILC with run delay R2R with run delay

Control and Optimization of Batch Processes, Fig. 1 kth batch, uk Œ0; tf


the corresponding input trajectories,
Control strategies for batch processes. The strategies are yk .t / the run-time outputs measured online, and zk the
classified according to the control objectives (horizontal run-end outputs available at the final time. FBC stands for
division) and the implementation aspect (vertical divi- “feedback control,” MPC for “model predictive control,”
sion). Each objective can be met either online or iteratively ILC for “iterative learning control,” and R2R for “run-to-
over several batches depending on the type of measure- run control”
ments available. uk represents the input vector for the

used for low-volume production. Discontinuous repeating the processing steps on a predetermined
operations can be of the batch or semi-batch type. schedule.
In batch operations, the products to be processed The main characteristics of batch process op-
are loaded in a vessel and processed without ma- erations include the absence of steady state, the
terial addition or removal. This operation permits presence of constraints, and the repetitive nature.
more flexibility than continuous operation by These characteristics bring both challenges and
allowing adjustment of the operating conditions opportunities to the operation of batch processes
and the final time. Additional flexibility is avail- (Bonvin 1998). The challenges are related to the
able in semi-batch operations, where products are fact that the available models are often poor and
continuously added by adjusting the feed rate incomplete, especially since they need to repre-
profile. We use the term batch process to include sent a wider range of operating conditions than
semi-batch processes. in the case of continuous processes. Furthermore,
Batch processes dealing with reaction although product quality must be controlled, this
and separation operations include reaction, variable is usually not available online but only
distillation, absorption, extraction, adsorption, at run end. On the other hand, opportunities
chromatography, crystallization, drying, fil- stem from the fact that industrial chemical pro-
tration, and centrifugation. The operation of cesses are often slow, which facilitates larger
batch processes involves recipes developed in sampling periods and extensive online computa-
the laboratory. A sequence of operations is tions. In addition, the repetitive nature of batch
performed in a prespecified order in specialized processes opens the way to run-to-run process
process equipment, yielding a fixed amount improvement (Bonvin et al. 2006). More infor-
of product. The sequence of tasks to be mation on batch processes and their operation
carried out on each piece of equipment, can be found in Seborg et al. (2004) and Nagy
such as heating, cooling, reaction, distillation, and Braatz (2003). Next, we will successively
crystallization, and drying, is predefined. The address the control and the optimization of batch
desired production volume is then achieved by processes.
Control and Optimization of Batch Processes 135

Control of Batch Processes (Moore 1993). This strategy exhibits the


limitations of open-loop control with respect
Control of batch processes differs from control to the current run, in particular the fact
of continuous processes in two main ways. First, that there is no feedback correction for
since batch processes have no steady-state oper- run-time disturbances. Nevertheless, this
ating point, at least some of the set points are scheme is useful for generating a time-varying
time-varying profiles. Second, batch processes feedforward input term. C
are repeated over time and are characterized by 4. Iterative control of run-end outputs. In this
two independent variables, the run time t and case the input profiles are parameterized as
the run counter k. The independent variable k uk Œ0; tf
D U. k / using the input parameters
provides additional degrees of freedom for meet- k . The batch process is thus seen as a static
ing the control objectives when these objectives map between the input parameters k and the
do not necessarily have to be completed in a run-end outputs zk (Francois et al. 2005).
single batch but can be distributed over several It is also possible to combine online and run-
successive batches. This situation brings into fo- to-run control for both y and z. However, in such
cus the concept of run-end outputs, which need a combined scheme, care must be taken so that
to be controlled but are only available at the the online and run-to-run corrective actions do
completion of the batch. The most common run- not oppose each other. Stability during run time
end output is product quality. Consequently, the and convergence in run index must be guaranteed
control of batch processes encompasses four dif- (Srinivasan and Bonvin 2007a).
ferent strategies (Fig. 1):
1. Online control of run-time outputs. This con-
trol approach is similar to that used in con- Optimization of Batch Processes
tinuous processing. However, although some
controlled variables, such as temperature in The process variables undergo significant
isothermal operation, remain constant, the key changes during batch operation. Hence, the
process characteristics, such as process gain major objective in batch operations is not to
and time constants, can vary considerably be- keep the system at optimal constant set points but
cause operation occurs along state trajectories rather to determine input profiles that optimize
rather than at a steady-state operating point. an objective function expressing the system
Hence, adaptation in run time t is needed to performance.
handle the expected variations. Feedback con-
trol is implemented using PID techniques or Problem Formulation
more sophisticated alternatives (Seborg et al. A typical optimization problem in the context of
2004). batch processes is
2. Online control of run-end outputs. In this case
it is necessary to predict the run-end out-

puts z based on measurements of the run-time min Jk D xk .tf /


outputs y. Model predictive control (MPC) uk Œ0;tf

Z
is well suited to this task (Nagy and Braatz tf

2003). However, the process models available C L xk .t/; uk .t/; t dt (1)


0
for prediction are often simplified and thus of
limited accuracy. subject to
3. Iterative control of run-time outputs. The
manipulated variable profiles can be generated

xP k .t/ D F xk .t/; uk .t/ ; xk .0/ D xk;0 (2)


using iterative learning control (ILC), which

exploits information from previous runs S xk .t/; uk .t/  0; T xk .tf /  0; (3)


136 Control and Optimization of Batch Processes

where x represents the state vector, J the scalar updating the initial conditions for the sub-
cost to be minimized, S the run-time constraints, sequent optimization (and optionally the pa-
T the run-end constraints, and tf the final time. rameters of the process model) and numerical
In constrained optimal control problems, the optimization based on the updated process
solution often lies on the boundary of the fea- model (Abel et al. 2000). Since both steps
sible region. Batch processes involve run-time are repeated as measurements become avail-
constraints on inputs and states as well as run-end able, the procedure is also referred to as re-
constraints. peated online optimization. The weakness of
this method is its reliance on the model; if
Optimization Strategies the model is not updated, its accuracy plays
As can be seen from the cost objective (1), op- a crucial role. However, when the model is
timization requires information about the com- updated, there is a conflict between parameter
plete run and thus cannot be implemented in identification and optimization since parame-
real time using only online measurements. Some ter identification requires persistency of exci-
information regarding the future of the run is tation, that is, the inputs must be sufficiently
needed in the form of either a process model varied to uncover the unknown parameters, a
capable of prediction or measurements from pre- condition that is usually not satisfied when
vious runs. Accordingly, measurement-based op- near-optimal inputs are applied. Note that,
timization methods can be classified depending instead of computing the input uk Œt; tf
, it is
on whether or not a process model is used ex- also possible to use a receding horizon and
plicitly for implementation, as illustrated in Fig. 2 compute only uk Œt; t C T
, with T the control
and discussed next: horizon (Abel et al. 2000).
1. Online explicit optimization. This approach 2. Online implicit optimization. In this scenario,
is similar to model predictive control (Nagy measurements are used to update the inputs
and Braatz 2003). Optimization uses a process directly, that is, without the intermediary of
model explicitly and is repeated whenever a process model. Two classes of techniques
a new set of measurements becomes avail- can be identified. In the first class, an update
able. This scheme involves two steps, namely, law that approximates the optimal solution

Use of process model


Implementation
aspect Explicit optimization Implicit optimization
(with process model) (without process model)
Online input update using
1 Repeated online optimization 2 measurements
Online
yk[0,t] → xk(t) → u∗k [t,tf ] uk∗(t)
EST OPT Approx. of opt. solution

(within-run) yk(t)
repeat online yk[0,t] NCO prediction
NCO uk∗(t)
Run-to-run input update using
3 Repeated run-to-run optimization 4 measurements
Iterative IDENT
∗ OPT

(run-to-run) yk[0,tf ] → θk → uk+1 [0,tf ] yk[0,tf ]


NCO evaluation
NCO u∗k+1 [0,tf ]
repeat with run delay repeat with run delay

Control and Optimization of Batch Processes, Fig. 2 or iteratively over several runs (vertical division). EST
Optimization strategies for batch processes. The strate- stands for “estimation,” IDENT for “identification,” OPT
gies are classified according to whether or not a process for “optimization,” and NCO for “necessary conditions of
model is used for implementation (horizontal division). optimality”
Furthermore, each class can be implemented either online
Control and Optimization of Batch Processes 137

is sought. For example, a neural network is laws can be implemented online, leaving the
trained with data corresponding to optimal be- responsibility for satisfying terminal constraints
havior for various uncertainty realizations and and sensitivities to run-to-run controllers.
used to update the inputs (Rahman and Palanki
1996). The second class of techniques relies
on transforming the optimization problem into Summary and Future Directions
a control problem that enforces the neces- C
sary conditions of optimality (NCO) (Srini- Batch processing presents several challenges.
vasan and Bonvin 2007b). The NCO involve Since there is little time for developing
constraints that need to be made active and appropriate dynamic models, there is a need for
sensitivities that need to be pushed to zero. improved data-driven control and optimization
Since some of these NCO are evaluated at approaches. These approaches require the
run time and others at run end, the control availability of online concentration-specific
problem involves both run-time and run-end measurements such as chromatographic and
outputs. The main issue is the measurement or spectroscopic sensors, which are not yet readily
estimation of the controlled variables, that is, available in production.
the constraints and sensitivities that constitute Technically, the main operational difficulty in
the NCO. batch process improvement lies in the presence
3. Iterative explicit optimization. The steps of run-end outputs such as final quality, which
followed in run-to-run explicit optimization cannot be measured during the run. Although
are the same as in online explicit optimization. model-based solutions are available, process
However, there is substantially more data models in the batch area tend to be poor. On
available at the end of the run as well as the other hand, measurement-based optimization
sufficient computational time to refine the for a given batch faces the challenge of having
model by updating its parameters and, if to know about the future to act during the batch.
needed, its structure. Furthermore, data from Consequently, the main research push is in the
previous runs can be collected for model area of measurement-based optimization and the
update (Rastogi et al. 1992). As with online use of data from both the current and previous
explicit optimization, this approach suffers batches for control and optimization purposes.
from the conflict between estimation and
optimization.
4. Iterative implicit optimization. In this sce- Cross-References
nario, the optimization problem is transformed
into a control problem, for which the control  Industrial MPC of continuous processes
approaches in the second row of Fig. 1 are  Iterative Learning Control
used to meet the run-time and run-end ob-  Multiscale Multivariate Statistical Process
jectives (Francois et al. 2005). The approach, Control
which is conceptually simple, might be ex-  Scheduling of Batch Plants
perimentally expensive since it relies more on  State Estimation for Batch Processes
data.
These complementary measurement-based
optimization strategies can be combined by Bibliography
implementing some aspects of the optimization
online and others on a run-to-run basis. For Abel O, Helbig A, Marquardt W, Zwick H, Daszkowski T
instance, in explicit schemes, the states can be (2000) Productivity optimization of an industrial semi-
estimated online, while the model parameters batch polymerization reactor under safety constraints.
J Process Control 10(4):351–362
can be estimated on a run-to-run basis. Similarly, Bonvin D (1998) Optimal operation of batch reactors – a
in implicit optimization, approximate update personal view. J Process Control 8(5–6):355–368
138 Control Applications in Audio Reproduction

Bonvin D, Srinivasan B, Hunkeler D (2006) Con- processing where the ultimate objective is to
trol and optimization of batch processes: improve- reconstruct the original analog signal one started
ment of process operation in the production of spe-
cialty chemicals. IEEE Control Syst Mag 26(6):
with. After discussing some fundamental prob-
34–45 lems in the Shannon paradigm, we give our basic
Francois G, Srinivasan B, Bonvin D (2005) Use of mea- problem formulation that can be solved using
surements for enforcing the necessary conditions of modern sampled-data control theory. Examples
optimality in the presence of constraints and uncer-
tainty. J Process Control 15(6):701–712
are given to illustrate the results.
Moore KL (1993) Iterative learning control for determin-
istic systems. Advances in industrial control. Springer,
London
Nagy ZK, Braatz RD (2003) Robust nonlinear model Keywords
predictive control of batch processes. AIChE J
49(7):1776–1786
Digital signal processing; Multirate signal pro-
Rahman S, Palanki S (1996) State feedback synthesis for
on-line optimization in the presence of measurable cessing; Sampled-data control; Sampling theo-
disturbances. AIChE J 42:2869–2882 rem; Sound reconstruction
Rastogi A, Fotopoulos J, Georgakis C, Stenger HG
(1992) The identification of kinetic expressions and
the evolutionary optimization of specialty chemical
batch reactors using tendency models. Chem Eng Sci Introduction: Status Quo
47(9–11):2487–2492
Seborg DE, Edgar TF, Mellichamp DA (2004) Process
dynamics and control. Wiley, New York Consider the problem of reproducing sounds
Srinivasan B, Bonvin D (2007a) Controllability and sta- from recorded media such as compact discs. The
bility of repetitive batch processes. J Process Control current CD format is recorded at the sampling
17(3):285–295
Srinivasan B, Bonvin D (2007b) Real-time optimization of frequency 44.1 kHz. It is commonly claimed that
batch processes by tracking the necessary conditions the highest frequency for human audibility is
of optimality. Ind Eng Chem Res 46(2):492–504 20 kHz, whereas the upper bound of reproduction
in this format is believed to be the half of
44.1 kHz, i.e., 22.1 kHz, and hence, this format
should have about 10 % margin against the
Control Applications in Audio alleged audible limit of 20 kHz.
Reproduction CD players of early days used to process such
digital signals with the simple zero-order hold
Yutaka Yamamoto at this frequency, followed by an analog low-
Department of Applied Analysis and Complex pass filter. This process requires a sharp low-
Dynamical Systems, Graduate School of pass characteristic to cut out unnecessary high
Informatics, Kyoto University, Kyoto, Japan frequency beyond 20 kHz. However, a sharp cut-
off low-pass characteristic inevitably requires a
high-order filter which in turn introduces a large
Abstract amount of phase shift distortion around the cutoff
frequency.
This entry gives a brief overview of the recent To circumvent this defect, there was intro-
developments in audio sound reproduction via duced the idea of oversampling DA converter that
modern sampled-data control theory. We first re- is realized by the combination of a digital filter
view basics in the current sound processing tech- and a low-order analog filter (Zelniker and Taylor
nology and then proceed to the new idea derived 1994). This is based on the following principle:
from sampled-data control theory, which is dif- Let ff .nh/g1 nD1 be a discrete-time signal
ferent from the conventional Shannon paradigm obtained from a continuous-time signal f ./ by
based on the perfect band-limiting hypothesis. sampling it with sampling period h. The upsam-
The hybrid nature of sampled-data systems pro- pler appends the value 0, M  1 times, between
vides an optimal platform for dealing with signal two adjacent sampling points:
Control Applications in Audio Reproduction 139

Control Applications in
Audio Reproduction,
Fig. 1 Upsampler for
M D2

C

w.`/; k D M ` signal f ./ from sampled data. This is clearly
." M w/Œk
WD (1) an ill-posed problem without any assumption on
0; elsewhere:
f because there are infinitely many functions
See Fig. 1 for the case M D 2. This has the that can match the sampled data ff .nh/g1
nD1 .
effect of making the unit operational time M Hence, one has to impose a reasonable a
times faster. priori assumption on f to sensibly discuss this
The bandwidth will also be expanded by M problem.
times and the Nyquist frequency (i.e., half the The following sampling theorem gives one
sampling frequency) becomes M = h [rad/sec]. answer to this question:
As we see in the next section, the Nyquist fre-
Theorem 1 Suppose that the signal f 2 L2
quency is often regarded as the true bandwidth
is perfectly band-limited, in the sense that there
of the discrete-time signal ff .nh/g1 nD1 . But exists !0  = h such that the Fourier transform
this upsampling process just insert zeros between
fO of f satisfies
sampling points, and the real information con-
tents (the true bandwidth) is not really expanded.
fO.!/ D 0; j!j  !0 ; : (2)
As a result, the copy of the frequency content
for Œ0; = h/ appears as a mirror image repeatedly
over the frequency range above = h. This dis- Then
tortion is called imaging. In order to avoid the 1
X sin .t= h  n/
effect of such mirrored frequency components, f .t/ D f .nh/ : (3)
one often truncates the frequency components nD1
.t= h  n/
beyond the (original) Nyquist frequency via a
digital low-pass filter that has a sharp roll-off This theorem states that if the signal f does
characteristic. One can then complete the digital not contain any high-frequency components be-
to analog (DA) conversion process by postposing yond the Nyquist frequency = h, then the origi-
a slowly decaying analog filter. This is the idea of nal signal f can be uniquely reconstructed from
an oversampling DA converter (Zelniker and Tay- its sampled-data ff .nh/g1 nD1 . On the other
lor 1994). The advantage here is that by allowing hand, if this assumption does not hold, then the
a much wider frequency range, the final analog result does not necessarily hold. This is easy to
filter can be a low-order filter and hence yields a see via a schematic representation in Fig. 2.
relatively small amount of phase distortion sup- If we sample the sinusoid in the upper fig-
ported in part by the linear-phase characteristic ure in Fig. 2, these sampled values would turn
endowed on the digital filter preceding it. out to be compatible with another sinusoid with
much lower frequency as the lower figure shows.
In other words, this sampling period does not
Signal Reconstruction Problem have enough resolution to distinguish these two
sinusoids. The maximum frequency below where
As before, consider the sampled discrete- there does not occur such a phenomenon is the
time signal ff .nh/g1nD1 obtained from a Nyquist frequency. The sampling theorem above
continuous-time signal f . The main question is asserts that it is half of the sampling frequency
how we can recover the original continuous-time 2 = h, that is, = h [rad/sec]. In other words, if
140 Control Applications in Audio Reproduction

Nyquist frequency is 22.05 kHz, and the en-


ergy distribution of real sounds often extends
way over 20 kHz.
• To remedy this, one often introduces a band-
limiting low-pass filter, but it can introduce
distortions due to the Gibbs phenomenon, due
to a required sharp decay in the frequency
domain. See Fig. 3.
This is the Gibbs phenomenon well known
in Fourier analysis. A sharp truncation in the
frequency domain yields such a ringing effect.
In view of such drawbacks, there has been
revived interest in the extension of the sampling
Control Applications in Audio Reproduction, Fig. 2 theorem in various forms since the 1990s. There
Aliasing is by now a stream of papers that aim at studying
signal reconstruction under the assumption of
we can assume that the original signal contains nonideal signal acquisition devices; an excellent
no frequency components beyond the Nyquist survey is given in Unser (2000). In this research
frequency, then one can uniquely reconstruct the framework, the incoming signal is supposed to be
original analog signal f from its sampled-data acquired through a nonideal analog filter (acqui-
ff .nh/g1 nD1 . On the other hand, if this as-
sition device) and sampled, and then the recon-
sumption does not hold, the distortion depicted struction process attempts to recover the original
in Fig. 2 occurs; this is called aliasing. signal. The idea is to place the problem into
This is the content of the sampling theorem. the framework of the (orthogonal or oblique)
It has been widely accepted as the basis for projection theorem in a Hilbert space (usually
digital signal processing that bridges analog to L2 ) and then project the signal space to the
digital. Concrete applications such as CD, MP3, subspace generated by the shifted reconstruction
or images are based on this principle in one way functions. It is often required that the process
or another. give a consistent result, i.e., if we subject the
reconstructed signal to the whole process again, it
should yield the same sampled values from which
Difficulties it was reconstructed (Unser and Aldroubi 1994).
In what follows, we take a similar viewpoint,
However, this paradigm (hereafter the Shannon that is, the incoming signals are acquired through
paradigm) of the perfect band-limiting hypoth- a nonideal filter, but develop a methodology dif-
esis and the resulting sampling theorem renders ferent from the projection method, relying on
several difficulties as follows: sampled-data control theory.
• The reconstruction formula (3) is not causal,
i.e., one needs future sampled values to recon-
struct the present value f .t/. One can remedy
this defect by allowing a certain amount of The Signal Class
delay in reconstruction, but this delay can
depend on how fast the formula converges. We have seen that the perfect band-limiting hy-
• This formula is known to decay slowly; that pothesis is restrictive. Even if we adopt it, it is a
is, we need many terms to approximate if we fairly crude model for analog signals to allow for
use this formula as it is. a more elaborate study.
• The perfect band-limiting hypothesis is hardly Let us now pose the question: What class of
satisfied in reality. For example, for CDs, the functions should we process in such systems?
Control Applications in Audio Reproduction 141

Control Applications in
Audio Reproduction,
Fig. 3 Ringing due to the
Gibbs phenomenon

Control Applications in Audio Reproduction, Fig. 4 Error system for sampled-data design

Consider the situation where one plays a mu- the frequency components beyond the Nyquist
sical instrument, say, a guitar. A guitar naturally frequency, one needs a faster sampling period,
has a frequency characteristic. When one picks a so we insert the upsampler " L to make the
string, it produces a certain tone along with its sampling period h=L. This upsampled signal is
harmonics, as well as a characteristic transient processed by digital filter K.z/ and then becomes
response. All these are governed by a certain a continuous-time signal again by going through
frequency decay curve, demanded by the physical the hold device Hh=L . It will then be processed
characteristics of the guitar. Let us suppose that by analog filter P .s/ to be smoothed out. The
such a frequency decay is governed by a rational obtained signal is then compared with delayed
transfer function F .s/, and it is driven by varied analog signal yc .t  mh/ to form the delayed
exogenous inputs. error signal ec . The objective is then to make
Consider Fig. 4. The exogenous analog signal this error ec as small as possible. The reason for
wc 2 L2 is applied to the analog filter F .s/. allowing delay e mhs is to accommodate certain
This F .s/ is not an ideal filter and hence its processing delays. This is the idea of the block
bandwidth is not limited below the Nyquist fre- diagram Fig. 4.
quency. The signal wc drives F .s/ to produce The performance index we minimize is the
the target analog signal yc , which should be the induced norm of the transfer operator Tew from
signal to be reconstructed. It is then sampled wc to ec :
by sampler Sh and becomes the recorded or
transmitted digital signal yd . The objective here
is to reconstruct the target analog signal yc out kec k2
of this sampled signal yd . In order to recover kTew k1 WD sup : (4)
wc ¤0 kwc k2
142 Control Applications in Audio Reproduction

In other words, the H 1 -norm of the sampled- in the case of H 2 , is targeted. Furthermore, since
data control system Fig. 4. Our objective is then we can control the continuous-time performance
to solve the following problem: of the worst-case error signal, the present
method can indeed minimize (continuous-time)
phase errors. This is an advantage usually not
Filter Design Problem
possible with conventional methods since they
Given the system specified by Fig. 4. For a given
mainly discuss the gain characteristics of the
performance level > 0, find a filter K.z/ such
designed filters only. By the very property of
that
minimizing the H 1 norm of the continuous-
kTew k1 < :
time error signal ec , the present method can
This is a sampled-data H 1 (sub-)optimal even control the phase errors and yield much
control problem. This can be solved by using less phase distortion even around the cutoff
the standard solution method for sampled-data frequency.
control systems (Chen and Francis 1995a; Figure 5 shows the response of the proposed
Yamamoto 1999; Yamamoto et al. 2012). sampled-data filter against a rectangular wave,
The only anomaly here is that the system in with a suitable first- or second-order analog fil-
Fig. 4 contains a delay element e mhs which ter F .s/; see Yamamoto et al. (2012) for more
is infinite dimensional. However, by suitably details. Unlike Fig. 3, the overshoot is controlled
approximating this delay by successive series of to be minimum.
shift registers, one can convert the problem to The present method has been patented
an appropriate finite-dimensional discrete-time (Fujiyama et al. 2008; Yamamoto 2006;
problem (Yamamoto et al. 1999, 2002, 2012). Yamamoto and Nagahara 2006) and implemented
This problem setting has the following fea- into sound processing LSI chips as a core
tures: technology by Sanyo Semiconductors and
1. One can optimize the continuous-time perfor- successfully used in mobile phones, digital voice
mance under the constraint of discrete-time recorders, and MP3 players; their cumulative
filters. production has exceeded 40 million units as of
2. By setting the class of input functions as L2 the end of 2012.
functions band-limited by F .s/, one can cap-
ture the continuous-time error signal ec and its
worst-case norm in the sense of (4). Summary and Future Directions
The first feature is due to the advantage of
sampled-data control theory. It is a great ad- We have presented basic ideas of new signal pro-
vantage of sampled-data control theory that al- cessing theory derived from sampled-data control
lows the mixture of continuous- and discrete- theory. The theory has the advantage that is not
time components. This is in marked contrast to possible with the conventional projection meth-
the Shannon paradigm where continuous-time ods, whether based on the perfect band-limiting
performance is really demanded by the artificial hypothesis or not.
perfect band-limiting hypothesis. The application of sampled-data control
The second feature is an advantage due to theory to digital signal processing was first made
H 1 control theory. Naturally, we cannot have an by Chen and Francis (1995b) with performance
access to each individual error signal ec , but measure in the discrete-time domain; see also
we can still control the overall performance Hassibi et al. (2006). The present author and his
from wc to ec in terms of the H 1 norm that group have pursued the idea presented in this
guarantees the worst-case performance. This is in entry since 1996 (Khargonekar and Yamamoto
clear contrast with the classical case where only 1996). See Yamamoto et al. (2012) and references
a representative response, e.g., impulse response therein. For the background of sampled-data
Control Applications in Audio Reproduction 143

Control Applications in
Audio Reproduction,
Fig. 5 Response of the
proposed sampled-data
filter against a rectangular
wave

control theory, consult, e.g., Chen and Francis For sampling theorem, see Shannon (1949),
(1995a) and Yamamoto (1999). Unser (2000), and Zayed (1996), for example.
The same philosophy of emphasizing Note, however, that Shannon himself (1949) did
the importance of analog performance was not claim originality on this theorem; hence, it
proposed and pursued recently by Unser and is misleading to attribute this theorem solely to
co-workers (1994), Unser (2005), and Eldar Shannon. See Unser (2000) and Zayed (1996)
and Dvorkind (2006). The crucial difference for some historical accounts. For a general back-
is that they rely on L2 =H 2 type optimization ground in signal processing, Vetterli et al. (2013)
and orthogonal or oblique projections, which is useful.
are very different from our method here. In
particular, such projection methods can behave
poorly for signals outside the projected space. Cross-References
The response shown in Fig. 3 is a typical such
example.  H-Infinity Control
 Optimal Sampled-Data Control
Applications to image processing is discussed
 Sampled-Data Systems
in Yamamoto et al. (2012). An application
to Delta-Sigma DA converters is studied in
Acknowledgments The author would like to thank
Nagahara and Yamamoto (2012). Again, the Masaaki Nagahara and Masashi Wakaiki for their help
crux of the idea is to assume a signal generator with the numerical examples. Part of this entry is based
model and then design an optimal filter in on the exposition (Yamamoto 2007) written in Japanese.
the sense of Fig. 4 or a similar diagram with
the same idea. This idea should be applicable
to a much wider class of problems in signal Bibliography
processing and should prove to have more
impact. Chen T, Francis BA (1995a) Optimal sampled-data control
Some processed examples of still and moving systems. Springer, New York
Chen T, Francis BA (1995b) Design of multirate filter
images are downloadable from the site: http:// banks by H1 optimization. IEEE Trans Signal Pro-
www-ics.acs.i.kyoto-u.ac.jp/~yy/ cess 43:2822–2830
144 Control for High-Speed Nanopositioning

Eldar YC, Dvorkind TG (2006) A minimum squared-


error framework for generalized sampling. IEEE Trans Control for High-Speed
Signal Process 54(6):2155–2167 Nanopositioning
Fujiyama K, Iwasaki N, Hirasawa Y, Yamamoto Y (2008)
High frequency compensator and reproducing device.
S.O. Reza Moheimani
US patent 7,324,024 B2, 2008
Hassibi B, Erdogan AT, Kailath T (2006) MIMO linear School of Electrical Engineering & Computer
equalization with an H 1 criterion. IEEE Trans Signal Science, The University of Newcastle,
Process 54(2):499–511 Callaghan, NSW, Australia
Khargonekar PP, Yamamoto Y (1996) Delayed signal
reconstruction using sampled-data control. In: Pro-
ceedings of 35th IEEE CDC, Kobe, Japan, pp 1259– Abstract
1263
Nagahara M, Yamamoto Y (2012) Frequency domain
min-max optimization of noise-shaping delta-sigma Over the last two and a half decades we have
modulators. IEEE Trans Signal Process 60(6):2828– observed astonishing progress in the field of
2839 nanotechnology. This progress is largely due to
Shannon CE (1949) Communication in the presence of
noise. Proc IRE 37(1):10–21
the invention of Scanning Tunneling Microscope
Unser M (2000) Sampling – 50 years after Shannon. Proc (STM) and Atomic Force Microscope (AFM)
IEEE 88(4):569–587 in the 1980s. Central to the operation of AFM
Unser M (2005) Cardinal exponential splines: part II – and STM is a nanopositioning system that
think analog, act digital. IEEE Trans Signal Process
53(4):1439–1449 moves a sample or a probe, with extremely
Unser M, Aldroubi A (1994) A general sampling theory high precision, up to a fraction of an Angstrom,
for nonideal acquisition devices. IEEE Trans Signal in certain applications. This note concentrates
Process 42(11):2915–2925 on the fundamental role of feedback, and the
Vetterli M, Kovacěvić J, Goyal V (2013) Foundations of
signal processing. Cambridge University Press, Cam- need for model-based control design methods in
bridge improving accuracy and speed of operation of
Yamamoto Y (1999) Digital control. In: Webster JG nanopositioning systems.
(ed) Wiley encyclopedia of electrical and elec-
tronics engineering, vol 5. Wiley, New York,
pp 445–457
Yamamoto Y (2006) Digital/analog converters and a de-
Keywords
sign method for the pertinent filters. Japanese patent
3,820,331, 2006 Atomic force microscopy; High-precision
Yamamoto Y (2007) New developments in signal process- mechatronic systems; Nanopositioining; Scan-
ing via sampled-data control theory–continuous-time
performance and optimal design. Meas Control (Jpn)
ning probe microscopy
46:199–205
Yamamoto Y, Nagahara M (2006) Sample-rate converters.
Japanese patent 3,851,757, 2006
Introduction
Yamamoto Y, Madievski AG, Anderson BDO
(1999) Approximation of frequency response for Controlling motion of an actuator to within a
sampled-data control systems. Automatica 35(4): single atom, known as nanopositioning, may
729–734
Yamamoto Y, Anderson BDO, Nagahara M (2002) Ap-
seem as an impossible task. Yet, it has become
proximating sampled-data systems with applications a key requirement in many systems to emerge
to digital redesign. In: Proceedings of the 41st IEEE in recent years. In scanning probe microscopy
CDC, Las Vegas, pp 3724–3729 nanopositioning is needed to scan a probe over
Yamamoto Y, Nagahara M, Khargonekar PP (2012) Signal
reconstruction via H 1 sampled-data control theory–
a sample surface for imaging and to control the
beyond the Shannon paradigm. IEEE Trans Signal interaction between the probe and the surface
Process 60:613–625 during interrogation and manipulation (Meyer
Zayed AI (1996) Advances in Shannon’s sampling theory. et al. 2004). Nanopositioning is the enabling tech-
CRC Press, Boca Raton
Zelniker G, Taylor FJ (1994) Advanced digital signal
nology for mask-less lithography tools under de-
processing: theory and applications. Marcel Dekker, velopment to replace optical lithography systems
New York (Vettiger et al. 2002). Novel nanopositioning
Control for High-Speed Nanopositioning 145

tools are required for positioning of wafers


and for mask alignment in the semiconductor
industry (Verma et al. 2005). Nanopositioning
systems are vital in molecular biology for
imaging, alignment, and nanomanipulation in
applications such as DNA analysis (Meldrum
et al. 2001) and nanoassembly (Whitesides and C
Christopher Love 2001). Nanopositioning is
an important technology in optical alignment
systems (Krogmann 1999). In data storage
systems, nanometer-scale precision is needed
for emerging probe-storage devices, for dual-
stage hard-disk drives, and for next generation
tape drives (Cherubini et al. 2012).
Control for High-Speed Nanopositioning, Fig. 1 A
3DoF flexure-guided high-speed nanopositioner (Yong
et al. 2013). The three axes are actuated independently
The Need for High-Speed using piezoelectric stack actuators. Movement of lateral
Nanopositioning axes is measured using capacitive sensors

In all applications of nanopositioning, there is a A high-speed scanner is shown in Fig. 1.


significant and growing demand for high speeds. In all applications where nanopositioning is
The ability to operate a nanopositioner at a band- a necessity, the key objective is to make the
width of tens of kHz, as opposed to today’s scanner follow, or track, a given reference
hundreds of Hz, is the key to unlocking countless trajectory (Devasia et al. 2007). A large number
technological possibilities in the future (Gao et al. of control design methods have been proposed
2000; Pantazi et al. 2008; Salapaka 2003; Sebas- for this purpose, including feedforward control
tian et al. 2008b; Yong et al. 2012). The atomic (Clayton et al. 2009), feedback control (Salapaka
force microscope (AFM) is an example of such 2003), and combinations of those (Yong et al.
technologies. A typical commercial atomic force 2009). These control techniques are required
microscope is a slow device, taking up to a minute in order to compensate for the mechanical
or longer to generate an image. Such imaging resonances of the scanner as well as for various
speeds are too slow to investigate phenomena nonlinearities and uncertainties in the dynamics
with fast dynamics. For example, rapid biological of the nanopositioner. At low speeds, feedforward
processes that occur in seconds, such as rapid techniques are usually sufficient to address
movement of cells or fast dehydration and denat- many of the arising challenges. However, over
uration of collagen, are too fast to be observed a wide bandwidth, model uncertainties, sensor
by a typical commercial AFM (Zou et al. 2004). noise, and mechanical cross-couplings become
A key obstacle in realizing high-speed and video- significant, and hence feedback control becomes
rate atomic force microscopy is the limited speed essential to achieve the requisite nanoscale
of nanopositioners. accuracy and precision at high speeds (Devasia
et al. 2007; Salapaka 2003).

The Vital Role of Feedback Control in


High-Speed Nanopositioning Control Design Challenges

The systems described above depend on a A feedback loop typically encountered in


precision mechatronic device, known as a nanopositioning is illustrated in Fig. 2. The
nanopositioner, or a scanner for their operation. purpose of the feedback controller is to control
146 Control for High-Speed Nanopositioning

Control for High-Speed Nanopositioning, Fig. 2 A the scanner such that it follows the intended reference
feedback loop typically encountered in nanopositioning. trajectory based on the position measurement obtained
Purpose of the controller is to control the position of from a position sensor

the position of the scanner such that it follows Summary and Future Directions
a given reference trajectory based on the
measurement provided by a displacement While high-precision nanoscale positioning
sensor. The resulting tracking error contains systems have been demonstrated at low speeds,
both deterministic and stochastic components. despite an intensive international race spanning
Deterministic errors are typically due to several years, the longstanding challenge remains
insufficient closed-loop bandwidth. They may to achieve high-speed motion and positioning
also arise from excitation of mechanical resonant with Ångstrom-level accuracy. Overcoming this
modes of the scanner or actuator nonlinearities barrier is believed to be the necessary catalyst for
such as piezoelectric hysteresis and creep (Croft emergence of ground breaking innovations across
et al. 2001). The factors that limit the achievable a wide range of scientific and technological fields.
closed-loop bandwidth include phase delays and Control is a critical technology to facilitate the
non-minimum phase zeros associated with the emergence of such systems.
actuator and scanner dynamics (Devasia et al.
2007). The dynamics of the nanopositioner, the
controller, and the reference trajectory selected Bibliography
for scanning play a key role in minimizing the
Cherubini G, Chung CC, Messner WC, Moheimani SOR
deterministic component of the tracking error. (2012) Control methods in data-storage systems. IEEE
Tracking errors of a stochastic nature mostly Trans Control Syst Technol 20(2):296–322
arise from external noise and vibrations and from Clayton GM, Tien S, Leang KK, Zou Q, Devasia S
position measurement noise. External noise and (2009) A review of feedforward control approaches
in nanopositioning for high-speed SPM. J Dyn Syst
vibrations can be significantly reduced by oper- Meas Control Trans ASME 131(6):1–19
ating the nanopositioner in a controlled environ- Croft D, Shed G, Devasia S (2001) Creep, hysteresis,
ment. However, dealing with the measurement and vibration compensation for piezoactuators: atomic
noise is a significant challenge (Sebastian et al. force microscopy application. ASME J Dyn Syst
Control 123(1):35–43
2008a). The feedback loop allows the sensing Devasia S, Eleftheriou E, Moheimani SOR (2007) A
noise to generate a random positioning error that survey of control issues in nanopositioning. IEEE
deteriorates the positioning precision. Increasing Trans Control Syst Technol 15(5):802–823
Gao W, Hocken RJ, Patten JA, Lovingood J, Lucca DA
the closed-loop bandwidth (to decrease the deter-
(2000) Construction and testing of a nanomachining
ministic errors) tends to worsen this effect. Low instrument. Precis Eng 24(4):320–328
sensitivity to measurement noise is, therefore, Krogmann D (1999) Image multiplexing system on
a key requirement in feedback control design the base of piezoelectrically driven silicon microlens
arrays. In: Proceedings of the 3rd international con-
for high-speed nanopositioning and a very hard
ference on micro opto electro mechanical systems
problem to address. (MOEMS), Mainz, pp 178–185
Control Hierarchy of Large Processing Plants: An Overview 147

Meldrum DR, Pence WH, Moody SE, Cunningham DL,


Holl M, Wiktor PJ, Saini M, Moore MP, Jang L, Control Hierarchy of Large
Kidd M, Fisher C, Cookson A (2001) Automated, Processing Plants: An Overview
integrated modules for fluid handling, thermal cycling
and purification of DNA samples for high throughput
Cesar de Prada
sequencing and analysis. In: IEEE/ASME interna-
tional conference on advanced intelligent mechatron- Departamento de Ingeniería de Sistemas y
ics, AIM, Como, vol 2, pp 1211–1219 Automática, University of Valladolid, Valladolid, C
Meyer E, Hug HJ, Bennewitz R (2004) Scanning probe Spain
microscopy. Springer, Heidelberg
Pantazi A, Sebastian A, Antonakopoulos TA, Bachtold
P, Bonaccio AR, Bonan J, Cherubini G, Despont Abstract
M, DiPietro RA, Drechsler U, DurIg U, Gotsmann
B, Haberle W, Hagleitner C, Hedrick JL, Jubin D,
Knoll A, Lantz MA, Pentarakis J, Pozidis H, Pratt This entry provides an overview of the so-called
RC, Rothuizen H, Stutz R, Varsamou M, Weismann control pyramid, which organizes the different
D, Eleftheriou E (2008) Probe-based ultrahigh- types of control tasks in a processing plant in a
density storage technology. IBM J Res Dev 52(4–5):
493–511 set of interconnected layers, from basic control
Salapaka S (2003) Control of the nanopositioning devices. and instrumentation to plant-wide economic op-
In: Proceedings of the IEEE conference on decision timization. These layers have different functions,
and control, Maui all of them necessary for the optimal functioning
Sebastian A, Pantazi A, Moheimani SOR, Pozidis H,
Eleftheriou E (2008a) Achieving sub-nanometer pre- of large processing plants.
cision in a MEMS storage device during self-servo
write process. IEEE Trans Nanotechnol 7(5):586–595.
doi:10.1109/TNANO.2008.926441 Keywords
Sebastian A, Pantazi A, Pozidis H, Eleftheriou E (2008b)
Nanopositioning for probe-based data storage [appli- Control hierarchy; Control pyramid; Model-
cations of control]. IEEE Control Syst Mag 28(4):26–
35
predictive control; Optimization; Plant-wide
Verma S, Kim W, Shakir H (2005) Multi-axis ma- control; Real-time optimization
glev nanopositioner for precision manufacturing and
manipulation applications. IEEE Trans Ind Appl
41(5):1159–1167 Introduction
Vettiger P, Cross G, Despont M, Drechsler U, Durig
U, Gotsmann B, Haberle W, Lantz MA, Rothuizen
Operating a process plant is a complex task in-
HE, Stutz R, Binnig GK (2002) The “millipede”-
nanotechnology entering data storage. IEEE Trans volving many different aspects ranging from the
Nanotechnol 1(1):39–54 control of individual pieces of equipment and of
Whitesides GM, Christopher Love J (2001) The art of process units to the management of the plant or
building small. Sci Am 285(3):38–47
factory as a whole, including relations with other
Yong YK, Aphale S, Moheimani SOR (2009) Design,
identification and control of a flexure-based XY stage plants or suppliers.
for fast nanoscale positioning. IEEE Trans Nanotech- From the control point of view, the corre-
nol 8(1):46–54 sponding tasks are traditionally organized in sev-
Yong YK, Moheimani SOR, Kenton BJ, Leang KK
(2012) Invited review article: high-speed flexure-
eral layers, placing in the bottom the ones closer
guided nanopositioning: mechanical design and con- to the physical processes and in the top those
trol issues. Rev Sci Instrum 83(12):121101 closer to plant-wide management, forming the so-
Yong YK, Bhikkaji B, Moheimani SOR (2013) De- called control pyramid represented in Fig. 1.
sign, modeling and FPAA-based control of a
high-speed atomic force microscope nanopositioner.
The process industry currently faces many
IEEE/ASME Trans Mechatron 18(3):1060–1071. challenges, originated from factors such as
doi:10.1109/TMECH.2012.2194161 increased competition among companies and bet-
Zou Q, Leang KK, Sadoun E, Reed MJ, Devasia S ter global market information, new environmental
(2004) Control issues in high-speed AFM for biolog-
ical applications: collagen imaging example. Asian
regulations and safety standards, improved
J Control Spec Issue Adv Nanotechnol Control 6(2): quality, or energy efficiency requirements. Many
164–178 years ago, the main tasks were associated to the
148 Control Hierarchy of Large Processing Plants: An Overview

and 1980s, ( Industrial MPC of Continuous


Processes; Camacho and Bordóns (2004)).
MPC aims at regulating a process unit as
a whole considering all manipulated and
controlled variables simultaneously. It handles
all interactions, disturbances, and process
constraints using a process model in order to
compute the control actions that optimize a
control performance index. MPC is built on top
of the basic control loops and partly replaces
the complex control structures of the advanced
control layer adding new functionalities and
better control performance. The improvements
in control quality and the management of
constraints and interactions of the model-
predictive controllers open the door for the
implementation of local economic optimization.
Linked to the MPC controller and taking
Control Hierarchy of Large Processing Plants: An advantage of its model, an optimizer may look
Overview, Fig. 1 The control pyramid for the best operating point of the unit by
computing the controller set points that optimize
correct and safe functioning of the individual an economic cost function of the process unit
process units and to the global management considering the operational constraints of the
of the factory from the point of view of unit. This task is usually formulated and solved
organization and economy. Therefore, only the as a linear programming (LP) problem, i.e., based
lower and top layers of the control pyramid were on linear or linearized economic models and cost
realized by computer-based systems, whereas function (see Fig. 2).
the intermediate tasks were largely performed A natural extension of these ideas was to
by human operators and managers, but more consider the interrelations among the different
and more the intermediate layers are gaining parts of the processing plants and to look for
importance in order to face the abovementioned the steady-state operating point that provides the
challenges. best economic return and minimum energy ex-
Above the physical plant represented in Fig. 1, pense or optimizes any other economic criterion
there is a layer related to instrumentation and while satisfying the global production aims and
basic control, devoted to obtaining direct pro- constraints. These optimization tasks are known
cess information and maintaining selected pro- as real-time optimization (RTO) ( Real-Time
cess variables close to their desired targets by Optimization of Industrial Processes) and form
means of local controllers. Motivated by the need another layer of the control pyramid.
for more efficient operation and better-quality Finally, when we consider the whole plant
assurance, an improvement of this basic control operation, obvious links between the RTO and
can be obtained using control structures such the planning and economic management of the
as cascades, feed forwards, ratios, and selectors. company appear. In particular, the organization
This is called advanced control in industry, but and optimization of the flows of raw materials,
not in academia, where the word is reserved for purchases, etc., involved in the supply chains
more sophisticated controls. present important challenges that are placed in
A big step forward took place in the control the top layer of Fig. 1.
field with the introduction of model-based This entry provides an overview of the dif-
predictive control (MBPC/MPC) in the late 1970s ferent layers and associated tasks so that the
Control Hierarchy of Large Processing Plants: An Overview 149

installation of wireless transmitters and the


advances in analytical instrumentation will lead,
without doubt, to the development of a stronger
information base to support better decisions and
operations in the plants.
Information from transmitters is collected in
the control rooms that are the core of the second C
layer. Many of them are equipped with distributed
Control Hierarchy of Large Processing Plants: An control systems (DCS) that implement monitor-
Overview, Fig. 2 MPC implementation with a local op- ing and control tasks. Field signals are received
timizer in the control cabinets where a large number
of microprocessors execute the data acquisition
reader can place in context the different con- and regulatory control tasks, sending signals back
trollers and related functionalities and tools, as to the field actuators. Internal buses connect the
well as appreciate the trends in process control controllers with the computers that support the
focusing the attention toward the higher levels of displays of the human-machine interface (HMI)
the hierarchy and the optimal operation of large- for the plant operators of the control room. In
scale processes. the past, DCS were mostly in charge of the
regulatory control tasks, including basic control,
alarm management, and historians, while inter-
An Alternative View locking systems related to safety and sequences
related to batch operations were implemented
The implementation in a process factory of the either in the DCS or in programmable logic
tasks and layers previously mentioned is possible controllers (PLCs):  Programmable Logic Con-
nowadays due to important advances in many trollers. Today, the bounds are not so clear, due to
fields, such as modeling and identification, con- the increase of the computing power of the PLCs
trol and estimation, optimization methods, and, in and the added functionalities of the DCS. Safety
particular, software tools, communications, and instrumented systems (SIS) for the maintenance
computing power. Today it is rather common of plant safety are usually implemented in dedi-
to find in many process plants an information cated PLCs, if not hard-wired, but for the rest of
network that follows also a pyramidal structure the functions, a combination of PLC-like proces-
represented in Fig. 3. sors with I/O cards and SCADAs (Supervision,
At the bottom, there is the instrumentation Control, And Data Acquisition Systems) is the
layer that includes, besides sensors and prevailing architecture. SCADAs act as HMI and
actuators connected by the classical analog information systems collecting large amounts of
4–20 mA signals, possibly enhanced by the data that can be used at other levels with different
transmission of information to and from the purposes.
sensors by the HART protocol, digital field Above the basic and advanced control layer,
buses and smart transmitters and actuators using the information stored in the SCADA
that incorporate improved information and as well as other sources, there is an increased
intelligence. New functionalities, such as number of applications covering diverse fields.
remote calibration, filtering, self-test, and Figure 3 depicts the perspective of the computing
disturbance compensation, provide more accurate and information flow architecture and includes
measurements that contribute to improving a level called supervisory control, placed in
the functioning of local controllers, in the direct connection with the control room and
same way as that of new methods and tools the production tasks. It includes, for instance,
available nowadays for instrument monitoring MPC with local optimizers, statistical process
and fault detection and diagnosis. The increased control (SPC) for quality and production
150 Control Hierarchy of Large Processing Plants: An Overview

Control Hierarchy of Large Processing Plants: An Overview, Fig. 3 Information network in a modern process plant

supervision ( Multiscale Multivariate Statistical MES are information systems that support the
Process Control), data reconciliation, inferences functions that a production department must
and estimation of unmeasured quantities, perform in order to prepare and to manage
fault detection and diagnosis, or performance work instructions, schedule production activities,
controller monitoring ( Controller Performance monitor the correct execution of the production
Monitoring) (CPM). process, gather and analyze information about
The information flow becomes more complex the production process, and optimize procedures.
when we move up the basic control layer, looking Notice that regarding the control of process units,
more like a web than a pyramid when we enter up to this level no fundamental differences appear
the world of what can be called generally as between continuous and batch processes. But at
asset (plant and equipment) management: a col- the MES level, which corresponds to RTO of
lection of different activities oriented to sustain Fig. 1, many process units may be involved, and
performance and economic return, considering the tools and problems are different, the main task
their entire cycle of life and, in particular, aspects in batch production being the optimal scheduling
such as maintenance, efficiency, or production of those process units ( Scheduling of Batch
organization. Above the supervisory layer, one Plants; Mendez et al. 2006).
can usually distinguish at least two levels denoted MES are part of a larger class of systems
generically as manufacturing execution systems called manufacturing operation management
(MES) and enterprise resource planning (ERP) (MOM) that cover not only the management of
(Scholten 2009) as can be seen in Fig. 4. production operations but also other functions
Control Hierarchy of Large Processing Plants: An Overview 151

Control Hierarchy of
Large Processing Plants:
An Overview, Fig. 4
Software/hardware view

such as maintenance, quality, laboratory


information systems, or warehouse management.
One of their main tasks is to generate elaborate
information, quite often in the form of key
performance indicators (KPIs), with the purpose
of facilitating the implementation of corrective
actions.
ERP systems represent the top of the pyramid,
corresponding to the enterprise business planning
activities that allows assigning global targets to
production scheduling. For many years, it has
been considered to be out of the scope of the field
of control, but nowadays, more and more, supply
chain management is viewed and addressed as a
control and optimization problem in research. Control Hierarchy of Large Processing Plants: An
Overview, Fig. 5 Two possible implementations of RTO

Future Control and Optimization at process model in steady state and the correspond-
Plant Scale ing operational constraints to generate targets for
the control systems on the lower layers. The
Going back to Fig. 1, the variety of control and implementation of RTO provides consistent ben-
optimization problems increases as we move up efits by looking at the optimal operation problem
in the control hierarchy, entering the field of from a plant-wide perspective. Nevertheless, in
dynamic process operations and considering not practice, when MPCs with local optimizers are
only individual process units but also larger sets operating the process units, many coordination
of equipment or whole plants. Examples at the problems appear between these layers, due to dif-
RTO (or MES) level are optimal management of ferences in models and targets, so that driving the
shared resources or utilities, production bottle- operation of these process units in a coherent way
neck avoidance, optimal energy use or maximum with the global economic targets is an additional
efficiency, smooth transitions against production challenge.
changes, etc. A different perspective is taken by the
Above, we have mentioned RTO as the most so-called self-optimizing control (Fig. 5 right,
common approach for plant-wide optimization. Skogestad 2000) that, instead of implementing
Normally, RTO systems perform the optimization the RTO solution online, uses it to design a
of an economic cost function using a nonlinear control structure that assures a near optimum
152 Control Hierarchy of Large Processing Plants: An Overview

Control Hierarchy of Large Processing Plants: An


Overview, Fig. 6 Direct dynamic optimization

operation if some specially chosen variables are


maintained closed to their targets.
As in any model-based approach, the problem
of how to implement or modify the theoretical
optimum computed by RTO so that the optimum
computed with the model and the real optimum
of the process coincide in spite of model errors,
disturbances, etc., emerges. A common choice to
deal with this problem is to update periodically
the model using parameter estimation methods
or data reconciliation with plant data in steady
state. Also, uncertainty can be explicitly taken Control Hierarchy of Large Processing Plants: An
Overview, Fig. 7 Hierarchical, price coordination, and
into account by considering different scenarios distributed approaches
and optimizing the worst case, but this is con-
servative and does not take advantage of the
plant measurements. Along this line, there are Predictive Control and  Model-Based Perfor-
proposals of other solutions such as modifier- mance Optimizing Control (Engell 2007).
adaptation methods that use a fixed model and The type of problems that can be formulated
process measurements to modify the optimization within this framework is very wide, as are the
problem so that the final result corresponds to the possible fields of application. Processes with dis-
process optimum (Marchetti et al. 2009) or the tributed parameter structure or mixtures of real
use of stochastic optimization where several sce- and on/off variables, batch and continuous units,
narios are taken into account and future decisions statistical distribution of particle sizes or proper-
are used as recourse variables (Lucia et al. 2013). ties, etc., give rise to special type of NMPC prob-
RTO is formulated in steady state, but in prac- lems (see, e.g., Lunze and Lamnabhi-Lagarrigue
tice, most of the time the plants are in transients, 2009), but a common characteristic of all of them
and there are many problems, such as start-up is the fact that they are computational intensive
optimization, that require a dynamic formulation. and should be solved taking into account the
A natural evolution in this direction is to combine different forms of uncertainty always present.
nonlinear MPC with economic optimization so Control and optimization are nowadays
that the target of the NMPC is not set point inseparable essential parts of any advanced
following but direct economic optimization as in approach to dynamic process operation.
the right-hand side of Fig. 6:  Economic Model Progress in the field and spreading of the
Control Hierarchy of Large Processing Plants: An Overview 153

industrial applications are possible thanks to and robustness against uncertainty, and leading
the advances in optimization methods and tools to new challenges and problems for research
and computing power available on the plant and possibly large improvements of plant oper-
level, but implementation is still a challenge ations.
from many points of view, not only technical.
Few suppliers offer commercial products,
and finding optimal operation policies for a Cross-References
C
whole factory is a complex task that requires
taking into consideration many aspects and  Controller Performance Monitoring
elaborate information not available directly  Economic Model Predictive Control
as process measurements. Solving large  Industrial MPC of Continuous Processes
NMPC problems in real time may require  Model-Based Performance Optimizing Control
breaking the associated optimization problem  Multiscale Multivariate Statistical Process
in subproblems that can be solved in parallel. Control
This leads to several local controllers/optimizers,  Programmable Logic Controllers
each one solving one subproblem involving  Real-Time Optimization of Industrial Pro-
variables of a part of the process and linked cesses
by some type of coordination. This offers a  Scheduling of Batch Plants
new point of view of the control hierarchy.
Typically, three types of architectures are
mentioned for dealing with this problem,
Bibliography
represented in Fig. 7: In the hierarchical
approach, coordination between local controllers Camacho EF, Bordóns C (2004) Model predictive control.
is made by an upper layer that deals with Springer, London, pp 1–405. ISBN: 1-85233-694-3
the interactions, assigning targets to them. In de Prada C, Gutierrez G (2012) Present and future trends
price coordination, the coordination task is in process control. Ingeniería Química 44(505):38–42.
Special edition ACHEMA, ISSN:0210–2064
performed by a market-like mechanism that Engell S (2007) Feedback control for optimal process
assigns different prices to the cost functions of operation. J Process Control 17:203–219
every local controller/optimizer. Finally, in Engell S, Harjunkoski I (2012) Optimal operation:
the distributed approach, the local controllers scheduling, advanced control and their integration.
Comput Chem Eng 47:121–133
coordinate their actions by interchanging Lucia S, Finkler T, Engell S (2013) Multi-stage nonlinear
information about its decisions or states with model predictive control applied to a semi-batch poly-
neighbors (Scattolini 2009). merization reactor under uncertainty. J Process Control
23:1306–1319
Lunze J, Lamnabhi-Lagarrigue F (2009) HYCON hand-
book of hybrid systems control. Theory, tools, ap-
Summary and Future Research plications. Cambridge University Press, Boca Raton.
ISBN:978-0-521-76505-3
Marchetti A, Chachuat B, Bonvin D (2009) Modifier-
Process control is a key element in the operation
adaptation methodology for real-time optimization.
of process plants. At the lowest layer, it can Ind Eng Chem Res 48(13):6022–6033
be considered a mature, well-proven technology, Mendez CA, Cerdá J, Grossmann I, Harjunkoski I, Fahl M
even if many problems such as control structure (2006) State-of-the-art review of optimization methods
for short-term scheduling of batch processes. Comput
selection and controller tuning in reality are often
Chem Eng 30:913–946
not solved well. The range of problems under Scattolini R (2009) Architectures for distributed and hier-
consideration is continuously expanding to the archical model predictive control – a review. J Process
upper layers of the hierarchy, merging control Control 19:723–731
Scholten B (2009) MES guide for executives: why and
with process operation and optimization, creating
how to select, implement, and maintain a manufac-
new challenges that range from modeling and turing execution system. ISA, Research Triangle Park.
estimation to efficient large-scale optimization ISBN:978-1-936007-03-5
154 Control of Biotechnological Processes

Skogestad S (2000) Plantwide control: the search for the bulk chemicals. To explain the needs for control
self-optimizing control structure. J Process Control of bioprocesses, especially for the production of
10:487–507
high-value and/or large-volume compounds, it
is instructive to have a look on the development
of a new process. If a potential strain is found
Control of Biotechnological or genetically engineered, the biologist will
Processes determine favorable environmental factors for the
growth of and the production of the target product
Rudibert King by the cells. These factors typically comprise the
Technische Universität Berlin, Berlin, Germany levels of temperature, pH, dissolved oxygen, etc.
Moreover, concentration regions for the nutrients,
precursors, and so-called trace elements are
Abstract specified. Whereas for the former variables
often “optimal” setpoints are provided which,
Closed-loop control can significantly improve the at least in smaller scale reactors, can be easily
performance of bioprocesses, e.g., by an increase maintained by independent classically designed
of the production rate of a target molecule or controllers, information about the best nutrient
by guaranteeing reproducibility of the production supply is incomplete from a control engineering
with low variability. In contrast to the control point of view. It is this dynamic nutrient supply
of chemical reaction systems, the biological re- which is most often not revealed in the biological
actions take place inside cells which constitute laboratory and which, however, offers substantial
highly regulated, i.e., internally controlled sys- room for production improvements by control.
tems by themselves. As a result, through evolu- Irrespective whether bacteria, yeasts, fungi, or
tion, the same cell can and will mimic a system animal cells are used for production, these cells
of first order in some situations and a high- will consist of thousands of different compounds
dimensional, highly nonlinear system in others. which react with each other in hundreds or more
A complete mathematical description of the pos- reactions. All reactions are tightly regulated on a
sible behaviors of the cell is still beyond reach molecular and genetic basis; see  Deterministic
and would be far too complicated as a basis for Description of Biochemical Networks. For so-
model-based process control. This makes super- called unlimited growth conditions, all cellular
vision, control, and optimization of biosystems compartments will be built up with the same
very demanding. specific growth rate, meaning that the cellular
composition will not change over time. In a
Keywords mathematical model describing growth and pro-
duction, only one state variable will be needed
Bioprocess control; Control of uncertain sys- to describe the biotic phase. This will give rise
tems; Optimal control; Parameter identification; to unstructured models; see below. Whenever a
State estimation; Structure identification; Struc- cell enters a limitation, which is often needed
tured models for production, the cell will start to reorganize
its internal reaction pathways. Model-based ap-
proaches of supervision and control based on
Introduction unstructured models are now bound to fail. More
biotic state variables are needed. However, it is
Biotechnology offers solutions to a broad not clear which and how many. As a result, mod-
spectrum of challenges faced today, e.g., for eling of limiting behaviors is challenging and cru-
health care, remediation of environmental cial for the control of biotechnological processes.
pollution, new sources for energy supplies, It requires a large amount of process-specific
sustainable food production, and the supply of information. Moreover, model-based estimates of
Control of Biotechnological Processes 155

the state of the cell and of the environment are a for pH or antifoam control, are added with vari-
key factor as online measurements of the internal able rates leading to an unsteady behavior. The
processes in the cell and of the nutrient concen- system to be modeled consists of the gaseous, the
trations are usually impossible. Finally, as models liquid, and the biotic phase inside the reactor. For
used for process control have to be limited in size the former ones, balance equations can be formu-
and thus only give an approximative description, lated readily. The biotic phase can be modeled
robustness of the methods has to be addressed. in a structured or unstructured way. Moreover, as C
not all cells behave similarly, this may give rise to
a segregated model formulation which is omitted
Mathematical Models here for brevity.

For the production of biotechnical goods, many


up- and downstream unit operations are involved Unstructured Models
besides the biological reactions. As these pose If the biotic phase is represented by just one
no typical bio-related challenges, we will concen- state variable, mX , a typical example of a simple
trate here on the cultivation of the organisms only. unstructured model of the liquid phase would be
This is mostly performed in aqueous solutions in
special bioreactors through which air is sparged
for a supply with oxygen. In some cases, other P X D X mX
m
gases are supplied as well; see Fig. 1. Disregard-
P P D P mX
m
ing wastewater treatment plants, most cultiva-
tions are still performed in a fed-batch mode, P S D a1 X mX  a2 P mX C cS;f eed u
m
meaning that a small amount of cells and part
P O D a3 .a4  cO /  a5 X mX  a6 P mX
m
of the nutrients are put into the reactor initially.
Then more nutrients and correcting fluids, e.g., VP D u

Control of Biotechnological Processes, Fig. 1 Modern laboratory reactor platform for control-oriented process
development
156 Control of Biotechnological Processes

Control of Biotechnological Processes, Table 1 Multiplicative rates depending on several concentrations c1 , . . . , ck


with possible kinetic terms
i D aimax i1 .c1 /  i2 .c2 /  : : :  ik .ck /
ij cj aij cj
cj C aij cj C aij cj2 C aij
cj cj =aij C1 cj
cj C aij e aij cj2 C cj C aij C1
...

with the masses mi with i D X; P; S; O for Structured Models


cells, product, substrate or nutrient, and dissolved In structured models, the changing composition
oxygen, respectively. The volume is given by V , and reaction pathways of the cell is accounted for.
and the specific growth and production rates X As detailed information about the cell’s complete
and P depend on concentrations ci D mi =V , metabolism including all regulations is missing
e.g., of the substrate S or oxygen O according to for the majority if not all cells exploited in bio-
formal kinetics, e.g., processes, an approximative description is used.
Examples are models in which a part of the real
a7 cS cO metabolism is described on a mechanistic level,
X D whereas the rest is lumped together into one or
.cS C a8 /.cO C a9 /
a10 cS very few states (Goudar et al. 2006), cybernetic
P D models (Varner and Ramkrishna 1998), or com-
a11 cS2 C cS C a12
partment models (King 1997). As an example,
all compartment models can be written down as
The nutrient supply can be changed by the feed
rate u.t/ as a control input, with inflow concen- mP D A.c/ C f i n .u/ C f out .u/
tration cS;f eed . Very often, just one feed stream X
is considered in unstructured models. As all pa- VP D ui
rameters ai have to be identified from noisy and i
infrequently sampled data, a low-dimensional
nonlinear uncertain model results. All steps prior with vectors of streams into and out of the re-
to the cultivation in which, e.g., from frozen action mixture, f i n and f out , which depend
cells, enough cells are produced to start the on control inputs u; a matrix of (stoichiomet-
fermentation add to the uncertainty. Whereas the ric) parameters, A; a vector of reaction rates
balance equations follow from first principles-  D .c/; and, finally, a vector m comprising
based modeling, the structure of the kinetics X substrates, products, and more than one biotic
and P is unknown, i.e., empirical relations are state. These biotic states can be motivated, for
exploited. Many different kinetic expressions can example, by physiological arguments, describing
be used here; see Bastin and Dochain (1990) or a the total amounts of macromolecules in the cell,
small selection shown in Table 1. such as the main building blocks DNA, RNA, and
It has to be pointed out that, most often, proteins. In very simple compartment models, the
neither cX , cP , nor cS are measured online. As cell is only divided up into what is called ac-
the measurement of cO might be unreliable, the tive and inactive biomass. Again, all coefficients
exhaust gas concentration of the gaseous phase in A and the structure and the coefficients of
is the main online measurement which can be all entries in .c/ (see Table 1) are unknown
used by employing an additional balance equa- and have to be identified based on experimental
tion for the gaseous phase. Infrequent at-line data. Issues of structural and practical identifia-
measurements, though, are sometimes available bility are of major concern. For models of sys-
for X; P; S , especially at the lab-scale during tem biology (see  Deterministic Description of
process development. Biochemical Networks), algebraic equations are
Control of Biotechnological Processes 157

added that describe the dependencies between in- a combinatorial explosion. Although approaches
dividual fluxes. Then at least part of A is known. exist to support this modeling step (see Herold
Describing the biotic phase with a higher de- and King 2013 or Mangold et al. 2005) finally,
gree of granularity does not change the mea- the modeler will have to settle with a compromise
surement situation in the laboratory or in the with respect to the accuracy of the model found
production scale, i.e., still only very few online versus the number of fully identified model can-
measurements will be available for control. didates. As a result, all control methods applied C
should be robust in some sense.

Identification
Soft Sensors
Even if the growth medium initially “only” con-
sists of some 10–20 different, chemically well- Despite many advantages in the development
defined substances, from which only few are of online measurements (see Mandenius and
described in the model, this situation will change Titchener-Hooker 2013) systems for supervision
over the cultivation time as the organisms release and control of biotechnical processes often
further compounds from which only few may be include model-based estimations schemes, such
known. If, for economic reasons, complex raw as extended Kalman filters (EKF); see  Kalman
materials are used, even the initial composition Filters. Concentration estimates are needed for
is unknown. Hence, measuring the concentrations unmeasured substances and for quantities which
of some of the compounds of the large set of depend on these concentrations like the growth
substances as a basis for modeling is not trivial. rate of the cells. In real applications, formulations
For structured models, intracellular substances have to be used which account for delays in
have to be determined additionally. These are laboratory analysis of up to several hours and for
embedded in an even larger matrix of compounds situations in which results from the laboratory
making chemical analysis more difficult. There- will not be available in the same sequence as
fore, the basis for parameter and structure identi- the samples were taken. An example from a
fication is uncertain. real cultivation is shown in Fig. 2. Here, the at-
As the expensive experiments and chemical line measurement of the biomass concentration,
analysis tasks are very time consuming, some- cX D mX =V , is the only measurement available.
times lasting up to several weeks, methods of The result of a single measurement is obtained
optimal experimental design should always be about 30 min after sampling. For reference,
considered in biotechnology; see  Experiment unaccessible state variables, which were analyzed
Design and Identification for Control. later, are shown as well along with the online
The models to be built up should possess estimates. The scatter of the data, especially of
some predictive capability for a limited range DNA and RNA, gives a qualitative impression of
of environmental conditions. This rules out un- the measurement accuracy in biotechnology.
structured models for many practical situations.
However, for process control, the models should
still be of manageable complexity. Medium-sized Control
structured models seem to be well suited for
such a situation. The choice of biotic states in Beside the relatively simple control of physical
m and possible structures for the reaction rates parameters, such as temperature, pH, dissolved
i , however, is hardly supported by biological oxygen, or carbon dioxide concentration, only
or chemical evidence. As a result, a combined few biotic variables are typically controlled
structure and parameter identification problem with respect to a setpoint. The most prominent
has to be solved. The choices of possible terms example is the growth rate of the biomass with
ij in all i give rise to a problem that exhibits the goal to reach a high cell concentration
158 Control of Biotechnological Processes

120 6 15 50

mProtein [g]
mDNA [g]

mRNA [g]
mX [g]

0 0 0 0
0 50 0 50 0 50 0 50

8 6 400 50

mAmino acids [g]


mPhosphate [g]
mAmmonia [g]

mGlucose [g]

0 0 0 0
0 25 50 0 25 50 0 25 50 0 25 50
time [h] time [h] time [h] time [h]

Control of Biotechnological Processes, Fig. 2 Estima- states (black), online estimated evolution (blue), off-line
tion of states of a structured model with an EKF with an data analyzed after the experiment (open red circles) (Data
unexpected growth delay initially. At-line measurement obtained by T. Heine)
mX (red filled circles), initially predicted evolution of

in the reactor as fast as possible. This is the concentrations instead, this can pose a rather
predominant goal when the cells are the primary challenging control problem. As the organisms
target as in baker’s yeast cultivations or when try to grow exponentially, the controller must
the expression of the desired product is growth be able to increase the feed exponentially as
associated. For other non-growth-associated well. The difficulty mainly arises from the
products, a high cell mass is desirable as well, inaccurate and infrequent measurements that
as production is proportional to the amount of the soft sensors/controller has to work with
cells. If the nutrient supply is maintained above a and from the danger that an intermediate
certain level, unlimited growth behavior results, shortage or oversupply with nutrients may switch
allowing the use of unstructured models for the metabolism to an undesired state of low
model-based control. An excess of nutrients productivity.
has to be avoided, though, as some organisms, For control of biotechnical processes,
like baker’s yeast, will initiate an overflow many methods explained in this encyclopedia
metabolism, with products which may be including feedforward, feedback, model-based,
inhibitory in later stages of the cultivation. For optimal, adaptive, fuzzy, neural nets, etc., can
some products, such as the antibiotic penicillin, be and have been used (cf. Dochain 2008;
the organism has to grow slowly to obtain Gnoth et al. 2008; Rani and Rao 1999).
a high production rate. For these so-called As in other areas of application, (robust)
secondary metabolites, low but not vanishing model-predictive control schemes (MPC) (see
concentrations for some limiting substrates  Industrial MPC of Continuous Processes) are
are needed. If setpoints are given for these applied with great success in biotechnology.
Control of Biotechnological Processes 159

20 1 6

VBase [l]

mNi [g]
cX [g/l]

0 0 0
0 100 0 100 0 100
0.8 3 12
C

cRNA [g/l]
cDNA [g/l]

cPr [g/l]
0 0 0
0 100 0 100 0 100
1.5 0.4 60
cAm [g/l]

cPh [g/l]

cC [g/l]
0 0 0
0 100 0 100 0 100
0.12 0.03 0.09
uAm [l/h]

uPh [l/h]

uC [l/h]
0.06

0 0 0
0 50 100 0 50 100 0 50 100
time [h] time [h] time [h]

Control of Biotechnological Processes, Fig. 3 MPC lution (blue), off-line data analyzed after the experiment
control and state estimation of a cultivation with S. ten- (open red circles). Off-line optimal feeding profiles ui
dae. At-line measurement mX (red filled circles), initially (blue broken line), MPC-calculated feeds (black, solid)
predicted evolution of states (black), online estimated evo- (Data obtained by T. Heine)

For the antibiotic production shown in Fig. 3, determined optimal reference trajectory will not
optimal feeding profiles ui for ammonia (AM), always be the best solution, and an online op-
phosphate (PH), and glucose (C) were calculated timization over the whole horizon has a larger
before the experiment was performed in a potential (cf. Kawohl et al. 2007).
trajectory optimization such that the final
mass of the desired antibiotic nikkomycin
(Ni) was maximized. This resulted in the blue Summary and Future Directions
broken lines for the feeds ui . However, due to
disturbances and model inaccuracies, an MPC Advanced process control including soft sen-
scheme had to significantly change the feeding sors can significantly improve biotechnical pro-
profiles, to actually obtain this high amount cesses. Using these techniques promotes qual-
of nikkomycin; see the feeding profiles given ity and reproducibility of processes (Junker and
in black solid lines. This example shows that, Wang 2006). These methods should, however,
especially in biotechnology, off-line trajectory not only be exploited in the production scale.
planning has to be complemented by closed-loop For new pharmaceutical products, the time to
concepts. market is the decisive factor. Methods of (model-
On the other hand, the experimental data given based) monitoring and control can help here to
in Fig. 2 shows that significant disturbances, such speed up process development. Since a few years,
as an unexpected initial growth delay, may occur a clear trend can be seen in biotechnology to
in real systems as well. For this reason, the miniaturize and parallelize process development
classical receding horizon MPC with an off-line using multi-fermenter systems and robotic tech-
160 Control of Fluids and Fluid-Structure Interactions

nologies. This trend gives rise to new challenges Kawohl M, Heine T, King R (2007) Model-based esti-
for modeling on the basis of huge data sets and mation and optimal control of fed-batch fermentation
processes for the production of antibiotics. Chem Eng
for control in very small scales. At the same Process 11:1223–1241
time, it is expected that a continued increase King R (1997) A structured mathematical model for a
of information from bioinformatic tools will be class of organisms: part 1–2. J Biotechnol 52:219–244
available which has to be utilized for process Mandenius CF, Titchener-Hooker NJ (ed) (2013)
Measurement, monitoring, modelling and control of
control as well. Going to large-scale cultivations bioprocesses. Advances in biochemical engineering
adds further spatial dimensions to the problem. biotechnology, vol 132. Springer, Heidelberg
Now, the assumption of a well-stirred, ideally Mangold M, Angeles-Palacios O, Ginkel M, Waschler R,
mixed reactor does not longer hold. Substrate Kinele A, Gilles ED (2005) Computer aided modeling
of chemical and biological systems – methods, tools,
concentrations will be space dependent. Cells and applications. Ind Eng Chem Res 44:2579–2591
will experience changing good and bad nutrient Rani KY, Rao VSR (1999) Control of fermenters. Biopro-
environments frequently. Thus, mass transfer has cess Eng 21:77–88 31:21–39
to be accounted for, leading to partial differential Varner J, Ramkrishna D (1998) Application of cybernetic
models to metabolic engineering: investigation of stor-
equations as models for the process. age pathways. Biotech Bioeng 58:282–291; 31:21–39

Cross-References
Control of Fluids and Fluid-Structure
 Control and Optimization of Batch Processes Interactions
 Deterministic Description of Biochemical
Networks Jean-Pierre Raymond
 Extended Kalman Filters Institut de Mathématiques, Université Paul
 Experiment Design and Identification for Sabatier Toulouse III & CNRS, Toulouse Cedex,
Control France
 Industrial MPC of Continuous Processes
 Nominal Model-Predictive Control
 Nonlinear System Identification: An Overview Abstract
of Common Approaches
We introduce control and stabilization issues for
fluid flows along with known results in the field.
Bibliography Some models coupling fluid flow equations and
equations for rigid or elastic bodies are presented,
Bastin G, Dochain D (2008) On-line estimation and adap- together with a few controllability and stabiliza-
tive control of bioreactors. Elsevier, Amsterdam
Dochain D (2008) Bioprocess control. ISTE, London tion results.
Gnoth S, Jentzsch M, Simutis R, Lübbert A (2008)
Control of cultivation processes for recombinant pro-
tein production: a review. Bioprocess Biosyst Eng
31:21–39
Keywords
Goudar C, Biener R, Zhang C, Michaels J, Piret J, Kon-
stantinov K (2006) Towards industrial application of Control; Fluid flows; Fluid-structure systems;
quasi real-time metabolic flux analysis for mammalian Stabilization
cell culture. Adv Biochem Eng Biotechnol 101:99–
118
Herold S, King R (2013) Automatic identification of struc-
tured process models based on biological phenom- Some Fluid Models
ena detected in (fed-)batch experiments. Bioprocess
Biosyst Eng. doi:10.1007/s00449-013-1100-6
We consider a fluid flow occupying a bounded
Junker BH, Wang HY (2006) Bioprocess monitoring and
computer control: key roots of the current PAT initia- domain F RN , with N D 2 or N D 3,
tive. Biotechnol Bioeng 95:226–261 at the initial time t D 0, and a domain F (t)
Control of Fluids and Fluid-Structure Interactions 161

at time t > 0. Let us denote by .x; t/ 2 RC This model has to be completed with boundary
the density of the fluid at time t at the point conditions on @F .t/ and an initial condition at
x 2 F .t/ and by u.x; t/ 2 RN its velocity. time t D 0.
The fluid flow equations are derived by writing The incompressible Euler equations with con-
the mass conservation stant density are obtained by setting  = 0 in the
above system.
@ The compressible Navier-Stokes system is ob- C
Cdiv. u/ D 0 in F .t/; for t > 0; (1) tained by coupling the equation of conservation
@t
of mass Eq. (1) with the balance of momentum
and the balance of momentum Eq. (2), where the tensor  is defined by Eq. (4),
and by completing the system with a constitutive
  law for the pressure.
@u
C .u  r/u D div  + f
@t (2)
in F .t/; for t > 0
Control Issues
where  is the so-called constraint tensor and
There are unstable steady states of the Navier-
f represents a volumic force. For an isothermal
Stokes equations which give rise to interesting
fluid, there is no need to complete the system
control problems (e.g., to maximize the ratio
by the balance of energy. The physical nature
“lift over drag”), but which cannot be observed
of the fluid flow is taken into account in the
in real life because of their unstable nature. In
choice of the constraint tensor . When the vol-
such situations, we would like to maintain the
ume is preserved by the fluid flow transport, the
physical model close to an unstable steady state
fluid is called incompressible. The incompress-
by the action of a control expressed in feedback
ibility condition reads as div u D 0 in F .t/.
form, that is, as a function either depending on
The incompressible Navier-Stokes equations are
an estimation of the velocity or depending on the
the classical model to describe the evolution of
velocity itself. The estimation of the velocity of
isothermal incompressible and Newtonian fluid
the fluid may be recovered by using some real-
flows. When in addition the density of the fluid
time measurements. In that case, we speak of a
is assumed to be constant, (x, t) = 0 , the
feedback stabilization problem with partial infor-
equations reduce to
mation. Otherwise, when the control is expressed
in terms of the velocity itself, we speak of a feed-
div u D 0; back stabilization problem with full information.
 
@u Another interesting issue is to maintain a fluid
0 C .u  r/u D u  rp C 0 f flow (described by the Navier-Stokes equations)
@t
in the neighborhood of a nominal trajectory (not
in F .t/; t > 0; (3)
necessarily a steady state) in the presence of
perturbations. This is a much more complicated
which are obtained by setting issue which is not yet solved.
In the case of a perturbation in the initial
 
2 condition of the system (the initial condition at
 D  ru C .ru/ C  T
div u I pI; time t D 0 is different from the nominal velocity
3
(4) held at time t D 0), the exact controllability
in Eq. (2). When div u = 0, the expression of  to the nominal trajectory consists in looking for
simplifies. The coefficients  > 0 and  > 0 are controls driving the system in finite time to the
the viscosity coefficients of the fluid, and p.x; t/ desired trajectory.
its pressure at the point x 2 F .t/ and at time Thus, control issues for fluid flows are those
t > 0. encountered in other fields. However there are
162 Control of Fluids and Fluid-Structure Interactions

specific difficulties which make the correspond- where gs andfs are time-independent functions.
ing problems challenging. When we deal with the In the case F .t/ D F , not depending on t, the
incompressible Navier-Stokes system, the pres- corresponding instationary model is
sure plays the role of a Lagrange multiplier asso-
ciated with the incompressibility condition. Thus, @u  u C .u  r/u C rp D f
s
we have to deal with an infinite-dimensional non- @t
and div u = 0 in F  .0; 1/;
linear differential algebraic system. In the case P c (5)
u D gs C N i D1 fi .t/gi ; @F  .0; 1/
of a Dirichlet boundary control, the elimination
u.0/ D u0 on F .
of the pressure, by using the so-called Leray or
Helmholtz projector, leads to an unusual form of
In this model, we assume that u0 ¤ us , gi are
the corresponding control operator; see Raymond
given functions with localized supports in @F
(2006). In the case of an internal control, the
and f .t/ D .f1 .t/; : : : ; fNc .t// is a finite-
estimation of the pressure to prove observability
dimensional control. Due to the incompressibility
inequalities is also quite tricky; see Fernandez-
condition, the functions gi have to satisfy
Cara et al. (2004). From the numerical viewpoint,
the approximation of feedback control laws leads Z
to very large-size problems, and new strategies gi  n D 0;
have to be found for tackling these issues. @F

Moreover, the issues that we have described


where n is the unit normal to @F , outward F .
for the incompressible Navier-Stokes equations
The stabilization problem, with a prescribed
may be studied for other models like the
decay rate ˛ < 0, consists in looking for a
compressible Navier-Stokes equations, the
control f in feedback form, that is, of the form
Euler equations (describing nonviscous fluid
flows) both for compressible and incom-
pressible models, or even more complicated f .t/ D K.u.t/  us /; (6)
models.
such that the solution to the Navier-Stokes system
Eq. (5), with f defined by Eq. (6), obeys
˛t

Feedback Stabilization of Fluid Flows e .u.t/  us /  ' ku0  us k ;


z z

Let us now describe what are the known results for some norm Z, provided ku0  us kz is small
for the incompressible Navier-Stokes equations enough and where ' is a nondecreasing function.
in 2D or 3D bounded domains, with a control act- The mapping K, called the feedback gain, may
ing locally in a Dirichlet boundary condition. Let be chosen linear.
us consider a given steady state (us ; ps / satisfying The usual procedure to solve this stabilization
the equation problem consists in writing the system satisfied
by u  us , in linearizing this system, and in
us C .us  r/us C rps D fs ; looking for a feedback control stabilizing this
and div us D 0 in F ; linearized model. The issue is first to study the
stabilizability of the linearized model and, when
it is stabilizable, to find a stabilizing feedback
with some boundary conditions which may be gain. Among the feedback gains that stabilize
of Dirichlet type or of mixed type (Dirichlet- the linearized model, we have to find one able
Neumann-Navier type). For simplicity, we only to stabilize, at least locally, the nonlinear system
deal with the case of Dirichlet boundary condi- too.
tions The linearized controlled system associated
us D gs on @F ; with Eq. (5) is
Control of Fluids and Fluid-Structure Interactions 163

@v  v C .u  r/v C .v  r/u C rq D 0 @v v C .Qu .t/  r/ vC.v  r/Qu .t/ C rq D 0


s s
@t @t
and div v = 0 in F  .0; 1/; and div v = 0 in F  .0; T /;
P c
vD N i D1 fi .t/gi on @F  .0; 1/; v D mc f on @F  .0; T /;
v.0/ D v0 on F : v.0/ D v0 2 L2 .F I RN /; div v0 = 0.
(7) (9)
The easiest way for proving the stabilizability of The nonnegative function mc is used to lo-
C
the controlled system Eq. (7) is to verify the Hau- calize the boundary control f . The control f is
tus criterion. It consists in proving the following assumed to satisfy
unique continuation result. If . j ; j ; j / is the
solution to the eigenvalue problem Z
mc f  n D 0: (10)
@F
j j   j  .us  r/ j C .rus /T j
Cr j D0 and div j D 0 in F ; As for general linear dynamical systems, the null
controllability of the linearized system follows
j D 0 on @F ; Re j  ˛; (8)
from an observability inequality for the solutions
to the following adjoint system
and if in addition . j ; j/ satisfies
@
Z   .Qu .t/  r/ C.r uQ .t//T Cr D 0
@t
gi  . j ; j /n D0 for all 1  i  Nc ; and div = 0 in F  .0; T /;
@F
D 0 on @F  .0; T /;
.T / 2 L2 .F I RN /; div (T ) = 0.
then ( j ; j / D 0. By using a unique continu- (11)
ation theorem due to Fabre and Lebeau (1996), Contrary to the stabilization problem, the null
we can explicitly determine the functions gi so controllability by a control of finite dimension
that this condition is satisfied; see Raymond and seems to be out of reach and it will be
Thevenet (2010). For feedback stabilization re- impossible in general. We look for a control
sults of the Navier-Stokes equations in two or f 2 L2 .@F ; RN /, satisfying Eq. (10), driving
three dimensions, we refer to Fursikov (2004), the solution to system Eq. (9) in time T to
Raymond (2006), Barbu et al. (2006), Raymond zero, that is, such that the solution vv0 ; f
(2007), Badra (2009), and Vazquez and Krstic obeys vv0 ; f .T / D 0. The linearized system
(2008). Eq. (9) is null controllable in time T > 0 by a
boundary control f 2 L2 .@F ; RN / obeying
Eq. (10), if and only if there exists C > 0 such
Controllability to Trajectories of Fluid that
Flows
Z Z
2
If .Qu .t/; pQ .t//0t <1 is a solution to the Navier- j .0/j dx  C mc j. ; /nj2 dx;
F @F
Stokes system, the controllability problem to the (12)
trajectory .Qu .t/; pQ .t//0t <1 , in time T > 0,
may be rewritten as a null controllability problem for all solution ( ; ) of Eq. (11). The
satisfied by .v; q/ D .u  uQ ;p  p/. Q The local observability inequality Eq. (12) may be proved
null controllability in time T > 0 follows from by establishing weighted energy estimates called
the null controllability of the linearized system “Carleman-type estimates”; see Fernandez-Cara
and from a fixed point argument. The linearized et al. (2004) and Fursikov and Imanuvilov
controlled system is (1996).
164 Control of Fluids and Fluid-Structure Interactions

Additional Controllability Results for The domain S .t/ and the flow XS associated
Other Fluid Flow Models with the motion of the structure obey

The null controllability of the 2D incompressible XS .y; t/ D h.t/ C Q.t/Q01 .y  h.0//;


Euler equation has been obtained by J.-M. Coron for y 2 S .0/ D S ; (13)
with the so-called Return Method (Coron 1996). S .t/ D XS .S .0/; t/;
See also Coron (2007) for additional references
(in particular, the 3D case has been treated by O. and the matrix Q.t/ is related to the angular
Glass). velocity ! W .0; T / 7! R3 , by the differential
Some null controllability results for the equation
one-dimensional compressible Navier-Stokes
equations have been obtained in Ervedoza et al. Q0 .t/ D !.t/  Q.t/; Q.0/ D Q0 : (14)
(2012).
We consider the case when the fluid flow satisfies
the incompressible Navier-Stokes system Eq. (3)
Fluid-Structure Models in the domain F .t/ corresponding to Fig. 1.
Denoting by J.t/ 2 R33 the tensor of inertia
Fluid-structure models are obtained by coupling at time t, and by m the mass of the rigid body, the
an equation describing the evolution of the fluid equations of the structure are obtained by writing
flow with an equation describing the evolution the balance of linear and angular momenta
of the structure. The coupling comes from the Z
00
balance of momentum and by writing that at mh D .u; p/ndx;
the fluid-structure interface, the fluid velocity is Z
@S .t /

equal to the displacement velocity of the struc- J !0 D J !  ! C .x  h/  .u; p/ndx;


@S .t /
ture. h.0/ D h0 ; h0 .0/ D h1 ; !.0/ D !0 ;
The most important difficulty in studying (15)
those models comes from the fact that the domain
where n is the normal to @S .t/ outward F .t/.
occupied by the fluid at time t evolves and
The system Eqs. (3) and (13)–(15) has to be com-
depends on the displacement of the structure.
pleted with boundary conditions. At the fluid-
In addition, when the structure is deformable,
structure interface, the fluid velocity is equal to
its evolution is usually written in Lagrangian
the displacement velocity of the rigid solid:
coordinates while fluid flows are usually
described in Eulerian coordinates.
u.x; t/ D h0 .t/ C !.t/  .x  h.t//; (16)
The structure may be a rigid or a deformable
body immersed into the fluid. It may also be a
for all x 2 @S .t/, t > 0. The exterior bound-
deformable structure located at the boundary of
ary of the fluid domain is assumed to be fixed
the domain occupied by the fluid.

A Rigid Body Immersed in a


Three-Dimensional Incompressible
Viscous Fluid
In the case of a 3D rigid body S .t/ immersed
in a fluid flow occupying the domain F .t/, the
motion of the rigid body may be described by
the position h.t/ 2 R3 of its center of mass Control of Fluids and Fluid-Structure Interactions,
and by a matrix of rotation Q.t/ 2 R3  3 . Fig. 1
Control of Fluids and Fluid-Structure Interactions 165

e D @F .t/n@S .t/. The boundary condition obtained bypwriting that F in Eq. (18) is given
on e  .0; T / may be of the form by F D  1 C 2x .u; p/nQ  n, where n.x;
Q y/
is the unit normal at .x; y/ 2 S .t/ to S .t/
u D mc f on e  .0; 1/; (17) outward F .t/, and n is the unit normal to
S outward F .0/ D F . If in addition, a
R control f acts as a distributed control in the
with e mc f  n D 0, f is a control, and mc a C
beam equation, we shall have
localization function.
q
F D  1 C 2x .u; p/nQ  n C f (19)
An Elastic Beam Located at the
Boundary of a Two-Dimensional The equality of velocities on S .t/ reads as
Domain Filled by an Incompressible
Viscous Fluid u.x; y0 C .x; t// D .0; t .x; t//;
x 2 .0; L/; t > 0: (20)
When the structure is described by an infinite-
dimensional model (a partial differential equation
or a system of p.d.e.), there are a few existence
results for such systems and mainly existence of Control of Fluid-Structure Models
weak solutions (Chambolle et al. 2005). But for
stabilization and control problems of nonlinear To control or to stabilize fluid-structure models,
systems, we are usually interested in strong so- the control may act either in the fluid equation or
lutions. Let us describe a two-dimensional model in the structure equation or in both equations.
in which a one-dimensional structure is located There are a very few controllability and
on a flat part S D .0; L/ {y0 } of the boundary stabilization results for systems coupling the
of the reference configuration of the fluid domain incompressible Navier-Stokes system with a
F . We assume that the structure is a Euler- structure equation. We state below two of those
Bernoulli beam with or without damping. The results. Some other results are obtained for
displacement  of the structure in the direction simplified one-dimensional models coupling
normal to the boundary S is described by the the viscous Burgers equation coupled with the
partial differential equation motion of a mass; see Badra and Takahashi
(2013) and the references therein.
We also have to mention here recent papers on
t t bxx ctxx Caxxxx D F; in S  .0; 1/;
control problems for systems coupling quasi-
 D 0 and x D 0 on @S  .0; 1/;
stationary Stokes equations with the motion
.0/ D 01 and t .0/ D 02 in S ;
of deformable bodies, modeling microorganism
(18)
swimmers at low Reynolds number; see Alouges
where x , xx , and xxxx stand for the first,
et al. (2008).
the second, and the fourth derivative of  with
respect to x 2 S . The other derivatives are
defined in a similar way. The coefficients b
and c are nonnegative, and a > 0. The term Null Controllability of the
ctxx is a structural damping term. At time Navier-Stokes System Coupled with
t, the structure occupies the position S .t/ D the Motion of a Rigid Body
f.x; y/ jx 2 .0; L/; y D y0 C .x; t/ g. When
F is a two-dimensional model, S is of The system coupling the incompressible Navier-
dimension one, and @S is reduced to the two Stokes system Eq. (3) in the domain drawn
extremities of S . The momentum balance is in Fig. 1, with the motion of a rigid body
166 Control of Fluids and Fluid-Structure Interactions

described by Eqs. (13)–(16), with the boundary Cross-References


control Eq. (17) is null controllable locally in a
neighborhood of 0. Before linearizing the system  Bilinear Control of Schrödinger PDEs
in a neighborhood of 0, the fluid equations have  Boundary Control of Korteweg-de Vries and
to be rewritten in Lagrangian coordinates, that Kuramoto–Sivashinsky PDEs
is, in the cylindrical domain F  .0; 1/. The  Motion Planning for PDEs
linearized system is the Stokes system coupled
with a system of ordinary differential equations.
The proof of this null controllability result relies Bibliography
on a Carleman estimate for the adjoint system;
Alouges F, DeSimone A, Lefebvre A (2008) Optimal
see, e.g., Boulakia and Guerrero (2013). strokes for low Reynolds number swimmers: an exam-
ple. J Nonlinear Sci 18:277–302
Badra M (2009) Lyapunov function and local feedback
Feedback Stabilization of the boundary stabilization of the Navier-Stokes equations.
Navier-Stokes System Coupled with a SIAM J Control Optim 48:1797–1830
Beam Equation Badra M, Takahashi T (2013) Feedback stabilization of a
simplified 1d fluid-particle system. An. IHP, Analyse
Non Lin. http://dx.doi.org/10.1016/j.anihpc.2013.03.
The system coupling the incompressible Navier- 009
Stokes system Eq. (3) in the domain drawn in Barbu V, Lasiecka I, Triggiani R (2006) Tangential bound-
Fig. 2, with beam Eqs. (18)–(20), can be locally ary stabilization of Navier-Stokes equations. Mem Am
Math Soc 181 (852) 128
stabilized with any prescribed exponential decay Boulakia M, Guerrero S (2013) Local null controllability
rate ˛ < 0, by a feedback control f acting in of a fluid-solid interaction problem in dimension 3.
Eq. (18) via Eq. (19); see Raymond (2010). The J Eur Math Soc 15:825–856
proof consists in showing that the infinitesimal Chambolle A, Desjardins B, Esteban MJ, Grandmont
C (2005) Existence of weak solutions for unsteady
generator of the linearized model is an analytic fluid-plate interaction problem. J Math Fluid Mech
semigroup (when c > 0), that its resolvent is 7:368–404
compact, and that the Hautus criterion is satisfied. Coron J-M (1996) On the controllability of 2-D in-
When the control acts in the fluid equation, compressible perfect fluids. J Math Pures Appl
75(9):155–188
the system coupling Eq. (3) in the domain drawn Coron J-M (2007) Control and nonlinearity. American
in Fig. 2, with the beam Eqs. (18)–(20), can be Mathematical Society, Providence
stabilized when c > 0. To the best of our Ervedoza S, Glass O, Guerrero S, Puel J-P (2012) Local
knowledge, there is no null controllability result exact controllability for the one-dimensional com-
pressible Navier-Stokes equation. Arch Ration Mech
for such systems, even with controls acting both Anal 206:189–238
in the structure and fluid equations. The case Fabre C, Lebeau G (1996) Prolongement unique des
where the beam equation is approximated by a solutions de l’équation de Stokes. Comm. P. D. E.
finite-dimensional model is studied in Lequeurre 21:573–596
Fernandez-Cara E, Guerrero S, Imanuvilov Yu O, Puel J-P
(2013). (2004) Local exact controllability of the Navier-Stokes
system. J Math Pures Appl 83:1501–1542
Fursikov AV (2004) Stabilization for the 3D Navier-
Stokes system by feedback boundary control. Partial
differential equations and applications. Discrete Con-
tin Dyn Syst 10:289–314
Fursikov AV, Imanuvilov Yu O (1996) Controllability of
evolution equations. Lecture notes series, vol 34. Seoul
National University, Research Institute of Mathemat-
ics, Global Analysis Research Center, Seoul
Lequeurre J (2013) Null controllability of a fluid-structure
system. SIAM J Control Optim 51:1841–1872
Raymond J-P (2006) Feedback boundary stabilization of
Control of Fluids and Fluid-Structure Interactions, the two dimensional Navier-Stokes equations. SIAM J
Fig. 2 Control Optim 45:790–828
Control of Linear Systems with Delays 167

Raymond J-P (2007) Feedback boundary stabilization of model transport phenomena and heredity, and
the three-dimensional incompressible Navier-Stokes they arise as feedback delays in control loops.
equations. J Math Pures Appl 87:627–669
Raymond J-P (2010) Feedback stabilization of a fluid–
An overview of applications, ranging from traffic
structure model. SIAM J Control Optim 48:5398–5443 flow control and lasers with phase-conjugate
Raymond J-P, Thevenet L (2010) Boundary feedback feedback, over (bio)chemical reactors and cancer
stabilization of the two dimensional Navier-Stokes modeling, to control of communication networks
equations with finite dimensional controllers. Discret
and control via networks, is included in Sipahi C
Contin Dyn Syst A 27:1159–1187
Vazquez R, Krstic M (2008) Control of turbulent and et al. (2011).
magnetohydrodynamic channel flows: boundary stabi- The aim of this contribution is to describe
lization and estimation. Birkhäuser, Boston some fundamental properties of linear control
systems subjected to time-delays and to outline
principles behind analysis and synthesis methods.
Throughout the text, the results will be illustrated
by means of the scalar system

Control of Linear Systems with


Delays P
x.t/ D u.t  /; (1)

Wim Michiels
KU Leuven, Leuven (Heverlee), Belgium which, controlled with instantaneous state feed-
back, u.t/ D kx.t/, leads to the closed-loop
system
Abstract P
x.t/ D kx.t  /: (2)
Although this didactic example is extremely sim-
The presence of time delays in dynamical sys- ple, we shall see that its dynamics are already
tems may induce complex behavior, and this be- very rich and shed a light on delay effects in
havior is not always intuitive. Even if a system’s control loops.
equation is scalar, oscillations may occur. Time In some works, the analysis of (2) is called the
delays in control loops are usually associated hot shower problem, as it can be interpreted as
with degradation of performance and robustness, a (over)simplified model for a human adjusting
but, at the same time, there are situations where the temperature in a shower: x.t/ then denotes
time delays are used as controller parameters. the difference between the water temperature and
the desired temperature as felt by the person, the
term – kx.t/ models the reaction of the person
Keywords by further opening or closing taps, and the delay
is due to the propagation with finite speed of the
Delay differential equations; Delays as controller water in the ducts.
parameters; Functional differential equation

Basis Properties of Time-Delay


Introduction Systems
Time-delays are important components of many Functional Differential Equation
systems from engineering, economics, and the We focus on a model for a time-delay system
life sciences, due to the fact that the transfer described by
of material, energy, and information is mostly
not instantaneous. They appear, for instance, as
computation and communication lags, they P
x.t/ D A0 x.t/ C A1 x.t  /; x.t/ 2 Rn : (3)
168 Control of Linear Systems with Delays

This is an example of a functional differential differential equations of retarded type: solutions


equation (FDE) of retarded type. The term FDE become smoother as time evolves. As a
stems from the property that the right-hand side third major difference with ODEs, backward
can be interpreted as a functional evaluated at a continuation of solutions is not always possible
piece of trajectory. The term retarded expresses (Michiels and Niculescu 2007).
that the right-hand side does not explicitly depend
P
on x.
Reformulation in a First-Order Form
As a first difference with an ordinary differ-
The state of system (3) at time t is the minimal in-
ential equation, the initial condition of (3) at
formation needed to continue the solution, which,
t D 0 is a function from Œ; 0
to Rn . For
once again, boils down to a function segment
all 2 C .Œ; 0
; Rn /, where C .Œ; 0
; Rn / is
xt . /where xt . /./ D x.t C /;  2 Œ; 0
(in
the space of continuous functions mapping the
Fig. 1, the function xt is shown in red for t D 5).
interval Œ; 0
into Rn , a forward solution x. /
This suggests that (3) can be reformulated as a
exists and is uniquely defined. In Fig. 1, a solution
standard ordinary differential equation over the
of the scalar system (2) is shown.
infinite-dimensional space C.Œ; 0
; Rn /. This
The discontinuity in the derivative at t = 0
P equation takes the form
stems from A0 .0/ C A1 ./ ¤ lim !0 .
Due to the smoothing property of an integrator, d
however, at t D n 2 N, the discontinuity will z.t/ D Az.t/; z.t/ 2 C .Œ; 0
; Rn / (4)
dt
only be present in the .n C 1/th derivative.
This illustrates a second property offunctional where operator A is given by

 
2 C .Œm ; 0
; Rn / W P 2 C .Œm ; 0
; Rn /
D .A/ D ;
P .0/ D A0 .0/ C A1 ./
d
A D d : (5)

The relation between solutions of (3) and (4) where jj  jjs is the supremum norm and jj jjs D
is given by z.t/./ D x.t C /;  2 Œ; 0
. sup 2Œ;0
jj ./jj2 . As the system is linear,
Note that all system information is concentrated
in the nonlocal boundary condition describing the 1.2
domain of A. The representation (4) is closely
1
related to a description by an advection PDE with
0.8
a nonlocal boundary condition (Krstic 2009).
0.6
0.4
Asymptotic Growth Rate of Solutions
0.2
x

and Stability
The reformulation of (3) into the standard 0
form (4) allows us to define stability notions −0.2
and to generalize the stability theory for ordinary −0.4
differential equations in a straightforward way,
−0.6
with the main change that the state space is
−0.8
C.Œ; 0
; Rn /. For example, the null solution −1 0 1 2 3 4 5
of (3) is exponentially stable if and only if there t
exist constants C > 0 and > 0 such that
Control of Linear Systems with Delays, Fig. 1
 t Solution of (2) for  D 1; k D 1, and initial condition
8 2 C .Œm ; 0
; R / kxt . /ks  C e
n
k ks ; 1
Control of Linear Systems with Delays 169

asymptotic stability and exponential stability are of delay-independent criteria (conditions holding
equivalent. A direct generalization of Lyapunov’s for all values of the delays) and delay-dependent
second method yields: criteria (usually holding for all delays smaller
than a bound). This classification has its origin at
Theorem 1 The null solution of linear system
two different ways of seeing (3) as a perturbation
(3) is asymptotically stable if there exist a
of an ODE, with as nominal system x.t/ P D
continuous functional V W C.Œ; 0
; Rn / ! R (a C
P
A0 x.t/ and x.t/ D .A0 C A1 /x.t/ (system
so-called Lyapunov-Krasovskii functional) and
for zero delay), respectively. This observation is
continuous nondecreasing functions u; v; w W
illustrated in Fig. 2 for results based on input-
RC ! RC with
output- and Lyapunov-based approaches.
u.0/ D v.0/ D w.0/ D 0 and u.s/ > 0;
v.s/ > 0; w.s/ > for s > 0;
The Spectrum of Linear Time-Delay
such that for all 2 C.Œ; 0
; Rn / Systems
u .k ks /  V . /  v .k ./k2 / ; Two Eigenvalue Problems
VP . /  w .k ./k2 / ; The substitution of an exponential solution in (3)
leads us to the nonlinear eigenvalue problem
where
1 .I  A0  A1 e  /v D 0;  2 C; v 2 Cn; v ¤ 0:
VP . / D lim sup ŒV .xh . //  V . /
:
h!0C h (6)
The solutions of the equation det.I  A0 
Converse Lyapunov theorems and the con- A1 e  / D 0 are called characteristic roots.
struction of the so-called complete-type Similarly, formulation (4) leads to the equivalent
Lyapunov-Krasovskii functionals are discussed infinite-dimensional linear eigenvalue problem
in Kharitonov (2013). Imposing a particular
structure on the functional, e.g., a form depending
only on a finite number of free parameters, .I  A/u D 0;  2 C; u 2 C.Œ; 0
; Cn /; u6
0:
often leads to easy-to-check stability criteria (7)
(for instance, in the form of LMIs), yet as price
to pay, the obtained results may be conservative The combination of these two viewpoints lays
in the sense that the sufficient stability conditions at the basis of most methods for computing
might not be close to necessary conditions. characteristic roots; see Michiels (2012). On the
As an alternative to Lyapunov functionals, one hand, discretizing (7), i.e., approximating
Lyapunov functions can be used as well, provided A with a matrix, and solving the resulting
that the condition VP < 0 is relaxed (the so- standard eigenvalue problems allow to obtain
called Lyapunov-Razumikhin approach); see, for global information, for example, estimates of
example, Gu et al. (2003). all characteristic roots in a given compact set
or in a given right half plane. On the other
Delay Differential Equations as hand, the (finitely many) nonlinear equations (6)
Perturbation of ODEs allow to make local corrections on characteristic
Many results on stability, robust stability, and root approximations up to the desired accuracy,
control of time-delay systems are explicitly or e.g., using Newton’s method or inverse residual
implicitly based on a perturbation point of view, iteration. Linear time-delay systems satisfy
where delay differential equations are seen as spectrum-determined growth properties of
perturbations of ordinary differential equations. solutions. For instance, the zero solution of (3)
For instance, in the literature, a classification is asymptotically stable if and only if all
of stability criteria is often presented in terms characteristic roots are in the open left half plane.
170 Control of Linear Systems with Delays

Delay-independent results Delay-dependent results

ẋ(t) = A0x(t)+A1x(t − τ) ẋ(t) = (A0 + A1)x(t)+A1(x(t − τ) − x(t))


input-output setting:

(λI − A0)−1A1 (λI − (A0 + A1))−1 A1λ

e−λτ I e−λτ−1
I
λ

e−jωτ−1
e−jωτ = 1 ≤τ

Lyapunov setting:
V = xT Px + ... V = xT Px + ...
where where
A0T P + PA0 < 0 (A0 + A1)T P + P (A0 + A1) < 0

Control of Linear Systems with Delays, Fig. 2 The classification of stability criteria in delay-independent results and
delay-dependent results stems from two different perturbation viewpoints. Here, perturbation terms are printed in red

kτ=1
100 1.5
80
1
60
40 0.5
20
0
ℜ(λτ)
ℑ(λτ)

0
−20 −0.5

−40 −1
−60
−1.5
−80
−100 −2
−4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0 2 4 6 8
ℜ(λτ) kτ

Control of Linear Systems with Delays, Fig. 3 (Left) Rightmost characteristic roots of (2) for k D 1. (Right) Real
parts of rightmost characteristic roots as a function of k

In Fig. 3 (left), the rightmost characteristic abscissa, i.e., the supremum of the real parts of
roots of (2) are depicted for k D 1. Note that all characteristic roots, continuously depend on
since the characteristic equation can be written parameters. Related to this, a loss or gain of
as  C ke D 0; k and  can be combined stability is always associated with characteristic
into one parameter. In Fig. 3 (right), we show the roots crossing the imaginary axis. Figure 3 (right)
real parts of the characteristic roots as a func- also illustrates the transition to a delay-free sys-
tion of k. The plots illustrate some important tem as k ! 0C .
spectral properties of retarded-type FDEs. First,
even though there are in general infinitely many Critical Delays: A Finite-Dimensional
characteristic roots, the number of them in any Characterization
right half plane is always finite. Second, the indi- Assume that for a given value of k, we are
vidual characteristic roots, as well as the spectral looking for values of the delay c for which (2)
Control of Linear Systems with Delays 171

has a characteristic root j!c on the imaginary spectral abscissa, is given by 1=; hence, large
axis. From j! D kej! , we get delays can only be tolerated at the price of a
degradation of the rate of convergence. It should

C l2 be noted that the limitations induced by delays are
!c D k; c D 2
; l
!c even more stringent if the uncontrolled systems
  1 are exponentially unstable, which is not the case
d 1 C
D 0; 1; : : : ; < j.c ;j!c / D 2 : (8) for (2).
d !c The analysis in the previous sections gives
a hint why control is difficult in the presence
Critical delay values c are indicated with green
of delays: the system is inherently infinite
circles on Fig. 3 (right). The above formulas first
dimensional. As a consequence, most control
illustrate an invariance property of imaginary
design problems which involve determining a
axis roots and their crossing direction with re-
finite number of parameters can be interpreted
spect to delay shifts of 2 =!c . Second, the num-
as reduced-order control design problems or
ber of possible values of !c is one and thus finite.
as control design problems for under-actuated
More generally, substituting  D j! in (6) and
systems, which both are known to be hard
treating  as a free parameter lead to a two-
problems.
parameter eigenvalue problem

.j!I  A0  A1 z/v D 0; (9) Fixed-Order Control


Most standard control design techniques lead to
with ! on the real axis and z WD exp.j!/ controllers whose dimension is larger or equal
on the unit circle. Most methods to solve such a to the dimension of the system. For infinite-
problem boil down to an elimination of one of the dimensional time-delay system, such controllers
independent variables ! or z. As an example of an might have a disadvantage of being complicated
elimination technique, we directly get from (9) and hard to implement. To see this, for a system
with delay in the state, the generalization of
j! 2 .A0 C A1 z/; j! 2 .A0 C A1 z1 / R 0 feedback, u.t/ D k.x/, is given by
static state

u.t/ D  x.t C /d./, where  is a function
) det .A0 C A1 z/ ˚ .A0 C A1 z1 / D 0; of bounded variation. However, in the context
of large-scale systems, it is known that reduced-
where ./ denotes the spectrum and ˚ the order controllers often perform relatively well
Kronecker sum. Clearly, the resulting eigenvalue compared to full-order controllers, while they are
problem in z is finite dimensional. much easier to implement.
Recently, new methods for the design of con-
trollers with a prescribed order (dimension) or
Control of Linear Time-Delay System structure have been proposed (Michiels 2012).
These methods rely on a direct optimization of
Limitations Induced by Delays appropriately defined cost functions (spectral ab-
It is well known that delays in control loop scissa, H2 =H1 criteria). While H2 criteria can be
may lead to a significant degradation of per- addressed within a derivative-based optimization
formance and robustness and even to instability framework, H1 criteria and the spectral abscissa
(Niculescu 2001; Richard 2003). Let us return require targeted methods for non-smooth opti-
to example (2). As illustrated with Fig. 3 and mization problems. To illustrate the need for such
expressions (8), the system loses stability if  methods, consider again Fig. 3 (right): minimiz-
reaches the value =2k, while stability cannot ing the spectral abscissa for a given value of 
be recovered for larger delays. The maximum as a function of the controller gain k leads to an
achievable exponential decay rate of the solu- optimum where the objective function is not dif-
tions, which corresponds to the minimum of the ferentiable, even not locally Lipschitz, as shown
172 Control of Linear Systems with Delays

by the red circle. In case of multiple controller one closed-loop characteristic root at  D k,
parameters, the path of steepest descent in the pa- i.e., as long as the model used in the predictor
rameter space typically has phases along a man- is exact, the delay in the loop is compensated by
ifold characterized by the non-differentiability of the prediction. For further reading on prediction-
the objective function. based controllers, see, e.g., Krstic (2009) and the
references therein.
Using Delays as Controller Parameters
In contrast to the detrimental effects of delays,
there are situations where delays have a beneficial
Conclusions
effect and are even used as controller parameters;
see Sipahi et al. (2011). For instance, delayed
Time-delay systems, which appear in a large
feedback can be used to stabilize oscillatory sys-
number of applications, are a class of infinite-
tems where the delay serves to adjust the phase in
dimensional systems, resulting in rich dynamics
the control loop. Delayed terms in control laws
and challenges from a control point of view. The
can also be used to approximate derivatives in
different representations and interpretations and,
the control action. Control laws which depend
in particular, the combination of viewpoints lead
on the difference x.t/  x.t  /, the so-called
to a wide variety of analysis and synthesis tools.
Pyragas-type feedback, have the property that the
position of equilibria and the shape of periodic
orbits with period  are not affected, in contrary
to their stability properties. Last but not least, Cross-References
delays can be used in control schemes to generate
predictions or to stabilize predictors, which allow  Control of Nonlinear Systems with Delays
to compensate delays and improve performance  H-Infinity Control
(Krstic 2009; Zhong 2006). Let us illustrate the  H2 Optimal Control
main idea once more with system (1).  Optimization-Based Control Design Tech-
System (1) has a special structure, in the sense niques and Tools
that the delay is only in the input, and it is advan-
tageous to exploit this structure in the context of
control. Coming back to the didactic example, the Bibliography
person who is taking a shower is – possibly after
some bad experiences – aware about the delay Gu K, Kharitonov VL, Chen J (2003) Stability of time-
and will take into account his/her prediction of delay systems. Birkhäuser, Basel
the system’s reaction when adjusting the cold and Kharitonov VL (2013) Time-delay systems. Lyapunov
functionals and matrices. Birkhäuser, Basel
hot water supply. Let us, to conclude, formalize Krstic M (2009) Delay compensation for nonlinear, adap-
this. The uncontrolled system can be rewritten as tive, and PDE systems. Birkhäuser, Basel
P
x.t/ D v.t/, where v.t/ D u.t  /. We know Michiels W (2012) Design of fixed-order stabilizing and
u up to the current time t; thus, we know v up H2  H1 optimal controllers: an eigenvalue op-
timization approach. In: Time-delay systems: meth-
to time t C , and if x.t/ is also known, we can ods, applications and new trends. Lecture notes in
predict the value of x at time t C , control and information sciences, vol 423. Springer,
Z t C Berlin/Heidelberg, pp 201–216
Michiels W, Niculescu S-I (2007) Stability and stabi-
xp .t C / D x.t/ C v.s/ds lization of time-delay systems: an eigenvalue based
t
approach. SIAM, Philadelphia
Z t Niculescu S-I (2001) Delay effects on stability: a robust
D x.t/ C u.s/ds; control approach. Lecture notes in control and infor-
t  mation sciences, vol 269. Springer, Berlin/New York
Richard J-P (2003) Time-delay systems: an overview
and use the predicted state for feedback. With the of recent advances and open problems. Automatica
control law u.t/ D kxp .t C /, there is only 39(10):1667–1694
Control of Machining Processes 173

Sipahi R, Niculescu S, Abdallah C, Michiels W, Gu K the tool and workpiece, in order to facilitate the
(2011) Stability and stabilization of systems with time- desired cutting operation. Figure 1 illustrates a
delay. IEEE Control Syst Mag 31(1):38–65
Zhong Q-C (2006) Robust control of time-delay systems.
single axis of a ball screw-driven machine tool,
Springer, London performing a milling operation. Here, the cutting
process is influenced by the motion of the servo
drive. The faster the part is fed in towards the ro-
tating cutter, the larger the cutting forces become, C
Control of Machining Processes following a typically proportional relationship
that holds for a large class of milling operations
Kaan Erkorkmaz (Altintas 2012). The generated cutting forces, in
Department of Mechanical & Mechatronics turn, are absorbed by the machine tool and feed
Engineering, University of Waterloo, Waterloo, drive structure. They cause mechanical deforma-
ON, Canada tion and may also excite the vibration modes,
if their harmonic content is near the structural
natural frequencies. This may, depending on the
Abstract cutting speed and tool and workpiece engagement
conditions, lead to forced vibrations or chatter
Control of machining processes encompasses (Altintas 2012).
a broad range of technologies and innovations, The disturbance effect of cutting forces is
ranging from optimized motion planning and also felt by the servo control loop, consisting of
servo drive loop design to on-the-fly regulation mechanical, electrical, and digital components.
of cutting forces and power consumption to This disturbance may result in the degradation
applying control strategies for damping out of tool positioning accuracy, thereby leading to
chatter vibrations caused by the interaction of part errors. Another input that influences the
the chip generation mechanism with the machine quality achieved in a machining operation is the
tool structural dynamics. This article provides a commanded trajectory. Discontinuous or poorly
brief introduction to some of the concepts and designed motion commands, with acceleration
technologies associated with machining process discontinuity, lead typically to jerky motion, vi-
control. brations, and poor surface finish. Beyond motion
controller design and trajectory planning, emerg-
ing trends in machining process control include
Keywords
regulating, by feedback, various outcomes of the
machining process, such as peak resultant cutting
Adaptive control; Chatter vibrations; Feed drive
force, spindle power consumption, and amplitude
control; Machining; Trajectory planning
of vibrations caused by the machining process.
In addition to using actuators and instrumentation
Introduction already available on a machine tool, such as feed
and spindle drives and current sensors, additional
Machining is used extensively in the manufac- devices, such as dynamometers, accelerometers,
turing industry as a shaping process, where high as well as inertial or piezoelectric actuators, may
product accuracy, quality, and strength are re- need to be used in order to achieve the required
quired. From automotive and aerospace compo- level of feedback and control injection capability.
nents, to dies and molds, to biomedical implants,
and even mobile device chassis, many manufac-
tured products rely on the use of machining. Servo Drive Control
Machining is carried out on machine tools,
which are multi-axis mechatronic systems de- Stringent requirements for part quality, typically
signed to provide the relative motion between specified in microns, coupled with disturbance
174 Control of Machining Processes

Control of Machining Processes, Fig. 1 Single axis of a ball screw-driven machine tool performing milling

force inputs coming from the machining process, It can be seen in Fig. 3 that increased axis
which can be in the order of tens to thousands tracking errors (ex and ey ) may result in increased
of Newtons, require that the disturbance rejec- contour error ."/. A practical solution to mitigate
tion of feed drives, which act as dynamic (i.e., this problem, in machine tool engineering, is
frequency dependent) “stiffness” elements, be to also match the dynamics of different motion
kept as strong as possible. In traditional machine axes, so that the tracking errors always assume
design, this is achieved by optimizing the me- an instantaneous proportion that brings the actual
chanical structure for maximum rigidity. After- tool position as close as possible to the desired
wards, the motion control loop is tuned to yield toolpath (Koren 1983). Sometimes, the control
the highest possible bandwidth (i.e., responsive action can be designed to directly reduce the
frequency range), without interfering with the contour error as well, which leads to the structure
vibratory modes of the machine tool in a way that known as “cross-coupling control” (Koren 1980).
can cause instability. The P-PI position velocity
cascade control structure, shown in Fig. 2, is
the most widely used technique in machine tool Trajectory Planning
drives. Its tuning guidelines have been well estab-
lished in the literature (Ellis 2004). To augment Smooth trajectory planning with at least accel-
the command following accuracy, velocity and eration level continuity is required in machine
acceleration feedforward, and friction compensa- tool control, in order to avoid inducing unwanted
tion terms are added. Increasing the closed-loop vibration or excessive tracking error during the
bandwidth yields better disturbance rejection and machining process. For this purpose, computer
more accurate tracking of the commanded tra- numerical control (CNC) systems are equipped
jectory (Pritschow 1996), which is especially with various spline toolpath interpolation func-
important in high-speed machining applications tions, such as B-splines, and NURBS. The fee-
where elevated cutting speeds necessitate faster drate (i.e., progression speed along the toolpath)
feed motion. is planned in the “look-ahead” function of the
Control of Machining Processes 175

Control of Machining
Processes, Fig. 2 P-PI
position velocity cascade
control used in machine
tool drives

Control of Machining
Processes, Fig. 3
Formation of contour error
."/, as a result of servo
errors (ex and ey ) in the
individual axes

CNC so that the total machining cycle time is reduced machining cycle time, while avoiding
reduced as much as possible. This has to be excessive vibration or positioning error due to
done without violating the position-dependent “jerky” motion.
feedrate limits already programmed into the nu- An example of trajectory planning using quin-
merical control (NC) code, which are specified tic (5th degree) polynomials for toolpath param-
by considering various constraints coming from eterization is shown in Fig. 4. Here, comparison
the machining process. is provided between unoptimized and optimized
In feedrate optimization, axis level trajecto- feedrate profiles subject to the same axis velocity,
ries have to stay within the velocity and torque torque (i.e., control signal), and jerk limits. As
limits of the drives, in order to avoid damaging can be seen, significant machining time reduc-
the machine tool or causing actuator saturation. tion can be achieved through trajectory optimiza-
Moreover, as an indirect way of containing track- tion, while retaining the dynamic tool position
ing errors, the practice of limiting axis level jerk accuracy. While Fig. 4 shows the result of an
(i.e., rate of change of acceleration) is applied elaborate nonlinear optimization approach (Alt-
(Gordon and Erkorkmaz 2013). This results in intas and Erkorkmaz 2003), practical look-ahead
176 Control of Machining Processes

Control of Machining Processes, Fig. 4 Example of quintic spline trajectory planning without and with feedrate
optimization

algorithms have also been proposed which lead to they are fully integrated inside a computer-aided
more conservative cycle times but are much better process planning environment, as reported for
suited for real-time implementation inside a CNC 3-axis machining by Altintas and Merdol (2007).
(Weck et al. 1999). An alternative approach, which allows the ma-
chining process to take place within safe and ef-
ficient operating bounds, is to use feedback from
Adaptive Control of Machining the machine tool during the cutting process. This
measurement can be of the cutting forces using a
There are established mathematical methods for dynamometer or the spindle power consumption.
predicting cutting forces, torque, power, and even This measurement is then used inside a feedback
surface finish for a variety of machining oper- control loop to override the commanded feedrate
ations like turning, boring, drilling, and milling value, which has direct impact on the cutting
(Altintas 2012). However, when machining com- forces and power consumption. This scheme can
plex components, such as gas turbine impellers, be used to ensure that the cutting forces do not
or dies and molds, the tool and workpiece en- exceed a certain limit for process safety or to
gagement and workpiece geometry undergo con- increase the feed when the machining capacity
tinuous change. Hence, it may be difficult to is underutilized, thus boosting productivity. Since
apply such prediction models efficiently, unless the geometry and tool engagement are generally
Control of Machining Processes 177

Control of Machining Processes, Fig. 5 Example of 5-axis impeller machining with adaptive force control (Source:
Budak and Kops (2000), courtesy of Elsevier)

Control of Machining
Processes, Fig. 6
Schematic of the chatter
vibration mechanism for
one degree of freedom
(From: Altintas (2012),
courtesy of Cambridge
University Press)

continuously varying, the coefficients of a model accordingly. This approach has produced signifi-
that relates the cutting force (or power) to the feed cant cycle time reduction in 5-axis machining of
command are also time-varying. Furthermore, gas turbine impellers, as reported in Budak and
in CNC controllers, depending on the trajectory Kops (2000) and shown in Fig. 5.
generation architecture, the execution latency of
a feed override command may not always be
deterministic. Due to these sources of variabil- Control of Chatter Vibrations
ity, rather than using classical fixed gain feed-
back, machining control research has evolved Chatter vibrations are caused by the interaction
around adaptive control techniques (Masory and of the chip generation mechanism with the
Koren 1980; Spence and Altintas 1991), where structural dynamics of the machine, tool, and
changes in the cutting process dynamics are con- workpiece assembly (see Fig. 6). The relative
tinuously tracked and the control law, which com- vibration between the tool and workpiece gener-
putes the proceeding feedrate override, is updated ates a wavy surface finish. In the consecutive tool
178 Control of Machining Processes

pass, a new wave pattern, caused by the current become available at lower cost, one can expect
instantaneous vibration, is generated on top of to see new features, such as more elaborate
the earlier one. If the formed chip, which has an trajectory planning algorithms, active vibration
undulated geometry, displays a steady average damping techniques, and real-time process
thickness, then the resulting cutting forces and and machine simulation and control capability,
vibrations also remain bounded. This leads to beginning to appear in CNC units. No doubt that
a stable steady-state cutting regime, known as the dynamic analysis and controller design for
“forced vibration.” On the other hand, if the such complicated systems will require higher
chip thickness keeps increasing at every tool levels of rigor, so that these new technologies can
pass, resulting in increased cutting forces and be utilized reliably and at their full potential.
vibrations, then chatter vibration is encountered.
Chatter can be extremely detrimental to the
machined part quality, tool life, and the machine Cross-References
tool.
Chatter has been reported in literature to be  Adaptive Control, Overview
caused by two main phenomena: self-excitation  PID Control
through regeneration and mode coupling. For  Robot Motion Control
further information on chatter theory, the reader is
referred to Altintas (2012) as an excellent starting
point. Bibliography
Various mitigation measures have been inves-
tigated and proposed in order to avoid and control Altintas Y (2012) Manufacturing automation: metal cut-
ting mechanics, machine tool vibrations, and CNC de-
chatter. One widespread approach is to select sign, 2nd edn. Cambridge University Press, Cambridge
chatter-free cutting conditions through detailed Altintas Y, Erkorkmaz K (2003) Feedrate optimization for
modal testing and stability analyses. Recently, spline interpolation in high speed machine tools. Ann
to achieve higher material removal rates, the CIRP 52(1):297–302
Altintas Y, Merdol DS (2007) Virtual High performance
application of active damping has started to re- milling. Ann CIRP 55(1):81–84
ceive interest. This has been realized through spe- Budak E, Kops L (2000) Improving productivity and
cially designed tools and actuators (Munoa et al. part quality in milling of titanium based impellers
2013; Pratt and Nayfeh 2001) and demonstrated by chatter suppression and force control. Ann CIRP
49(1):31–36
productivity improvement in boring and milling Ellis GH (2004) Control system design guide, 3rd edn.
operations. As another method for chatter sup- Elsevier Academic, New York
pression, modulation of the cutting (i.e., spindle) Gordon DJ, Erkorkmaz K (2013) Accurate control of
speed has been successfully applied as a means ball screw drives using pole-placement vibration
damping and a novel trajectory prefilter. Precis Eng
of interrupting the regeneration mechanism (Soli- 37(2):308–322
man and Ismail 1997; Zatarain et al. 2008). Koren Y (1980) Cross-coupled biaxial computer control
for manufacturing systems. ASME J Dyn Syst Meas
Control 102:265–272
Koren Y (1983) Computer control of manufacturing sys-
Summary and Future Directions tems. McGraw-Hill, New York
Masory O, Koren Y (1980) Adaptive control system for
This article has presented an overview of various turning. Ann CIRP 29(1):281–284
Munoa J, Mancisidor I, Loix N, Uriarte LG, Barcena R,
concepts and emerging technologies in the area of
Zatarain M (2013) Chatter suppression in ram type
machining process control. The new generation travelling column milling machines using a biaxial
of machine tools, designed to meet the ever- inertial actuator. Ann CIRP 62(1):407–410
growing productivity and efficiency demands, Pratt JR, Nayfeh AH (2001) Chatter control and stability
analysis of a cantilever boring bar under regenerative
will likely utilize advanced forms of these ideas
cutting conditions. Philos Trans R Soc 359:759–792
and technologies in an integrated manner. As Pritschow G (1996) On the influence of the velocity gain
more computational power and better sensors factor on the path deviation. Ann CIRP 45/1:367–371
Control of Networks of Underwater Vehicles 179

Soliman E, Ismail F (1997) Chatter suppression by applications and by the unique challenges
adaptive speed modulation. Int J Mach Tools Manuf associated with operating in the oceans,
37(3):355–369
Spence A, Altintas Y (1991) CAD assisted adaptive
lakes, and rivers. Tasks include underwater
control for milling. ASME J Dyn Syst Meas Control exploration, mapping, search, and surveillance,
113(3):444–450 associated with problems that include pollution
Weck M, Meylahn A, Hardebusch C (1999) Innovative monitoring, human safety, resource seeking,
algorithms for spline-based CNC controller. Ann Ger
ocean science, and marine archeology. Vehicle C
Acad Soc Prod Eng VI(1):83–86
Zatarain M, Bediaga I, Munoa J, Lizarralde R (2008) networks collect data on underwater physics,
Stability of milling processes with continuous spindle biology, chemistry, and geology for improving
speed variation: analysis in the frequency and time the understanding and predictive modeling
domains, and experimental correlation. Ann CIRP
57(1):379–384
of natural dynamics and human-influenced
changes in marine environments. Because the
underwater environment is opaque, inhospitable,
uncertain, and dynamic, control is critical to the
Control of Networks of Underwater performance of vehicle networks.
Vehicles Underwater vehicles typically carry sensors
to measure external environmental signals and
Naomi Ehrich Leonard fields, and thus a vehicle network can be regarded
Department of Mechanical and Aerospace as a mobile sensor array. The underlying principle
Engineering, Princeton University, Princeton, of control of networks of underwater vehicles
NJ, USA leverages their mobility and uses an interacting
dynamic among the vehicles to yield a high-
performing collective behavior. If the vehicles
Abstract can communicate their state or measure the rel-
ative state of others, then they can cooperate and
Control of networks of underwater vehicles is coordinate their motion.
critical to underwater exploration, mapping, One of the major drivers of control of under-
search, and surveillance in the multiscale, water mobile sensor networks is the multiscale,
spatiotemporal dynamics of oceans, lakes, spatiotemporal dynamics of the environmental
and rivers. Control methodologies have been fields and signals. In Curtin et al. (1993), the
derived for tasks including feature tracking and concept of the autonomous oceanographic sam-
adaptive sampling and have been successfully pling network (AOSN), featuring a network of
demonstrated in the field despite the severe underwater vehicles, was introduced for dynamic
challenges of underwater operations. measurement of the ocean environment and res-
olution of spatial and temporal gradients in the
sampled fields. For example, to understand the
coupled biological and physical dynamics of the
Keywords ocean, data are required both on the small-scale
dynamics of phytoplankton, which are major ac-
Adaptive sampling; Feature tracking; Gliders; tors in the marine ecosystem and the global cli-
Mobile sensor arrays; Underwater exploration mate, and on the large-scale dynamics of the flow
field, temperature, and salinity.
Accordingly, control laws are needed to co-
Introduction ordinate the motion of networks of underwater
vehicles to match the many relevant spatial and
The development of theory and methodology temporal scales. And for a network of underwater
for control of networks of underwater vehicles vehicles to perform complex missions reliably
is motivated by a multitude of underwater and efficiently, the control must address the many
180 Control of Networks of Underwater Vehicles

uncertainties and real-world constraints including ocean currents can sometimes reach or even ex-
the influence of currents on the motion of the ceed the speed of the gliders. Unlike a propelled
vehicles and the limitations on underwater com- AUV, which typically has sufficient thrust to
munication. maintain course despite currents, a glider trying
to move in the direction of a strong current will
make no forward progress. This makes coordi-
Vehicles nated control of gliders challenging; for instance,
two sensors that should stay sufficiently far apart
Control of networks of underwater vehicles is may be pushed toward each other leading to less
made possible with the availability of small than ideal sampling conditions.
(e.g., 1.5–2 m long), relatively inexpensive
autonomous underwater vehicles (AUVs).
Propelled AUVs such as the REMUS provide Communication and Sensing
maneuverability and speed. These kinds of AUVs
respond quickly and agilely to the needs of Underwater communication is one of the biggest
the network, and because of their speed, they challenges to the control of networks of un-
can often power through strong ocean flows. derwater vehicles and one that distinguishes it
However, propelled AUVs are limited by their from control of vehicles on land or in the air.
batteries; for extended missions, they need Radio-frequency communication is not typically
docking stations or other means to recharge their available underwater, and acoustic data telemetry
batteries. has limitations including sensitivity to ambient
Buoyancy-driven autonomous underwater noise, unpredictable propagation, limited band-
gliders, including the Slocum, the Spray, and width, and latency.
the Seaglider, are a class of endurance AUVs When acoustic communication is too limiting,
designed explicitly for collecting data over large vehicles can surface periodically and communi-
three-dimensional volumes continuously over cate via satellite. This method may be bandwidth
periods of weeks or even months (Rudnick et al. limited and will require time and energy. How-
2004). They move slowly and steadily, and, as a ever, in the case of profiling propelled AUVs
result, they are particularly well suited to network or underwater gliders, they already move in the
missions of long duration. vertical plane in a sawtooth pattern and thus
Gliders propel themselves by alternately in- regularly come closer to the surface. When on the
creasing and decreasing their buoyancy using surface, vehicles can also get a GPS fix whereas
either a hydraulic or a mechanical buoyancy en- there is no access to GPS underwater. The GPS
gine. Lift generated by flow over fixed wings fix is used for correcting onboard dead reckoning
converts the vertical ascent/descent induced by of the vehicle’s absolute position and for updating
the change in buoyancy into forward motion, re- onboard estimation of the underwater currents,
sulting in a sawtooth-like trajectory in the vertical both helpful for control.
plane. Gliders can actively redistribute internal Vehicles are typically equipped with
mass to control attitude, for example, they pitch conductivity-temperature-density (CTD) sensors
by sliding their battery pack forward and aft. For to measure temperature, salinity, and density.
heading control, they shift mass to roll, bank, From this pressure can be computed and thus
and turn or deflect a rudder. Some gliders are depth and vertical speed. Attitude sensors
designed for deep water, e.g., to 1,500 m, while provide measurements of pitch, roll, and
others for shallower water, e.g., to 200 m. heading. Position and velocity in the plane is
Gliders are typically operated at their maxi- estimated using dead reckoning. Many sensors
mum speed and thus they move at approximately for measuring the environment have been
constant speed relative to the flow. Because this is developed for use on underwater vehicles; these
relatively slow, on the order of 0.3–0.5 m/s in the include chlorophyll fluorometers to estimate
horizontal direction and 0.2 m/s in the vertical, phytoplankton abundance, acoustic Doppler
Control of Networks of Underwater Vehicles 181

profilers (ADPs) to measure variations in water X


N
velocity, and sensors to measure pH, dissolved xR i D ui D  rVI .xij /  kd xP i
oxygen, and carbon dioxide. j D1;j ¤i

where a damping term is added with scalar gain


kd > 0. Stability of the triangle of resolution d0
Control is proved with the Lyapunov function C
Described here are a selection of control method- 1 X
1X X
N N N
ologies designed to serve a variety of under- V D kPxi k C
2
VI .xij /:
water applications and to address many of the 2 i D1 i D1 j Di C1
challenges described above for both propelled
AUVs and underwater gliders. Some of these Now let each vehicle use the sequence of
methodologies have been successfully field tested single-point measurements it takes along its path
in the ocean. to compute the projection of the spatial gradient
onto its normalized velocity, exP D xP i =kPxi k,
Formations for Tracking Gradients, i.e., rTP .x; xP i / D .rT .x/  exP /exP . Following
Boundaries, and Level Sets in Sampled Bachmayer and Leonard (2002), let
Fields
While a small underwater vehicle can take only X
N
xR i D ui D rTP .x; xP i / rVI .xij /kd xP i ;
single-point measurements of a field, a network
j D1;j ¤i
of N vehicles employing cooperative control
laws can move as a formation and estimate or where  is a scalar gain. For  > 0, each vehicle
track a gradient in the field. This can be done in a will accelerate along its path when it measures an
straightforward way in 2D with three vehicles and increasing T and decelerates for a decreasing T .
can be extended to 3D with additional vehicles. Each vehicle will also turn to keep up with the
Consider N D 3 vehicles moving together in an others so that the formation will climb the spatial
equilateral triangular formation and sampling a gradient of T to find a local maximum.
2D field T W R2 ! R. The formation serves as a Alternative control strategies have been devel-
sensor array and the triangle side length defines oped that add versatility in feature tracking. The
the resolution of the array. virtual body and artificial potential (VBAP) mul-
Let the position of the i th vehicle be xi 2 R2 . tivehicle control methodology (Ögren et al. 2004)
Consider double integrator dynamics xR i D ui , was demonstrated with a network of Slocum
where ui 2 R2 is the control force on the i th autonomous underwater gliders in the AOSN II
vehicle. Suppose that each vehicle can measure field experiment in Monterey Bay, California,
the relative position of each of its neighbors, in August 2003 (Fiorelli et al. 2006). VBAP is
xij D xi  xj . Decentralized control that derives well suited to the operational scenario described
from an artificial potential is a popular method for above in which vehicles surface asynchronously
each of the three vehicles to stay in the triangular to establish communication with a base.
formation of prescribed resolution d0 . Consider VBAP is a control methodology for coordi-
the nonlinear interaction potential VI W R2 ! R nating the translation, rotation, and dilation of a
defined as group of vehicles. A virtual body is defined by
  a set of reference points that move according to
d0
VI .xij / D ks ln kxij k C dynamics that are computed centrally and made
kxij k2 available to the vehicles in the group. Artificial
potentials are used to couple the dynamics of
where ks > 0 is a scalar gain. The control law vehicles and a virtual body so that control laws
for the i th vehicle derives as the gradient of this can be derived that stabilize desired formations
potential with respect to xi as follows: of vehicles and a virtual body. When sampled
182 Control of Networks of Underwater Vehicles

measurements of a scalar field can be commu- Computing coordinated trajectories to maxi-


nicated, the local gradients can be estimated. mize I.t/ can in principle be addressed using
Gradient climbing algorithms prescribe virtual optimal coverage control methods. However, this
body direction, so that, for example, the vehicle coverage problem is especially challenging since
network can be directed to head for the coldest the uncertainty field is spatially nonuniform and
water or the highest concentration of phytoplank- it changes with time and with the motion of
ton. Further, the formation can be dilated so that the sampling vehicles. Furthermore, the optimal
the resolution can be adapted to minimize error in trajectories may become quite complex so that
estimates. Control of the speed of the virtual body controlling vehicles to them in the presence of
ensures stability and convergence of the vehicle dynamic disturbances and uncertainty may lead
formation. to suboptimal performance.
These ideas have been extended further to An alternative approach decouples the design
design provable control laws for cooperative level of motion patterns to optimize the entropic in-
set tracking, whereby small vehicle groups coop- formation metric from the decentralized control
erate to generate contour plots of noisy, unknown laws that stabilize the network onto the motion
fields, adjusting their formation shape to provide patterns (see Leonard et al. 2007). This approach
optimal filtering of their noisy measurements was demonstrated with a network of 6 Slocum
(Zhang and Leonard 2010). autonomous underwater gliders in a 24-day-long
field experiment in Monterey Bay, California, in
Motion Patterns for Adaptive Sampling August 2006 (see Leonard et al. 2010). The coor-
A central objective in many underwater applica- dinating feedback laws for the individual vehicles
tions is to design provable and reliable mobile derive systematically from a control methodol-
sensor networks for collecting the richest data ogy that provides provable stabilization of a pa-
set in an uncertain environment given limited re- rameterized family of collective motion patterns
sources. Consider the sampling of a single time- (Sepulchre et al. 2008). These patterns consist of
and space-varying scalar field, like temperature vehicles moving on a finite set of closed curves
T , using a network of vehicles, where the control with spacing between vehicles defined by a small
problem is to coordinate the motion of the net- number of “synchrony” parameters. The feed-
work to maximize information on this field over back laws that stabilize a given motion pattern use
a given area or volume. the same synchrony parameters that distinguish
The definition of the information metric will the desired pattern.
depend on the application. If the data are to Each vehicle moves in response to the relative
be assimilated into a high-resolution dynamical position and direction of its neighbors so that it
ocean model, then the metric would be defined keeps moving, it maintains the desired spacing,
by uncertainty as computed by the model. A and it stays close to its assigned curve. It has been
general-purpose metric, based on objective anal- observed in the ocean, for vehicles carrying out
ysis (linear statistical estimation from given field this coordinated control law, that “when a vehicle
statistics), specifies the statistical uncertainty of on a curve is slowed down by a strong opposing
the field model as a function of where and when flow field, it will cut inside a curve to make up
the data were taken (Bennett 2002). The pos- distance and its neighbor on the same curve will
teriori error A.r; t/ is the variance of T about cut outside the curve so that it does not overtake
its estimate at location r and time t. Entropic the slower vehicle and compromise the desired
information over a spatial domain of area A is spacing” (Leonard et al. 2010). The approach is
robust to vehicle failure since there are no leaders
 Z 
1 in the network, and it is scalable since the control
I.t/ D  log d r A.r; t/ ;
0 A law for each vehicle can be defined in terms of
the state of a few other vehicles, independent of
where 0 is a scaling factor (Grocholsky 2002). the total number of vehicles.
Control of Networks of Underwater Vehicles 183

The control methodology prescribes steering for multiple graphs to handle multiple scales.
laws for vehicles operated at a constant speed. For example, in the 2006 field experiment, the
Assume that the i th vehicle moves at unit speed in default motion pattern was one in which six
the plane in the direction i .t/ at time t. Then, the gliders moved in coordinated pairs around three
velocity of the i th vehicle is xP i D .cos i ; sin i /. closed curves; one graph defined the smaller-
The steering control ui is the component of the scale coordination of each pair of gliders about
force in the direction normal to velocity, such that its curve, while a second graph defined the larger- C
Pi D ui for i D 1; : : : ; N . Define scale coordination of gliders across the three
curves.
1 X
N
N
U.1 ; : : : ; N / D kp k2 ; p D xP j :
2 N j D1
Implementation
U is a potential function that is maximal at 1
Implementation of control of networks of
when all vehicle directions are synchronized and
underwater vehicles requires coping with the
minimal at 0 when all vehicle directions are
remote, hostile underwater environment. The
perfectly anti-synchronized. Let xQ i D .xQ i ; yQi / D
P control methodology for motion patterns
.1=N / N Q?
j D1 xij and let x i D .yQi ; x
Q i /. Define
and adaptive sampling, described above, was
implemented in the field using a customized
1X
N
software infrastructure called the Glider
S.x1 ; : : : ; xN ; 1 ; : : : ; N / D kPxi !0 xQ ?
i k ;
2
2 i D1 Coordinated Control System (GCCS) (Paley et al.
2008). The GCCS combines a simple model for
where !0 ¤ 0. S is a potential function that is control planning with a detailed model of glider
minimal at 0 for circular motion of the vehicles dynamics to accommodate the constant speed of
around their center of mass with radius 0 D gliders, relatively large ocean currents, waypoint
j!0 j1 . tracking routines, communication only when
Define the steering control as gliders surface (asynchronously), other latencies,
and more. Other approaches consider control
design in the presence of a flow field, formal
X
N
Pi D !0 .1 C Kc hQxi ; xP i i/  K sin.j  i /; methods to integrate high-resolution models of
j D1 the flow field, and design tailored to propelled
AUVs.
where Kc > 0 and K are scalar gains. Then, cir-
cular motion of the network is a steady solution,
with the phase-locked heading arrangement a Summary and Future Directions
minimum of K U , i.e., synchronized or perfectly
anti-synchronized depending on the sign of K . The multiscale, spatiotemporal dynamics of the
Stability can be proved with the Lyapunov func- underwater environment drive the need for well-
tion Vc D Kc S C K U . This steering control coordinated control of networks of underwater
law depends only on relative position and relative vehicles that can manage the significant opera-
heading measurements of the other vehicles. tional challenges of the opaque, uncertain, inhos-
The general form of the methodology extends pitable, and dynamic oceans, lakes, and rivers.
the above control law to network interconnec- Control theory and algorithms have been de-
tions defined by possibly time-varying graphs veloped to enable networks of vehicles to suc-
with limited sensing or communication links, cessfully operate as adaptable sensor arrays in
and it provides systematic control laws to stabi- missions that include feature tracking and adap-
lize symmetric patterns of heading distributions tive sampling. Future work will improve control
about noncircular closed curves. It also allows in the presence of strong and unpredictable flow
184 Control of Nonlinear Systems with Delays

fields and will leverage the latest in battery and adaptive sampling in Monterey Bay. IEEE J Ocean
underwater communication technologies. Hybrid Eng 31(4):935–948
Grocholsky B (2002) Information-theoretic control of
vehicles and heterogeneous networks of vehicles multiple sensor platforms. PhD thesis, University of
will also promote advances in control. Future Sydney
work will draw inspiration from the rapidly grow- Leonard NE, Paley DA, Lekien F, Sepulchre R, Fratantoni
ing literature in decentralized cooperative con- DM, Davis RE (2007) Collective motion, sensor net-
works, and ocean sampling. Proc IEEE 95(1):48–74
trol strategies and complex dynamic networks. Leonard NE, Paley DA, Davis RE, Fratantoni DM, Lekien
Dynamics of decision-making teams of robotic F, Zhang F (2010) Coordinated control of an un-
vehicles and humans is yet another important derwater glider fleet in an adaptive ocean sampling
direction of research that will impact the success field experiment in Monterey Bay. J Field Robot
27(6):718–740
of control of networks of underwater vehicles. Ögren P, Fiorelli E, Leonard NE (2004) Cooperative
control of mobile sensor networks: adaptive gradient
climbing in a distributed environment. IEEE Trans
Autom Control 49(8):1292–1302
Cross-References
Paley D, Zhang F, Leonard NE (2008) Cooperative control
for ocean sampling: the glider coordinated control sys-
 Motion Planning for Marine Control Systems tem. IEEE Trans Control Syst Technol 16(4):735–744
 Underactuated Marine Control Systems Redfield S (2013) Cooperation between underwater ve-
hicles. In: Seto ML (ed) Marine robot autonomy.
Springer, New York, pp 257–286
Rudnick D, Davis R, Eriksen C, Fratantoni D, Perry M
Recommended Reading (2004) Underwater gliders for ocean research. Mar
Technol Soc J 38(1):48–59
Sepulchre R, Paley DA, Leonard NE (2008) Stabilization
In Bellingham and Rajan (2007), it is argued that of planar collective motion with limited communica-
cooperative control of robotic vehicles is espe- tion. IEEE Trans Autom Control 53(3):706–719
cially useful for exploration in remote and hostile Zhang F, Leonard NE (2010) Cooperative filters and con-
trol for cooperative exploration. IEEE Trans Autom
environments such as the deep ocean. A recent
Control 55(3):650–663
survey of robotics for environmental monitoring,
including a discussion of cooperative systems,
is provided in Dunbabin and Marques (2012).
A survey of work on cooperative underwater
vehicles is provided in Redfield (2013). Control of Nonlinear Systems with
Delays

Bibliography Nikolaos Bekiaris-Liberis and Miroslav Krstic


Department of Mechanical and Aerospace
Bachmayer R, Leonard NE (2002) Vehicle networks for Engineering, University of California,
gradient descent in a sampled environment. In: Pro- San Diego, La Jolla, CA, USA
ceedings of the 41st IEEE Conference on Decision and
Control, Las Vegas, pp 112–117
Bellingham JG, Rajan K (2007) Robotics in remote and
hostile environments. Science 318(5853):1098–1102 Abstract
Bennett A (2002) Inverse modeling of the ocean and
atmosphere. Cambridge University Press, Cambridge
The reader is introduced to the predictor feedback
Curtin TB, Bellingham JG, Catipovic J, Webb D
(1993) Autonomous oceanographic sampling net- method for the control of general nonlinear sys-
works. Oceanography 6(3):86–94 tems with input delays of arbitrary length. The
Dunbabin M, Marques L (2012) Robots for environmen- delays need not necessarily be constant but can
tal monitoring: significant advancements and applica-
be time-varying or state-dependent. The predictor
tions. IEEE Robot Autom Mag 19(1):24–39
Fiorelli E, Leonard NE, Bhatta P, Paley D, Bachmayer feedback methodology employs a model-based
R, Fratantoni DM (2006) Multi-AUV control and construction of the (unmeasurable) future state of
Control of Nonlinear Systems with Delays 185

the system. The analysis methodology is based on The first step toward control design and anal-
the concept of infinite-dimensional backstepping ysis for system (1) is to consider the special
transformation – a transformation that converts case in which D D const. The next step is to
the overall feedback system to a new, cascade consider the special case of system (1), in which
“target system” whose stability can be studied D D D.t/, i.e., the delay is an a priori given
with the construction of a Lyapunov function. function of time. Systems with time-varying de-
lays model numerous real-world systems, such C
as, networked control systems, traffic systems,
or irrigation channels. Assuming that the input
Keywords
delay is an a priori defined function of time is
a plausible assumption for some applications.
Distributed parameter systems; Delay systems;
Yet, the time-variation of the delay might be
Backstepping; Lyapunov function
the result of the variation of a physical quantity
that has its own dynamics, such as, for example,
in milling processes (due to speed variations),
Nonlinear Systems with Input Delay 3D printers (due to distance variations), cooling
systems (due to flow rate variations), and popu-
Nonlinear systems of the form lation dynamics (due to population’s size varia-
tions). Processes in this category can be modeled
by systems with a delay that is a function of
XP .t/ D f .X.t/; U .t  D .t; X.t//// ; (1)
the state of the system, i.e., by (1) with D D
D.X /.
where t 2 RC is time, f W Rn  R ! Rn In this article control designs are presented
is a vector field, X 2 Rn is the state, D W for the stabilization of nonlinear systems with
RC  Rn ! RC is a nonnegative function of input delays, with delays that are constant (Krstic
the state of the system, and U 2 R is the scalar 2009), time-varying (Bekiaris-Liberis and Krstic
input, are ubiquitous in applications. The starting 2012) or state-dependent (Bekiaris-Liberis and
point for designing a control law for (1), as well Krstic 2013b), employing predictor feedback,
as for analyzing the dynamics of (1) is to con- i.e., employing a feedback law that uses the future
sider the delay-free counterpart of (1), i.e., when rather than the current state of the system. Since
D D 0, for which a plethora of results exists one employs in the feedback law the future values
dealing with its stabilization and Lyapunov-based of the state, the predictor feedback completely
analysis (Krstic et al 1995). cancels (compensates) the input delay, i.e., after
Systems of the form (1) constitute more the control signal reaches the system, the state
realistic models for physical systems than evolves as if there were no delay at all. Since the
delay-free systems. The reason is that often future values of the state are not a priori known,
in engineering applications the control that is the main control challenge is the implementation
applied to the system does not immediately affect of the predictor feedback law. Having determined
the system. This dead time until the controller can the predictor, the control law is then obtained
affect the system might be due to, among other by replacing the current state in a nominal state-
things, the long distance of the controller from feedback law (which stabilizes the delay-free
the system, such as, for example, in networked system) by the predictor.
control systems, or due to finite-speed transport A methodology is presented in the article
or flow phenomena, such as, for example, in for the stability analysis of the closed-loop
additive manufacturing and cooling systems, or system under predictor feedback by constructing
due to various after-effects, such as, for example, Lyapunov functionals. The Lyapunov functionals
in population dynamics. are constructed for a transformed (rather than
186 Control of Nonlinear Systems with Delays

the original) system. The transformed system Predictor Feedback


is, in turn, constructed by transforming the
original actuator state U./,  2 Œt  D; t
The predictor feedback designs are based on a
to a transformed actuator state with the feedback law U.t/ D .X.t// that renders the
aid of an infinite-dimensional backstepping closed-loop system XP D f .X; .X // glob-
transformation. The overall transformed system ally asymptotically stable. For stabilizing sys-
is easier to analyze than the original system tem (1), the following control law is employed
because it is a cascade, rather than a feedback instead
system, consisting of a delay line with zero
input, whose effect fades away in finite time, U.t/ D .P .t//; (2)
namely, after D time units, cascaded with an
asymptotically stable system. where

Z 
f .P .s/; U.s//
P ./ D X.t/ C ds (3)
t D.t;X.t // 1  Dt ..s/; P .s//  rD ..s/; P .s// f .P .s/; U.s//
Z 
1
./ D t C ds; (4)
t D.t;X.t // 1  Dt ..s/; P .s//  rD ..s/; P .s// f .P .s/; U.s//

for all t  D .t; X.t//    t. The sig- was actually applied. To cancel the effect of this
nal P is the predictor of X at the appropri- delay, the control law (2) is designed such that
ate prediction time , i.e., P .t/ D X..t//. U. .t// D U.t  D/ D

.X.t//, i.e., such


This fact is explained in more detail in the next that U.t/ D  X 1 .t/ D  .X.t C D//.
paragraphs of this section. The predictor em- Define the prediction time  through the relation
ploys the future values of the state X which 1 .t/ D .t/ D t C D. This is the time
are not a priori available. Therefore, for actually instant at which an input signal that is currently
implementing the feedback law (2) one has to applied actually affects the system. In the case of
employ (3). Relation (3) is a formula for the a constant delay, the prediction time is simply D
future values of the state that depends on the time-units in the future. Next an implementable
available measured quantities, i.e., the current formula for X..t// D X.t C D/ is derived.
state X.t/ and the history of the actuator state Performing a change of variables t D  C D, for
U./,  2 Œt  D .t; X.t// ; t
. To make clear the all t D    t in XP .t/ D f .X.t/; U.t  D//
definitions of the predictor P and the prediction and integrating in  starting at  D t  D, one
time , as well as their implementation through can conclude that P defined by (3) with Dt D
formulas (3) and (4), the constant delay case is rDf D 0 and D D const is the D time-units
discussed first. ahead predictor of X , i.e., P .t/ D X..t// D
The idea of predictor feedback is to employ in X.t C D/.
the control law the future values of the state at To better understand definition (3) the
the appropriate future time, such that the effect case of a linear system with a constant input
of the input delay is completely canceled (com- delay D, i.e., a system of the form XP .t/ D
pensated). Define the quantity .t/ D t  D, AX.t/ C BU.t  D/, is considered next (see
which from now on is referred to as the de- also  Control of Linear Systems with Delays
layed time. This is the time instant at which the and Hale and Verduyn Lunel (1993)). In this
control signal that currently affects the system case, the predictor P .t/ is given explicitly
Control of Nonlinear Systems with Delays 187

using the variation of constants formula, with (and X.t/ D P .t  D.X.t///). This implicit
the initial condition P .tR  D/ D X.t/, as relation can be solved by proceeding as in the
t
P .t/ D e AD X.t/ C t D e A.t  / BU./d. time-varying case, i.e., by performing the change
For systems that are nonlinear, P .t/ cannot be of variables t D ./, for all t  D .X.t// 
written explicitly, for the same reason that a   t in XP .t/ D f .X.t/; U .t  D.X.t//// and
nonlinear ODE cannot be solved explicitly. So integrating in  starting at  D t  D .X.t//,
P .t/ is represented implicitly using the nonlinear to obtain the formula (3) for P with Dt D 0, C
integral equation (3). The computation of P .t/ rDf D rD .P .s// f .P .s/; U.s// and D D
from (3) is straightforward with a discretized D.X.t//.
implementation in which P .t/ is assigned values Analogously, one can derive the predictor for
based on the right-hand side of (3), which the case D D D .t; X.t// with the difference
involves earlier values of P and the values of that now the prediction time is not given explic-
the input U . itly in terms of P , but it is defined through an
The case D D D.t/ is considered next. As implicit relation, namely, it holds that .t/ D
in the case of constant delays the main goal is t C D ..t/; P .t//. Therefore, for actually com-
to implement the predictor P . One needs first to puting  one has to proceed as in the deriva-
define the appropriate time interval over which tion of P , i.e., to differentiate relation ./ D
the predictor of the state is needed, which, in  C D ../; P .// and then integrate starting
the constant delay case is simply D time-units at the known value  .t  D .t; X.t/// D t. It
in the future. The control law has to satisfy is important to note that the integral equation (4)
U . .t// D .X.t//, or, U.t/ D  .X ..t///. is needed in the computation of P only when D
Hence, one needs to find an implementable for- depends on both X and t.
mula for P .t/ D X ..t//. In the constant
delay case the prediction horizon over which one
needs to compute the predictor can be determined Backstepping Transformation and
based on the knowledge of the delay time since Stability Analysis
the prediction horizon and the delay time are
both equal to D. This is not anymore true in The predictor feedback designs are based on a
the time-varying case in which the delayed time feedback law .X / that renders the closed-loop
is defined as .t/ D t  D.t/, whereas the system XP D f .X; .X // globally asymptoti-
prediction time as 1 .t/ D .t/ D t CD ..t//. cally stable. However, in the rest of the section
Employing a change of variables in XP .t/ D it is assumed that the feedback law .X / renders
f .X.t/; U .t  D.t/// as t D ./, for all the closed-loop system XP D f .X; .X / C v/
.t/    t and integrating in  starting at input-to-state stable (ISS) with respect to v, i.e.,
 D .t/ one obtains the formula for P given there exists a smooth function S W Rn ! RC and
by (3) with Dt D D 0 ..t//, rDf D 0 and class K1 functions ˛1 , ˛2 , ˛3 , ˛4 such that
D D D.t/.
Next the case D D D.X.t// is considered. ˛3 .jX.t/j/  S .X.t//
First one has to determine the predictor, i.e.,
 ˛4 .jX.t/j/ (5)
the signal P such that P .t/ D X ..t//, where
.t/ D 1 .t/ and .t/ D t  D .X.t//. In
the case of state-dependent delay, the prediction @S.X.t//
f .X.t/; .X.t//
time .t/ depends on the predictor itself, i.e., @X
the time when the current control reaches Cv.t//  ˛1 .jX.t/j/ C ˛2 .jv.t/j/: (6)
the system depends on the value of the state
at that time, namely, the following implicit Imposing this stronger assumption enables one
relationship holds P .t/ D X.t C D.P .t/// to construct a Lyapunov functional for the
188 Control of Nonlinear Systems with Delays

closed-loop systems (1)–(4) with the aid of the With the invertibility of the backstepping trans-
Lyapunov characterization of ISS defined in (5) formation one can then show global asymptotic
and (6). stability of the closed-loop system in the original
The stability analysis of the closed-loop variables .X; U /. In particular, there exists a class
systems (1)–(4) is explained next. Denote the KL function ˇ such that
infinite-dimensional backstepping transformation
of the actuator state as jX.t/j C sup jU./j
t D.t;X.t // t
!
W ./ D U./  .P .//;
 ˇ jX.0/j C sup jU./j; t ;
for all t  D .t; X.t//    t ; (7) D.0;X.0// 0

for all t  0: (12)


where P ./ is given in terms of U./ from (3).
Using the fact that P .t  D .t; X.t/// D One of the main obstacles in designing glob-
X.t/, for all t  0, one gets from (7) that ally stabilizing control laws for nonlinear sys-
U .t  D .t; X.t/// D W .t  D .t; X.t/// C tems with long input delays is the finite escape
.X.t//. With the fact that for all   0, phenomenon. The input delay may be so large
U./ D .P .// one obtains from (7) that that the control signal cannot reach the system
W ./ D 0, for all   0. Yet, for all before its state grows unbounded. Therefore, one
t  D .t; X.t//, i.e., for all   0, W ./ might has to assume that the system XP D f .X; !/ is
be nonzero due to the effect of the arbitrary initial forward complete, i.e., for every initial condition
condition U./,  2 ŒD .0; X.0// ; 0
. With the and every bounded input signal the corresponding
above observations, one can transform system (1) solution is defined for all t  0.
with the aid of transformation (7) to the following With the forward completeness requirement,
target system estimate (12) holds globally for constant but
arbitrary large delays. For the case of time-
XP .t/ D f .X.t/; .X.t// varying delays, estimate (12) holds globally as
well but under the following four conditions on
CW .t  D.t; X.t//// (8)
the delay:
W .t  D .t; X.t/// D 0; C1. D.t/  0. This condition guarantees the
causality of the system.
for t  D .t; X.t//  0: (9)
C2. D.t/ < 1. This condition guarantees that
all inputs applied to the system eventually
Using relations (5), (6), and (8), (9) one can reach the system.
construct the following Lyapunov functional P
C3. D.t/ < 1. This condition guarantees that the
for showing asymptotic stability of the target system never feels input values that are older
system (8), (9), i.e., for the overall system than the ones it has already felt, i.e., the input
consisting of the vector X.t/ and the transformed signal’s direction never gets reversed. (This
infinite-dimensional actuator state W ./, condition guarantees the existence of  D
t  D .t; X.t//    t, 1 .)
P
C4. D.t/ > 1 This condition guarantees that
Z L.t /
2 ˛2 .r/ the delay cannot disappear instantaneously,
V .t/ D S .X.t// C dr; (10)
c 0 r but only gradually.
In the case of state-dependent delays, the delay
where c > 0 is arbitrary and depends on time as a result of its dependency on
ˇ ˇ the state. Therefore, predictor feedback guaran-
ˇ c. . /t / ˇ tees stabilization of the system when the delay
L.t/ D sup ˇe W ./ˇ : (11)
t D.t;X.t // t satisfies the four conditions C1–C4. Yet, since
Control of Quantum Systems 189

the delay is a nonnegative function of the state, Krstic M (2009) Delay compensation for nonlinear, adap-
conditions C2–C4 are satisfied by restricting the tive, and PDE systems. Birkhauser, Boston
Krstic M, Kanellakopoulos I, Kokotovic PV (1995) Non-
initial state X and the initial actuator state. There- linear and adaptive control design. Wiley, New York
fore estimate (12) holds locally. Krstic M, Smyshlyaev A (2008) Boundary control of
PDEs: a course on backstepping designs. SIAM,
Philadelphia

Cross-References C

 Control of Linear Systems with Delays Control of Quantum Systems

Ian R. Petersen
Recommended Reading School of Engineering and Information
Technology, University of New South Wales, the
The main control design tool for general systems Australian Defence Force Academy, Canberra,
with input delays of arbitrary length is predictor Australia
feedback. The reader is referred to Artstein
(1982) for the first systematic treatment of
general linear systems with constant input delays. Abstract
The applicability of predictor feedback was
extended in Krstic (2009) to several classes of Quantum control theory is concerned with the
systems, such as nonlinear systems with constant control of systems whose dynamics are governed
input delays and linear systems with unknown by the laws of quantum mechanics. Quantum
input delays. Subsequently, predictor feedback control may take the form of open loop quan-
was extended to general nonlinear systems with tum control or quantum feedback control. Also,
nonconstant input and state delays (Bekiaris- quantum feedback control may consist of mea-
Liberis and Krstic 2013a). The main stability surement based feedback control, in which the
analysis tool for systems employing predictor controller is a classical system governed by the
feedback is backstepping. Backstepping was laws of classical physics. Alternatively, quantum
initially introduced for adaptive control of finite- feedback control may take the form of coherent
dimensional nonlinear systems (Krstic et al feedback control in which the controller is a
1995). The continuum version of backstepping quantum system governed by the laws of quan-
was originally developed for the boundary tum mechanics. In the area of open loop quantum
control of several classes of PDEs in Krstic and control, questions of controllability along with
Smyshlyaev (2008). optimal control and Lyapunov control methods
are discussed. In the case of quantum feedback
control, LQG and H 1 control methods are dis-
Bibliography cussed.

Artstein Z (1982) Linear systems with delayed controls: a


reduction. IEEE Trans Autom Control 27: 869–879
Bekiaris-Liberis N, Krstic M (2013) Nonlinear control
Keywords
under nonconstant delays. SIAM, Philadelphia
Bekiaris-Liberis N, Krstic M (2013) Compensation of Coherent quantum feedback; Measurement based
state-dependent input delay for nonlinear systems. quantum feedback; Quantum control; Quantum
IEEE Trans Autom Control 58: 275–289
Bekiaris-Liberis N, Krstic M (2012) Compensation of
controllability
time-varying input and state delays for nonlinear sys-
tems. J Dyn Syst Meas Control 134:011009
Hale JK, Verduyn Lunel SM (1993) Introduction to func- This work was supported by the Australian Research
tional differential equations. Springer, New York Council (ARC).
190 Control of Quantum Systems

Introduction An alternative approach to open loop quantum


control is the Lyapunov approach; e.g., see Wang
Quantum control is the control of systems whose and Schirmer (2010). This approach extends the
dynamics are described by the laws of quantum classical Lyapunov control approach in which a
physics rather than classical physics. The control Lyapunov function is used to construct a
dynamics of quantum systems must be described stabilizing state feedback control law. However
using quantum mechanics which allows for in quantum control, state feedback control is not
uniquely quantum behavior such as entanglement allowed since classical measurements change the
and coherence. There are two main approaches quantum state of a system and the Heisenberg un-
to quantum mechanics which are referred to certainty principle forbids the simultaneous exact
as the Schrödinger picture and the Heisenberg classical measurement of noncommuting quan-
picture. In the Schrödinger picture, quantum tum variables. Also, in many quantum control
systems are modeled using the Schrödinger applications, the timescales are such that real time
equation or a master equation which describe the classical measurements are not technically feasi-
evolution of the system state or density operator. ble. Thus, in order to obtain an open loop control
In the Heisenberg picture, quantum systems are strategy, the deterministic closed loop system is
modeled using quantum stochastic differential simulated as if the state feedback control were
equations which describe the evolution of system available and this enables an open loop control
observables. These different approaches to strategy to be constructed. As an alternative to
quantum mechanics lead to different approaches coherent open loop control strategies, some clas-
to quantum control. Important areas in which sical measurements may be introduced leading to
quantum control problems arise include physical incoherent control strategies; e.g., see Dong et al.
chemistry, atomic and molecular physics, and (2009).
optics. Detailed overviews of the field o quantum In addition to open loop quantum control
control can be found in the survey papers Dong approaches, a number of approaches to quantum
and Petersen (2010) and Brif et al. (2010) and the control involve the use of feedback; e.g., see
monographs Wiseman and Milburn (2010) and Wiseman and Milburn (2010). This quantum
D’Alessandro (2007). feedback may either involve the use of classical
A fundamental problem in a number of ap- measurements, in which case the controller is a
proaches to quantum control is the controllability classical (nonquantum) system or it may involve
problem. Quantum controllability problems are the case where no classical measurements are
concerned with finite dimensional quantum sys- used since the controller itself is a quantum
tems modeled using the Schrödinger picture of system. The case in which the controller itself
quantum mechanics and involves the structure of is a quantum system is referred to as coherent
corresponding Lie groups or Lie algebras; e.g., quantum feedback control; e.g., see Lloyd
see D’Alessandro (2007). These problems are (2000) and James et al. (2008). Quantum
typically concerned with closed quantum sys- feedback control may be considered using the
tems which are quantum systems isolated from Schrödinger picture, in which case the quantum
their environment. For a controllable quantum systems under consideration are modeled using
system, an open loop control strategy can be stochastic master equations. Alternatively using
constructed in order to manipulate the quantum the Heisenberg picture, the quantum systems
state of the system in a general way. Such open under consideration are modeled using quantum
loop control strategies are referred to as coherent stochastic differential equations. Applications in
control strategies. Time optimal control is one which quantum feedback control can be applied
method of constructing these control strategies include quantum optics and atomic physics. In
which has been applied in applications including addition, quantum control can potentially be
physical chemistry and in nuclear magnetic res- applied to problems in quantum information (e.g.,
onance systems; e.g., see Khaneja et al. (2001). see Nielsen and Chuang 2000) such as quantum
Control of Quantum Systems 191

error correction (e.g., see Kerckhoff et al. 2010) H. In this case, the Schrödinger picture model of
or the preparation of quantum states. Quantum a quantum system is given in terms of a master
information and quantum computing in turn have equation which describes the time evolution of
great potential in solving intractable computing the density operator. In the case of an open quan-
problems such as factoring large integers using tum system with Markovian dynamics defined on
Shor’s algorithm; see Shor (1994). a finite dimensional Hilbert space of dimension
N, the master equation is a matrix differential C
equation of the form
Schrödinger Picture Models of
Quantum Systems " ! #
X
m
P D i
.t/ H0 C uk .t/Hk ; .t/
The state of a closed quantum system can be rep- kD1
resented by a unit vector j i in a complex Hilbert N 12
h i
space H. Such a quantum state is also referred 1 X 
C ˛j;k Fj .t/; Fk
to as a wavefunction. In the Schrödinger picture, 2
j;kD0
the time evolution of the quantum state is defined h i

by the Schrödinger equation which is in general a C Fj ; .t/Fk I
partial differential equation. An important class
(2)
of quantum systems are finite-level systems in
which the Hilbert space is finite dimensional. In
this case, the Schrödinger equation is a linear e.g., see Breuer and Petruccione (2002). Here
ordinary differential equation of the form the notation ŒX;
D X  X refers to the
commutation operator and the notation  denotes
˚ N 2 1
@ the adjoint of an operator. Also, Fj j D0 is a
i„ j .t/i D H0 j .t/i
@t basis set for the space of bounded linear operators

on H with F0 D I . Also, the matrix A D ˛j;k


where H0 is the free Hamiltonian of the sys-
is assumed to be positive definite. These models,
tem, which is a self-adjoint operator on H; e.g.,
which include the Lindblad master equation for
see Merzbacher (1970). Also, „ is the reduced
dissipative quantum systems as a special case
Planck’s constant, which can be assumed to be
(e.g., see Wiseman and Milburn 2010), are used
one with a suitable choice of units. In the case of
in the open loop control of finite-level Markovian
a controlled closed quantum system, this differ-
open quantum systems.
ential equation is extended to a bilinear ordinary
In quantum mechanics, classical measure-
differential equation of the form
ments are described in terms of self-adjoint
" # operators on the underlying Hilbert space
@ X
m
i j .t/i D H0 C uk .t/Hk j .t/i (1) referred to as observables; e.g., see Breuer
@t and Petruccione (2002). An important case
kD1
of measurements are projective measurements
where the functions uk .t/ are the control in which an observable M is decomposed as
Pm
variables and the Hk are corresponding control M D kD1 kPk where the Pk are orthogonal
Hamiltonians, which are also assumed to be self- projection operators on H; e.g., see Nielsen and
adjoint operators on the underlying Hilbert space. Chuang (2000). Then, for a closed quantum
These models are used in the open loop control system with quantum state j i, the probability
of closed quantum systems. of an outcome k from the measurement is given
To represent open quantum systems, it is nec- by h jPk j i which denotes the inner product
essary to extend the notion of quantum state to between the vector j i and the vector Pk j i.
density operators which are positive operators This notation is referred to as Dirac notation and
with trace one on the underlying Hilbert space is commonly used in quantum mechanics. If the
192 Control of Quantum Systems

outcome of the quantum measurement is k, the evolution of general operators on the underlying
state of the quantum system collapses to the new Hilbert space rather than just observables which
value of p Pk j i . This change in the quantum are required to be self-adjoint operators. An im-
h jPk j i
state as a result of a measurement is an important portant class of open quantum systems which are
characteristic of quantum mechanics. For an open considered in the Heisenberg picture arise when
quantum system which is in a quantum state the underlying Hilbert space is infinite dimen-
defined by a density operator , the probability of sional and the system represents a collection of
a measurement outcome k is given by tr.Pk /. In independent quantum harmonic oscillators inter-
Pk Pk
this case, the quantum state collapses to tr.P : acting with a number of external quantum fields.
k /
In the case of an open quantum system with Such linear quantum systems are described in the
continuous measurements of an observable X , Heisenberg picture by linear quantum stochastic
we can consider a stochastic master equation as differential equations (QSDEs) of the form
follows:
dx.t/ D Ax.t/dt C Bdw.t/I
" ! #
X
m
dy.t/ D C x.t/dt C Ddw.t/ (5)
d .t/ D i H0 C uk .t/Hk ; .t/ dt
kD1
where A, B, C , D are real or complex matrices,
 ŒX; ŒX; .t/

dt
p x.t/ is a vector of possibly noncommuting oper-
C 2 .X .t/ C .t/X ators on the underlying Hilbert space H; e.g., see
James et al. (2008). Also, the quantity dw.t/ is
2tr .X .t// .t// dW
decomposed as
(3)
dw.t/ D ˇw .t/dt C dw.t/
Q
where  is a constant parameter related to the
measurement strength and dW is a standard Q
where ˇw .t/ is an adapted process and w.t/ is a
Wiener increment which is related to the quantum Wiener process with Itô table:
continuous measurement outcome y.t/ by
p Q
dw.t/d Q  D FwQ dt:
w.t/
dW D dy  2 tr .X .t// dtI (4)
Here, FwQ  0 is a real or complex matrix. The
e.g., see Wiseman and Milburn (2010). These quantity w.t/ represents the components of the
models are used in the measurement feedback input quantum fields acting on the system. Also,
control of Markovian open quantum systems. the quantity y.t/ represents the components of
Also, the Eqs. (3) and (4) can be regarded as a interest of the corresponding output fields that
quantum filter in which .t/ is the conditional result from the interaction of the harmonic oscil-
density of the quantum system obtained by filter- lators with the incoming fields.
ing the measurement signal y.t/; e.g., see Bouten In order to represent physical quantum sys-
et al. (2007) and Gough et al. (2012). tems, the components of vector x.t/ are required
to satisfy certain commutation relations of the
form
Heisenberg Picture Models of  
Quantum Systems xj .t/; xk .t/ D 2i ‚j k ; j; k D 1; 2; : : : ; n; 8t

In the Heisenberg picture of quantum mechanics, where the matrix ‚ D ‚j k is skew symmetric.
the observables of a system evolve with time and The requirement to represent a physical quan-
the quantum state remains fixed. This picture may tum system places restrictions on the matrices
also be extended slightly by considering the time A, B, C , D, which are referred to as physical
Control of Quantum Systems 193

realizability conditions; e.g., see James et al. controllability. For the case of a closed quantum
(2008) and Shaiju and Petersen (2012). QSDE system of the form (1), the question of
models of the form (5) arise frequently in the area controllability can be defined as follows (e.g.,
of quantum optics. They can also be generalized see Albertini and D’Alessandro 2003):
to allow for nonlinear quantum systems such as
Definition 1 (Pure State Controllability) The
arise in the areas of nonlinear quantum optics
quantum system (1) is said to be pure state C
and superconducting quantum circuits; e.g., see
controllable if for every pair of initial and final
Bertet et al. (2012). These models are used in
states j 0 i and j f i, there exist control functions
the feedback control of quantum systems in both
fuk .t/g and a time T > 0 such that the cor-
the case of classical measurement feedback and
responding solution of (1) with initial condition
in the case of coherent feedback in which the
j 0 i satisfies j .T /i D j f i.
quantum controller is also a quantum system and
is represented by such a QSDE model. Alternative definitions have also been con-
sidered for the controllability of the quantum
.S; L; H / Quantum System Models system (1); e.g., see Albertini and D’Alessandro
An alternative method of modeling an open (2003) and Grigoriu et al. (2013) in the case
quantum system as opposed to the stochastic of open quantum systems. The following the-
master equation (SME) approach or the orem provides a necessary and sufficient con-
quantum stochastic differential equation (QSDE) dition for pure state controllability in terms of
approach, which were considered above, is to the Lie algebra L0 generated by the matrices
simply model the quantum system in terms of fiH0 ; iH1 ; : : : ; iHm g, u.N / the Lie algebra
the physical quantities which underlie the SME corresponding to the unitary group of dimension
and QSDE models. For a general open quantum N , su.N / the Lie algebra corresponding to the
system, these quantities are the scattering special unitary group of dimension N , sp. N2 / the
Q
2 dimensional symplectic group, and L the Lie
N
matrix S which is a matrix of operators on
N
the underlying Hilbert space, the coupling algebra conjugate to sp. 2 /.
operator L which is a vector of operators on
Theorem 1 (See D’Alessandro 2007) The
the underlying Hilbert space, and the system
quantum system (1) is pure state controllable
Hamiltonian which is a self-adjoint operator on
if and only if the Lie algebra L0 satisfies one of
the underlying Hilbert space; e.g., see Gough
the following conditions:
and James (2009). For a given .S; L; H / model,
(1) L0 D su.N /;
the corresponding SME model or QSDE model
(2) L0 is conjugate to sp. N2 /;
can be calculated using standard formulas; e.g.,
(3) L0 D u.N /;
see Bouten et al. (2007) and James et al. (2008). Q
(4) L0 D span fiIN N g ˚ L.
Also, in certain circumstances, an .S; L; H /
model can be calculated from an SME model Similar conditions have been obtained when
or a QSDE model. For example, if the linear alternative definitions of controllability are used.
QSDE model (5) is physically realizable, then a Once it has been determined that a quantum
corresponding .S; L; H / model can be found. In system is controllable, the next task in open
fact, this amounts to the definition of physical loop quantum control is to determine the control
realizability. functions fuk .t/g which drive a given initial state
to a given final state. An important approach to
this problem is the optimal control approach
Open Loop Control of Quantum in which a time optimal control problem is
Systems solved using Pontryagin’s maximum principle
to construct the control functions fuk .t/g which
A fundamental question in the open loop drives the given initial state to the given final state
control of quantum systems is the question of in minimum time; e.g., see Khaneja et al. (2001).
194 Control of Quantum Systems

This approach works well for low dimensional electromagnetic field is stabilized about a spec-
quantum systems but is computationally ified state f D j m ih m j.
intractable for high dimensional quantum
systems.
An alternative approach for high dimensional A Heisenberg Picture Approach to
quantum systems is the Lyapunov control ap- Classical Measurement Based Quantum
proach. In this approach, a Lyapunov function is Feedback Control
selected which provides a measure of the distance In this Heisenberg picture approach to classical
between the current quantum state and the desired measurement based quantum feedback control,
terminal quantum state. An example of such a we begin with a quantum system which is de-
Lyapunov function is scribed by linear quantum stochastic equations of
the form (5). In these equations, it is assumed
V D h .t/  f j .t/  f i  0I that the components of the output vector all
commute with each other and so can be regarded
e.g., see Mirrahimi et al. (2005). A state feedback as classical quantities. This can be achieved if
control law is then chosen to ensure that the time each of the components are obtained via a process
derivative of this Lyapunov function is negative. of homodyne detection from the corresponding
This state feedback control law is then simulated electromagnetic field; e.g., see Bachor and Ralph
with the quantum system dynamics (1) to give the (2004). Also, it is assumed that the input electro-
required open loop control functions fuk .t/g. magnetic field w.t/ can be decomposed as

 
Classical Measurement Based ˇu .t/dt C dw Q 1 .t/
Quantum Feedback Control dw.t/ D (6)
dw2 .t/

A Schrödinger Picture Approach to


Classical Measurement Based Quantum where ˇu .t/ represents the classical control input
Feedback Control signal and wQ 1 .t/; w2 .t/ are quantum Wiener pro-
In the Schrödinger picture approach to classical cesses. The control signal displaces components
measurement based quantum feedback control of the incoming electromagnetic field acting on
with weak continuous measurements, we begin the system via the use of an electro-optic modu-
the stochastic master equations (3) and (4) which lator; e.g., see Bachor and Ralph (2004).
are considered as both a model for the system The classical measurement feedback based
being controlled and as a filter which will form controllers to be considered are classical systems
part of the final controller. These filter equations described by stochastic differential equations of
are then combined with a control law of the form the form

u.t/ D f . .t//
dxK .t/ D AK xk .t/dt C BK dy.t/
where the function f ./ is designed to achieve
a particular objective such as stabilization of the ˇu .t/dt D CK xk .t/dt: (7)
quantum system. Here u.t/ represents the vector
of control inputs uk .t/. An example of such a
quantum control scheme is given in the paper For a given quantum system model (5), the ma-
Mirrahimi and van Handel (2007) in which a trices in the controller (7) can be designed using
Lyapunov method is used to design the con- standard classical control theory techniques such
trol law f ./ so that a quantum system consist- as LQG control (see Doherty and Jacobs 1999) or
ing of an atomic ensemble interacting with an H 1 control (see James et al. 2008).
Control of Quantum Systems 195

Coherent Quantum Feedback Control An important requirement in coherent feed-


back control is that the QSDEs (8) should satisfy
Coherent feedback control of a quantum system the conditions for physical realizability; e.g., see
corresponds to the case in which the controller James et al. (2008). Subject to these constraints,
itself is a quantum system which is coupled in a the controller (8) can then be designed according
feedback interconnection to the quantum system to an H 1 or LQG criterion; e.g., see James
being controlled; e.g., see Lloyd (2000). This et al. (2008) and Nurdin et al. (2009). In the case C
type of control by interconnection is closely re- of coherent quantum H 1 control, it is shown
lated to the behavioral interpretation of feedback in James et al. (2008) that for any controller
control; e.g., see Polderman and Willems (1998). matrices .AK ; BK ; CK /, the matrices .BNK ; DN K /
An important approach to coherent quantum can be chosen so that the controller QSDEs (8)
feedback control occurs in the case when the are physically realizable. Furthermore, the choice
quantum system to be controlled is a linear quan- of the matrices .BN K ; DN K / does not affect the H 1
tum system described by the QSDEs (5). Also, it performance criterion considered in James et al.
is assumed that the input field is decomposed as (2008). This means that the coherent controller
in (6). However in this case, the quantity ˇu .t/ can be designed using the same approach as
represents a vector of noncommuting operators designing a classical H 1 controller.
on the Hilbert space underlying the controller In the case of coherent LQG control such as
system. These operators are described by the fol- considered in Nurdin et al. (2009), the choice of
lowing linear QSDEs, which represent the quan- the matrices .BN K ; DN K / significantly affects the
tum controller: closed loop LQG performance of the quantum
control system. This means that the approach
dxK .t/ D AK xk .t/dt C BK dy.t/ C BNK dwN K .t/ used in solving the coherent quantum H 1 prob-
lem given in James et al. (2008) cannot be applied
dyK .t/ D CK xk .t/dt C DN K dw
N K .t/: (8) to the coherent quantum LQG problem. To date
there exist only some nonconvex optimization
Then, the input ˇu .t/ is identified as methods which have been applied to the coherent
quantum LQG problem (e.g., see Nurdin et al.
ˇu .t/ D CK xk .t/: 2009), and the general solution to the coherent
quantum LQG control problem remains an open
question.
Here the quantity

 
dy.t/
dwK .t/ D (9) Cross-References
dwN K .t/
 Bilinear Control of Schrödinger PDEs
represents the quantum fields acting on the con-  Robustness Issues in Quantum Control
troller quantum system and where wK .t/ cor-
responds to a quantum Wiener process with a
given Itô table. Also, y.t/ represents the output
quantum fields from the quantum system being Bibliography
controlled. Note that in the case of coherent quan-
Albertini F, D’Alessandro D (2003) Notions of controlla-
tum feedback control, there is no requirement that bility for bilinear multilevel quantum systems. IEEE
the components of y.t/ commute with each other Trans Autom Control 48:1399–1403
and this in fact represents one of the main advan- Bachor H, Ralph T (2004) A guide to experiments in
quantum optics, 2nd edn. Wiley-VCH, Weinheim
tages of coherent quantum feedback control as
Bertet P, Ong FR, Boissonneault M, Bolduc A, Mallet F,
opposed to classical measurement based quantum Doherty AC, Blais A, Vion D, Esteve D (2012) Circuit
feedback control. quantum electrodynamics with a nonlinear resonator.
196 Control of Ship Roll Motion

In: Dykman M (ed) Fluctuating nonlinear oscillators: Shaiju AJ, Petersen IR (2012) A frequency domain con-
from nanomechanics to quantum superconducting cir- dition for the physical realizability of linear quantum
cuits. Oxford University Press, Oxford systems. IEEE Trans Autom Control 57(8):2033–2044
Breuer H, Petruccione F (2002) The theory of open quan- Shor P (1994) Algorithms for quantum computation:
tum systems. Oxford University Press, Oxford discrete logarithms and factoring. In: Goldwasser S
Brif C, Chakrabarti R, Rabitz H (2010) Control of quan- (ed) Proceedings of the 35th annual symposium on
tum phenomena: past, present and future. New J Phys the foundations of computer science. IEEE Computer
12:075008 Society, Los Alamitos, pp 124–134
Bouten L, van Handel R, James M (2007) An intro- Wang W, Schirmer SG (2010) Analysis of Lyapunov
duction to quantum filtering. SIAM J Control Optim method for control of quantum states. IEEE Trans
46(6):2199–2241 Autom Control 55(10):2259–2270
D’Alessandro D (2007) Introduction to quantum control Wiseman HM, Milburn GJ (2010) Quantum measurement
and dynamics. Chapman & Hall/CRC, Boca Raton and control. Cambridge University Press, Cambridge,
Doherty A, Jacobs K (1999) Feedback-control of quantum UK
systems using continuous state-estimation. Phys Rev
A 60:2700–2711
Dong D, Petersen IR (2010) Quantum control theory
and applications: a survey. IET Control Theory Appl
4(12):2651–2671
Dong D, Lam J, Tarn T (2009) Rapid incoherent control Control of Ship Roll Motion
of quantum systems based on continuous measure-
ments and reference model. IET Control Theory Appl Tristan Perez1 and Mogens Blanke2;3
3:161–169 1
Gough J, James MR (2009) The series product and its Electrical Engineering & Computer Science,
application to quantum feedforward and feedback net- Queensland University of Technology, Brisbane,
works. IEEE Trans Autom Control 54(11):2530–2544 QLD, Australia
Gough JE, James MR, Nurdin HI, Combes J (2012) Quan- 2
Department of Electrical Engineering,
tum filtering for systems driven by fields in single-
photon states or superposition of coherent states. Phys Automation and Control Group, Technical
Rev A 86:043819 University of Denmark (DTU), Lyngby,
Grigoriu A, Rabitz H, Turinici G (2013) Controllability Denmark
analysis of quantum systems immersed within an en- 3
Centre for Autonomous Marine Operations and
gineered environment. J Math Chem 51(6):1548–1560
James MR, Nurdin HI, Petersen IR (2008) H 1 control of Systems (AMOS), Norwegian University of
linear quantum stochastic systems. IEEE Trans Autom Science and Technology, Trondheim, Norway
Control 53(8):1787–1803
Kerckhoff J, Nurdin HI, Pavlichin DS, Mabuchi H (2010)
Designing quantum memories with embedded control:
photonic circuits for autonomous quantum error cor- Abstract
rection. Phys Rev Lett 105:040502
Khaneja N, Brockett R, Glaser S (2001) Time optimal The undesirable effects of roll motion of ships
control in spin systems. Phys Rev A 63:032308 (rocking about the longitudinal axis) became no-
Lloyd S (2000) Coherent quantum feedback. Phys Rev A
62:022108 ticeable in the mid-nineteenth century when sig-
Merzbacher E (1970) Quantum mechanics, 2nd edn. Wi- nificant changes were introduced to the design of
ley, New York ships as a result of sails being replaced by steam
Mirrahimi M, van Handel R (2007) Stabilizing feedback
engines and the arrangement being changed from
controls for quantum systems. SIAM J Control Optim
46(2):445–467 broad to narrow hulls. The combination of these
Mirrahimi M, Rouchon P, Turinici G (2005) Lyapunov changes led to lower transverse stability (lower
control of bilinear Schrödinger equations. Automatica restoring moment for a given angle of roll) with
41:1987–1994
the consequence of larger roll motion. The in-
Nielsen M, Chuang I (2000) Quantum computation and
quantum information. Cambridge University Press, crease in roll motion and its effect on cargo
Cambridge, UK and human performance lead to the development
Nurdin HI, James MR, Petersen IR (2009) Coherent several control devices that aimed at reducing and
quantum LQG control. Automatica 45(8):1837–1846
controlling roll motion. The control devices most
Polderman JW, Willems JC (1998) Introduction to
mathematical systems theory: a behavioral approach. commonly used today are fin stabilizers, rudder,
Springer, New York anti-roll tanks, and gyrostabilizers. The use of
Control of Ship Roll Motion 197

different types of actuators for control of ship on the hull, hence reducing the net roll excitation
roll motion has been amply demonstrated for over moment. This led to the development of fluid
100 years. Performance, however, can still fall anti-roll tank stabilizers. The most common type
short of expectations because of difficulties as- of anti-roll tank is the U-tank, which comprises
sociated with control system design, which have two reservoirs, located one on port and one on
proven to be far from trivial due to fundamental starboard, connected at the bottom by a duct.
performance limitations and large variations of Anti-roll tanks can be either passive or active. In C
the spectral characteristics of wave-induced roll passive tanks, the fluid flows freely from side to
motion. This short article provides an overview side. According to the density and viscosity of
of the fundamentals of control design for ship the fluid used, the tank is dimensioned so that
roll motion reduction. The overview is limited to the time required for most of the fluid to flow
the most common control devices. Most of the from side to side equals the natural roll period
material is based on Perez (Ship motion control. of the ship. Active tanks operate in a similar
Advances in industrial control. Springer, London, manner, but they incorporate a control system
2005) and Perez and Blanke (Ann Rev Control that modifies the natural period of the tank to
36(1):1367–5788, 2012). match the actual ship roll period. This is normally
achieved by controlling the flow of air from the
top of one reservoir to the other. Anti-roll tanks
Keywords attain a medium to high performance in the range
of 20–70 % of roll angle reduction (RMS) (Mar-
Roll damping; Ship motion control zouk and Nayfeh 2009). Anti-roll tanks increase
the ship displacement. They can also be used to
correct list (steady-state roll angle), and they are
Ship Roll Motion Control Techniques the preferred stabilizer for icebreakers.
Rudder-roll stabilization (RRS) is a technique
One of the most commonly used devices to at- based on the fact that the rudder is located not
tenuate ship motion are the fin stabilisers. These only aft, but also below the center of gravity of
are small controllable fins located on the bilge of the vessel, and thus the rudder imparts not only
the hull usually amid ships. These devices attain yaw but also roll moment. The idea of using the
a performance in the range of 60–90 % of roll rudder for simultaneous course keeping and roll
reduction (root mean square) (Sellars and Martin reduction was conceived in the late 1960s by
1992). They require control systems that sense observations of anomalous behavior of autopilots
the vessel’s roll motion and act by changing the that did not have appropriate wave filtering – a
angle of the fins. These devices are expensive feature of the autopilot that prevents the rudder
and introduce underwater noise that can affect from reacting to every single wave; see, for ex-
sonar performance, they add to propulsion losses, ample, Fossen and Perez (2009) for a discus-
and they can be damaged. Despite this, they sion on wave filtering. Rudder-roll stabilization
are among the most commonly used ship roll has been demonstrated to attain medium to high
motion control device. From a control perspec- performance in the range of 50–75 % of roll
tive, highly nonlinear effects (dynamic stall) may reduction (RMS) (Baitis et al. 1983; Blanke et al.
appear when operating in severe sea states and 1989; Källström et al. 1988; Oda et al. 1992;
heavy rolling conditions (Gaillarde 2002). van Amerongen et al. 1990). The upgrade of the
During studies of ship damage stability con- rudder machinery is required to be able to attain
ducted in the late 1800s, it was observed that slew rates in the range 10–20 deg/s for RRS to
under certain conditions the water inside the have sufficient control authority.
vessel moved out of phase with respect to the A gyrostabilizer uses the gyroscopic effects of
wave profile, and thus, the weight of the water on large rotating wheels to generate a roll reducing
the vessel counteracted the increase of pressure torque. The use of gyroscopic effects was
198 Control of Ship Roll Motion

proposed in the early 1900s as a method to elim- is called roll added mass (inertia). The second
inate roll, rather than to reduce it. Although the term is a damping term, which captures forces
performance of these systems was remarkable, due to wave making and linear skin friction, and
up to 95 % roll reduction, their high cost, the in- the coefficient Kp is a linear damping coefficient.
crease in weight, and the large stress produced on The third term is a nonlinear damping term,
the hull masked their benefits and prevented fur- which captures forces due to viscous effects. The
ther developments. However, a recent increase in last term is the restoring torque due to gravity and
development of gyrostabilizers has been seen in buoyancy.
the yacht industry (Perez and Steinmann 2009). For a 4DOF model (surge, sway, roll, and
Fins and rudder give rise to lift forces in yaw), motion variables considered are  D
proportion to the square of flow velocity past the Œ
T ,  D Œu v p r
T ,  i D ŒX Y K N
T , where
fin. Hence, roll stabilization by fin or rudder is is the yaw angle, the body-fixed velocities are
not possible at low or zero speed. Only U-tanks u-surge and v-sway, and r is the yaw rate. The
and gyro devices are able to provide stabilization forces and torques are X -surge, Y -sway, K-roll,
in these conditions. For further details about the and N -yaw. With these variables, the following
performance of different devices, see Sellars and mathematical model is usually considered:
Martin (1992), and for a comprehensive descrip-
tion of the early development of devices, see P D J./ ; (3)
Chalmers (1931).
MRB P C CRB ./ D  h C  c C  d ; (4)

Modeling of Ship Roll Motion for where J./ is a kinematic transformation, MRB
Control Design is the rigid-body inertia matrix that corresponds
to expressing the inertia tensor in body-fixed co-
The study of roll motion dynamics for control ordinates, CRB ./ is the rigid-body Coriolis and
system design is normally done in terms of either centripetal matrix, and  h ,  c , and  d represent
one- or four-degrees-of-freedom (DOF) models. the hydrodynamic, control, and disturbance vec-
The choice between models of different complex- tor of force components and torques, respectively.
ity depends on the type of motion control system The hydrostatic and hydrodynamic forces are
considered.  h MA P  CA ./  D./  K. /. The
For a one-degree-of-freedom (1DOF) case, the first two terms have origin in the motion of a
following model is used: vessel in an irrotational flow in a nonviscous
fluid. The third term corresponds to damping
P D p; (1) forces due to potential (wave making), skin fric-
tion, vortex shedding, and circulation (lift and
Ixx pP D Kh C Kw C Kc ; (2) drag). The hydrodynamic effects involved are
quite complex, and different approaches based
where is roll angle, p is roll rate, and Ixx is on superposition of either odd-term Taylor ex-
rigid-body moment of inertia about the x-axis of pansions or square modulus (xjxj) series expan-
a body-fixed coordinate system, where Kh is hy- sions are usually considered Abkowitz (1964)
drostatic and hydrodynamic torques, Kw torque and Fedyaevsky and Sobolev (1964). The K. /
generated by wave forces acting on the hull, and term represents the restoring forces in roll due to
Kc the control torques. The hydrodynamic torque buoyancy and gravity. The 4DOF model captures
can be approximated by the following parametric parameter dependency on ship speed as well as
model: Kh KpP pP C Kp p C Kpjpj pjpj C K. /. the couplings between steering and roll, and it is
The first term represents a hydrodynamic torque useful for controller design. For additional details
in roll due to pressure change that is proportional about mathematical model of marine vehicles,
to the roll accelerations, and the coefficient KpP see Fossen (2011).
Control of Ship Roll Motion 199

Wave-Disturbance Models to attenuate the motion due to the wave-induced


The action of the waves creates changes in pres- forces in different frequency ranges. These re-
sure on the hull of the ship, which translate into sults have important consequences on the de-
forces and moments. It is common to model the sign of a roll motion control system since the
ship motion response due to waves within a linear frequency of the waves seen from the vessel
framework and to obtain two frequency-response changes significantly with the sea state, the speed
functions (FRF), wave to excitation Fi .j!; U; / of the vessel, and the wave encounter angle. C
and wave to motion Hi .j!; U; / response func- The changing characteristics on open-loop roll
tions, where i indicates the degree of freedom. motion in conjunction with the Bode integral
These FRF depend on the wave frequency, the constraint make the control design challenging
ship speed, and the angle  at which the waves since roll amplification may occur if the control
encounter the ship – this is called the encounter design is not done properly. For some roll motion
angle. control problems, like using the rudder for simul-
The wave elevation in deep water is approx- taneous roll attenuation and heading control, the
imately a stochastic process that is zero mean, system presents non-minimum phase dynamics.
stationary for short periods of time, and Gaussian In this case, the trade-off of reduced sensitivity
(Haverre and Moan 1985). Under these assump- vs. amplification of roll motion is dominating
tions, the wave elevation  is fully described by at frequencies close to the non-minimum phase
a power spectral density ˆ .!/. With a linear zero – a constraint with origin in the Poisson
response assumption, the power spectral density integral (Hearns and Blanke 1998); see also Perez
of wave to excitation force and wave to motion (2005).
can be expressed as It should be noted that non-minimum phase
dynamics also occurs with fin stabilizers, when
ˆFF;i .j!/ D jFi .j!; U; /j2 ˆ .j!/; the stabilizers are located aft of the center of
gravity. With the fins at this location, they behave
ˆ;i .j!/ D jHi .j!; U; /j2 ˆ .j!/: like a rudder and introduce non-minimum phase
dynamics and heading interference at low wave-
These spectra are models of the wave-induced excitation frequencies. These aspects of fin loca-
forces and motions, respectively, from which it tion were discussed by Lloyd (1989).
its common to generate either time series of The above discussion highlights general de-
wave excitation forces in terms of the encounter sign constraints that apply to roll motion control
frequency to be used as input disturbances in systems in terms of the dynamics of the vessel
simulation models or time series of wave-induced and actuator. In addition to these constraints, one
motion to be used as output disturbance; see, for needs also to account for limitations in actuator
example, Perez (2005) and references herein. slew rate and angle.

Roll Motion Control and Performance Controls Techniques Used in Different


Limitations Roll Control Systems

The analysis of performance of ship roll mo- Fin Stabilizers


tion control by means of force actuators is usu- In regard to fin stabilizers, the control design is
ally conducted within a linear framework by commonly address using the 1DOF model (1)
linearizing the models. For a SISO loop where and (2). The main issues associated with control
the wave-induced roll motion is considered an design are the parametric uncertainty in model
output disturbance, the Bode integral constraint and the Bode integral constraint. This integral
applies. This imposes restrictions on one’s free- constraint can lead to roll amplification due to
dom to shape the closed-loop transfer function changes in the spectrum of the wave-induced
200 Control of Ship Roll Motion

roll moment with sea state and sailing conditions has been addressed in practice using PID, LQG,
(speed and encounter angle). Fin machinery is and H1 and standard frequency-domain linear
designed so that the rate of the fin motion is control designs. The characteristics of limited
fast enough, and actuator rate saturation is not an control authority were solved by van Ameron-
issue in moderate sea states. The fins could be gen et al. (1990) using automatic gain control.
used to correct heeling angles (steady-state roll) In the literature, there have been proposals put
when the ship makes speed, but this is avoided forward for the use of model predictive control,
due to added resistance. If it is used, integral QFT, sliding-mode nonlinear control, and auto-
action needs to include anti-windup. In terms regressive stochastic control. Combined use of
of control strategies, PID, H1 , and LQR tech- fin and rudder has also be investigated. Grimble
niques have been successfully applied in prac- et al. (1993) and later Roberts et al. (1997)
tice. Highly nonlinear effects (dynamic stall) may used H1 control techniques. Thorough com-
appear when operating in severe sea states and parison of controller performances for warships
heavy rolling conditions, and proposals for appli- was published in Crossland (2003). A thorough
cations of model predictive control have been put review of the control literature can be found in
forward to constraint the effective angle of attack Perez and Blanke (2012).
of the fins. In addition, if the fins are located
too far aft along the ship, the dynamic response Gyrostabilizers
from fin angle to roll can exhibit non-minimum Using a single gimbal suspension gyrostabilizer
phase dynamics, which can limit the performance for roll damping control, the coupled vessel-roll-
at low encounter frequencies. A thorough review gyro model can be modeled as follows:
of the control literature can be found in Perez and
Blanke (2012). P D p; (5)
KpP pP C Kp p C K D Kw  Kg ˛P cos ˛ (6)
Rudder-Roll Stabilization
The problem of rudder-roll stabilization requires Ip ˛R C Bp ˛P C Cp sin ˛ D Kg p cos ˛ C Tp ;
the 4DOF model (3) and (4), which captures the (7)
interaction between roll, sway, and yaw together
with the changes in the hydrodynamic forces where (6) represents the 1DOF roll dynamics
due to the forward speed. The response from and (7) represents the dynamics of the gyrosta-
rudder to roll is non-minimum phase (NMP), bilizer about the axis of the gimbal suspension,
and the system is characterized by further con- where ˛ is the gimbal angle, equivalent to the
straints due to the single-input-two-output nature precession angle for a single gimbal suspension,
of the control problem – attenuate roll without Ip is gimbal and wheel inertia about the gimbal
too much interference with the heading. Studies axis, Bp is the damping, and Cp is a restoring
of fundamental limitations due to NMP dynamics term of the gyro about the precession axis due to
have been approached using standard frequency- location of the gyro center of mass relative to the
domain tools by Hearns and Blanke (1998) and precession axis (Arnold and Maunder 1961). Tp
Perez (2005). A characterization of the trade-off is the control torque applied to the gimbal. The
between roll reduction vs. increase of interfer- use of twin counter-spinning wheels prevents gy-
ence was part of the controller design in Stoustrup roscopic coupling with other degrees of freedom.
et al. (1994). Perez (2005) determined the limits Hence, the control design for gyrostabilizers can
obtainable using optimal control with full distur- be based on a linear single-degree-of-freedom
bance information. The latter also incorporated model for roll.
constraints due to the limiting authority of the The wave-induced roll moment Kw excites the
control action in rate and magnitude of rudder roll motion. As the roll motion develops, the roll
machinery and stall conditions of the rudder. rate p induces a torque along the precession axis
The control design for rudder-roll stabilization of the gyrostabilizer. As the precession angle ˛
Control of Ship Roll Motion 201

develops, there is reaction torque done on the to the other. If the roll dominant frequency is
vessel that opposes the wave-induced moment. higher than the roll natural frequency, the U-tank
The later is the roll stabilizing torque, Xg , is used in passive mode, and the standard roll
Kg ˛P cos ˛ Kg ˛. P This roll torque can only reduction degradation occurs.
be controlled indirectly through the precession
dynamics in (7) via Tp . In the model above, the
spin angular velocity !spi n is controlled to be C
Summary and Future Directions
constant; hence the wheels’ angular momentum
Kg D Ispi n !spi n is constant.
This article provides a brief summary of control
The precession control torque Tp is used
aspects for the most common ship roll motion
to control the gyro. As observed by Sperry
control devices. These aspects include the type of
(Chalmers 1931), the intrinsic behavior of the
mathematical models used to design and analyze
gyrostabilizer is to use roll rate to generate a roll
the control problem, the inherent fundamental
torque. Hence, one could design a precession
limitations and the constraints that some of the
torque controller such that from the point of
designs are subjected to, and the performance
view of the vessel, the gyro behaves as damper.
that can be expected from the different devices.
Depending on how precession torque is delivered,
As an outlook, one of the key issues in roll
it may be necessary to constraint precession
motion control is the model uncertainty and the
angle and rate. This problem has been recently
adaptation to the changes in the environmen-
considered in Donaire and Perez (2013) using
tal conditions. As the vessel changes speed and
passivity-based control.
heading, or as the seas build up or abate, the dom-
inant frequency range of the wave-induced forces
U-tanks
changes significantly. Due to the fundamental
U-tanks can be passive or active. Roll reduction
limitations discussed, a nonadaptive controller
is achieved by attempting to transfer energy from
may produce roll amplification rather than roll
the roll motion to motion of liquid within the tank
reduction. This topic has received some attention
and using the weight of the liquid to counteract
in the literature via multi-mode control switching,
the wave excitation moment. A key aspect of the
but further work in this area could be beneficial.
design is the dimension and geometry of the tank
In the recent years, new devices have appeared
to ensure that there is enough weight due to the
for stabilization at zero speed, like flapping fins
displaced liquid in the tank and that the oscilla-
and rotating cylinders. Also the industry’s interest
tion of the fluid in the tank matches the vessel
in roll gyrostabilizers has been re-ignited. The
natural frequency in roll; see Holden and Fossen
investigation of control designs for these devices
(2012) and references herein. The design of the
has not yet received much attention within the
U-tank can ensure a single-frequency matching,
control community. Hence, it is expected that this
at which the performance is optimized, and for
will create a potential for research activity in the
this frequency the roll natural frequency is used.
future.
As the frequency of roll motion departs from this,
a degradation of roll reduction occurs. Active U-
tanks use valves to control the flow of air from
the top of the reservoirs to extend the frequency Cross-References
matching in sailing conditions in which the roll
dominant frequency is lower than the roll natural  Fundamental Limitation of Feedback Control
frequency – the flow of air is used to delay  H-Infinity Control
the motion of the liquid from one reservoir to  H2 Optimal Control
the other. This control is achieved by detecting  Linear Quadratic Optimal Control
the dominant roll frequency and using this infor-  Mathematical Models of Ships and Underwater
mation to control the air flow from one reservoir Vehicles
202 Control Structure Selection

Bibliography Perez T, Blanke M (2012) Ship roll damping control. Ann


Rev Control 36(1):1367–5788
Abkowitz M (1964) Lecture notes on ship Perez T, Steinmann P (2009) Analysis of ship roll gyrosta-
hydrodynamics–steering and manoeuvrability. biliser control. In: 8th IFAC international conference
Technical report Hy-5, Hydro and Aerodynamics on manoeuvring and control of marine craft, Guaruja
Laboratory, Lyngby Roberts G, Sharif M, Sutton R, Agarwal A (1997) Robust
Arnold R, Maunder L (1961) Gyrodynamics and its engi- control methodology applied to the design of a com-
neering applications. Academic, New York/London bined steering/stabiliser system for warships. IEE Proc
Baitis E, Woollaver D, Beck T (1983) Rudder roll stabi- Control Theory Appl 144(2):128–136
lization of coast guard cutters and frigates. Nav Eng J Sellars F, Martin J (1992) Selection and evaluation of
95(3):267–282 ship roll stabilization systems. Mar Technol SNAME
Blanke M, Haals P, Andreasen KK (1989) Rudder roll 29(2):84–101
damping experience in Denmark. In: Proceedings of Stoustrup J, Niemann HH, Blanke M (1994) Rudder-roll
IFAC workshop CAMS’89, Lyngby damping for ships- a new H1 approach. In: Proceed-
Chalmers T (1931) The automatic stabilisation of ships. ings of 3rd IEEE conference on control applications,
Chapman and Hall, London Glasgow, pp 839–844
Crossland P (2003) The effect of roll stabilization con- van Amerongen J, van der Klugt P, van Nauta Lemke H
trollers on warship operational performance. Control (1990) Rudder roll stabilization for ships. Automatica
Eng Pract 11:423–431 26:679–690
Donaire A, Perez T (2013) Energy-based nonlinear con-
trol of ship roll gyro-stabiliser with precession angle
constraints. In: 9th IFAC conference on control appli-
cations in marine systems, Osaka Control Structure Selection
Fedyaevsky K, Sobolev G (1964) Control and stability in
ship design. State Union Shipbuilding, Leningrad
Sigurd Skogestad
Fossen TI (2011) Handbook of marine craft hydrodynam-
ics and motion control. Wiley, Chichester Department of Chemical Engineering,
Fossen T, Perez T (2009) Kalman filtering for positioning Norwegian University of Science and
and heading control of ships and offshore rigs. IEEE Technology (NTNU), Trondheim, Norway
Control Syst Mag 29(6):32–46
Gaillarde G (2002) Dynamic behavior and operation limits
of stabilizer fins. In: IMAM international maritime Abstract
association of the Mediterranean, Creta
Grimble M, Katebi M, Zang Y (1993) H1 –based ship
fin-rudder roll stabilisation. In: 10th ship control sys- Control structure selection deals with selecting
tem symposium SCSS, Ottawa, vol 5, pp 251–265 what to control (outputs), what to measure and
Haverre S, Moan T (1985) On some uncertainties related
what to manipulate (inputs), and also how to split
to short term stochastic modelling of ocean waves.
In: Probabilistic offshore mechanics. Progress in en- the controller in a hierarchical and decentralized
gineering science. CML manner. The most important issue is probably
Hearns G, Blanke M (1998) Quantitative analysis and the selection of the controlled variables (outputs),
design of rudder roll damping controllers. In: Proceed-
CV D Hy, where y are the available mea-
ings of CAMS’98, Fukuoka, pp 115–120
Holden C, Fossen TI (2012) A nonlinear 7-DOF model for surements and H is a degree of freedom that is
U-tanks of arbitrary shape. Ocean Eng 45: 22–37 seldom treated in a systematic manner by control
Källström C, Wessel P, Sjölander S (1988) Roll reduction engineers. This entry discusses how to find H
by rudder control. In: Spring meeting-STAR sympo-
for both for the upper (slower) economic layer
sium, 3rd IMSDC, Pittsburgh
Lloyd A (1989) Seakeeping: ship behaviour in rough and the lower (faster) regulatory layer in the
weather. Ellis Horwood control hierarchy. Each layer may be split in a
Marzouk OA, Nayfeh AH (2009) Control of ship roll decentralized fashion. Systematic approaches for
using passive and active anti-roll tanks. Ocean Eng
input/output (IO) selection are presented.
36:661–671
Oda H, Sasaki M, Seki Y, Hotta T (1992) Rudder roll sta-
bilisation control system through multivariable autore-
gressive model. In: Proceedings of IFAC conference Keywords
on control applications of marine systems–CAMS
Perez T (2005) Ship motion control. Advances in indus-
Control configuration; Control hierarchy;
trial control. Springer, London
Control structure design; Decentralized control;
Control Structure Selection 203

Economic control; Input-output controllability; J0 D jjW0 zjj. The reason for using a prime on J
Input/output selection; Plantwide control; (J0 ), is to distinguish it from the economic cost
Regulatory control; Supervisory control J which we later use for selecting the controlled
variables (CV).
Notice that it is assumed in Fig. 1 that we know
Introduction what to measure (v), manipulate (u), and, most
importantly, which variables in z we would like to C
Consider the generalized controller design prob- keep at setpoints (CV), that is, we have assumed a
lem in Fig. 1 where P denotes the generalized given control structure. The term “control struc-
plant model. Here, the objective is to design the ture selection” (CSS) and its synonym “control
controller K, which, based on the sensed outputs structure design” (CSD) is associated with the
v, computes the inputs (MVs) u such that the overall control philosophy for the system with
variables z are kept small, in spite of variations in emphasis on the structural decisions which are
the variables w, which include disturbances (d), a prerequisite for the controller design problem
varying setpoints/references (CVs ) and measure- in Fig. 1:
ment noise (n), 1. Selection of controlled variables (CVs,
“outputs,” included in z in Fig. 1)
w D Œd; CVs ; n
2. Selection of manipulated variables (MVs,
“inputs,” u in Fig. 1)
The variables z, which should be kept small, 3. Selection of measurements y (included in v in
typically include the control error for the selected Fig. 1)
controlled variables (CV) plus the plant inputs 4. Selection of control configuration (structure
(u), of overall controller K that interconnects the
z D ŒCV  CVs I u
controlled, manipulated and measured vari-
ables; structure of K in Fig. 1)
The variables v, which are the inputs to the 5. Selection of type of controller K (PID, MPC,
controller, include all known variables, including LQG, H-infinity, etc.) and objective function
measured outputs (ym ), measured disturbances (norm) used to design and analyze it.
(dm ) and setpoints, Decisions 2 and 3 (selection of u and y) are
sometimes referred to as the input/output (IO)
v D Œym I dm I CVs
: selection problem. In practice, the controller (K)
is usually divided into several layers, operating on
The cost function for designing the optimal con- different time scales (see Fig. 2), which implies
troller K is usually the weighted control error, that we in addition to selecting the (primary)
controlled variables (CV1
CV) must also
select the (secondary) variables that interconnect
the layers (CV2 ).
Control structure selection includes all the
structural decisions that the engineer needs to
make when designing a control system, but
it does not involve the actual design of each
individual controller block. Thus, it involves the
decisions necessary to make a block diagram
Control Structure Selection, Fig. 1 General formu- (Fig. 1; used by control engineers) or process
lation for designing the controller K. The plant P is & instrumentation diagram (used by process
controlled by manipulating u, and is disturbed by the
signals w. The controller uses the measurements v, and the engineers) for the entire plant, and provides
control objective is to keep the outputs (weighted control the starting point for a detailed controller
error) z as small as possible design.
204 Control Structure Selection

Scheduling Overall Objectives for Control and


(weeks) Structure of the Control Layer

The starting point for control system design


Site-wide optimization is to define clearly the operational objectives.
(day)
There are usually two main objectives for
control:
1. Longer-term economic operation (minimize
RTO economic cost J subject to satisfying opera-
Local optimization
(hour)
tional constraints)
2. Stability and short-term regulatory control
CV1 The first objective is related to “making the sys-
tem operate as intended,” where economics are
MPC Supervisory an important issue. Traditionally, control engi-
control neers have not been much involved in this step.
(minutes)
Control The second objective is related to “making sure
layer CV2 the system stays operational,” where stability
and robustness are important issues, and this
PID Regulatory has traditionally been the main domain of con-
control
(seconds) trol engineers. In terms of designing the con-
trol system, the second objective (stabilization)
is usually considered first. An example is bicy-
cle riding; we first need to learn how to sta-
bilize the bicycle (regulation), before trying to
Control Structure Selection, Fig. 2 Typical control hi-
erarchy, as illustrated for a process plant use it for something useful (optimal operation),
like riding to work and selecting the shortest
path.
We use the term “economic cost,” because
usually the cost function J can be given a mon-
The term “plantwide control,” which is a syn- etary value, but more generally, the cost J could
onym for “control structure selection,” is used be any scalar cost. For example, the cost J could
in the field of process control. Control structure be the “environmental impact” and the economics
selection is particularly important for process could then be given as constraints.
control because of the complexity of large pro- In theory, the optimal strategy is to combine
cessing plants, but it applies to all control applica- the control tasks of optimal economic operation
tions, including vehicle control, aircraft control, and stabilization/regulation in a single centralized
robotics, power systems, biological systems, so- controller K, which at each time step collects all
cial systems, and so on. the information and computes the optimal input
It may be argued that control structure selec- changes. In practice, simpler controllers are used.
tion is more important than the controller design The main reason for this is that in most cases one
itself. Yet, control structure selection is hardly can obtain acceptable control performance with
covered in most control courses. This is probably simple structures, where each controller block in-
related to the complexity of the problem, which volves only a few variables. Such control systems
requires the knowledge from several engineering can be designed and tuned with much less effort,
fields. In the mathematical sense, the control especially when it comes to the modeling and
structure selection problem is a formidable com- tuning effort.
binatorial problem which involves a large number So how are large-scale systems controlled in
of discrete decision variables. practise? Usually, the controller K is decomposed
Control Structure Selection 205

into several subcontrollers, using two main prin- Notation and Matrices H1 and H2 for
ciples Controlled Variable Selection
– Decentralized (local) control. This “horizon-
tal decomposition” of the control layer is usu- The most important notation is summarized in
ally based on separation in space, for example, Table 1 and Fig. 3. To distinguish between the
by using local control of individual units. two control layers, we use “1” for the upper
– Hierarchical (cascade) control. This “vertical supervisory (economic) layer and “2” for the C
decomposition” is usually based on time scale regulatory layer, which is “secondary” in terms
separation, as illustrated for a process plant in of its place in the control hierarchy.
Fig. 2. The upper three layers in Fig. 2 deal There is often limited possibility to select the
explicitly with economic optimization and are input set (u) as it is usually constrained by the
not considered here. We are concerned with
the two lower control layers, where the main
objective is to track the setpoints specified by
the layer above.
In accordance with the two main objectives for
control, the control layer is in most cases divided
hierarchically in two layers (Fig. 2):
1. A “slow” supervisory (economic) layer
2. A “fast” regulatory (stabilization) layer
Another reason for the separation in two con-
trol layers, is that the tasks of economic opera-
tion and regulation are fundamentally different.
Combining the two objectives in a single cost
function, which is required for designing a single
centralized controller K, is like trying to compare
apples and oranges. For example, how much is
an increased stability margin worth in monitory
units [$]? Only if there is a reasonable benefit in
combining the two layers, for example, because
there is limited time scale separation between
the tasks of regulation and optimal economics, Control Structure Selection, Fig. 3 Block diagram of
a typical control hierarchy, emphasizing the selection of
should one consider combining them into a single controlled variables for supervisory (economic) control
controller. (CV1 D H1 y) and regulatory control .CV2 D H2 y/

Control Structure Selection, Table 1 Important notation


u D Œu1 I u2
D set of all available physical plant inputs
u1 D inputs used directly by supervisory control layer
u2 D inputs used by regulatory layer
ym D set of all measured outputs
y D Œym I u
D combined set of measurements and inputs
y2 D controlled outputs in regulatory layer (subset or combination of y); dim.y2 / D dim.u2 /
CV1 D H1 y D controlled variables in supervisory layer; dim.CV1 / D dim.u/
CV2 D Œy2 I u1
D H2 y D controlled variables in regulatory layer; dim.CV2 / D dim.u/
MV1 D CV2s D Œy2s I u1
D manipulated variables in supervisory layer; dim.MV1 / D dim.u/
MV2 D u2 D manipulated variables in regulatory layer; dim.MV2 / D dim.u2 /  dim.u/
206 Control Structure Selection

plant design. However, there may be a possibility and we have by definition y D Œym I u
. Then the
to add inputs or to move some to another location, choice
for example, to avoid saturation or to reduce the
time delay and thus improve the input-output H2 D Œ0 1 0 0 0I 0 0 0 0 1

controllability.
There is much more flexibility in terms of out- means that we have selected CV2 D H2 y D
put selection, and the most important structural ŒTb I qb
. Thus, u1 D qb is an unused input for
decision is related to the selection of controlled regulatory control, and in the regulatory layer we
variables in the two control layers, as given by close one loop, using u2 D qa to control y2 D Tb .
the decision matrices H1 and H2 (see Fig. 3). If we instead select

CV1 D H1 y H2 D Œ1 0 0 0 0I 0 0 1 0 0

CV2 D H2 y then we have CV2 D ŒTa I Tc


. None of these are
inputs, so u1 is an empty set in this case. This
Note from the definition in Table 1 that y D means that we need to close two regulatory loops,
Œym I u
. Thus, y includes, in addition to the can- using u2 D Œqa I qb
to control y2 D ŒTa I Tc
:
didate measured outputs (ym ), also the physical
inputs u. This allows for the possibility of select-
ing an input u as a “controlled” variable, which Supervisory Control Layer and
means that this input is kept constant (or, more Selection of Economic Controlled
precisely, the input is left “unused” for control in Variables (CV1 )
this layer).
In general, H1 and H2 are “full” matrices, Some objectives for the supervisory control layer
allowing for measurement combinations as con- are given in Table 2. The main structural issue
trolled variables. However, for simplicity, espe- for the supervisory control layer, and probably
cially in the regulatory layer, we often pefer to the most important decision in the design of any
control individual measurements, that is, H2 is control system, is the selection of the primary
usually a “selection matrix,” where each row (economic) controlled variable CV1 . In many
in H2 contains one 1-element (to identify the cases, a good engineer can make a reasonable
selected variable) with the remaining elements set choice based on process insight and experience.
to 0. In this case, we can write CV2 D H2 y D However, the control engineer must realize that
Œy2 I u1
, where y2 denotes the actual controlled this is a critical decision. The main rules and
variables in the regulatory layer, whereas u1 de- issues for selecting CV1 are
notes the “unused” inputs (u1 ), which are left CV1 Rule 1. Control active constraints (almost
as degrees of freedom for the supervisory layer. always)
Note that this indirectly determines the inputs u2 • Active constraints may often be identified
used in the regulatory layer to control y2 , because by engineering insight, but more generally
u2 is what remains in the set u after selecting u1 . requires optimization based on a detailed
To have a simple control structure, with as few model.
regulatory loops as possible, it is desirable that For example, consider the problem of min-
H2 is selected such that there are many inputs (u1 ) imizing the driving time between two cities
left “unused” in the regulatory layer. (cost J D T ). There is a single input (u D
Example. Assume there are three candidate out- fuel flow f Œl=s
) and the optimal solution
put measurements (temperatures T) and two in- is often constrained. When driving a fast
puts (flowrates q), car, the active constraint may be the speed
limit (C V1 D v Œkm= h
with setpoint vmax ,
ym D ŒTa Tb Tc
; u D Œqa qb
e.g., vmax D 100 km= h). When driving
Control Structure Selection 207

Control Structure Selection, Table 2 Objectives of supervisory control layer


O1. Control primary “economic” variables CV1 at setpoint using as degrees of freedom MV1 , which includes the
setpoints to the regulatory layer .y2s D CV2s / as well as any “unused” degrees of freedom (u1 )
O2. Switch controlled variables (CV1 ) depending on operating region, for example, because of change in active
constraints
O3. Supervise the regulatory layer, for example, to avoid input saturation (u2 ), which may destabilize the system
O4. Coordinate control loops (multivariable control) and reduce effect of interactions (decoupling) C
O5. Provide feedforward action from measured disturbances
O6. Make use of additional inputs, for example, to improve the dynamic performance (usually combined with input
midranging control) or to extend the steady-state operating range (split range control)
O7. Make use of extra measurements, for example, to estimate the primary variables CV1

an old car, the active constraint maybe the • An ideal self-optimizing variable is the gra-
maximum fuel flow (C V1 D f Œl=s
with dient of the cost function with respect to the
setpoint fmax ). The latter corresponds to unconstrained input. CV1 D dJ=du D Ju
an input constraint .umax D fmax / which • More generally, since we rarely can mea-
is trivial to implement (“full gas”); the sure the gradient Ju , we select CV1 D H1 y.
former corresponds to an output constraint The selection of a good H1 is a nontrivial
.ymax D vmax / which requires a controller task, but some quantitative approaches are
(“cruise control”). given below.
• For“hard” output constraints, which can- For example, consider again the problem of
not be violated at any time, we need to driving between two cities, but assume that the
introduce a backoff (safety margin) to guar- objective is to minimize the total fuel, J = V
antee feasibility. The backoff is defined as [liters]., Here, driving at maximum speed will
the difference between the optimal value consume too much fuel, and driving too slow
and the actual setpoint, for example, we is also nonoptimal. This is an unconstrained
need to back off from the speed limit be- optimization problem, and identifying a good
cause of the possibility for measurement C V1 is not obvious. One option is to maintain
error and imperfect control a constant speed (C V1 D v), but the optimal
value of v may vary depending on the slope
of the road. A more “self-optimizing” option,
CV1;s D CV1;max  backoff
could be to keep a constant fuel rate (C V1 D
f Œl=s
), which will imply that we drive slower
For example, to avoid exceeding the uphill and faster downhill. More generally,
speed limit of 100 km/h, we may set one can control combinations, C V1 D H1 y
backoff D 5 km/h, and use a setpoint where H1 is a “full” matrix.
vs D 95 km/h rather than 100 km/h. CV1 Rule 3. For the unconstrained degrees of
CV1 Rule 2. For the remaining unconstrained freedom, one should never control a variable
degrees of freedom, look for “self-optimizing” that reaches its maximum or minimum value at
variables which when held constant, indirectly the optimum, for example, never try to control
lead to close-to-optimal operation, in spite of directly the cost J. Violation of this rule gives
disturbances. either infeasibility (if attempting to control J
• Self-optimizing variables (CV1 D H1 y) are at a lower value than Jmin ) or nonuniqueness
variables which when kept constant, indi- (if attempting to control J at higher value than
rectly (through the action of the feedback Jmin ).
control system) lead to close-to optimal Assume again that we want to minimize the
adjustment of the inputs (u) when there are total fuel needed to drive between two cities,
disturbances (d). J D V Œl
. Then one should avoid fixing the
208 Control Structure Selection

total fuel, C V1 D V Œl
, or, alternatively, avoid a problem with only one unconstrained input
fixing the fuel consumption(“gas mileage”) in (the power). In summary, the heart rate is a good
liters pr. km .C V1 D f Œl=km
/. Attempting to candidate.
control the fuel consumption[l/km] below the
car’s minimum value is obviously not possible Regions and switching. If the optimal active
(infeasible). Alternatively, attempting to control constraints vary depending on the disturbances,
the fuel consumption above its minimum value new controlled variables (CV1 ) must be identified
has two possible solutions; driving slower or (offline) for each active constraint region, and on-
faster than the optimum. Note that the policy of line switching is required to maintain optimality.
controlling the fuel rate f [l/s] at a fixed value In practise, it is easy to identify when to switch
will never become infeasible. when one reaches a constraint. It is less obvious
For CV1 -Rule 2, it is always possible to find when to switch out of a constraint, but actually
good variable combinations (i.e., H1 is a “full” one simply has to monitor the value of the un-
matrix), at least locally, but whether or not it is constrained CVs from the neighbouring regions
possible to find good individual variables (H1 and switch out of the constraint region when the
is a selection matrix), is not obvious. To help unconstrained CV reaches its setpoint.
identify potential “self-optimizing” variables In general, one would like to simplify the
.CV1 D c/ ;the following requirements may be control structure and reduce need for switching.
used: This may require using a suboptimal CV1 in
Requirement 1. The optimal value of c is insen- some regions of active constraints. In this case,
sitive to disturbances, that is, dcopt =dd D H1 F the setpoint for CV1 may not be its nominally
is small. Here F = dyopt =dd is the optimal optimal value (which is the normal choice), but
sensitivity matrix (see below). rather a “robust setpoint” (with backoff) which
Requirement 2. The variable c is easy to measure reduces the loss when we are outside the nominal
and control accurately constraint region.
Requirement 3. The value of c is sensitive to
changes in the manipulated variable, u; that Structure of supervisory layer. The supervi-
is, the gain, G D HGy , from u to c is sory layer may either be centralized, e.g., using
large (so that even a large error in controlled model predictive control (MPC), or decomposed
variable, c, results in only a small variation in into simpler subcontrollers using standard ele-
u.) Equivalently, the optimum should be “flat” ments, like decentralized control (PID), cascade
with respect to the variable, c. Here Gy D control, selectors, decouplers, feedforward ele-
dy=d u is the measurement gain matrix (see ments, ratio control, split range control, and input
below). midrange control (also known as input resetting,
Requirement 4. For cases with two or more valve position control or habituating control). In
controlled variables c, the selected variables theory, the performance is better with the central-
should not be closely correlated. ized approach (e.g., MPC), but the difference can
All four requirements should be satisfied. be small when designed by a good engineer. The
For example, for the operation of a marathon main reasons for using simpler elements is that
runner, the heart rate may be a good “self- (1) the system can be implemented in the existing
optimizing” controlled variable c (to keep at “basic” control system, (2) it can be implemented
constant setpoint). Let us check this against with little model information, and (3) it can be
the four requirements. The optimal heart build up gradually. However, such systems can
rate is weakly dependent on the disturbances quickly become complicated and difficult to un-
(requirement 1) and the heart rate is easy to derstand for other than the engineer who designed
measure (requirement 2). The heart rate is quite it. Therefore, model-based centralized solutions
sensitive to changes in power input (requirement (MPC) are often preferred because the design is
3). Requirement 4 does not apply since this is more systematic and easier to modify.
Control Structure Selection 209

Quantitative Approach for Selecting practise, the optimum input uopt (d) cannot
Economic Controlled Variables, CV1 be realized, because of model error and
unknown disturbances d, so we use a feeback
A quantitative approach for selecting economic implementation where u is adjusted to keep
controlled variables is to consider the effect of the the selected variables CV1 at their nominally
choice CV1 D H1 y on the economic cost J when optimal setpoints.
disturbances d occur. One should also include Together with obtaining the model, the opti- C
noise/errors (ny ) related to the measurements and mization step S2 is often the most time con-
inputs. suming step in the entire plantwide control
Step S1. Define operational objectives (eco- procedure.
nomic cost function J and constraints) Step S3. Select supervisory (economic) con-
We first quantify the operational objectives trolled variables, C V 1
in terms of a scalar cost function J [$/s] that
should be minimized (or equivalently, a scalar CV1 -Rule 1: Control Active Constraints
profit function, P D J, that should be max- A primary goal for solving the optimization prob-
imized). For process control applications, this lem is to find the expected regions of active
is usually easy, and typically we have constraints, and a constraint is said to be “active”
if g D 0 at the optimum. The optimally active
J D cost feed C cost utilities .energy/ constraints will vary depending on disturbances
(d) and market conditions (prices).
 value products Œ$=s

CV1 -Rule 2: Control Self-Optimizing


Note that the economic cost function J is used Variables
to select the controlled variables (CV1 ), and After having identified (and controlled) the ac-
another cost function (J0 ), typically involving tive constraints, one should consider the remain-
the deviation in CV1 from their optimal set- ing lower-dimension unconstrained optimization
points CV1s , is used for the actual controller problem, and for the remaining unconstrained
design (e.g., using MPC). degrees of freedom one should search for control
Step S2. Find optimal operation for expected “self-optimizing” variables c.
disturbances 1. “Brute force” approach. Given a set of con-
Mathematically, the optimization problem can trolled variables CV1 D c D H1 y; one
be formulated as computes the cost J(c,d) when we keep c
constant .c D cs C H1 ny / for various dis-
minu J .u; x; d/ turbances (d) and measurement errors (ny ).
In practise, this is done by running a large
subject to: number of steady-state simulations to try to
cover the expected future operation.
Model equations: dx=dt D f .u; x; d/ 2. “Local” approaches based on a quadratic
Operational constraints: g .u; x; d/  0 approximation of the cost J. Linear models are
used for the effect of u and d on y.
In many cases, the economics are determined
y
by the steady-state behavior, so we can set y D Gy u C Gd d
dx=dt D 0. The optimization problem should
be resolved for the expected disturbances (d) This is discussed in more detail in Alstad et al.
to find the truly optimal operation policy, (2009) and references therein. The main local
uopt (d). The nominal solution (dnom ) may approaches are:
be used to obtain the setpoints (CV1s ) 2A. Maximum gain rule: maximize the min-
for the selected controlled variables. In imum singular value of G D H1 Gy .
210 Control Structure Selection

In other words, the maximum gain rule, min_H1 jjM.H1 /jj2


which essentially is a quantitative version
of Requirements 1, 3 and 4 given above, where 2 denotes the Frobenius norm and
says that one should control “sensitive”
variables, with a large scaled gain G from M.H1 / D J1=2 y 1
uu .H1 G / H1 Y; Y
the inputs (u) to c D H1 y. This rule is  
good for pre-screening and also yields good D FWd Wny :
insight.
2B. Nullspace method. This method yields op- Note here that the optimal choice with
timal measurement combinations for the Wny D 0 (no noise) is to choose H1 such
case with no noise, ny D 0. One must first that H1 F D 0, which is the nullspace
obtain the optimal measurement sensitivity method. For the general case, when H1 is a
matrix F, defined as “full” matrix, this is a convex problem and
the optimal solution is HT1 D .YY0 /1 Gy Q
F D dyopt =dd: where Q is any nonsingular matrix.

Each column in F expresses the optimal


change in the y’s when the independent
Regulatory Control Layer
variable (u) is adjusted so that the sys-
tem remains optimal with respect to the
The main purpose of the regulatory layer is
disturbance d. Usually, it is simplest to
to “stabilize” the plant, preferably using a
obtain F numerically by optimizing the
simple control structure (e.g., single-loop PID
model. Alternatively, we can obtain F from
controllers) which does not require changes
a quadratic approximation of the cost func-
during operation. “Stabilize” is here used in a
tion
more extended sense to mean that the process
F D Gd  Gy J1
y
uu Jud
does not “drift” too far away from acceptable
Then, assuming that we have at least as operation when there are disturbances. The
many (independent) measurements y as the regulatory layer should make it possible to use
sum of the number of (independent) inputs a “slow” supervisory control layer that does not
(u) and disturbances (d), the optimal is to require a detailed model of the high-frequency
select c D H1 y such that dynamics. Therefore, in addition to track the
setpoints given by the supervisory layer (e.g.,
H1 F D 0 MPC), the regulatory layer may directly control
primary variables (CV1 ) that require fast and
Note that H1 is a nonsquare matrix, so tight control, like economically important active
H1 F D 0 does not require that H1 D 0 constraints.
(which is a trivial uninteresting solution), In general, the design of the regulatory layer
but rather that H1 is in the nullspace of FT . involves the following structural decisions:
2C. Exact local method (loss method). This 1. Selection of controlled outputs y2 (among all
extends the nullspace method to include candidate measurements ym ).
noise (ny ) and allows for any number of 2. Selection of inputs MV2 D u2 (a subset of all
measurements. The noise and disturbances available inputs u) to control the outputs y2 .
are normalized by introducing weighting 3. Pairing of inputs u2 and outputs y2 (since
matrices Wny and Wd (which have the ex- decentralized control is normally used).
pected magnitudes along the diagonal) and Decisions 1 and 2 combined (IO selection) is
then the expected loss, L D J  Jopt .d/, equivalent to selecting H2 (Fig. 3). Note that
is minimized by selecting H1 to solve the we do not “use up” any degrees of freedom in
following problem the regulatory layer because the set points (y2s )
Control Structure Selection 211

become manipulated variables (MV1 ) for the Before we look at the approaches, note again
supervisory layer (see Fig. 3). Furthermore, since that the IO-selection for regulatory control may
the set points are set by the supervisory layer be combined into a single decision, by consider-
in a cascade manner, the system eventually ing the selection of
approaches the same steady-state (as defined by
the choice of economic variables CV1 ) regardless CV2 D Œy2 I u1
D H2 y
of the choice of controlled variables in the C
regulatory layer. Here u1 denotes the inputs that are not used by the
The inputs for the regulatory layer (u2 ) are regulatory control layer. This follows because we
selected as a subset of all the available inputs want to use all inputs u for control, so assuming
(u). For stability reasons, one should avoid input that the set u is given, “selection of inputs u2 ”
saturation in the regulatory layer. In particular, (decision 2) is by elimination equivalent to “se-
one should avoid using inputs (in the set u2 ) that lection of inputs u1 .” Note that CV2 include all
are optimally constrained in some disturbance variables that we keep at desired (constant) values
region. Otherwise, in order to avoid input satura- within the fast time horizon of the regulatory
tion, one needs to include a backoff for the input control layer, including the “unused” inputs u1
when entering this operational region, and doing
so will have an economic penalty. Survey by Van de Wal and Jager
In the regulatory layer, the outputs (y2 ) are Van de Wal and Jager provide an overview of
usually selected as individual measurements and methods for input-output selection, some of
they are often not important variables in them- which include:
selves. Rather, they are “extra outputs” that are 1. “Accessibility” based on guaranteeing a
controlled in order to “stabilize” the system, and cause–effect relationship between the selected
their setpoints (y2s ) are changed by the layer inputs (u2 ) and outputs (y2 ). Use of such
above, to obtain economical optimal operation. measures may eliminate unworkable control
For example, in a distillation column one may structures.
control a temperature somewhere in the middle 2. “State controllability and state observability”
of the column (y2 D T) in order to “stabilize” to ensure that any unstable modes can be sta-
the column profile. Its setpoint (y2s D Ts ) is bilized using the selected inputs and outputs.
adjusted by the supervisory layer to obtain the 3. “Input-output controllability” analysis to en-
desired product composition (y1 D c). sure that y2 can be acceptably controlled us-
ing u2 . This is based on scaling the system,
and then analysing the transfer matrices G2 .s/
(from u2 to y2 ) and Gd2 (from expected dis-
Input-Output (IO) Selection for turbances d to y2 ). Some important control-
Regulatory Control (u2 ; y2 ) lability measures are right half plane zeros
(unstable dynamics of the inverse), condition
Finding the truly optimal control structure, in- number, singular values, relative gain array,
cluding selecting inputs and outputs for regu- etc. One problem here is that there are many
latory control, requires finding also the optimal different measures, and it is not clear which
controller parameters. This is an extremely dif- should be given most emphasis.
ficult mathematical problem, at least if the con- 4. “Achievable robust performance.” This may
troller K is decomposed into smaller controllers. be viewed as a more detailed version of input-
In this section, we consider some approaches output controllability, where several relevant
which does not require that the controller param- issues are combined into a single measure.
eters be found. This is done by making assump- However, this requires that the control prob-
tions related to achievable control performance lem can actually be formulated clearly, which
(controllability) or perfect control. may be very difficult, as already mentioned.
212 Control Structure Selection

In addition, it requires finding the optimal where the G-matrices are transfer matrices. Here,
robust controller for the given problem, which Gxd gives the effect of the disturbances on the
may be very difficult. states with no control, and the idea is to reduce
Most of these methods are useful for analyzing a the disturbance effect by closing the regulatory
given structure (u2 , y2 ) but less suitable for selec- control loops. Within the “slow” time scale of
tion. Also, the list of methods is also incomplete, the supervisory layer, we can assume that CV2 is
as disturbance rejection, which is probably the perfectly controlled and thus constant, or CV2 D
most important issue for the regulatory layer, is 0 in terms of deviation variables. This gives
hardly considered.
y
CV2 D H2 Gy u C H2 Gd d D 0
A Systematic Approach for IO-Selection
Based on Minimizing State Drift Caused by
and solving with respect to u gives
Disturbances
The objectives of the regulatory control layer y

are many, and Yelchuru and Skogestad (2013) u D  .H2 Gy /1 H2 Gd d


list 13 partly conflicting objectives. To have a
truly systematic approach to regulatory control and we have
design, including IO-selection, we would need to x D Pxd .H2 / d
quantify all these partially conflicting objectives
where
in terms of a scalar cost function J2 . We here
consider a fairly general cost function,
Pxd .H2 / D Gxd  Gx .H2 Gy /1 H2 Gd
y

J2 D jjWxjj
is the disturbance effect for the “partially” con-
which may be interpreted as the weighted state trolled system with only the regulatory loops
drift. One justification for considering the state closed. Note that it is not generally possible to
drift, is that the regulatory layer should ensure make Pxd D 0 because we have more states than
that the system, as measured by the weighted we have available inputs. To have a small “state
states Wx, does not drift too far away from the drift,” we want J2 D jjW Pd djj to be small, and
desired state, and thus stays in the “linear region” to have a simple regulatory control system we
when there are disturbances. Note that the cost J2 want to close as few regulatory loops as possible.
is used to select controlled variables (CV2 ) and Assume that we have normalized the disturbances
not to design the controller (for which the cost so that the norm of d is 1, then we can solve the
may be the control error, J2 0 D jjCV2  CV2s jj). following problem
Within this framework, the IO-selection prob- For 0; 1; 2: : : etc. loops closed solve:
lem for the regulatory layer is then to select the min_H2 jjM2 .H2 / jj
nonsquare matrix H2 , where M2 D WPxd and dim .u2/ D
dim .y2/ D no: of loops closed:
CV2 D H2 y By comparing the value of jjM2 .H2 / jj with
different number of loops closed (i.e., with differ-
where y D Œym I u
, such that the cost J2 is ent H2 ), we can then decide on an appropriate reg-
minimized. The cause for changes in J2 are dis- ulatory layer structure. For example, assume that
turbances d, and we consider the linear model (in we find that the value of J2 is 110 (0 loops closed),
deviation variables) 0.2 (1 loop), and 0.02 (2 loops), and assume we
have scaled the disturbances and states such that
y a J2 -value less than about 1 is acceptable, then
y D Gy u C Gd d
closing 1 regulatory loop is probably the best
x D Gx u C Gxd d choice.
Control Structure Selection 213

In principle, this is straightforward, but there for the cost J2 , it is possible to obtain analytical
are three remaining issues: (1) We need to choose formulas for the optimal sensitivity, F2 . Again,
an appropriate norm, (2) we should include Wd and Wny are diagonal matrices, expressing
measurement noise to avoid selecting insensitive the expected magnitude of the disturbances (d)
measurements and (3) the problem must be and noise (for y).
solvable numerically. For the case when H2 is a “full” matrix, this
Issue 1. The norm of M2 should be evalu- can be reformulated as a convex optimization C
ated in the frequency range between the “slow” problem and an explicit solution is
bandwidth of the supervisory control layer (¨B1 )
and the “fast” bandwidth of the regulatory control
HT2 D .Y2 YT2 /1 Gy .Gy .Y2 YT2 /1 Gy /1 J2uu
T 1=2
layer (¨B2 ). However, since it is likely that the
system sometimes operates without the supervi-
sory layer, it is reasonable to evaluate the norm and from this we can find the optimal value of
of Pxd in the frequency range from 0 (steady state) J2 . It may seem restrictive to assume that H2 is a
to ¨B2 . Since we want H2 to be a constant (not “full” matrix, because we usually want to control
frequency-dependent) matrix, it is reasonable to individual measurements, and then H2 should be
choose H2 to minimize the norm of M2 at the a selection matrix, with 1’s and 0’s. Fortunately,
frequency where jjM2 jj is expected to have its since we in this case want to control as many
peak. For some mechanical systems, this may measurements (y2 ) as inputs (u2 ), we have that
be at some resonance frequency, but for process H2 is square in the selected set, and the opti-
control applications it is usually at steady state mal value of J2 when H2 is a selection matrix
(¨ D 0), that is, we can use the steady-state is the same as when H2 is a full matrix. The
gain matrices when computing Pxd . In terms of reason for this is that specifying (controlling) any
the norm, we use the 2-norm (Frobenius norm), linear combination of y2 , uniquely determines
mainly because it has good numerical proper- the individual y2 ’s, since dim.u2 / D dim.y2 /.
ties, and also because it has the interpretation of Thus, we can find the optimal selection matrix
giving the expected variance in x for normally H2 , by searching through all the candidate square
distributed disturbances. sets of y. This can be effectively solved using
Issues 2 and 3. If we include also measurement the branch and bound approach of Kariwala and
noise ny , which we should, then the expected Cao, or alternatively it can be solved as a mixed-
value of J2 is minimized by solving the problem integer problem with a quadratic program (QP) at
min_H2 jjM2 .H2 /jj2 where (Yelchuru and Sko- each node (Yelchuru and Skogestad 2012). The
gestad 2013) approach of Yelchuru and Skogestad can also be
applied to the case where we allow for disjunct
M2 .H2 / D J2uu .H2 Gy /1 H2 Y2
1=2
sets of measurement combinations, which may
give a lower J2 in some cases.

@yopt Comments on the state drift approach.


Y2 D ŒF2 Wd Wn
I F2 D
@d 1. We have assumed that we perfectly control y2
D Gy J 1
2uu J 2ud  Gdy
using u2 , at least within the bandwidth of the
regulatory control system. Once one has found
 
a candidate control structure (H2 ), one should
@2 J2 T
where J2uu D @u2
D 2Gx WT WGx ; J2ud D check that it is possible to achieve acceptable
@2 J2 T
@u@d D 2Gx W WGdx , T
control. This may be done by analyzing the
Note that this is the same mathematical prob- input-output controllability of the system y2 D
lem as the “exact local method” presented for se- G2 u2 C G2d d, based on the transfer matrices
y
lecting CV1 D H1 y for minimizing the economic G2 D H2 Gy and G2d D H2 Gd . If the control-
cost J, but because of the specific simple form lability of this system is not acceptable, then
214 Control Structure Selection

one should consider the second-best matrix H2 controlled variables, CV2 D H2 y. However,
(with the second-best value of the state drift control engineers have traditionally not used
J2 ) and so on. the degrees of freedom in the matrices H1 and
2. The state drift cost drift J2 D jjWxjj is in H2 , and this chapter has summarized some
principle independent of the economic cost approaches.
(J). This is an advantage because we know There has been a belief that the use of “ad-
that the economically optimal operation (e.g., vanced control,” e.g., MPC, makes control struc-
active constraints) may change, whereas we ture design less important. However, this is not
would like the regulatory layer to remain un- correct because also for MPC must one choose
changed. However, it is also a disadvantage, inputs (MV1 D CV2s ) and outputs (CV1 ). The
because the regulatory layer determines the selection of CV1 may to some extent be avoided
initial response to disturbances, and we would by use of “Dynamic Real-Time Optimization
like this initial response to be in the right (DRTO)” or “Economic MPC,” but these opti-
direction economically, so that the required mizing controllers usually operate on a slower
correction from the slower supervisory layer time scale by sending setpoints to the basic con-
is as small as possible. Actually, this issue trol layer (MV1 D CV2s ), which means that se-
can be included by extending the state vector lecting the variables CV2 is critical for achieving
x to include also the economic controlled (close to) optimality on the fast time scale.
variables, CV1 , which is selected based on the
economic cost J. The weight matrix W may
then be used to adjust the relative weights
of avoiding drift in the internal states x and Cross-References
economic controlled variables CV1 .
3. The above steady-state approach does not con-  Control Hierarchy of Large Processing Plants:
sider input-output pairing, for which dynamics An Overview
are usually the main issue. The main pairing  Industrial MPC of Continuous Processes
rule is to “pair close” in order to minimize the  PID Control
effective time delay between the selected input
and output. For a more detailed approach, de-
centralized input-output controllability must Bibliography
be considered.
Alstad V, Skogestad S (2007) Null space method
for selecting optimal measurement combinations as
controlled variables. Ind Eng Chem Res 46(3):
Summary and Future Directions
846–853
Alstad V, Skogestad S, Hori ES (2009) Optimal measure-
Control structure design involves the structural ment combinations as controlled variables. J Process
decisions that must be made before designing Control 19:138–148
Downs JJ, Skogestad S (2011) An industrial and academic
the actual controller, and it is in most cases a
perspective on plantwide control. Ann Rev Control
much more important step than the controller 17:99–110
design. In spite of this, the theoretical tools for Engell S (2007) Feedback control for optimal process
making the structural decisions are much less operation. J Proc Control 17:203–219
Foss AS (1973) Critique of chemical process control
developed than for controller design. This chapter
theory. AIChE J 19(2):209–214
summarizes some approaches, and it is expected, Kariwala V, Cao Y (2010) Bidirectional branch and bound
or at least hoped, that this important area will for controlled variable selection. Part III. Local av-
further develop in the years to come. erage loss minimization. IEEE Trans Ind Inform 6:
54–61
The most important structural decision is
Kookos IK, Perkins JD (2002) An Algorithmic method
usually related to selecting the economic con- for the selection of multivariable process control struc-
trolled variables, CV1 D H1 y, and the stabilizing tures. J Proc Control 12:85–99
Controllability and Observability 215

Morari M, Arkun Y, Stephanopoulos G (1973) Studies signals over a given time interval. This question
in the synthesis of control structures for chemical is dealt with using the concept of observability.
processes. Part I. AIChE J 26:209–214
Narraway LT, Perkins JD (1993) Selection of control
structure based on economics. Comput Chem Eng
18:S511–S515
Skogestad S (2000) Plantwide control: the search for Keywords
the self-optimizing control structure. J Proc Control C
10:487–507
Skogestad S (2004) Control structure design for complete Controllability; Duality; Indistinguishability;
chemical plants. Comput Chem Eng 28(1–2):219–234 Input–output systems in state-space form;
Skogestad S (2012) Economic plantwide control, chap- Observability; Reachability
ter 11. In: Rangaiah GP, Kariwala V (eds) Plantwide
control. Recent developments and applications. Wiley,
Chichester, pp 229–251. ISBN:978-0-470-98014-9
Skogestad S, Postlethwaite I (2005) Multivariable feed-
back control, 2nd edn. Wiley, Chichester Introduction
van de Wal M, de Jager B (2001) Review of methods for
input/output selection. Automatica 37:487–510
Yelchuru R, Skogestad S (2012) Convex formulations for In the state-space approach to input–output
optimal selection of controlled variables and measure- systems, the relation between input signals
ments using Mixed Integer Quadratic Programming. and output signals is represented by means of
J Process Control 22:995–1007
Yelchuru R, Skogestad S (2013) Quantitative methods for two equations. In the continuous-time case, the
regulatory layer selection. J Process Control 23:58–69 first of these equations is a first-order vector
differential equation driven by the input signal
and is often called the state equation. The second
equation is an algebraic equation, often called the
output equation. The unknown in the differential
equation is called the state vector of the system.
Controllability and Observability Given a particular input signal and initial value
of the state vector, the state equation generates
H.L. Trentelman a unique solution, called the state trajectory of
Johann Bernoulli Institute for Mathematics and the system. The output equation determines the
Computer Science, University of Groningen, corresponding output signal as a function of this
Groningen, AV, The Netherlands state trajectory and the input signal. Thus, in the
state space approach, the input–output behavior
of the system is obtained using the state vector as
Abstract an intermediate variable.
In the context of input–output systems in state-
State controllability and observability are key space form, the properties of controllability and
properties in linear input–output systems in state- observability characterize the interaction between
space form. In the state-space approach, the re- the input, the state, and the output. In particular,
lation between inputs and outputs is represented controllability describes the ability to manipulate
using the state variables of the system. A natural the state vector of the system by applying ap-
question is then to what extent it is possible propriate input signals. Observability describes
to manipulate the values of the state vector by the ability to determine the values of the state
means of an appropriate choice of the input func- vector from knowledge of the input and output
tion. The concepts of controllability, reachability, over a certain time interval. The properties of
and null controllability address this issue. An- controllability and observability are fundamental
other important question is whether it is possible properties that play a major role in the analysis
to uniquely determine the values of the state and control of linear input–output systems in
vector from knowledge of the input and output state-space form.
216 Controllability and Observability

Z t
Systems with Inputs and Outputs
yu .t; x0 / D C e x0 C At
K .t  / u ./ d
0
Consider a continuous-time, linear, time-
CDu .t/ ; (3)
invariant, input–output system in state-space
form represented by the equations
where K.t/W D C e At B. In the case D D 0, it is
customary to call K.t/ the impulse response. In
xP .t/ D Ax .t/ C Bu .t/ ;
(1) the general case, one would call the distribution
y .t/ D C x .t/ C Du .t/ :
K.t/ C Dı.t/ the impulse response.
This system is referred to as †. In Eq. (1), A,
B, C , and D are maps (or matrices), and the Controllability
functions x, u, and y are considered to be defined
on the real axis R or on any subinterval of it. Controllability is concerned with the ability to
In particular, one often assumes the domain of manipulate the state by choosing an appropriate
definition to be the nonnegative part of R, which input signal, thus steering the current state to a
is without loss of generality since the system is desired future state in a given finite time. Thus, in
time-invariant. The function u is called the input, particular, in the differential equation in (1), we
and its values are assumed to be given. The class study the relation between u and x. We investi-
of admissible input functions is denoted by U. Of- gate to what extent one can influence the state x
ten, U is the class of piecewise continuous or lo- by a suitable choice of the input u.
cally integrable functions, but for most purposes, For this purpose, we introduce the (at time
the exact class from which the input functions are T ) reachable space WT , defined as the space of
chosen is not important. We assume that input points x1 for which there exists an input u such
functions take values in an m-dimensional space that xu .T; 0/ D x1 , i.e., the set of points that can
U, which we often identify with Rm . The first be reached from the origin at time T . It follows
equation of † is an ordinary differential equation from the linearity of the differential equation that
for the variable x. For a given initial value of x WT is a linear subspace of X . In fact, (2) implies
and input function u, the function x is completely Z ˇ 
T ˇ
determined by this equation. The variable x is WT D e A.T  / Bu./d ˇˇ u 2 U : (4)
called the state variable and it is assumed to 0
take values in an n-dimensional space X : The
We call system † reachable at time T if
space X is called the state space. It is usually
every point can be reached from the origin,
identified with Rn . Finally, y is called the output
i.e., if WT D X . It follows from (2)
of the system and takes values in a p-dimensional
that if the system is reachable at time T ,
space Y, which we identify with Rp . Since the
every point can be reached from every point
system † is completely determined by the maps
at time T , because the condition for the
(or matrices) A, B; C , and D, we identify † with
point x1 to be reachable from x0 at time T
the quadruple .A; B; C; D/.
is
The solution of the differential equation of †
x1  e AT x0 2 WT :
with initial value x(0) = x0 is denoted as xu .t; x0 /.
It can be given explicitly using the variation-of- The property that every point is reachable
constants formula, namely, from any point in a given time interval [0,
T ] is called controllability (at T). Finally, we
Z t
have the concept of null controllability, i.e.,
xu .t; x0 / D e At x0 C e A.t  / Bu ./ d: (2)
0
the possibility to reach the origin from an
arbitrary initial point. According to (2), for a
The corresponding value of y is denoted by point x0 to be null controllable at T , we must
yu .t; x0 /. As a consequence of (2), we have have
Controllability and Observability 217

Z T
that Ak (k > n) is a linear combination
e AT x0 C e A.T  / Bu ./ d D 0
0 of I; A; : : : ; An1 as well. Therefore,
Ak B D 0 for k D 0; 1; : : : ; n  1 implies
for some u 2 U. We observe that x0 is null that Ak B D 0 for all k 2 N. 
controllable at T (by the control u) if and only As an immediate consequence of the previous
if –e AT x0 is reachable at T (by the control u). theorem, we find that at time T reachable sub-
Since e AT is invertible, we see that † is null space WT can be expressed in terms of the maps C
controllable at T if and only if † is reachable A and B as follows.
at T . Henceforth, we refer to the equivalent
Corollary 1
properties reachability, controllability, null con-
trollability simply as controllability (at T ). It
should be remarked that the equivalence of these WT D im .B AB    An1 B/:
concepts does not hold in other situations, e.g.,
for discrete-time systems. We intend to obtain This implies that, in fact, WT is independent of
an explicit expression for the space WT and, T , for T > 0. Because of this, we often use W in-
based on this, an explicit condition for control- stead of WT and call this subspace the reachable
lability. This is provided by the following re- subspace of †. This subspace of the state space
sult. has the following geometric characterization in
terms of the maps A and B.
Theorem 1 Let  be an n-dimensional row vec-
tor and T > 0. Then the following statements are Corollary 2 W is the smallest A-invariant sub-
equivalent: space containing B:D imB. Explicitly, W is A-
1.  ?WT (i.e., x D 0 for all x 2 WT ). invariant, B W, and any A-invariant sub-
2. etA B D 0 for 0  t  T . space L satisfying B L also satisfies W
3. Ak B D 0 for k D 0,1, 2,: : :. L. We denote the smallest A-invariant subspace
4.  (B AB    An1 B) D 0. containing B by hAjBi, so that we can write
W D hAjBi. For the space hAjBi, we have the
Proof (i) , (ii) If  ?WT , then by Eq. (4): following explicit formula
Z T hAjBi D B C AB C    C An1 B.
e A.T  / Bu ./ d D 0 (5) Corollary 3 The following statements are equiv-
0
alent.
for every u 2 U. Choosing u.t/ D 1. There exists T > 0 such that system † is
B T e A .T t / T for 0  t  T yields
T
controllable at T .
2. hAjBi D X .
Z
T
A.T  / 2 3. Rank (B AB   An1 B/ D n.
e B d D 0;
0 4. The system † is controllable at T for all T > 0.
We say that the matrix pair (A, B) is controllable
from which (ii) follows. Conversely, assume
if one of these equivalent conditions is satisfied.
that (ii) holds. Then (5) holds and hence (i)
follows. Example 1 Let A and B be defined by
(ii) , (iii) This is obtained by power series
P1 t k k    
expansion of e At
D kD0 kŠ A : 2 6 3
AWD ; BWD :
(iii) , (iv) This follows immediately from the 2 5 2
evaluation of the vector-matrix product.
 
(iv) , (iii) This implication is based on the 3 6
Then .B AB/ D ; rank(B AB) D 1,
Cayley-Hamilton Theorem. According to 2 4
this theorem, An is a linear combination and consequently, (A, B) is not controllable. The
of I; A; : : : ; An1 . By induction, it follows reachable subspace is the span of (B AB), i.e., the
218 Controllability and Observability

0 1
line given by the equation 2x1 C 3x2 D 0. This C
can also be seen as follows. Let z WD 2x1 C 3x2 , B CA C
B C
then zP D z. Hence, if z(0) D 0, which is the case B CA2 C
B C v D 0: (6)
if x(0) D 0, we must have z.t/ D 0 for all t  0. B :: C
@ : A
CAn1
As a consequence, the distinguishability of two
Observability vectors does not depend on T . The space of
vectors v for which (6) holds is denoted hker
In this section, we include the second of equa- C jAi and called the unobservable subspace. It is
tions (1), y D Cx + Du, in our considerations. equivalently characterized as the intersection of
Specifically, we investigate to what extent it is the spaces ker CAk for k D 0; : : : ; n  1, i.e.,
possible to reconstruct the state x if the input
u and the output y are known. The motivation \
n1
is that we often can measure the output and hker C jAi D ker CAk :
prescribe (and hence know) the input, whereas kD0
the state variable is hidden.
Equivalently, hker C jAi is the largest A-invariant
Definition 2 Two states x0 and x1 in X are
subspace contained in ker C . Finally, another
called indistinguishable on the interval [0, T ] if
characterization is “v 2 hker C jAi if and only if
for any input u we have yu .t, x0 / D yu .t, x1 /,
y0 .t, v/ is identically zero,” where the subscript
for all 0  t  T .
“0” refers to the zero input.
Hence, x0 and x1 are indistinguishable if they
give rise to the same output values for every Definition 3 System † is called observable if
input u. According to (3), for x0 and x1 to be any two distinct states are not indistinguishable.
indistinguishable on [0, T ], we must have that
The previous considerations immediately lead to
the result.
Z t
C e x0 C
At
K .t  / u ./ d C Du .t/ Theorem 2 The following statements are equiv-
0 alent.
Z t 1. The system † is observable.
D C e At x1 C K .t  / u ./ d C Du .t/ 2. Every nonzero state is not indistinguishable
0
from the origin.
3. hker C jAi D 0.
for 0  t  T and for any input signal u. We 4. CeAt v D 0 (0  t  T / ) v D 0.
0 1
note that the input signal does not affect distin- C
guishability, i.e., if one u is able to distinguish B CA C
B C
between two states, then any input is. In fact, B 2 C
5. Rank B CA C D n:
x0 and x1 are indistinguishable if and only if B : C
@ :: A
CeAt x0 D CeAt x1 (0  t  T ). Obviously, x0
and x1 are indistinguishable if and only if v WD CAn1
x0  x1 and 0 are indistinguishable. By applying Since observability is completely determined
Theorem 1 with  = v T nonzero and transposing by the matrix pair (C , A), we will say “(C , A) is
the equations, it follows that CeAt x0 D CeAt x1 observable” instead of “system † is observable.”
(0  t  T / if and only if CeAt v D 0 (0 There is a remarkable relation between the
 t  T / and hence if and only if CAk v D 0 controllability and observability properties,
(k D 0; 1; 2; : : :). The Cayley-Hamilton Theorem which is referred to as duality. This property
implies that we need to consider the first n terms is most conspicuous from the conditions (3) in
only, i.e., Corollary 3 and (5) in Theorem 2, respectively.
Controllability and Observability 219

Specifically, (C , A) is observable if and only if  Linear Systems: Continuous-Time, Time-Vary-


(AT , C T ) is controllable. As a consequence of ing State Variable Descriptions
duality, many theorems on controllability can  Linear Systems: Discrete-Time, Time-Invariant
be translated into theorems on observability and State Variable Descriptions
vice versa by mere transposition of matrices.  Linear Systems: Discrete-Time, Time-Varying,
State Variable Descriptions
Example 2 Let C
 Realizations in Linear Systems Theory
   
11 3 1
AWD ; B WD ;
3 5 1
Recommended Reading
C W D .1  1/ ;
The description of linear systems in terms of
Then a state space representation was particularly
stressed by R. E. Kalman in the early 1960s (see
   
C 1 1 Kalman 1960a,b, 1963), Kalman et al. (1963).
rank D rank D 1;
CA 8 8 See also Zadeh and Desoer (1963) and Gilbert
(1963). In particular, Kalman introduced the
hence, (C , A) is not observable. Notice that if v 2 concepts of controllability and observability and
hker C j Ai and u D 0, identically, then y D 0, gave the conditions expressed in Corollary 3,
identically. In this example, hker C jAi is the span time (3), and Theorem 5, item (5). Alternative
of (l, 1)T . conditions for controllability and observability
have been introduced in Hautus (1969) and
independently by a number of authors; see Popov
(1966) and Popov (1973). Other references are
Summary and Future Directions
Belevitch (1968) and Rosenbrock (1970).
The property of controllability can be tested by
means of a rank test on a matrix involving the
maps A and B appearing in the state equation of Bibliography
the system. Alternatively, controllability is equiv-
alent to the property that the reachable subspace Antsaklis PJ, Michel AN (2007) A linear systems primer.
Birkhäuser, Boston
of the system is equal to the state space. The prop- Belevitch V (1968) Classical network theory. Holden-Day,
erty of observability allows a rank test on a matrix San Francisco
involving the maps A and C appearing in the Gilbert EG (1963) Controllability and observability in
system equations. An alternative characterization multivariable control systems. J Soc Ind Appl Math A
2:128–151
of this property is that the unobservable subspace Hautus MLJ (1969) Controllability and observability con-
of the system is equal to the zero subspace. Con- ditions of linear autonomous systems. Proc Nederl
cepts of controllability and observability have Akad Wetensch A 72(5):443–448
Kalman RE (1960a) Contributions to the theory of optimal
also been defined for discrete-time systems and,
control. Bol Soc Mat Mex 2:102–119
more generally, for time-varying systems and Kalman RE (1960b) On the general theory of control
for continuous-time and discrete-time nonlinear systems. In: Proceedings of the first IFAC congress,
systems. London: Butterworth, pp 481–491
Kalman RE (1963) Mathematical description of linear
dynamical systems. J Soc Ind Appl Math A 1:152–192
Kalman RE, Ho YC, Narendra KS (1963) Controllabil-
Cross-References ity of linear dynamical systems. Contrib Diff Equ
1:189–213
Popov VM (1966) Hiperstabilitatea sistemelor automate.
 Linear Systems: Continuous-Time, Time-In- Editura Axcademiei Republicii Socialiste România (in
variant State Variable Descriptions Rumanian)
220 Controller Performance Monitoring

Popov VM (1973) Hyperstability of control systems. basic control loops in a plant by monitoring the
Springer, Berlin. (Translation of the previous refer- variance in the process variable and compar-
ence)
Rosenbrock HH (1970) State-space and multivariable the-
ing it to that of a minimum variance controller.
ory. Wiley, New York Other metrics of controller performance are also
Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con- reviewed. The resulting indices of performance
trol theory for linear systems. Springer, London give an indication of the level of performance
Wonham WM (1979) Linear multivariable control: a geo-
metric approach. Springer, New York
of the controller and an indication of the ac-
Zadeh LA, Desoer CA (1963) Linear systems theory – the tion required to improve its performance; the
state-space approach. McGraw-Hill, New York diagnosis of poor performance may lead one to
look at remediation alternatives such as: retuning
controller parameters or process reengineering to
reduce delays or implementation of feed-forward
Controller Performance Monitoring control or attribute poor performance to faulty
actuators or other process nonlinearities.
Sirish L. Shah
Department of Chemical and Materials
Engineering, University of Alberta Edmonton, Keywords
Edmonton, AB, Canada
Time series analysis; Minimum variance control;
Control loop performance assessment; Perfor-
Abstract mance monitoring; Fault detection and diagnosis

Process control performance is a cornerstone of


operational excellence in a broad spectrum of Introduction
industries such as refining, petrochemicals, pulp
and paper, mineral processing, power and waste A typical industrial process, as in a petroleum
water treatment. Control performance assessment refinery or a petrochemical complex, includes
and monitoring applications have become main- thousands of control loops. Instrumentation tech-
stream in these industries and are changing the nicians and engineers maintain and service these
maintenance methodology surrounding control loops, but rather infrequently. However, industrial
assets from predictive to condition based. The studies have shown that as many as 60 % of
large numbers of these assets on most sites com- control loops may have poor tuning or config-
pared to the number of maintenance and control uration or actuator problems and may therefore
personnel have made monitoring and diagnosing be responsible for suboptimal process perfor-
control problems challenging. For this reason, au- mance. As a result, monitoring of such control
tomated controller performance monitoring tech- strategies to detect and diagnose cause(s) of un-
nologies have been readily embraced by these satisfactory performance has received increasing
industries. attention from industrial engineers. Specifically
This entry discusses the theory as well as the methodology of data-based controller per-
practical application of controller performance formance monitoring (CPM) is able to answer
monitoring tools as a requisite for monitoring and questions such as the following: Is the controller
maintaining basic as well as advanced process doing its job satisfactorily and if not, what is the
control (APC) assets in the process industry. The cause of poor performance?
section begins with the introduction to the theory The performance of process control assets is
of performance assessment as a technique for monitored on a daily basis and compared with in-
assessing the performance of the basic control dustry benchmarks. The monitoring system also
loops in a plant. Performance assessment al- provides diagnostic guidance for poorly perform-
lows detection of performance degradation in the ing control assets. Many industrial sites have
Controller Performance Monitoring 221

established reporting and remediation workflows Note that all transfer functions are expressed for
to ensure that improvement activities are carried the discrete time case in terms of the backshift op-
out in an expedient manner. Plant-wide perfor- erator, q 1 : N represents the disturbance trans-
mance metrics can provide insight into company- fer function with numerator and denominator
wide process control performance. Closed-loop polynomials in q 1 . The division of the numera-
tuning and modeling tools can also be deployed tor by the denominator can be rewritten as: N D
to aid with the improvement activities. Survey ar- F C q d R, where the quotient term, F D F0 C C
ticles by Thornhill and Horch (2007) and Shardt F1 q 1 C    C Fd 1 q .d 1/ is a polynomial of
et al. (2012) provide a good overview of the over- order .d  1/ and the remainder, R is a transfer
all state of CPM and the related diagnosis issues. function. The closed-loop transfer function can
CPM software is now readily available from most be reexpressed, after algebraic manipulation as
DCS vendors and has already been implemented  
successfully at many large-scale industrial sites N
yt D at
throughout the world. 1 C q d TQ Q
 
F C q d R
D at
Univariate Control Loop Performance 1 C q d TQ Q
Assessment with Minimum Variance

!
F 1 C q d TQ Q C q d R  F TQ Q
Control as Benchmark D at
1 C q d TQ Q
It has been shown by Harris (1989) that for a !
Q
d R  F T Q
system with time delay d , a portion of the output D F Cq at
variance is feedback control invariant and can be 1 C q d TQ Q
estimated from routine operating data. This is the D F0 at C F1 at 1 C    C Fd 1 at d C1
so-called minimum variance output. Consider the „ ƒ‚ …
et
closed-loop system shown in Fig. 1, where Q is
the controller transfer function, TQ is the process C L0 at d C L1 at d 1 C   
„ ƒ‚ …
transfer function, d is the process delay (in terms wt d
of sample periods), and N is the disturbance
transfer function driven by random white-noise The closed-loop output can then be expressed as
sequence, at .
In the regulatory mode (when the set point yt D et C wt d
is constant), the closed-loop transfer function
relating the process output and the disturbance is where et D F at corresponds to the first d  1
given by lags of the closed-loop expression for the output,
  yt , and more importantly is independent of the
N controller, Q, or it is controller invariant, while
Closed-loop response: yt D at
1 C q d TQ Q wt d is dependent on the controller. The variance
of the output is then given by

Var.yt / D Var.et / C Var.wt d /  Var.et /

Since et is controller invariant, it provides the


lower bound on the output variance. This is nat-
urally achieved if wt d D 0, that is, when
R D F TQ Q or when the controller is a minimum
Controller Performance Monitoring, Fig. 1 Block di- variance controller with Q D FRTQ . If the total
agram of a regulatory control loop output variance is denoted as Var .yt / D  2 , then
222 Controller Performance Monitoring

the lowest achievable variance is Var .et / D mv


2
. Desborough and Harris (1992), and Huang and
To obtain an estimate of the lowest achievable Shah (1999) have derived algorithms for the cal-
variance from the time series of the process culation of this minimum variance term.
output, one needs to model the closed-loop output Multiplying Eq. (1) by at ; at 1 ; : : : ; at d C1 ,
data yt by a moving average process such as respectively, and then taking the expectation of
both sides of the equation yield the sample co-
variance terms:
yt D f0 at C f1 at 1 C    C fd 1 at .d 1/
„ ƒ‚ … 9
et rya .0/ D E Œyt at
D f0 a2 >
>
>
>
rya .1/ D E Œyt at 1
D f1 a
2 >
>
C fd at d C fd C1 at .d C1/ C    (1) >
>
>
=
rya .2/ D E Œyt at 2
D f2 a
2
(2)
>
>
The controller-invariant term et can then be esti- :: >
>
: >
>
mated by time series analysis of routine closed- >
>
>
loop operating data and subsequently used as 2;
rya .d  1/ D E Œyt at d C1
D fd 1 a
a benchmark measure of theoretically achiev-
able absolute lower bound of output variance to The minimum variance or the invariant portion of
assess control loop performance.Harris (1989), output variance is
9
>
>
>
>
2
>
>
2
D f0 C f1 C f2 C    C fd 1 a
2 2 2 2 >
>
mv >
>
>
>
" # =
2  2  2  2
rya .0/ rya .1/ rya .2/ rya .d  1/ (3)
D C C C a2 >
>
>
a2 a2 a2 2
a >
>
>
>
>
>
h i >
>
>
;
D r .0/ C r .1/ C r .2/ C    C r .d  1/ =
2
ya
2
ya
2
ya
2
ya
2
a

A measure of controller performance index can Substituting Eq. (3) into Eq. (4) yields
then be defined as

 .d / D mv
2
=y2 (4)

h i
 .d / D rya
2
.0/ C rya
2
.1/ C rya
2
.2/ C    C rya
2
.d  1/ =y2 a2

D ya
2
.0/ C ya
2
.1/ C ya
2
.2/ C    C ya
2
.d  1/
D ZZ T

where Z is the vector of cross correlation coeffi- Although at is unknown, it can be replaced by
cients between yt and at for lags 0 to d  1 and the estimated innovation sequence aO t . The es-
is denoted as timate aO t is obtained by whitening the process
  output variable yt via time series analysis. This
Z D ya .0/ ya .1/ ya .2/ : : : ya .d  1/ algorithm is denoted as the FCOR algorithm
Controller Performance Monitoring 223

for Filtering and CORrelation analysis (Huang It is clear from these plots that performance
and Shah 1999). This derivation assumes that indices trajectory depends on dynamics of the
the delay, d , be known a priori. In practice, disturbance and controller tuning.
however, a priori knowledge of time delays may It is important to note that the minimum vari-
not always be available. It is therefore useful to ance is just one of several benchmarks for obtain-
assume a range of time delays and then calculate ing a controller performance metric. It is seldom
performance indices over this range of the time practical to implement minimum variance control C
delays. The indices over a range of time delays as it typically will require aggressive actuator ac-
are also known as extended horizon performance tion. However, the minimum variance benchmark
indices (Thornhill et al. 1999). Through pattern serves to provide an indication of the opportunity
recognition, one can tell the performance of the in improving control performance; that is, should
loop by visualizing the patterns of the perfor- the performance index .d / be near or just above
mance indices versus time delays. There is a clear zero, then it gives the user an idea of the benefits
relation between performance indices curve and possible in improving the control performance of
the impulse response curve of the control loop. that loop.
Consider a simple case where the process is
subject to random disturbances. Figure 2 is one
example of performance evaluation for a control Performance Assessment and
loop in the presence of disturbances. This fig- Diagnosis of Univariate Control Loop
ure shows time-series of process variable data Using Alternative Performance
for both loops in the left column, closed-loop Indicators
impulse responses (middle column) and corre-
sponding performance indices (labeled as PI on In addition to the performance index for perfor-
the right column). From the impulse responses, mance assessment, there are several alternative
one can see that the loop under the first set indicators of control loop performance. These are
of tuning constants (denoted as TAG1.PV) has discussed next.
better performance; the loop under the second
set of tuning constants (denoted as TAG5.PV) Autocorrelation function: The autocorrelation
has oscillatory behavior, indicating a relatively function (ACF) of the output error, shown in
poor control performance. With performance in- Fig. 3, is an approximate measure of how close
dex “1” indicating the best possible performance the existing controller is to minimum variance
and index “0” indicating the worst performance, condition or how predictable the error is over the
performance indices for the first controller tuning time horizon of interest. If the controller is under
(shown on the upper-right plot) approach “1” minimum variance condition then the autocorre-
within 4 time lags, while performance indices lation function should decay to zero after “d  1”
for the second controller tuning (shown on the lags where “d ” is the delay of the process. In
bottom-right plot) take 10 time lags to approach other words, there should be no predictable infor-
“0.7.” In addition, performance indices for the mation beyond time lag d  1. The rate at which
second tuning show ripples as they approach an the autocorrelation decays to zero after “d  1”
asymptotic limit, indicating a possible oscillation lags indicates how close the existing controller
in the loop. is to the minimum variance condition. Since it is
Notice that one cannot rank performance of straightforward to calculate autocorrelation using
these two controller settings from the noisy time- process data, the autocorrelation function is often
series data. Instead, we can calculate performance used as a first-pass test before carrying out further
indices over a range of time delays (from 1 to 10). performance analysis.
The result is shown on the right column plots of
Fig. 2. These simulations correspond to the same Impulse response: An impulse response func-
process with different controller tuning constants. tion curve represents the closed-loop impulse
224 Controller Performance Monitoring

Trend Plot of TAG1.PV


6
5
4
3
2
1

Value
0
−1
−2
−3
−4
−5
182 364 546 728 910 1092 1274 1456 1638 1820
Sample

Trend Plot of TAG5.PV


7.5
5.0
2.5
Value

0.0
−2.5
−5.0
−7.5

182 364 546 728 910 1092 1274 1456 1638 1820
Sample

Impulse Response - TAG1.PV PI(var) vs. Delay: TAG1.PV


1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
PI(var)

0.6
0.6
Impulse

0.5
0.5 0.4
0.4 0.3
0.3 0.2
0.2 0.1
0.1 0.0
−0.0 Delay 1 2 3 4 5 6 7 8 9 10
−0.1 PI(var) 0.484 0.813 0.934 0.971 0.982 0.984 0.985 0.985 0.987 0.988

3 6 9 12 15 18 21 24 27 30 33 36
Lag

Impulse Response - TAG5.PV PI(var) vs. Delay: TAG5.PV


1.00 0.7
0.75 0.6
0.5
0.50
PI(var)

0.4
Impulse

0.25
0.3
−0.00
0.2
−0.25 0.1
−0.50 0.0
−0.75 Delay 1 2 3 4 5 6 7 8 9 10
PI(var) 0.169 0.284 0.286 0.398 0.482 0.483 0.565 0.628 0.628 0.684

9 18 27 36 45 54 63 72 81 90 99
Lag

Controller Performance Monitoring, Fig. 2 Time series of process variable (top), corresponding impulse responses
(left column) and their performance indices (right column).
Controller Performance Monitoring 225

Controller Performance
Monitoring, Fig. 3
Autocorrelation function of
the controller error

Controller Performance
Monitoring, Fig. 4
Impulse responses
estimated from routine
operating data

response between the whitened disturbance se- Spectral analysis: The closed-loop frequency
quence and the process output. This function is response of the process is an alternative way to
a direct measure of how well the controller is assess control loop performance. Spectral analy-
performing in rejecting disturbances or tracking sis of output data easily allows one to detect oscil-
set-point changes. Under stochastic framework, lations, offsets, and measurement noise present in
this impulse response function may be calculated the process. The closed-loop frequency response
using time series model such as an Autoregres- is often plotted together with the closed-loop fre-
sive Moving Average (ARMA) or Autoregres- quency response under minimum variance con-
sive with Integrated Moving Average (ARIMA) trol. This is to check the possibility of perfor-
model. Once an ARMA type of time series model mance improvement through controller tunings.
is estimated, the infinite-order moving average The comparison gives a measure of how close
representation of the model shown in Eq. (1) can the existing controller is to the minimum vari-
be obtained through a long division of the ARMA ance condition. In addition, it also provides the
model. As shown in Huang and Shah (1999), the frequency range in which the controller signif-
coefficients of the moving average model, Eq. (1), icantly deviates from minimum variance condi-
are the closed-loop impulse response coefficients tion. Large deviation in the low-frequency range
of the process between whitened disturbances typically indicates lack of integral action or weak
and the process output. Figure 4 shows closed- proportional gain. Large peaks in the medium-
loop impulse responses of a control loop with two frequency range typically indicate an overtuned
different control tunings. Clearly they denote two controller or presence of oscillatory disturbances.
different closed-loop dynamic responses: one is Large deviation in the high-frequency range typ-
slow and smooth, and the other one is relatively ically indicates significant measurement noise.
fast and slightly oscillatory. The sum of square of As an illustrative example, frequency responses
the impulse response coefficients is the variance of two control loops are shown in Fig. 5. The
of the data. left graph of the figure shows that closed-loop
226 Controller Performance Monitoring

Closed-Loop vs. Min. Variance Output Response


TAG1.PV

Current Minimum Variance


10.0
7.5
5.0
Magnitude (dB)

2.5
0.0
−2.5
−5.0
−7.5
10−2 10−1
Frequency

Closed-Loop vs. Min. Variance Output Response


TAG5.PV
Current Minimum Variance
20
15
10
Magnitude (dB)

5
0
−5
−10
−15
−20

10−2 10−1
Frequency

Controller Performance Monitoring, Fig. 5 Frequency response estimated from routine operating data

frequency response of the existing controller is is therefore often desirable. For example, seg-
almost the same as the frequency response under mentation of data may lead to some insight into
minimum variance control. A peak at the mid- any cyclical behavior of the process variation in
frequency indicates possible overtuned control. controller performance during, e.g., day/night or
The right graph of Fig. 5 shows that the frequency due to shift change. Figure 6 is an example of
response of the existing controller is oscillatory, performance segmentation over a 200 data point
indicating a possible overtuned controller or the window.
presence of an oscillatory disturbance at the peak
frequency; otherwise the controller is close to
minimum variance condition. Performance Assessment of
Univariate Control Loops Using
Segmentation of performance indices: Most User-Specified Benchmarks
process data exhibit time- varying dynamics; i.e.,
the process transfer function or the disturbance The increasing level of global competitive-
transfer function is time variant. Performance ness has pushed chemical plants into high-
assessment with a non-overlapping sliding data performance operating regions that require
window that can track time-varying dynamics advanced process control technology. See the
Controller Performance Monitoring 227

articles  Control Hierarchy of Large Processing shows oscillatory behavior, with an RPI D 0:4,
Plants: An Overview and  Control Structure indicating deteriorated performance due to the
Selection. Consequently, the industry has an oscillation.
increasing need to upgrade the conventional In some cases one may wish to specify cer-
PID controllers to advanced control systems. tain desired closed-loop dynamics and carry out
The most natural questions to ask for such an performance analysis with respect to such desired
upgrading are as follows. Has the advanced dynamics. One such desired dynamic benchmark C
controller improved performance as expected? is the closed-loop settling time. As an illustrative
If yes, where is the improvement and can it example, Fig. 8 shows a system where a settling
be justified? Has the advanced controller been time of ten sampling units is desired for a process
tuned to its full capacity? Can this improvement with a delay of five sampling units. The impulse
also be achieved by simply retuning the existing responses show that the existing loop is close to
traditional (e.g., PID) controllers? (see  PID the desired performance, and the value of RPI D
Control). In other words, what is the cost 0:9918 confirms this. Thus no further tuning of
versus benefit of implementing an advanced the loop is necessary.
controller? Unlike performance assessment using
minimum variance control as benchmark, the
solution to this problem does not require a Diagnosis of Poorly Performing
priori knowledge of time delays. Two possible Loops
relative benchmarks may be chosen: one is the
historical data benchmark or reference data set Whereas detection of poorly performing loops
benchmark, and the other is a user-specified is now relatively simple, the task of diagnos-
benchmark. ing reason(s) for poor performance and how to
The purpose of reference data set benchmark- “mend” the loop is generally not straightforward.
ing is to compare performance of the existing The reasons for poor performance could be any
controller with the previous controller during the one of interactions between various control loops,
“normal” operation of the process. This reference overtuned or undertuned controller settings, pro-
data set may represent the process when the cess nonlinearity, poor controller configuration
controller performance is considered satisfactory (meaning the choice of pairing a process (or
with respect to meeting the performance objec- controlled) variable with a manipulative variable
tives. The reference data set should be represen- loop), or actuator problems such as stiction, large
tative of the normal conditions that the process is delays, and severe disturbances. Several studies
expected to operate at; i.e., the disturbances and have focused on the diagnosis issues related to
set-point changes entering into the process should actuator problems (Håagglund 2002; Choudhury
not be unusually different. This analysis provides et al. 2008; Srinivasan and Rengaswamy 2008;
the user with a relative performance index (RPI) Xiang and Lakshminarayanan 2009; de Souza
which compares the existing control loop perfor- et al. 2012). Shardt et al. (2012) has given an
mance with a reference control loop benchmark overview of the overall state of CPM and the
chosen by the user. The RPI is bounded by related diagnosis issues.
0  RPI  1, with “<1” indicating dete-
riorated performance, “1” indicating no change
of performance, and “>1” indicating improved Industrial Applications of CPM
performance. Figure 6 shows a result of reference Technology
data set benchmarking. The impulse response
of the benchmark or reference data smoothly As remarked earlier, CPM software is now read-
decays to zero, indicating good performance of ily available from most DCS vendors and has
the controller. After one increases the propor- already been implemented successfully at several
tional gain of the controller, the impulse response large-scale industrial sites. A summary of just
228 Controller Performance Monitoring

Controller Performance PI for TAG1.PV


Monitoring, Fig. 6 Delay 3
Performance indices for
segmented data (each of 0.5
window length 200 points)
0.4

0.3

PI
0.2

0.1

0.0
Time 200 400 600 800 1000 1200 1400 1600 1800 2000
PI 0.541 0.428 0.387 0.433 0.462 0.404 0.479 0.460 0.496 0.388

Controller Performance
Monitoring, Fig. 7
Reference benchmarking
based on impulse responses

Controller Performance
Monitoring, Fig. 8
User-specified benchmark
based on impulse responses
Controller Performance Monitoring 229

two of many large-scale industrial implementa- control loops can be detected, and repeated per-
tions of CPM technology appears below. It gives formance monitoring will indicate which loops
a clear evidence of the impact of this control should be retuned or which loops have not been
technology and how readily it has been embraced effectively upgraded when changes in the dis-
by industry (Shah et al. 2014). turbances, in the process, or in the controller
itself occur. Obviously better design, tuning, and
BASF Controller Performance Monitoring upgrading will mean that the process will operate C
Application at a point close to the economic optimum, leading
As part of its excellence initiative OPAL 21 to energy savings, improved safety, efficient uti-
(Optimization of Production Antwerp and Lud- lization of raw materials, higher product yields,
wigshafen), BASF has implemented the CPM and more consistent product qualities. This entry
strategy on more than 30,000 control loops at has summarized the major features available in
its Ludwigshafen site in Germany and on over recent commercial software packages for control
10,000 loops at its Antwerp production facility loop performance assessment. The illustrative ex-
in Belgium. The key factor in using this technol- amples have demonstrated the applicability of
ogy effectively is to combine process knowledge, this new technique when applied to process data.
basic chemical engineering, and control expertise This entry has also illustrated how controllers,
to develop solutions for the indicated control whether in hardware or software form, should
problems that are diagnosed in the CPM software be treated like “capital assets” and how there
(Wolff et al. 2012). should be routine monitoring to ensure that they
perform close to the economic optimum and that
Saudi Aramco Controller Performance the benefits of good regulatory control will be
Monitoring Practice achieved.
As part of its process control improvement ini-
tiative, Saudi Aramco has deployed CPM on
approximately 15,000 PID loops, 50 MPC appli-
cations, and 500 smart positioners across multiple Cross-References
operating facilities.
The operational philosophy of the CPM en-  Control Hierarchy of Large Processing Plants:
gine is incorporated in the continuous improve- An Overview
ment process at BASF and Aramco, whereby all  Control Structure Selection
loops are monitored in real-time and a holistic  Fault Detection and Diagnosis
performance picture is obtained for the entire  PID Control
plant. Unit-wide performance metrics are dis-  Statistical Process Control in Manufacturing
played in effective color-coded graphic forms to
effectively convey the analytics information of
the process.
Bibliography

Choudhury MAAS, Shah SL, Thornhill NF (2008) Di-


Concluding Remarks agnosis of process nonlinearities and valve stiction:
data driven approaches. Springer-Verlag, Sept. 2008,
ISBN:978-3-540-79223-9
In summary, industrial control systems are de-
Desborough L, Harris T (1992) Performance assessment
signed and implemented or upgraded with a par- measure for univariate feedback control. Can J Chem
ticular objective in mind. The controller perfor- Eng 70:1186–1197
mance monitoring methodology discussed here Desborough L, Miller R (2002) Increasing customer value
of industrial control performance monitoring: Hon-
will permit automated and repeated reviews of
eywell’s experience. In: AIChE symposium series.
the design, tuning, and upgrading of the control American Institute of Chemical Engineers, New York,
loops. Poor design, tuning, or upgrading of the pp 169–189; 1998
230 Cooperative Manipulators

de Prada C (2014) Overview: control hierarchy of large on cooperative manipulation. Kinematics and dy-
processing plants. In: Encylopedia of Systems and namics of robotic arms cooperatively manipu-
Control. Springer, London
de Souza LCMA, Munaro CJ, Munareto S (2012) Novel
lating a tightly grasped rigid object are briefly
model-free approach for stiction compensation in con- discussed. Then, this entry presents the main
trol valves. Ind Eng Chem Res 51(25):8465–8476 strategies for force/motion control of the cooper-
Håagglund T (2002) A friction compensator for pneu- ative system.
matic control valves. J Process Control 12(8):897–904
Harris T (1989) Assessment of closed loop performance.
Can J Chem Eng 67:856–861
Huang B, Shah SL (1999) Performance assessment Keywords
of control loops: theory and applications. Springer-
Verlag, October 1999, ISBN: 1-85233-639-0.
Shah SL, Nohr M, Patwardhan R (2014) Success stories in Cooperative task space; Coordinated motion;
control: controller performance monitoring. In: Samad Force/motion control; Grasping; Manipulation;
T, Annaswamy AM (eds) The impact of control tech- Multi-arm systems
nology, 2nd edn. www.ieeecss.org
Shardt YAW, Zhao Y, Lee KH, Yu X, Huang B, Shah
SL (2012) Determining the state of a process control
system: current trends and future challenges. Can J
Chem Eng 90(2):217–245
Introduction
Srinivasan R, Rengaswamy R (2008) Approaches for effi-
cient stiction compensation in process control valves. Since the early 1970s, it has been recognized that
Comput Chem Eng 32(1):218–229 many tasks, which are difficult or even impossi-
Skogestad S (2014) Control structure selection and
plantwide control. In: Encylopedia of Systems and
ble to execute by a single robotic manipulator,
Control. Springer, London become feasible when two or more manipulators
Thornhill NF, Horch A (2007) Advances and new direc- work in a cooperative way. Examples of typical
tions in plant-wide controller performance assessment. cooperative tasks are the manipulation of heavy
Control Eng Pract 15(10):1196–1206
Thornhill NF, Oettinger M, Fedenczuk P (1999) Refinery-
and/or large payloads, assembly of multiple parts,
wide control loop performance assessment. J Process and handling of flexible and articulated objects
Control 9(2):109–124 (Fig. 1).
Wolff F, Roth M, Nohr A, Kahrs O (2012) Software In the 1980s, research achieved several the-
based control-optimization for the chemical indus-
try. VDI, Tagungsband “Automation 2012”, 13/14.06 oretical results related to modeling and control
2012, Baden-Baden of to single-arm robots; this further fostered re-
Xiang LZ, Lakshminarayanan S (2009) A new unified search on multi-arm robotic systems. Dynamics
approach to valve stiction quantification and compen-
sation. Ind Eng Chem Res 48(7):3474–3483

Cooperative Manipulators

Fabrizio Caccavale
School of Engineering, Università degli Studi
della Basilicata, Potenza, Italy

Abstract

This chapter presents an overview of the main


issues related to modeling and control of coop-
erative robotic manipulators. A historical path Cooperative Manipulators, Fig. 1 An example of a
cooperative robotic work cell composed by two industrial
is followed to present the main research results robot arms
Cooperative Manipulators 231

and control as well as force control issues have composed by two cooperative manipulators
been widely explored along the decade. grasping a common object.
In the 1990s, parameterization of the The kinetostatic formulation proposed by
constraint forces/moments acting on the object Uchiyama and Dauchez (1993), i.e., the so-called
has been recognized as a key to solving control symmetric formulation, is based on kinematic
problems and has been studied in several and static relationships between generalized
papers (e.g., Sang et al. 1995; Uchiyama and forces/velocities acting at the object and their C
Dauchez 1993; Walker et al. 1991; Williams counterparts acting at the manipulators end
and Khatib 1993). Several control schemes for effectors. To this aim, the concept of virtual
cooperative manipulators based on the sought stick is defined as the vector which determines
parameterizations have been designed, including the position of an object-fixed coordinate frame
force/motion control (Wen and Kreutz-Delgado with respect to the frame attached to each robot
1992) and impedance control (Bonitz and Hsia end effector (Fig. 2). When the object grasped
1996; Schneider and Cannon 1992). Other by the two manipulators can be considered rigid
approaches are adaptive control (Hu et al. 1995), and tightly attached to each end effector, then the
kinematic control (Chiacchio et al. 1996), task- virtual stick behaves as a rigid stick fixed to each
space regulation (Caccavale et al. 2000), and end effector.
model-based coordinated control (Hsu 1993). According to the symmetric formulation, the
Other important topics investigated in the 1990s vector, h, collecting the generalized forces (i.e.,
were the definition of user-oriented task-space forces and moments) acting at each end effector
variables for coordinated control (Caccavale et al. is given by
2000; Chiacchio et al. 1996), the development of
meaningful performance measures (Chiacchio h D W  hE C VhI ; (1)
et al. 1991a,b) for multi-arm systems, and the
problem of load sharing (Walker et al. 1989). where W is the so-called grasp matrix, the
Most of the abovementioned works assume columns of V span the null space of the
that the cooperatively manipulated object is
rigid and tightly grasped. However, since
the 1990s, several research efforts have
been focused on the control of cooperative C
flexible manipulators (Yamano et al. 2004),
since flexible-arm robot merits (lightweight
structure, intrinsic compliance, and hence safety)
can be conveniently exploited in cooperative
manipulation. Other research efforts have been
focused on the control of cooperative systems C

for the manipulation of flexible objects (Yukawa


et al. 1996) as well.

Modeling, Load Sharing,


and Performance Evaluation
Cooperative Manipulators, Fig. 2 Grasp geometry
The first modeling goal is the definition of for a two-manipulator cooperative system manipulating
a common object. The vectors r1 and r2 are the virtual
suitable variables describing the kinetostatics of
sticks, Tc is the coordinate frame attached to the object,
a cooperative system. Hereafter, the main results and T1 and T2 are the coordinate frames attached to each
available are summarized for a dual-arm system end effector
232 Cooperative Manipulators

grasp matrix, and hI is the generalized force a commonly manipulated object. In multifingered
vector which does not contribute to the object’s hands, only some motion components are
motion, i.e., it represents internal loading of the transmitted through the contact point to the
object (mechanical stresses) and is termed as manipulated object (unilateral constraints), while
internal forces, while hE represents the vector of cooperative manipulation via robotic arms is
external forces, i.e., forces and moments causing achieved by rigid (or near-rigid) grasp points and
the object’s motion. Later, a task-oriented interaction takes place by transmitting all the
formulation has been proposed (Chiacchio et al. motion components through the grasping points
1996), aimed at defining a cooperative task space (bilateral constraints). While many common
in terms of absolute and relative motion of problems between the two fields can be tackled
the cooperative system, which can be directly in a conceptually similar way (e.g., kinetostatic
computed from the position and orientation of modeling, force control), many others are specific
the end-effector coordinate frames. of each of the two application fields (e.g., form
The dynamics of a cooperative multi-arm sys- and force closure for multifingered hands).
tem can be written as the dynamics of the single
manipulators together with the closed-chain con-
straints imposed by the grasped object. By elimi- Control
nating the constraints, a reduced-order model can
be obtained (Koivo and Unseren 1991). When a cooperative multi-arm system is em-
Strongly related to kinetostatics and dynamics ployed for the manipulation of a common object,
of cooperative manipulators is the load sharing it is important to control both the absolute motion
problem, i.e., distributing the load among the of the object and the internal stresses applied
arms composing the system, which has been to it. Hence, most of the control approaches to
solved, e.g., in Walker et al. (1989). A very rele- cooperative robotic systems can be classified as
vant problem related to the load sharing is that of force/motion control schemes.
robust holding, i.e., the problem of determining Early approaches to the control of cooperative
forces/moments applied to object by the arms, in systems were based on the master/slave concept.
order to keep the grasp even in the presence of Namely, the cooperative system is decomposed
disturbing forces/moments. in a position-controlled master arm, in charge
A major issue in robotic manipulation is the of imposing the absolute motion of the object,
performance evaluation via suitably defined and the force-controlled slave arms, which are
indexes (e.g., manipulability ellipsoids). These to follow (as smoothly as possible) the motion
concepts have been extended to multi-arm robotic imposed by the master. A natural evolution of the
systems in Chiacchio et al. (1991a,b). Namely, by above-described concept has been the so-called
exploiting the kinetostatic formulations described leader/follower approach, where the follower
above, velocity and force manipulability arm reference motion is computed via closed-
ellipsoids can be defined, by regarding the whole chain constraints. However, such approaches
cooperative system as a mechanical transformer suffered from implementation issues, mainly due
from the joint space to the cooperative task space. to the fact that the compliance of the slave arms
The manipulability ellipsoids can be seen as has to be very large, so as to smoothly follow the
performance measures aimed at determining the motion imposed by the master arm. Moreover,
attitude of the system to cooperate in a given the roles of the master and slave (leader and
configuration. follower) may need to be changed during the task
Finally, it is worth mentioning the strict re- execution.
lationship between problems related to grasping Due to the abovementioned limitations, more
of objects by fingers/hands and those related natural nonmaster/slave approaches have been
to cooperative manipulation. In fact, in both pursued later, where the cooperative system is
cases, multiple manipulation structures grasp seen as a whole. Namely, the reference motion
Cooperative Manipulators 233

of the object is used to determine the motion of model (Williams and Khatib 1993) or according
all the arms in the system and the interaction to the scheme proposed in Hsu (1993).
forces are measured and fed back so as to be An alternative control approach is based on
directly controlled. To this aim, the mappings the well-known impedance concept (Bonitz and
between forces and velocities at the end effector Hsia 1996; Schneider and Cannon 1992). In fact,
of each manipulator and their counterparts at the when a manipulation system interacts with an
manipulated object are considered in the design external environment and/or other manipulators, C
of the control laws. large values of the contact forces and moments
An approach, based on the classical hybrid can be avoided by enforcing a compliant behavior
force/position control scheme, has been proposed with suitable dynamic features. In detail, the fol-
in Uchiyama and Dauchez (1993), by exploiting lowing mechanical impedance behavior between
the symmetric formulation described in the pre- the object displacements and the forces due to the
vious section. object-environment interaction can be enforced
In Wen and Kreutz-Delgado (1992) a (external impedance):
Lyapunov-based approach is pursued to devise
force/position PD-type control laws. This ME aQ E C DE vQ E C K e eE D henv ; (3)
approach has been extended in Caccavale et al.
(2000), where kinetostatic filtering of the control
where eE represents the vector of displacements
action is performed, so as to eliminate all the
between object’s desired and actual pose, vQ E is
components of the control input which contribute
the difference between the object’s desired and
to internal stresses at the object.
actual generalized velocities, aQ E is the difference
A further improvement of the PD plus
between the object’s desired and actual gener-
gravity compensation control approach has
alized accelerations, and henv is the generalized
been achieved by introducing a full model
force acting on the object, due to the interaction
compensation, so as to achieve feedback
with the environment. The impedance dynamics
linearization of the closed-loop system. The
is characterized in terms of given positive definite
feedback linearization approach formulated at
mass, damping, and stiffness matrices (ME , DE ,
the operational space level is the base of the so-
K E ). A mechanical impedance behavior between
called augmented object approach (Sang et al.
the i th end-effector displacements and the in-
1995). In this approach, the system is modeled
ternal forces can be imposed as well (internal
in the operational space as a whole, by suitably
impedance):
expressing its inertial properties via a single
augmented inertia matrix MO , i.e.,
MI;i aQ i C DI;i vQ i C K I;i ei D hI;i ; (4)

MO .xE /RxE C cO .xE ; xP E / C gO .xE / D hE ; (2)


where ei is the vector expressing the displace-
ment between the commanded and the actual
where M O , cO , and gO are the operational space pose of the i th end effector, vQ i is the vector
terms modeling, respectively, the inertial proper- expressing the difference between commanded
ties of the whole system (manipulators and ob- and actual generalized velocities of the i th end
ject), the Coriolis/centrifugal/friction terms, and effector, aQ i is the vector expressing the difference
the gravity terms, while xE is the operational between commanded and actual generalized ac-
space vector describing the position and orien- celerations of the i th end effector, and hI;i is the
tation of the coordinate frame attached to the contribution of the i th end effector to the internal
grasped object. In the framework of feedback lin- force. Again, the impedance dynamics is charac-
earization (formulated in the operational space), terized in terms of given positive definite mass,
the problem of controlling the internal forces can damping, and stiffness matrices (MI;i; DI;i; K I;i ).
be solved, e.g., by resorting to the virtual linkage More recently, an impedance scheme for control
234 Cooperative Manipulators

of both external forces and internal forces has Bibliography


been proposed (Caccavale et al. 2008).
Bonitz RG, Hsia TC (1996) Internal force-based
impedance control for cooperating manipulators. IEEE
Trans Robot Autom 12:78–89
Caccavale F, Uchiyama M (2008) Cooperative manipula-
Summary and Future Directions tors. In: Siciliano B, Khatib O (eds) Springer handbook
of robotics – chapter 29. Springer, Heidelberg
This entry has provided a brief survey of the Caccavale F, Chiacchio P, Chiaverini S (2000) Task-space
regulation of cooperative manipulators. Automatica
main issues related to cooperative robots, with 36:879–887
special emphasis on modeling and control prob- Caccavale F, Chiacchio P, Marino A, Villani L (2008)
lems. Among several open research topics in Six-DOF impedance control of dual-arm coopera-
cooperative manipulation, it is worth mentioning tive manipulators. IEEE/ASME Trans Mechatron 13:
576–586
the problem of cooperative transportation and Chiacchio P, Chiaverini S, Sciavicco L, Siciliano B
manipulation of objects via multiple mobile ma- (1991a) Global task space manipulability ellipsoids
nipulators. In fact, although notable results have for multiple arm systems. IEEE Trans Robot Autom
been already devised in Khatib et al. (1996), 7:678–685
Chiacchio P, Chiaverini S, Sciavicco L, Siciliano B
the foreseen use of robotic teams in industrial (1991b) Task space dynamic analysis of multiarm
settings (hyperflexible robotic work cells) and/or system configurations. Int J Robot Res 10:708–715
in collaboration with humans (robotic coworker Chiacchio P, Chiaverini S, Siciliano B (1996) Direct and
concept) raises new challenges related to auton- inverse kinematics for coordinated motion tasks of
a two-manipulator system. ASME J Dyn Syst Meas
omy and safety of multiple mobile manipulators.
Control 118:691–697
Also, an emerging application field is given by Fink J, Michael N, Kim S, Kumar V (2011) Planning and
cooperative systems composed by multiple aerial control for cooperative manipulation and transporta-
vehicle-manipulator systems (see, e.g., Fink et al. tion with aerial robots. Int J Robot Res 30:324–334
2011). Hsu P (1993) Coordinated control of multiple manipulator
systems. IEEE Trans Robot Autom 9:400–410
Hu Y-R, Goldenberg AA, Zhou C (1995) Motion and
force control of coordinated robots during constrained
motion tasks. Int J Robot Res 14:351–365
Cross-References Khatib O, Yokoi K, Chang K, Ruspini D, Holmberg
R, Casal A (1996) Coordination and decentralized
cooperation of multiple mobile manipulators. J Robot
 Force Control in Robotics Systems 13:755Ű-764
 Robot Grasp Control Koivo AJ, Unseren MA (1991) Reduced order model and
 Robot Motion Control decoupled control architecture for two manipulators
holding a rigid object. ASME J Dyn Syst Meas Control
113:646–654
Sang KS, Holmberg R, Khatib O (1995) The augmented
object model: cooperative manipulation and parallel
Recommended Reading mechanisms dynamics. In: Proceedings of the 2000
IEEE international conference on robotics and au-
tomation, San Francisco, pp 470–475
An overview of the field of cooperative ma- Schneider SA, Cannon Jr RH (1992) Object impedance
nipulation can be found also in Caccavale and control for cooperative manipulation: theory and ex-
Uchiyama (2008), where a more extended lit- perimental results. IEEE Trans Robot Autom 8:383–
394
erature review and further technical details are Uchiyama M, Dauchez P (1993) Symmetric kinematic
provided. Seminal contributions to control of co- formulation and non-master/slave coordinated control
operative manipulators can be found in Chiacchio of two-arm robots. Adv Robot 7:361–383
et al. (1991a), Koivo and Unseren (1991), Sang Walker ID, Marcus SI, Freeman RA (1989) Distribution
of dynamic loads for multiple cooperating robot ma-
et al. (1995), Uchiyama and Dauchez (1993), nipulators. J Robot Syst 6:35–47
Walker et al. (1989), Wen and Kreutz-Delgado Walker ID, Freeman RA, Marcus SI (1991) Analysis of
(1992), and Williams and Khatib (1993). motion and internal force loading of objects grasped
Cooperative Solutions to Dynamic Games 235

by multiple cooperating manipulators. Int J Robot Res stern (1944) this categorization is already made.
10:396–409 These authors discuss zero-sum (matrix) games
Wen JT, Kreutz-Delgado K (1992) Motion and force
control of multiple robotic manipulators. Automatica
in normal form, where the noncooperative so-
28:729–743 lution concept of saddle-point was defined and
Williams D, Khatib O (1993) The virtual linkage: a model characterized, and games in characteristic func-
for internal forces in multi-grasp manipulation. In: tion form, where solution concepts for games
Proceedings of the 1993 IEEE international confer-
of coalitions were introduced. In this article we C
ence on robotics and automation, Atlanta, pp 1025–
1030 present the fundamental solution concepts of the
Yamano M, Kim J-S, Konno A, Uchiyama M (2004) theory of cooperative games in the context of
Cooperative control of a 3D dual-flexible-arm robot. dynamical systems. The article is organized as
J Intell Robot Syst 39:1–15
Yukawa T, Uchiyama M, Nenchev DN, Inooka H (1996)
follows: we first recall the papers, which mark
Stability of control system in handling of a flexible the origin of development of a theory of dy-
object by rigid arm robots. In: Proceedings of the namic games; then we recall the basic concept of
1996 IEEE international conference on robotics and Pareto optimality proposed as a cooperative so-
automation, Minneapolis, pp 2332–2339
lution concept; we present the scalarization tech-
nique and the necessary or sufficient optimality
conditions for Pareto optimality in mathematical
programming and optimal control settings; we
Cooperative Solutions to Dynamic
then explore the difficulties encountered when
Games
one tried to extend the Nash bargaining solution,
characteristic function and cores concept to dy-
Alain Haurie
namic games; we show the links that exist with
ORDECSYS and University of Geneva,
the theory of reachability for perturbed dynamic
Switzerland
systems.
GERAD-HEC Montréal PQ, Canada

The Origins
Abstract
One may consider that the first introduction of
This article presents the fundamental elements of
a cooperative game solution concept in systems
the theory of cooperative games in the context
and control science is due to L.A. Zadeh (1963).
of dynamic systems. The concepts of Pareto op-
Two-player zero-sum dynamic games have been
timality, Nash bargaining solution, characteristic
studied by R. Isaacs (1954) in a deterministic
function, cores, and C-optimality are discussed,
continuous time setting and by L. Shapley (1953)
and some fundamental results are recalled.
in a discrete time stochastic setting. Nonzero-sum
and m player differential games were introduced
by Y.C. Ho and A.W. Starr (1969) and J.H. Case
Keywords (1969). For these games cooperative solutions
can be looked for to complement the noncoop-
Cores; Nash equilibrium; Pareto optimality erative Nash equilibrium concept.

Introduction Cooperation Solution Concept

Solution concepts in game theory are regrouped In cooperative games one is interested in non-
in two main categories called noncooperative dominated solution. This solution type is re-
and cooperation solutions, respectively. In the lated to a concept introduced by the well-known
seminal book of von Neumann and Morgen- economist V. Pareto (1869) in the context of
236 Cooperative Solutions to Dynamic Games

welfare economics. Consider a system with de- of the criteria. But this procedure will not find all
cision variables x 2 X IRn and m performance of the nondominated solutions.
criteria x ! j .x/ 2 IR, j D 1; : : : ; m that one
tries to maximize. Conditions for Pareto Optimality
in Mathematical Programming
Definition 1 The decision x  2 X is nondomi- N.O. Da Cunha and E. Polak (1967b) have ob-
nated or Pareto optimal if the following condition tained the first necessary conditions for multi-
holds: objective optimization. The problem they con-
sider is

j .x/  j .x / 8j D 1; : : : m
Pareto Opt: j .x/ j D 1; : : : m

H) j .x/ D j .x / 8j D 1; : : : m:
s:t:
'k .x/  0 k D 1; : : : p
In other words it is impossible to give one cri-
terion j a value greater than j .x  / without where the functions x 2 IRn 7! j .x/ 2 IR,
decreasing the value of another criterion, say `, j D 1; : : : ; m, and x 7! 'k .x/ 2 IR, k D
which then takes a value lower than ` .x  /. 1; : : : ; p are continuously differentiable (C 1 ) and
where we assume that the constraint qualification
This vector-valued optimization framework cor-
conditions of mathematical programming hold
responds to a situation where m players are en-
gaged in a game, described in its normal form, for this problem too. They proved the following
theorem.
where the strategies of the m players constitute
the decision vector x and their respective payoffs Theorem 1 Let x  be a Pareto optimal solution
are given by the m performance criteria j .x/, of the problem defined above. Then there exists a
j D 1; : : : ; m. One assumes that these players vector  of p multipliers k , k D 1; : : : ; p, and a
jointly take a decision that is cooperatively op- vector r ¤ 0 of m weights rj  0, such that the
timal, in the sense that no player can improve following conditions hold
his/her payoff without deteriorating the payoff of
@ 

at least one other player. L x I rI  D 0


@x
'k .x  /  0
The Scalarization Technique
Let r D .r1 ; r2 ; : : : ; rm / be a given m-vector com- k 'k .x  / D 0
posed of normalized weights that satisfy rj > 0, k  0;
P
j D 1 : : : ; m and j D1;:::;m rj D 1.
where L .x  I rI / is the weighted Lagrangian
Lemma 1 Let x  2 X be a maximum in defined by
X for the scalarized criterion ‰.xI r/ D
Pm 
j D1 rj j .x/. Then x is a nondominated X
m X
p

solution for the multi-objective problem. L .xI rI / D rj j .x/ C k 'k .x/:


j D1 kD1
The proof is very simple. Suppose x  is
dominated, then there exists x ı 2 X such So there is a local scalarization principle for
that j .x ı /  j .x  /, 8j D 1; : : : ; m, and Pareto optimality.
ı 
i .x / > i .x / for one i 2Pf1; : : : ; mg. Since
ı
all the rj are > 0, this yields m j D1 rj j .x / > Maximum Principle
Pm 
j D1 rj j .x /, which contradicts the maxi- The extension of Pareto optimality concept to
mizing property of x  . This result shows that it control systems was done by several authors
will be very easy to find many Pareto optimal (Basile and Vincent 1970; Bellassali and Jourani
solutions by varying a strictly positive weighting 2004; Binmore et al. 1986; Blaquière et al. 1972;
Cooperative Solutions to Dynamic Games 237

Leitmann et al. 1972; Salukvadze 1971; Vincent where the weighted Hamiltonian is defined by
and Leitmann 1970; Zadeh 1963), the main result
being an extension of the maximum principle of X
m

Pontryagin. Let a system be governed by state H.x; uI I r/ D rj gj .x; u/ C T f .x; u/:


j D1
equations:

P
x.t/ D f .x.t/; u.t// (1)
The proof of this result necessitates some addi- C
tional regularity assumptions. Some of these con-
u.t/ 2 U (2) ditions imply that there exist differentiable Bell-
man value functions (see, e.g., Blaquière et al.
x.0/ D xo (3)
1972); some others use the formalism of nons-
t 2 Œ0; T
(4) mooth analysis (see, e.g., Bellassali and Jourani
2004).
where x 2 IRn is the state variable of the system,
u 2 U IRp with U compact is the control
variable, and Œ0; T
is the control horizon. The The Nash Bargaining Solution
system is evaluated by m performance criteria of
the form Since Pareto optimal solutions are numerous (ac-
Z T tually since a subset of Pareto outcomes Pm are in-
j .x./; u.// D gj .x.t/; u.t//dtCGj .x.T //; dexed over the weightings r, r j > 0, j D1 rj D
0 1), one can expect, in the payoff m-dimensional
(5)
space, to have a manifold of Pareto outcomes.
for j D 1; : : : ; m. Under the usual assumptions
Therefore, the problem that we must solve now
of control theory, i.e., f .; / and gj .; /, j D is how to select the “best” Pareto outcome?
1; : : : ; m, being C 1 in x and continuous in u,
“Best” is a misnomer here, because, by their
Gj ./ being C 1 in x, one can prove the following.
very definition, two Pareto outcomes cannot be

Theorem 2 Let fx .t/ W t 2 Œ0; T
g be a Pareto compared or gauged. The choice of a Pareto
optimal trajectory, generated at initial state x ı by outcome that satisfies each player must be the
the Pareto optimal control fu .t/ W t 2 Œ0; T
g. result of some bargaining. J. Nash addressed this
Then there exist costate vectors f .t/ W t 2 problem very early, in 1951, using a two-player
Œ0; T
g and a vector of positive weightsP r ¤ 0 2 game setting. He developed an axiomatic ap-
IRm , with components rj  0, m j D1 j D 1,
r proach where he proposed four behavior axioms
such that the following relations hold: which, if accepted, would determine a unique
choice for the bargaining solution. These ax-
@ ioms are called respectively, (i) invariance to
xP  .t/ D H.x  .t/; u .t/I .t/I r/ (6)
@ affine transformations of utility representations,
@ (ii) Pareto optimality, (iii) independence of irrel-
P
.t/ D  H.x  .t/; u .t/I .t/I r/ (7) evant alternatives, and (iv) symmetry. Then the
@x
 bargaining point is the Pareto optimal solution
x .0/ D xo (8)
that maximizes the product
Xm
@
.T / D rj Gj .x.T // (9)
@x x  D argmaxx . 1 .x/ 1 .x ı //. 2 .x/ 2 .x ı //
j D1

with where x ı is the status quo decision, in case


bargaining fails, and . j .x ı //, j D 1; 2 are
H.x  .t/; u .t/I .t/I r/ the payoffs associated with this no-accord deci-
sion (this defines the so-called threat point). It
D max H.x  .t/; uI .t/I r/ has been proved (Binmore et al. 1986) that this
u2U
238 Cooperative Solutions to Dynamic Games

solution could be obtained also as the solution of Cores and C-Optimality in Dynamic
an auxiliary dynamic game in which a sequence Games
of claims and counterclaims is made by the two
players when they bargain. Characteristic functions and the associated so-
When extended directly to the context of lution concept of core are important elements
differential or multistage games, the Nash in the classical theory of cooperative games. In
bargaining solution concept proved to lack two papers (Haurie 1975; Haurie and Delfour
the important property of time consistency. 1974) the basic definitions and properties of the
This was first noticed in Haurie (1976). Let a concept of core in dynamic cooperative games
dynamic game be defined by Eqs. (1)–(5), with were presented. Consider the multistage system,
j D 1; 2. Suppose the status quo decision, controlled by a set M of m players and defined
if no agreement is reached at initial state by
.t D 0; x.0/ D x o /, consists in playing an
open-loop Nash equilibrium, defined by the x.k C 1/ D f k .x.k/; uM .k//;
controls uN j ./ W Œ0; T
! Uj , j D 1; 2 and
k D 0; 1; : : : ; K  1
generating the trajectory x N ./ W Œ0; T
! IRn ,
with x N .0/ D xo . Now applying the Nash x.i / D x i ; i 2 f0; 1; : : : ; K  1g
bargaining solution scheme to the data of this Y
differential game played at time t D 0 and state uM .k/, .uj .k//j 2M 2 UM .k/ , Uj .k/:
j 2M
x.0/ D xo , one identifies a particular Pareto
optimal solution, associated with the controls
u ./ W Œ0; T
! Uj , j D 1; 2 and generating the From the initial point .i; x i / a control sequence
trajectory x  ./ W Œ0; T
! IRn , with x  .0/ D xo . .uM .i /; : : : ; uM .K  1// generates for each
Now assume the two players renegotiate the player j a payoff defined as follows:
agreement to play uj ./ at an intermediate point
of the Pareto optimal trajectory .; x  .//, Jj .i; x i I uM .i /; : : : ; uM .K  1// ,
 2 .0; T /. When computed from that point, X
K1
the status quo strategies are in general not the ˆj .x.k/; uM .k// C ‡j .x.K//:
same as they were at .0; xo /; furthermore, the kDi
shape of the Pareto frontier, when the game is
played from .; x  .//, is different from what it A subset S of M is called a coalition. Let kS W
Q
is when the game is played at .0; xo /. For these x.k/ 7! uS .k/ 2 j 2S Uj .k/ be a feedback
two reasons the bargaining solution at .; x  .// control for the coalition defined at each stage k.
will not coincide in general with the restriction to A player j 2 S considers then, from any initial
the interval Œ; T
of the bargaining solution from point .i; x i /, his guaranteed payoff:
.0; xo /. This implies that the solution concept is
not time consistent. Using feedback strategies, ‰j .i; x i I iS ; : : : ; SK1 / ,
instead of open-loop ones, does not help, as
the same phenomena (change of status quo and infuM s .i /2UM s .i /;:::;uM s .K1/2UM s .K1/
PK1  k 
change of Pareto frontier) occur in a feedback
kDi ˆj .x.k/; S .x.k//; uM S .k/ /
strategy context.
This shows that the cooperative game solu- C‡j .x.K//:
tions proposed in the classical theory of games
cannot be applied without precaution in a dy- Definition 2 The characteristic function at stage
namic setting when players have the possibil- i for coalition S M is the mapping v i W
ity to renegotiate agreements at any interme- .S; x i / 7! v i .S; x i / IRS defined by
diary point .t; x ? .t// of the bargained solution
trajectory. !S , .!j /j 2S 2 v i .S; x i / ,
Cooperative Solutions to Dynamic Games 239

9iS ; : : : ; SK1 W 8j 2 S examples that a Pareto optimal trajectory which


has the gain-to-go vector in the core at initial
‰j .i; x i I iS ; : : : ; SK1 /  !j : point .0; x0 / is not C -optimal.

In other words, there is a feedback law for the


coalition S which guarantees at least !j to each
Links with Reachability Theory for
player j in the coalition.
Perturbed Systems
C
Suppose that in a cooperative agreement, at point
.i; x i /, the coalition S is proposed a gain vector The computation of characteristic functions can
!s which is interior to v i .S; x i /. Then coalition be made using the techniques developed to study
S will block this agreement, because using an ap- reachability of dynamic systems with set con-
propriate feedback, the coalition can guarantee a strained disturbances (see Bertsekas and Rhodes
better payoff to each of its members. We can now 1971). Consider the particular case of a linear
extend the definition of the core of a cooperative system
game to the context of dynamic games, as the set
of agreement gains that cannot be blocked by any X
x.k C 1/ D Ak x.k/ C Bjk uj .k/ (10)
coalition. j 2M

Definition 3 The core .i; x i / at point .i; x i /


is the set of gain vectors !M , .!j /j 2M such where x 2 IRn , uj 2 Ujk IRpj , where Ujk is
that: a convex-bound set and Ak , Bjk are matrices of
1. There exists a Pareto optimal control appropriate dimensions. Let the payoff to player
u?M .i /; : : : ; u?M .K  1/ for which !j D j be defined by:
Jj .i; x i I u?M .i /; : : : ; u?M .K  1//,
2. 8S M the projection of !M in IRS is not
Jj .i; x i I uM .i /; : : : ; uM .K  1// ,
interior to v i .S; x i /
X
K1
Playing a cooperative game, one would be inter- jk .x.k// C jk .uj .k// C ‡j .x.K//:
ested in finding a solution where the gain-to-go kDi
remains in the core at each point of the trajectory.
This leads us to define the following.
Algorithm Here we use the notations Sk ,
P
Definition 4 A control uQ o , .uoM .0/; : : : ; uM . jk /j 2S and BSk uS , k
j 2S Bj uj . Also we
.K  1// is C -optimal at .0; x 0 / if uQ o is Pareto denote fu C V g, where u is a vector in IRm and
optimal generating a state trajectory V IRm , the set of vectors u C v, 8v 2 V .
Then ˚ 
fx o .0/ D x 0 ; x o .1/; : : : ; x o .K/g 1. 8x K v K .S; x K /, !S 2 IRS W ‡S .x K /  !S
2. 8x E kC1 .S; x/ , \v 2 UM S v kC1
and a sequence of gain-to-go values .S; x C BM k
S
v/
[ ˚ k
3. 8x k Hk .S; x k / , .u/ C E kC1
!jo .i / D Jj .i; x o .i /I uoM .i /; : : : ; uoM .K  1//; u 2 US S

.S; Ak x k C BSk u/ ˚ 
i D 0; : : : ; K  1
4. 8x k v k .S; x k / D Sk .x k / C Hk .S; x k / :
In an open-loop control setting, the calculation
such that 8i D 0; 1; : : : ; K  1, the m-vector
o of characteristic function can be done using the
!M .i / is element of the core .i; x o .i //.
concept of Pareto optimal solution for a sys-
A C -optimal control generates an agreement tem with set constrained disturbances, as shown
which cannot be blocked by any coalition along in Goffin and Haurie (1973, 1976) and Haurie
the Pareto optimal trajectory. It can be shown on (1973).
240 Cooperative Solutions to Dynamic Games

Conclusion spaces. In: Balakrishnan AV, Neustadt LW (eds) Math-


ematical theory of control. Academic, New York,
pp 96–108
Since the foundations of a theory of cooperative Da Cunha NO, Polak E (1967b) Constrained minimiza-
solutions to dynamic games, recalled in this ar- tion under vector-valued criteria in finite dimensional
ticle, the research has evolved toward the search spaces. J Math Anal Appl 19:103–124
for cooperative solutions that could be also equi- Germain M, Toint P, Tulkens H, Zeeuw A (2003) Trans-
fers to sustain dynamic core-theoretic cooperation in
librium solution, using for that purpose a class of international stock pollutant control. J Econ Dyn Con-
memory strategies Haurie and Towinski (1985), trol 28:79–99
and has found a very important domain of appli- Goffin JL, Haurie A (1973) Necessary conditions and suf-
cation in the assessment of environmental agree- ficient conditions for Pareto optimality in a multicrite-
rion perturbed system. In: Conti R, Ruberti A (eds) 5th
ments, in particular those related to the climate conference on optimization techniques, Rome. Lecture
change issue. For example, the sustainability of notes in computer science, vol 4 Springer
solutions in the core of a dynamic game mod- Goffin JL, Haurie A (1976) Pareto optimality with non-
eling international environmental negotiations is differentiable cost functions. In: Thiriez H, Zionts S
(eds) Multiple criteria decision making. Lecture notes
studied in Germain et al. (2003). A more encom- in economics and mathematical systems, vol 130.
passing model of dynamic formation of coalitions Springer, Berlin/New York, pp 232–246
and stabilization of solutions through the use of Haurie A (1973) On Pareto optimal decisions for a coali-
threats is proposed in Breton et al. (2010). These tion of a subset of players. IEEE Trans Autom Control
18:144–149
references are indicative of the trend of research Haurie A (1975) On some properties of the characteristic
in this field. function and the core of a multistage game of coalition.
IEEE Trans Autom Control 20(2):238–241
Haurie A (1976) A note on nonzero-sum differential
Cross-References games with bargaining solutions. J Optim Theory Appl
13:31–39
Haurie A, Delfour MC (1974) Individual and collective
 Dynamic Noncooperative Games
rationality in a dynamic Pareto equilibrium. J Optim
 Game Theory: Historical Overview Appl 13(3):290–302
 Strategic Form Games and Nash Equilibrium Haurie A, Towinski B (1985) Definition and properties of
cooperative equilibria in a two-player game of infinite
duration. J Optim Theory Appl 46(4):525–534
Isaacs R (1954) Differential games I: introduction. Rand
Bibliography Research Memorandum, RM-1391-30. Rand Corpora-
tion, Santa Monica
Basile G, Vincent TL (1970) Absolutely cooperative solu- Leitmann G, Rocklin S, Vincent TL (1972) A note on
tion for a linear, multiplayer differential game. J Optim control space properties of cooperative games. J Optim
Theory Appl 6:41–46 Theory Appl 9:379–390
Bellassali S, Jourani A (2004) Necessary optimality con- Nash J (1950) The bargaining problem. Econometrica
ditions in multiobjective dynamic optimization. SIAM 18(2):155–162
J Control Optim 42:2043–2061 Pareto V (1896) Cours d’Economie Politique. Rogue,
Bertsekas DP, Rhodes IB (1971) On the minimax reach- Lausanne
ability of target sets and target tubes. Automatica 7: Salukvadze ME (1971) On the optimization of control
23–247 systems with vector criteria. In: Proceedings of the
Binmore K, Rubinstein A, Wolinsky A (1986) The Nash 11th all-union conference on control, Part 2. Nauka
bargaining solution in economic modelling. Rand J Shapley LS (1953) Stochastic games. PNAS 39(10):1095–
Econ 17(2):176–188 1100
Blaquière A, Juricek L, Wiese KE (1972) Geometry of Starr AW, Ho YC (1969) Nonzero-sum differential games.
Pareto equilibria and maximum principle in n-person J Optim Theory Appl 3(3):184–206
differential games. J Optim Theory Appl 38:223–243 Vincent TL, Leitmann G (1970) Control space properties
Breton M, Sbragia L, Zaccour G (2010) A dynamic model of cooperative games. J Optim Theory Appl 6(2):
for international environmental agreements. Environ 91–113
Resour Econ 45:25–48 von Neumann J, Morgenstern O (1944) Theory of Games
Case JH (1969) Toward a theory of many player differen- and Economic Behavior, Princeton University Press
tial games. SIAM J Control 7(2):179–197 Zadeh LA (1963) Optimality and non-scalar-valued per-
Da Cunha NO, Polak E (1967a) Constrained minimiza- formance criteria. IEEE Trans Autom Control AC-
tion under vector-valued criteria in linear topological 8:59–60
Coordination of Distributed Energy Resources for Provision of Ancillary Services. . . 241

down regulation for frequency control (see, e.g.,


Coordination of Distributed Energy Callaway and Hiskens (2011) and the references
Resources for Provision of Ancillary therein). To enable DERs to provide ancillary
Services: Architectures and services, it is necessary to develop appropriate
Algorithms control and coordination mechanisms. One
potential solution relies on a centralized control
Alejandro D. Domínguez-García1 and C
architecture in which each DER is directly
Christoforos N. Hadjicostis2
1 coordinated by (and communicates with) a
University of Illinois at Urbana-Champaign,
central decision maker. An alternative approach
Urbana-Champaign, IL, USA
2 is to distribute the decision making, which
University of Cyprus, Nicosia, Cyprus
obviates the need for a central decision maker
to coordinate the DERs. In both cases, the
Abstract decision making involves solving a resource
allocation problem for coordinating the DERs
We discuss the utilization of distributed energy to collectively provide a certain amount of a
resources (DERs) to provide active and reactive resource (e.g., active or reactive power).
power support for ancillary services. Though the In a practical setting, whether a centralized or
amount of active and/or reactive power provided a distributed architecture is adopted, the control
individually by each of these resources can be of DERs for ancillary services provision will
very small, their presence in large numbers in involve some aggregating entity that will gather
power distribution networks implies that, under together and coordinate a set of DERs, which
proper coordination mechanisms, they can col- will provide certain amount of active or reac-
lectively provide substantial active and reactive tive power in exchange for monetary benefits. In
power regulation capacity. In this entry, we pro- general, these aggregating entities are the ones
vide a simple formulation of the DER coordina- that interact with the ancillary services market,
tion problem for enabling their utilization to pro- and through some market-clearing mechanism,
vide ancillary services. We also provide specific they enter a contract to provide some amount of
architectures and algorithmic solutions to solve resource, e.g., active and/or reactive power over a
the DER coordination problem, with focus on period of time. The goal of the aggregating entity
decentralized solutions. is to provide this amount of resource by properly
coordinating and controlling the DERs, while
ensuring that the total monetary compensation
Keywords to the DERs for providing the resource is below
the monetary benefit that the aggregating entity
Ancillary services; Consensus; Distributed algo-
obtains by selling the resource in the ancillary
rithms; Distributed energy resources (DERs)
services market.
In the context above, a household with a so-
Introduction lar PV rooftop installation and a PHEV might
choose to offer the PV installation to a renew-
On the distribution side of a power system, able aggregator so it is utilized to provide re-
there are many distributed energy resources active power support (this can be achieved as
(DERs), e.g., photovoltaic (PV) installations, long as the PV installation power electronics-
plug-in hybrid electric vehicles (PHEVs), and based grid interface has the correct topology
thermostatically controlled loads (TCLs), that Domínguez-García et al. 2011). Additionally, the
can be potentially used to provide ancillary household could offer its PHEV to a battery ve-
services, e.g., reactive power support for voltage hicle aggregator to be used as a controllable load
control (see, e.g., Turitsyn et al. (2011) and for energy peak shaving during peak hours and
the references therein) and active power up and load leveling at night (Guille and Gross 2009).
242 Coordination of Distributed Energy Resources for Provision of Ancillary Services. . .

Finally, the household might choose to enroll in differently. For example, the downregulation ca-
a demand response program in which it allows a pacity provided by a residential PV installation
demand response provider to control its TCLs to (which is achieved by curtailing its power) might
provide frequency regulation services (Callaway be valued differently from the downregulation
and Hiskens 2011). In general, the renewable capacity provided by a TCL or a PHEV (both
aggregator, the battery vehicle aggregator, and the would need to absorb additional power in order
demand response provider can be either separate to provide downregulation).
entities or they can be the same entity. In this It is not difficult to see that if the price func-
entry, we will refer to these aggregating entities tions i ./; i D 1; 2 : : : ; n, are convex and non-
P
as aggregators. decreasing, then the cost function niD1 xi i .xi /
is convex; thus, if the problem in (1) is feasi-
ble, then there exists a globally optimal solu-
The Problem of DER Coordination tion. Additionally, if the price per unit of re-
source is linear with the amount of resource, i.e.,
Without loss of generality, denote by xj the i .xi / D ci xi ; i D 1; 2; : : : ; n, then xi i .xi / D
amount of resource provided by DER i without ci xi2 ; i D 1; 2; : : : ; n, and the problem in (1)
specifying whether it is active or reactive power. reduces to a quadratic program. Also, if the price
[However, it is understood that each DER pro- per unit of resource is constant, i.e., i .xi / D
vides (or consumes) the same type of resource, ci ; i D 1; 2; : : : ; n, then xi i .xi / D ci xi ; i D
i.e., all the xi ’s are either active or reactive 1; 2; : : : ; n, and the problem in (1) reduces to a
power.] Let 0 < x i < x i , for i D 1; 2; : : : ; n, linear program. Finally, if i .xi / D .xi / D
denote the minimum (x i ) and maximum (x i ) c; i D 1; 2; : : : ; n, for some constant c > 0,
capacity limits on the amount of resource xi i.e., the price offered by the aggregator is constant
that node i can provide. Denote by X the total and the same for all DERs, then the optimization
amount of resource that the DERs must collec- problem in (1) becomes a feasibility problem of
tively provide to satisfy the aggregator request. the form
Let i .xi / denote the price that the aggregator
pays DER i per unit of resource xi that it pro- find x1 ; x2 ; : : : ; xn
vides. Then, the objective of the aggregator in X
n
the DER coordination problem is to minimize the subject to xi D X (2)
total monetary amount to be paid to the DERs for i D1
providing the total amount of resource X while 0 < x i  xi  x i ; 8j:
satisfying the individual capacity constraints of
the DERs. Thus, the DER coordination problem If the problem in (2) is indeed feasible (i.e.,
can be formulated as follows: Pn Pn
lD1 x l  X  lD1 x l ), then there is an
infinite number of solutions. One such solution,
X
n
which we refer to as fair splitting, is given by
minimize xi i .xi /
i D1 Pn
X lD1 x l
X
n
(1) xi D x i C Pn .x i  x i /; 8i: (3)
subject to xi D X lD1 l  x l /
.x
i D1
The formulation to the DER coordination
0 < x i  xi  x i ; 8j: problem provided in (2) is not the only possible
one. In this regard, and in the context of PHEVs,
By allowing heterogeneity in the price per several recent works have proposed game-
unit of resource that the aggregator offers to theoretic formulations to the problem (Ghare-
each DER, we can take into account the fact sifard et al. 2013; Ma et al. 2013; Tushar et al.
that the aggregator might value classes of DERs 2012). For example, in Gharesifard et al. (2013),
Coordination of Distributed Energy Resources for Provision of Ancillary Services. . . 243

a Ps + jQ s

Psc + jQ cs
Psr + jQ rs

Bus s
C
Psu + jQ u
s

Centralized architecture.

b Ps + jQ s

Psc + jQ cs
Psr + jQ rs

Bus s

Psu + jQ u
s

Decentralized architecture.

Coordination of Distributed Energy Resources for Provision of Ancillary Services: Architectures and Algo-
rithms, Fig. 1 Control architecture alternatives. (a) Centralized architecture. (b) Decentralized architecture

the authors assume that each PHEV is a decision as formulated in (1). Specifically, we describe a
maker and can freely choose to participate after centralized architecture that requires the aggre-
receiving a request from the aggregator. The gator to communicate bidirectionally with each
decision that each PHEV is faced with depends DER and a distributed architecture that requires
on its own utility function, along with some the aggregator to only unidirectionally communi-
pricing strategy designed by the aggregator. The cate with a limited number of DERs but requires
PHEVs are assumed to be price anticipating in some additional exchange of information (not
the sense that they are aware of the fact that necessarily through bidirectional communication
the pricing is designed by the aggregator with links) among the DERs.
respect to the average energy available. Another
alternative is to formulate the DER coordination
Centralized Architecture
problem as a scheduling problem (Chen et al.
A solution can be achieved through the
2012; Subramanian et al. 2012), where the DERs
completely centralized architecture of Fig. 1a,
are treated as tasks. Then, the problem is to
where the aggregator can exchange information
develop real-time scheduling policies to service
with each available DER. In this scenario,
these tasks.
each DER can inform the aggregator about
its active and/or reactive capacity limits and
other operational constraints, e.g., maintenance
Architectures schedule. After gathering all this information,
the aggregator solves the optimization program
Next, we describe two possible architectures that in (1), the solution of which will determine
can be utilized to implement the proper algo- how to allocate among the resources the total
rithms for solving the DER coordination problem amount of active power Psr and/or reactive
244 Coordination of Distributed Energy Resources for Provision of Ancillary Services. . .

power Qsr that it needs to provide. Then, the the testbed described in Domínguez-García et al.
aggregator sends individual commands to each (2012a), which is used to solve a particular in-
DER so they modify their active and or reactive stance of the problem in (1), uses Arduino mi-
power generation according to the solution crocontrollers (see Arduino for a description)
of (1) computed by the aggregator. In this outfitted with wireless transceivers implementing
centralized solution, however, it is necessary a ZigBee protocol (see ZigBee for a description).
to overlay a communication network connecting
the aggregator with each resource and to maintain
knowledge of the resources that are available at
any given time. Algorithms

Decentralized Architecture Ultimately, whether a centralized or a decentral-


An alternative is to use the decentralized control ized architecture is adopted, it is necessary to
architecture of Fig. 1b, where the aggregator re- solve the optimization problem in (1). If a cen-
lays information to a limited number of DERs tralized architecture is adopted, then solving (1)
that it can directly communicate with and each is relatively straightforward using, e.g., standard
DER is able to exchange information with a gradient-descent algorithms (see, e.g., Bertsekas
number of other close-by DERs. For example, and Tsitsiklis 1997). Beyond the DER coordina-
the aggregator might broadcast the prices to be tion problem and the specific formulation in (1),
paid to each type of DER. Then, through some solving an optimization problem is challenging if
distributed protocol that adheres to the commu- a decentralized architecture is adopted (especially
nication network interconnecting the DERs, the if the communication links between DERs are
information relayed by the aggregator to this not bidirectional); this has spurred significant
limited number of DERs is disseminated to all research in the last few years (see, e.g., Bertsekas
other available DERs. This dissemination pro- and Tsitsiklis 1997, Xiao et al. 2006, Nedic et al.
cess may rely on flooding algorithms, message- 2010, Zanella et al. 2011, Gharesifard and Cortes
passing protocols, or linear-iterative algorithms 2012, and the references therein).
as proposed in Domínguez-García and Hadji- In the specific context of the DER coordi-
costis (2010, 2011). After the dissemination pro- nation problem as formulated in (1), when the
cess is complete and through a distributed com- cost functions are assumed to be quadratic and
putation over the communication network, the the communication between DERs is not bidirec-
DERs can solve the optimization program in (1) tional, an algorithm amenable for implementation
and determine its active and/or reactive power in a decentralized architecture like the one in
contribution. Fig. 1b has been proposed in Domínguez-García
A decentralized architecture like the one in et al. (2012a). Also, in the context of Fig. 1b,
Fig. 1b may offer several advantages over the cen- when the communication between DERs are bidi-
tralized one in Fig. 1b, including the following. rectional, the DER coordination problem, as for-
First, a decentralized architecture may be more mulated in (1), can be solved using an algorithm
economical because it does not require commu- proposed in Kar and Hug (2012).
nication between the aggregator and the various As mentioned earlier, when the price offered
DERs. Also, a decentralized architecture does by the aggregator is constant and identical for
not require the aggregator to have a complete all DERs, the problem in (1) reduces to the
knowledge of the DERs available. Additionally, a feasibility problem in (2). One possible solution
decentralized architecture can be more resilient to to this feasibility problem is the fair-splitting so-
faults and/or unpredictable behavioral patterns by lution in (3). Next, we describe a linear-iterative
the DERs. Finally, the practical implementation algorithm – originally proposed in Domínguez-
of such decentralized architecture can rely on García and Hadjicostis (2010, 2011) and referred
inexpensive and simple hardware. For example, to as ratio consensus – that allows the DERs to
Coordination of Distributed Energy Resources for Provision of Ancillary Services. . . 245

individually determine its contribution so that the xi D x i C .x i  x i / (6)


fair-splitting solution is achieved.
where for all i
Ratio Consensus: A Distributed Algorithm
P
for Fair Splitting yi Œk
X  nlD1 x l
We assume that each DER is equipped with a lim D Pn WD : (7)
k!1 zi Œk
lD1 .x l  x l / C
processor that can perform simple computations
and can exchange information with neighboring
It is important to note that the algorithm in
DERs. In particular, the information exchange
(4)–(7) also serves as a primitive for the algorithm
between DERs can be described by a directed
proposed in Domínguez-García et al. (2012a),
graph G D fV; Eg, where V D f1; 2; : : : ; ng is the
which solves the problem in (1) when the cost
vertex set (each vertex – or node – corresponds to
function is quadratic. Also, the algorithm in
a DER) and E  V  V is the set of edges, where
(4)–(7) is not resilient to packet-dropping com-
.i; j / 2 E if node i can receive information from
munication links or imperfect synchronization
node j . We require G to be strongly connected,
among the DERs, which makes it difficult
i.e., for any pair of vertices l and l 0 , there exists
to implement in practice; however, there are
a path that starts in l and ends in l 0 . Let LC 
robustified variants of this algorithm that address
V, LC ¤ ; denote the set of nodes that the
these issues Domínguez-García et al. (2012b)
aggregator is able to directly communicate with.
and have been demonstrated to work in practice
The processor of each DER i maintains two
(Domínguez-García et al. 2012a).
values yi and zi , which we refer to as internal
states, and updates them (independently of each
other) to be, respectively, a linear combination of
DER i ’s own previous internal states and the pre- Cross-References
vious internal states of all nodes that can possibly
transmit information to node i (including itself).  Averaging Algorithms and Consensus
In particular, for all k  0, each node i updates  Distributed Optimization
its two internal states as follows:  Electric Energy Transfer and Control via Power
Electronics
X 1  Flocking in Networked Systems
yi Œk C 1
D yj Œk
; (4)
j 2Ni
DjC  Graphs for Modeling Networked Interactions
 Network Games
X 1  Networked Systems
zi Œk C 1
D zj Œk
; (5)
j 2Ni
DjC

where Ni D fj 2 V W .i; j / 2 Eg, i.e.,


Bibliography
all nodes that can possibly transmit information
to node i (including itself); and DiC is the out- Arduino [Online]. Available: http://www.arduino.cc
degree of node i , i.e., the number of nodes to Bertsekas DP, Tsitsiklis JN (1997) Parallel and distributed
which node i can possibly transmit information computation. Athena Scientific, Belmont
Callaway DS, Hiskens IA (2011) Achieving controllabil-
(including itself). The initial conditions in (4)
ity of electric loads. Proc IEEE 99(1):184–199
are set to yi Œ0
D X=m  x i if i 2 LC , and Chen S, Ji Y, Tong L (2012) Large scale charging of
yi Œ0
D x i otherwise and the initial conditions electric vehicles. In: Proceedings of the IEEE power
in (5) are set to zi Œ0
D x i  x i . Then, as shown and energy society general meeting, San Diego
Domínguez-García AD, Hadjicostis CN (2010) Coordi-
in Domínguez-García and Hadjicostis (2011), as
P P nation and control of distributed energy resources for
long as nlD1 x l  X  nlD1 x l , each DER i provision of ancillary services. In: Proceedings of the
can asymptotically calculate its contribution as IEEE SmartGridComm, Gaithersburg
246 Credit Risk Modeling

Domínguez-García AD, Hadjicostis CN (2011) Dis-


tributed algorithms for control of demand response Credit Risk Modeling
and distributed energy resources. In: Proceedings
of the IEEE conference on decision and control, Tomasz R. Bielecki
Orlando
Department of Applied Mathematics, Illinois
Domínguez-García AD, Hadjicostis CN, Krein PT, Cady
ST (2011) Small inverter-interfaced distributed energy Institute of Technology, Chicago, IL, USA
resources for reactive power support. In: Proceedings
of the IEEE applied power electronics conference and
exposition, Fort Worth
Domínguez-García AD, Cady ST, Hadjicostis CN (2012a) Abstract
Decentralized optimal dispatch of distributed energy
resources. In: Proceedings of the IEEE conference on
Modeling of credit risk is concerned with con-
decision and control, Maui
Domínguez-García AD, Hadjicostis CN, Vaidya N structing and studying formal models of time
(2012b) Resilient networked control of distributed evolution of credit ratings (credit migrations) in
energy resources. IEEE J Sel Areas Commun a pool of credit names, and with studying various
30(6):1137–1148
Gharesifard B, Cortes J (2012) Continuous-time dis-
properties of such models. In particular, this in-
tributed convex optimization on weight-balanced di- volves modeling and studying default times and
graphs. In: Proceedings of the IEEE conference on their functionals.
decision and control, Maui
Gharesifard B, Domínguez-García AD, Başar T (2013)
Price-based distributed control for networked plug-
in electric vehicles. In: Proceedings of the American
control conference, Washington, DC
Keywords
Guille C, Gross G (2009) A conceptual framework for the
vehicle-to-grid (V2G) implementation. Energy Policy Credit risk; Credit migrations; Default time;
37(11):4379–4390 Markov copulae
Kar S, Hug G (2012) Distributed robust economic dis-
patch in power systems: a consensus C innovations
approach. In: Proceedings of the IEEE power and
energy society general meeting, San Diego
Ma Z, Callaway DS, Hiskens IA (2013) Decentral- Introduction
ized charging control of large populations of plug-
in electric vehicles. IEEE Trans Control Syst Technol Modeling of credit risk is concerned with con-
21:67–78
structing and studying formal models of time
Nedic A, Ozdaglar A, Parrilo PA (2010) Constrained
consensus and optimization in multi-agent networks. evolution of credit ratings (credit migrations) in
IEEE Trans Autom Control 55(4):922–938 a pool of N credit names (obligors), and with
Subramanian A, Garcia M, Domínguez-García AD, Call- studying various properties of such models. In
away DC, Poolla K, Varaiya P (2012) Real-time
particular, this involves modeling and studying
scheduling of deferrable electric loads. In: Proceedings
of the American control conference, Montreal default times and their functionals. In many ways,
Turitsyn K, Sulc P, Backhaus S, Chertkov M (2011) modeling techniques used in credit risk are sim-
Options for control of reactive power by distributed ilar to modeling techniques used in reliability
photovoltaic generators. Proc IEEE 99(6):1063–1073
Tushar W, Saad W, Poor HV, Smith DB (2012) Eco-
theory. Here, we focus on modeling in continuous
nomics of electric vehicle charging: a game the- time.
oretic approach. IEEE Trans Smart Grids 3(4): Models of credit risk are used for the purpose
1767–1778 of valuation and hedging of credit derivatives, for
Xiao L, Boyd S, Tseng CP (2006) Optimal scaling of
a gradient method for distributed resource allocation.
valuation and hedging of counter-party risk, for
J Optim Theory Appl 129(3):469–488 assessment of systemic risk in an economy, or
Zanella F, Varagnolo D, Cenedese A, Pillonetto G, Schen- for constructing optimal trading strategies involv-
ato L (2011) Newton-Raphson consensus for dis- ing credit-sensitive financial instruments, among
tributed convex optimization. In: Proceedings of the
IEEE conference on decision and control, Orlando
other uses.
ZigBee Alliance [Online]. Available: http://www.zigbee. Evolution of credit ratings for a single obligor,
org labeled as i , where i 2 f1; : : : ; N g; can be
Credit Risk Modeling 247

modeled in many possible ways. One popular Modeling of Single Default Time
possibility is to model credit migrations in terms Using Conditional Density
of a jump process, say C i D .Cti /t 0 , taking
values in a finite set, say Ki WD f0; 1; 2; : : : ; K i  Traditionally, there were two main approaches to
1; K i g, representing credit ratings assigned to modeling default times: the structural approach
obligor i . Typically, the rating state K i represents and the reduced approach, also known as the
the state of default of the i -th obligor, and typi- hazard process approach. The main features of C
cally it is assumed that process C i is absorbed at both these approaches are presented in Bielecki
state K i . and Rutkowski (2004).
Frequently, the case when K i D 1, that is We focus here on modeling a single default
K WD f0; 1g; is considered. In this case, one
i
time, denoted as , using the so-called condi-
is only concerned with jump from the pre- tional density approach of El Karoui et al. (2010).
default state 0 to the default state 1, which This approach allows for extension of results that
is usually assumed to be absorbing – the can be derived using reduced approach.
assumption made here as well. It is assumed that The default time  is a strictly positive random
process C i starts from state 0. The (random) variable defined on the underlying probability
time of jump of process C i from state 0 space .˝; F ; P /, which is endowed with a ref-
to state 1 is called the default time, and is erence filtration, say F D .Ft /t 0 , representing
denoted as  i : Process C i is now the same as flow of all (relevant) market information available
the indicator process of  i , which is denoted in the model, not including information about
as H i and defined as Hti D 1f i t g ; for occurrence of : The information about occur-
t  0: Consequently, modeling of the process rence of  is carried by the (right continuous)
C i is equivalent to modeling of the default filtration H generated by the indicator process
time  i . H WD .Ht D 1 t /t 0 . The full information in
The ultimate goal of credit risk modeling is the model is represented by filtration G WD F_H:
to provide a feasible mathematical and compu- It is postulated that
tational methodology for modeling the evolution
of the multivariate credit migration process C WD
.C 1 ; : : : ; C N /; so that relevant functionals of P . 2 djFt / D ˛t ./d;
such processes can be computed efficiently. The
simplest example of such functional is P .Ctj 2
Aj ; j D 1; 2; : : : ; J jGs /, representing the con- for some random field ˛ ./; such that ˛t ./ is
ditional probability, given the information Gs at Ft ˝ B.RC / measurable for each t: The family
time s  0, that process C takes values in ˛t ./ is called Ft -conditionalR density of : In
1
the set Aj at time tj  0, j D 1; 2; : : : ; J . particular, P . > / D  ˛0 .u/ d u: The
In particular, in case of modeling of the de- following survival processes are associated with
fault times  i , i D 1; 2; : : : ; N , one is con- , R1
cerned with computing conditional survival prob- • St ./ WD P . > jFt / D  ˛t .u/ d u,
abilities P . 1 > t1 ; : : : ;  N > tN jGs /, which which is an F-martingale,
are the same as probabilities P .Htii D 0; i D • St WD St .t/ D P . > tjFt /, which is an F-
1; 2; : : : ; N jGs /. supermartingale (Azéma supermartingale).
R 1 particular, S0 ./ D P . > / D
Based on that, one can compute more com- In
 ˛0 .u/ d u; and St .0/ D S0 D 1:
plicated functionals, that naturally occur in the
context of valuation and hedging of credit risk– As an example of computations that can be
sensitive financial instruments, such as corporate done using the conditional density approach
(defaultable) bonds, credit default swaps, credit we give the following result, in which notation
spread options, collateralized bond obligations, “bd” and “ad” stand for before default and
and asset-based securities, for example. at-or-after default, respectively.
248 Credit Risk Modeling

Theorem 1 Let YT ./ be a FT _ ./ measur- is a G-martingale. If G is absolutely continuous,


able and bounded random variable. Then the G-adapted process G such that
Z t
E.YT ./jFt / D Ytbd 1t < C Ytad .T; /1t  ; G D G
t u du
0

where
is called the G-intensity of : The G-compensator
R1 is stopped at , i.e., G G G
t D t ^ . Hence, t D 0
YT ./˛t ./d
Ytbd D t
1St >0 ; when t > : In particular, we have
St
G F F
t D 1t < t D .1  Ht /t :
and
The conditional density process and the G-
E.YT ./˛T ./jFt /
Ytad .T; / D 1˛t . />0 : intensity of  are related as follows: For any t < 
˛t ./ and   t we have

There is an interesting connection between ˛t ./ D E.G


 jFt /:
the conditional density process and the so-called
default intensity processes, which are ones of the Example 1 This is a structural-model-like
main objects used in the reduced approach. This example
connection starts with the following result, • Suppose F D FX is a filtration of a default
Theorem 2 (i) The Doob-Meyer (additive) driver process, say X , and  is the default bar-
decomposition of the survival process S is rier assumed to be independent of X . Denote
given as G.t/ D P . > t/.
• Define
Z t
St D 1 C MtF  ˛u .u/d u;  WD infft  0 W t  g;
0

Rt
F with t WD supst Xs : We then have St ./ D
R 1 Mt D  0 .˛t .u/  ˛u .u//d u D
where
E. 0 ˛u .u/d ujFt /  1: G. / if   t and St ./ D E.G. /jFtX / if
(ii) Let  WD infft  0 W St  D 0g: Define  >t
Ft D ˛St .tt / for t <  and Ft D F for t  : • Assume that F D 1  G and  are ab-
Then, the multiplicative decomposition of S solutely continuous w.r.t. Lebesgue measure,
is given as with respective densities f and . We then
have
Rt
Fu d u
St D LFt e  0 ;
˛t ./ D f . /  D ˛ ; t  ;

where and FX intensity of  is


Rt
Fu d u
dLFt D e 0 dMtF ; LF0 D 1: ˛t .t/ ˛t .t/
t D D :
G.t / St
The process F is called the F intensity of :
• In particular, if  is a unit exponential r.v., that
The G-compensator of  is the G-predictable is, if G.t/ D e t for t  0, then we have that
increasing process G such that the process t D t D ˛St .tt / :

MtG D Ht  G
t Example 2 This is a reduced-form-like example.
Credit Risk Modeling 249

X
• Suppose S is a strictly positive process. Then,
k D aij ; 8i; j; h 2 K; i ¤ j; (1)
C 1
aih;j
the F-hazard process of  is denoted by  F k2K
and is given as
X
k D ahk ; 8i; h; k 2 K; h ¤ k; (2)
C 2
F
aih;j
t D  ln St ; t  0: j 2K

In other words, It can be shown that this system admits at least


C
F
one positive solution.
St D e t ; t  0:
Theorem 3 Consider an arbitrary positive so-
F lution of the system (1)–(2). Then the matrix
R t  F is absolutely continuous,
• In particular, if
AC D Œaih;jX
F
that is, t D 0 u d u then k
i;h;j;k2K (where diagonal elements
are defined appropriately) satisfies the condi-
Rt tions for a generator matrix of a bivariate time-
uF d u
St D e  0 ; t 0 and homogeneous Markov chain, say C D .C 1 ; C 2 /,
˛t ./ D F S ; t  : whose components are Markov chains in the
filtration of C and with the same laws as Z 1 and
Z2.
Modeling Evolution of Credit Ratings Consequently, the system (1)–(2) serves as a
Using Markov Copulae Markov copula between the Markovian margins
C 1 , C 2 and the bivariate Markov chain C.
The key goal in modeling of the joint migration Note that the system (1)–(2) can contain more
process C is that the distributional laws of unknowns than the number of equations, there-
the individual migration components C i ; i 2 fore being underdetermined, which is a crucial
f1; : : : ; N g; agree with given (predetermined) feature for ability of calibration of the joint mi-
laws. The reason for this is that the marginal laws gration process C to marginal market data.
of C, that is, the laws of C i ; i 2 f1; : : : ; N g;
can be calibrated from market quotes for prices Example 3 This example illustrates modeling
of individual (as opposed to basket) credit joint defaults using strong Markov copula theory.
derivatives, such as the credit default swaps, Let us consider two processes, Z 1 and Z 2 ,
and thus, the marginals of C should have laws that are time-homogeneous Markov chains, each
agreeing with the market data. taking values in the state space f0; 1g; with re-
One way of achieving this goal is to model spective generators
C as a Markov chain satisfying the so-called
Markov copula property. For brevity we present 0 1
here the simplest such model, in which the refer-  
0 .a C c/ aCc
ence filtration F is trivial, assuming additionally, A1 D (3)
1 0 0
but without loss of generality, that N D 2 and
that K1 D K2 D K WD f0; 1; : : : ; Kg:
Here we focus on the case of the so-called and
strong Markov copula property, which is reflected
in Theorem 3. 0 1
 
Let us consider two Markov chains Z 1 and Z 2 0 .b C c/ bCc
A2 D ; (4)
on .˝; F ; P /, taking values in a finite state space 1 0 0
K, and with the infinitesimal generators A1 WD
Œaij1
and A2 WD Œahk2

, respectively. for a; b; c  0:
Consider the system of linear algebraic equa- The off-diagonal elements of the matrix AC
C
tions in unknowns aih;j k, below satisfy the system (1)–(2),
250 Credit Risk Modeling

.0; 0/ .0; 1/ .1; 0/ .1; 1/


0 1
.0; 0/ B .a C b C c/ b a c
C
B C
.0; 1/ B
B 0 .a C c/ 0 aCcC
C:
AC D B C (5)
.1; 0/ B
@ 0 0 .b C c/ bCcC
A
.1; 1/ 0 0 0 0

Thus, matrix AC generates a Markovian joint Bibliography


migration process C D .C 1 ; C 2 /, whose com-
ponents C 1 and C 2 model individual default with We do not give a long list of recommended reading here.
That would be in any case incomplete. Up–to–date
prescribed default intensities a C c and b C c,
references can be found on www.defaultrisk.com.
respectively. Bielecki TR, Rutkowski M (2004) Credit risk: modeling,
For more information about Markov copulae valuation and hedging. Springer, Berlin
and about their applications in credit risk we, Bielecki TR, Brigo D, Patras F (eds) (2011) Credit risk
frontiers: subprime crisis, pricing and hedging, CVA,
refer to Bielecki et al. (2013).
MBS, ratings and liquidity. Wiley, Hoboken
Bielecki TR, Jakubowski J, Niewȩgłowski M (2013) In-
tricacies of dependence between components of multi-
Summary and Future Directions variate Markov chains: weak Markov consistency and
Markov copulae. Electron J Probab 18(45):1–21
Bluhm Ch, Overbeck L, Wagner Ch (2010) An introduc-
The future directions in development and applica- tion to credit risk modeling. Chapman & Hall, Boca
tions of credit risk models are comprehensively Raton
laid out in the recent volume Bielecki et al. El Karoui N, Jeanblanc M, Jiao Y (2010) What happens
after a default: the conditional density approach. SPA
(2011). One additional future direction is mod- 120(7):1011–1032
eling of systemic risk. Schönbucher PhJ (2003) Credit derivatives pricing
models. Wiley Finance, Chichester

Cross-References

 Financial Markets Modeling


 Option Games: The Interface Between Optimal
Stopping and Game Theory
D

Keywords
Data Association
Clutter; Measurement origin uncertainty;
Yaakov Bar-Shalom and Richard W. Osborne
Measurement validation; Persistent interference;
University of Connecticut, Storrs, CT, USA
Tracking

Abstract Introduction

In tracking applications, following the signal In a radar the “return” from the target of interest
detection process that yields measurements, there is sought within a time interval determined by the
is a procedure that selects the measurement(s) anticipated range of the target when it reflects the
to be incorporated into the state estimator energy transmitted by the radar: a “range gate” is
– this is called data association (DA) . In set up and the detection(s) within this gate can be
multitarget-multisensor tracking systems, there associated with the target of interest.
are generally three classes of data association: In general the measurements have a higher
specifically, measurement-to-track association dimension:
(M2TA) , track-to-track association (T2TA) , • Range, azimuth (bearing), elevation, or direc-
and measurement-to-measurement association tion sines for radar, possibly also range rate
(M2MA) . M2TA is the process of associating • Bearing and frequency (when the signal is
each measurement from a list (originating from narrow band) or time difference of arrival and
one or more sensors) to a new or existing track. frequency difference in passive sonar
T2TA is the process of associating multiple • Two line-of-sight angles or direction sines for
existing tracks (from multiple sensors or from optical or passive electromagnetic sensors
different periods in time), generally with the Then a multidimensional gate is set up for
intent of fusing them afterward. M2MA is detecting the signal from the target. This is done
the process of associating measurements from to avoid searching for the signal from the target of
different sensors in order to form “composite interest in the entire measurement space. A mea-
measurements” and/or do track initialization. The surement in the gate, while not guaranteed to have
processes of M2TA and T2TA will be discussed originated from the target the gate pertains to, is
in more detail here, while details on M2MA can a valid association candidate – thus, the name
be found in Bar-Shalom et al. (2011). validation region or association region . If there

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
252 Data Association

is more than one detection (measurement) in This distance metric, d 2 , is referred to in


the gate, this leads to an association uncertainty . the literature as the normalized innovation
In the discussion to follow, it will be assumed squared (NIS) , statistical distance squared ,
that one has point measurements rather than Mahalanobis distance , or chi-square distance .
distributed over several resolution cells of the The region defined by (2) is called the gate
sensor as in the case of an extended target . or validation region (hence, the notation V) or
Similar validation has to be carried out in association region . It is also known as the ellipse
T2TA. (or ellipsoid) of probability concentration – the
region of minimum volume that contains a given
probability mass under the Gaussian assumption.
Validation Region The semiaxes of the ellipsoid (2) are the square
roots of the eigenvalues of S . The threshold 
In view of the variety of variables that can be is obtained from tables of the chi-square distribu-
measured, a generic gating (or validation or as- tion since the quadratic form (3) that defines the
sociation) procedure for continuous-valued mea- validation region in (2) is chi-square distributed
surements is discussed. with number of degrees of freedom equal to the
Consider a target that is in track, i.e., its dimension nz of the measurement.
filter has been initialized. Then, according to Table 1 gives the gate probability (The no-
Sect. 5.2.3 of Bar-Shalom et al. (2001), one has tation P fg is used to denote the probability of
the predicted value (mean) of the measurement event fg.)
zO.kC1jk/ and the associated covariance S.kC1/. 
Assumption. The true measurement con- PG D P fz.k C 1/ 2 V.k C 1;  /g (4)
ditioned on the past is normally (Gaussian)
or the “probability that the (true) measurement
distributed (The notation N .xI ; S / stands for
will fall in the gate” for various values  and
the normal (Gaussian) pdf with the argument
dimensions nz of the measurement. The square
(vector) random variable x, mean , and p
root g D  is sometimes referred to as the
covariance matrix S . The reason for the use
“number of sigmas” (standard deviations) of the
of the designation “normal” is to distinguish this
gate. This, however, does not fully define the
omnipresent pdf from all the others (abnormal).)
probability mass in the gate as can be seen from
with its probability density function (pdf)
Table 1.
given by
Remark 1 It should be pointed out that thresh-
pŒz.k C 1/jZ k  D N Œz.k C 1/I zO.k C 1jk/; olding in a detector is also a form of gating –
only a signal above a certain intensity level (at the
S.k C 1/ (1) end of the signal processing chain) is accepted as
a detection and then one has a measurement. In
where S.k C 1/ is the innovation (residual) this case the “gate” is the interval Œ; 1 in the
covariance matrix and z is the true measurement. signal intensity space, where  is the detection
Then the true measurement will be in the threshold.
following region:
A Single Target in Clutter
V.k C 1;  / D fz W d 2   g (2) The validation procedure limits the region in
the measurement space where the information
with probability determined by the gate thresh- processor will “look” to find the measurement
old  and from the target of interest. In spite of this, it can
happen that more than one detection, i.e., several

d 2 D Œz  zO.k C 1jk/0 S.k C 1/1 Œz  zO.k C 1jk/ measurements, will be found in the validation
(3) region.
Data Association 253

Data Association, Table 1 Gate thresholds and the probability mass PG in the gate
 1 4 6.6 9 9.2 11.4 16 25
g 1 2 2.57 3 3.03 3.38 4 5
nz
1 0.683 0.954 0.99 0.997 0.99994 1
2 0.393 0.865 0.989 0.99 0.9997 1
3 0.199 0.739 0.971 0.99 0.9989 0.99998

Measurements outside the validation region


D
can be ignored: they are “too far” and thus very
´ z2
unlikely to have originated from the target of
interest. This holds if the gate probability is close
to unity and the model used to obtain the gate is • zˆ1
correct.
The problem of tracking a single target in ´ z1
clutter considers the situation where there are
´ z3
possibly several measurements in the validation
region (gate) of a target. The set of validated
measurements consists of: Data Association, Fig. 1 Several measurements in the
• The correct measurement (if detected and it validation region of a single track
fell in the gate)
• The undesirable measurements: clutter or false
alarm originated The common mathematical model for such false
In practice detections are obtained by thresh- measurements is that they are:
olding the signal received by the sensor after • Uniformly spatially distributed
processing it. This is the simplest (binary) way • Independent across time
of using a target feature – its intensity. More This corresponds to what is known as residual
sophisticated ways of using such feature informa- clutter – the constant clutter, if any, is assumed
tion can be found in Bar-Shalom et al. (2011). to have been removed.
It is assumed that the measurement contains
all the information that could be used to discard
the undesirable measurements. Therefore, any Multiple Targets in Clutter
measurement that has been validated could have The situation where there are several target tracks
originated from the target of interest. in the same neighborhood as well as clutter (or
A situation with a single-target track and sev- false alarms) is more complicated. Figure 2 il-
eral validated measurements is depicted in Fig. 1. lustrates such a case for a given time, with the
The (two-dimensional) validation region is an predicted measurements for the two targets con-
ellipse centered at the predicted measurement zO1 . sidered denoted as zO1 and zO2 . In this figure the
The parameters of the ellipse are determined by following measurement origins are possible:
the covariance matrix S of the innovation, which • z1 from target 1 or clutter
is assumed to be Gaussian. • z2 from either target 1 or target 2 or clutter
All the measurements in the validation region • z3 and z4 from target 2 or clutter
can be said to be not too unlikely to have origi- However, if z2 originated from target 2, then
nated from the target of interest, even though only it is quite likely that z1 originated from target
one is assumed to be the true one. 1. This illustrates the interdependence of the
The implication of the assumption that associations in a situation where a persistent
there is a single target is that the undesirable interference (neighboring target) is present in
measurements constitute a random interference. addition to random interference (clutter).
254 Data Association

(1) Continuous uncertainties – state estimation in


the presence of continuous noises
(2) Discrete uncertainties – which measure-
´ z2 ment(s) should be used in the estimation
process
Assuming the goal is to obtain the MMSE
• zˆ1 estimate of the target state – its conditional mean
– one can distinguish the following approaches.
´ z1
• zˆ2 Pure MMSE Approach
´ z3 The Pure MMSE Approach to tracking and
data association is obtained using the smoothing
property of expectations (see, e.g., Bar-Shalom
´ z4
et al. 2001, Sect. 1.4.12), as follows:

xO MMSE D EŒxjZ D EfEŒxjA; ZjZg


X
D EŒxjAi ; ZP fAi jZg (5)
Ai 2A
Data Association, Fig. 2 Two tracks with a measure-
ment in the intersection of their validation regions where A is an association event (assuming a
Bayesian model, with prior probabilities from
Up to this point, it was assumed that a mea- which one can calculate posterior probabilities),
surement could have originated from one of the and the summation is over all events Ai in the set
targets or from clutter. However, in view of the A of mutually exclusive and exhaustive associa-
fact that any signal processing system has an in- tion events.
herent finite resolution capability, an additional The above, which requires the evaluation
possibility has to be considered: of all the conditional (posterior) probabilities
z2 could be the result of the merging of the P fAi jZg, is a direct consequence of the total
detections from the two targets – it is an probability theorem (see, e.g., Bar-Shalom et al.
unresolved measurement . 2001, Sect. 1.4.10), which yields the conditional
This constitutes a fourth origin hypothesis for pdf of the state as the following mixture
a measurement that lies in the intersection of X
two validation regions. Most tracking algorithms p.xjZ/ D p.xjAi ; Z/P fAi jZg (6)
ignore the possibility that a measurement is an Ai 2A
unresolved one.
This illustrates only the difficulty of associ- In the linear-Gaussian case the above becomes a
ation of measurements to tracks at one point in Gaussian mixture . Algorithms that fall in this
time. The full problem, as will be discussed later, category are PDAF and JPDAF, see Bar-Shalom
consists of associating measurements across et al. (2011).
time.
MMSE-MAP Approach
Approaches to Tracking and Data The MMSE-MAP Approach , instead of enu-
Association merating and summing over all the association
events, selects the one with highest posterior
The problem of tracking and data association probability, namely,
is a hybrid problem because it is character-
ized by: AMAP D arg max P fAi jZg (7)
i
Data Association 255

and then (realistically) in view of the fact that it includes


the data association uncertainty.
xO MMSE-MAP D EŒxjAMAP ; Z (8)
Estimation and Data Association
The HOMHT as proposed by Reid (1979) falls in
in Nonlinear Stochastic Systems
this category, see Bar-Shalom et al. (2011).
The Model
MMSE-ML Approach Consider the discrete time stochastic system D
The MMSE-ML Approach does not assume
priors for the association events and relies on the x.k C 1/ D f Œk; x.k/; u.k/; v.k/ (11)
maximum likelihood approach to select the event,
that is, where x 2 Rn is the stacked state vector of
the targets under consideration, u.k/ is a known
AML D arg max pfZjAi jg (9) input (included here for the sake of generality),
i
and v.k/ is the process noise with a known pdf.
and then The measurements at time k C 1 are described by
the stacked vector
xO MMSE-ML D EŒxjAML ; Z (10)
z.k C1/ D hŒk C1; x.k C1/; A.k C1/; w.k C1/
The TOMHT falls into this category and S-D (12)
assignment (or MDA) is an implementation of where A.k C 1/ is the data association event
this, see Bar-Shalom et al. (2011). at k C 1 that specifies (i) which measurement
component originated from which components of
Heuristic Approaches x.k C 1/, namely, from which target, and (ii)
There are numerous simpler/heuristic ap- which measurements are false, that is, originated
proaches. The most common one relies on the from the clutter process. The vector w.k/ is the
distance metric (3) and makes the selection of observation noise, consisting of the error in the
which measurement is associated with which true measurement and the false measurements.
track based on the “nearest neighbor” rule. The The pdf of the false measurements and the prob-
same criterion can be used in a global cost ability mass function (pmf) of their number are
function. also assumed to be known.
The noise sequences and false measurements
Remarks are assumed to be white with known pdf and mu-
It should be noted that the MMSE-MAP esti- tually independent. The initial state is assumed to
mate (8) and the MMSE-ML estimate (10) are have a known pdf and to be independent of the
obtained assuming that the selected association noises. Additional assumptions are given below
is correct – a hard decision . This hard decision for the optimal estimator, which evaluates the pdf
is sometimes correct, sometimes wrong. On the of the state conditioned on the observations.
other hand, the pure MMSE estimate (5) yields The optimal state estimator in the presence
a soft decision – it averages over all the possi- of data association uncertainty consists of the
bilities. This soft decision is never totally correct, computation of the conditional pdf of the state
never totally wrong. x.k/ given all the information available at time
The uncertainties (covariances) associated k, namely, the prior information about the initial
with the MMSE-MAP and MMSE-ML estimates state, the intervening inputs, and the sets of mea-
might be optimistic in view of the above surements through time k. The conditions under
observation. The uncertainty associated with which the optimal state estimator consists of the
the pure MMSE estimate will be increased computation of this pdf are presented in detail.
256 Data Association

The Optimal Estimator for the Pure MMSE The recursion (17) shows that the optimal
Approach MMSE estimator in the presence of data associa-
The information set available at k is tion uncertainty has the following properties:
P1. The pdf pkC1 is a weighted sum of pdfs,
I k D fZ k ; U k1 g (13) conditioned on the current time association
events Ai .k C 1/, i D 1; : : : ; M.k C 1/,
where where M.k C 1/ is the number of mutually

Z k D fz.j /gkj D1 (14) exclusive and exhaustive association events
at time k C 1.
is the cumulative set of observations through time
P2. If the exact previous pdf, which is the suf-
k, which subsumes the initial information Z 0 , ficient statistic, is available, then only the
and U k1 is the set of known inputs prior to time most recent association event probabilities
k. are needed at each time.
For a stochastic system, an information state However, the number of terms of the mixture in
(Striebel 1965) is a function of the available the right-hand side of (17) is, by time k C1, given
information set that summarizes the past of the
by the product
system in a probabilistic sense.
It can be shown that the conditional pdf of the Y
kC1

state M kC1 D M.i / (18)



pk D pŒx.k/jI k  (15) i D1

is an information state if (i) the two noise se- which amounts to an exponential increase in time.
quences (process and measurement) are white This increase is similar to the increase in the
and mutually independent and (ii) the target de- number of the branches of the MHT hypothesis
tection and clutter/false measurement processes tree.
are white. Once the conditional pdf (15) is avail- A detailed derivation of the recursion for the
able, the pure MMSE estimator , i.e., the condi- optimal estimator can be found in Bar-Shalom
tional mean, as well as the conditional variance, et al. (2011).
or covariance matrix, can be obtained.
The optimal estimator, which consists of the
recursive functional relationship between the in- Track-to-Track Association
formation states pkC1 and pk , is given by
In addition to measurement-to-track associa-
pkC1 D Œk C 1; pk ; z.k C 1/; u.k/ (16) tion (M2TA) , an additional class of data asso-
ciation is track-to-track association (T2TA) .
where Following T2TA, track-to-track fusion (T2TF)
may be performed to (hopefully) improve the
Œk C 1; pk ; z.k C 1/; u.k/ overall tracking accuracy. For more details on
track fusion, see Bar-Shalom et al. (2011).
1 X
M.kC1/
It is desired first to test the hypothesis that
D pŒz.k C 1/jx.k C 1/; Ai .k C 1/
c i D1 two tracks pertain to the same target. The optimal
Z test would require using the entire data base (the
 pŒx.k C 1/jx.k/; u.k/pk d x.k/ sequences of measurements that form the tracks)
through the present time k and is not practical.
P fAi .k C 1/g (17) In view of this, the test to be presented is based
only on the most recent estimates from the tracks.
is the transformation that maps pk into pkC1 ; the The test based on the state estimates within a time
integration in (17) is over the range of x.k/ and c window is discussed in Tian and Bar-Shalom
is the normalization constant. (2009).
Data Association 257

Association of Tracks with Independent is zero mean and has covariance


Errors
Let xO i .k/ be the estimated state of a target by 
Q ij .k/
T ij .k/ D Ef Q ij .k/0 g
sensor i with its own information processor. As-
D EfŒxQ i .k/  xQ j .k/ŒxQ i .k/  xQ j .k/0 g
sume that one has an estimate xO j .k/ from sensor
(26)
j , corresponding to the same time. Both can be
current estimates or one can be a prediction as
given, under the error independence assump-
long as they pertain to the same time (the second
time argument has been omitted for simplicity).
tion , by D
The corresponding covariances are denoted as
P m .k/; m D i; j . The state estimation errors at T ij .k/ D P i .k/ C P j .k/ (27)
different sensors (local trackers),
Assuming the estimation errors to be Gaus-
xQ .k/ D x .k/  xO .k/
i i i
(19) sian, the test of H0 vs. H1 – the T2TA test – is

xQ .k/ D x .k/  xO .k/


j j j
(20) Accept H0 if

where x i and x j are the corresponding true states,



DDO ij .k/0 ŒT ij .k/1 
O ij .k/  D˛ (28)
are assumed to be independent. This is the state
estimation error independence assumption . The threshold D˛ is such that
Remark 2 As shown in the sequel, for indepen-
dent sensors, the state estimation errors for the P fD > D˛ jH0 g D ˛ (29)
same target are dependent in the presence of
process noise. where, e.g., ˛ D 0:01. From the Gaussian as-
sumption, the threshold is the 1  ˛ point of
Denote the difference of the two estimates as the chi-square distribution with nz degrees of
freedom (Bar-Shalom et al. 2001)
O ij .k/ D xO i .k/  xO j .k/
 (21)
D˛ D 2nz .1  ˛/ (30)
This is the estimate of the difference of the true
states
Association of Tracks with Dependent
ij .k/ D x i .k/  x j .k/ (22)
Errors
The same target hypothesis is that the true In the previous section, the association testing
states are equal, was done under the assumption that the esti-
mation errors in these tracks are independent.
H0 W ij .k/ D 0 (23) However, as shown in Bar-Shalom et al. (2011),
whenever there is process noise (or, in general,
while the different target alternative is motion uncertainty), the track errors based on
data from independent sensors are dependent.
H1 W ij .k/ ¤ 0 (24) The dependence between the estimation er-
rors xQ i .kjk/ and xQ j .kjk/ from the two tracks
While (21) is the appropriate statistic to test arises from the common process noise which
whether (22) is zero or not, the rigorous proof of contributes to both errors. This is due to the fact
this fact is presented in Bar-Shalom et al. (2011). that there is a common motion equation for both
The error in the difference between the state trackers.
estimates The testing of the hypothesis that the two
tracks under consideration originated from the
 O ij .k/
Q ij .k/ D ij .k/   (25) same target is done in the same manner as before,
258 Data Association

except for the following modification to account 


P ij .kjk/ DEŒxQ i .kjk/xQ j .kjk/0 
for the dependence of their state estimation
errors. D ŒI  W i .k/H i .k/
The covariance associated with the difference 
 F .k 1/P ij .k 1jk 1/F .k 1/0
of the estimates 
CQ.k  1/ ŒI  W j .k/H j .k/0
O ij .k/ D xO i .kjk/  xO j .kjk/
 (31) (35)

is, accounting for the dependence, This is a linear recursion – a Lyapunov-type


equation – and its initial condition is, assuming

Q ij .k/
T ij .k/ D Ef Q ij .k/0 g the initial errors to be uncorrelated,

D EfŒxQ i .kjk/  xQ j .kjk/ŒxQ i .kjk/ P ij .0j0/ D 0 (36)


0
 xQ .kjk/ g
j
(32)
This is a reasonable assumption in view of the
and, with the known cross-covariance P ij , is fact that the initial estimates are usually based on
given by the expression the initial measurements, which were assumed to
have independent errors.
T ij .k/ D P i .kjk/ C P j .kjk/ The cross-covariance for the case of asyn-
chronous sensors can be found in Bar-Shalom
P ij .kjk/  P j i .kjk/ (33)
et al. (2011).
Note the difference between the above and (27).

Effect of the Dependence Summary and Future Directions


The effect of the dependence between the esti-
mation errors is to reduce the covariance of the This entry surveyed the issues involved in data
difference (31) of the estimates. This is due to association (specifically M2TA and T2TA) with
the fact that the cross-covariance term reflects a regard to multitarget-multisensor tracking sys-
positive correlation between the estimation errors tems.
(this is always the case for linear systems). The future developments in this topic will be
in regard to the use of new feature variables and
The Test classification in data association (some prelimi-
The hypothesis testing for the track-to-track nary results are in Bar-Shalom et al. (2011)).
association with the dependence accounted for
is done in the same manner as before in (28),
except that the “smaller” covariance from (33) Cross-References
is used in the test statistic, which is, as before,
the normalized distance squared between the  Estimation for Random Sets
estimates  Estimation, Survey on

DD O ij .k/
O ij .k/0 ŒT ij .k/1  (34)
Bibliography
The Cross-Covariance of the Estimation
Errors Bar-Shalom Y, Li XR, Kirubarajan T (2001) Estimation
with applications to tracking and navigation: theory,
The cross-covariance recursion for synchro-
algorithms and software. Wiley, New York
nized sensors can be shown to be (see Bar- Bar-Shalom Y, Willett PK, Tian X (2011) Tracking and
Shalom et al. 2011) data fusion. YBS, Storrs
Data Rate of Nonlinear Control Systems and Feedback Entropy 259

Blackman SS, Popoli R (1999) Design and analysis of instantaneously, lossless, and with arbitrary pre-
modern tracking systems. Artech House, Norwood cision is no longer satisfied. Realistic mathe-
Mallick M, Krishnamurthy V, Vo BN (eds) (2013) In-
tegrated tracking, classification, and sensor manage-
matical models of many important real-world
ment. Wiley, Hoboken communication and control networks have to
Reid DB (1979) An algorithm for tracking multiple tar- take into account general data rate constraints in
gets. IEEE Trans Autom Control 24:843–854 the communication channels, time delays, partial
Striebel C (1965) Sufficient statistics in the optimum con-
trol of stochastic systems. J Math Anal Appl 12:576–
loss of information, and variable network topolo-
592 gies. This raises the question about the smallest
Tian X, Bar-Shalom Y (2009) Track-to-track fusion con- possible information rate above which a given
D
figurations and association in a sliding window. J Adv control task can be solved. Though networked
Inf Fusion 4(2):146–164
control systems can have a complicated topology,
consisting of multiple sensors, controllers, and
actuators, a first step towards understanding the
problem of minimal data rates is to analyze the
Data Rate of Nonlinear Control simplest possible network topology, consisting of
Systems and Feedback Entropy one controller and one dynamical system con-
nected by a digital channel with a certain rate
Christoph Kawan
in bits per unit time. There is a wealth of liter-
Courant Institute of Mathematical Sciences,
ature concerned with the problem of stabilizing
New York University, New York, USA
a system under different assumptions about the
specific coding and control scheme, in this con-
text. However, with few exceptions, mainly linear
Abstract
systems (both deterministic and stochastic) have
been considered. A comprehensive and detailed
Topological feedback entropy is a measure for the
overview of this literature until 2007 can be
smallest information rate in a digital communica-
found in the survey Nair et al. (2007). The first
tion channel between the coder and the controller
systematic approach to the problem of minimal
of a control system, above which the control task
data rates for set invariance and stabilization of
of rendering a subset of the state space invariant
(deterministic, nonlinear) control systems was
can be solved. It is defined purely in terms of
presented in Nair et al. (2004), where the notion
the open-loop system without making reference
of topological feedback entropy was introduced.
to a particular coding and control scheme and can
This quantity, defined in terms of the open-loop
also be regarded as a measure for the inherent
control system, is a measure for the smallest
rate at which the system generates “invariance
data rate a communication channel may have if
information.”
the system is supposed to solve the control task
of rendering a subset of the state space invari-
Keywords ant. Other challenges that digital communication
channels come along with are not yet taken into
Communication constraints; Controlled invari- account here.
ance; Invariance entropy; Minimal data rates; Feedback entropy was first introduced in Nair
Stabilization et al. (2004), using a similar approach via open
covers as in the definition of topological entropy
of for classical dynamical systems in Adler et al.
Introduction (1965). In Colonius and Kawan (2009), a quantity
named invariance entropy was defined which
In the theory of networked control systems, the later turned out to be equivalent to the feedback
assumption of classical control theory that infor- entropy of Nair et al. (cf. Colonius et al. 2013).
mation can be transmitted within control loops The notion of invariance entropy has been further
260 Data Rate of Nonlinear Control Systems and Feedback Entropy

i  1
studied in the papers Kawan (2011a), Kawan u.˛/ D .u0 ; u1 ; u2 ; : : :/ with .ul /lD.i 1/
(2011b), and Kawan (2011c). Several variations
and generalizations have been introduced in D G.Ai 1 / for all i  0;
Colonius (2010), Colonius (2012), Colonius and
Kawan (2011), Da Silva (2013), and Hagihara and for every j  1 a set
and Nair (2013). The research monograph Kawan
(2013) provides a comprehensive presentation of Bj .˛/ WD fx 2 X W '.i ; x; u.˛// 2 Ai
the results obtained so far in the deterministic for i D 0; 1; : : : ; j  1g :
case.
The family Bj WD fBj .˛/ W ˛ 2 AN0 g is an open
cover of K. Letting N.Bj jK/ denote the minimal
Definition cardinality of a finite subcover, the entropy of
.A; ; G/ is
Topological feedback entropy is a nonnegative
real-valued quantity which serves as a measure 1
for the smallest possible data rate in a digital h.A; ; G/ W D lim log2 N.Bj jK/
j !1 j
channel, connecting a coder to a controller, above
1
which the controller is able to generate inputs D inf log2 N.Bj jK/:
j 1 j
which guarantee invariance of a given subset of
the state space. In the literature, one finds several
Finally, the topological feedback entropy (TFE)
slightly differing versions. The original definition
of K is given by
given in Nair et al. (2004) is (with minor mod-
ifications) as follows. Consider a discrete-time
hfb .K/ WD inf h.A; ; G/;
control system .A;;G/

xkC1 D F .xk ; uk / D Fuk .xk /; k  0; where the infimum is taken over all invariant open
covers of K.
with F W X  U ! X , where X is a topological A conceptually simpler but equivalent defini-
space and U a nonempty set such that Fu W X ! tion, introduced in Colonius and Kawan (2009),
X is continuous for every u 2 U . The transition is the following. A subset S  U  is called
map associated to this system is .; K/-spanning if for every x 2 K there is u 2 S
with '.k; x; u/ 2 intK for k D 1; : : : ; . Writing
' W N0  X  U N0 ! X; rinv .; K/ for the minimal cardinality of such a
set, it can be shown that
'.k; x; .un // WD Fuk1 ı    ı Fu1 ı Fu0 .x/:
1
hfb .K/ D lim log2 rinv .; K/
A compact subset K  X with nonempty interior  !1 
is (strongly) controlled invariant if for every x 2 1
K there is an input u 2 U such that Fu .x/ 2 D inf log2 rinv .; K/:
 1 
intK. A triple .A; ; G/ is called an invariant
open cover of K if A is an open cover of K, This definition and several variations of it are
 is a positive integer, and G W A ! U  is mostly referred to by the name invariance
a map with components G0 ; G1 ; : : : ; G 1 which entropy instead of feedback entropy. The intuition
assign control values to all sets in A such that for behind this definition is that a controller which
every A 2 A the finite sequence of controls G.A/ receives a certain amount of information about
yields '.k; A; G.A//  intK for k D 1; 2; : : : ; . the state, say n bits, can generate at most
The entropy of .A; ; G/ is defined as follows. 2n different control sequences to steer the
For every sequence ˛ D .Ai /i 0 of sets in A one system on a finite time interval and hence,
defines an associated sequence of controls by the number of control sequences needed to
Data Rate of Nonlinear Control Systems and Feedback Entropy 261

accomplish the control task on this interval while TFE in general is not. Interpreted in terms
is a measure for the necessary amount of of information rates, topological entropy is a
information. measure for the largest average rate of informa-
The different variations of feedback or invari- tion about the initial state a dynamical system can
ance entropy which can be found in the literature generate. TFE measures the smallest rate of infor-
are briefly summarized as follows. For simplicity, mation about the state of the system above which
in this entry we refer to several of these variations a controller is able to render the set invariant. It
by the name (topological) feedback entropy: should also be mentioned that topological entropy
(i) Instead of requiring that trajectories enter the was first introduced as a topological counterpart
D
interior of K after one step of time, one can of the measure-theoretic entropy defined by Kol-
allow for a waiting time 0 before entering mogorov and Sinai, and that the two notions are
intK. related by the variational principle, which asserts
(ii) One can require that trajectories stay in K in- that the topological entropy is the supremum of
stead of intK or that they stay in an arbitrar- the measure-theoretic entropies with respect to
ily small neighborhood of K, respectively. all invariant probability measures of the given
(iii) One can restrict the set of initial states to system. For TFE, so far no convincing measure-
a subset of K 0  K. In this case, a set S theoretic approach exists. An excellent survey on
of control sequences is .; K 0 ; K/-spanning the entropy theory of dynamical systems can be
if for every x 2 K 0 there is u 2 S with found in Katok (2007).
'.k; x; u/ 2 intK for k D 1; : : : ; .
(iv) Feedback entropy can be defined for other The Data Rate Theorem
classes of systems, e.g., continuous-time de-
terministic systems, random control systems, The data rate theorem for the TFE confirms that
or systems with piecewise continuous right- the infimal data rate in a coding and control loop
hand sides. which guarantees strong controlled invariance of
There is also a local version of topological a set K is equal to hfb .K/. More precisely, sup-
feedback entropy (LTFE) which measures the pose that a sensor which measures the state of the
smallest data rate for local uniform asymptotic system at discrete times k D k, k D 0; 1; 2; : : :,
stabilization at an equilibrium. Also for other is connected to a coder which at time k has a
control tasks there have been attempts to define finite alphabet Sk of symbols available. The mea-
corresponding versions of feedback or invariance sured state is coded by use of this alphabet and
entropy. the corresponding symbol is sent via a noiseless
digital channel to a controller which generates
Comparison to Topological Entropy an input sequence of length . This sequence is
used to steer the system until the next symbol
Though there are similarities in the definitions arrives at time kC1 . The associated asymptotic
of TFE and topological entropy of dynamical average bit rate, which depends on the sequence
systems, which also reflect in similar properties, S D .Sk /k0 of coding alphabets, is given by
there is no direct relation between these two
quantities. Topological entropy detects exponen-
1 X
k1
tial complexity in the orbit structure of a dy- R.S / D lim log2 jSi j:
k!1 k
namical system. In contrast, TFE measures the i D0
complexity of the control task to keep a system in
If the limit does not exist, one may replace it
a subset of the state space by applying appropriate
with lim inf or lim sup. The data rate theorem
inputs. If no escape from this subset is possible,
establishes the equality
the TFE is zero, no matter how complicated the
orbit structure is. Hence, topological entropy is
hfb .K/ D inf R.S /;
sensitive to the local behavior of the system, S
262 Data Rate of Nonlinear Control Systems and Feedback Entropy

where the infimum is taken over all coding and inputs as time increases is the volume expansion
control loops which guarantee strong controlled of the open-loop system in the unstable subspace.
invariance of K, i.e., for initial states in K they
generate trajectories .xk /k0 with xk 2 intK for Upper Bounds Under Controllability
k D 1; 2; : : :. Similar data rate theorems can be Assumptions
proved for other variants of feedback entropy. In If the state space of the control system is a
particular, the data rate theorem for the LTFE differentiable manifold and the right-hand side
asserts that the infimal bit rate for local uniform is continuously differentiable, under certain
asymptotic stabilization at an equilibrium is given controllability assumptions upper bounds for the
by the LTFE. Proofs of different data rate theo- feedback entropy can be formulated in terms of
rems can be found in Nair et al. (2004), Hagihara the Lyapunov exponents of periodic trajectories
and Nair (2013), and Kawan (2013). (for the concept of Lyapunov exponents, see,
e.g., Barreira and Valls 2008) (cf. Kawan 2011b,
2013; Nair et al. 2004). More precisely, if there
Estimates and Formulas is a periodic trajectory in the interior of the
given set K such that the linearization along
Linear Systems this trajectory is controllable, and if complete
For linear systems, under reasonable assump- approximate controllability holds on the interior
tions, the feedback entropy is given by the sum of of K (cf. Colonius and Kliemann 2000), then
the unstable eigenvalues of the dynamical matrix,
X
i.e., if the system is given by xkC1 D Axk C Buk , hfb .K/  maxf0; n g; (2)
then 

X
hfb .K/ D max f0; n log2 jjg ; (1) where the sum is taken over all Lyapunov expo-
2Sp.A/ nents  of the periodic trajectory and n denotes
the multiplicity of . Using the definition of
where Sp.A/ denotes the spectrum of A and n feedback entropy in terms of .; K/-spanning
is the algebraic multiplicity of the eigenvalue  sets, one can prove this by constructing spanning
(cf. Colonius and Kawan 2009). It is worth to sets of control functions which first steer all initial
mention that the TFE therefore coincides with states in K into a small neighborhood of a point
the topological entropy of the uncontrolled sys- on the given periodic orbit and then, by use
tem xkC1 D Axk , as defined by Bowen for of local controllability, keep the corresponding
maps on non-compact metric spaces (cf. Bowen trajectories in a neighborhood of the periodic
1971). However, this is a special property of trajectory for arbitrary future times. Similar ideas
linear systems and is related to the facts that (i) first have been used in Nair et al. (2004) to prove
there is no difference between the local and the that the LTFE at an equilibrium is given by the
global dynamical behavior of uncontrolled linear sum of the unstable eigenvalues of the lineariza-
systems and that (ii) the control sequence does tion about this equilibrium. For systems given by
not affect the exponential complexity of the dy- differential equations the upper estimate (2) can
namics, since it only appears as an additive term be improved under additional regularity assump-
in the transition map of the system. Formula (1) tions. Assuming that the system is smooth and
is in correspondence with several former results satisfies the strong jet accessibility rank condition
on minimal data rates for stabilization of linear (cf. Coron 1994), one can show that both as-
systems. Thinking of the definition of feedback sumptions, controllability of the linearization and
entropy via spanning sets of control sequences, periodicity, can be omitted. The only restriction
its interpretation is that in order to guarantee that remains is that the trajectory must not leave
invariance of a bounded set, the only reason for a compact subset of the interior of K. However,
exponential growth of the number of necessary in the case of nonperiodic trajectories, the sum
Data Rate of Nonlinear Control Systems and Feedback Entropy 263

of the positive Lyapunov exponents has to be set K, which under sufficiently strong hyperbol-
replaced by the maximal Lyapunov exponent of icity assumptions can be estimated in terms of
the induced linear flow on the exterior bundle of other quantities such as Lyapunov exponents and
the manifold. For control-affine systems, strong dynamical entropies. Key references for escape
jet accessibility can be weakened to local acces- rates in the classical theory of dynamical sys-
sibility. In general, it is unlikely that such upper tems are (Young (1990) and Demers and Young
bounds are tight, since they are related to very (2006)).
specific control strategies for making the given
set invariant.
D
Summary and Future Directions
Lower Bounds, Volume Growth Rates,
and Escape Rates The theory of feedback entropy for finite-
A general approach to obtain lower bounds of dimensional deterministic systems is very far
feedback entropy is via a volume growth ar- from being complete. The currently available
gument, which in its simplest form works as results only give valuable information in very
follows. Every .; K/-spanning set S defines a regular situations, and even those are not fully
cover of K, consisting of the sets (cf. Kawan understood. For the further development of
2011a,c) this theory, it will be necessary to combine
control-theoretic methods with techniques from
K;u D fx 2 K W '.k; x; u/ 2 intK; different fields such as classical, random, and
nonautonomous dynamical systems. Some of the
1  k  g; u 2 S:
main focuses of future research will probably be
the following:
It follows that ';u .K;u /  K and hence, since K • The generalization of feedback entropy to
is bounded, the volume expansion under ';u D more complex network topologies
'.; ; u/ gives upper bounds for the volumes of • The formulation of a feedback entropy theory
the sets K;u , which result in a lower bound for for stochastic systems
the number of these sets. For instance, the lower • The development of a probabilistic (resp.
estimate in (1) can be established by applying measure-theoretic) version of feedback
this argument to the system which arises by entropy for both deterministic and stochastic
projection of the given linear system to the un- systems, which is related to the topological
stable subspace of the uncontrolled part xkC1 D version via a variational principle
Axk . A refinement of this idea also leads to • The numerical computation of feedback en-
lower estimates of feedback entropy for inho- tropy
mogeneous bilinear systems in terms of volume
growth rates or Lyapunov exponents on unstable
bundles, respectively. For nonlinear systems, in
general only very rough estimates can be ob- Cross-References
tained by this method. However, a variation of the
volume growth argument leads to a lower bound  Quantized Control and Data Rate Constraints
of the form

1 Bibliography
hfb .K/   lim inf log sup .K;u /;
 !1  u
Adler RL, Konheim AG, McAndrew MH (1965) Topolog-
ical entropy. Trans Amer Math Soc 114:309–319
where  denotes a reference measure on the state
Barreira L, Valls C (2008) Stability of nonautonomous
space. The right-hand side of this inequality can differential equations. Lecture notes in mathematics,
be considered as a uniform escape rate from the vol. 1926. Springer, Berlin
264 DES

Bowen R (1971) Entropy for group endomorphisms


and homogeneous spaces. Trans Am Math Soc Deterministic Description
153:401–414 of Biochemical Networks
Colonius F (2010) Minimal data rates and invariance
entropy. Electronic Proceedings of the conference
Jörg Stelling and Hans-Michael Kaltenbach
on mathematical theory of networks and systems
(MTNS), Budapest, 5–9 July 2010 ETH Zürich, Basel, Switzerland
Colonius F (2012) Minimal bit rates and entropy
for stabilization. SIAM J Control Optim 50:
2988–3010
Colonius F, Kawan C (2009) Invariance entropy
Abstract
for control systems. SIAM J Control Optim 48:
1701–1721 Mathematical models of living systems are often
Colonius F, Kawan C (2011) Invariance entropy for out- based on formal representations of the under-
puts. Math Control Signals Syst 22: 203–227
lying reaction networks. Here, we present the
Colonius F, Kliemann W (2000) The dynamics of control.
Birkhäuser-Verlag, Boston basic concepts for the deterministic nonspatial
Colonius F, Kawan C, Nair GN (2013) A note on topo- treatment of such networks. We describe the
logical feedback entropy and invariance entropy. Syst most prominent approaches for steady-state and
Control Lett 62:377–381
Coron J-M (1994) Linearized control systems and appli-
dynamic analysis using systems of ordinary dif-
cations to smooth stabilization. SIAM J Control Optim ferential equations.
32:358–386
Da Silva AJ (2013) Invariance entropy for random
control systems. Math Control Signals Syst 25:
491–516 Keywords
Demers MF, Young L-S (2006) Escape rates and condi-
tionally invariant measures. Nonlinearity 19:377–397 Michaelis-Menten kinetics; Reaction networks;
Hagihara R, Nair GN (2013) Two extensions of topo-
Stoichiometry
logical feedback entropy. Math Control Signals Syst
25:473–490
Katok A (2007) Fifty years of entropy in dynamics: 1958–
2007. J Mod Dyn 1:545–596
Kawan C (2011a) Upper and lower estimates for invari-
Introduction
ance entropy. Discret Contin Dyn Syst 30:169–186
Kawan C (2011b) Invariance entropy of control sets. A biochemical network describes the intercon-
SIAM J Control Optim 49:732–751 version of biochemical species such as proteins
Kawan C (2011c) Lower bounds for the strict invariance
or metabolites by chemical reactions. Such net-
entropy. Nonlinearity 24:1910–1935
Kawan C (2013) Invariance entropy for deterministic works are ubiquitous in living cells, where they
control systems – an introduction. Lecture notes in are involved in a variety of cellular functions
mathematics vol 2089. Springer, Berlin such as conversion of metabolites into energy
Nair GN, Evans RJ, Mareels IMY, Moran W (2004) Topo-
logical feedback entropy and nonlinear stabilization.
or building material of the cell, detection and
IEEE Trans Autom Control 49:1585–1597 processing of external and internal signals of
Nair GN, Fagnani F, Zampieri S, Evans RJ (2007) Feed- nutrient availability or environmental stress, and
back control under data rate constraints: an overview. regulation of genetic programs for development.
Proc IEEE 95:108–137
Young L-S (1990) Large deviations in dynamical systems.
Trans Am Math Soc 318:525–543 Reaction Networks
A biochemical network can be modeled as a dy-
namic system with the chemical concentration of
each species taken as the states and dynamics de-
scribed by the changes in species concentrations
DES as they are converted by reactions. Assuming that
species are homogeneously distributed in the re-
 Models for Discrete Event Systems: An action volume and that copy numbers are suffi-
Overview ciently high, we may ignore spatial and stochastic
Deterministic Description of Biochemical Networks 265

effects and derive a system of ordinary differen- by dropping the explicit dependence on time, its
tial equations (ODEs) to model the dynamics. dynamics is governed by a system of n ordinary
Formally, a biochemical network is given by r differential equations:
reactions R1 ; : : : ; Rr acting on n different chem-
ical species S1 ; : : : ; Sn . Reaction Rj is given by d
x D N  v.x; p/ : (1)
dt
Rj W ˛1;j S1 C    C ˛n;j Sn ! ˇ1;j S1 Here, the reaction rate vector v.x; p/ D
C    C ˇn;j Sn ; .v1 .x; p1 /; : : : ; vr .x; pr //T gives the rate of D
conversion of each reaction per unit-time as a
where ˛i;j ; ˇi;j 2 N are called the molecularities function of the current system state and of a set
of the species in the reaction. Their differences of parameters p.
form the stoichiometric matrix N D .ˇi;j  A typical reaction rate is given by the mass-
˛i;j /i D1:::n;j D1:::r , with Ni;j describing the net action rate law
effect of one turnover of Rj on the copy number Y
n
˛
of species Si . The j th column is also called vj .x; pj / D kj  xi i;j ;
the stoichiometry of reaction Rj . The system i D1
can be opened to an un-modeled environment by
introducing inflow reactions ; ! Si and outflow where the rate constant kj > 0 is the only
reactions Si ! ;. parameter and the rate is proportional to the
For example, consider the following reaction concentration of each species participating as an
network: educt (consumed component) in the respective
reaction.
Equation (1) decomposes the system into a
R1 W E CS !E S
time-independent and linear part described solely
R2 W E S !E CS by the topology and stoichiometry of the reac-
tion network via N and a dynamic and typically
R3 W E S !E CP
nonlinear part given by the reaction rate laws
v.; /. One can define a directed graph of the
Here, an enzyme E (a protein that acts as a network with one vertex per state and take N as
catalyst for biochemical reactions) binds to a sub- the (weighted) incidence matrix. Reaction rates
strate species S , forming an intermediate com- are then properties of the resulting edges. In
plex E  S and subsequently converting S into essence, the equation describes the change of
a product P . Note that the enzyme-substrate each species’ concentration as the sum of the cur-
binding is reversible, while the conversion to rent reaction rates. Each rate is weighted by the
a product is irreversible. This network contains molecularity of the species in the corresponding
r D 3 reactions interconverting n D 4 chemical reaction; it is negative if the species is consumed
species S1 D S , S2 D P , S3 D E, and S4 D by the reaction and positive if it is produced.
E  S . The stoichiometric matrix is Using mass-action kinetics throughout, and
0 1 using the species name instead of the numeric
1 C1 0
B 0 subscript, the reaction rates and parameters of the
0 C1C
NDB
@ 1
C : example network are given by
C1 C1A
C1 1 1
v1 .x; p1 / D k1  xE .t/  xS .t/
Dynamics v2 .x; p2 / D k2  xES .t/
Let x.t/ D .x1 .t/; : : : ; xn .t//T be the vector of
v3 .x; p3 / D k3  xES .t/
concentrations, that is, xi .t/ is the concentration
of Si at time t. Abbreviating this state vector as x p D .k1 ; k2 ; k3 /
266 Deterministic Description of Biochemical Networks

The system equations are then a combination of fluxes bT  v, the total pro-
duction rate of relevant metabolites to form new
d
xS D k1  xS  xE C k2  xES biomass. The biomass function given by b 2 Rr
dt is determined experimentally. The technique of
d flux balance analysis (FBA) then solves the linear
xP D k3  xES program
dt
d max bT  v
xE D k1  xS  xE Ck2  xES C k3  xES v
dt
d subject to
xES D k1  xS  xE  k2  xES  k3  xES :
dt 0DNv
Steady-State Analysis vil  vi  viu
A reaction network is in steady state if the pro-
duction and consumption of each species are bal- to yield a feasible flux vector that balances the
anced. Steady-state concentrations x then satisfy network while maximizing growth. Alternative
the equation objective functions have been proposed, for in-
stance, for higher organisms that do not necessar-
0 D N  v.x ; p/ : ily maximize the growth of each cell.

Computing steady-state concentrations requires Quasi-Steady-State Analysis


explicit knowledge of reaction rates and their In many reaction mechanisms, a quasi-steady-
parameter values. For biochemical reaction net- state assumption (QSSA) can be made, postulat-
works, these are often very difficult to obtain. ing that the concentration of some of the involved
An alternative is the computation of steady-state species does not change. This assumption is often
fluxes v , which only requires solving the system justified if reaction rates differ hugely, leading to
of homogeneous linear equations a time scale separation, or if some concentrations
are very high, such that their change is negligible
0D Nv : (2) for the mechanism. A typical example is the
derivation of Michaelis-Menten kinetics, which
Lower and upper bounds vil ; viu for each flux vi corresponds to our example network. There, we
can be given such that vil  vi  viu ; an exam- may assume that the concentration of the interme-
ple is an irreversible reaction Ri which implies diate species E  S stays approximately constant
vil D 0. The set of all feasible solutions then on the time scale of the overall conversion of
forms a pointed, convex, polyhedral flux cone in substrate into product and that the substrate, at
Rr . The rays spanning the flux cone correspond least initially, is in much larger abundance than
to elementary flux modes (EFMs) or extreme the enzyme. On the slower time scale, this leads
pathways (EPs), minimal subnetworks that are to the Michaelis-Menten rate law:
already balanced. Each feasible steady-state flux
can be written as a nonnegative combination d vmax  xS .t/
vP D xP .t/ D ;
dt Km C xS .t/
X
v D i  e i ; i  0
i with a maximal rate vmax D k3  xEtot , where xEtot is
the total amount of enzyme and the Michaelis-
of EFMs e1 ; e2 ; : : : , where the i are the corre- Menten constant Km D .k2 C k3 /=k1 as a
sponding weights. direct relation between substrate concentration
Even if in steady state, living cells grow and and production rate. This approximation reduces
divide. Growth of a cell is often described by the number of states by two. Both parameters
Deterministic Description of Biochemical Networks 267

of the Michaelis-Menten rate law are also better Constrained Dynamics


suited for experimental determination: vmax is the Due to the particular structure of the system
highest rate achievable and Km corresponds to equation (1), the trajectories x.t/ of the network
the substrate concentration that yields a rate of with x0 D x.0/ are confined to the stoichiometric
vmax =2. subspace, the intersection of x0 C ImgN with
the positive orthant. Conservation relations that
describe conservation of mass are thus found as
Cooperativity and Ultra-sensitivity solutions to
cT  N D 0 ;
D
In the Michaelis-Menten mechanism, the produc-
tion rate gradually increases with increasing sub- and two initial conditions x0 ; x00 lead to the same
strate concentration, until saturation (Fig. 1; h D stoichiometric subspace if cT  x0 D cT  x00 . This
1). A different behavior is achieved if the enzyme allows for the analysis of, for example, bistability
has several binding sites for the substrate and using only the reaction network structure.
these sites interact such that occupation of one
site alters the affinity of the other sites positively
or negatively, phenomena known as positive and
negative cooperativity, respectively. With QSSA Summary and Future Directions
arguments as before, the fraction of enzymes
completely occupied by substrate molecules at Reactions networks, even in simple cells,
time t is given by typically encompass thousands of components
and reactions, resulting in potentially high-
dimensional nonlinear dynamic systems. In
vmax  xS .t/ contrast to engineered systems, biology is char-
vD
K h C xSh .t/ acterized by a high degree of uncertainty of both
model structure and parameter values. Therefore,
system identification is a central problem in
where K > 0 is a constant and h > 0 is this domain. Specifically, advanced methods for
the Hill coefficient. The Hill coefficient deter- model topology and parameter identification as
mines the shape of the response with increasing well as for uncertainty quantification need to
substrate concentration: a coefficient of h > 1 be developed that take into account the very
(h < 1) indicates positive (negative) coopera- limited observability of biological systems. In
tivity; h D 1 reduces to the Michaelis-Menten addition, biological systems operate on multiple
mechanism. With increasing coefficient h, the time, length, and concentration scales. For
response changes from gradual to switch-like, example, genetic regulation usually operates on
such that the transition from low to high response the time scale of minutes and involves very few
becomes more rapid as indicated in Fig. 1. This molecules, whereas metabolism is significantly
phenomenon is also known as ultra-sensitivity. faster and states are well approximated by species

Deterministic h = 10
Description h=3
of Biochemical
response [a.u.]

Networks, Fig. 1 h=1


Responses for cooperative
enzyme reaction with Hill
coefficient h D 1; 3; 10,
respectively. All other
parameters are set to 1

input [a.u.]
268 Diagnosis of Discrete Event Systems

concentrations. Corresponding systematic an unobservable event of interest based on the ob-


frameworks for multiscale modeling, however, served system behavior and the complete model
are currently lacking. of the system. Event diagnosis is performed by
diagnosers that are synthesized from the system
model and that observe the system behavior at
Cross-References
run-time. Diagnosability analysis is the off-line
task of determining which events of interest can
 Dynamic Graphs, Connectivity of
be diagnosed at run-time by diagnosers.
 Modeling of Dynamic Systems from First Prin-
ciples
 Monotone Systems in Biology
Keywords
 Robustness Analysis of Biological Models
 Spatial Description of Biochemical Networks
Diagnosability; Diagnoser; Fault diagnosis; Veri-
 Stochastic Description of Biochemical
fier
Networks
 Synthetic Biology

Introduction
Bibliography
In this entry, we consider discrete event systems
Craciun G, Tang Y, Feinberg M (2006) Understanding that are partially observable and discuss the two
bistability in complex enzyme-driven reaction net- related problems of event diagnosis and diagnos-
works. Proc Natl Acad Sci USA 103(23):8697–8702 ability analysis. Let the DES of interest be de-
Higham DJ (2008) Modeling and simulating chemical
noted by M with event set E. Since M is partially
reactions. SIAM Rev 50(2):347–368
LeDuc PR, Messner WC, Wikswo JP (2011) How do observable, its set of events E is the disjoint
control-based approaches enter into biology? Annu union of a set of observable events, denoted by
Rev Biomed Eng 13:369–396 Eo , with a set of unobservable events, denoted by
Sontag E (2005) Molecular systems biology and control.
Euo : E D Eo [Euo . At this point, we do not spec-
Eur J Control 11(4–5):396–435
Szallasi Z, Stelling J, Periwal V (eds) (2010) System ify how M is represented; it could be an automa-
modeling in cellular biology: from concepts to nuts ton or a Petri net. Let LM be the set of all strings
and bolts. MIT, Cambridge of events in E that the DES M can execute, i.e.,
Tyson JJ, Chen KC, Novak B (2003) Sniffers, buzzers,
toggles and blinkers: dynamics of regulatory and sig-
the (untimed) language model of the system; cf.
naling pathways in the cell. Curr Opin Cell Biol the related entries,  Models for Discrete Event
15(2):221–231 Systems: An Overview,  Supervisory Control of
Discrete-Event Systems, and  Modeling, Anal-
ysis, and Control with Petri Nets. The set Euo
Diagnosis of Discrete Event Systems captures the fact that the set of sensors attached
to the DES M is limited and may not cover all
Stéphane Lafortune the events of the system. Unobservable events
Department of Electrical Engineering and can be internal system events that are not directly
Computer Science, University of Michigan, Ann “seen” by the monitoring agent that observes
Arbor, MI, USA the behavior of M for diagnosis purposes. They
can also be fault events that are included in the
system model but are not directly observable by
Abstract a dedicated sensor. For the purpose of diagnosis,
let us designate a specific unobservable event of
We discuss the problem of event diagnosis in interest and denote it by d 2 Euo . Event d could
partially observed discrete event systems. The be a fault event or some other significant event
objective is to infer the past occurrence, if any, of that is unobservable.
Diagnosis of Discrete Event Systems 269

Before we can state the problem of event basis of LM and of Eo and Euo , if any and all
diagnosis, we need to introduce some notation. occurrences of the given event of interest d will
E  is the set of all strings of any length n 2 N eventually be diagnosed by the monitoring agent
that can be formed by concatenating elements that observes the system behavior.
of E. The unique string of length n D 0 is Event diagnosis and diagnosability analysis
denoted by " and is the identity element of con- arise in numerous applications of systems
catenation. As in article  Supervisory Control that are modeled as DES. We mention a few
of Discrete-Event Systems, section “Supervisory application areas where DES diagnosis theory
Control Under Partial Observations,” we define has been employed. In heating, ventilation,
D
the projection function P W E  ! Eo that and air-conditioning systems, components such
“erases” the unobservable events in a string and as valves, pumps, and controllers can fail in
replaces them by ". The function P is naturally degraded modes of operation, such as a valve
extended to a set of strings by applying it to each gets stuck open or stuck closed or a pump or
string in the set, resulting in a set of projected controller gets stuck on or stuck off. The available
strings. The observed behavior of M is the lan- sensors may not directly observe these faults, as
guage P .LM / over event set Eo . the sensing abilities are limited. Fault diagnosis
The problem of event diagnosis, or simply di- techniques are essential, since the components
agnosis, is stated as follows: how to infer the past of the system are often not easily accessible. In
occurrence of event d when observing strings in monitoring communication networks, faults of
P .LM / at run-time, i.e., during the operation of certain transmitters or receivers are not directly
the system? This is model-based inferencing, i.e., observable and must be inferred from the set of
the monitoring agent knows LM and the partition successful communications and the topology of
E D Eo [ Euo , and it observes strings in P .LM /. the network. In document processing systems,
When there are multiple events of interest, d1 to faults of internal components can lead to jams in
dn , and these events are fault events, we have the paper path or a decrease in image quality, and
a problem of fault diagnosis. In this case, the while the paper jam or the image quality is itself
objective is not only to determine that a fault observable, the underlying fault may not be as
has occurred (commonly referred to as “fault the number of internal sensors is limited.
detection”) but also to identify which fault has Without loss of generality, we consider a sin-
occurred, namely, which event di (commonly gle event of interest to diagnose, d . When there
referred to as “fault isolation and identification”). are multiple events to diagnose, the methodolo-
Fault diagnosis requires that LM contains not gies that we describe in the remaining of this
only the nominal or fault-free behavior of the entry can be applied to each event of interest di ,
system but also its behavior after the occurrence i D 1; : : : n, individually; in this case, the other
of each fault event di of interest, i.e., its faulty events of interest dj , j ¤ i are treated the same
behavior. Event di is typically a fault of a com- as the other unobservable events in the set Euo in
ponent that leads to degraded behavior on the the process of model-based inferencing.
part of the system. It is not a catastrophic failure
that would cause the system to completely stop
operating, as such a failure would be immediately Problem Formulation
observable. The decision on which fault events
di , along with their associated faulty behaviors, to Event Diagnosis
include in the complete model LM is a design one We start with a general language-based formu-
that is based on practical considerations related to lation of the event diagnosis problem. The in-
the diagnosis objectives. formation available to the agent that monitors
A complementary problem to event diagnosis the system behavior and performs the task of
is that of diagnosability analysis. Diagnosability event diagnosis is the language LM and the set
analysis is the off-line task of determining, on the of observable events Eo , along with the specific
270 Diagnosis of Discrete Event Systems

string t 2 P .LM / that it observes at run-time. after the occurrence of d . That is, for any sY0 in
The actual string generated by the system is LM and for any n 2 N, there exists sY D sY0 t 2
s 2 LM where P .s/ D t. However, as far LM where the length of t is equal to n. Event d
as the monitoring agent is concerned, the actual is not diagnosable in language LM if there exists
string that has occurred could be any string in such a string sY together with a second string
P 1 .t/\LM , where P 1 is the inverse projection sN that does not contain event d , and such that
operation, i.e., P 1 .t/ is the set of all strings P .sY / D P .sN /. This means that the monitoring
st 2 E  such that P .st / D t. Let us denote this agent is unable to distinguish between sY and sN ,
estimate set by E.t/ D P 1 .t/ \ LM , where “E” yet, the number of events after an occurrence of
stands for “estimate.” If a string s 2 LM contains d can be made arbitrarily large in sY , thereby
event d , we write that d 2 LM ; otherwise, we preventing diagnosis of event d within a finite
write that d … LM . number of events after its occurrence. On the
The event diagnosis problem is to synthesize a other hand, if no such pair of strings .sY ; sN /
diagnostic engine that will automatically provide exists in LM , then event d is diagnosable in
the following answers from the observed t and LM . (The mathematically precise definition of
from the knowledge of LM and Eo : diagnosability is available in the literature cited
1. Yes, if and only if d 2 s for all s 2 E.t/. at the end of this entry.)
2. No, if and only if d … s for all s 2 E.t/.
3. Maybe, if and only if there exists sY ; sN 2
E.t/ such that d 2 sY and d … sN . Diagnosis of Automata
As defined, E.t/ is a string-based estimate. In
section “Diagnosis of Automata,” we discuss how We recall the definition of a deterministic finite-
to build a finite-state structure that will encode the state automaton, or simply automaton, from
desired answers for the above three cases when article  Models for Discrete Event Systems:
the DES M is modeled by a deterministic finite- An Overview, with the addition of a set of
state automaton. The resulting structure is called unobservable events as in section “Supervisory
a diagnoser automaton. Control Under Partial Observations” in article
 Supervisory Control of Discrete-Event
Diagnosability Analysis Systems. The automaton, denoted by G, is a
Diagnosability analysis consists in determining, a four-tuple G D .X; E; f; x0 / where X is the
priori, if any and all occurrences of event d in LM finite set of states, E is the finite set of events
will eventually be diagnosed, in the sense that partitioned into E D Eo [ Euo , x0 is the initial
if event d occurs, then the diagnostic engine is state, and f is the deterministic partial transition
guaranteed to eventually issue the decision “Yes.” function f W X  E ! X that is immediately
For the sake of technical simplicity, we assume extended to strings f W X  E  ! X . For a
hereafter that LM is a live language, i.e., any DES M represented by an automaton G, LM is
trace in LM can always be extended by one more the language generated by automaton G, denoted
event. In this context, we would not want the by L.G/ and formally defined as the set of all
diagnostic engine to issue the decision “Maybe” strings for which the extended f is defined. It
for an arbitrarily long number of event occur- is an infinite set if the transition graph of G
rences after event d occurs. When this outcome is has one or more cycles. In view of the liveness
possible, we say that event d is not diagnosable assumption made on LM in the preceding section,
in language LM . G has no reachable deadlocked state, i.e., for all
The property of diagnosability of DES is de- s 2 E  such that f .x; s/ is defined, then there
fined as follows. In view of the liveness assump- exists 2 E such that f .x; s / is also defined.
tion on language LM , any string sY0 that contains To synthesize a diagnoser automaton that cor-
event d can always be extended to a longer string, rectly performs the diagnostic task formulated
meaning that it can be made “arbitrarily long” in the preceding section, we proceed as follows.
Diagnosis of Discrete Event Systems 271

First, we perform the parallel composition (de- component is equal to Y and at least one state
noted by jj) of G with the two-state label au- t
pair in xDiag whose second component is equal
tomaton Alabel that is defined as follows. Alabel D to N ; we call such a state a “Maybe-state” of
.fN; Y g; fd g; flabel ; N /, where flabel has two tran- Diag.G/.
sitions defined: (i) flabel .N; d / D Y and (ii) To perform run-time diagnosis, it therefore suf-
flabel .Y; d / D Y . The purpose of Alabel is to fices to examine the current state of Diag.G/.
record the occurrence of event d , which causes Note that Diag.G/ can be computed off-line
a transition to state Y . By forming Glabeled D from G and stored in memory, so that run-time
GjjAlabel , we record in the states of Glabeled , which diagnosis requires only updating the new state of
D
are of the form .xG ; xA /, if the first element of Diag.G/ on the basis of the most recent observed
the pair, state xG 2 X , was reached or not event (which is necessarily in Eo ). If storing
by executing event d at some point in the past: the entire structure of Diag.G/ is impractical, its
if d was executed, then xA D Y , otherwise current state can be computed on-the-fly on the
xA D N . (We refer the reader to Chap. 2 in basis of the most recent observed event and of the
Cassandras and Lafortune (2008) for the formal transition structure of Glabeled ; this involves one
definition of parallel composition of automata.) step of the subset construction algorithm.
By construction, L.Glabeled / D L.G/. As a simple example, consider the automaton
The second step of the construction of the G1 shown in Fig. 1, where Euo D fd g. The
diagnoser automaton is to build the observer of occurrence of event d changes the behavior of the
Glabeled , denoted by Obs.Glabeled /, with respect system such that event c does not cause a return to
to the set of observable events Eo . (We refer the initial state 1 (identified by incoming arrow);
the reader to Chap. 2 in Cassandras and Lafor- instead, the system gets stuck in state 3 after d
tune (2008) for the definition of the observer occurs.
automaton and for its construction.) The con- Its diagnoser is depicted in Fig. 2. It contains
struction of the observer involves the standard one Yes-state, state f.3; Y /g (abbreviated as “3Y”
subset construction algorithm for nondeterminis- in the figure), one No-state, and two Maybe-states
tic automata in automata theory; here, the unob- (similarly abbreviated). Two consecutive occur-
servable events are the source of nondeterminism, rences of event c, or an occurrence of b right after
since they effectively correspond to "-transitions. c, both indicate that the system must be in state 3,
The diagnoser automaton of G with respect to Eo
is defined as Diag.G/ D Obs.GjjAlabel /. Its event
set is Eo .
The states of Diag.G/ are sets of state pairs
of the form .xG ; xA / where xA is either N or Y .
Examination of the state of Diag.G/ reached by
string t 2 P ŒL.G/ provides the answers to the
event diagnosis problem. Let us denote that state
t
by xDiag . Then:
1. The diagnostic decision is Yes if all state pairs
t
in xDiag have their second component equal
to Y ; we call such a state a “Yes-state” of
Diag.G/.
2. The diagnostic decision is No if all state pairs
t
in xDiag have their second component equal
to N ; we call such a state a “No-state” of
Diag.G/.
3. The diagnostic decision is Maybe if there is Diagnosis of Discrete Event Systems, Fig. 1
t
at least one state pair in xDiag whose second Automaton G1
272 Diagnosis of Discrete Event Systems

Diagnosis of Discrete Event Systems, Fig. 2


Diagnoser automaton of G1 Diagnosis of Discrete Event Systems, Fig. 4
Diagnoser automaton of G2

question: Can the occurrence of event d always


be diagnosed? This is the realm of diagnosability
analysis discussed in the next section.

Diagnosability Analysis of Automata

As mentioned earlier, diagnosability analysis


consists in determining, a priori, if any and all
occurrences of event d in LM will eventually be
diagnosed. In the case of diagnoser automata, we
do not want Diag.G/ to loop forever in a cycle of
Maybe-states and never enter a Yes-state if event
Diagnosis of Discrete Event Systems, Fig. 3
d has occurred, as happens in the diagnoser in
Automaton G2 Fig. 2 for the string adb n when n gets arbitrarily
large. In this case, Diag.G1 / loops in Maybe-
state f.2; N /; .3; Y /g, and the occurrence of d
i.e., that event d must have occurred; this is goes undetected. This shows that event d is
captured by the two transitions from Maybe-state not diagnosable in G1 ; the counterexample is
f.1; N /; .3; Y /g to the Yes-state in Diag.G1 /. As provided by strings sY D adb n and sN D ab n .
a second example, consider the automaton G2 For systems modeled as automata, diagnos-
shown in Fig. 3, where the self-loop b at state 3 ability can be tested with quadratic time com-
in G1 has been removed. Its diagnoser is shown plexity in the size of the state space of G by
in Fig. 4. forming a so-called twin-automaton (also called
Diagnoser automata provide as much informa- “verifier”) where G is parallel composed with
tion as can be inferred, from the available obser- itself, but synchronization is only enforced on ob-
vations and the automaton model of the system, servable events, allowing arbitrary interleavings
regarding the past occurrence of unobservable of unobservable events. The test for diagnosabil-
event d . However, we may want to answer the ity reduces to detection of cycles that occur after
Diagnosis of Discrete Event Systems 273

event d in the twin-automaton. It can be verified analysis, it can be performed using the twin-
that for automaton G2 in our example, event d is automaton technique of the preceding section,
diagnosable. Indeed, it is clear from the structure from the reachability graph of N .
of G2 that Diag.G2 / in Fig. 4 will never loop in Another approach that is actively being pur-
Maybe-state f.2; N /; .3; Y /g if event d occurs; sued in current literature is to exploit the struc-
rather, after d occurs, G2 can only execute c ture of the net model N for diagnosis and for
events, and after two such events, Diag.G2 / en- diagnosability analysis, instead of working with
ters Yes-state f.3; Y /g. the automaton model obtained from its reachabil-
Note that diagnosability will not hold in an ity graph. In addition to potential computational
D
automaton that contains a cycle of unobservable gains from avoiding the explicit generation of the
events after the occurrence of event d , although entire set of reachable states, this approach is mo-
this is not the only instance where the property is tivated by the need to handle Petri nets whose sets
violated, as we saw in our simple example. of reachable states are infinite and in particular
Petri nets that generate languages that are not
regular and hence cannot be represented by finite-
Diagnosis and Diagnosability state automata. Moreover, in this approach, one
Analysis of Petri Nets can incorporate potential observability of token
contents in the places of the Petri nets. We refer
There are several approaches for diagnosis and the interested reader to the relevant chapters in
diagnosability analysis of DES modeled by Petri Campos et al. (2013) and Seatzu et al. (2013) for
nets, depending on the boundedness properties coverage of these topics.
of the net and on what is observable about its
behavior. Let N be the Petri net model of the
system, which consists of a Petri net structure Current and Future Directions
along with an initial marking of all places. If the
transitions of N are labeled by events in a set E, The basic methodologies described so far for
some by observable events in Eo and some by diagnosis and diagnosability analysis have been
unobservable events in Euo , and if the contents extended in many different directions. We briefly
of the Petri net places are not observed except discuss a few of these directions, which are active
for the initial marking of N , then we have a research areas. Detailed coverage of these topics
language-based diagnosis problem as considered is beyond the scope of this entry and is available
so far in this entry, for language L.N / and for in the references listed at the end.
E D Eo [ Euo , with event of interest d 2 Euo . Diagnosis of timed models of DES has been
In this case, if the set of reachable states of the considered, for classes of timed automata and
net is bounded, then we can use the reachability timed Petri nets, where the objective is to ensure
graph as an equivalent automaton model of the that each occurrence of event d is detected within
same system and build a diagnoser automaton a bounded time delay. Diagnosis of stochastic
as described earlier. It is also possible to en- models of DES has been considered, in particu-
code diagnoser states into the original structure lar stochastic automata, where the hard diagnos-
of net N by keeping track of all possible net ability constraints are relaxed and detection of
markings following the observation of an event each occurrence of event d must be guaranteed
in Eo , appending the appropriate label (“N” or with some probability 1 
, for some small
“Y”) to each marking in the state estimate. This
> 0. Stochastic models also allow handling of
is reminiscent of the on-the-fly construction of unreliable sensors or noisy environments where
the current state of the diagnoser automaton dis- event observations may be corrupted with some
cussed earlier, except that the possible system probability, such as when an occurrence of event
states are directly listed as Petri net markings a is observed as event a 80 % of the time and as
on the structure of N . Regarding diagnosability some other event a0 20 % of the time.
274 Diagnosis of Discrete Event Systems

Decentralized diagnosis is concerned with observed by the monitoring agent. However, there
DES that are observed by several monitoring are many instances where one would want the
agents i D 1; : : : ; n, each with its own set of monitoring agent to dynamically activate or de-
observable events Eo;i and each having access activate the observability properties of a subset of
to the entire set of system behaviors, LM . The the events in Eo ; this arises in situations where
task is to design a set of individual diagnosers, event monitoring is “costly” in terms of energy,
one for each set Eo;i , such that the n diagnosers bandwidth, or security reasons. This is referred to
together diagnose all the occurrences of event d . as the case of dynamic observations, and the goal
In other words, for each occurrence of event is to synthesize sensor activation policies that
d in any string of LM , there exists at least minimize a given cost function while preserving
one diagnoser that will detect it (i.e., answer the diagnosability properties of the system.
“Yes”). The individual diagnosers may or may
not communicate with each other at run-time
or they may communicate with a coordinating Cross-References
diagnoser that will fuse their information; several
decentralized diagnosis architectures have been  Models for Discrete Event Systems: An
studied and their properties characterized. The Overview
focus in these works is the decentralized nature  Modeling, Analysis, and Control with Petri
of the information available about the strings in Nets
LM , as captured by the individual observable  Supervisory Control of Discrete-Event Systems
event sets Eo;i , i D 1; : : : ; n.  Modeling, Analysis, and Control with Petri
Distributed diagnosis is closely related to Nets
decentralized diagnosis, except that it normally
refers to situations where each individual
diagnoser uses only part of the entire system Recommended Reading
model. Let M be an automaton G obtained
by parallel composition of subsystem models: There is a very large amount of literature on
G D jji D1;n Gi . In distributed diagnosis, one diagnosis and diagnosability analysis of DES
would want to design each individual diagnoser that has been published in control engineering,
Diagi on the basis of Gi alone or on the basis computer science, and artificial intelligence jour-
of Gi and of an abstraction of the rest of the nals and conference proceedings. We mention
system, jjj D1;nIj ¤i Gj . Here, the emphasis is on a few recent books or survey articles that are
the distributed nature of the system, as captured a good starting point for readers interested in
by the parallel composition operation. In the case learning more about this active area of research.
where M is a Petri net N , the distributed nature In the DES literature, the study of fault diagnosis
of the system may be captured by individual net and the formalization of diagnosability proper-
models Ni , i D 1; : : : ; n, that are coupled by ties started in Lin (1994) and Sampath et al.
common places, i.e., place-bordered Petri nets. (1995). Chapter 2 of the textbook Cassandras
Robust diagnosis generally refers to decentral- and Lafortune (2008) contains basic results about
ized or distributed diagnosis, but where one or diagnoser automata and diagnosability analysis
more of the individual diagnosers may fail. Thus, of DES, following the approach introduced in
there must be built-in redundancy in the set of Sampath et al. (1995). The research monograph
individual diagnosers so that they together may Lamperti and Zanella (2003) presents DES diag-
still detect every occurrence of event d even if nostic methodologies developed in the artificial
one or more of them ceases to operate. intelligence literature. The survey paper Zay-
So far we have considered a fixed and static toon and Lafortune (2013) presents a detailed
set of observable events, Eo  E, where ev- overview of fault diagnosis research in the con-
ery occurrence of each event in Eo is always trol engineering literature. The two edited books
Differential Geometric Methods in Nonlinear Control 275

Campos et al. (2013) and Seatzu et al. (2013) Henry Hermes, Alberto Isidori, Velimir Jurdjevic,
contain chapters specifically devoted to diagnosis Arthur Krener, Claude Lobry, and Hector
of automata and Petri nets, with an emphasis Sussmann. These concepts revolutionized our
on automated manufacturing applications for the knowledge of the analytic properties of control
latter. Specifically, Chaps. 5, 14, 15, 17, and 19 in systems, e.g., controllability, observability,
Campos et al. (2013) and Chaps. 22–25 in Seatzu minimality, and decoupling. With these concepts,
et al. (2013) are recommended for further reading a theory of nonlinear control systems emerged
on several aspects of DES diagnosis. Zaytoon that generalized the linear theory. This theory of
and Lafortune (2013) and the cited chapters in nonlinear systems is largely parallel to the linear
D
Campos et al. (2013) and Seatzu et al. (2013) theory, but of course it is considerably more
contain extensive bibliographies. complicated.

Bibliography Keywords

Campos J, Seatzu C, Xie X (eds) (2013) Formal meth- Codistribution; Distribution; Frobenius theorem;
ods in manufacturing. Series on industrial information Involutive distribution; Lie jet
technology. CRC/Taylor and Francis, Boca Raton, FL
Cassandras CG, Lafortune S (2008) Introduction to dis-
crete event systems, 2nd edn. Springer, New York, NY
Lamperti G, Zanella M (2003) Diagnosis of active sys- Introduction
tems: principles and techniques. Kluwer, Dordrecht
Lin F (1994) Diagnosability of discrete event systems and This is a brief survey of the influence of differ-
its applications. Discret Event Dyn Syst Theory Appl
4(2):197–212
ential geometric concepts on the development of
Sampath M, Sengupta R, Lafortune S, Sinnamohideen K, nonlinear systems theory. Section “A Primer on
Teneketzis D (1995) Diagnosability of discrete event Differential Geometry” reviews some concepts
systems. IEEE Trans Autom Control 40(9):1555–1575 and theorems of differential geometry. Nonlin-
Seatzu C, Silva M, van Schuppen J (eds) (2013) Control
of discrete-event systems. Automata and Petri net
ear controllability and nonlinear observability are
perspectives. Lecture notes in control and information discussed in sections “Controllability of Nonlin-
sciences, vol 433. Springer, London ear Systems” and “Observability for Nonlinear
Zaytoon J, Lafortune S (2013) Overview of fault diagnosis Systems”. Section “Minimal Realizations” dis-
methods for discrete event systems. Annu Rev Control
37(2):308–320
cusses minimal realizations of nonlinear systems,
and section “Disturbance Decoupling” discusses
the disturbance decoupling problem.

Differential Geometric Methods


in Nonlinear Control A Primer on Differential Geometry

A.J. Krener Perhaps a better title might be “A Primer on


Department of Applied Mathematics, Naval Differential Topology” since we will not treat
Postgrauate School, Monterey, CA, USA Riemannian or other metrics. A n-dimensional
manifold M is a topological space that is locally
homeomorphic to a subset of IRn . For simplicity,
Abstract we shall restrict our attention to smooth (C 1 )
manifolds and smooth objects on them. Around
In the early 1970s, concepts from differential each point p 2 M, there is at least one coordinate
geometry were introduced to study nonlinear chart that is a neighborhood Np  M and a
control systems. The leading researchers in this homeomorphism x W Np ! U where U is
effort were Roger Brockett, Robert Hermann, an open subset of IRn . When two coordinate
276 Differential Geometric Methods in Nonlinear Control

charts overlap, the change of coordinates should are locally topologically equivalent (Arnol’d
be smooth. For simplicity, we restrict our atten- 1983). This is the Grobman-Hartman theorem.
tion to differential geometric objects described in There exists a local homeomorphism z D h.x/
local coordinates. that carries x.t/ trajectories into z.s/ trajectories
In local coordinates, a vector field is just an in some neighborhood of x 0 D 0. This
ODE of the form homeomorphism need not preserve time t ¤ s,
but it does preserve the direction of time. Whether
xP D f .x/ (1) these flows are locally diffeomorphic is a more
difficult question that was explored by Poincaré.
See the section on feedback linearization (Krener
where f .x/ is a smooth IRn1 valued function of
2013).
x. In a different coordinate chart with local coor-
If all the eigenvalues of F are in the open
dinates z, this vector field would be represented
left half plane, then the linear dynamics (2) is
by a different formula:
globally asymptotically stable around z0 D 0, i.e.,
if the flow of (2) is .t; z/, then .t; z1 / ! 0
zP D g.z/ as t ! 1. Then it can be shown that the non-
linear dynamics is locally asymptotically stable,
If the charts overlap, then on the overlap they are .t; x 1 / ! x 0 as t ! 1 for all x 1 in some
related: neighborhood of x 0 .
One forms, !.x/, are dual to vector fields.
@x @z The simplest example of a one form (also called
f .x.z// D .z/g.z/; g.z.x// D .z/f .x/
@z @x a covector field) is the differential dh.x/ of a
scalar-valued smooth function h.x/. This is the
Since f .x/ is smooth, it generates a smooth IR1n covector field
flow .t; x 0 / where for each t, the mapping
h i
x 7! .t; x 0 / is a local diffeomorphism and for !.x/ D @h @h
@x1 .x/ ::: @xn .x/
each x 0 the mapping t 7! .t; x 0 / is a solution
of the ODE (1) satisfying the initial condition
Sometimes this is written as
.0; x 0 / D x 0 . We assume that all the flows
are complete, i.e., defined for all t 2 IR; x 2 X
n
@h
M. The flows are one parameter groups, i.e., !.x/ D .x/ dxi
.t; .s; x 0 // D .t C s; x 0 / D .s; .t; x 0 //. 1
@xi

If f .x 0 / D b, a constant vector, then locally


the flow looks like translation, .t; x 1 / D x 1 C The most general smooth one form is of the form
tb. If f .x 0 / ¤ 0, then we can always choose  
local coordinates z so that in these coordinates !.x/ D ! 1 .x/ : : : ! n .x/
the vector field is constant. Without loss of gen-
X
n
erality, we can assume that x 0 D 0 and that the D ! i .x/ dxi
first component of f is f1 .0/ ¤ 0. Define the i D1
local change of coordinates: x.z/ D .z1 ; x 1 .z//
where x 1 .z/ D .0; z2 ; : : : ; zn /0 . It is not hard to where the ! i .x/ are smooth functions. The du-
that this is a local diffeomorphism and that, in z ality between one forms and vector fields is the
coordinates, the vector field is the first unit vector. bilinear pairing
If f .x 0 / D 0 let F D @f 0
@x .x /, then if all the
eigenvalues of F are off the imaginary axis, then < !.x/; f .x/ > D !.f /.x/ D !.x/f .x/
the integral curves of f .x/ and
X
n
D ! i .x/fi .x/
zP D F z (2) i D1
Differential Geometric Methods in Nonlinear Control 277

Just as a vector field can be thought of as a This can be iterated


first-order ODE, a one form can be thought of
as a first-order PDE. Given !.x/, find h.x/ such @Lfk1 h
Lkf .h/.x/ D .x/f .x/
that dh.x/ D !.x/. A one form !.x/ is said to @x
be exact if there exists such an h.x/. Of course
if there is one solution, then there are many all If h.x/ 2 IRp1 , then Lf .h/.x/ 2 IRp1 .
differing by a constant of integration which we The Lie bracket of two vector fields f 1 .x/ and
can take as the value of h at some point x 0 . 2
f .x/ is another vector field
Unlike smooth first-order ODEs, smooth first-
D
order PDEs do not always have a solution. There   @f 2 @f 1
f 1 ; f 2 .x/ D .x/f 1 .x/  .x/f 2 .x/
are integrability conditions and topological con- @x @x
ditions that must be satisfied. Suppose dh.x/ D
!.x/, then @x@h
i
.x/ D ! i .x/ so Clearly,
 1 2  the Lie bracket  is skew symmetric,
f ; f .x/ D  f 2 ; f 1 .x/.x/. It also
@! i @2 h @2 h @! j satisfies the Jacobi identity
.x/ D .x/ D .x/ D .x/
@xj @xj @xi @xi @xj @xi      
f 1 ; f 2 ; f 3 .x/ C f 2 ; f 3 ; f 1 .x/
Therefore, for the PDE to have a solution, the   
C f 3 ; f 1 ; f 2 .x/ D 0
integrability conditions
Repeated Lie brackets are often expressed induc-
@! i @! j
.x/  .x/ D 0 tively as
@xj @xi
adf0 .g/.x/ D g.x/
must be satisfied. The exterior derivative of a one
h i
form is a skew-symmetric matrix field adfk .g/.x/ D f; adfk1 .g/ .x/
X  @! i @! j

d!.x/ D .x/  .x/ dxi ^ dxj
i <j
@xj @xi The geometric interpretation of the Lie bracket
Œf; g .x/ is the infinitesimal commutator of their
A one form !.x/ is said to be closed if d!.x/ D flows .t; x/ and .t; x/, i.e.,
0. This is locally sufficient for there to exist an
h.x/ such that dh.x/ D !.x/. .t; .t; x//  .t; .t; x// D Œf; g .x/t 2
Every exact form is closed but not every closed
form is exact. A counter example on IR2 is CO.t/3

  Another interpretation of the Lie bracket is given


!.x/ D x2 x1
by the Lie series expansion
This is closed but not exact. The line integral of
1
X
this convector field around any circle centered at tk
g. .t; x// D .1/k adfk .g/.x/
the origin is 2 . If it were exact, the line integral kŠ
kD0
would have been zero because the curve ends
where it begins. This is a Taylor series expansion which is conver-
The Lie derivative of a scalar-valued function gent for small jtj if f; g are real analytic vector
h.x/ by a vector field f .x/ is denied to be the fields. Another Lie series is
scalar-valued function
1
X tk
@h h. .t; x// D Lkf .h/.x/
Lf .h/.x/ D .x/f .x/ D< dh.x/; f .x/ > kŠ
@x kD0
278 Differential Geometric Methods in Nonlinear Control

Given a smooth mapping z D .x/ from an A codistribution defines at each x 2 M a


open subset of IRn to an open subset of IRm and subspace of the cotangent space
vector fields f .x/ and g.z/ on these open subsets,
we sat f .x/ is -related to g.z/ if f!.x/ W ! 2 Eg

@ These subspaces form a subbundle E of the


g. .x// D .x/f .x/ cotangent bundle. If the subspaces are all of the
@x
same dimension, then the codistribution is said
It is not hard to see that if f 1.x/; f 2.x/ are - to be nonsingular. Again, we will restrict our
related to g 1 .z/; g 2 .z/, then f 1 ; f 2 .x/ is - attention to nonsingular codistributions.
related to g 1 ; g 2 .z/. For this reason, we say Every distribution D defines a dual codistribu-
that the Lie bracket is an intrinsic differentiation. tion
The other intrinsic differentiation is the exterior
derivative operation d . D D f!.x/ W !.x/f .x/ D 0; for all f .x/ 2 Dg
The Lie derivative of a one form !.x/ by a
vector field f .x/ is given by and vice versa

X  @! i E D ff .x/ W !.x/f .x/ D 0; for all !.x/ 2 Eg


Lf .!/.x/ D .x/fj .x/
i;j
@xj A k dimensional distribution D (or its dual
 codistribution D ) can be thought of as a system
@fj
C!j .x/ .x/ dxi of PDEs on M. Find n  k independent functions
@xi
h1 .x/; : : : ; hnk .x/ such that
It is not hard to see that
dhi .x/f .x/ D 0 for all f .x/ 2 D
Lf .< !; g >/.x/ D < Lf .!/; g > .x/
The functions h1 .x/; : : : ; hnk .x/ are said to be
C < !; Œf; g > .x/ independent if dh1 .x/; : : : ; dhnk .x/ are linearly
independent at every x 2 M. In other words,
and
dh1 .x/; : : : ; dhnk .x/ span D over the space of
Lf .dh/.x/ D d.Lf .h//.x/ (3) smooth functions.
The Frobenius theorem gives the integrability
Control systems involve multiple vector fields. conditions for these functions to exist locally.
A distribution D is a set of vector fields on M that The distribution D must be involutive, i.e., closed
is closed under addition of vector fields and under under the Lie bracket,
multiplication by scalar functions. A distribution
defines at each x 2 M a subspace of the tangent ŒD; D D fŒf; g W f; g 2 Dg  D
space
When the functions exist, their joint level sets
D.x/ D ff .x/ W f 2 Dg fx W hi .x/ D ci ; i D 1; : : : ; n  kg are the
leaves of a local foliation. Through each x 0 in
These subspaces form a subbundle D of the a convex local coordinate chart N , there exists
tangent bundle. If the subspaces are all of the locally a a k-dimensional submanifold fx 2 N W
same dimension, then the distribution is said to hi .x/ D hi .x 0 /g. At each x 1 in this subman-
be nonsingular. We will restrict our attention to ifold, its tangent space is D.x/. Whether these
nonsingular distributions. hi .x/ exist globally to define a global foliation,
A codistribution (or Pfaffian system) E is a a partition of M into smooth submanifolds, is a
set of one forms on M that is closed under delicate question. Consider a distribution on IR2
addition and multiplication by scalar functions. generated by a constant vector field f .x/ D b
Differential Geometric Methods in Nonlinear Control 279

of irrational slope, b2 =b1 is irrational. Construct where the state x are local coordinates on an
the torus T 2 as the quotient of IR2 by the integer n-dimensional manifold M, the control u is re-
lattice Z 2 . The distribution passes to the quotient stricted to lie in some set U  IRm , and the output
and since it is one dimensional, it is clearly y takes values IRp . We shall only consider such
involutive. The leaves of the quotient distribution systems.
are curves that wind around the torus indefinitely, A particular case is a linear system of the form
and each curve is dense in T 2 . Therefore, any
smooth function h.x/ that is constant on such a xP D F x C Gu
leaf is constant on all of T 2 . Hence, the local
D
y D Hx (5)
foliation does not extend to a global foliation. x.0/ D x 0
Another delicate question is whether the quo-
tient space of M by a foliation induced by an in- where M D IRn and U D IRm .
volutive distribution is a smooth manifold (Suss- The set At .x 0 / of points accessible at time
mann 1975). This is always true locally, but it t  0 from x 0 is the set of all x 1 2 M such
may not hold globally. Think of the foliation of that there exists a bounded, measurable control
T 2 discussed above. trajectory u.s/ 2 U; 0  s  t, so that the
Given k  n vector fields f 1 .x/; : : : ; f k .x/ solution of (4) satisfies x.t/ D x 1 . We define
that are linearly
 independent
 at each x and that A.x 0 / as the union of At .x 0 / for all t  0. The
commute, f i ; f j .x/ D 0, there exists a local system (4) is said to be controllable at time t > 0
change of coordinates z D z.x/ so that in the new if At .x 0 / D M and controllable in forward time
coordinates the vector fields are the first k unit if A.x 0 / D M.
vectors. For linear systems, controllability is a rather
The involutive closure DN of D is the smallest straightforward matter, but for nonlinear systems
involutive distribution containing D. As with all it is more subtle with numerous variations. The
distributions, we always assume implicitly that variation of constants formula gives the solution
is nonsingular. A point x 1 is D-accessible from of the linear system (5) as
x 0 if there exists a continuous and piecewise
smooth curve joining x 0 to x 1 whose left and Z t
right tangent vectors are always in D. Obvi- x.t/ D e F t x 0 C e F .t s/ Gu.s/ ds
ously D-accessibility is an equivalence relation. 0

Chow’s theorem (1939) asserts that its equiva-


lence classes are the leaves of the foliation in- so At .x 0 / is an affine subspace of IRn for any
duced by D. N Chow’s theorem goes a step further. t > 0. It is not hard to see that the columns of
Suppose f 1 .x/; : : : ; f k .x/ span D.x/ at each  
x 2 M then given any two points x 0 ; x 1 in a G : : : F n1 G (6)
leaf of D,N there is a continuous and piecewise
smooth curve joining x 0 to x 1 whose left and are tangent to this affine subspace, so if this
right tangent vectors are always one of the f i .x/. matrix is of rank n, then At .x 0 / D IRn for any
t > 0. This is the so-called controllability rank
Controllability of Nonlinear Systems condition for linear systems.
Turning to the nonlinear system (4), let D
An initialized nonlinear system that is affine in be the distribution spanned by the vector fields
the control is of the form f .x/; g 1 .x/; : : : ; g m .x/, and let DN be its involu-
tive closure. It is clear that A.x 0 / is contained in
xP D f .x/ C g.x/u
P the leaf of DN through x 0 . Krener (1971) showed
D f .x/ C m j
j D1 g .x/uj that A.x 0 / has nonempty interior in this leaf.
y D h.x/ More precisely, A.x 0 / is between an open set and
x.0/ D x 0 (4) its closure in the relative topology of the leaf.
280 Differential Geometric Methods in Nonlinear Control

Let D0 be the smallest distribution containing arbitrarily small open set containing x 0 ; x 1 . The
the vector fields g 1 .x/; : : : ; g m .x/ and invariant open set need not be connected. A nonlinear sys-
under bracketing by f .x/, i.e., tem is (short time, locally short time) observable
if every pair of initial states is (short time, locally
Œf; D0   D0 short time) distinguishable. Finally, a nonlinear
system is (short time, locally short time) locally
and let DN 0 be its involutive closure. Sussmann and observable if every x 0 has a neighborhood such
Jurdjevic (1972) showed that At .x 0 / is in the leaf that every other point x 1 in the neighborhood
of DN 0 through x 0 , and it is between an open set is (short time, locally short time) distinguishable
and its closure in the topology of this leaf. from x 0 .
For linear systems (5) where f .x/ D F x and For a linear system (5), all these definitions
g j .x/ D G j , the j th column of G, it is not hard coalesce into a single concept of observability
to see that which can be checked by the observability rank
condition which is that the rank of
adfk .g j /.x/ D .1/k F k G j
2 3
H
so the nonlinear generalization of the controlla- 6 HF 7
6 7
bility rank condition is that at each x the dimen- 6 :: 7 (8)
4 : 5
sion of DN 0 .x/ is n. This guarantees that At .x 0 /
HF n1
is between an open set and its closure in the
topology of M.
equals n.
The condition that
The corresponding concept for nonlinear sys-
N tems involves E, the smallest codistribution con-
dimension D.x/ Dn (7)
taining dh1 .x/; : : : ; dhp .x/ that is invariant un-
der repeated Lie differentiation by the vector
is referred to as the nonlinear controllability rank
fields f .x/; g 1 .x/; : : : ; g m .x/. Let
condition. This guarantees that A.x 0 / is between
an open set and its closure in the topology of M.
E.x/ D f!.x/ W ! 2 Eg
There are stronger interpretations of control-
lability for nonlinear systems. One is short time
local controllability (STLC). The definition of The nonlinear observability rank condition is
this is that the set of accessible points from x 0 in
any small t > 0 with state trajectories restricted dimension E.x/ D n (9)
to an arbitrarily small neighborhood of x 0 should
contain x 0 in its interior. Hermes (1994) and for all x 2 M. This condition guarantees that
others have done work on this. the nonlinear system is locally short time, locally
observable. It follows from (3) that E is spanned
by a set of exact one form, so its dual distribution
Observability for Nonlinear Systems E  is involutive.
For a linear system (5), the input–output map-
Two possible initial states x 0 ; x 1 for the non- ping from x.0/ D x i , i D 0; 1 is
linear system are distinguishable if there exists Z t
a control u./ such that the corresponding out-
y .t/ D He x C
i Ft i
He F .t s/ Gu.s/ ds
puts y 0 .t/; y 1 .t/ are not equal. They are short 0
time distinguishable if there is an u./ such that
y 0 .t/ ¤ y 1 .t/ for all small t > 0. They are lo- The difference is
cally short time distinguishable if in addition the
corresponding state trajectories do not leave an y1.t/  y 0 .t/ D He F t .x 1  x 0 /
Differential Geometric Methods in Nonlinear Control 281

So if one input u./ distinguishes x 0 from x 1 , then this k-dimensional subspace, and it realizes the
so does every input. same input–output mapping from x 0 D 0. The
For nonlinear systems, this is not necessarily restricted system satisfies the controllability rank
true. That prompted Gauthier et al. (1992) to condition.
introduce a stronger concept of observability for If the observability rank condition does not
nonlinear systems. For simplicity, we describe hold, then the kernel of (8) is a subspace of IRn1
it for scalar input and scalar output systems. A of dimension n  l > 0. This subspace is in the
nonlinear system is uniformly observable for any kernel of H and is invariant under multiplication
input if there exist local coordinates so that it is by F . In fact, it is the maximal subspace with
D
of the form these properties. Therefore, there is a quotient
linear system on the IRn1 mod, the kernel of (8)
y D x1 C h.u/ which has the same input–output mapping. The
xP 1 D x2 C f1 .x1 ; u/ quotient is of dimension l < n, and it realizes the
same input–output mapping. The quotient system
::
: satisfies the observability rank condition.
By employing these two steps in either order,
xP n1 D xn C fn1 .x1 ; : : : ; xn2 ; u/
we pass to a minimal realization of the input–
xP n D fn .x; u/ output map of (5) from x 0 D 0. Kalman also
showed that two linear minimal realizations differ
Cleary if we know u./; y./, then by repeated by a linear change of state coordinates.
differentiation of y./, we can reconstruct x./. An initialized nonlinear system is a realiza-
It has been shown that for nonlinear systems tion of minimal dimension of its input–output
that are uniformly observable for any input, the mapping if the nonlinear controllability rank con-
extended Kalman filter is a locally convergent ob- dition (7) and the nonlinear observability rank
server (Krener 2002a), and the minimum energy condition (9) hold.
estimator is globally convergent (Krener 2002b). If the nonlinear controllability rank condition
(7) fails to hold because the dimension of D.x/
is k < n, then by replacing the state space M
Minimal Realizations with the k dimensional leaf through x 0 of the
foliation induced by D, we obtain a smaller state
The initialized nonlinear system (4) can be space on which the nonlinear controllability rank
viewed as defining a input–output mapping from condition (7) holds. The input–output mapping is
input trajectories u./ to output trajectories y./. unchanged by this restriction.
Is it a minimal realization of this mapping, does Suppose the nonlinear observability rank con-
there exists an initialized nonlinear system on a dition (9) fails to hold because the dimension of
smaller dimensional state space that realizes the E.x/ is l < n. Then consider a convex neigh-
same input–output mapping? borhood N of x 0 . The distribution E  induces
Kalman showed that (5) initialized at x 0 D 0 a local foliation of N into leaves of dimension
is minimal iff the controllability rank condition n  l > 0. The nonlinear system leaves this
and the observability rank condition hold. He local foliation invariant in the following sense.
also showed how to reduce a linear system to a Suppose x 0 and x 1 are on the same leaf then if
minimal one. x i .t/ is the trajectory starting at x i , then x 0 .t/
If the controllability rank condition does not and x 1 .t/ are on the same leaf as long as the
hold, then the span of (6) dimension is k < n. trajectories remain in N . Furthermore, h.x/ is
This subspace contains the columns of G and constant on leaves so y 0 .t/ D y 1 .t/. Hence, there
is invariant under multiplication by F . In fact, exists locally a nonlinear system whose state
it is the maximal subspace with these proper- space is the leaf space. On this leaf space, the
ties. So the linear system can be restricted to nonlinear observability rank condition holds (9),
282 Differential Geometric Methods in Nonlinear Control

and the projected system has the same input– LŒf jl Œ: : : Œf j2 ; f j1  : : :.x 0 /
output map as the original locally around x 0 .
If the leaf space of the foliation induced by D Œg jl Œ: : : Œg j2 ; g j1  : : :.z0 / (13)
E  admits the structure of a manifold, then the
reduced system can be defined globally on it. for 1  l  k.
Sussmann (1973) and Sussmann (1977) stud- On the other hand, if there is a linear map
ied minimal realizations of analytic nonlinear L such that (13) holds for 1  l  k, then
systems. there exists a smooth mapping ˆ.x/ D z and
The state space of two minimal nonlinear constants M > 0;
> 0 such that for any
systems need not be diffeomorphic. Consider the ku.t/k < 1, the corresponding trajectories x.t/
system and z.t/ satisfy (12).
The k-Lie jet of (10) at x 0 is the tree of
brackets Œf jl Œ: : : Œf j2 ; f j1  : : :.x 0 / for 1  l 
xP D u; y D sin x k. In some sense these are the coordinate-free
Taylor series coefficients of (10) at x 0 .
where x; u are scalars. We can take the state The dynamics (10) is free-nilpotent of degree
space to be either M D IR or M D S 1 , k if all these brackets are as linearly independent
and we will realize the same input–output map- as possible consistent with skew symmetry and
ping. These two state spaces are certainly not the Jacobi identity and all higher degree brackets
diffeomorphic but one is a covering space of are zero. If it is free to degree k, the controlled
the other. dynamics (10) can be used to approximate any
other controlled dynamics (11) to degree k. Be-
cause it is nilpotent, integrating (10) reduces to
Lie Jet and Approximations repeated quadratures. If all brackets with two or
more f i ; 1  i  m are zero at x 0 , then (10) is
Consider two initialized nonlinear controlled dy- linear in appropriate coordinates. If all brackets
namics with three or more f i ; 1  i  m are zero at x 0 ,
then (10) is quadratic in appropriate coordinates.
Pm The references Krener and Schaettler (1988) and
xP D f 0 .x/ C j D1 f
j
.x/uj Krener (2010a,b) discuss the structure of the
(10)
x.0/ D x 0 reachable sets for such systems.
Pm
zP D g 0 .z/ C j D1 g j .z/uj
(11)
z.0/ D z0 Disturbance Decoupling
Suppose that (10) satisfies the nonlinear Consider a control system affected by a distur-
controllability rank condition (7). Further, bance input w.t/
suppose that there is a smooth mapping ˆ.x/ D z
and constants M > 0;
> 0 such that for any xP D f .x/ C g.x/u C b.x/w
ku.t/k < 1, the corresponding trajectories x.t/ y D h.x/
(14)
and z.t/ satisfy

kˆ.x.t//  z.t/k < M t kC1 (12) The disturbance decoupling problem is to find
a feedback u D .x/, so that in the closed-
for 0  t <
. loop system, the output y.t/ is not affected by
Then it is not hard to show that the linear the disturbance w.t/. Wonham and Morse (1970)
map L D @ˆ @x
.x 0 / takes brackets up to order k solved this problem for a linear system
of the vector fields f j evaluated at x 0 into the
corresponding brackets of the vector fields g j xP D F x C Gu C Bw
(15)
evaluated at z0 , y D Hx
Differential Geometric Methods in Nonlinear Control 283

To do so, they introduced the concept of an distribution Dmax in the kernel of dh. Moreover,
F; G invariant subspace. A subspace V  IRn is this distribution is involutive. The disturbance
F; G invariant if decoupling problem is locally solvable iff
columns of b.x/ are contained in Dmax . If
FV  V C G (16) M is simply connected, then the disturbance
decoupling problem is globally solvable iff
where G is the span of the columns of G. It is columns of b.x/ are contained in Dmax .
easy to see that V is F; G invariant iff there exists
D
a K 2 IRmn such that
Conclusion
.F C GK/V  V (17)
We have briefly described the role that differential
The feedback gain K is called a friend of V. geometric concepts played in the development
It is easy from (16) that if V i is F; G invariant of controllability, observability, minimality,
for i D 1; 2, then V 1 C V 2 is also. So there exists approximation, and decoupling of nonlinear
a maximal F; G invariant subspace V max in the systems.
kernel of H . Wonham and Morse showed that the
linear disturbance decoupling problem is solvable
iff B  V max where B is the span of the columns Cross-References
of B.
Isidori et al. (1981a) and independently  Feedback Linearization of Nonlinear Systems
Hirschorn (1981) solved the nonlinear distur-  Lie Algebraic Methods in Nonlinear Control
bance decoupling problem. A distribution D is  Nonlinear Zero Dynamics
locally f; g invariant if

 Œf; D  D C  Bibliography


(18)
gj ; D  D C 
Arnol’d VI (1983) Geometrical methods in the theory of
for j D 1; : : : ; m where  is the distribution ordinary differential equations. Springer, Berlin
Brockett RW (1972) Systems theory on group manifods
spanned by the columns of g. In Isidori et al. and coset spaces. SIAM J Control 10: 265–284
(1981b) it is shown that if D is locally f; g Chow WL (1939) Uber Systeme von Linearen Partiellen
invariant, then so is its involutive closure. Differentialgleichungen Erster Ordnung. Math Ann
A distribution D is f; g invariant if there exists 117:98–105
Gauthier JP, Hammouri H, Othman S (1992) A sim-
˛.x/ 2 IRm1 and invertible ˇ.x/ 2 IRmm such ple observer for nonlinear systems with applications
that to bioreactors. IEEE Trans Autom Control 37:
hP Œf C g˛; Di  D 875–880
(19) Griffith EW, Kumar KSP (1971) On the observability
j g ˇj ; D  D
j k
of nonlinear systems, I. J Math Anal Appl 35:
135–147
for k D 1; : : : ; m. It is not hard to see that a f; g
Haynes GW, Hermes H (1970) Non-linear controllability
invariant is locally f; g invariant. It is shown in via Lie theory SIAM J Control 8: 450–460
Isidori et al. (1981b) that if D is a locally f; g Hermann R, Krener AJ (1977) Nonlinear controllabil-
invariant distribution, then locally there exists ity and observability. IEEE Trans Autom Control
22:728–740
˛.x/ and ˇ.x/ so that (19) holds. Furthermore,
Hermes H (1994) Large and small time local controllabil-
if the state space is simply connected, then ˛.x/ ity. In: Proceedings of the 33rd IEEE conference on
and ˇ.x/ exist globally, but the matrix field ˇ.x/ decision and control, vol 2, pp 1280–1281
may fail to be invertible at some x. Hirschorn RM (1981) (A,B)-invariant distributions and
the disturbance decoupling of nonlinear systems.
From (18), it is clear that if Di is locally f; g
SIAM J Control Optim 19:1–19
invariant for i D 1; 2, then so is D1 C D2 . Hence, Isidori A, Krener AJ, Gori Giorgi C, Monaco S (1981a)
there exists a maximal locally f; g invariant Nonlinear decoupling via feedback: a differential
284 Disaster Response Robot

geometric approach. IEEE Trans Autom Control


26:331–345 Disaster Response Robot
Isidori A, Krener AJ, Gori Giorgi C, Monaco S (1981b)
Locally (f,g) invariant distributions. Syst Control Lett Satoshi Tadokoro
1:12–15
Tohoku University, Sendai, Japan
Kostyukovskii YML (1968a) Observability of non-
linear controlled systems. Autom Remote Control
9:1384–1396 Abstract
Kostyukovskii YML (1968b) Simple conditions for ob-
servability of nonlinear controlled systems. Autom
Remote Control 10:1575-1584-1396
Disaster response robots are robotic systems used
Kou SR, Elliot DL, Tarn TJ (1973) Observ- for preventing the worsening of disaster dam-
ability of nonlinear systems. Inf Control 22: age under emergent situations. Robots for nat-
89–99 ural disasters (water disaster, volcano eruption,
Krener AJ (1971) A generalization of the Pontryagin
maximal principle and the bang-bang principle. PhD
earthquakes, landslides, and fire) and man-made
dissertation, University of California, Berkeley disasters (explosive ordnance disposal, CBRNE
Krener AJ (1974) A generalization of Chow’s theorem and disasters, Fukushima Daiichi nuclear power plant
the bang-bang theorem to nonlinear control problems. accident) are introduced. Technical challenges
SIAM J Control 12:43–52
Krener AJ (1975) Local approximation of control sys-
are described on the basis of generalized data
tems. J Differ Equ 19:125–133 flow.
Krener AJ (2002a) The convergence of the extended
Kalman filter. In: Rantzer A, Byrnes CI (eds) Di-
rections in mathematical systems theory and optimiza- Keywords
tion. Springer, Berlin, pp 173–182. Corrected version
available at arXiv:math.OC/0212255 v. 1
Krener AJ (2002b) The convergence of the mini- Rescue robot; Response robot
mum energy estimator. I. In: Kang W, Xiao M,
Borges C (eds) New trends in nonlinear dynamics and
control, and their applications. Springer, Heidelberg, Introduction
pp 187–208
Krener AJ (2010a) The accessible sets of linear free
nilpotent control systems. In: Proceeding of NOLCOS Disaster response robots are robotic systems used
2010, Bologna for preventing the worsening of disaster damage
Krener AJ (2010b) The accessible sets of quadratic free under emergent situations, such as for search and
nilpotent control systems. Commun Inf Syst 11:35–46
rescue, recovery construction, etc.
Krener AJ (2013) Feedback linearization of nonlinear
systems. Baillieul J, Samad T (eds) Encyclopedia of A disaster changes its state as time passes.
systems and control. Springer The state starts as an unforeseen occurrence
Krener AJ, Schaettler H (1988) The structure of small and proceeds to prevention phase, emergency
time reachable sets in low dimensions. SIAM J Control
response phase, recovery phase, and revival
Optim 27:120–147
Lobry C (1970) Cotrollabilite des Systemes Non Lin- phase. Although a disaster response robot
eaires. SIAM J Control 8:573–605 usually means a system for disaster response
Sussmann HJ (1973) Minimal realizations of nonlin- and recovery in a narrow sense a system used in
ear systems. In: Mayne DQ, Brockett RW (eds)
Geometric methods in systems theory. D. Ridel,
every phase of disaster can be called a disaster
Dordrecht response robot in a broad sense.
Sussmann HJ (1975) A generalization of the closed When parties of firefighters and military per-
subgroup theorem to quotients of arbitrary manifolds. sonnel respond to disasters, robots are among
J Differ Geom 10:151–166
Sussmann HJ (1977) Existence and uniqueness of min-
the technical equipments used. The purposes of
imal realizations of nonlinear systems. Math Syst robots are (1) to perform tasks that are impos-
Theory 10:263–284 sible/difficult to perform by humans and con-
Sussmann HJ, Jurdjevic VJ (1972) Controllability of ventional equipment, (2) to reduce responders’
nonlinear systems. J Differ Equ 12:95–116
Wonham WM, Morse AS (1970) Decoupling an pole as-
risk of inflicting secondary damage, and (3) to
signment in linear multivariable systems: a geometric improve rapidity/efficiency of tasks, by using
approach. SIAM J Control 8:1–18 remote/automatic robot equipment.
Disaster Response Robot 285

Response Robots for Natural An unmanned construction system consists of


Disasters teleoperated robot backhoes, trucks, and bulldoz-
ers with wireless relaying cars and camera vehi-
Water Disaster cles as shown in Fig. 2 and is remotely controlled
Underwater robots (ROV, remotely operated ve- from an operator vehicle. It has been used since
hicle) are deployed to responder organizations the 1990s for remote civil engineering works
in preparation for water damage such as caused from a distance of a few kilometers.
by tsunami, flood, cataract, and accidents in the
sea and rivers. They are equipped with cameras
D
Structural Collapse by Earthquakes,
and sonars and remotely controlled by crews Landslides, etc.
via tether from land or shipboard within several Small-sized UGVs (unmanned ground vehicles)
tens of meters area for victim search and dam- were developed for victim search and monitoring
age investigation. After the Great Eastern Japan in confined spaces of collapsed buildings and
Earthquake in 2011, Self Defense Force and vol- underground structures. VGTV X-treme shown
unteers of International Rescue System Institute in Fig. 3 is a tracked vehicle remotely operated
(IRS) and Center for Robot-Assisted Search and via a tether. It was used for victim search at
Rescue (CRASAR) used various types of ROVs mine accidents and the 9/11 terror attack. Active
such as SARbot shown in Fig. 1 for victim search scope camera shown in Fig. 4 is a serpentine robot
and debris investigation in the port. like a fiberscope and has been used for forensic
investigation of structural collapse accidents.
Volcano Eruption
In order to reduce risk in monitoring and re- Fire
covery construction at volcano eruptions, appli- Large-scale fires in chemical plants and forests
cation of robotics and remote systems is highly sometimes have a high risk, and firefighters
desired. Various types of UAVs (unmanned aerial cannot approach near them. Remote-controlled
vehicles) such as small-sized robot helicopters robots with firefighting nozzles for water and
and airplanes have been used for this purpose. chemical extinguishing agents are deployed.

Disaster Response Robot, Fig. 1 SARbot (Courtesy of SeaBotix Inc.) http://www.seabotix.com/products/sarbot.htm


286 Disaster Response Robot

Disaster Response Robot, Fig. 2 Unmanned construction system (Courtesy of Society for Unmanned Construction
Systems) http://www.kenmukyou.gr.jp/f_souti.htm

Disaster Response Robot, Fig. 3 VGTV X-treme (Courtesy of Recce Robotics) http://www.recce-robotics.com/vgtv.
html

Disaster Response
Robot, Fig. 4 Active
scope camera (Courtesy of
International Rescue
System Institute)
Disaster Response Robot 287

Large-sized robots can discharge large volumes compounds (VOCs), radiation, etc., as options
of the fluid with water cannons, whereas small- and can measure the Hazmat in dangerous
sized robots have better mobility. confined spaces (Fig. 6). Quince was developed
for research into technical issues of UGVs at
CBRNE disasters and has high mobility on rough
Response Robots for Man-Made terrain (Fig. 7).
Disasters
D
Explosive Ordnance Disposal (EOD)
Detection and disposal of explosive ordnance Fukushima Daiichi Nuclear Power Plant
is one of the most dangerous tasks. TALON, Accident
PackBot, and Telemax are widely used in military At the Fukushima Daiichi nuclear power plant
and explosive ordnance disposal teams world- accident caused by tsunami in 2011, various
wide. Telemax has an arm with seven degrees disaster response robots were applied. They
of freedom on a tracked vehicle with four sub- contributed to the cool shutdown and decom-
tracks as shown in Fig. 5. It can observe narrow missioning of the plant. For example, PackBot
spaces like overhead lockers of airplanes and and Quince gave essential data for task planning
bottom of automobiles by cameras, manipulate by shooting images and radiation measurement
objects by the arm, and deactivate explosives by a in nuclear reactor buildings there. Unmanned
disrupter. construction system removed debris outdoors
that were contaminated by radiological materials
CBRNE Disasters and reduced the radiation rate there significantly.
CBRNE (chemical, biological, radiological, Group INTRA in France and KHG in Ger-
nuclear, and explosive) disasters have a high risk many are organizations for responding to nu-
and can cause large-scale damage because human clear plant accidents. They are equipped with
cannot detect contamination by the Hazmat (haz- robots and remote-controlled construction ma-
ardous materials). Application of robotic systems chines for radiation measurement, decontamina-
is highly expected for this disaster. PackBot has tion, and constructions in emergency. In Japan,
sensors for toxic industrial chemicals (TIC), the Assist Center for Nuclear Emergencies was
blood agents, blister agents, volatile organic established after the Fukushima Accident.

Disaster Response Robot, Fig. 5 Telemax (Courtesy equipment/unmanned-systems/products-and-services/


of Cobham Mission Equipment) http://www.cobham. remote-controlled-robotic-solutions/telemax-explosive-
com/about-cobham/mission-systems/about-us/mission- ordnance-(eod)-robot.aspx
288 Disaster Response Robot

Disaster Response Robot, Fig. 6 PackBot (Courtesy of iRobot) http://www.irobot.com/us/learn/defense/packbot/


Specifications.aspx

Disaster Response Robot, Fig. 7 Quince (Courtesy of International Rescue System Institute)

Summary and Future Directions feedback enables the robots’ autonomous motion
and work. The sensed data are shown to oper-
Data flow of disaster response robots is generally ators via communication, data processing, and
described by a feedback system as shown in human interface. The operators give commands
Fig. 8. Robots change the states of objects and of motion and work to the system via the human
environment by movement and task execution. interface. The system recognizes and transmits
Sensors measure and recognize them, and their them to the robot.
Disaster Response Robot 289

Disaster Response Robot, Fig. 8 Data flow of remotely controlled disaster response robots

Each functional block has its own technical been held, e.g., RoboCupRescue targeting
challenges to be fulfilled under extreme CBRNE disasters, ELROB and euRathlon
environments of disaster space in response to for field activities, MAGIC for multi-robot
the objectives and the conditions. They should be autonomy, and DARPA Robotics Challenge for
solved technically in order to improve the robot humanoid robots in nuclear disasters. These
performance. They include insufficient mobility competitions seek to stimulate solutions of the
in disaster spaces (steps, gaps, slippage, narrow above-mentioned technical issues in different
space, obstacles, etc.), deficient workability environments by providing practical test beds for
(dexterity, accuracy, speed, force, work space, advanced technology developments.
etc.), poor sensors and sensor data processing
(image, recognition, etc.), lack of reliability and
performance of autonomy (robot intelligence,
multiagent collaboration, etc.), issues of wireless Cross-References
and wired communication (instability, delay,
capacity, tether handling, etc.), operators’  Robot Teleoperation
limitations (situation awareness, decision ability,  Walking Robots
fatigue, mistake, etc.), basic performances  Wheeled Robots
(explosion proof, weight, durability, portability,
etc.), and system integration that combines the
components into the solution. Mission critical Bibliography
planning and execution including human factors,
training, role sharing, logistics, etc. have to be ELROB (2013) www.elrob.org
considered at the same time. Group INTRA (2013) www.groupe-intra.com
Research into systems and control is expected KHG (2013) www.khgmbh.de
Murphy, R. (2014) Disaster robotics. MIT, Cambridge
to solve the abovementioned challenges of com- RoboCup (2013) www.robocup.org
ponents and systems. For example, intelligent Siciliano B, Khatib O (eds) (2008) Springer handbook of
control is essential for mobility and workability robotics, 1st edn. Springer, Berlin
under extreme conditions; control of feedback Siciliano B, Khatib O (eds) (2014) Springer handbook of
robotics, 2nd edn. Springer, Berlin
systems including long delay and dynamic in- Tadokoro S (ed) (2010) Rescue robotics: DDT project
stability, control of human-in-loop systems, and on robots and systems for urban search and rescue.
system integration of heterogeneous systems are Springer, London
important research topics of systems and control. Tadokoro S, Seki S, Asama H (2013) Priority issues
of disaster robotics in Japan. In: Proceedings of the
In the research field of disaster robotics, IEEE region 10 humanitarian technology conference,
various competitions of practical robots have Sendai, 27–29 Aug 2013
290 Discrete Event Systems and Hybrid Systems, Connections Between

A discrete event system (Lafortune and Cas-


Discrete Event Systems and Hybrid sandras 2007; Seatzu et al. 2012) evolves in ac-
Systems, Connections Between cordance with the abrupt occurrence, at possibly
unknown irregular intervals, of physical events.
Alessandro Giua
Its states may have logical or symbolic, rather
DIEE, University of Cagliari, Cagliari, Italy
than numerical, values that change in response
LSIS, Aix-en-Provence, France
to events which may also be described in non-
numerical terms. An example of such a system
is a robot that loads parts on a conveyor, whose
Abstract
behavior is described by the automaton in Fig. 2.
The robot can be “idle,” “loading” a part, or in an
The causes of the complex behavior typical of hy-
“error” state when a part is incorrectly positioned.
brid systems are multifarious and are commonly
The events that drive its evolution are a (grasp a
explained in the literature using paradigms that
part), b (part correctly loaded), c (part incorrectly
are mainly focused on the connections between
positioned), and d (part repositioned). In a logi-
time-driven and hybrid systems. In this entry,
cal discrete event system ( Supervisory Control
we recall some of these paradigms and further
of Discrete-Event Systems), the timing of event
explore the connections between discrete event
occurrences are ignored, while in a timed discrete
and hybrid systems from other perspectives. In
event system ( Models for Discrete Event Sys-
particular, the role of abstraction in passing from
tems: An Overview), they are described by means
a hybrid model to a discrete event one and vice
of a suitable timing structure.
versa is discussed.
In a hybrid system ( Hybrid Dynamical
Systems, Feedback Control of), time-driven
and event-driven evolutions are simultaneously
Keywords present and mutually dependent. As an example,
consider a room where a thermostat maintains
Hybrid system; Logical discrete event system; the temperature x.t/ between xa D 20 ı C
Timed discrete event system and xb D 22 ı C by turning a heat pump on
and off. Due to the exchange with the external
environment at temperature xe  x.t/, when the
pump is off, the room temperature derivative is
Introduction
d
Hybrid systems combine the dynamics of both x.t/ D kŒx.t/  xe 
time-driven systems and discrete event systems. dt
The evolution of a time-driven system can be
where k is a suitable coefficient, while when the
described by a differential equation (in continu-
pump is on, the room temperature derivative is
ous time) or by a difference equation (in discrete
time). An example of such a system is the tank
d
shown in Fig. 1 whose behavior, assuming the x.t/ D h.t/  kŒx.t/  xe 
tank is not full, is ruled in continuous time t by dt
the differential equation
where the positive term h.t/ is due to the heat
pump. The hybrid automaton that describes this
d system is shown in Fig. 3.
V .t/ D q1 .t/  q2 .t/
dt The causes of the complex behavior typical
of hybrid systems are multifarious, and among
where V is the volume of liquid and q1 and q2 the paradigms commonly used in the literature to
are, respectively, the input and output flow. describe them, we mention three.
Discrete Event Systems and Hybrid Systems, Connections Between 291

Discrete Event Systems


and Hybrid Systems,
Connections Between,
Fig. 1 A tank

Discrete Event Systems


and Hybrid Systems,
Connections Between,
Fig. 2 A machine with
failures

Discrete Event Systems


and Hybrid Systems,
Connections Between,
Fig. 3 Hybrid automaton
of the thermostat

• Logically controlled systems. Often, a phys- (vertical position h D 0), its behavior is that
ical system with a time-driven evolution is of a (partially) elastic body that bounces up.
controlled in a feedback loop by means Classes of systems that can be described by
of a controller that implements discrete this paradigm are piecewise affine systems and
computations and event-based logic. This is linear complementarity system.
the case of the thermostat mentioned above. • Variable structure systems. Some systems
Classes of systems that can be described may change their structure assuming different
by this paradigm are embedded systems or, configuration, each characterized by a
when the feedback loop is closed through different behavior. As an example, consider
a communication network, cyber-physical a multicell voltage converter composed by a
systems. cascade of elementary commutation cells:
• State-dependent mode of operation. A time- controlling some switches, it is possible
driven system can have different modes of to insert or remove cells so as to produce
evolution depending on its current state. As a desired output voltage signal. Classes
an example, consider a bouncing ball. While of systems that can be described by this
the ball is above the ground (vertical position paradigm are switched systems.
h > 0) its behavior is that of a falling body While these are certainly appropriate and
subject to a constant gravitational force. How- meaningful paradigms, they are mainly focused
ever, when the ball collides with the ground on the connections between time-driven and
292 Discrete Event Systems and Hybrid Systems, Connections Between

hybrid systems. In the rest of this entry, we


will discuss the connections between discrete
event and hybrid systems from other different
perspectives. The focus is strictly on modeling,
thus approaches for analysis or control will not
be discussed. Discrete Event Systems Fig. 4 Logical discrete
and Hybrid Systems, event model of the
From Hybrid Systems to Discrete Connections Between, thermostat
Event System by Modeling
Abstraction abstraction of the hybrid model in Fig. 3
obtaining a timed discrete event model such
A system is a physical object, while a model is a as the automaton in Fig. 4. Here, to each event
(more or less accurate) mathematical description is associated a firing delay: as an example, ıa
of its behavior that captures those features that are represents the time it takes – when the pump is
deemed mostly significant. In the previous pages, off – to cool down until the lower temperature
we have introduced different classes of systems, threshold is reached and event a occurs. The
such as “time-driven systems,” “discrete event delay may be a deterministic value or even a ran-
systems,” and “hybrid systems,” but properly dom one to take into account the uncertainty due
speaking, this taxonomy pertains to the models to non-modeled time-varying parameters such
because the terms “time driven,” “discrete event,” as the temperature of the external environment.
or “hybrid” should be used to classify the mathe- Note that a new state (START) and a new event b 0
matical description and not the physical object. have now been introduced to capture the transient
According to this view, a discrete event model phase in which the room temperature, from the
is often perceived as a high-level description of a initial value x.0/ D 15 ı C, reaches the higher
physical system where the time-driven dynamics temperature threshold: in fact, event b 0 has a
are ignored or, at best, approximated by a timing delay greater than the delay of event b.
structure. This procedure to derive a simpler
model in a way that preserves the properties being Timed Discrete Event Systems
analyzed while hiding the details that are of no Are Hybrid Systems
interest is called abstraction (Alur et al. 2000).
Consider, as an example, the thermostat in Properly speaking, all timed discrete event
Fig. 3. In such a system, the time-driven evolution systems may also be seen as hybrid systems if one
determines a change in the temperature, which in considers the dynamics of the timers – that spec-
turn – reaching a threshold – triggers the occur- ify the event occurrence – as elementary time-
rence of an event that changes the discrete state. driven evolutions. In fact, the simplest model of
Assume one does not care about the exact form hybrid systems is the timed automaton introduced
this triggering mechanism takes and is only inter- by Alur and Dill (1994) whose main feature is
ested in determining if the heat pump is turned on the fact that each continuous variable x.t/ has a
or off. In such a case, we can completely abstract P
constant derivative x.t/ D 1 and thus can only
the time-driven evolution obtaining a logical dis- describe the passage of time. Incidentally, we
crete event model such as the automaton in Fig. 4, note that the term “timed automaton” is also used
where label a denotes the event the temperature in the area of discrete event systems (Lafortune
drops below 20 ı C and label b denotes the event and Cassandras 2007) to denote an automaton
the temperature raises over 22 ı C. in which a timing structure is associated to the
For some purposes, e.g., to determine the events: such an example was shown in Fig. 5. To
utilization rate of the heat pump and thus its avoid any confusion, in the following, we denote
operating cost, the model in Fig. 4 is inadequate. the former model Alur-Dill automaton and the
In such a case, one can consider a less coarse latter model timed DES automaton.
Discrete Event Systems and Hybrid Systems, Connections Between 293

In Fig. 6 is shown an Alur-Dill automaton event (enabling policy) or to specify when a timer
that describes the thermostat, where the time- is reset (memory policy) (Ajmone Marsan et al.
driven dynamics have been abstracted and only 1995). Furthermore, timed discrete event system
the timing of event occurrence is modeled as in can have arbitrary stochastic timing structures
the timed DES automaton in Fig. 5. The only (e.g., semi-Markovian processes, Markov chains,
continuous variable is the value of a timer ı: and queuing networks (Lafortune and Cassandras
when it goes beyond a certain threshold (e.g., 2007)), not to mention the possibility of having
ı > ıa ), an event occurs (e.g., event a) changing an infinite discrete state space (e.g., timed Petri
the discrete state (e.g., from OFF to ON) and nets (Ajmone Marsan et al. 1995; David and
D
resetting the timer to zero. Alla 2004)). As a result, we can say that timed
It is rather obvious that the behavior of the DES automata are far more general than Alur-Dill
Alur-Dill automaton in Fig. 6 is equivalent to automata and represent a meaningful subclass of
the behavior of timed DES automaton in Fig. 5. hybrid systems.
In the former model, the notion of time is en-
coded by means of an explicit continuous vari-
able ı. In the latter model, the notion of time From Discrete Event System to Hybrid
is implicitly encoded by the timer that during Systems by Fluidization
an evolution will be associated to each event.
In both cases, however, the overall state of the The computational complexity involved in the
systems is described by a pair .`.t/; x.t// where analysis and optimization of real-scale problems
the first element ` takes value in a discrete set often becomes intractable with discrete event
fS TART; ON; OFF g and the second element is models due to the very large number of reachable
a vector (in this particular case with a single states, and a technique that has shown to be
component) of timer valuations. effective in reducing this complexity is called
It should be pointed out that an Alur-Dill fluidization ( Applications of Discrete-Event
automaton may have a more complex structure Systems). It should be noted that the derivation
than that shown in Fig. 6: as an example, the of a fluid (i.e., hybrid) model from a discrete
guard associated to a transition, i.e., the values of event one is yet an example of abstraction albeit
the timer that enable it, can be an arbitrary rect- going in opposite direction with respect to the
angular set. However, the same is also true for a examples discussed in the section “From Hybrid
timed discrete event system: several policies can Systems to Discrete Event System by Modeling
be used to define the time intervals enabling an Abstraction” above.
The main drive that motivated the fluidization
approach derives from the observation that some
discrete event systems are “heavily populated”
in the sense that there are many identical items
in some component (e.g., clients in a queue).
Fluidization consists in replacing the integer
Discrete Event Systems and Hybrid Systems, Connec-
tions Between, Fig. 5 Timed discrete event model of the counter of the number of items by a real number
thermostat and in approximating the “fast” discrete event

START OFF ON
. . δ > δa ? .
δ=1 δ > δb’ ? δ=1 δ := 0 δ=1
δ := 0 { δ ≤ δb’ } δ := 0 { δ ≤ δa } δ > δb ? { δ ≤ δb }
δ := 0

Discrete Event Systems and Hybrid Systems, Connections Between, Fig. 6 Alur-Dill automaton of the thermostat
294 Discrete Optimal Control

dynamics that describe how the counter changes Bibliography


by a continuous dynamics. This approach has
been successfully used to study the performance Ajmone Marsan M, Balbo G, Conte G, Donatelli S,
Franceschinis G (1995) Modelling with generalized
optimization of fluid-queuing networks (Cas-
stochastic Petri nets. Wiley, Chichester/New York
sandras and Lygeros 2006) or Petri net models Alur R, Dill DL (1994) A theory of timed automata. Theor
(Balduzzi et al. 2000; David and Alla 2004; Comput Sci 126:183–235
Silva and Recalde 2004) with applications in Alur R, Henzinger TA, Lafferriere G, Pappas GJ (2000)
Discrete abstractions of hybrid systems. Proc IEEE
domains such as manufacturing systems and
88(7):971–984
communication networks. We also remark that Balduzzi F, Giua A, Menga G (2000) First–order hybrid
in general, different fluid approximations are Petri nets: a model for optimization and control. IEEE
necessary to describe the same system, depending Trans Robot Autom 16:382–399
Cassandras CG, Lygeros J (eds) (2006) Stochastic hy-
on its discrete state, e.g., in the manufacturing
brid systems: recent developments and research. CRC
domain, machines working or down, buffers full Press, New York
or empty, and so on. Thus, the resulting model David R, Alla H (2004) Discrete, continuous and hybrid
can be better described as a hybrid model, where Petri nets. Springer, Berlin
Lafortune S, Cassandras CG (2007) Introduction to dis-
different time-driven dynamics are associated to
crete event systems, 2nd edn. Springer, Boston
different discrete states. Seatzu C, Silva M, van Schuppen JH (eds) (2012) Control
There are many advantages in using fluid ap- of discrete event systems. Automata and Petri net
proximations. First, there is the possibility of perspectives. Volume 433 of lecture notes in control
and information science. Springer, London
considerable increase in computational efficiency Silva M, Recalde L (2004) On fluidification of Petri
because the simulation of a fluid model can often net models: from discrete to hybrid and continuous
be performed much faster than that of its discrete models. Annu Rev Control 28(2):253–266
event counterpart. Second, fluid approximations
provide an aggregate formulation to deal with
complex systems, thus reducing the dimension
of the state space. Third, the resulting simple Discrete Optimal Control
structures often allow explicit computation of
performance measures. Finally, some design pa- David Martin De Diego
rameters in fluid models are continuous; hence, it Instituto de Ciencias Matemáticas
is possible to use gradient information to speed up (CSIC-UAM-UC3M-UCM), Madrid, Spain
optimization and to perform sensitivity analysis
(Balduzzi et al. 2000): in many cases, it has also
been shown that fluid approximations do not in- Synonyms
troduce significant errors when carrying out per-
formance analysis via simulation ( Perturbation DOC
Analysis of Discrete Event Systems).

Abstract
Cross-References
Discrete optimal control is a branch of mathe-
 Applications of Discrete-Event Systems matics which studies optimization procedures for
 Hybrid Dynamical Systems, Feedback Control controlled discrete-time models – that is, the opti-
of mization of a performance index associated with
 Models for Discrete Event Systems: An a discrete-time control system. This entry gives
Overview an introduction to the topic. The formulation of
 Perturbation Analysis of Discrete Event Sys- a general discrete optimal control problem is de-
tems scribed, and applications to mechanical systems
 Supervisory Control of Discrete-Event Systems are discussed.
Discrete Optimal Control 295

Keywords structure underlying the geometry of the system.


If so, they are called symplectic integrators.
Discrete mechanics; Discrete-time models; Sym- Another approach used by more and more au-
plectic integrators; Variational integrators thors is based on the theory of discrete mechanics
and variational integrators to obtain geometric
integrators preserving some of the geometry of
Definition the original system (Hussein et al. 2006; Marsden
and West 2001; Wendlandt and Marsden 1997a,b)
Discrete optimal control is a branch of math- (see also the section “Discrete Mechanics”).
D
ematics which studies optimization procedures These geometric integrators are easily adapted
for controlled discrete-time models, that is, the and applied to a wide range of mechani-
optimization of a performance index associated cal systems: forced or dissipative systems,
to a discrete-time control system. holonomically constrained systems, explicitly
time-dependent systems, reduced systems with
frictional contact, nonholonomic dynamics, and
Motivation multisymplectic field theories, among others.
As before, in optimal control theory, it is
Optimal control theory is a mathematical disci- necessary to distinguish two kinds of numerical
pline with innumerable applications in both sci- methods: the so-called direct and indirect meth-
ence and engineering. Discrete optimal control is ods. If we use direct methods, we first discretize
concerned with control optimization for discrete- the state and control variables, control equations,
time models. Recently, in discrete optimal control and cost functional, and then we solve a nonlinear
theory, a great interest has appeared in develop- optimization problem with constraints given by
ing numerical methods to optimally control real the discrete control equations, additional con-
mechanical systems, as for instance, autonomous straints, and boundary conditions (Bock and Plitt
robotic vehicles in natural environments such as 1984; Bonnans and Laurent-Varin 2006; Hager
robotic arms, spacecrafts, or underwater vehicles. 2001; Pytlak 1999). In this case, we typically
During the last years, a huge effort has been need to solve a system of the type (see the sec-
made for the comprehension of the fundamen- tion “Formulation of a General Discrete Optimal
tal geometric structures appearing in dynamical Control Problem”)
systems, including control systems and optimal 8
control systems. This new geometric understand- < minimize F .X / X D .q 0 ; : : : ; q N ; u1 ; : : : uN /
ing of those systems has made possible the con- with ‰.X / D 0
:
struction of suitable numerical techniques for ˆ.X /  0
integration. A collection of ad hoc numerical
methods are available for both dynamical and On the other hand, indirect methods consist of
control systems. These methods have grown up solving numerically the boundary value problem
accordingly with the needs in research coming obtained from the equations after applying Pon-
from different fields such as physics and en- tryagin’s Maximum Principle.
gineering. However, a new breed of ideas in The combination of direct methods and
numerical analysis has started recently. They in- discrete mechanics allows to obtain numerical
corporate the geometry of the systems into the control algorithms which are geometric structure
analysis and that allows faster and more accurate preserving and exhibit a good long-time behavior
algorithms and with less spurious effects than the (Bloch et al. 2013; Jiménez et al. 2013; Junge
traditional ones. All this gives birth to a new field and Ober-Blöbaum 2005; Junge et al. 2006;
called Geometric Integration (Hairer et al. 2002). Kobilarov 2008; Leyendecker et al. 2007; Ober-
For instance, numerical integrators for Hamil- Blöbaum 2008; Ober-Blöbaum et al. 2011).
tonian systems should preserve the symplectic Furthermore, it is possible to adapt many of
296 Discrete Optimal Control

the techniques used for continuous control X


N 1 h
mechanical systems to the design of quantitative CQd D pkC1 .xkC1  fd .k; xk ; uk //
and qualitative accurate numerical methods kD0
i
for optimal control methods (reduction by
Cd .k; xk ; uk /  ˆd .N; xN / (3)
symmetries, preservation of geometric structures,
Lie group methods, etc.).
where pk 2 Rn , k D 1; : : : ; N , are considered as
the Lagrange multipliers. The notation xy is used
Formulation of a General Discrete for the scalar (inner) product x  y of two vectors
Optimal Control Problem in Rn .
From the pseudo-Hamiltonian function
Let M be an n-dimensional manifold, x denote
the state variables in M for an agent’s environ- Hd .k; xk ; pkC1 ; uk / D pkC1 fd .k; xk ; uk /
ment, and u 2 U  Rm be the control or action
that the agent chooses to accomplish a task or Cd .k; xk ; uk /;
objective. Let fd .x; u/ 2 M be the resulting state
after applying the control u to the state x. For we deduce the necessary conditions for a con-
instance, x may be the configuration of a vehicle strained minimum:
at time t and u its fuel consumption, and then
fd .x; u/ is the new configuration of the vehicle @Hd
xkC1 D .k; xk ; pkC1 ; uk /
at time t C h, with t; h > 0. Of course, we @p
want to minimize the fuel consumption. Hence, D fd .k; xk ; uk / (4)
the optimal control problem consists of finding
@Hd
the cheapest way to move the system from a pk D .k; xk ; pkC1 ; uk /
@q
given initial position to a final state. The problem
can be mathematically described as follows: find @fd
D pkC1 .k; xk ; uk /
a sequence of controls .u0 ; u1 ; : : : ; uN 1 / and a @q
sequence of states .x0 ; x1 ; : : : ; xN / such that @Cd
 .k; xk ; uk / (5)
@q
xkC1 D fd .k; xk ; uk /; (1)
@Hd
0D .k; xk ; pkC1 ; uk /
where xk 2 M , uk 2 U , and the total cost @u
@fd
X
N 1 D pkC1 .k; xk ; uk /
Cd D Cd .k; xk ; uk / C d .N; xN / (2) @u
@Cd
kD0
 .k; xk ; uk / (6)
@u
is minimized where d is a function of the final
time and state at the final time (the terminal where 0  k  N  1. Moreover, we have some
payoff) and Cd is a function depending on the boundary conditions
discrete time, the state, and the control at each
intermediate discrete time k (the running payoff). @ˆd
x0 is given and pN D  .N; xN / (7)
To solve the discrete optimal control prob- @q
lem determined by Eqs. (1) and (2), it is pos-
sible to use the classical Lagrangian multiplier The variable pk is called the costate of the system
approach. In this case, we consider the control and Eq. (5) is called the adjoint equation. Observe
equations (1) as constraint equations associating that the recursion of xk given by Eq. (4) develops
a Lagrange multiplier to each constraint. Assume forward in the discrete time, but the recursion of
for simplicity that M D Rn . Then, we construct the costate variable is backward in the discrete
the augmented cost function time.
Discrete Optimal Control 297

In the sequel, it is assumed the following First, we will need an introduction to discrete
regularity condition: mechanics and variational integrators.
 
@2 Hd
det 6D 0
@ua @ub Discrete Mechanics

where 1  a; b  m, and .ua / 2 U Rm . Ap- Let Q be an n-dimensional differentiable man-


plying the implicit function theorem, we obtain ifold with local coordinates .q i /, 1  i  D
from Eq. (1) that locally uk D g.k; xk ; pkC1 /. n. We denote by TQ its tangent bundle with
Defining the function induced coordinates .q i ; qP i /. Let LW TQ ! R
be a Lagrangian function; the associated Euler–
HQ d W Z  R2n ! R Lagrange equations are given by
.k; qk ; pkC1 / 7! Hd .k; qk ; pkC1 ; uk /
 
d @L @L
Equations (4) and (5) are rewritten as the follow-  D 0; 1  i  n: (10)
dt @qP i @q i
ing discrete Hamiltonian system:
These equations are a system of implicit second-
@HQ d order differential equations. Assume thatthe La-
xkC1 D .k; xk ; pkC1 / (8) 
@p 2
grangian is regular, that is, the matrix @qP@i @LqP j
@HQ d is non-singular. It is well known that the origin
pk D .k; xk ; pkC1 / (9)
@q of these equations is variational (see Marsden
and West 2001). Variational integrators retain this
The expression of the solutions of the optimal variational character and also some of the key
control problem as a discrete Hamiltonian system geometric properties of the continuous system,
(under some regularity properties) is important such as symplecticity and momentum conser-
since it indicates that the discrete evolution is vation (see Hairer et al. 2002 and references
preserving symplecticity. A simple proof of this therein). In the following, we summarize the
fact is the following (de León et al. (2007)). main features of this type of numerical inte-
Construct the following function grators (Marsden and West 2001). A discrete
Lagrangian is a map Ld W Q  Q ! R, which
Gk .xk ; xkC1 ; pkC1 / D HQ d .k; xk ; pkC1 / may be considered as an approximation of the in-
tegral action defined by a continuous
Rh Lagrangian
pkC1 xkC1 ; LW TQ ! R: Ld .q0 ; q1 /
0 L.q.t/; q.t//P dt
where q.t/ is a solution of the Euler–Lagrange
with 0  k  N  1. For each fixed k: equations for L with q.0/ D q0 , q.h/ D q1 , and
h > 0 being enough small.
@HQ d
dGk D .k; xk ; pkC1 / dxk Remark 1 The Cartesian product Q  Q
@q
is equipped with an interesting differential
@HQ d structure, called Lie groupoid, which allows the
C .k; xk ; pkC1 / dpkC1
@p extension of variational calculus to more general
pkC1 dxkC1  xkC1 dpkC1 : settings (see Marrero et al. 2006, 2010 for more
details).
Thus, along solutions of Eqs. (8) and (9), we have Define the action sum Sd W QN C1 ! R,
that dGk jsolutions D pk dxk pkC1 dxkC1 which corresponding to the Lagrangian Ld by Sd D
PN
implies dxk ^ dpk D dxkC1 ^ dpkC1 . kD1 Ld .qk1 ; qk /; where qk 2 Q for 0  k 
In the next section, we will study the case of N and N is the number of steps. The discrete
discrete optimal control of mechanical systems. variational principle states that the solutions of
298 Discrete Optimal Control

the discrete system determined by Ld must ex- formulations of optimal control problems (see
tremize the action sum given fixed endpoints q0 Cadzow 1970; Hwang and Fan 1967; Jordan and
and qN . By extremizing Sd over qk , 1  k  N  Polak 1964).
1, we obtain the system of difference equations
Discrete Optimal Control
D1 Ld .qk ; qkC1 / C D2 Ld .qk1 ; qk / D 0; (11)
of Mechanical Systems
or, in coordinates, Consider a mechanical system whose configu-
ration space is an n-dimensional differentiable
@Ld @Ld
.qk ; qkC1 / C .qk1 ; qk / D 0; manifold Q and whose dynamics is determined
@x i @y i
by a Lagrangian L W TQ ! R. The control forces
are modeled as a mapping f W TQ  U ! T  Q,
where 1  i  n; 1  k  N  1, and x; y
where f .vq ; u/ 2 Tq Q, vq 2 Tq Q and u 2 U ,
denote the n-first and n-second variables of the
being U the control space. Observe that this last
function Ld , respectively.
definition also covers configuration and velocity-
These equations are usually called the discrete
dependent forces such as dissipation or friction
Euler–Lagrange equations. Under some regu-
(see Ober-Blöbaum et al. 2011).
larity hypotheses (the matrix D12 Ld .qk ; qkC1 / is
The motion of the mechanical system is de-
regular), it is possible to define a (local) discrete
scribed by applying the principle of Lagrange–
flow ‡Ld W Q Q ! Q Q, by ‡Ld .qk1 ; qk / D
D’Alembert, which requires that the solutions
.qk ; qkC1 / from (11). Define the discrete Legen-
q.t/ 2 Q must satisfy
dre transformations associated to Ld as
Z T

F Ld W Q  Q ! T Q  ı P
L.q.t/; q.t// dt
0
.q0 ; q1 / 7! .q0 ; D1 Ld .q0 ; q1 //; Z T
C P
f .q.t/; q.t/; u.t// ıq.t/ dt D 0;
FC Ld W Q  Q ! T  Q 0

.q0 ; q1 / 7! .q1 ; D2 Ld .q0 ; q1 // ; (12)

and the discrete Poincaré–Cartan 2-form !d D where .q ; q/P are the local coordinates of TQ and
.FC Ld / !Q D .F Ld / !Q , where !Q is the where we consider arbitrary variations ıq.t/ 2
canonical symplectic form on T  Q. The discrete Tq.t / Q with ıq.0/ D 0 and ıq.T / D 0 (since we
algorithm determined by ‡Ld preserves the sym- are prescribing fixed initial and final conditions
plectic form !d , i.e., ‡Ld !d D !d . Moreover, if P
.q.0/; q.0// P //).
and .q.T /; q.T
the discrete Lagrangian is invariant under the di- As we consider an optimal control problem,
agonal action of a Lie group G, then the discrete the forces f must be chosen, if they exist, as the
momentum map Jd W Q  Q ! g defined by ones that extremize the cost functional:
Z T
hJd .qk ; qkC1 /; i D hD2 Ld .qk ; qkC1 /; P
C.q.t/; q.t/; u.t// dt
0
Q .qkC1 /i Cˆ.q.T /; q.T
P /; u.T //; (13)

is preserved by the discrete flow. Therefore, these where C W TQ  U ! R.


integrators are symplectic-momentum preserv- The optimal equations of motion can now be
ing. Here, Q denotes the fundamental vector derived using Pontryagin’s Maximum Principle.
field determined by  2 g, where g is the Lie In general, it is not possible to explicitly inte-
algebra of G. As stated in Marsden and West grate these equations. Then, it is necessary to
(2001), discrete mechanics is inspired by discrete apply a numerical method. In this work, using
Discrete Optimal Control 299

discrete variational techniques, we first discretize manipulations, we arrive to the forced discrete
the Lagrange–d’Alembert principle and then the Euler–Lagrange equations
cost functional. We obtain a numerical method
that preserves some geometric features of the
D2 Ld .qk1 ; qk / C D1 Ld .qk ; qkC1 /
original continuous system as described in the
sequel. C fdC .qk1 ; qk ; uk1 /
C fd .qk ; qkC1 ; uk / D 0;
Discretization of the Lagrangian
and Control Forces (14) D
To discretize this problem, we replace the tangent
space TQ by the Cartesian product Q  Q and with k D 1; : : : ; N  1.
the continuous curves by sequences q0 ; q1 ; : : : qN
(we are using N steps, with time step h fixed, in
such a way tk D kh and N h D T ). The discrete Boundary Conditions
Lagrangian Ld W Q  Q ! R is constructed as For simplicity, we assume that the boundary con-
an approximation of the action integral in a single ditions of the continuous optimal control problem
time step (see Marsden and West 2001), that is, are given by q.0/ D x0 , q.0/
P D v0 , q.T / D
P / D vT . To incorporate these conditions
xT , q.T
Z .kC1/h to the discrete setting, we use both the continu-
Ld .qk ; qkC1 /
P
L.q.t/; q.t// dt: ous and discrete Legendre transformations. From
kh
Marsden and West (2001), given a forced system,
we can define the discrete momenta
We choose the following discretization for the
external forces: fd˙ W Q Q U ! T  Q, where
U  Rm ; m  n, such that k D D1 Ld .qk ; qkC1 /  fd .qk ; qkC1 ;uk /;
kC1 D D2 Ld .qk ; qkC1 / C fdC .qk ; qkC1 ; uk /:
fd .qk ; qkC1 ; uk / 2 Tqk Q;

fdC .qk ; qkC1 ; uk / 2 TqkC1 Q: From the continuous Lagrangian, we have the
momenta
fdC and fd are right and left discrete forces (see
Ober-Blöbaum et al. 2011). @L
p0 D FL.x0 ; v0 / D .x0 ; .x0 ; v0 //
@v
Discrete Lagrange–d’Alembert Principle  
@L
Given such forces, we define the discrete pT D FL.xT ; vT / D xT ; .xT ; vT /
@v
Lagrange–d’Alembert principle, which seeks
sequences fqk gN
kD0 that satisfy
Therefore, the natural choice of boundary condi-
1
tions is
X
N
ı Ld .qk ; qkC1 /
kD0 x0 D q0 ; xT D qN
X
N 1 
FL.x0 ; v0 / D D1 Ld .q0 ; q1 /  fd .q0 ;q1 ; u0 /
C fd .qk ; qkC1 ; uk / ıqk
kD0 FL.xT ; vT / D D2 Ld .qN 1 ; qN /

CfdC .qk ; qkC1 ; uk / ıqkC1 D 0; CfdC .qN 1 ; qN ; uN 1 /

for arbitrary variations fıqk gN


kD0 with ıq0 D that we add to the discrete optimal control
ıqN D 0. After some straightforward problem.
300 Discrete Optimal Control

Discrete Cost Function Optimal Control Systems


We can also approximate the cost functional (13) with Symmetries
in a single time step h by
In many interesting cases, the continuous optimal
Cd .qk ; qkC1 ; uk / control of a mechanical system is defined on a Lie
Z .kC1/h group and the Lagrangian, cost function, control

P
C.q.t/; q.t/; u.t// dt; forces are invariant under the group action. The
kh goal is again the same as in the previous section,
that is, to move the system from its current state
yielding the discrete cost functional: to a desired state in an optimal way. In this
particular case, it is possible to adapt the contents
X
N 1
of the section “Discrete Mechanics” in a similar
Cd .qk ; qkC1 ; uk / C ˆd .qN 1 ; qN ; uN 1 /:
way to the continuous case (from the standard
kD0
Euler–Lagrange equations to the Euler–Poincaré
equations) and to produce the so-called Lie group
Discrete Optimal Control Problem
variational integrators. These methods preserve
With all these elements, we have the following
the Lie group structure avoiding the use of local
discrete optimal control problem:
charts, projections, or constraints. Based on these
1
methods, for the case of controlled mechanical
X
N
min Cd .qk ; qkC1 ; uk / systems, we produce the discrete Euler–Poincaré
kD0
equations with controls and the discrete cost
function. Consequently, it is possible to deduce
Cˆd .qN 1 ; qN ; uN 1 /
necessary optimality conditions for this class of
subject to invariant systems (see Bloch et al. 2009; Bou-
Rabee and Marsden 2009; Hussein et al. 2006;
D2 Ld .qk1 ; qk / C D1 Ld .qk ; qkC1 / Kobilarov and Marsden 2011).
CfdC .qk1 ; qk ; uk1 / C fd .qk ; qkC1 ;uk / D 0;
x0 D q0 ; xT D qN
@L Cross-References
.x0 ; v0 / D D1 Ld .q0 ; q1 /  fd .q0 ; q1 ;u0 /
@v
 Differential Geometric Methods in Nonlinear
@L
.xT ; vT / D D2 Ld .qN 1 ; qN / Control
@v
CfdC .qN 1 ; qN ; uN 1 /  Numerical Methods for Nonlinear Optimal
Control Problems
with k D 1; : : : ; N  1 (see Jiménez and Martín  Optimal Control and Mechanics
de Diego 2010; Jiménez et al. 2013 for a mod-  Optimal Control and Pontryagin’s Maximum
ification of these equations admitting piecewise Principle
controls).
The system now is a constrained nonlinear
optimization problem, that is, it corresponds Bibliography
to the minimization of a function subject to
Bock HG, Plitt KJ (1984) A multiple shooting algorithm
algebraic constraints. The necessary conditions
for direct solution of optimal control problems. In:
for optimality are derived applying nonlinear 9th IFAC world congress, Budapest. Pergamon Press,
programming optimization. For the concrete pp 242–247
implementation, it is possible to use sequential Bloch AM, Hussein I, Leok M, Sanyal AK (2009) Geo-
metric structure-preserving optimal control of a rigid
quadratic programming (SQP) methods to
body. J Dyn Control Syst 15(3):307–330
numerically solve the nonlinear optimization Bloch AM, Crouch PE, Nordkvist N (2013) Continuous
problem (Ober-Blöbaum et al. 2011). and discrete embedded optimal control problems and
Distributed Model Predictive Control 301

their application to the analysis of Clebsch optimal constrained multibody dynamics. In: 6th international
control problems and mechanical systems. J Geom conference on multibody systems, nonlinear dynam-
Mech 5(1):1–38 ics, and control, ASME international design engineer-
Bonnans JF, Laurent-Varin J (2006) Computation of or- ing technical conferences, Las Vegas
der conditions for symplectic partitioned Runge-Kutta Marrero JC, Martín de Diego D, Martínez E (2006) Dis-
schemes with application to optimal control. Numer crete Lagrangian and Hamiltonian mechanics on Lie
Math 103:1–10 groupoids. Nonlinearity 19:1313–1348. Corrigendum:
Bou-Rabee N, Marsden JE (2009) Hamilton-Pontryagin Nonlinearity 19
integrators on Lie groups: introduction and Marrero JC, Martín de Diego D, Stern A (2010) La-
structure-preserving properties. Found Comput grangian submanifolds and discrete constrained me-
Math 9(2):197–219 chanics on Lie groupoids. Preprint, To appear in
D
Cadzow JA (1970) Discrete calculus of variations. Int J DCDS-A 35-1, January 2015.
Control 11:393–407 Marsden JE, West M (2001) Discrete mechanics and
de León M, Martí’n de Diego D, Santamaría-Merino A variational integrators. Acta Numer 10: 357–514
(2007) Discrete variational integrators and optimal Marsden JE, Pekarsky S, Shkoller S (1999a) Discrete
control theory. Adv Comput Math 26(1–3):251–268 Euler-Poincaré and Lie-Poisson equations. Nonlinear-
Hager WW (2001) Numerical analysis in optimal con- ity 12:1647–1662
trol. In: International series of numerical mathematics, Marsden JE, Pekarsky S, Shkoller S (1999b) Symmetry
vol 139. Birkhäuser Verlag, Basel, pp 83–93 reduction of discrete Lagrangian mechanics on Lie
Hairer E, Lubich C, Wanner G (2002) Geometric nu- groups. J Geom Phys 36(1–2):140–151
merical integration, structure-preserving algorithms Ober-Blöbaum S (2008) Discrete mechanics and optimal
for ordinary differential equations’. Springer series in control. Ph.D. Thesis, University of Paderborn
computational mathematics, vol 31. Springer, Berlin Ober-Blöbaum S, Junge O, Marsden JE (2011) Discrete
Hussein I, Leok M, Sanyal A, Bloch A (2006) A discrete mechanics and optimal control: an analysis. ESAIM
variational integrator for optimal control problems on Control Optim Calc Var 17(2):322–352
SO.3/’. In: Proceedings of the 45th IEEE conference Pytlak R (1999) Numerical methods for optimal
on decision and control, San Diego, pp 6636–6641 control problems with state constraints. Lecture
Hwang CL, Fan LT (1967) A discrete version of Pontrya- Notes in Mathematics, 1707. Springer-Verlag, Berlin,
gin’s maximum principle. Oper Res 15:139–146 xvi+215 pp.
Jordan BW, Polak E (1964) Theory of a class of dis- Wendlandt JM, Marsden JE (1997a) Mechanical integra-
crete optimal control systems. J Electron Control tors derived from a discrete variational principle. Phys
17:697–711 D 106:2232–2246
Jiménez F, Martín de Diego D (2010) A geometric Wendlandt JM, Marsden JE (1997b) Mechanical systems
approach to Discrete mechanics for optimal control with symmetry, variational principles and integration
theory. In: Proceedings of the IEEE conference on algorithms. In: Alber M, Hu B, Rosenthal J (eds)
decision and control, Atlanta, pp 5426–5431 Current and future directions in applied mathematics,
Jiménez F, Kobilarov M, Martín de Diego D (2013) (Notre Dame, IN, 1996), Birkhäuser Boston, Boston,
Discrete variational optimal control. J Nonlinear Sci MA, pp 219–261
23(3):393–426
Junge O, Ober-Blöbaum S (2005) Optimal reconfiguration
of formation flying satellites. In: IEEE conference on
decision and control and European control conference Distributed Model Predictive Control
ECC, Seville
Junge O, Marsden JE, Ober-Blöbaum S (2006) Optimal Gabriele Pannocchia
reconfiguration of formation flying spacecraft- a de-
centralized approach. In: IEEE conference on decision University of Pisa, Pisa, Italy
and control and European control conference ECC,
San Diego, pp 5210–5215
Kobilarov M (2008) Discrete geometric motion control Abstract
of autonomous vehicles. Thesis, Computer Science,
University of Southern California Distributed model predictive control refers to a
Kobilarov M, Marsden JE (2011) Discrete geometric
optimal control on Lie groups. IEEE Trans Robot
class of predictive control architectures in which
27(4):641–655 a number of local controllers manipulate a subset
Leok M (2004) Foundations of computational geometric of inputs to control a subset of outputs (states)
mechanics, control and dynamical systems. Thesis, composing the overall system. Different levels
California Institute of Technology. Available in http://
www.math.lsa.umich.edu/~mleok
of communication and (non)cooperation exist, al-
Leyendecker S, Ober-Blobaum S, Marsden JE, Ortiz M though in general the most compelling properties
(2007) Discrete mechanics and optimal control for can be established only for cooperative schemes,
302 Distributed Model Predictive Control

those in which all local controllers optimize local properties may be even lost. Distributed predic-
inputs to minimize the same plantwide objective tive control architectures arise to meet perfor-
function. Starting from state-feedback algorithms mance specifications (stability at minimum) sim-
for constrained linear systems, extensions are dis- ilar to centralized predictive control systems, still
cussed to cover output feedback, reference target retaining the modularity and local character of the
tracking, and nonlinear systems. An outlook of optimization problems solved by each controller.
future directions is finally presented.

Definitions and Architectures


Keywords for Constrained Linear Systems

Constrained large-scale systems; Cooperative Subsystem Dynamics, Constraints,


control systems; Interacting dynamical systems and Objectives
We start the description of distributed MPC al-
Introduction and Motivations gorithms by considering an overall discrete-time
linear time-invariant system in the form:
Large-scale systems (e.g., industrial processing
x C D Ax C Bu; y D C x (1)
plants, power generation networks, etc.)
usually comprise several interconnected units
in which x 2 Rn and x C 2 Rn are, respectively,
which may exchange material, energy, and
the system state at a given time and at a successor
information streams. The overall effectiveness
time: u 2 Rm is the input: and y 2 Rp is the
and profitability of such large-scale systems
output.
depend strongly on the level of local effectiveness
We consider that the overall system (1) is
and profitability of each unit but also on the level
divided into M subsystems, Si , defined by (dis-
of interactions among the different units. An
joint) sets of inputs and outputs (states), and each
overall optimization goal can be achieved by
Si is regulated by a local MPC. For each Si , we
adopting a single centralized model predictive
denote by yi 2 Rpi its output, by xi 2 Rni
control (MPC) system (Rawlings and Mayne
its state, and by ui 2 Rmi the control input
2009) in which all control input trajectories are
computed by the i th MPC. Due to interactions
optimized simultaneously to minimize a common
among subsystems, the local output yi (and state
objective.
xi ) is affected by control inputs computed by
This choice is often avoided for several rea-
(some) other MPCs. Hence, the dynamics of Si
sons. When the overall number of inputs and
can be written as
states is very large, a single optimization problem
may require computational resources (CPU time, X
xiC D Ai xi C Bi ui C Bij uj ; yi D Ci xi
memory, etc.) that are not available and/or com- j 2Ni
patible with the system’s dynamics. Even if these (2)
limitations do no hold, it is often the case that in which Ni denotes the indices of neighbors
organizational reasons require the use of smaller, of Si , i.e., the subsystems whose inputs have
local controllers, which are easier to coordinate an influence on the states of Si . To clarify the
and maintain. notation, we depict in Fig. 1 the case of three
Thus, industrial control systems are often de- subsystems, with neighbors N1 D f2; 3g, N2 D
centralized, i.e., the overall system is divided into f1g, and N3 D f2g.
(possibly mildly coupled) subsystems and a local Without loss of generality, we assume that
controller is designed for each unit disregard- each pair (Ai , Bi ) is stabilizable. Moreover, the
ing the interactions from/to other subsystems. state of each subsystem xi is assumed known (to
Depending on the extent of dynamic coupling, the i th MPC) at each decision time. For each
it is well known that the performance of such subsystem Si , inputs are required to fulfill (hard)
decentralized systems may be poor, and stability constraints:
Distributed Model Predictive Control 303

In decentralized MPC architectures, interac-


tions among subsystems are neglected by forcing
Ni D ¿ for all i even if this is not true.
That is, the subsystem model used in each local
controller, instead of (2), is simply

xiC D Ai xi C Bi ui ; yi D Ci xi (5)
D
Therefore, an inherent mismatch exists between
the model used by the local controllers (5) and
the actual subsystem dynamics (2). Each local
MPC solves the following finite-horizon optimal
control problem (FHOCP):
Distributed Model Predictive Control, Fig. 1
Interconnected systems and neighbors definition PDe
i W min Vi ./ s.t. ui 2 UN
i ; Ni D ¿ (6)
ui

We observe that in this case, Vi ./ depends only


u i 2 Ui ; i D 1; : : : ; M (3)
on local inputs, ui , because it is assumed that Ni
= ¿. Hence, each PDe is solved independently
in which Ui are polyhedrons containing the i
of the neighbors computations, and no iterations
origin in their interior. Moreover, we consider

are performed. Clearly, depending on the actual
a quadratic stage cost function `i .x; u/ D level of interactions among subsystems, decen-
1
2
.x 0 Qi x C u0 Ri u/ and a terminal cost function tralized MPC architectures can perform poorly,

Vf i .x/ D 12 x 0 Pi x, with Qi 2 Rni ni , Ri 2 namely, being non-stabilizing. Performance cer-
Rmi mi , and Pi 2 Rni ni positive definite. tifications are still possible resorting to robust
Without loss of generality, let xi (0) be the state stability theory, i.e., by treating the neglected dy-
P
of Si at the current decision time. Consequently, namics j 2Ni Bij uj as (bounded) disturbances
the finite-horizon cost function associated with (Riverso et al. 2013).
Si is given by: In noncooperative MPC architectures, the
existing interactions among the subsystems are
X
N 1 fully taken into account through (2). Given a

Vi .xi .0/; ui ; fuj gj 2Ni / D `i .xi .k/;ui .k// known value of the neighbors’ control input
i D0 sequences, fuj gj 2Ni , each local MPC solves the
CVf i .xi .N // (4) following FHOCP:

in which ui D .ui .0/; ui .1/; : : :; ui .N  1// is PNCDi


i W min Vi ./ s.t. ui 2 UN
i (7)
ui
a finite-horizon sequence of control inputs of
Si , and uj is similarly defined as a sequence The obtained solution can be exchanged with
of control inputs of each neighbor j 2 Ni . the other local controllers to update the assumed
Notice that Vi ./ is a function of neighbors’ input neighbors’ control input sequences, and iterations
sequences, fuj gj 2Ni , due to the dynamics (2). can be performed. We observe that this approach
is noncooperative because local controllers try
Decentralized, Noncooperative, and to optimize different, possibly competing, objec-
Cooperative Predictive Control tives. In general, no convergence is guaranteed in
Architectures noncooperative iterations, and when this scheme
Several levels of communications and (non) co- converges, it leads to a so-called Nash equilib-
operation can exist among the controllers, as rium. However, the achieved local control inputs
depicted in Fig. 2 for the case of two subsystems. do not have proven stability properties (Rawlings
304 Distributed Model Predictive Control

a b

Distributed Model Predictive Control, Fig. 2 Three distributed control architectures: decentralized MPC, noncoop-
erative MPC, and cooperative MPC

and Mayne 2009, §6.2.3). To ensure closed-loop PCDi


i W min V ./ s.t. ui 2 UN
i (9)
ui
stability, variants can be formulated by including
a sequential solution of local MPC problems,
exploiting the notion (if any) of an auxiliary
As in noncooperative schemes, the obtained so-
stabilizing decentralized control law non-iterative
lution can be exchanged with the other local con-
noncooperative schemes are also proposed, in
trollers, and further iterations can be performed.
which stability guarantees are provided by ensur-
Notice that in PCDi
i , the (possible) implications
ing a decrease of a centralized Lyapunov function
of the local control sequence ui to all other
at each decision time.
subsystems’ objectives, Vj ./ with j ¤ i are
Finally, in cooperative MPC architectures,
taken into account, as well as the effect of the
each local controller optimizes a common
neighbors’ sequences fuj gj 2Ni on the local state
(plantwide) objective:
evolution through (2). Clearly, this approach is
termed cooperative because all controllers com-
X
M pute local inputs to minimize a global objective.

V .x.0/; u/ D i Vi .xi .0/; ui ; fuj gj 2Ni / Convergence of cooperative iterations is guaran-
i D1 teed, and under suitable assumptions the con-
(8) verged solution is the centralized Pareto-optimal
in which i > 0, for all i , are given scalar weights solution (Rawlings and Mayne 2009, §6.2.4).

and u D .u1 ; : : : ; uM / is the overall control Furthermore, the achieved local control inputs
sequence. In particular, given a known value have proven stabilizing properties (Stewart et al.
of other subsystems’ control input sequences, 2010). Variants are also proposed in which each
fuj gj ¤i , each local MPC solves the following controller still optimizes a local objective, but
FHOCP: cooperative iterations are performed to ensure a
Distributed Model Predictive Control 305

decrease of the global objective at each decision We observe that Step 8 implicitly defines the
time (Maestre et al. 2011). new overall iterate as a convex combination of the
overall solutions achieved by each controller, that
is,
Cooperative Distributed MPC
X
M
Cooperative schemes are preferable over nonco- uc D wi .u1c1 ; : : : ; ui ; : : : ; uM
c1
/ (10)
operative schemes from many points of view, i D1
namely, in terms of superior theoretical guaran- D
tees and no larger computational requirements. In It is also important to observe that Steps 5, 8, and
this section we focus on a prototype cooperative 9 are performed separately by each controller.
distributed MPC algorithm adapted from Stewart
et al. (2010), highlighting the required compu-
Properties
tations and discussing the associated theoretical
The basic cooperative MPC described in Algo-
properties and guarantees.
rithm 1 enjoys several nice theoretical and practi-
cal properties, as detailed (Rawlings and Mayne
Basic Algorithm
2009, §6.3.1):
We present in Algorithm 1 a streamlined descrip-
1. Feasibility of each iterate: uic1 2 UN i implies
tion of a cooperative distributed MPC algorithm,
uci 2 UN , for all i D 1, . . . , M and c 2 I>0 .
in which each local controller solves PCDi
i , given
i
2. Cost decrease at each iteration: V .x.0/; uc / 
a previously computed value of all other subsys-
V .x.0/; uc1 / for all c 2 I>0 .
tems’ input sequences. For each local controller,
3. Cost convergence to the centralized optimum:
the new iterate is defined as a convex combination
limc!1 V .x.0/; uc / D minu2UN V .x.0/; u/,
of the newly computed solution with the previous 
iteration. A relative tolerance is defined, so that in which U D U1      UM .
cooperative iterations stop when all local con- Resorting to suboptimal MPC theory, the
trollers have computed a new iterate sufficiently above properties (1) and (2) can be exploited
close to the previous one. A maximum number to show that the origin of closed-loop system
of cooperative iterations can also be defined, so

that a finite bound on the execution time can be x C D Ax C B c .x/; with  c .x/ D uc .0/ (11)
established.
is exponentially stable for any finite c 2 I>0 . This
Algorithm 1 (Cooperative MPC). Require: result is of paramount (practical and theoretical)

Overall warm start u0 D .u01 ; : : : ; u0M /, convex importance because it ensures closed-loop
P
step weights wi > 0, s.t. M i D1 wi D 1, relative
stability using cooperative distributed MPC
tolerance parameter " > 0, maximum cooperative with any finite number of cooperative itera-
iterations cmax tions. As in centralized MPC based on the
1:Initialize: c 0 and ei 2
for i = 1, . . . , M . solution of a FHOCP (Rawlings and Mayne
2:while .c < cmax / and (9i jei >
) do 2009, §2.4.3), particular care of the terminal
3: c c C 1.
cost function Vf i ./ is necessary, possibly
4: for i D 1 to M do
5: Solve PCDi in (9) obtaining u in conjunction with a terminal constraint
i i .
6: end for xi .N / 2 Xf i . Several options can be adopted
7: for i D 1 to M do as discussed, e.g., in Stewart et al. (2010,

8: Define new iterate: uci D wi u i C .1  wi / ui
c1
. 2011).
 kuci uic1 k
9: Compute convergence error: ei D . Moreover, the results in Pannocchia et al.
kui k
c1

10: end for (2011) can be used to show inherent robust


11:end while stability to system’s disturbances and mea-

12:return Overall solution: uc D .uc1 ; : : : ; ucM /.
surement errors. Therefore, we can confidently
306 Distributed Model Predictive Control

state that well-designed distributed cooperative Output Feedback and Offset-Free Tracking
MPC and centralized MPC algorithms share When the subsystem state cannot be directly mea-
the same guarantees in terms of stability and sured, each local controller can use a local state
robustness. estimator, namely, a Kalman filter (or Luenberger
observer). Assuming that the pair (Ai , Ci ) is
detectable, the subsystem state estimate evolves
Complementary Aspects as follows:

We discuss in this section a number of comple- X


mentary aspects of distributed MPC algorithms, xO iC D Ai xO i CBi ui C Bij uj CLi .yi  Ci xO i /
j 2Ni
omitting technical details for the sake of space.
(12)
ni pi
Coupled Input Constraints and State in which Li 2 R is the local Kalman predic-
Constraints tor gain, chosen such that the matrix .Ai  Li Ci /
Convergence of the solution of cooperative is Schur. Stability of the closed-loop origin can be
distributed MPC towards the centralized (global) still established using minor variations (Rawlings
optimum holds when input constraints are in the and Mayne 2009, §6.3.3).
form of (3), i.e., when no constraints involve When offset-free control is sought, each lo-
inputs of different subsystems. Sometimes this cal MPC can be equipped with an integrating
assumption fails to hold, e.g., when several disturbance model similarly to centralized offset-
units share a common utility resource, that is, in free MPC algorithms (Pannocchia and Rawlings
addition to (3) some constraints involve inputs of 2003). Given the current estimate of the subsys-
more than one unit. In this situation, it is possible tem state and disturbance, a target calculation
that Algorithm 1 remains stuck at a fixed point, problem is solved to compute the state and input
without improving the cost, even if it is still away equilibrium pair such that (a subset of) output
from the centralized optimum (Rawlings and variables correspond to given set points. Such a
Mayne 2009, §6.3.2). It is important to point out target calculation problem can be performed in
that this situation is harmless from a closed- a centralized fashion or in a distributed manner,
loop stability and robustness point of view. although in the latter case several issues arise
However, the degree of suboptimality in and associated precautions should be taken into
comparison with centralized MPC could be account (Rawlings and Mayne 2009, §6.3.4).
undesired from a performance point of view.
To overcome this situation, a slightly different Distributed Control for Nonlinear Systems
partitioning of the overall inputs into non-disjoint Several nonlinear distributed MPC algorithms
sets can be adopted (Stewart et al. 2010). have been recently proposed (Liu et al. 2009;
Similarly the presence of state constraints, Stewart et al. 2011). Some schemes require the
even in decentralized form xi 2 Xi (with i D presence of a coordinator, thus introducing a
1; : : :; M ), can prevent convergence of a cooper- hierarchical structure (Scattolini 2009). In Stew-
ative algorithm towards the centralized optimum. art et al. (2011), instead, a cooperative distributed
It is also important to point out that the local MPC MPC architecture similar to the one discussed in
controlling Si needs to consider in the optimal the previous section has been proposed for non-
control problem, besides local state constraints linear systems. Each local controller considers
xi 2 Xi , also state constraints of all other the following subsystem model:
subsystems Sj such that i 2 Nj . This ensures
feasibility of each iterate and cost reduction,
hence closed-loop stability (and robustness) can xiC D fi .xi ; ui ; uj /; with j 2 Ni
(13)
be established. yi D hi .xi /
Distributed Model Predictive Control 307

A problem (formally) identical to PCDi i in (9) we discussed various extensions to coupled


is solved by each controller and cooperative input constraints and state constraints, output
iterations are performed. However, non- feedback, reference target tracking, and nonlinear
convexity of PCDi
i can make a convex combi- systems.
nation step similar to Step 8 in Algorithm 1 not The research on DMPC algorithms has been
necessarily a cost improvement. As a workaround extensive during the last decade, and some excel-
in such cases, Stewart et al. (2011) propose lent review papers have been recently made avail-
deleting the least effective control sequence able (Christofides et al. 2013; Scattolini 2009).
Still, we expect DMPC to attract research efforts
D
computed by a local controller (repeating this
deletion if necessary). In this way it is possible to in various directions, as briefly discussed:
show a monotonic decrease of the cost function • Nonlinear DMPC algorithms (Liu et al. 2009;
at each cooperative iteration. Stewart et al. 2011) will require improvements
in terms of global optimum goals.
• Economic DMPC and tracking DMPC (Fer-
Summary and Future Directions ramosca et al. 2013) will replace current for-
mulations designed for regulation around the
We presented the basics and foundations of origin, especially for nonlinear systems.
distributed model predictive control (DMPC) • Reconfigurability, e.g., addition/deletion of
schemes, which prove useful and effective in the new local controllers, is an ongoing topic, and
control of large-scale systems for which a single preliminary results available for decentralized
centralized predictive controller is not regarded architectures (Riverso et al. 2013) may be
as a possible or desirable solution, e.g., due to or- extended to cooperative and noncooperative
ganizational requirements and/or computational schemes. It is also desirable to improve
limitations. In DMPCs, the overall controlled the resilience of DMPC to communication
system is organized into a number of subsystems, disruptions (Alessio et al. 2011).
in general featuring some dynamic couplings, and • Preliminary results on constrained distributed
for each subsystem a local MPC is implemented. estimation (Farina et al. 2012) will draw at-
Different flavors of communication and coop- tention and require further insights to bridge
eration among the local controllers can be chosen the gap between constrained estimation and
by the designer, ranging from decentralized to control algorithms.
cooperative schemes. In cooperative DMPC • Specific optimization algorithms tailored to
algorithms, the dynamic interactions among DMPC local problems (Doan et al. 2011)
the subsystems are fully taken into account, will increase the effectiveness of DMPC al-
with limited communication overheads, and the gorithms, as well as distributed optimization
same overall objective can be optimized by each approaches will be exploited even for dynam-
local controller. When cooperative iterations ically uncoupled systems.
are performed upon convergence, such DMPC
algorithms achieve the same global minimum
control sequence as that of the centralized MPC.
Termination prior to convergence does not hinder
stability and robustness guarantees. Cross-References
In this contribution, after discussing an
overview on possible communication and  Cooperative Solutions to Dynamic Games
cooperation schemes, we addressed the design of  Nominal Model-Predictive Control
a state-feedback and distributed MPC algorithm  Optimization Algorithms for Model Predictive
for linear systems subject to input constraints, Control
with convergence and stability guarantees. Then,  Tracking Model Predictive Control
308 Distributed Optimization

Recommended Reading Stewart BT, Venkat AN, Rawlings JB, Wright SJ, Pannoc-
chia G (2010) Cooperative distributed model predic-
tive control. Syst Control Lett 59:460–469
General overviews on DMPC can be found in Stewart BT, Wright SJ, Rawlings JB (2011) Cooperative
Christofides et al. (2013), Rawlings and Mayne distributed model predictive control for nonlinear sys-
(2009), and Scattolini (2009). DMPC algorithms tems. J Process Control 21:698–704
for linear systems are discussed in Alessio et al.
(2011), Ferramosca et al. (2013), Riverso et al.
(2013), Stewart et al. (2010), and Maestre et al.
(2011), and for nonlinear systems in Farina et al. Distributed Optimization
(2012), Liu et al. (2009), and Stewart et al.
(2011). Supporting results for implementation Angelia Nedić
and robustness theory can be found in Doan et al. Industrial and Enterprise Systems Engineering,
(2011), Pannocchia and Rawlings (2003), and University of Illinois, Urbana, IL, USA
Pannocchia et al. (2011).

Abstract
Bibliography
The paper provides an overview of the distributed
Alessio A, Barcelli D, Bemporad A (2011) Decentralized first-order optimization methods for solving a
model predictive control of dynamically coupled linear
systems. J Process Control 21:705–714
constrained convex minimization problem, where
Christofides PD, Scattolini R, Muñoz de la Peña D, Liu J the objective function is the sum of local objec-
(2013) Distributed model predictive control: a tutorial tive functions of the agents in a network. This
review and future research directions. Comput Chem problem has gained a lot of interest due to its
Eng 51:21–41
Doan MD, Keviczky T, De Schutter B (2011) An iterative emergence in many applications in distributed
scheme for distributed model predictive control using control and coordination of autonomous agents
Fenchel’s duality. J Process Control 21:746–755 and distributed estimation and signal processing
Farina M, Ferrari-Trecate G, Scattolini R (2012) Dis- in wireless networks.
tributed moving horizon estimation for nonlinear con-
strained systems. Int J Robust Nonlinear Control
22:123–143
Ferramosca A, Limon D, Alvarado I, Camacho EF (2013) Keywords
Cooperative distributed MPC for tracking. Automatica
49:906–914
Liu J, Muñoz de la Peña D, Christofides PD (2009) Dis- Collaborative multi-agent systems; Consensus
tributed model predictive control of nonlinear process protocol; Gradient-projection method; Net-
systems. AIChE J 55(5):1171–1184 worked systems
Maestre JM, Muñoz de la Peña D, Camacho EF (2011)
Distributed model predictive control based on a coop-
erative game. Optim Control Appl Methods 32:153–
176 Introduction
Pannocchia G, Rawlings JB (2003) Disturbance mod-
els for offset-free model predictive control. AIChE J
49:426–437 There has been much recent interest in distributed
Pannocchia G, Rawlings JB, Wright SJ (2011) Conditions optimization pertinent to optimization aspects
under which suboptimal nonlinear MPC is inherently arising in control and coordination of networks
robust. Syst Control Lett 60:747–755
consisting of multiple (possibly mobile) agents
Rawlings JB, Mayne DQ (2009) Model predictive control:
theory and design. Nob Hill Publishing, Madison and in estimation and signal processing in sensor
Riverso S, Farina M, Ferrari-Trecate G (2013) Plug-and- networks (Bullo et al. 2009; Hendrickx 2008;
play decentralized model predictive control for linear Kar and Moura 2011; Martinoli et al. 2013;
systems. IEEE Trans Autom Control 58:2608–2614
Mesbahi and Egerstedt 2010; Olshevsky 2010).
Scattolini R (2009) A survey on hierarchical and dis-
tributed model predictive control. J Process Control In many of these applications, the network
19:723–731 system goal is to optimize a global objective
Distributed Optimization 309

Pn
through local agent-based computations and minimize i D1 fi .x/ (1)
local information exchange with immediate subject to x 2 X:
neighbors in the underlying communication
network. This is motivated mainly by the Each fi : Rd ! R is a convex function which
emergence of large-scale data and/or large- represents the local objective of agent i , while
scale networks and new networking applications X Rd is a closed convex set. The function
such as mobile ad hoc networks and wireless fi is a private function known only to agent i ,
sensor networks, characterized by the lack of
centralized access to information and time-
while the set X is commonly known by all agents D
i 2 N . The vector x 2 X represents a global
varying connectivity. Control and optimization decision vector which the agents want to optimize
algorithms deployed in such networks should using local information. The problem is a simple
be completely distributed (relying only on local constrained convex optimization problem, where
observations and information), robust against the global objective function is given by the sum
unexpected changes in topology (i.e., link or node of the individual objective functions fi .x/ of the
failures) and against unreliable communication agents in the system. As such, the objective func-
(noisy links or quantized data) (see  Networked tion is the sum of non-separable convex functions
Systems). Furthermore, it is desired that the corresponding to multiple agents connected over
algorithms are scalable in the size of the a network.
network. As an example, consider the problem arising
Generally speaking, the problem of distributed in support vector machines (SVMs), which are
optimization consists of three main components: a popular tool for classification problems. Each
1. The optimization problem that the network of n o mi
.i / .i /
agent i has a set Si D aj ; bj of mi
agents wants to solve collectively (specifying j D1
.i /
an objective function and constraints) sample-label pairs, where aj 2 Rd is a data
2. The local information structure, which de- .i /
point and bj 2 fC1; 1g is its corresponding
scribes what information is locally known or (correct) label. The number mi of data points for
observable by each agent in the system (who every agent i is typically very large (hundreds
knows what and when) of thousands). Without sharing the data points,
3. The communication structure, which specifies the agents want to collectively find a hyperplane
the connectivity topology of the under- that separates all the data, i.e., a hyperplane that
lying communication network and other separates (with a maximal separation distance)
features of the communication environ- the data with label 1 from the data with label 1
ment S
in the global data set niD1 Si . Thus, the agents
The algorithms for solving such global need to solve an unconstrained version of the
network problems need to comply with the problem (1), where the decision variable x 2 Rd
distributed knowledge about the problem among is a hyperplane normal and the objective function
the agents and obey the local connectivity fi of agent i is given by
structure of the communication network
( Networked Systems;  Graphs for Modeling Xmi n  o

max 0; 1  bj x 0 aj
.i / .i /
Networked Interactions). fi .x/ D kxk2 C ;
2 j D1

where  is a regularization parameter (common


Networked System Problem to all agents).
The network communication structure is rep-
Given a set N D f1; 2; : : :; ng of agents (also resented by a directed (or undirected) graph G D
referred to as nodes), the global system problem .N; E/, with the vertex set N and the edge set
has the following form: E. The network is used as a medium to diffuse
310 Distributed Optimization

the information from an agent to every other 2004; Olshevsky 2010; Olshevsky and Tsitsiklis
agent through local agent interactions over time 2006, 2009; Touri 2011; Vicsek et al. 1995; Wan
( Graphs for Modeling Networked Interactions). and Lemmon 2009).
To accommodate the information spread through Using the consensus technique, a class
the entire network, it is typically assumed that the of distributed algorithms has emerged, as a
network communication graph G D .N; E/ is combination of the consensus protocols and the
strongly connected ( Dynamic Graphs, Connec- gradient-type methods. The gradient-based
tivity of). In the graph, a link (i , j ) means that approaches are particularly suitable, as they
agent i 2 N receives the relevant information have a small overhead per iteration and are, in
from agent j 2 N . general, robust to various sources of errors and
uncertainties.
The technique of using the network as a
Distributed Algorithms medium to propagate the relevant information
for optimization purpose has its origins in the
The algorithms for solving problem (1) are con- work by Tsitsiklis (1984), Tsitsiklis et al. (1986),
structed by using standard optimization tech- and Bertsekas and Tsitsiklis (1997), where the
niques in combination with a mechanism for network has been used to decompose the vector x
information diffusion through local agent inter- components across different agents, while all
actions. A control point of view for the design agents share the same objective function.
of distributed algorithms has a nice exposition in
Wang and Elia (2011). Algorithms Using Weighted Averaging
One of the existing optimization techniques The technique has recently been employed in
is the so-called incremental method, where the Nedić and Ozdaglar (2009) (see also Nedić and
information is processed along a directed cycle Ozdaglar 2007, 2010) to deal with problems
in the graph. In this approach the estimate is of the form (1) when the agents have different
passed from an agent to its neighbor (along the objective functions fi , but their decisions are
cycle), and only one agent updates at a time fully coupled through the common vector vari-
(Bertsekas 1997; Blatt et al. 2007; Johansson able x. In a series of recent work (to be detailed
2008; Johansson et al. 2009; Nedić and Bertsekas later), the following distributed algorithm has
2000, 2001; Nedić et al. 2001; Rabbat and Nowak emerged. Letting xi .k/ 2 X be an estimate
2004; Ram et al. 2009; Tseng 1998); for a de- (of the optimal decision) at agent i and time
tailed literature on incremental methods, see the k, the next iterate is constructed through two
textbooks Bertsekas (1999) and Bertsekas et al. updates. The first update is a consensus-like it-
(2003). eration, whereby, upon receiving the estimates
More recently, one of the techniques that xj .k/ from its (in)neighbors j , the agent i aligns
gained popularity as a mechanism for information its estimate with its neighbors through averaging,
diffusion is a consensus protocol, in which formally given by
the agent diffuses the information through the X
network through locally weighted averaging of vi .k/ D wij xj .k/; (2)
their incoming data ( Averaging Algorithms j 2Ni
and Consensus). The problem of reaching
a consensus on a particular scalar value, or where Ni is the neighbor set
computing exact averages of the initial values of
the agents, has gained an unprecedented interest Ni D fj 2 N j.j; i / 2 Eg [ fi g:
as a central problem inherent to cooperative
behavior in networked systems (Blondel et al. The neighbor set Ni includes agent i itself, since
2005; Boyd et al. 2005; Cao et al. 2005, 2008a,b; the agent always has access to its own informa-
Jadbabaie et al. 2003; Olfati-Saber and Murray tion. The scalar wij is a nonnegative weight that
Distributed Optimization 311

agent i places on the incoming information from the weights wij give rise to a doubly stochastic
neighbor j 2 Ni . These weights sum to 1, i.e., weight matrix W , whose entries are wij defined
P
j 2Ni wij D 1, thus yielding vi .k/ as a (local) by the weights in (4) and augmented by wij D 0
convex combination of xj .k/, j 2 Ni , obtained for j … Ni .
by agent i . Intuitively, the requirement that the weight
After computing vi .k/, agent i computes a matrix W is doubly stochastic ensures that each
new iterate xi .k C 1/ by performing a gradient- agent has the same influence on the system be-
projection step, aimed at minimizing its own havior, in a long run. More specifically, the dou-
objective fi , of the following form: bly stochastic weights ensure that the system as
D
P
Y  whole minimizes niD1 n1 fi .x/, where the fac-

xi .k C 1/ D vi .k/  ˛.k/rfi .vi .k// ; tor 1/n is seen as portion of the influence of
X
(3) agent i . When the weight matrix W is only row
where …X [x] is the projection of a point x on the stochastic, the consensus protocol converges to
P
set X (in the Euclidean norm), ˛.k/ > 0 is the a weighted average niD1 i xi .0/ of the agent
stepsize at time k, and rfi .z/ is the gradient of initial values, where is the left eigenvector of
fi at a point z. W associated with the eigenvalue 1. In general,
When all functions are zero and X is the entire the values i can be different for different indices
space Pd , the distributed algorithm (2) and (3) i , and the distributed algorithm in (2) and (3)
P
reduces to the linear-iteration method: results in minimizing the function niD1 i fi .x/
over X , and thus not solving problem (1).
X
xi .k C 1/ D wij xj .k/; (4) The distributed algorithm (2) and (3) has
j 2Ni been proposed and analyzed in Ram et al.
(2010a, 2012), where the convergence had
which is known as consensus or agreement pro- been established for diminishing stepsize rule
P P
tocol. This protocol is employed when the agents (i.e., k ˛.k/ D 1 and k ˛ 2 .k/ < 1). As
in the network wish to align their decision vectors seen from the convergence analysis (see, e.g.,
xi .k/ to a common vector x. O The alignment is Nedić and Ozdaglar 2009; Ram et al. 2012),
attained asymptotically (as k ! 1). for the convergence of the method, it is critical
In the presence of objective function and con- that the iterate disagreements k xi .k/  xj .k/ k
straints, the distributed algorithm in (2) and (3) converge linearly in time, for all i ¤ j . This fast
corresponds to “forced alignment” guided by the disagreement decay seems to be indispensable
P
gradient forces niD1 rfi .x/. Under appropriate for ensuring the stability of the iterative process
conditions, the alignment is forced to a common (2) and (3).
vector x  that minimizes the network objective According to the distributed algorithm (2) and
Pn
i D1 fi .x/ over the set X . This corresponds to
(3), at first, each agent aligns its estimate xi .k/
the convergence of the iterates xi .k/ to a common with the estimates xj .k/ that are received from
solution x  2 X as k ! 1 for all agents i 2 N . its neighbors and, then, updates based on its local
The conditions under which the convergence objective fi which is to be minimized over x 2
of {xi .k/} to a common solution x  2 X occurs X . Alternatively, the distributed method can be
are a combination of the conditions needed for constructed by interchanging the alignment step
the consensus protocol to converge and the con- and the gradient-projection step. Such a method,
ditions imposed on the functions fi and the step- while having the same asymptotic performance
size ˛.k/ to ensure the convergence of standard as the method in (2) and (3), exhibits a somewhat
gradient-projection methods. For the consensus slower (transient) convergence behavior due to
part, the conditions should guarantee that the a larger misalignment resulting from taking the
pure consensus protocol in (4) converges to the gradient-based updates at first. This alternative
P
average n1 niD1 xi .0/ of the initial agent values. has been initially proposed independently in
This requirement transfers to the condition that Lopes and Sayed (2006) (where simulation
312 Distributed Optimization

results have been reported), in Nedić and rfi .x/, resulting in the following stochastic
Ozdaglar (2007, 2009) (where the convergence gradient-projection step:
analysis for a time-varying networks and a Y h i
constant stepsize is given), and in Nedić et al. xi .k C 1/ D Q i .vi .k//
vi .k/  ˛.k/rf
X
(2008) (with the quantization effects) and further
investigated in Lopes and Sayed (2008), Nedić instead of (3). The convergence of these
et al. (2010), and Cattivelli and Sayed (2010). methods is established for the cases when the
More recently, it has been considered in Lobel stochastic gradients are consistent estimates
et al. (2011) for state-dependent weights and in of the actual gradient, i.e.,
Tu and Sayed (2012) where the performance is h i
compared with that of an algorithm of the form E rfQ i .vi .k//jvi .k/ D rfi .vi .k//:
(2) and (3) for estimation problems.
Algorithm Extensions: Over the past years, The convergence of these methods typically
many extensions of the distributed algorithm in requires the use of non-summable but square-
(2) and (3) have been developed, including the summable stepsize sequence {˛.k/} (i.e.,
P P 2
following: k ˛.k/ D 1 and k ˛ .k/ < 1/, e.g.,
(a) Time-varying communication graphs: The Ram et al. (2010a).
algorithm naturally extends to the case of (c) Noisy or unreliable communication links:
time-varying connectivity graphs {G.k/}, Communication medium is not always
with G.k/ = (N , E.k/) defined over perfect and, often, the communication links
the node set N and time-varying links are characterized by some random noise
E.k/. In this case, the weights wij in (2) process. In this case, while agent j 2 Ni
are replaced with wij .k/ and, similarly, sends its estimate xj .k/ to agent i , the
the neighbor set Ni is replaced with the agent does not receive the intended message.
corresponding time-dependent neighbor set Rather, it receives xj .k/ with some random
Ni .k/ specified by the graph G.k/. The link-dependent noise  ij .k/, i.e., it receives
convergence of the algorithm (2) and (3) xj .k/ +  ij .k/ instead of xj .k/ (see Kar and
with these modifications typically requires Moura 2011; Patterson et al. 2009; Touri and
some additional assumptions of the network Nedić 2009 for the influence of noise and
connectivity over time and the assumptions link failure on consensus). In such cases, the
on the entries in the corresponding weight distributed optimization algorithm needs to
matrix sequence {W .k/}, where wij .k/ = 0 be modified to include a stepsize for noise
for j … Ni .k/. These conditions are the same attenuation and the standard stepsize for
as those that guarantee the convergence of the gradient scaling. These stepsizes are
the (row stochastic) matrix sequence {W .k/} coupled through an appropriate relative-
to a rank-one row-stochastic matrix, such growth conditions which ensure that the
as a connectivity over some fixed period gradient information is maintained at the
(of a sliding-time window), the nonzero right level and, at the same time, link-noise
diagonal entries in W .k/, and the existence of is attenuated appropriately (Srivastava and
a uniform lower bound on positive entries in Nedić 2011; Srivastava et al. 2010). Other
W .k/; see, for example, Cao et al. (2008b), imperfections of the communication links
Touri (2011), Tsitsiklis (1984), Nedić and can also be modeled and incorporated into
Ozdaglar (2010), Moreau (2005), and Ren the optimization method, such as link failures
and Beard (2005). and quantization effects, which can be built
(b) Noisy gradients: The algorithm in (2) and (3) using the existing results for consensus
works also when the gradient computations protocol (e.g., Carli et al. 2007; Kar and
rfi .x/ in update (3) are erroneous Moura 2010, 2011; Nedić et al. 2008).
with random errors. This corresponds to (d) Asynchronous implementations: The method
Q i .x/ instead of
using a stochastic gradient rf in (2) and (3) has simultaneous updates,
Distributed Optimization 313

evident in all agents exchanging and updating Xi is an intersection of finitely many linear
information in synchronous time steps equality and/or inequality constraints), or the
indexed by k. In some communication Slater condition (Lee and Nedić 2012; Nedić
settings, the synchronization of agents is et al. 2010; Srivastava 2011; Srivastava and
impractical, and the agents are using their Nedić 2011; Zhu and Martínez 2012).
own clocks which are not synchronized In principle, most of the simple first-order
but do tick according to a common time methods that solve a centralized problem of
interval. Such communications result in the form (1) can also be distributed among the
random weights wij .k/ and random neighbor agents (through the use of consensus protocols)
D
set Ni .k/ E in (2), which are typically to solve distributed problem (1). For example,
independent and identically distributed (over the Nesterov dual-averaging subgradient method
time). Most common models are random (Nesterov 2005) can be distributed as proposed
gossip and random broadcast. In the gossip in Duchi et al. (2012), a distributed Newton-
model, at any time, two randomly selected Raphson method has been proposed and studied
agents i and j communicate and update, in Zanella et al. (2011), while a distributed
while the other agents sleep (Boyd et al. simplex algorithm has been constructed and
2005; Kashyap et al. 2007; Ram et al. analyzed in Bürger et al. (2012). An interesting
2010b; Srivastava 2011; Srivastava and Nedić method based on finding a zero of the gradient
Pn
2011). In the broadcast model, a random rf D i D1 rfi , distributedly, has been
agent i wakes up and broadcasts its estimate proposed and analyzed in Lu and Tang (2012).
xi .k/. Its neighbors that receive the estimate Some other distributed algorithms and their
update their iterates, while the other agents implementations can be found in Johansson et al.
(including the agent who broadcasted) do not (2007), Tsianos et al. (2012a,b), Dominguez-
update (Aysal et al. 2008; Nedić 2011). Garcia and Hadjicostis (2011), Tsianos (2013),
(e) Distributed constraints: One of the more Gharesifard and Cortés (2012a), Jakovetic et al.
challenging aspects is the extension of the (2011a,b), and Zargham et al. (2012).
algorithm to the case when the constraint
set X in (1) is given as an intersection of
closed convex sets Xi , one set per agent. Summary and Future Directions
Specifically, the set X in (1) is defined by
The distributed optimization algorithms have
\
n
been developed mainly using consensus protocols
XD Xi ;
that are based on weighted averaging, also known
i D1
as linear-iterative methods. The convergence
where the set Xi is known to agent i only. In behavior and convergence rate analysis of these
this case, the algorithm has a slight modifica- methods combines the tools from optimization
tion at the update of xi .k C 1/ in (3), where theory, graph theory, and matrix analysis. The
the projection is on the local set Xi instead of main drawback of these algorithms is that they
X , i.e., the update in (3) is replaced with the require (at least theoretically) the use of doubly
following update: stochastic weight matrix W (or W .k/ in time-
varying case) in order to solve problem (1). This
Y 
xi .k C 1/ D vi .k/  ˛.k/rfi .vi .k// : requirement can be accommodated by allowing
Xi agents to exchange locally some additional
(5) information on the weights that they intend to
The resulting method (2), (5) converges under use or their degree knowledge. However, in
some additional assumptions on the sets Xi , general, constructing such doubly stochastic
such as the T nonempty interior assumption weights distributedly on directed graphs is rather
(i.e., the set niD1 Xi has a nonempty inte- a complex problem (Gharesifard and Cortés
rior), a linear-intersection assumption (each 2012b).
314 Distributed Optimization

As an alternative, which seems a promising di- Rosenblatt M (1965) Products of independent


rection for future research, is the use of so-called identically distributed stochastic matrices.
push-sum protocol (or sum-ratio algorithm) for J Math Anal Appl 11(1):1–10
consensus problem (Benezit et al. 2010; Kempe Shen J (2000) A geometric approach to ergodic
et al. 2003). This direction is pioneered in Tsianos non-homogeneous Markov chains. In: T.-X.
et al. (2012b), Tsianos (2013), and Tsianos and He (Ed.), Proc. Wavelet Analysis and
Rabbat (2011) for static graphs and recently ex- Multiresolution Methods. Marcel Dekker Inc.,
tended to directed graphs Nedić and Olshevsky New York, vol 212, pp. 341–366
(2013) for an unconstrained version of prob- Tahbaz-Salehi A, Jadbabaie A (2010) Consensus
lem (1). over ergodic stationary graph processes. IEEE
Another promising direction lies in the use Trans Autom Control 55(1):225–230
of alternating direction method of multipliers Touri B (2012) Product of random stochastic
(ADMM) in combination with the graph- matrices and distributed averaging. Springer
Laplacian formulation of consensus constraints Theses. Springer, Berlin/New York
P
Ni xi D j 2Ni xj . A nice exposure to ADMM Wolfowitz J (1963) Products of indecomposable,
method is given in Boyd et al. (2010). The first aperiodic, stochastic matrices. Proc Am Math
work to address the development of distributed Soc 14(4):733–737
ADMM over a network is Wei and Ozdaglar
(2012), where a static network is considered. Acknowledgments The author would like to thank J.
Cortés for valuable suggestions to improve the article.
Its distributed implementation over time-varying
Also, the author gratefully acknowledges the support by
graphs will be an important and challenging task. the National Science Foundation under grant CCF 11-
11342 and by the Office of Naval Research under grant
N00014-12-1-0998.
Cross-References
Bibliography
 Averaging Algorithms and Consensus
 Dynamic Graphs, Connectivity of Aysal T, Yildiz M, Sarwate A, Scaglione A (2008) Broad-
 Graphs for Modeling Networked Interactions cast gossip algorithms: design and analysis for consen-
 Networked Systems sus. In: Proceedings of the 47th IEEE conference on
decision and control, Cancún, pp 4843–4848
Benezit F, Blondel V, Thiran P, Tsitsiklis J, Vetterli M
(2010) Weighted gossip: distributed averaging using
Recommended Reading non-doubly stochastic matrices. In: Proceedings of the
2010 IEEE international symposium on information
theory, Austin
In addition to the below cited literature the use- Bertsekas D (1997) A new class of incremental gradient
ful material relevant to consensus and matrix- methods for least squares problems. SIAM J Optim
product convergence theory includes: 7:913–926
Bertsekas D (1999) Nonlinear programming. Athena Sci-
Chatterjee S, Seneta E (1977) Towards consen- entific, Belmont
Bertsekas D, Tsitsiklis J (1997) Parallel and distributed
sus: some convergence theorems on repeated
computation: numerical methods. Athena Scientific,
averaging. J Appl Probab 14(1):89–97 Belmont
Cogburn R (1986) On products of random Bertsekas D, Nedić A, Ozdaglar A (2003) Convex analysis
stochastic matrices. Random matrices and and optimization. Athena Scientific, Belmont
Blatt D, Hero A, Gauchman H (2007) A convergent in-
their applications. American Mathematical
cremental gradient algorithm with a constant stepsize.
Society, vol. 50, pp 199–213 SIAM J Optim 18:29–51
DeGroot MH (1974) Reaching a consensus. J Am Blondel V, Hendrickx J, Olshevsky A, Tsitsiklis J (2005)
Stat Assoc 69(345):118–121 Convergence in multiagent coordination, consensus,
and flocking. In: Proceedings of IEEE CDC, Seville,
Lorenz J (2005) A stabilization theorem for con-
pp 2996–3000
tinuous opinion dynamics. Physica A: Stat Boyd S, Ghosh A, Prabhakar B, Shah D (2005) Gos-
Mech Appl 355:217–223 sip algorithms: design, analysis, and applications.
Distributed Optimization 315

In: Proceedings of IEEE INFOCOM, Miami, vol 3, Johansson B (2008) On distributed optimization in net-
pp 1653–1664 worked systems. Ph.D. dissertation, Royal Institute of
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Technology, Stockholm
Distributed optimization and statistical learning via Johansson B, Rabi M, Johansson M (2007) A simple
the alternating direction method of multipliers. Found peer-to-peer algorithm for distributed optimization in
Trends Mach Learn 3(1): 1–122 sensor networks. In: Proceedings of the 46th IEEE
Bullo F, Cortés J, Martínez S (2009) Distributed control of conference on decision and control, New Orleans, Dec
robotic networks. Applied mathematics series. Prince- 2007, pp 4705–4710
ton University Press, Princeton Johansson B, Rabi M, Johansson M (2009) A random-
Bürger M, Notarsetfano G, Bullo F, Allgöwer F (2012) ized incremental subgradient method for distributed
A distributed simplex algorithm for degenerate linear optimization in networked systems. SIAM J Control
D
programs and multi-agent assignments. Automatica Optim 20(3):1157–1170
48(9):2298–2304 Kar S, Moura J (2010) Distributed consensus algo-
Cao M, Spielman D, Morse A (2005) A lower bound rithms in sensor networks: quantized data and ran-
on convergence of a distributed network consensus dom link failures. IEEE Trans Signal Process 58(3):
algorithm. In: Proceedings of IEEE CDC, Seville, 1383–1400
pp 2356–2361 Kar S, Moura J (2011) Convergence rate analysis of
Cao M, Morse A, Anderson B (2008a) Reaching a distributed gossip (linear parameter) estimation: fun-
consensus in a dynamically changing environment: damental limits and tradeoffs. IEEE J Sel Top Signal
a graphical approach. SIAM J Control Optim 47(2): Process 5(4):674–690
575–600 Kashyap A, Basar T, Srikant R (2007) Quantized consen-
Cao M, Morse A, Anderson B (2008b) Reaching a consen- sus. Automatica 43(7):1192–1203
sus in a dynamically changing environment: conver- Kempe D, Dobra A, Gehrke J (2003) Gossip-based com-
gence rates, measurement delays, and asynchronous putation of aggregate information. In: Proceedings of
events. SIAM J Control Optim 47(2):601–623 the 44th annual IEEE symposium on foundations of
Carli R, Fagnani F, Frasca P, Taylor T, Zampieri S (2007) computer science, Cambridge, Oct 2003, pp 482–491
Average consensus on networks with transmission Lee S, Nedić A (2012) Distributed random projection
noise or quantization. In: Proceedings of European algorithm for convex optimization. IEEE J Sel Top
control conference, Kos Signal Process 48(6):988–1001 (accepted to appear)
Cattivelli F, Sayed A (2010) Diffusion LMS strategies Lobel I, Ozdaglar A, Feijer D (2011) Distributed multi-
for distributed estimation. IEEE Trans Signal Process agent optimization with state-dependent communica-
58(3):1035–1048 tion. Math Program 129(2):255–284
Dominguez-Garcia A, Hadjicostis C (2011) Distributed Lopes C, Sayed A (2006) Distributed processing over
strategies for average consensus in directed graphs. In: adaptive networks. In: Adaptive sensor array process-
Proceedings of the IEEE conference on decision and ing workshop, MIT Lincoln Laboratory, Lexington,
control, Orlando, Dec 2011 pp 1–5
Duchi J, Agarwal A, Wainwright M (2012) Dual aver- Lopes C, Sayed A (2008) Diffusion least-mean squares
aging for distributed optimization: convergence anal- over adaptive networks: formulation and perfor-
ysis and network scaling. IEEE Trans Autom Control mance analysis. IEEE Trans Signal Process 56(7):
57(3):592–606 3122–3136
Gharesifard B, Cortés J (2012a) Distributed continuous- Lu J, Tang C (2012) Zero-gradient-sum algorithms for
time convex optimization on weight-balanced di- distributed convex optimization: the continuous-time
graphs. http://arxiv.org/pdf/1204.0304.pdf case. IEEE Trans Autom Control 57(9):2348–2354
Gharesifard B, Cortés J (2012b) Distributed strategies Martinoli A, Mondada F, Mermoud G, Correll N, Egerst-
for generating weight-balanced and doubly stochastic edt M, Hsieh A, Parker L, Stoy K (2013) Distributed
digraphs. Eur J Control 18(6):539–557 autonomous robotic systems. Springer tracts in ad-
Hendrickx J (2008) Graphs and networks for the analysis vanced robotics. Springer, Heidelberg/New York
of autonomous agent systems. Ph.D. dissertation, Uni- Mesbahi M, Egerstedt M (2010) Graph theoretic methods
versité Catholique de Louvain for multiagent networks. Princeton University Press,
Jadbabaie A, Lin J, Morse S (2003) Coordination of Princeton
groups of mobile autonomous agents using nearest Moreau L (2005) Stability of multiagent systems with
neighbor rules. IEEE Trans Autom Control 48(6): time-dependent communication links. IEEE Trans Au-
988–1001 tom Control 50(2):169–182
Jakovetic D, Xavier J, Moura J (2011a) Cooperative Nedić A (2011) Asynchronous broadcast-based convex
convex optimization in networked systems: aug- optimization over a network. IEEE Trans Autom Con-
mented lagrangian algorithms with directed gossip trol 56(6):1337–1351
communication. IEEE Trans Signal Process 59(8): Nedić A, Bertsekas D (2000) Convergence rate of incre-
3889–3902 mental subgradient algorithms. In: Uryasev S, Parda-
Jakovetic D, Xavier J, Moura J (2011b) Fast distributed los P (eds) Stochastic optimization: algorithms and
gradient methods. Available at: http://arxiv.org/abs/ applications. Kluwer Academic Publishers, Dordrecht,
1112.2972 The Netherlands, pp. 263–304
316 Distributed Optimization

Nedić A, Bertsekas D (2001) Incremental subgradient Ram S, Nedić A, Veeravalli V (2010b) Asynchronous
methods for nondifferentiable optimization. SIAM J gossip algorithms for stochastic optimization: constant
Optim 56(1):109–138 stepsize analysis. In: Recent advances in optimization
Nedić A, Olshevsky A (2013) Distributed optimization and its applications in engineering: the 14th Belgian-
over time-varying directed graphs. Available at http:// French-German conference on optimization (BFG).
arxiv.org/abs/1303.2289 Springer-Verlag, Berlin Heidelberg, pp 51–60
Nedić A, Ozdaglar A (2007) On the rate of convergence Ram SS, Nedić A, Veeravalli VV (2012) A new class
of distributed subgradient methods for multi-agent of distributed optimization algorithms: application to
optimization. In: Proceedings of IEEE CDC, New regression of distributed data. Optim Method Softw
Orleans, pp 4711–4716 27(1):71–88
Nedić A, Ozdaglar A (2009) Distributed subgradient Ren W, Beard R (2005) Consensus seeking in multi-
methods for multi-agent optimization. IEEE Trans agent systems under dynamically changing interac-
Autom Control 54(1):48–61 tion topologies. IEEE Trans Autom Control 50(5):
Nedić A, Ozdaglar A (2010) Cooperative distributed 655–661
multi-agent optimization. In: Eldar Y, Palomar D Srivastava K (2011) Distributed optimization with ap-
(eds) Convex optimization in signal processing and plications to sensor networks and machine learning.
communications. Cambridge University Press, Cam- Ph.D. dissertation, University of Illinois at Urbana-
bridge/New York, pp 340–386 Champaign, Industrial and Enterprise Systems Engi-
Nedić A, Bertsekas D, Borkar V (2001) Distributed asyn- neering
chronous incremental subgradient methods. In: But- Srivastava K, Nedić A (2011) Distributed asynchronous
nariu D, Censor Y, Reich S (eds) Inherently parallel constrained stochastic optimization. IEEE J Sel Top
algorithms in feasibility and optimization and their Signal Process 5(4):772–790
applications. Studies in computational mathematics. Srivastava K, Nedić A, Stipanović D (2010) Distributed
Elsevier, Amsterdam/New York constrained optimization over noisy networks. In: Pro-
Nedić A, Olshevsky A, Ozdaglar A, Tsitsiklis J (2008) ceedings of the 49th IEEE conference on decision and
Distributed subgradient methods and quantization ef- control (CDC), Atlanta, pp 1945–1950
fects. In: Proceedings of 47th IEEE conference on Tseng P (1998) An incremental gradient(-projection)
decision and control, Cancún, Dec 2008, pp 4177– method with momentum term and adaptive stepsize
4184 rule. SIAM J Optim 8:506–531
Nedić A, Ozdaglar A, Parrilo PA (2010) Constrained Tsianos K (2013) The role of the network in dis-
consensus and optimization in multi-agent networks. tributed optimization algorithms: convergence rates,
IEEE Trans Autom Control 55(4):922–938 scalability, communication/computation tradeoffs and
Nesterov Y (2005) Primal-dual subgradient methods for communication delays. Ph.D. dissertation, McGill
convex problems, Center for Operations Research and University, Department of Electrical and Computer
Econometrics (CORE), Catholic University of Lou- Engineering
vain (UCL), Technical report 67 Tsianos K, Rabbat M (2011) Distributed consensus and
Olfati-Saber R, Murray R (2004) Consensus problems in optimization under communication delays. In: Pro-
networks of agents with switching topology and time- ceedings of Allerton conference on communication,
delays. IEEE Trans Autom Control 49(9):1520–1533 control, and computing, Monticello, pp 974–982
Olshevsky A (2010) Efficient information aggregation Tsianos K, Lawlor S, Rabbat M (2012a) Consensus-based
for distributed control and signal processing. Ph.D. distributed optimization: practical issues and applica-
dissertation, MIT tions in large-scale machine learning. In: Proceedings
Olshevsky A, Tsitsiklis J (2006) Convergence rates in of the 50th Allerton conference on communication,
distributed consensus averaging. In: Proceedings of control, and computing, Monticello
IEEE CDC, San Diego, pp 3387–3392 Tsianos K, Lawlor S, Rabbat M (2012b) Push-sum dis-
Olshevsky A, Tsitsiklis J (2009) Convergence speed in tributed dual averaging for convex optimization. In:
distributed consensus and averaging. SIAM J Control Proceedings of the IEEE conference on decision and
Optim 48(1):33–55 control, Maui
Patterson S, Bamieh B, Abbadi A (2009) Distributed aver- Tsitsiklis J (1984) Problems in decentralized decision
age consensus with stochastic communication failures. making and computation. Ph.D. dissertation, Depart-
IEEE Trans Signal Process 57:2748–2761 ment of Electrical Engineering and Computer Science,
Rabbat M, Nowak R (2004) Distributed optimization in Massachusetts Institute of Technology
sensor networks. In: Symposium on information pro- Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed
cessing of sensor networks, Berkeley, pp 20–27 asynchronous deterministic and stochastic gradient
Ram SS, Nedić A, Veeravalli V (2009) Incremental optimization algorithms. IEEE Trans Autom Control
stochastic sub-gradient algorithms for convex opti- 31(9):803–812
mization. SIAM J Optim 20(2):691–717 Touri B (2011) Product of random stochastic matrices and
Ram SS, Nedić A, Veeravalli VV (2010a) Distributed distributed averaging. Ph.D. dissertation, University of
stochastic sub-gradient projection algorithms for con- Illinois at Urbana-Champaign, Industrial and Enter-
vex optimization. J Optim Theory Appl 147:516–545 prise Systems Engineering
Dynamic Graphs, Connectivity of 317

Touri B, Nedić A (2009) Distributed consensus over sensing and communication. This article focuses
network with noisy links. In: Proceedings of the 12th on the use of graphs as models of wireless
international conference on information fusion, Seat-
tle, pp 146–154
communications. In this context, graphs have
Tu S-Y, Sayed A (2012) Diffusion strategies outperform been used widely in the study of robotic and
consensus strategies for distributed estimation over sensor networks and have provided an invaluable
adaptive networks. http://arxiv.org/abs/1205.3993 modeling framework to address a number of
Vicsek T, Czirok A, Ben-Jacob E, Cohen I, Schochet O
(1995) Novel type of phase transitions in a system of
coordinated tasks ranging from exploration,
self-driven particles. Phys Rev Lett 75(6):1226–1229 surveillance, and reconnaissance to cooperative
Wan P, Lemmon M (2009) Event-triggered distributed construction and manipulation. In fact, the
D
optimization in sensor networks. In: Symposium on success of these stories has almost always
information processing of sensor networks, San Fran-
cisco, pp 49–60
relied on efficient information exchange and
Wang J, Elia N (2011) A control perspective for coordination between the members of the team,
centralized and distributed convex optimization. In: as seen, e.g., in the case of distributed state
IEEE conference on decision and control, Florida, agreement where multi-hop communication
pp 3800–3805
Wei E, Ozdaglar A (2012) Distributed alternating direc-
has been proven necessary for convergence and
tion method of multipliers. In: Proceedings of the performance guarantees.
51st IEEE conference on decision and control and
European control conference, Maui
Zanella F, Varagnolo D, Cenedese A, Pillonetto G, Schen-
ato L (2011) Newton-Raphson consensus for dis- Keywords
tributed convex optimization. In: IEEE conference on
decision and control, Florida, pp 5917–5922 Algebraic graph theory; Convex optimization;
Zargham M, Ribeiro A, Ozdaglar A, Jadbabaie A (2014) Distributed and hybrid control; Graph connectiv-
Accelerated dual descent for network flow optimiza-
tion. IEEE Trans Autom Control 59(4):905–920 ity
Zhu M, Martínez S (2012) On distributed convex op-
timization under inequality and equality constraints.
IEEE Trans Autom Control 57(1):151–164
Introduction

DOC Communication in networked dynamical systems


has typically relied on constructs from graph
 Discrete Optimal Control
theory, with disc-based and weighted-proximity
graphs gaining the most popularity; see Fig. 1a, b.
Besides their simplicity, these models owe their
popularity to their resemblance to radio signal
Dynamic Graphs, Connectivity of strength models, where the signals attenuate with
the distance (Neskovic et al. 2000; Pahlavan and
Michael M. Zavlanos 1 and George J. Pappas2 Levesque 1995; Parsons 2000). In this context,
1
Department of Mechanical Engineering and multi-hop communication becomes equivalent to
Materials Science, Duke University, Durham, network connectivity, defined as the property of
NC, USA a graph to transmit information between any pair
2
Department of Electrical and Systems of its nodes; see Fig. 1c.
Engineering, University of Pennsylvania, Specifically, let G.t/ D fV; E.t/; W.t/g de-
Philadelphia, PA, USA note a graph on n nodes that can be robots or
mobile sensors, so that V D f1; : : : ; ng is the set
Abstract of vertices, E.t/ V  V is the set of edges at
time t, and W.t/ D fwij .t/ j .i; j / 2 V  Vg is a
Dynamic networks have recently emerged as set of weights so that wij .t/ D 0 if .i; j / 62 E.t/
an efficient way to model various forms of and wij .t/ > 0 otherwise. If wij .t/ D wj i .t/ for
interaction within teams of mobile agents, such as all pairs of nodes i; j , then the graph is called
318 Dynamic Graphs, Connectivity of

Dynamic Graphs, Connectivity of, Fig. 1 (a) Disc-based model of communication; (b) Weighted, proximity-based
model of communication; (c) Connected network of mobile robots

undirected; otherwise it is called directed. The via its relation to the network eigenvalue spectra
weights in W.t/ typically model signal strength (Preciado 2008). For example, the spectrum of
or channel reliability, as per the disc-based and the Laplacian matrix of a network plays a key role
weighted-proximity models in Fig. 1a, b. In these in the analysis of synchronization in networks of
models communication between nodes is related nonlinear oscillators (Pecora and Carrollg 1998;
to their pairwise distance, giving rise to the dy- Preciado and Verghese 2005), distributed algo-
namic or time-varying nature of the graph G.t/ rithms (Lynch 1997), and decentralized control
due to node mobility. Given an undirected dy- problems (Fax and Murray 2004; Olfati Saber
namic graph G.t/, we say that this graph is and Murray 2004). Similarly, the spectrum of
connected at time t if there exists a path, i.e., a se- the adjacency matrix determines the speed of
quence of distinct vertices such that consecutive viral information spreading in a network (Van
vertices are adjacent, between any two vertices Mieghem et al. 2009). Additionally, more robust
in G.t/. In the case of directed graphs, two versions of connectivity, such as k-node or k-edge
notions of connectivity are defined. A directed connectivity, can be used to introduce robustness
graph G.t/ is called strongly connected if there of a network to node or link failures, respectively
exists a directed path between any two of its (Zavlanos and Pappas 2005, 2008).
vertices or equivalently, if every vertex is reach-
able from any other vertex. On the other hand,
a directed graph is called weakly connected if Graph-Theoretic Connectivity Control
replacing all directed edges by undirected edges
produces a connected undirected graph. Finally, Connectivity Using the Graph Laplacian
a collection of graphs fG.t/ j t D t0 ; : : : ; tk g Matrix
is called jointly connected over time if the union A metric that is typically employed to capture
graph [ttkDt0 G.t/ D fV; [ttkDt0 E.t/g is connected. connectivity of dynamic networks is the second
Clearly checking for the existence of paths be- smallest eigenvalue 2 .L/ of the Laplacian
tween all pairs of nodes in a graph is difficult, matrix L 2 Rnn of the graph, also known as
especially so as the number of nodes in the graph the algebraic connectivity or Fiedler value of the
increases. For this reason, equivalent, algebraic graph. For a weighted graph G D fV; E; Wg,
representations of graphs are employed that allow the entries of the Laplacian matrix are typically
for efficient algebraic ways to check for connec- related to the weights in W so that
tivity, as we discuss in the following section. Pn the i; j
entry of L is given by ŒLij D j D1 wij if
While connectivity is necessary for informa- i D j and ŒLij D wij if i ¤ j . The
tion propagation in a network, it is also relevant Laplacian matrix of an undirected graph is always
to the performance of many networked dynamical a symmetric, positive semidefinite matrix whose
processes, such as synchronization and gossiping, smallest eigenvalue 1 .L/ is identically zero
Dynamic Graphs, Connectivity of 319

with corresponding eigenvector the vector of all Connectivity Using the Graph Adjacency
entries equal to one. Additionally, the algebraic Matrix
connectivity 2 .L/ is a concave function of the Alternatively, connectivity can be captured by the
P
Laplacian matrix that is positive if and only if sum of powers K k
kD0 A of the adjacency matrix
the graph is connected (Fiedler 1973; Godsil and A2R nn
of the network for K  n  1. The en-
Royle 2001; Merris 1994; Mohar 1991). tries of the adjacency matrix are typically related
As the algebraic connectivity 2 .L/ plays to the weights in W as ŒAij D wij . For disc-
a critical role in determining whether a graph is based graphs as in Fig. 1a, the i; j entry of the
connected or not, a number of methods have been kth power of the adjacency matrix ŒAk ij captures
D
proposed for its decentralized estimation and the number of paths of length k between nodes
control. These range from methods that employ i and j ; for weighted graphs, ŒAk ij captures
market-based control to underestimate the a weighted sum of those paths. Therefore, the
P
algebraic connectivity and accordingly control entries of K k
kD0 A represent the number of paths
the network structure (Zavlanos and Pappas up to length K between every pair of nodes in
2008) to methods that enforce the states of the the graph (Godsil and Royle 2001). By definition
P
nodes to oscillate at frequencies that correspond of graph connectivity, if all entries of K kD0 A
k

to the Laplacian eigenvalues and then use fast are positive for K D n  1, then the network
Fourier transform to estimate these eigenvalues is connected. Clearly, for K < n  1, not all
P
(Franceschelli et al. 2013), to methods that entries of K k
kD0 A are necessarily positive, even
iteratively update the interval where the algebraic if the graph is connected. Maintaining positive
P
connectivity is supposed to lie (Montijano et al. definiteness of the positive entries of K kD0 A
k

2011), and to methods that rely on the power of an initially connected graph maintains paths
iteration method and its variants (DeGennaro of length K between the corresponding nodes
and Jadbabaie 2006; Kempe and McSherry 2008; and, as shown in Zavlanos and Pappas (2005),
Knorn et al. 2009; Oreshkin et al. 2010; Sabattini is sufficient to maintain connectivity of the graph
et al. 2011; Yang et al. 2010). All the above throughout.
techniques are often integrated with appropriate The ability to capture graph connectivity
controllers to regulate mobility of the nodes while using the adjacency matrix has given rise
ensuring connectivity of the network. Another to optimization-based connectivity controllers
way that 2 .L/ can be used to ensure connectivity (Srivastava and Spong 2008; Zavlanos and
of dynamic graphs is via optimization-based Pappas 2005) that are often centralized due to the
methods that maximize it away from its zero multi-hop dependencies between nodes due to the
value. Such approaches were initially centralized powers of the adjacency matrix. Since smaller
as connectivity is a global property of a graph powers correspond to shorter dependencies
(Kim and Mesbahi 2006), although recently (paths), decentralization is possible as K
distributed subgradient algorithms (DeGennaro decreases. If K D 1, connectivity maintenance
and Jadbabaie 2006) as well as non-iterative reduces to preserving the pairwise links between
decomposition techniques (Simonetto et al. the nodes in an initially connected network. Since
2013) have also been proposed. As the algebraic the adjacency matrix of weighted graphs is often
connectivity is a non-differentiable function a differentiable function, this approach can result
of the Laplacian matrix, designing continuous in continuous feedback solution techniques.
feedback controllers to maintain it positive Discrete-time approaches are discussed in Ando
definite is a challenging task. This problem et al. (1999), Notarstefano et al. (2006), and
was overcome in Zavlanos and Pappas (2007) Bullo et al. (2009), while Spanos and Murray
via the use of gradient flows that maintain (2004), Dimarogonas and Kyriakopoulos (2008),
positive definiteness of the determinant of the Cornejo and Lynch (2008), Yao and Gupta
projected Laplacian matrix to the space that is (2009), Zavlanos et al. (2007), and Ji and
perpendicular to eigenvector of ones. Egerstedt (2007) rely on local gradients that
320 Dynamic Graphs, Connectivity of

may also incorporate switching in the case Powers and Balch 2004; Wagner and Arkin
of link additions. Switching between arbitrary 2004) and visibility constraints (Anderson et al.
spanning topologies has also been studied in the 2003; Ando et al. 1999; Arkin and Diaz 2002;
literature, with the spanning subgraphs being Flocchini et al. 2005; Ganguli et al. 2009).
updated by local auctions (Zavlanos and Pappas Periodic connectivity for robot teams that need to
2008), distributed spanning tree algorithms occasionally split in order to achieve individual
(Wagenpfeil et al. 2009), combination of objectives (Hollinger and Singh 2010; Zavlanos
information dissemination algorithms and graph 2010) and sufficient conditions for connectivity
picking games (Schuresko and Cortes 2009b), or in leader-follower networks (Gustavi et al. 2010)
intermediate rendezvous (Schuresko and Cortes also adds to the list. Early experimental results
2009a; Spanos and Murray 2005). This class have demonstrated efficiency of these algorithms
of approaches is typically hybrid, combining also in practice (Hollinger and Singh 2010;
continuous link maintenance and discrete Michael et al. 2009; Tardioli et al. 2010).
topology control. The algebraic connectivity
PK k
2 .L/ and number of paths kD0 A metrics
can also be combined to give controllers that Summary and Future Directions
maintain connectivity, while enforcing desired
multi-hop neighborhoods for all agents (Stump Although graphs provide a simple abstraction of
et al. 2008). inter-robot communications, it has long been rec-
A recent, comprehensive survey on graph- ognized that since links in a wireless network do
theoretic approaches for connectivity control of not entail tangible connections, associating links
dynamic graphs can be found in Zavlanos et al. with arcs on a graph can be somewhat arbitrary.
(2011). Indeed, topological definitions of connectivity
start by setting target signal strengths to draw
the corresponding graph. Even small differences
Applications in Mobile Robot in target strengths might result in dramatic dif-
Network Control ferences in network topology (Lundgren et al.
2002). As a result, graph connectivity is neces-
Methods to control connectivity of dynamic sary but not nearly sufficient to guarantee com-
graphs have been successfully applied to multiple munication integrity, interpreted as the ability
scenarios that require network connectivity to of a network to support desired communication
achieve a global coordinated objective. Indicative rates.
of the impact of this work is recent literature To address these challenges, a new body of
on connectivity preserving rendezvous (Ando work is recently appearing that departs from tra-
et al. 1999; Cortes et al. 2006; Dimarogonas and ditional graph-based models of communication.
Kyriakopoulos 2008; Ganguli et al. 2009; Ji and Specifically, Zavlanos et al. (2013) employs a
Egerstedt 2007), flocking (Zavlanos et al. 2007, simple, yet effective, modification that relies on
2009), and formation control (Ji and Egerstedt weighted graph models with weights that capture
2007; Schuresko and Cortes 2009a), where the packet error probability of each link (De-
so far connectivity had been an assumption. Couto et al. 2006). When using reliabilities as
Further extensions and contributions involve link metrics, it is possible to model routing and
connectivity control for double integrator agents scheduling problems as optimization problems
(Notarstefano et al. 2006), agents with bounded that accept link reliabilities as inputs (Ribeiro
inputs (Ajorlou and Aghdam 2010; Ajorlou et al. et al. 2007, 2008). The key idea proposed in
2010; Dimarogonas and Johansson 2008), and Zavlanos et al. (2013) is to define connectivity
indoor navigation (Stump et al. 2008), as well in terms of communication rates and to use op-
as for communication based on radio signal timization formulations to describe optimal op-
strength (Hsieh et al. 2008; Mostofi 2009; erating points of wireless networks. Then, the
Dynamic Graphs, Connectivity of 321

communication variables are updated in discrete in arbitrary dimensions. IEEE Trans Autom Control
time via a distributed gradient descent algorithm 51(8):1289–1298
DeCouto D, Aguayo D, Bicket J, Morris R (2006) A high-
on the dual function, while robot motion is reg- throughput path metric for multihop wireless routing.
ulated in continuous time by means of appro- In: Proceedings of the international ACM conference
priate distributed barrier potentials that maintain on mobile computing and networking, San Diego,
desired communication rates. Related approaches pp 134–146
DeGennaro MC, Jadbabaie A (2006) Decentralized con-
consider optimal communications based on T-slot trol of connectivity for multi-agent systems. In: Pro-
time averages of the primal variables for general ceedings of the 45th IEEE conference on decision and
mobility schemes Neely (2010), as well as opti- control, San Diego, pp 3628–3633
D
mization of mobility and communications based Dimarogonas DV, Johansson KH (2008) Decentralized
connectivity maintenance in mobile networks with
on the end-to-end bit error rate between nodes bounded inputs. In Proceedings of the IEEE in-
(Ghaffarkhah and Mostofi 2011; Yan and Mostofi ternational conference on robotics and automation,
2012). Pasadena, pp 1507–1512
Dimarogonas DV, Kyriakopoulos KJ (2008) Connect-
edness preserving distributed swarm aggregation
for multiple kinematic robots. IEEE Trans Robot
24(5):1213–1223
Cross-References Fax A, Murray RM (2004) Information flow and coopera-
tive control of vehicle formations. IEEE Trans Autom
 Flocking in Networked Systems Control 49:1465–1476
 Graphs for Modeling Networked Interactions Fiedler M (1973) Algebraic connectivity of graphs.
Czechoslovak Math J 23(98):298–305
Flocchini P, Prencipe G, Santoro N, Widmayer P
(2005) Gathering of asynchronous oblivious robots
with limited visibility. Theor Comput Sci 337(1–3):
Bibliography 147–168
Franceschelli M, Gasparri A, Giua A, Seatzu C (2013)
Ajorlou A, Aghdam AG (2010) A class of bounded Decentralized estimation of laplacian eigenvalues in
distributed controllers for connectivity preservation of multi-agent systems. Automatica 49(4):1031–1036
unicycles. In: Proceedings of the 49th IEEE confer- Ganguli A, Cortes J, Bullo F (2009) Multirobot ren-
ence on decision and control, Atlanta, pp 3072–3077 dezvous with visibility sensors in nonconvex environ-
Ajorlou A, Momeni A, Aghdam AG (2010) A class of ments. IEEE Trans Robot 25(2):340–352
bounded distributed control strategies for connectivity Ghaffarkhah A, Mostofi Y (2011) Communication-aware
preservation in multi-agent systems. IEEE Trans Au- motion planning in mobile networks. IEEE Trans Au-
tom Control 55(12):2828–2833 tom Control Spec Issue Wirel Sens Actuator Netw
Anderson SO, Simmons R, Goldberg D (2003) Main- 56(10):2478–248
taining line-of-sight communications networks be- Godsil C, Royle G (2001) Algebraic graph theory, Gradu-
tween planetray rovers. In: Proceedings of the 2003 ate Texts in Mathematics, vol 207. Springer, Berlin
IEEE/RSJ international conference on intelligent Gustavi T, Dimarogonas DV, Egerstedt M, Hu X (2010)
robots and systems, Las Vegas, pp 2266–2272 Sufficient conditions for connectivity maintenance and
Ando H, Oasa Y, Suzuki I, Yamashita M (1999) Dis- rendezvous in leader-follower networks. Automatica
tributed memoryless point convergence algorithm for 46(1):133–139
mobile robots with limited visibility. IEEE Trans Hollinger G, Singh S (2010) Multi-robot coordination
Robot Autom 15(5):818–828 with periodic connectivity. In: Proceedings of the
Arkin RC, Diaz J (2002) Line-of-sight constrained ex- IEEE international conference on robotics and au-
ploration for reactive multiagent robotic teams. In: tomation, Anchorage, Alaska, pp 4457–4462
Proceedings of the 7th international workshop on ad- Hsieh MA, Cowley A, Kumar V, Taylor C (2008) Main-
vanced motion control, Maribor, pp 455–461 taining network connectivity and performance in robot
Bullo F, Cortes J, Martinez S (2009) Distributed control of teams. J Field Robot 25(1–2):111–131
robotic networks. Applied Mathematics Series. Prince- Ji M, Egerstedt M (2007) Coordination control of multi-
ton University Press, Princeton agent systems while preserving connectedness. IEEE
Cornejo A, Lynch N (2008) Connectivity service for Trans Robot 23(4):693–703
mobile ad-hoc networks. In: Proceedings of the 2nd Kempe D, McSherry F (2008) A decentralized algorithm
IEEE international conference on self-adaptive and for spectral analysis. J Comput Syst Sci 74(1):70–83
self-organizing systems workshops, pp 292–297 Kim Y, Mesbahi M (2006) On maximizing the second
Cortes J, Martinez S, Bullo F (2006) Robust rendezvous smallest eigenvalue of a state-dependent graph lapla-
for mobile autonomous agents via proximity graphs cian. IEEE Trans Autom Control 51(1):116–120
322 Dynamic Graphs, Connectivity of

Knorn F, Stanojevic R, Corless M, Shorten R (2009) Preciado V (2008) Spectral analysis for stochastic models
A framework for decentralized feedback connectivity of large-scale complex dynamical networks. Ph.D.
control with application to sensor networks. Int J dissertation, Department of Electrical Engineering and
Control 82(11):2095–2114 Computer Science, MIT
Lundgren H, Nordstrom E, Tschudin C (2002) The gray Preciado V, Verghese G (2005) Synchronization in gen-
zone problem in ieee 802.11b based ad hoc networks. eralized erdös-rényi networks of nonlinear oscillators.
ACM SIGMOBILE Mobile Comput Commun Rev In: 44th IEEE conference on decision and control,
6(3):104–105 Seville, Spain, pp 4628–463
Lynch N (1997) Distributed algorithms. Morgan Kauf- Ribeiro A, Luo Z-Q, Sidiropoulos ND, Giannakis GB
mann, San Francisco (2007) Modelling and optimization of stochastic rout-
Merris R (1994) Laplacian matrices of a graph: a survey. ing for wireless multihop networks. In: Proceedings of
Linear Algebra Appl 197:143–176 the 26th annual joint conference of the IEEE Computer
Michael N, Zavlanos MM, Kumar V, Pappas GJ (2009) and Communications Societies (INFOCOM), Anchor-
Maintaining connectivity in mobile robot networks. age, pp 1748–1756
In: Experimental robotics. Tracts in advanced robotics. Ribeiro A, Sidiropoulos ND, Giannakis GB (2008) Opti-
Springer, Berlin/Heidelberg, pp 117–126 mal distributed stochastic routing algorithms for wire-
Mohar B (1991) The laplacian spectrum of graphs. In: less multihop networks. IEEE Trans Wirel Commun
Alavi Y, Chartrand G, Ollermann O, Schwenk A (Eds) 7(11):4261–4272
Graph theory, combinatorics, and applications. Wiley, Sabattini L, Chopra N, Secchi C (2011) On decentralized
New York, pp 871–898 connectivity maintenance for mobile robotic systems.
Montijano E, Montijano JI, Sagues C (2011) Adaptive In: Proceedings of the 50th IEEE conference on deci-
consensus and algebraic connectivity estimation in sion and control, Orlando, pp 988–993
sensor networks with chebyshev polynomials. In: Pro- Schuresko M, Cortes J (2009a) Distributed tree rearrange-
ceedings of the 50th IEEE conference on decision and ments for reachability and robust connectivity. In: Hy-
control, Orlando, pp 4296–4301 brid systems: computetation and control. Lecture notes
Mostofi Y (2009) Decentralized communication-aware in computer science, vol 5469. Springer, Berlin/New
motion planning in mobile networks: an information- York, pp 470–474
gain approach. J Intell Robot Syst 56(1–2): Schuresko M, Cortes J (2009b) Distributed motion con-
233–256 straints for algebraic connectivity of robotic networks.
Neely MJ (2010) Universal scheduling for networks with J Intell Robot Syst 56(1–2):99–126
arbitrary traffic, channels, and mobility. In: Proceed- Simonetto A, Kaviczky T, Babuska R (2013) Constrained
ings of the 49th IEEE conference on decision and distributed algebraic connectivity maximization in
control, Altanta, pp 1822–1829 robotic networks. Automatica 49(5):1348–1357
Neskovic A, Neskovic N, Paunovic G (2000) Modern Spanos DP, Murray RM (2004) Robust connectivity
approaches in modeling of mobile radio systems prop- of networked vehicles. In: Proceedings of the 43rd
agation environment. IEEE Commun Surv 3(3):1–12 IEEE conference on decision and control, Bahamas,
Notarstefano G, Savla K, Bullo F, Jadbabaie A pp 2893–2898
(2006) Maintaining limited-range connectivity among Spanos DP, Murray RM (2005) Motion planning with
second-order agents. In: Proceedings of the 2006 wireless network constraints. In: Proceedings of
American control conference, Minneapolis, pp 2124– the 2005 American control conference, Portland,
2129 pp 87–92
Olfati-Saber R, Murray RM (2004) Consensus prob- Srivastava K, Spong MW (2008) Multi-agent coordina-
lems in networks of agents with switching topol- tion under connectivity constraints. In: Proceedings
ogy and time-delays. IEEE Trans Autom Control 49: of the 2008 American control conference, Seattle,
1520–1533 pp 2648–2653
Oreshkin BN, Coates MJ, Rabbat MG (2010) Opti- Stump E, Jadbabaie A, Kumar V (2008) Connectivity
mization and analysis of distributed averaging with management in mobile robot teams. In: Proceedings
short node memory. IEEE Trans Signal Process of the IEEE international conference on robotics and
58(5):2850–2865 automation, Pasadena, pp 1525–1530
Pahlavan K, Levesque AH (1995) Wireless information Tardioli D, Mosteo AR, Riazuelo L, Villarroel JL, Mon-
networks. Willey, New York tano L (2010) Enforcing network connectivity in robot
Parsons JD (2000) The mobile radio propagation channel. team missions. Int J Robot Res 29(4):460–480
Willey, Chichester Van Mieghem P, Omic J, Kooij R (2009) Virus spread in
Pecora L, Carrollg T (1998) Master stability functions networks. IEEE/ACM Trans Networking 17(1):1–14
for synchronized coupled systems. Phys Rev Lett Wagenpfeil J, Trachte A, Hatanaka T, Fujita M, Sawodny
80:2109–2112 O (2009) A distributed minimum restrictive connectiv-
Powers M, Balch T (2004) Value-based communication ity maintenance algorithm. In: Proceedings of the 9th
preservation for mobile robots. In: Proceedings of international symposium on robot control, Gifu
the 7th international symposium on distributed au- Wagner AR, Arkin RC (2004) Communication-sensitive
tonomous robotic systems, Toulouse multi-robot reconnaissance. In: Proceedings of the
Dynamic Noncooperative Games 323

IEEE international conference on robotics and au- player is finite, and discuss briefly extensions to
tomation, New Orleans, pp 2480–2487 infinite games.
Yan Y, Mostofi Y (2012) Robotic router formation in
realistic communication environments. IEEE Trans
Robot 28(4):810–827
Yang P, Freeman RA, Gordon GJ, Lynch KM, Srinivasa Keywords
SS, Sukthankar R (2010) Decentralized estimation
and control of graph connectivity for mobile sensor Extensive form games; Finite games; Nash equi-
networks. Automatica 46(2): 390–396 librium
Yao Z, Gupta K (2009) Backbone-based connectivity
control for mobile networks. In: Proceedings IEEE
D
international conference on robotics and automation,
Kobe, pp 1133–1139 Introduction
Zavlanos MM (2010) Synchronous rendezvous of very-
low-range wireless agents. In: Proceedings of the 49th Dynamic noncooperative games allow multiple
IEEE conference on decision and control, Atlanta,
pp 4740–4745
actions by individual players, and include explicit
Zavlanos MM, Pappas GJ (2005) Controlling connectivity representations of the information available to
of dynamic graphs. In: Proceedings of the 44th IEEE each player for selecting its decision. Such games
conference on decision and control and European have a complex temporal order of play and an
control conference, Seville, pp 6388–6393
Zavlanos MM, Pappas GJ (2007) Potential fields for
information structure that reflects uncertainty as
maintaining connectivity of mobile networks. IEEE to what individual players know when they have
Trans Robot 23(4):812–816 to make decisions. This temporal order and in-
Zavlanos MM, Pappas GJ (2008) Distributed connectiv- formation structure is not evident when the game
ity control of mobile networks. IEEE Trans Robot
24(6):1416–1428
is represented as a static game between play-
Zavlanos MM, Jadbabaie A, Pappas GJ (2007) Flocking ers that select strategies. Dynamic games often
while preserving network connectivity. In: Proceed- incorporate explicit uncertainty in outcomes, by
ings of the 46th IEEE conference on decision and representing such outcomes as actions taken by a
control, New Orleans, pp 2919–2924
Zavlanos MM, Tanner HG, Jadbabaie A, Pappas random player (called chance or “Nature”) with
GJ (2009) Hybrid control for connectivity known probability distributions for selecting its
preserving flocking. IEEE Trans Autom Control actions.
54(12):2869–2875 We focus our exposition on models of finite
Zavlanos MM, Egerstedt MB, Pappas GJ (2011) Graph
theoretic connectivity control of mobile robot net- games and discuss briefly extensions to infinite
works. Proc IEEE Spec Issue Swarming Nat Eng Syst games at the end of the entry.
99(9):1525–154
Zavlanos MM, Ribeiro A, Pappas GJ (2013) Network in-
tegrity in mobile robotic networks. IEEE Trans Autom Finite Games in Extensive Form
Control 58(1):3–18

The extensive form of a game was introduced


by von Neumann (1928) and later refined by
Dynamic Noncooperative Games Kuhn (1953) to represent explicitly the order of
play, the information and actions available to
David A. Castañón
each player for making decisions at each of their
Boston University, Boston, MA, USA
turns, and the payoffs that players receive after
a complete set of actions. Let I D f0; 1; : : : ; ng
Abstract denote the set of players in a game, where player
0 corresponds to Nature. The extensive form is
In this entry, we present models of dynamic represented in terms of a game tree, consisting
noncooperative games, solution concepts and al- of a rooted tree with nodes N and edges E. The
gorithms for finding game solutions. For the sake root node represents the initial state of the game.
of exposition, we focus mostly on finite games, Nodes x in this tree correspond to positions or
where the number of actions available to each “states” of the game. Any non-root node with
324 Dynamic Noncooperative Games

more than one incident edge is an internal node


of the tree and is a decision node associated
with a player in the game; the root node is also
a decision node. Each decision node x has a
player o.x/ 2 I assigned to select a decision
from a finite set of admissible decisions A.x/.
Using distance from the root node to indicate
direction of play, each edge that follows node x
corresponds to an action in A.x/ taken by player
p.x/, which evolves the state of the game into a
subsequent node x 0 .
Non-root nodes with only one incident edge
are terminal nodes, which indicate the end of
the game; such nodes represent outcomes of the
game. The unique path from the root node to a
terminal node is a called a play of the game. For
each outcome node x, there are payoff functions Dynamic Noncooperative Games, Fig. 1 Illustration
of game in extensive form
Ji .x/ associated with each player i 2 f1; : : : ; ng.
The above form represents the different play-
only one information set associated with player
ers, the order in which players select actions and
p.x/. The constraints above ensure that, for each
their possible actions, and the resulting payoffs
information set, there is a unique player identified
to each player from a complete set of plays of
to select actions, and the set of admissible actions
the game. The last component of interest is to
is unambiguously defined. Denote by A.hki / the
represent the information available for players to
set of admissible actions at information set hki .
select decisions at each decision node. Due to the
The last condition is a causality condition that
tree structure of the extensive form of a game,
ensures that a player who has selected a previous
each decision node x contains exact information
decision remembers that he/she has already made
on all of the previous actions taken that led to
that previous decision.
state x. In games of perfect information, each
Figure 1 illustrates an extensive form for a
player knows exactly the decision node x at
two-person game. Player 1 has two actions in any
which he/she is selecting an action. To represent
play of the game, whereas player 2 has only 1
imperfect information, the extensive form uses
action. Player 1 starts the game at the root node
the notion of an information set, which represents
a; the information set h12 shows that player 2 is
a group of decision nodes, associated with the
unaware of this action, as both nodes that descend
same player, where the information available to
from node a are in this information set. After
that player is that the game is in one of the states
player 2’s action, player 1 gets to select a sec-
in that information set. Formally, let Ni  N be
ond action. However, the information sets h21 ; h31
the set of decision nodes associated with player
indicate that player 1 recalls what earlier action
i , for i 2 I . Let Hi denote a partition of Ni so
he/she selected, but he/she has not observed the
that, for each set hki 2 Hi , we have the following
action of player 2. The terminal nodes indicate
properties:
the payoffs to player 1 and player 2 as an ordered
• If x; x 0 2 hki , then they have the same admis-
pair.
sible actions: A.x/ D A.x 0 /.
• If x; x 0 2 hki , then they cannot both belong
to a play of the game; that is, both x and x 0 Strategies and Equilibrium Solutions
cannot be on a path from the root node to an
outcome. A pure strategy i for player i 2 f1; : : : ; ng is a
Elements hki for some player i are the information function that maps each information set of player
sets. Each decision node x belongs to one and i into an admissible decision. That is,
Dynamic Noncooperative Games 325

i W Hi ! A A Nash equilibrium solution is a tuple of feasible


strategies   D .1 ; : : : ; n / such that
such that i .hki / 2 A.hki /. The set of pure
strategies for player i is denoted as i . Ji .  /  Ji . i ; i / for all i 2 i ;
Note that pure strategies do not include actions for all i 2 f1; : : : ; ng (1)
selected by Nature. Nature’s actions are selected
using probability distributions over the choices The special case of two-person games where
available for the information sets corresponding J1 . / D J2 . / are known as zero-sum games. D
to Nature, where the choice at each information As discussed in the encyclopedia entry on
set is made independently of other choices. For static games ( Strategic Form Games and Nash
finite games, the number of pure strategies for Equilibrium), the existence of Nash equilibria
each player is also finite. Given a tuple of pure or even saddle point strategies in terms of pure
strategies  D .1 ; : : : ; n /, we can define the strategies is not guaranteed for finite games.
probability of the play of the game resulting in Thus, one must consider the use of mixed
outcome node x as .x/, computed as follows: strategies. A mixed strategy i for player
Each outcome node has a unique path (the play) i 2 f1; : : : ; ng is a probability distribution over
prx from root node r. We initialize .r/ D 1 the set of pure strategies i . The definition of
and node n D r. If the player at node n is payoffs can be extended to mixed strategies by
Nature and the next node in prx is n0 , then averaging the payoffs associated with the pure
.n0 / D .n/ p.n; n0 /, where p.n; n0 / is the strategies, as
probability that Nature chooses the action that
leads to n0 . Otherwise, let i D p.n/ be the player X X
Ji .i / D ::: 1 .1 /    n .n /
at node n and let hki .n/ denote the information 1 21 n 2n
set containing node n. Then, if  i .hki .n// D
a.n; n0 /; let .n0 / D .n/, where a.n; n0 / is Ji .1 ; : : : ; n /
the action in A.n/ that leads to n0 ; otherwise, set
.n0 / D 0. The above process is repeated letting Denote the set of probability distributions over
n0 D n, until n0 equals the terminal node x. Using i as .i /. An n-tuple  D .1 ; : : : ; n / of
these probabilities, the resulting expected payoff mixed strategies is said to be a Nash equilibrium
to player i is if

X Ji . /  Ji .i ; i / for all i


Ji . / D .x/Ji .x/
2 .i /; for all i 2 f1; : : : ; ng
x terminal; x2N

This representation of payoffs in terms of Theorem 1 (Nash 1950, 1951) Every finite
strategies transforms the game from an extensive n-person game has at least one Nash equilibrium
form representation to a strategic or normal form point in mixed strategies.
representation, where the concept of dynamics
and information has been abstracted away. The Mixed strategies suggest that each player’s
resulting strategic form looks like a static game randomization occurs before the game is played,
as discussed in the encyclopedia entry  Strategic by choosing a strategy at random from its choices
Form Games and Nash Equilibrium, where each of pure strategies. For games in extensive form,
player selects his/her strategy from a finite set, one can introduce a different class of strategies
resulting in a vector of payoffs for the players. where a player makes a random choice of action
Using these payoffs, one can now define so- at each information set, according to a proba-
lution concepts for the game. Let the notation bility distribution that depends on the specific
 i D .1 ; : : : ; i 1 ; i C1 ; n / denote the set of information set. The choice of action is selected
strategies in a tuple excluding the i th strategy. independently at each information set according
326 Dynamic Noncooperative Games

to a selected probability distribution, in a man- is equivalent, where every player receives the
ner similar to how Nature’s actions are selected. same payoffs under both strategies i and xi .
These random choice strategies are known as
behavior strategies. Let .A.hki // denote the This equivalence was extended by Aumann
set of probability distributions over the deci- to infinite games (Aumann 1964). In dynamic
sions A.hki / for player i at information set hki . games, it is common to assume that each player
A behavior strategy for player i is an element has perfect recall and thus solutions can be found
Q
xi 2 k k
hki 2Hi .A.hi //, where xi .hi / denotes
in the smaller space of behavior strategies.
the probability distribution of the behavior strat-
egy xi over the available decisions at information Computation of Equilibria
set hki . Algorithms for the computation of mixed-
Note that the space of admissible behavior strategy Nash equilibria of static games can
strategies is much smaller than the space of ad- be extended to compute mixed-strategy Nash
missible mixed strategies. To illustrate this, con- equilibria for games in extensive form when
sider a player with K information sets and two the pure strategies are enumerated as above.
possible decisions for each information set. The However, the number of pure strategies grows
number of possible pure strategies would be 2K , exponentially with the size of the extensive form
and thus the space of mixed strategies would be tree, making these methods hard to apply. For
a probability simplex of dimension 2K  1. In two-person games in extensive form with perfect
contrast, behavior strategies would require spec- recall by both players, one can search for Nash
ifying probability distributions over two choices equilibria in the much smaller space of behavior
for each of K information sets, so the space strategies. This was exploited in Koller et al.
of behavior strategies would be a product space (1996) to obtain efficient linear complementarity
of K probability simplices of dimension 1, to- problems for nonzero-sum games and linear
taling dimension K. One way of understanding programs for zero-sum games where the number
this difference is that mixed strategies introduce of variables involved is linear in the number of
correlated randomization across choices at differ- internal decision nodes of the extensive form
ent information sets, whereas behavior strategies of the game. A more detailed overview of
introduce independent randomization across such computation algorithms for Nash equilibria can
choices. be found in McKelvey and McLennan (1996).
For every behavior strategy, one can find an The Gambit Web site (McKelvey et al. 2010)
equivalent mixed strategy by computing the prob- provides software implementations of several
abilities of every set of actions that result from techniques for computation of Nash equilibria in
the behavior strategy. The converse is not true two- and n-person games.
for general games in extensive form. However, An alternative approach to computing Nash
there is a special class of games for which the equilibria for games in extensive forms is based
converse is true. In this class of games, players on subgame decomposition, discussed next.
recall what actions they have selected previously
and what information they knew previously. A Subgames
formal definition of perfect recall is beyond the Consider a game G in extensive form. A node
scope of this exposition but can be found in c is a successor of a node n if there is a path
Hart (1991) and Kuhn (1953). The implication of in the game tree from n to c. Let h be a node
perfect recall is summarized below: in G that is not terminal and is the only node in
its information set. Assume that if a node c is a
Theorem 2 (Kuhn 1953) Given a finite n- successor of h, then every node in the information
person game in which player i has perfect recall, set containing c is also a successor of h. In this
for each mixed strategy i for player i , there situation, one can define a subgame H of G
exists a corresponding behavior strategy xi that with root node h, which consists of node h and
Dynamic Noncooperative Games 327

for the information sets in H . Suppose the game


G has a Nash equilibrium achieved by strategies
.x1 ; : : : ; xn /. This Nash equilibrium is called sub-
game perfect (Selten 1975) if, for every subgame
H of G, the strategies .x1H ; : : : ; xnH / are a Nash
equilibrium for the subgame H . In the game in
Fig. 2, there is only one subgame perfect equi-
librium, which is .R; R/. There are several other
refinements of the Nash equilibrium concept to
D
enforce sequential consistency, such as sequential
equilibria (Kreps and Wilson 1982).
Dynamic Noncooperative Games, Fig. 2 Simple game An important application of subgames is to
with multiple Nash equilibria compute subgame perfect equilibria by backward
induction. The idea is to start with a subgame
root node as close as possible to an outcome node
its successors, connected by the same actions as (e.g., node c in Fig. 2). This small subgame can
in the original game G, with the payoffs and be solved for its Nash equilibria, to compute the
terminal vertices equal to those in G. This is equilibrium payoffs for each player. Then, in the
the subgame that would be encountered by the original game G, the root node of the subgame
players had the previous play of the game reached can be replaced by an outcome node, with payoffs
the state at node h. Since the information set equal to the equilibrium payoffs in the subgame.
containing node h contains no other node and all This results in a smaller game, and the process
the information sets in the subgame contain only can be applied inductively until the full game is
nodes in the subgame, then every player in the solved. The subgame perfect equilibrium strate-
subgame knows that they are playing only in the gies can then be computed as the solution of
subgame once h has been reached. the different subgames solved in this backward
Figure 2 illustrates a game where every non- induction process. For Fig. 2, the subgame at c is
terminal node is contained in its own information solved by player 2 selecting R, with payoffs (4,1).
set. This game contains a subgame rooted at The reduced new game has two choices for player
node c. Note that the full game has two Nash 1, with best decision R. This leads to the overall
equilibria in pure strategies: strategies .L; L/ and subgame perfect equilibrium .R; R/.
.R; R/. However, strategy .L; L/ is inconsistent This backward induction procedure is similar
with how player 2 would choose its decision if it to the dynamic programming approach to solving
were to find itself at node c. This inconsistency control problems. Backward induction was first
arises because the Nash equilibria are defined in used by Zermelo (1912) to analyze zero-sum
terms of strategies announced before the game games of perfect information such as chess. An
is played and may not be a reasonable way extension of Zermelo’s work by Kuhn (1953)
to select decisions if unanticipated plays occur. establishes the following result:
When player 1 chooses L, node c should not be
reached in the play of the game, and thus player Theorem 3 Every finite game of perfect informa-
2 can choose L because it does not affect the tion has a subgame perfect Nash equilibrium in
expected payoff. One can define the concept of pure strategies.
Nash equilibria that are sequentially consistent as
follows. This result follows because, at each step in the
Let xi be behavior strategies for player i in backward induction process, the resulting sub-
game G. Denote the restriction of these strate- game consists of a choice among finite actions for
gies to the subgame H as xiH . This restriction a single player, and thus a pure strategy achieves
describes the probabilities for choices of actions the maximal payoff possible.
328 Dynamic Noncooperative Games

Infinite Noncooperative Games selections. Define the information available


for player i at stage t to be Ii .t/, a subset
When the number of options available to players of fy1 .0/;: : : ;yn .0/;: : : ; y1 .t/;: : : ; yn .t/I u1 .0/;
is infinite, game trees are no longer appropriate : : : ; un .0/; : : : ; u1 .t  1/; : : : ; un .t  1/g. With
representations for the evolution of the game. this notation, strategies i .t/ for player i are
Instead, one uses state space models with ex- functions that map, at each stage, the available
plicit models of actions and observations. A typ- information Ii .t/ into admissible decisions
ical multistage model for the dynamics of such ui .t/ 2 Ai .t/. With appropriate measurability
games is conditions, specifying a full set of strategies
 D .1 ; : : : n / for each of the players induces a
x.t C 1/ D f .x.t/; u1 .t/; : : : ; un .t/; w.t/; t/; probability distribution on the plays of the game,
which leads to the expected payoff Ji . /. Nash
t D 0; : : : ; T  1 (2) equilibria are defined in identical fashion to (1).
Obtaining solutions of multistage games is a
with initial condition x.0/ D w.0/, where difficult task that depends on the ability to use
x.t/ is the state of the game at stage t, and the subgame decomposition techniques discussed
u1 .t/; : : : ; un .t/ are actions selected by players previously. Such subgame decompositions are
1; : : : ; n at stage t, and w.t/ is an action selected possible when games do not include actions by
by Nature at stage t. The space of actions for each Nature and the payoff functions have a stagewise
player are restricted at each time to infinite sets additive property. Under such cases, backward
Ai .t/ with an appropriate topological structure, induction allows the construction of Nash equi-
and the admissible state x.t/ at each time belongs librium strategies through the recursive solutions
to an infinite set X with a topological structure. In of static infinite games, such as those discussed
terms of Nature’s actions, for each stage t, there is in the encyclopedia entry on static games.
a probability distribution that specifies the choice Generalizations of multistage games to con-
of Nature’s action w.t/, selected independently tinuous time result in differential games, covered
of other actions. in two articles in the encyclopedia, but for the
Equation (2) describes a play of the game, zero-sum case. Additional details on infinite dy-
in terms of how different actions by players namic noncooperative games, exploiting different
at the various stages evolve the state of the models and information structures and studying
game. A play of the game is thus a history the existence, uniqueness, or nonuniqueness of
h D .x.0/; u1 .0/; : : : ; un .0/; x.1/; u1 .1/; : : : ; equilibria, can be found in Basar and Olsder
un .1/; : : : ; x.T //. Associated with each play of (1982).
the game is a set of real-valued functions Ji .h/
that indicates the payoff to player i in this play.
This function is often assumed to be separable Conclusions
across the variables in each stage.
To complete the description of the extensive In this entry, we reviewed models for dynamic
form, one must now introduce the available in- noncooperative games that incorporate temporal
formation to each player at each stage. Define order of play and uncertainty as to what indi-
observation functions vidual players know when they have to make
decisions. Using these models, we defined so-
yi .t/ D gi .x.t/; v.t/; t/; i D 1; : : : ; n lution concepts for the games and discussed al-
gorithms for determining solution strategies for
where yi .t/ takes values in observations the players. Active directions of research include
spaces which may be finite or infinite, and development of new solution concepts for dy-
v.t/ are selected by Nature given their namic games, new approaches to computation of
probability distributions, independent of other game solutions, the study of games with a large
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 329

number of players, evolutionary games where


players’ greedy behavior evolves toward equilib-
Dynamic Positioning Control
rium strategies, and special classes of dynamic
Systems for Ships and Underwater
games such as Markov games and differential
Vehicles
games. Several of these topics are discussed in
Asgeir J. Sørensen
other entries in the encyclopedia.
Department of Marine Technology, Centre for
Autonomous Marine Operations and Systems
Cross-References
(AMOS), Norwegian University of Science and D
Technology, NTNU, Trondheim, Norway
 Stochastic Dynamic Programming
 Stochastic Games and Learning
Abstract
 Strategic Form Games and Nash Equilibrium

In 2012 the fleet of dynamically positioned (DP)


ships and rigs was probably larger than 3,000
Bibliography
units, predominately operating in the offshore oil
Aumann RJ (1964) Mixed and behavior strategies in and gas industry. The complexity and function-
infinite extensive games. In: Dresher M, Shapley LS, ality vary subject to the targeted marine opera-
Tucker AW (eds) Advances in game theory. Princeton tion, vessel concept, and risk level. DP systems
University Press, Princeton
with advanced control functions and redundant
Basar T, Olsder GJ (1982) Dynamic noncooperative game
theory. Academic press, London sensor, power, and thruster/propulsion configu-
Hart S (1991) Games in extensive and strategic forms. In: rations are designed in order to provide high-
Aumann R, Hart S (eds) Handbook of game theory precision fault-tolerant control in safety-critical
with economic applications, vol 1. Elsevier, Amster-
marine operations. The DP system is customized
dam
Koller D, Megiddo N, von Stengel B (1996) Efficient for the particular application with integration to
computation of equilibria for extensive two-person other control systems, e.g., power management,
games. Games Econ Behav 14:247–259 propulsion, drilling, oil and gas production, off-
Kreps DM, Wilson RB (1982) Sequential equilibria.
Econometrica 50:863–894
loading, crane operation, and pipe and cable lay-
Kuhn HW (1953) Extensive games and the problem of ing. For underwater vehicles such as remotely
information. In: Kuhn HW, Tucker AW (eds) Con- operated vehicles (ROVs) and autonomous un-
tributions to the theory of games, vol 2. Princeton derwater vehicles (AUVs), DP functionality also
University Press, Princeton
McKelvey RD, McLennan AM (1996) Computation of
denoted as hovering is implemented on several
equilibria in finite games. In: Amman HM, Kendrick vehicles.
DA, Rust J (eds) Handbook of computational eco-
nomics, vol 1. Elsevier, Amsterdam
McKelvey RD, McLennan AM, Turocy TL (2010)
Gambit: software tools for game theory, version Keywords
0.2010.09.01. http://www.gambit-project.org
Nash J (1950) Equilibrium points in n-person games. Proc Autonomous underwater vehicles (AUVs); Fault-
Natl Acad Sci 36(1):48–49
Nash J (1951) Non-cooperative games. Ann Math tolerant control; Remotely operated vehicles
54(2):286–295 (ROVs)
Selten R (1975) Reexamination of the perfectness concept
for equilibrium points in extensive games. Int J Game
Theory 4(1):25–55
von Neumann J (1928) Zur theorie der gesellschaftsspiele. Introduction
Mathematicsche Annale 100:295–320
Zermelo E (1912) Uber eine anwendung der mengenlehre
The offshore oil and gas industry is the
auf die theoreie des schachspiels. In: Proceedings
of the fifth international congress of mathematicians, dominating market for DP vessels. The various
Cambridge University Press, Cambridge, vol 2 offshore applications include offshore service
330 Dynamic Positioning Control Systems for Ships and Underwater Vehicles

vessels, drilling rigs (semisubmersibles) and system, propulsion system, and sensors is impor-
ships, shuttle tankers, cable and pipe layers, tant for successful design and operation of DP
floating production, storage, and off-loading systems and DP vessels.
units (FPSOs), crane and heavy lift vessels, Thruster-assisted position mooring is another
geological survey vessels, rescue vessels, and important stationkeeping application often used
multipurpose construction vessels. DP systems for FPSOs, drilling rigs, and shuttle tanker oper-
are also installed on cruise ships, yachts, fishing ations where the DP system has to be redesigned
boats, navy ships, tankers, and others. accounting for the effect of the mooring system
A DP vessel is by the International Maritime dynamics. In thruster-assisted position mooring,
Organization (IMO) and the maritime class so- the DP system is renamed to position mooring
cieties (DNV GL, ABS, LR, etc.) defined as a (PM) system. PM systems have been commer-
vessel that maintains its position and heading cially available since the 1980s. While for DP-
(fixed location denoted as stationkeeping or pre- operated ships the thrusters are the sole source
determined track) exclusively by means of active of the stationkeeping, the assistance of thrusters
thrusters. The DP system as defined by class is only complementary to the mooring system.
societies is not only limited to the DP control Here, most of the stationkeeping is provided by
system including computers and cabling. Position a deployed anchor system. In severe environmen-
reference systems of various types measuring tal conditions, the thrust assistance is used to
North-East coordinates (satellites, hydroacoustic, minimize the vessel excursions and line tension
optics, taut wire, etc.), sensors (heading, roll, by mainly increasing the damping in terms of
pitch, wind speed and direction, etc.), the power velocity feedback control and adding a bias force
system, thruster and propulsion system, and in- minimizing the mean tensions of the most loaded
dependent joystick system are also essential parts mooring lines. Modeling and control of turret-
of the DP system. In addition, the DP operator is anchored ships are treated in Strand et al. (1998)
an important element securing safe and efficient and Nguyen and Sørensen (2009).
DP operations. Further development of human- Overview of DP systems including references
machine interfaces, alarm systems, and operator can be found in Fay (1989), Fossen (2011), and
decision support systems is regarded as top pri- Sørensen (2011). The scientific and industrial
ority bridging advanced and complex technology contributions since the 1960s are vast, and many
to safe and efficient marine operations. Sufficient research groups worldwide have provided impor-
DP operator training is a part of this. tant results. In Sørensen et al. (2012), the devel-
The thruster and propulsion system controlled opment of DP system for ROVs is presented.
by the DP control system is regarded as one of
the main power consumers on the DP vessel. An
important control system for successful integra- Mathematical Modeling of DP Vessels
tion with the power plant and the other power
consumers such as drilling system, process sys- Depending on the operational conditions, the
tem, heating, and ventilation system is the power vessel models may briefly be classified into
and energy management system (PMS/EMS) bal- stationkeeping, low-velocity, and high-velocity
ancing safety requirements and energy efficiency. models. As shown in Sørensen (2011) and
The PMS/EMS controls the power generation Fossen (2011) and the references therein,
and distribution and the load control of heavy different model reduction techniques are
power consumers. In this context both transient used for the various speed regimes. Vessel
and steady-state behaviors are of relevance. A motions in waves are defined as seakeeping
thorough understanding of the hydrodynamics, and will here apply both for stationkeeping
dynamics between coupled systems, load char- (zero speed) and forward speed. DP vessels
acteristics of the various power consumers, con- or PM vessels can in general be regarded as
trol system architecture, control layers, power stationkeeping and low-velocity or low Froude
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 331

  
number applications. This assumption will P̃ 1 J .˜ / 033 1
P̃ D D 1 2 (1)
particularly be used in the formulation of math- P̃ 2 033 J2 .˜2 / 2
ematical models used in conjunction with the
controller design. It is common to use a two-time The vectors defining the Earth-fixed vessel posi-
scale formulation by separating the total model tion (˜1 ) and orientation (˜2 ) using Euler angles
into a low-frequency (LF) model and a wave- and the body-fixed translation (1 ) and rotation
frequency (WF) model (seakeeping) by superpo- .2 / velocities are given by
sition. Hence, the total motion is a sum of the
corresponding LF and the WF components. The   T T D
˜1 D x y z ; ˜2 D ;
WF motions are assumed to be caused by first-
order wave loads. Assuming small amplitudes,  T  T
1 D u v w ; 2 D p q r : (2)
these motions will be well represented by a linear
model. The LF motions are assumed to be caused
The rotation matrix J1 .˜2 / 2 SO(3) and the
by second-order mean and slowly varying wave
velocity transformation matrix J2 .˜2 / 2 R33
loads, current loads, wind loads, mooring (if
are defined in Fossen (2011). For ships, if only
any), and thrust and rudder forces and moments.
surge, sway and yaw (3DOF) are considered, the
For underwater vehicles operating below the
kinematics and the state vectors are reduced to
wave zone, estimated to be deeper than half the
wavelength, the wave loads can be disregarded 3 2
xP
and of course the effect of the wind loads as well.
P̃ D R. /; or 4 yP 5
P
Modeling Issues
2 32 3
The mathematical models may be formulated in cos  sin 0 u
two complexity levels: D 4 sin cos 05 4 v 5 : (3)
• Control plant model is a simplified mathe- 0 0 1 r
matical description containing only the main
physical properties of the process or plant. Process Plant Model: Low-Frequency
This model may constitute a part of the con- Motion
troller. The control plant model is also used The 6-DOF LF model formulation is based on
in analytical stability analysis based on, e.g., Fossen (2011) and Sørensen (2011). The equa-
Lyapunov stability. tions of motion for the nonlinear LF model of a
• Process plant model or simulation model is a floating vessel are given by
comprehensive description of the actual pro-
cess and should be as detailed as needed. The MP C CRB ./ C CA .r /r C D.r /CG.˜/
main purpose of this model is to simulate
the real plant dynamics. The process plant D £wave2 C £wind C £thr C £moor ; (4)
model is used in numerical performance and
robustness analysis and testing of the control where M 2 R66 is the system inertia matrix
systems. As shown above, the process plant including added mass; CRB ./ 2 R66 and
models may be implemented for off-line or CA .r / 2 R66 are the skew-symmetric Coriolis
real-time simulation (e.g., HIL testing; see Jo- and centripetal matrices of the rigid body and
hansen et al. 2007) purposes defining different the added mass; G(˜) 2 R6 is the generalized
requirements for model fidelity. restoring vector caused by the mooring lines
(if any), buoyancy, and gravitation; £thr 2 R6
Kinematics is the control vector consisting of forces and
The relationship between the Earth-fixed position moments produced by the thruster system; £wind
and orientation of a floating structure and its and £wave2 2 R6 are the wind and second-order
body-fixed velocities is wave load vectors, respectively.
332 Dynamic Positioning Control Systems for Ships and Underwater Vehicles

The damping vector may be divided into linear coefficient – !. However, this is a common
and nonlinear terms according to used formulation denoted as a pseudo-differential
equation. An important feature of the added mass
D.r / D dL .r ; /r C dNL .r ; r /r ; (5) terms and the wave radiation damping terms
is the memory effects, which in particular are
where r 2 R6 is the relative velocity vector important to consider for nonstationary cases,
between the current and the vessel. The nonlin- e.g., rapid changes of heading angle. Memory
ear damping, dNL , is assumed to be caused by effects can be taken into account by introducing
turbulent skin friction and viscous eddy-making, a convolution integral or a so-called retardation
also denoted as vortex shedding (Faltinsen 1990). function (Newman 1977) or state space models
The strictly positive linear damping matrix dL 2 as suggested by Fossen (2011).
R66 is caused by linear laminar skin friction
and is assumed to vanish for increasing speed Control Plant Model
according to For the purpose of controller design and analysis,
2 3 it is convenient to apply model reduction and
Xur e jur j :: Xr e jrj derive a LF and WF control plant model in
dL .r ; / D 4 :: :: :: 5 ; (6) surge, sway, and yaw about zero vessel velocity
Nur e jur j :: Nr e jrj according to

where  is a positive scaling constant such that pP w D Apw pw C Epw wpw ; (8)
 2 RC .
P̃ D R. /; (9)
Process Plant Model: Wave-Frequency
Motion bP D Tb b C Eb wb ; (10)
The coupled equations of the WF motions in MP D DL  C RT . /b C £; (11)
surge, sway, heave, roll, pitch, and yaw are as-
sumed to be linear and can be formulated as !P p D 0; (12)
h
iT
M.!/ P̃ Rw C Dp .!/ P̃ Rw C G˜Rw D £wave1 ; y D ˜ C Cpw pw T !p ; (13)
P̃ w D J. Ñ 2 / P̃ Rw ; where !p 2 R is the peak frequency of the waves
(7)
(PFW). The estimated PFW can be calculated
where ˜Rw 2 R6 is the WF motion vector in the
by spectral analysis of the pitch and roll mea-
hydrodynamics frame. ˜w 2 R6 is the WF motion
surements assumed to dominantly oscillate at the
vector in the Earth-fixed frame. £wave1 2 R6 is
peak wave frequency. In the spectral analysis,
the first-order wave excitation vector, which will
the discrete Fourier transforms of the measured
be modified for varying vessel headings relative
roll and pitch, which are collected through a
to the incident wave direction. M(!) 2 R66 is
period of time, are done by taking the n-point
the system inertia matrix containing frequency
fast Fourier transform (FFT). The PFW may be
dependent added mass coefficients in addition to
found to be the frequency at which the power
the vessel’s mass and moment of inertia. Dp (!)
spectrum is maximal. The assumption !P p D 0
2 R66 is the wave radiation (potential) damping
is valid for slowly varying sea state. You can
matrix. The linearized restoring coefficient ma-
also find the wave frequency using nonlinear
trix G 2 R66 is due to gravity and buoyancy
observers/EKF and signal processing techniques.
affecting heave, roll, and pitch only. For anchored
It is assumed that the second-order linear model is
vessels, it is assumed that the mooring system
sufficient to describe the first-order wave-induced
will not influence the WF motions.
motions, and then pw 2 R6 is the state of the WF
Remark 1 Generally, a time domain equation model. Apw 2 R66 is assumed Hurwitz and de-
cannot be expressed with frequency domain scribes the first-order wave-induced motion as a
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 333

mass-damper-spring system. wpw 2 R3 is a zero- available. Ensuring robust and fault-tolerant


mean Gaussian white noise vector. y is the mea- control proper diagnostics and change detec-
surement vector. The WF measurement matrix tion algorithms is regarded as maybe one of
Cpw 2 R36 and the disturbance matrix Epw 2 the most important research areas. For an
R63 are formulated as overview of the field, see Basseville and Niki-
   T forov (1993) and Blanke et al. (2003).
Cpw D 033 I33 ; ETpw D 033 KTw : • Vessel observer for state estimation and
(14) wave filtering. In case of lost sensor
Here, a 3-DOF model is assumed adopting the signals, the predictor is used to provide
D
notation in (3) such that ˜ 2 R3 and  2 R3 are dead reckoning, which is required by class
the LF position vector in the Earth-fixed frame societies. Prediction error which is the
and the LF velocity vector in the body-fixed deviation between the measurements and the
frame, respectively. M 2 R33 and DL 2 R33 are estimated measurements is also one important
the mass matrix including hydrodynamic added barrier in the failure detection.
mass and linear damping matrix, respectively. • Feedback control law is often of multivari-
The bias term accounting for unmodeled affects able PID type, where feedback is produced
and slowly varying disturbances b 2 R3 is mod- from the estimated low-frequency (LF) posi-
eled as Markov processes with positive definite tion and heading deviations and estimated LF
diagonal matrix Tb 2 R33 of time constants. If velocities.
Tb is removed, a Wiener process is used. wb 2 • Feedforward control law is normally the
R3 is a bounded disturbance vector, and Eb 2 wind force and moment. For the different
R33 is a disturbance scaling matrix. £ 2 R3 is the applications (pipe laying, ice operations,
control force. As mentioned later in the paper, the position mooring), tailor-made feedforward
proposed model reduction considering only hor- control functions are also used.
izontal motions may create problems conducting • Guidance system with reference models is
DP operations of structures with low waterplane needed in achieving a smooth transition
area such as semisubmersibles. More details can between setpoints. In the most basic case,
be found in Sørensen (2011) and Fossen (2011) the operator specifies a new desired position
and the references therein. and heading, and a reference model generates
For underwater vehicles, 6-DOF model should smooth reference trajectories/paths for
be used. For underwater vehicles with self- the vessel to follow. A more advanced
stabilizing roll and pitch, a 4-DOF model with guidance system involves way-point tracking
surge, sway, yaw, and heave may be used; see functionality with optimal path planning.
Sørensen et al. (2012). • Thrust allocation computes the force and
direction commands to each thruster device
Control Levels and Integration based on input from the resulting feedback
Aspects and feedforward controllers. The low-level
thruster controllers will then control the
The real-time control hierarchy of a marine con- propeller pitch, speed, torque, and power.
trol system (Sørensen 2005) may be divided into • Model adaptation provides the necessary cor-
three levels: the guidance system and local op- rections of the vessel model and the controller
timization, the high-level plant control (e.g., DP settings subject to changes in the vessel draft,
controller including thrust allocation), and the wind area, and variations in the sea state.
low-level thruster control. The DP control system • Power management system performs diesel
consists of several modules as indicated in Fig. 1: engine control, power generation management
• Signal processing for analysis and testing of with frequency and voltage monitoring, active
the individual signals including voting and and passive load sharing, and load dependent
weighting when redundant measurements are start and stop of generator sets.
334 Dynamic Positioning Control Systems for Ships and Underwater Vehicles

Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 1 Controller structure

DP Controller be designed. A nonlinear output horizontal-plane


In the 1960s the first DP system was introduced positioning feedback controller of PID type may
for horizontal modes of motion (surge, sway, and be formulated as
yaw) using single-input single-output PID control
algorithms in combination with low-pass and/or £PID D RTe Kp e  RTe Kp3 f.e/  Kd Q  RT Ki z;
notch filters. In the 1970s more advanced output (15)
control methods based on multivariable optimal where e2 R3 is the position and heading deviation
control and Kalman filter theory were proposed vector, Q 2 R3 is the velocity deviation vector, z2
by Balchen et al. (1976) and later refined in R3 is the integrator states, and f(e) is a third-order
Sælid et al. (1983); Grimble and Johnson (1988); stiffness term defined as
and others as referred to in Sørensen (2011).
_
In the 1990s nonlinear DP controller designs e D Œe1 ; e2 ; e3 T D RT .¥d /.˜  ˜d /;
were proposed by several research groups; for Q D O  RT .¥d /˜d ;
_
an overview see Strand et al. (1998), Fossen and zP D ˜  ˜d ;
Strand (1999), Pettersen and Fossen (2000), and Re D R.¥  ¥d / D RT .¥d /R.¥/;
Sørensen (2011). Nguyen et al. (2007) proposed f.e/ D Œe13 ; e23 ; e33 T :
the design of hybrid controller for DP from calm
to extreme sea conditions. Experience from implementation and operations
of real DP control systems has shown that ¥d
Plant Control in the calculation of the error vector e generally
By copying the control plant model (8)–(13) and gives a better performance with less noisy sig-
adding an injection term, a passive observer may nals than using ¥. However, this is only valid
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 335

Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 2 Thrust allocation

under the assumption that the vessel maintains _


where p and qO are the estimated pitch and roll an-
its desired heading with small deviations. As gular velocities. The resulting positioning control
the DP capability for ships is sensitive to the law is written as
heading angle, i.e., minimizing the environmental
loads, heading control is prioritized in case of
limitations in the thrust capacity. This feature is £ D £wFF C £PID C £rpd ; (17)
handled in the thrust allocation.
An advantage of this is the possibility to re- where £wFF 2 R3 is the wind feedforward control
duce the first-order proportional gain matrix, re- law.
sulting in reduced dynamic thruster action for Thrust allocation or control allocation (Fig. 2)
smaller position and heading deviations. More- is the mapping between plant and actuator con-
over, the third-order restoring term will make trol. It is assumed to be a part of the plant control.
the thrusters to work more aggressive for larger The DP controller calculated the desired force
deviations. Kp , Kp3 , Kd , and Ki 2 R33 are in surge and sway and moment in yaw. Depen-
the nonnegative controller gain matrices for pro- dent on the particular thrust configuration with
portional, third-order restoring, derivative, and installed and enabled propellers, tunnel thrusters,
integrator controller terms, respectively, found by azimuthing thrusters, and rudders, the allocation
appropriate controller synthesis methods. is a nontrivial optimization problem calculating
For small-waterplane-area marine vessels such the desired thrust and direction for each enabled
as semisubmersibles, often used as drilling rigs, thruster subject to various constraints such as
Sørensen and Strand (2000) proposed a DP con- thruster ratings, forbidden sectors, and thrust effi-
trol law with the inclusion of roll and pitch ciency. References on thrust allocation are found
damping according to in Johansen and Fossen (2013).
2 3 In Sørensen and Smogeli (2009), torque and
0 gxq " _ # power control of electrically driven marine pro-
p
£rpd D  4gyp 0 5 _ ; (16) pellers are shown. Ruth et al. (2009) proposed
q
g¥p 0 anti-spin thrust allocation, and Smogeli et al.
336 Dynamic Positioning Control Systems for Ships and Underwater Vehicles

(2008) and Smogeli and Sørensen (2009) pre- Newman JN (1977) Marine hydrodynamics. MIT,
sented the concept of anti-spin thruster control. Cambridge
Nguyen TD, Sørensen AJ (2009) Setpoint chasing for
thruster-assisted position mooring. IEEE J Ocean Eng
34(4):548–558
Cross-References Nguyen TD, Sørensen AJ, Quek ST (2007) Design of
hybrid controller for dynamic positioning from calm
to extreme sea conditions. Automatica 43(5):768–785
 Control of Ship Roll Motion Pettersen KY, Fossen TI (2000) Underactuated dynamic
 Fault-Tolerant Control positioning of a ship – experimental results. IEEE
 Mathematical Models of Ships and Underwater Trans Control Syst Technol 8(4):856–863
Ruth E, Smogeli ØN, Perez T, Sørensen AJ (2009) Anti-
Vehicles
spin thrust allocation for marine vessels. IEEE Trans
 Motion Planning for Marine Control Systems Control Syst Technol 17(6):1257–1269
 Underactuated Marine Control Systems Sælid S, Jenssen NA, Balchen JG (1983) Design and
analysis of a dynamic positioning system based on
Kalman filtering and optimal control. IEEE Trans
Autom Control 28(3):331–339
Bibliography Smogeli ØN, Sørensen AJ (2009) Antispin thruster con-
trol for ships. IEEE Trans Control Syst Technol
Balchen JG, Jenssen NA, Sælid S (1976) Dynamic po- 17(6):1362–1375
sitioning using Kalman filtering and optimal control Smogeli ØN, Sørensen AJ, Minsaas KJ (2008) The con-
theory. In: IFAC/IFIP symposium on automation in cept of anti-spin thruster control. Control Eng Pract
offshore oil field operation, Amsterdam, pp 183–186 16(4):465–481
Basseville M, Nikiforov IV (1993) Detection of abrupt Strand JP, Fossen TI (1999) Nonlinear passive observer
changes: theory and application. Prentice-Hall, Engle- for ships with adaptive wave filtering. In: Nijmeijer H,
wood Cliffs. ISBN:0-13-126780-9 Fossen TI (eds) New directions in nonlinear observer
Blanke M, Kinnaert M, Lunze J, Staroswiecki M (2003) design. Springer, London, pp 113–134
Diagnostics and fault-tolerant control. Springer, Berlin Strand JP, Sørensen AJ, Fossen TI (1998) Design of
Faltinsen OM (1990) Sea loads on ships and offshore automatic thruster assisted position mooring systems
structures. Cambridge University Press, Cambridge for ships. Model Identif Control 19(2):61–75
Fay H (1989) Dynamic positioning systems, princi- Sørensen AJ (2005) Structural issues in the design and op-
ples, design and applications. Editions Technip, Paris. eration of marine control systems. Annu Rev Control
ISBN:2-7108-0580-4 29(1):125–149
Fossen TI (2011) Handbook of marine craft hydrodynam- Sørensen AJ (2011) A survey of dynamic positioning
ics and motion control. Wiley, Chichester control systems. Annu Rev Control 35:123–136.
Fossen TI, Strand JP (1999) Passive nonlinear observer Sørensen AJ, Smogeli ØN (2009) Torque and power con-
design for ships using Lyapunov methods: experimen- trol of electrically driven marine propellers. Control
tal results with a supply vessel. Automatica 35(1):3–16 Eng Pract 17(9):1053–1064
Grimble MJ, Johnson MA (1988) Optimal control and Sørensen AJ, Strand JP (2000) Positioning of small-
stochastic estimation: theory and applications, vols 1 waterplane-area marine constructions with roll and
and 2. Wiley, Chichester pitch damping. Control Eng Pract 8(2):205–213
Johansen TA, Fossen TI (2013) Control allocation— Sørensen AJ, Dukan F, Ludvigsen M, Fernandez DA,
a survey. Automatica 49(5):1087–1103 Candeloro M (2012) Development of dynamic posi-
Johansen TA, Sørensen AJ, Nordahl OJ, Mo O, Fossen TI tioning and tracking system for the ROV Minerva.
(2007) Experiences from hardware-in-the-loop (HIL) In: Roberts G, Sutton B (eds) Further advances in
testing of dynamic positioning and power management unmanned marine vehicles. IET, London, pp 113–128.
systems. In: OSV Singapore, Singapore Chapter 6
E

(LQ) control, for example, this is cast as a trade-


Economic Model Predictive Control off between control effort and tracking perfor-
mance. The designer is allowed to settle such a
David Angeli
trade-off by suitably tuning weighting parameters
Department of Electrical and Electronic
of an otherwise automatic design procedure.
Engineering, Imperial College London,
When the primary goal of a control system
London, UK
is profitability rather than tracking performance,
Dipartimento di Ingegneria dell’Informazione,
a suboptimal approach has often been devised,
University of Florence, Italy
namely, a hierarchical separation is enforced be-
tween the economic optimization layer and the
dynamic real-time control layer.
Abstract
In practice, while set points are computed by
optimizing economic revenue among all equilib-
Economic model predictive control (EMPC) is
ria fulfilling the prescribed constraints, the task
a variant of model predictive control aimed at
of the real-time control layer is simply to drive
maximization of system’s profitability. It allows
(basically as fast as possible) the system’s state
one to explicitly deal with hard and average
to the desired set-point value.
constraints on system’s input and output variables
Optimal control or LQ control may be used
as well as with nonlinearity of dynamics. We
to achieve the latter task, possibly in conjunc-
provide basic definitions and concepts of the
tion with model predictive control (MPC), but
approach and highlight some promising research
the actual economics of the plant are normally
directions.
neglected at this stage.
The main benefits of this approach are
twofold:
Keywords 1. Reduced computational complexity with re-
spect to infinite-horizon dynamical program-
Constrained systems; Dynamic programming; ming
Profit maximization 2. Stability robustness in the face of uncertainty,
normally achieved by using some form of
robust control in the real-time control layer
Introduction The hierarchical approach, however, is subop-
timal in two respects:
Most control tasks involve some kind of eco- 1. First of all, given nonlinearity of the
nomic optimization. In classical linear quadratic plant’s dynamics and/or nonconvexity of the

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
338 Economic Model Predictive Control

functions characterizing the economic the input. We also assume that Z  X  U is a


revenue, there is no reason why the most compact set which defines the (possibly coupled)
profitable regime should be an equilibrium. state/input constraints that need to hold pointwise
2. Even when systems are most profitably oper- in time:
ated at equilibrium, transient costs are totally
disregarded by the hierarchical approach and .x.t/; u.t// 2 Z 8t 2 N: (2)
this may be undesirable if the time constants
of the plant are close enough to the time scales In order to introduce a measure of economic
at which set point’s variations occur. performance, to each feasible state/input pair
Economic model predictive control seeks to .x; u/ 2 Z, we associate the instantaneous net
remove these limitations by directly using the cost of operating the plant at that state when
economic revenue in the stage cost and by the for- feeding the specified control input:
mulation of an associated dynamic optimization
problem to be solved online in a receding horizon
`.x; u/ W Z ! R: (3)
manner. It was originally developed by Rawlings
and co-workers, in the context of linear control
The function ` (which we assume to be contin-
systems subject to convex constraints as an effec-
uous) is normally referred to as stage cost and
tive technique to deal with infeasible set points
together with actuation and/or inflow costs should
(Rawlings et al. 2008) (in contrast to the classical
also take into account the profits associated to
approach of redesigning a suitable quadratic cost
possible output/outflows of the system. Let .x  ,
that achieves its minimum at the closest feasible
u / denote the best equilibrium/control input pair
equilibrium). Preserving the original cost has
associated to (3) and (2), namely,
the advantage of slowing down convergence to
such an equilibrium when the transient evolution
occurs in a region where the stage cost is better `.x  ; u / D minx;u `.x; u/
than at steady state. Stability and convergence subject to
(4)
issues are at first analyzed, thanks to convexity .x; u/ 2 Z
and for the special case of linear systems only. x D f .x; u/
Subsequently Diehl introduced the notion of ro-
tated cost (see Diehl et al. 2011) that allowed a Notice that, unlike in tracking MPC, it is not
Lyapunov interpretation of stability criteria and assumed here that
paved the way for the extension to general dissi-
pative nonlinear systems (Angeli et al. 2012). `.x  ; u /  `.x; u/ 8.x; u/ 2 Z: (5)

This is, technically speaking, the main point of


Economic MPC Formulation departure between economic MPC and tracking
MPC.
In order to describe the most common versions As there is no natural termination time to
of economic MPC, assume that a discrete-time operation of a system, our goal would be to
finite-dimensional model of state evolution is optimize the infinite-horizon cost functional:
available for the system to be controlled:
X
`.x.t/; u.t// (6)
x C D f .x; u/ (1) t 2N

where x 2 X  Rn is the state variable, u 2 possibly in an average sense (or by introducing


U  Rm is the control input, and f W X  U ! some discounting factor to avoid infinite
X is a continuous map which computes the next costs) and subject to the dynamic/operational
state value, given the current one and the value of constraints (1) and (2).
Economic Model Predictive Control 339

To make the problem computationally more constraints, asymptotic average constraints on


tractable and yet retain some of the desirable specified input/output variables. In tracking
economic benefits of dynamic programming, (6) applications, where the control algorithm
is truncated to the following cost functional: guarantees asymptotic convergence of the state to
a feasible set point, the average asymptotic value
X
N 1 of all input/output variables necessarily matches
J.z; v/ D `.z.k/; v.k// C Vf .z.N // (7) that of the corresponding equilibrium/control
kD0 input pair. In economic MPC, the asymptotic
regime resulting in closed loop may, in general,
where z D Œz.0/; z.1/; : : :; z.N / 2 X N C1 ; v D fail to be an equilibrium; therefore, it might E
Œv.0/; v.1/; : : :; v.N 1/ 2 U N and Vf W X ! R be of interest to impose average constraints on
is a terminal weighting function whose properties system’s inflows and outflows which are more
will be specified later. stringent than those indirectly implied by the
The virtual state/control pair .z ; v / at time t fulfillment of (2). To this end, let the system’s
is the solution (which for the sake of simplicity output be defined as
we assume to be unique) of the following opti-
mization problem: y.t/ D h.x.t/; u.t// (9)

V .x.t// D minz;v J.z; v/ with h.x; u/ W Z ! Rp , a continuous map, and


subject to consider the convex compact set Y. We may de-
z.k C 1/ D f .z.k/; v.k// fine the set of asymptotic averages of a bounded
(8) signal y as follows:
.z.k/; v.k// 2 Z
for k 2 f0; 1; : : : ; N  1g 
z.0/ D x.t/; z.N / 2 Xf : AvŒy D  2 Rp W 9ftn g1
nD1 W tn ! 1 as n ! 1
t 1  
P
n
Notice that z(0) is initialized at the value of the and  D lim y.k/ =tn
current state x.t/. Thanks to this fact, z and v n!1 kD0
may be seen as functions of the current state x.t/.
At the same time, z.N / is constrained to belong Notice that for converging signals, or even
to the compact set Xf  X whose properties will for periodic ones, Av[y] always is a singleton
be detailed in the next paragraph. but may fail to be such for certain oscillatory
As customary in model predictive control, a regimes. An asymptotic average constraint can be
state-feedback law is defined by applying the first expressed as follows:
virtual control to the plant, that is, by letting
AvŒy  Y (10)
u.t/ = v  .0/ and restating, at the subsequent
time instant, the same optimization problem from where y is the output signal as defined in (9).
initial state x.t C 1/ which, in the case of exact
match between plant and model, can be computed
as f .x.t/; u.t//. Basic Theory
In the next paragraph, we provide details on
how to design the “terminal ingredients” (namely, The main theoretical results in support of the
Vf and Xf ) in order to endow the basic algo- approach discussed in the previous paragraph are
rithm (8) with important features such as recur- discussed below. Three fundamental aspects are
sive feasibility and a certain degree of average treated:
performance and/or stability). • Recursive feasibility and constraint satisfac-
Hereby it is worth pointing out how, in tion
the context of economic MPC, it makes • Asymptotic performance
sense to treat, together with pointwise-in-time • Stability and convergence
340 Economic Model Predictive Control

Feasibility and Constraints where ˚ denotes Pontryagin’s set sum. (A˚B WD


The departing point of most model predictive fc W 9a 2 A, 9b 2 B W c D a C bg) The sequence
control techniques is to ensure recursive is initialized as Y0 D N Y ˚ Y00 where Y00 is
feasibility, namely, the fact that feasibility of an arbitrary compact set in Rp containing 0 in its
the problem (8) at time 0 implies feasibility at all interior. The following result can be proved.
subsequent times, provided there is no mismatch
Theorem 2 Consider the optimization prob-
between the true plant and its model (1). This is
lem (8) with additional constraints (11), and
normally achieved by making use of a suitable
assume that x(0) is a feasible initial state.
notion of control invariant set which is used as
Then, provided a terminal equality constraint
a terminal constraint in (8). Economic model
is adopted, the closed-loop solution x.t/ is
predictive control is not different in this respect,
well defined and feasible for all t 2 N and
and either one of the following set of assumptions
the resulting closed-loop variable y.t/ D
is sufficient to ensure recursive feasibility:
h.x.t/; u.t// fulfills the constraint (10).
1. Assumption 1: Terminal constraint
Extending average constraints to the case of eco-
Xf D fx  g Vf D 0 nomic MPC with terminal penalty function is
possible but outside the scope of this brief tuto-
rial. It is worth mentioning that the set Y00 plays
2. Assumption 2: Terminal penalty function the role of an initial allowance that is shrunk or
There exists a continuous map  W Xf ! expanded as a result of how close are closed-
U such that loop output signals to the prescribed region. In
particular, Y00 can be selected a posteriori (after
.x; K.x// 2 Z 8x 2 Xf computation of the optimal trajectory) just for
f .x; K.x// 2 Xf 8x 2 Xf t D 0, so that the feasibility region of the
algorithm is not affected by the introduction of
The following holds: average asymptotic constraints.
Theorem 1 Let x(0) be a feasible state for (8)
and assume that either Assumption 1 or 2 hold. Asymptotic Average Performance
Then, the closed-loop trajectory x.t/ resulting Since economic MPC does not necessarily lead
from receding horizon implementation of the to converging solutions, it is important to have
feedback u.t/ D v  .0/ is well defined for all bounds which estimate the asymptotic average
t 2 N (i.e., x.t/ is a feasible initial state of (8) performance of the closed-loop plant. To this end,
for all t 2 N) and the resulting closed-loop the following dissipation inequality is needed for
variables .x.t/, u.t// fulfill the constraints in (2). the approach with terminal penalty function:
The proof of this Theorem can be found in
Angeli et al. (2012) and Amrit et al. (2011), for Vf .f .x; K.x///  Vf .x/`.x; K.x//C`.x ; u/
instance. When constraints on asymptotic aver- (13)
ages are of interest, the optimization problem (8)
can be augmented by the following constraints: which shall hold for all x 2 Xf . We are now
ready to state the main bound on the asymptotic
1 performance:
X
N
h.z.k/; v.k// 2 Yt (11) Theorem 3 Let x(0) be a feasible state for (8)
kD0
and assume that either Assumption 1 or Assump-
tion 2 together with (13) hold. Then, the closed-
provided Yt is recursively defined as loop trajectory x.t/ resulting from receding hori-
zon implementation of the feedback u.t/ D v  .0/
Yt C1 D Yt ˚ Y ˚ fh.x.t/; u.t//g (12) is well defined for all t 2 N and fulfills
Economic Model Predictive Control 341

PT 1
`.x.t/; u.t// .f .x; u//  .x/ C s.x; u/  .x/: (16)
t D0
lim sup  `.x  ; u /: (14)
T !C1 T
The next result highlights the connection between
The proof of this fact can be found in Angeli et al. dissipativity of the open-loop system and stability
(2012) and Amrit et al. (2011). When periodic of closed-loop economic MPC.
solutions are known to outperform, in an average Theorem 4 Assume that either Assumption 1 or
sense, the best equilibrium/control pair, one may Assumption 2 together with (13) hold. Let the
replace terminal equality constraints by periodic system (1) be strictly dissipative with respect to
terminal constraints (see Angeli et al. 2012). This the supply function s.x; u/ D `.x; u/  `.x  ; u /
leads to an asymptotic performance at least as as from Definition 1 and assume there exists a E
good as that of the solution adopted as a terminal neighborhood of feasible initial states containing
constraint. x  in its interior. Then provided V is continuous
at x  ; x  is an asymptotically stable equilibrium
Stability and Convergence with basin of attraction equal to the set of feasible
It is well known that the cost-to-go V .x/ as initial states.
defined in (8) is a natural candidate Lyapunov
function for the case of tracking MPC. In fact, See Angeli et al. (2012) and Amrit et al. (2011)
the following estimate holds along solutions of for proofs and discussions. Convergence results
the closed-loop system: are also possible for the case of economic MPC
subject to average constraints. Details can be
found in Müller et al. (2013a).
V .x.t C1//  V .x.t//`.x.t/; u.t//C`.x  ; u /:
Hereby it is worth mentioning that finding
(15)
a function satisfying (16) (should one exist) is
in general a hard task (especially for nonlinear
This shows, thanks to inequality (5), that V .x.t//
systems and/or nonconvex stage costs); it is akin
is nonincreasing. Owing to this, stability and con-
to the problem of finding a Lyapunov function
vergence can be easily achieved under mild addi-
and therefore general construction methods do
tional technical assumptions. While property (15)
not exist. Let us emphasize, however, that while
holds for economic MPC, both in the case of
existence of a storage function  is a sufficient
terminal equality constraint and terminal penalty
condition to ensure convergence of closed-loop
function, it is no longer true that (5) holds. As
economic MPC, formulation and resolution of
a matter of fact, x  might even fail to be an
the optimization problem (8) can be performed
equilibrium of the closed-loop system, and hence,
irrespectively of any explicit knowledge of such
convergence and stability cannot be expected in
function. Also, we point out that existence of 
general.
as in Definition 1 and Theorem 4 is only possible
Intuitively, however, when the most profitable
if the optimal infinite-horizon regime of operation
operating regime is an equilibrium, the aver-
for the system is an equilibrium.
age performance bound provided by Theorem 3
seems to indicate that some form of stability or
convergence to x  could be expected. This is
Summary and Future Directions
true under an additional dissipativity assumption
which is closely related to the property of optimal
Economic model predictive control is a fairly
operation at steady state.
recent and active area of research with great
Definition 1 A system is strictly dissipative with potential in those engineering applications where
respect to the supply function s.x; u/ if there economic profitability is crucial rather than track-
exists a continuous function  W X ! R and ing performance.
 W X ! R positive definite with respect to x  The technical literature is rapidly growing in
such that for all x and u in X  U , it holds: application areas such as chemical engineering
342 Economic Model Predictive Control

(see Heidarinejad 2012) or power systems engi- turnpike theory in economics, please refer to
neering (see Hovgaard et al. 2010; Müller et al. Rawlings and Amrit (2009). Potential applica-
2013a) where system’s output is in fact physical tions of EMPC are described in Hovgaard et al.
outflows which can be stored with relative ease. (2010), Heidarinejad (2012), and Ma et al. (2011)
We only dealt with the basic theoretical de- while Rawlings et al. (2012) is an up-to-date
velopments and would like to provide pointers to survey on the topic. Fagiano and Teel (2012) and
interesting recent and forthcoming developments Grüne (2012, 2013) deal with the issue of relax-
in this field: ation or elimination of terminal constraints, while
• Generalized terminal constraints: possibility Müller et al. (2013b) explore the possibility of
of enlarging the set of feasible initial states adaptive terminal costs and generalized equality
by using arbitrary equilibria as terminal constraints.
constraints, possibly to be updated on line in
order to improve asymptotic performance (see
Fagiano and Teel 2012; Müller et al. 2013b). Bibliography
• Economic MPC without terminal constraints:
removing the need for terminal constraints Amrit R, Rawlings JB, Angeli D (2011) Economic opti-
by taking a sufficiently long control horizon mization using model predictive control with a termi-
is an interesting possibility offered by nal cost. Annu Rev Control 35:178–186
Angeli D, Amrit R, Rawlings JB (2011) Enforcing conver-
standard tracking MPC. This is also possible gence in nonlinear economic MPC. Paper presented at
for economic MPC at least under suitable the 50th IEEE conference on decision and control and
technical assumptions as investigated in european control conference (CDC-ECC), Orlando
Grüne (2012, 2013). 12–15 Dec 2011
Angeli D, Amrit R, Rawlings JB (2012) On average per-
• The basic developments presented in the formance and stability of economic model predictive
previous paragraph only deal with systems control. IEEE Trans Autom Control 57:1615–1626
unaffected by uncertainty. This is a severe Diehl M, Amrit R, Rawlings JB (2011) A Lyapunov
limitation of current approaches and it is to be function for economic optimizing model predictive
control. IEEE Trans Autom Control 56:703–707
expected that, as for the case of tracking MPC, Fagiano L, Teel A (2012) Model predictive control with
a great deal of research in this area could be generalized terminal state constraint. Paper presented
developed in the future. In particular, both at the IFAC conference on nonlinear model predictive
deterministic and stochastic uncertainties are control, Noordwijkerhout 23–27 Aug 2012
Grüne L (2012) Economic MPC and the role of exponen-
of interest. tial turnpike properties. Oberwolfach Rep 12:678–681
Grüne L (2013) Economic receding horizon control with-
out terminal constraints. Automatica 49:725–734
Cross-References Heidarinejad M (2012) Economic model predictive con-
trol of nonlinear process systems using Lyapunov
 Model-Predictive Control in Practice techniques. AIChe J 58:855–870
Hovgaard TG, Edlund K, Bagterp Jorgensen J (2010) The
 Optimization Algorithms for Model Predictive
potential of Economic MPC for power management.
Control Paper presented at the 49th IEEE conference on deci-
sion and control, Atlanta 15–17 Dec 2010
Ma J, Joe Qin S, Li B et al (2011) Economic model
Recommended Reading predictive control for building energy systems. Paper
presented at the IEEE innovative smart grid technol-
ogy conference, Anaheim, 17–19 Jan 2011
Papers Amrit et al. (2011), Angeli et al. (2011,
Müller MA, Angeli D, Allgöwer F (2013a) On conver-
2012), Diehl et al. (2011), Müller et al. (2013a), gence of averagely constrained economic MPC and
and Rawlings et al. (2008) set out the basic necessity of dissipativity for optimal steady-state op-
technical tools for performance and stability anal- eration. Paper presented at the 2013 IEEE American
control conference, Washington DC, 17–19 June 2013
ysis of EMPC. To readers interested in the gen-
Müller MA, Angeli D, Allgöwer F (2013b) Economic
eral theme of optimization of system’s economic model predictive control with self-tuning terminal
performance and its relationship with classical weight. Eur J Control 19:408–416
Electric Energy Transfer and Control via Power Electronics 343

Rawlings JB, Amrit R (2009) Optimizing process eco- Introduction


nomic performance using model predictive control. In:
Magni L, Raimondo DM, Allgöwer F (eds) Nonlinear
model predictive control. Lecture notes in control Modern society runs on electricity or electric
and information Sciences, vol 384. Springer, Berlin, energy. The electric energy generally must be
pp 119–138 transferred before consumption since the energy
Rawlings JB, Bonne D, Jorgensen JB et al (2008) Un- sources, such as thermal power plants, hydro
reachable setpoints in model predictive control. IEEE
Trans Autom Control 53:2209–2215 dams, and wind farms, are often some distances
Rawlings JB, Angeli D, Bates CN (2012) Fundamentals away from the loads. In addition, electric energy
of economic model predictive control. Paper presented needs to be controlled as well since the energy
at the IEEE 51st annual conference on decision and transfer and use often require electricity in a
control (CDC), Maui 10–13 Dec 2012 E
form different from the raw form generated at
the source. Examples are the voltage magnitude
and frequency for long distance transmission; the
voltage needs to be stepped up at the sending end
EKF
to reduce the energy loss along the lines and then
stepped down at the receiving end for users; for
 Extended Kalman Filters
many modern consumer devices, DC voltage is
needed and obtained through transforming the 50
or 60 Hz utility power. Note that electric energy
Electric Energy Transfer and Control transfer and control is often used interchangeably
via Power Electronics with the electric power transfer and control. This
is because the modern electric power systems
Fred Wang have very limited energy storage and the energy
University of Tennessee, Knoxville, TN, USA generated must be consumed at the same time.
Since the beginning of the electricity era,
electric energy transfer and control technologies
Abstract have been an essential part of electric power
systems. Many types of equipment were invented
Power electronics and their applications for elec- and applied for these purposes. The commonly
tric energy transfer and control are introduced. used equipment includes electric transmission
The fundamentals of the power electronics are and distribution lines, generators, transformers,
presented, including the commonly used semi- switchgears, inductors or reactors, and capacitor
conductor devices and power converter circuits. banks. The traditional equipment has limited
Different types of power electronic controllers control capability. Many cannot be controlled
for electric power generation, transmission and at all or can only be connected or disconnected
distribution, and consumption are described. The with mechanical switches, others with limited
advantages of power electronics over traditional range, such as transformers with tap changers.
electromechanical or electromagnetic controllers Even with fully controllable equipment such as
are explained. The future directions for power generators, the control dynamics is relatively
electronic application in electric power systems slow due to the electromechanical or magnetic
are discussed. nature of the controller.
Power electronics are based on semiconductor
devices. These devices are derivatives from tran-
sistors and diodes used in microelectronic circuits
Keywords with the additional large power handling capa-
bility. Due to their electronic nature, power elec-
Electric energy control; Electric energy transfer; tronic devices are much more flexible and faster
Power electronics than their electromechanical or electromagnetic
344 Electric Energy Transfer and Control via Power Electronics

counterparts for electric energy transfer and con- • Thyristor – also called SCR (silicon-
trol. Since the advent of power electronics in the controlled rectifier). Unlike diode, thyristor
1950s, they have steadily gained ground in power is a three-terminal device with an additional
system applications. Today, power electronic gate terminal. It can be turned on by a current
controllers are an important part of equipment pulse through gate but can only be turned
for electric energy transfer and control. Their off when the main current goes to zero with
roles are growing rapidly with the continuous im- external means. Thyristor has low conduction
provement of the power electronic technologies. loss but slow switching speed.
• GTO – stands for gate-turn-off thyristor. GTO
Fundamentals of Power Electronics can be turned on similarly as a regular thyris-
tor and can also be turned off with a large
Different from semiconductor devices in micro- negative gate current pulse. GTO has been
electronics, the power electronic devices only act largely replaced by IGBT and IGCT due to its
as switches for desired control functions, such complex gate driving needs and slow switch-
that they incur minimum losses when they are ing speed.
either on (closed) or off (open). As a result, • Power BJT – similar to bipolar transistor
the power electronic controllers are basically the for microelectronics and requires a sustained
switching circuits. The semiconductor switches gate current to turn on and off. It has been
are therefore the most important elements of the replaced by IGBT and power MOSFET with
power electronic controllers. Since the 1950s, simpler gate signals and faster switching
many different types of power semiconductor speed.
switches have been developed and can be selected • Power MOSFET – similar to metal-oxide
based on the applications. semiconductor field effect transistor for
The performance of the power semiconductor microelectronics and can be turned on and
devices is mainly characterized by their voltage off with a gate voltage signal. It is the
and current ratings, conduction or on-state loss, fastest device available but has relatively
as well as the switching speed (or switching high conduction loss and relatively low-
frequency capability) and associated switching voltage/power ratings.
loss. Main types of power semiconductor devices • IGBT – stands for insulated-gate bipolar tran-
are listed with their symbols and state-of-the-art sistor. Unlike regular BJT, it can be turned
rating and frequency range shown in Table 1: on and off with a gate voltage like MOS-
• Power diode – a two terminal device with FET. It has relatively low conduction loss and
similar characteristics to diodes used in micro- fast switching speed. IGBT is becoming the
electronics but with higher-voltage and power workhorse of the power electronics for high
ratings. power applications.

Electric Energy Transfer and Control via Power Electronics, Table 1 Commonly use Si-based power semiconduc-
tor devices and their ratings
Types Symbol Voltage Current Switching frequency
Power diodes Max 80 kV, typical < 10 kV 10 kA Various
Thyristor Max 8 kV 4.5 kA AC line frequency
GTO Max 10 kV 6.5 kA <500 Hz
Power MOSFET Max 4.5 kV, typical < 600 V 1.6 kA 10 s of kHz to MHz

IGBT Max 6.5 kV, typical > 600 V 2.4 kA 1 kHz to 10 s of kHz

IGCT Max 10 kV, typical > 4.5 kV 6.5 kA <2 kHz


Electric Energy Transfer and Control via Power Electronics 345

• IGCT – stands for integrated-gate-commutated be called frequency changer. Active devices


thyristor. It is basically a GTO with an are needed for these types of converters.
integrated gate drive circuit allowing a hard There are a variety of converter topologies
driven turnoff. It therefore has faster switching for each type of the converters listed above.
speed than regular GTO but slower than IGBT. The most commonly used basic topologies for
Except for diodes, all other devices above can power system applications are shown in Fig. 1.
be turned on and/or off through a gate signal, so These basic topologies can be expanded through
they are active switches, while diodes are called paralleling or series of devices and/or converters
passive switch. to achieve higher current and voltage ratings.
With different types of power semiconduc- Other variations such as multilevel converters are E
tors, many power electronics circuits have been also popular for high-voltage applications using
developed. Based on their functions, they can be lower-voltage rating devices.
classified as: It should be noted that passive components,
• Rectifier – rectifiers convert AC to DC. De- i.e., inductors and capacitors, are essential parts
pending on AC sources, rectifiers can be three of power electronic converters. In fact, power
phase or single phase; depending on device electronic converters transfer or control the elec-
types, they can be passive (diode based), phase tric energy by storing it temporarily in induc-
controlled (thyristor controlled), or actively tors or capacitors while reformatting the original
switched. voltage or current waveform through switching
• Inverter – inverters convert DC to AC. They actions. The other key function of the passives is
again can be three phase or single phase. filtering the harmonics caused by switching.
Inverters generally require active switching
devices. Power Electronic Controller Types for
• DC-DC converter – also called choppers, DC- Energy Transfer and Control
DC converters convert one DC voltage level For almost all traditional non-power-electronic
to another. Sometimes they also contains a equipment for electric energy transfer and
magnetic isolation. DC-DC converters can control, there can be corresponding power
have unidirectional or bidirectional power electronic-based counterpart, often with better
flow and generally requires active switching controllability. However, power electronic
devices. equipment can be more expensive and therefore
• AC-AC converter – directly converts one AC only used when it provides better overall
to another, either only the voltage magnitude performance and cost benefits. In other cases,
or both magnitude and frequency. The former only power electronic equipment can achieve the
can also be called AC switch, and the latter can required control functions.

a b

Thyristor based rectifier Voltage source inverter (VSI) Bi-directional AC switch

Electric Energy Transfer and Control via Power Electronics, Fig. 1 Commonly used basic power electronics
converter topologies (only one phase shown for the AC switch)
346 Electric Energy Transfer and Control via Power Electronics

The power electronic controllers can be • Semiconductor devices – Devices used today
categorized as for energy generation, delivery, are almost exclusively based on silicon. The
and consumption. For generation, the thermal or emerging devices based on wide-bandgap
hydro generators both use synchronous machines materials such as SiC and GaN are expected
with excitation windings on the rotor, which to revolutionize power electronics with their
require DC current. A thyristor-based rectifier, capabilities of higher voltage, lower loss,
called exciter, is generally used for this purpose. faster switching speed, higher temperature,
Wind turbine generators usually use a back-to- and smaller size.
back VSI to interface to the AC grid, and PV • Power electronic converters – More cost-
solar sources use a DC-DC converter cascaded effective and reliable converters will be
with a VSI. developed as a result of better devices,
Power electronic controllers for transmission passive components, and circuit structures.
and distribution controllers include so-called Modular, distributed, and hybrid with non-
flexible AC transmission systems (FACTS) and power-electronics approaches are expected to
high-voltage DC transmission (HVDC). Some of result in overall better benefits.
the more commonly used controllers and their • Enhanced functions – Power electronic con-
functions and circuit topologies are listed in trollers can be designed to have multiple func-
Table 2. tions in the system. For example, wind and PV
The main power electronic controllers for solar inverters can provide reactive power to
loads include variable speed motor drives; the grid in addition to transferring real energy.
electronic ballast for fluorescent lights and power Today, power electronic controllers are mostly
supplies for LED; various power supplies for locally controlled. With better measurement
computer, IT, and other electronic loads; and and communication technologies, they may be
chargers for electric vehicles. The percentage controlled over a wide area for supporting the
of power electronics controlled loads in power system level functions.
systems have been steadily increasing. Power • New applications – The new applications for
electronics can generally result in improved future power system include DC grid based
performance and efficiency. on multiterminal HVDC and energy storage.
Critical technologies include cost-effective
and efficient DC transformers and DC circuit
breakers. Power electronics will play key roles
Future Directions in these technologies.

Power electronics have progressed steadily


since the invention of thyristors in the 1950s. Cross-References
The progress is in all aspects, semiconductor
devices, passives, circuits, control, and system  Cascading Network Failure in Power Grid
integration, leading to converter systems with Blackouts
better performance, higher efficiency, higher  Coordination of Distributed Energy Resources
power density, higher reliability, and lower for Provision of Ancillary Services: Architec-
cost. Because of these progresses, the power tures and Algorithms
electronics applications in power systems have  Lyapunov Methods in Power System Stability
become more and more widespread. However,  Power System Voltage Stability
in general, power electronic controllers are  Small Signal Stability in Electric Power Sys-
still not sufficiently cost-effective, reliable, tems
or efficient. Many improvements are needed  Time-Scale Separation in Power System
and expected, especially in the following Swing Dynamics: Singular Perturbations and
areas: Coherency
Electric Energy Transfer and Control via Power Electronics, Table 2 Commonly used power electronic controllers for transmission and distribution
Controller Oneline configuration System functions Control principle Basic PE function
SVC • Stability enhancement VAR control through Controlled bidirectional
– static VAR compensator with • Voltage regulation and VAR varying L and C in shunt AC switch
thyristor-controlled reactor and compensation connection
capacitor

TCSC • Power flow control Power and VAR control Controlled bidirectional
– thyristor-controlled series • Stability enhancement through varying C and L in AC switch
capacitor • Fault current limiting series connection

SSTS Power supply transfer for reliability On and off control Controlled bidirectional
– solid state transfer switch and power quality AC switch

HVDC • System Interconnection Power control through Bidirectional AC/DC current


(classic) • Power flow control back-to-back converters in source converter
• Stability enhancement shunt connections
Electric Energy Transfer and Control via Power Electronics

SSSC • Power flow control VAR control through Bidirectional AC/DC voltage
– static series synchronous • Stability enhancement voltage control in series source converter
compensator connection

STATCOM • Stability enhancement VAR control through Bidirectional AC/DC voltage


– static synchronous compensator • Voltage regulation & VAR current control in shunt source converter
compensation connection

HVDC (voltage source) • System Interconnection Power and VAR control Bidirectional AC/DC voltage
• Power flow control through back-to-back source converter
• stability enhancement converters in shunt
connections
347

E
348 Engine Control

Bibliography Keywords

Bayegen M (2001) A vision of the future grid. IEEE Compression ignition; Emissions; Exhaust af-
Power Eng Rev. vol 12, pp 10–12
Hingorani NG, Gyugyi L (2000) Understanding FACTS –
tertreatment; Internal combustion engines; Spark
concepts and technology of flexible AC transmission ignition
systems. IEEE Press, New York
Rahimo M, Klaka S (2009) High voltage semiconductor
technologies. In: 13th European conference on power
electronics and applications, EPE’09, Barcelona, pp 1–
Introduction
10
Wang F, Rosado S, Boroyevich D (2003) Open modular Most vehicles are moved by internal combus-
power electronics building blocks for utility power tion engines (ICE), whose key function is the
system controller applications. IEEE PESC, Acapulco,
pp 1792–1797, 15–19 June 2003
conversion of chemical into mechanical energy,
basically by oxidation, e.g., in the case of propane

C3 H8 C5O2 D 3CO2 C4H2 OC46:3 MJ=kg (1)

Engine Control The chemical energy is first transformed into


heat and then converted by the ICE into me-
Luigi del Re chanical energy (Heywood 1988). The key task
Johannes Kepler Universität, Linz, Austria of engine control (Guzzella and Onder 2010;
Kiencke and Nielssen 2005) is to make sure that
the reactants (fuel and oxygen) meet in the right
Abstract proportion (“mixture formation”) and that the
combustion is started (or “ignited”) to deliver the
Engine control is the enabling technology for required torque at the engine crankshaft. Several
efficiency, performance, reliability, and cleanli- combustion processes are known, the most com-
ness of modern vehicles for a wide variety of mon ones being Otto and Diesel. For the first
uses and users. It has also a paramount impor- kind (also called SI for spark ignited), the mix-
tance for many other engine applications like ture is prepared outside the combustion chamber
power plants. Engines are essentially chemical and combustion is ignited by spark, while in
reactors, and the core task of engine control the second one fuel is injected directly into the
consists in preparing and starting the reaction combustion chamber and combustion is ignited
(mixing the reactants and igniting the mixture) by compression (CI, compression ignited). GDI
while the reaction itself is not controlled. The (gasoline direct injection) is a variant of SI en-
technical challenge derives from the combination gines with direct fuel injection as CI but spark
of high complexity, wide range of conditions of ignition as SI.
use, performance requirements, significant time Unfortunately, the chemical equation (1) is not
delays, and use of the constraints on the choice the whole truth. Indeed, the way the mixture is
of components. In practice, engine control is to a prepared and ignited affects the efficiency of the
large extent feed-forward control, feedback loops conversion from thermal into mechanical energy,
being used either for low-level control or for up- but also secondary reactions, like pollutant for-
dating the feed-forward. Industrial engine control mation, and other aspects, like noise, vibrations
is based on very complex structures calibrated and harshness (NVH), and mechanical fatigue
experimentally, but there is a growing interest and thus life expectancy. Furthermore, driveabil-
for model-based control with stronger feedback ity requirements are primarily determined by the
action, supported by the breakthrough of new ability of an ICE to change fast its operating
computational and communication possibilities, point, and this sets additional requirements to
as well as the introduction of new sensors. the engine control. These requirements have to
Engine Control 349

be met for all vehicles ín spite of production The Control Tasks


variability and under all relevant operating con-
ditions, including all drivers, road, traffic, and The high-level control task can be defined as the
weather conditions. minimization of the average fuel consumption
As first principle models are often not avail- while providing the required torque and respect-
able or very time-consuming to tune and sel- ing the constraints on emissions (i.e., nitrogen
dom precise enough, engine control is based on oxides and dioxides .NOx / and particulate matter
very complex heuristic descriptions which can (PM)), noise, temperature, etc. The legislators in
be tuned experimentally and even automatically different countries have defined test procedure,
(Schoggl et al. 2002) – a modern engine control including a specified road profile and correspond- E
unit (ECU) can include up to 40.000 labels (pa- ing emission limits. Figure 3 shows the progres-
rameters or maps). This structure is mainly feed- sive reduction of the limits and the speed profile
forward, with feedback loops typically used for used to assess this value.
control of actuators, primarily calibrated under Even if fuel consumption is not yet limited by
laboratory conditions but with adaptation loops law, the control problem associated can be stated
designed to correct parameters to take in account as an optimal constrained control problem:
production and wear effects. Figure 1 shows an
Z
1120
engine test bench setup with the engine control
unit (ECU) and a calibration system. min qP f dt (2)
u.t /
0

so that
v.t/ D vdem .t/ ˙ v (3)
The Target System and
Z
1120

Figure 2 shows the basic setup of an ICE as CI qPi dt  Qi (4)


and SI. In both cases, the main components of an 0
ICE are fuel path, air path, combustion chamber, where u.t/ are all available control inputs, 1120
and exhaust aftertreatment system. is the duration of the European cycle, vdem .t/
Roughly speaking, ICEs exhibit three time the corresponding speed, v the speed tolerance,
scales. Changes in the setting of the fuel path – qPf is the instantaneous fuel consumption, qP i each
responsible to deliver the fuel to the combus- limited quantity (e.g., NOx), and Qi the corre-
tion chamber – act very fast for CI and GDI sponding limit for the whole test. In practice,
engines (e.g., 50 Hz) and rather fast for SI engines other criteria must be considered as well, like
(10 Hz or more). The same is not true for the NVH, but even this problem is never solved
air path which brings the gas mixture (fresh air using the standard tools of optimal control es-
and possibly recirculated exhaust gas) into the sentially for the nonlinearity (and following non-
combustion chamber and is the slowest system convexity) of the problem, but even more for the
(typically in the range of 0.5–2 Hz). In SI and lack of explicit models of sufficient quality relat-
GDI engines, spark timing can be changed for ing the inputs to the target quantities, especially
each combustion too. A still faster dynamics is combustion depending quantities like emissions.
associated with the combustion process itself, In practice, different simpler subproblems are
pressure sensors with the required dynamics to solved separately and tuned to achieve sufficient
monitor it are being introduced in a growing results also in terms of the general problem to
number of applications, but until now no suitable achieve the required performance. In the follow-
actuators are available for its closed loop control. ing, we concentrate on the main high-level tasks,
The torque demand changes typically with the omitting many others, e.g., all the control loops
vehicle dynamics, which are usually still slower required for the correct operation of the single
than the air path. actuators.
350 Engine Control

Engine Control, Fig. 1 Light duty engine test bench with ECU and calibration system

1a
1b
2
2

3
3

5a
Control unit Control unit 5b
Air path Air path
T C
8 4 6a T C
4
λ λ

7a 7b

Air inlet Exhaust gases Air inlet Exhaust gases

Engine Control, Fig. 2 Basic system scheme of CI (left) exhaust gases; 5a EGR valve, 5b throttle valve, 6a low-
and SI (right) engines: 1a Control of the injector opening, pressure EGR valve, 7a Diesel exhaust after treatment
1b Injection premixing with air; 2 Measurement of the (DOC, SCR, DPF), 7b SI engine after treatment (3 way
engine temperature; 3 Measurement of the engine rational catalyst), 8 SCR dosing control
speed; 4 Measurement of oxygen concentration in the
Engine Control 351

0.15 140
New European Driving Cycle (NEDC)
Euro1 120
0.12
100
PM emissions g /km

0.09 Euro2 0.03

vehicle speed km/h


80

0.02 Euro4 60
0.06 0.01
40
Euro3 Euro6 Euro5
0
0.2 0.3
E
0.03 0 0.1
20
Euro4
0
0
0 0.2 0.4 0.6 0.8 1 −20
0 200 400 600 800 1000 1200
NOx emissions g/km time s

Engine Control, Fig. 3 Left: different steps of limits of emissions per km as defined by the European Union (Euro 1
introduced in 1991 and Euro 6 from 2014). Right: New European Driving Cycle (NEDC)

Air Path Control combustion chamber and be available for com-


The main source of oxygen for the reaction of bustion side reactions as well. In the case of
Eq. (1) is ambient air which contains about 21 % nitrogen, these reactions lead to the undesired for-
oxygen. The engine – essentially a volumetric air mation of nitrogen oxides (NOx). Therefore, in
pump – aspires air flow roughly proportional to some engines, especially in CI engines, part of the
the cylinder volume and the revolution speed of combusted gases are recirculated to the combus-
the engine. The amount of oxygen entering the tion chamber (“exhaust gas recirculation”, EGR),
combustion chamber, however, will depend also providing advantages in terms of NOx reduction.
on temperature, pressure, and moisture. This flow While EGR is typically realized at high pressures
can be reduced (as in the standard SI engines) (path HP in Fig. 1), it is realized also at low
by throttling, e.g., by adding an additional flow pressure (path LP), even though less frequently.
resistance between the air intake and the com- Typically, the air path includes some coolers
bustion chamber, or increased by compressing designed to increase gas densities.
the fresh air, most commonly by turbocharging Air path control is designed to track dynamical
(especially in CI engines). A turbocharger con- references, for instance, the total fresh air mass
sists essentially of a turbine, which transforms (MAF) entering the cylinder and the correspond-
part of the enthalpy of the exhaust gas into me- ing pressure (MAP), but also other quantities are
chanical power, and a compressor, driven by this possible. The references are typically generated
power to compress the fresh air on its way to by the calibration engineers on the basis of tests.
the combustion chamber, thus increasing both The control inputs of the air path are mostly the
its density and temperature. Turbocharger oper- turbine (and possibly compressor) steering angle,
ation is typically controlled either directly (for the EGR, and – if available – throttle(s) setpoints.
instance, with variable vane angles) or indirectly, Most commonly used sensors include a mass
by bypass valves which deviate the gas flows in flow meter (hot film sensor), rather slow and dy-
parallel to the turbine. namically not reliable, pressure, and temperature
If only ambient air is fed to the combus- sensors, and sometimes the actual position of the
tion chamber, a proportional amount of the other valves is measured as well and the turbocharger
gases present in the atmosphere will enter the speed.
352 Engine Control

Engine Control, Fig. 4 80

indicated pressure
Pressure trace in a fired
cylinder of a CI engine 60
triggered by a pilot and a
main injection 40

20

0
-100 -50 0 50 100 150
crank angle [deg]
1
injection

0.5

0
-100 -50 0 50 100 150
crank angle [deg]

Fuel Path Control which must be compensated by the injection


The fuel path delivers the correct amount of fuel control.
for the reaction (1). In almost every ICE, a rail Injection in CI engines is typically splitted in
is filled with fuel at a given pressure (from few a main injection for torque and a pilot injection
bars for SI to about 2000 bars for CI), from for NVH control and sometimes also a post-
which the required amount of fuel is injected into injection for emission control or regeneration of
the cylinder. The injection can occur inside the aftertreatment devices. Figure 4 shows the typical
combustion chamber (as for CI and GDI engines) effect of a pilot injection on the pressure trace of
or near to the intake valve (“port injection”) for a CI engine.
standard SI engines. Differences between injectors of different
The injection amount is always set taking in cylinders are compensated by cylinder balancing
account the available oxygen mass. In SI engines control (typically using irregularities in the
with three-way catalyst, the fuel injection is given engine acceleration). Rail pressure is also an
by the stoichiometric condition. œ control uses important control variable for the direct injection.
an oxygen sensor in the exhaust to determine the
actual fuel/oxygen ratio and if appropriate correct Ignition
the injection tables. In CI and GDI the maximum Once the combustion chamber is filled, the
fuel injection is limited to prevent smoke forma- combustion can be started. In SI and GDI
tion, typically by tables, even though œ control combustion is started by a spark) leading to
can be and is partly used (Amstutz and del Re a flame front which propagates through the
1995). whole combustion chamber. Very few SI engines
In SI engines with port injection, the liquid have a second spark plug to better control the
fuel is injected near to the inlet valve and is combustion. Under some circumstances, e.g.,
expected to vaporize due to the local temperature high temperature, an undesired auto-ignition
and pressure conditions. During load changes, (“knock”) can occur with potentially catastrophic
however, it can happen that part of the fuel consequences for the engine durability but also
is not vaporized, remains on the duct wall unconventional NVH. To cope with this, SI
(“wall wetting”), and vaporizes at a later engines have vibration sensors whose output is
time, leading in both cases to a deviation used to modify the engine operation, in particular
from the expected values (Turin et al. 1995), the spark timing, to prevent it.
Engine Control 353

In CI engines, the injection leads almost im- deviation from this target. The first task is per-
mediately to the combustion which has more the formed both by control of the cooling circuit
character of an explosion and starts typically at and by specific combustion-related measures, the
several undefined locations. second one by taking the measured or estimated
Additional control during the combustion is up temperature as input for the controllers.
to now only theoretically feasible, as the combus- Fast heating is especially important for SI
tion takes place in an extremely short time, but engines, because almost all toxic emissions are
also because adequate actuators are not available. produced when the three-way catalyst is cold. To
achieve faster heating, SI engines tend to operate
Aftertreatment in a less fuel efficient, but “hotter” operation E
As the combustion mixture will always contain mode during this warm-up phase, one of the
more potential reactants than oxygen and fuel, causes of increased consumption of cold engines
side reactions will always take place, yielding and short trips.
toxic products, in particular NOx, incompletely
burnt fuel (HC), carbon monoxide (CO), and par-
ticulate matter (PM). Even if much effort is spent Cranking Idle Speed and Gear Shifting
on reducing their formation, this is almost never Control
sufficient, so additional aftertreatment equipment Initially, the engine is cranked by the starter until
is used. Table 1 gives an overview over the most a relatively low speed and then injection starts
common aftertreatment systems as well as over bringing the engine to the minimum operational
their control aspects. speed. If the injected fuel is not immediately
burnt, very high emissions will arise. At cranking,
Thermal Management the cylinder walls are typically very cold and
All main properties of engines are strongly af- combustion of a stoichiometric mixture is hardly
fected by its temperature, which depends on the possible. So engine control has the task to inject
varying load conditions. Engine operation is typ- as little as possible but as much as needed only in
ically optimal for a relatively narrow temperature the cylinder which is going to fire.
range, the same is even more critical for the Normally an ICE is expected to provide a
exhaust aftertreatment system. Engine heat is also torque to the driveline, speed being the result
required for other purposes (like defrosting of of the balance between it and the load. In idle
windshields in cold climates). control, no torque is transmitted to the driveline,
Thus the engine control system has two main but the engine speed is expected to remain stable
tasks: bringing the engine and the exhaust af- in spite of possible changes of local loads (like
tertreatment system as fast as possible into the cabin climate control). This boils down to a
target temperature range and taking in account robust control problem (Hrovat and Sun 1997).

Engine Control, Table 1 Main exhaust aftertreatment systems


System Purpose Control targets
Three-way catalyst Reduction of HC, CO, and NOx by more Achieve fast and maintain operating
than 98 % temperature and keep œ D 1
Oxydation catalyst Reduction of HC and CO, partly of PM Achieve fast and maintain operating
temperature and keep œ > 1
Particulate filter Traps PM Check trap state and regenerate by
increasing exhaust temperature for short
time if needed
NOx lean trap Traps NOx Estimate trap state and shift combustion to
CO rich when required
Selective catalyst reaction Reduces NOx Estimate required quantity of additional
reactant (urea) and dose it
354 Estimation and Control over Networks

Gear shifting requires several steps. Smooth- techniques (del Re et al. 2010) are increasing, but
ness and speed of the shifting depend on the they are not yet widespread.
coordination of engine operating point change.
Actual hardware developments (double clutches,
automated gear boxes) make a better operation, Cross-References
but require precise control.
 Powertrain Control for Hybrid-Electric and
Electric Vehicles
New Trends  Transmission

The utilization environment of engine control is


changing. On one side, customer and legisla- Bibliography
tor expectations continue producing pressure, but
there is a shift in priority from emissions to fuel Alberer D et al (2012) Identification for automotive sys-
efficiency and safety. Driver support systems, for tems. Springer, London
instance, automated parking, are becoming the Amstutz A, del Re L (1995) EGO sensor based robust
output control of EGR in diesel engines. IEEE Trans
longer the more pervasive, and many functions Control Syst Technol 3(1):39–48
must be included or affect immediately the ECU, del Re L et al (2010) Automotive model predictive control.
even though they are frequently hosted on own Springer, Berlin
Guzzella L, Onder C (2010) Introduction to modeling and
control hardware. Hybrid vehicles are gaining
control of internal combustion engine systems, 2nd
popularity, and this implies a different operation edn. Springer, Berlin
mode for the engine, for instance, thermal man- Heywood J (1988) Internal combustion engines funda-
agement becomes much more complex for range mentals. McGrawHill, New York
Hrovat D, Sun J (1997) Models and control methodologies
extender vehicles with long “cold” phases.
for IC engine idle speed control design. Control Eng
Maybe even more important is the diffusion Pract 5(8):1093–1100
of new devices and communication possibilities, Kiencke U, Nielssen L (2005) Automotive control sys-
so that, for instance, fuel saving preview-based tems: for engine, driveline, and vehicle, 2nd edn.
Springer, New York
gear shifting can be easily implemented using
Schoggl P et al (2002) Automated EMS calibration using
infrastructure-to-vehicle information, or even just objective driveability assessment and computer aided
navigation data. Further extensions, like cooper- optimization methods. SAE Trans 111(3):1401–1409
ative adaptive cruise control (CACC), plan to use Turin RC, Geering HP (1995) Model-reference adaptive
A/F-ratio control in an SI engine based on kalman-
vehicle-to-vehicle information to increase both
filtering techniques. In: Proceedings of the American
safety and efficiency. Control Conference, Seattle, 1995, vol 6
Against this background, there is a growing
consciousness that the actual industrial approach
based on huge calibration work is becoming the Estimation and Control over
longer the less viable and bears a steadily increas- Networks
ing risk of wasting potential performance. Some
model-based controls have already found their Vijay Gupta
way into the ECU, and the academy has shown Department of Electrical Engineering, University
in several occasions that model-based control is of Notre Dame, Notre Dame, IN, USA
able to achieve better performance, but it has not
yet been shown how this could comply with other
industrialization requirements. Abstract
Actually, new faster sensors (e.g., pressure
sensors in the combustion chambers) are being Estimation and control of systems when
introduced; the interest in model-based control data is being transmitted across nonideal
(Alberer et al. 2012) and in system identification communication channels has now become an
Estimation and Control over Networks 355

important research topic. While much progress channel models, e.g., channels that introduce de-
has been made in the area over the last few lays or additive noise, have been considered in
years, many open problems still remain. This the literature, these models are among the ones
entry summarizes some results available for such that have been studied the most. Moreover, the
systems and points out a few open research richness of the field can be illustrated by concen-
directions. Two popular channel models are trating on these models.
considered – the analog erasure channel model An analog erasure channel model is defined
and the digital noiseless model. Results are pre- as follows. At every time step k, the channel
sented for both the multichannel and multisensor supports as its input a real vector i.k/ 2 Rt
settings. with a bounded dimension t. The output o.k/ E
of the channel is determined stochastically. The
simplest model of the channel is when the output
Keywords
is determined by a Bernoulli process with proba-
bility p. In this case, the output is given by
Analog erasure channel; Digital noiseless chan-
nel; Networked control systems; Sensor fusion (
i.k  1/ with probability 1  p
o.k/ D
Introduction  otherwise,

Networked control systems refer to systems in where the symbol  denotes the fact that the
which estimation and control is done across receiver does not obtain any data at that time
communication channels. In other words, these step and, importantly, recognizes that the channel
systems feature data transmission among the has not transmitted any data. The probability p
various components – sensors, estimators, con- is termed the erasure probability of the channel.
trollers, and actuators – across communication More intricate models in which the erasure pro-
channels that may delay, erase, or otherwise cess is governed by a Markov chain, or by a
corrupt the data. It has been known for a long time deterministic process, have also been proposed
that the presence of communication channels and analyzed. In our subsequent development, we
has deep and subtle effects. As an instance, an will assume that the erasure process is governed
asymptotically stable linear system may display by a Bernoulli process.
chaotic behavior if the data transmitted from A digital noiseless channel model is defined
the sensor to the controller and the controller as follows. At every time step k, the channel
to the actuator is quantized. Accordingly, the supports at its input one out of 2m symbols.
impact of communication channels on the The output of the channel is equal to the input.
estimation/control performance and design of The symbol that is transmitted may be generated
estimation/control algorithms to counter any arbitrarily; however, it is natural to consider the
performance loss due to such channels have both channel as supporting m bits at every time step
become areas of active research. and the specific symbol transmitted as being
generated according to an appropriately design
quantizer. Once again, additional complications
Preliminaries such as delays introduced by the channel have
been considered in the literature.
It is not possible to provide a detailed overview A general networked control problem consists
of all the work in the area. This entry attempts to of a process whose states are being measured
summarize the flavor of the results that are avail- by multiple sensors that transmit data to multi-
able today. We focus on two specific communi- ple controllers. The controllers generate control
cation channel models – analog erasure channel inputs that are applied by different actuators.
and the digital noiseless channel. Although other All the data is transmitted across communication
356 Estimation and Control over Networks

channels. Design of control inputs when multiple two-block designs is better than the one-block
controllers are present, even without the pres- designs.
ence of communication channels, is known to be
hard since the control inputs in this case have
dual effect. It is, thus, not surprising that not Analog Erasure Channel Model
many results are available for networked control
systems with multiple controllers. We will thus Consider the usual LQG formulation. A linear
concentrate on the case when only one controller process of the form
and actuator is present. However, we will review
the known results for the analog erasure channel x.k C 1/ D Ax.k/ C Bu.k/ C w.k/;
and the digital noiseless channel models when
(i) multiple sensors observe the same process with state x.k/ 2 Rd and process noise w.k/
and transmit information to the controller and (ii) is controlled using a control input u.k/ 2 Rm .
the sensor transmits information to the controller The process noise is assumed to be white, Gaus-
over a network of communication channels with sian, zero mean, with covariance †w . The initial
an arbitrary topology. condition x.0/ is also assumed to be Gaussian
An important distinction in the networked and zero mean with covariance …0 . The process
control system literature is that of one-block is observed by n sensors, with the i -th sensor
versus two-block designs. Intuitively, the generating measurements of the form
one-block design arises from viewing the
communication channel as a perturbation to yi .k/ D Ci x.k/ C vi .k/;
a control system designed without a channel.
In this paradigm, the only block that needs with the measurement noise vi .k/ assumed to be
to be designed is the receiver. Thus, for white, Gaussian, zero mean, with covariance †iv :
instance, if an analog erasure channel is present All the random variables in the system are as-
between the sensor and the estimator, the sensor sumed to be mutually independent. We consider
continues to transmit the measurements as if two cases:
no channel is present. However, the estimator • If n D 1, the sensor communicates with
present at the output of the channel is now the controller across a network consisting of
designed to compensate for any imperfections multiple communication channels connected
introduced by the communication channel. according to an arbitrary topology. Every
On the other hand, in the two-block design communication channel is modeled as an
paradigm, both the transmitter and the receiver analog erasure channel with possibly a
are designed to optimize the estimation or control different erasure probability. The erasure
performance. Thus, if an analog erasure channel events on the channels are assumed to be
is present between the sensor and the estimator, independent of each other, for simplicity. The
the sensor can now transmit an appropriate sensor and the controller then form two nodes
function of the information it has access to. of a network each edge of which represents a
The transmitted quantity needs to satisfy the communication channel.
constraints introduced by the channel in terms of • If n > 1, then every sensor communicates
the dimensions, bit rate, power constraints, and with the controller across an individual com-
so on. It is worth remembering that while the munication channel that is modeled as an ana-
two-block design paradigm follows in spirit from log erasure channel with possibly a different
communication theory where both the transmitter erasure probability. The erasure events on the
and the receiver are design blocks, the specific channels are assumed to be independent of
design of these blocks is usually much more each other, for simplicity.
involved than in communication theory. It is The controller calculates the control input to
not surprising that in general performance with optimize a quadratic cost function of the form
Estimation and Control over Networks 357

" K1
X  controller transmits the control input to the ac-
JK D E x T .k/Qx.k/ C uT .k/Ru.k/ tuator across a perfect channel. The separation
kD0 principle states that the optimal performance is
#
achieved if the control input is calculated using
C x .K/PK x.K/ :
T
the usual LQR control law, but the process state
is replaced by the minimum mean squared error
(MMSE) estimate of the state. Thus, the two-
All the covariance matrices and the cost matri- block design problem needs to be solved now for
ces Q, R, and PK are assumed to be positive an optimal estimation problem.
definite. The pair .A; B/ is controllable and the The next step is to realize that for any allowed
pair .A; C / is observable, where C is formed E
two-block design, an upper bound on estima-
by stacking the matrices Ci ’s. The system is tion performance is provided by the strategy of
said to be stabilizable if there exists a design every node transmitting every measurement it
(within the specified one-block or two-block de- has access to at each time step. Notice that this
sign framework) such that the cost limK!1 K1 JK strategy is not in the set of allowed two-block
is bounded. designs since the dimension of the transmitted
quantity is not bounded with time. However, the
A Network of Communication Channels same estimate is calculated at the decoder if the
We begin with the case when N D 1 as men- sensor transmits an estimate of the state at every
tioned above. The one-block design problem in time step and every other node (including the
the presence of a network of communication decoder) transmits the latest estimate it has access
channels is identical to the one-block design as to from either its neighbors or its memory. This
if only one channel were present. This is be- algorithm is recursive and involves every node
cause the network can be replaced by an “equiv- transmitting a quantity with bounded dimension,
alent” communication channel with the erasure however, since it leads to calculation of the same
probability as some function of the reliability estimate at the decoder, and is, thus, optimal. It
of the network. This can lead to poor perfor- is worth remarking that the intermediate nodes
mance, since the reliability may decrease quickly do not require access to the control inputs. This
as the network size increases. For this reason, is because the estimate at the decoder is a linear
we will concentrate on the two-block design function of the control inputs and the measure-
paradigm. ments: thus, the effect of control inputs in the
The two-block design paradigm permits the estimate can be separated from the effect of
nodes of the network to process the data prior the measurements and included at the controller.
to transmission and hence achieve much better Moreover, as long as the closed loop system is
performance. The only constraint imposed on the stable, the quantities transmitted by various nodes
transmitter is that the quantity that is transmitted are also bounded. Thus, the two-block design
is a causal function of the information that the problem can be solved.
node has access to, with a bounded dimension. The stability and performance analysis with
The design problem can be solved using the the optimal design can also be performed. As an
following steps. The first step is to prove that a example, a necessary and sufficient stabilizability
separation principle holds if the controller knows condition is that the inequality
the control input applied by the actuator at ev-
ery time step. This can be the case if the con- pmaxcut .A/2 < 1;
troller transmits the control input to the actuator
across a perfect channel or if the control input is holds, where .A/ is the spectral radius of A
transmitted across an analog erasure channel but and pmaxcut is the max-cut probability evaluated
the actuator can transmit an acknowledgment to as follows. Generate cut-sets from the network
the controller. For simplicity, we assume that the by dividing the nodes into two sets – a source
358 Estimation and Control over Networks

set containing the sensor and a sink set con- it is known that the global estimate cannot, in
taining the controller. For each cut-set, obtain general, be obtained from local estimates because
the cut-set probability by multiplying the erasure of the correlation introduced by the process noise.
probabilities of the channels from the source set If erasure probabilities are zero, or if the process
to the sink set. The max-cut probability is the noise is not present, then the optimal encoding
maximum such cut-set probability. The necessity schemes are known. Another case for which the
of the condition follows by recognizing that the optimal encoding schemes are known is when
channels from the source set to the sink set need the estimator sends back acknowledgments to the
to transmit data at a high enough rate even if encoders.
the channels within each set are assumed not to Transmitting local estimates does, however,
erase any data. The sufficiency of the condition achieve optimal stability conditions as compared
follows by using the Ford-Fulkerson algorithm to to the conditions obtained from the optimal (un-
reduce the network into a collection of parallel known) two-block design (Gupta et al. 2009b).
paths from the sensor to the controller such that As an example, the necessary and sufficient sta-
each path has links with equal erasure probability bility conditions for the two sensor cases are
and the product of these probabilities for all paths given by
is the max-cut probability. More details can be
found in Gupta et al. (2009a). p1 .A1 /2 < 1
p2 .A2 /2 < 1
Multiple Sensors
Let us now consider the case when the process p1 p2 .A3 /2 < 1;
is observed using multiple sensors that transmit
data to a controller across an individual analog where p1 and p2 are erasure probabilities from
erasure channel. A separation principle to reduce sensors 1 and 2, respectively, .A1 / is the spectral
the control design problem into the combination radius of the unobservable part of the matrix A
of an LQR control law and an estimation prob- from the second sensor, .A2 / is the spectral
lem can once again be proven. Thus, the two- radius of the unobservable part of the matrix A
block design for the estimation problem asks from the first sensor, and .A3 / is the spectral
the following question: what quantity should the radius of the observable part of the matrix A
sensors transmit such that the decoder is able from both the sensors. The conditions are fairly
to generate the optimal MMSE estimate of the intuitive. For instance, the first condition provides
state at every time step, given all the information a bound on the rate of increase of modes for
the decoder has received till that time step. This which only sensor 1 can provide information
problem is similar to the track-to-track fusion to the controller, in terms of how reliable the
problem that has been studied since the 1980s communication channel from the sensor 1 is.
and is still open for general cases (Chang et al.
1997). Suppose that at time k, the last successful
transmission from sensor i happened at time Digital Noiseless Channels
ki  k. The optimal estimate that the decoder
can ever hope to achieve is the estimate of the Similar results as above can be derived for the
state x.k/ based on all measurements from the digital noiseless channel model. For the digital
sensor 1 till time k1 , from sensor 2 till time k2 , noiseless channel model, it is easier to consider
and so on. However, it is not known whether the system without either measurement or pro-
this estimate is achievable if the sensors are con- cess noises (although results with such noises are
strained to transmit real vectors with a bounded available). Moreover, since quantization is inher-
dimension. A fairly intuitive encoding scheme is ently highly nonlinear, results such as separation
if the sensors transmit the local estimates of the between estimation and control are not available.
state based on their own measurements. However, Thus, encoders and controllers that optimize a
Estimation and Control over Networks 359

cost function such as a quadratic performance eigenvalues, since no information needs to be


metric are not available even for the single sen- transmitted about the modes that are stable in
sor or channel case. Most available results thus open loop.
discuss stabilizability conditions for a given data
rate that the channels can support. Multiple Sensors
While early works used the one-block design The case when multiple sensors transmit
framework to model the digital noiseless channel information across an individual digital noiseless
as introducing an additive white quantization channel to a controller can also be considered.
noise, that framework obscures several crucial For every sensor i , define a rate vector
features of the channel. For instance, such an fRi1 ; Ri2 ;    ; Rid g corresponding to the d modes E
additive noise model suggests that at any bit of the system. If a mode j cannot be observed
rate, the process can be stabilized by a suitable from the sensor i , set Rij D 0. For stability, the
controller. However, a simple argument can show condition
that is not true. Consider a scalar process in which X
at time k, the controller knows that the state is Rij  max.0; j /;
within a set of length l.k/. Then, stabilization is i

possible only if l.k/ remains bounded as k ! 1. for every mode j must be satisfied. All such rate
Now, the evolution of l.k/ is governed by two vectors stabilize the system.
processes: at every time step, this uncertainty
can be (i) decreased by a factor of at most 2m
due to the data transmission across the channel Summary and Future Directions
and (ii) increased by a factor of a (where a is
the process matrix governing the evolution of the This entry provided a brief overview of some
state) due to the process evolution. This implies results available in the field of networked
that for stabilization to be possible, the inequality control systems. Although the area is seeing
m  log2 .a/ must hold. Thus, the additive intense research activity, many problems
noise model is inherently wrong. Most results remain open. For control across analog erasure
in the literature formalize this basic intuition channels, most existing results break down
above (Nair et al. 2007). if a separation principle cannot be proved.
Thus, for example, if control packets are also
A Network of Communication Channels transmitted to the actuator across an analog
For the case when there is only one sensor that erasure channel, the LQG optimal two-block
transmits information to the controller across a design is unknown. There is some recent
network of communication channels connected work on analyzing the stabilizability under
in arbitrary topology, an analysis similar to that such conditions (Gupta and Martins 2010),
done for analog erasure channels can be per- but the problem remains open in general.
formed (Tatikonda 2003). A max-flow min-cut For digital noiseless channels, controllers that
like theorem again holds. The stability condition optimize some performance metric are largely
now becomes that for any cut-set unknown. Considering more general channel
X X models is also an important research direction
Rj > log2 .i /; (Martins and Dahleh 2008; Sahai and Mitter
all unstable eigenvalues 2006).
P
where Rj is the sum of data rates supported
by the channels joining the source set to sink Cross-References
set for any cut-set and i are the eigenvalues of
the process matrix A. Note that the summation  Averaging Algorithms and Consensus
on the right hand side is only over the unstable  Oscillator Synchronization
360 Estimation for Random Sets

Bibliography Keywords

Chang K-C, Saha RK, Bar-Shalom Y (1997) On optimal Image processing; Multitarget processing; Ran-
track-to-track fusion. IEEE Trans Aerosp Electron
Syst AES-33:1271–1276
dom finite sets; Stochastic geometry
Gupta V, Martins NC (2010) On stability in the pres-
ence of analog erasure channels between controller
and actuator. IEEE Trans Autom Control 55(1):
175–179
Gupta V, Dana AF, Hespanha J, Murray RM, Hassibi
Introduction
B (2009a) Data transmission over networks for es-
timation and control. IEEE Trans Autom Control In ordinary signal processing, one models
54(8):1807–1819 physical phenomena as “sources,” which generate
Gupta V, Martins NC, Baras JS (2009b) Stabilization over
erasure channels using multiple sensors. IEEE Trans
“signals” obscured by random “noise.” The
Autom Control 54(7):1463–1476 sources are to be extracted from the noise
Martins NC, Dahleh M (2008) Feedback control in the using optimal-estimation algorithms. Random
presence of noisy channels: ‘bode-like’ fundamental set (RS) theory was devised about 40 years
limitations of performance. IEEE Trans Autom Con-
trol 53(7):1604–1615
ago by mathematicians who also wanted to
Nair GN, Fagnani F, Zampieri S, Evans RJ (2007) Feed- construct optimal-estimation algorithms. The
back control under data rate constraints: an overview. “signals” and “noise” that they had in mind,
Proc IEEE 95(1):108–137 however, were geometric patterns in images.
Sahai A, Mitter S (2006) The necessity and sufficiency of
anytime capacity for stabilization of a linear system The resulting theory, stochastic geometry, is
over a noisy communication link–Part I: scalar sys- the basis of the “morphological operators”
tems. IEEE Trans Inf Theory 52(8):3369–3395 commonly employed today in image-processing
Tatikonda S (2003) Some scaling properties of large applications. It is also the basis for the theory of
distributed control systems. In: 42th IEEE conference
on decision and control, Maui, Dec 2003 RSs. An important special case of RS theory, the
theory of random finite sets (RFSs), addresses
problems in which the patterns of interest consist
of a finite number of points. It is the theoretical
basis of many modern medical and other image-
Estimation for Random Sets processing algorithms. In recent years, RFS
theory has found application to the problem
Ronald Mahler of detecting, localizing, and tracking unknown
Eagan, MN, USA numbers of unknown, evasive point targets.
Most recently and perhaps most surprisingly,
RS theory provides a theoretically rigorous
way of addressing “signals” that are human-
Abstract mediated, such as natural-language statements
and inference rules. The breadth of RS theory
The random set (RS) concept generalizes that is suggested in the various chapters of Goutsias
of a random vector. It permits the mathematical et al. (1997).
modeling of random systems that can be inter- The purpose of this entry is to summarize the
preted as random patterns. Algorithms based on RS and RFS theories and their applications. It is
RSs have been extensively employed in image divided in to the following sections: A Simple
processing. More recently, they have found appli- Example, Mathematics of Random Sets, Ran-
cation in multitarget detection and tracking and in dom Sets and Image Processing, Random Sets
the modeling and processing of human-mediated and Multitarget Processing, Random Sets and
information sources. The purpose of this entry Human-Mediated Data, Summary and Future Di-
is to briefly summarize the concepts, theory, and rections, Cross-References, and Recommended
practical application of RSs. Reading.
Estimation for Random Sets 361

A Simple Example measurement c is imprecise – a randomly varying


hexagonal cell c. There are two ways of thinking
To illustrate the concept of a RS, let us begin by about the total measurement collection. First, it is
examining a simple example: locating stars in the a finite set Z D fc10 ; : : : ; cm
0
g  fc1 ; : : : ; cM g of
nighttime sky. We will proceed in successively cells. Second, it is the union Z D c10 [ : : : [ cm 0
of
more illustrative steps: all of the observed cells – i.e., it is a geometrical
Locating a single non-dim star (estimating pattern.
a random point). When we try to locate a star, Locating multiple stars over an extended
we are trying to estimate its actual position – its period of time (estimating multiple moving tar-
“state” x D .˛0 ; 0 / – in terms of its azimuth gets). As the night progresses, we must contin- E
angle ˛0 and elevation angle 0 . When the star ually redetermine the existence and positions of
is dim but not too dim, its apparent position will each star – a process called multitarget tracking.
vary slightly. We can estimate its position by We must also account for appearances and disap-
averaging many measurements – i.e., by applying pearances of the stars in the patch – i.e., for target
a point estimator. death and birth.
Locating a very dim star (estimating an RS
with at most one element). Assume that the star is
so dim that, when we see it, it might be just a mo- Mathematics of Random Sets
mentary visual illusion. Before we can estimate
its position, we must first estimate whether or not The purpose of this section is to sketch the ele-
it exists. We must record not only its apparent ments of the theory of random sets. It is organized
position z D .˛; / (if we see it) but its apparent as follows: General Theory of Random Sets, Ran-
existence ", with " D 1 (we saw it) or " D 0 (we dom Finite Sets (Random Point Processes), and
did not). Averaging " over many observations, we Stochastic Geometry. Of necessity, the material
get a number q between 0 and 1. If q > 14 (say), is less elementary than in later sections.
we could declare that the star probably actually
is a star; and then we could average the non-null General Theory of Random Sets
observations to estimate its position. Let Y be a topological space – for example, an
Locating multiple stars (estimating an RFS). N -dimensional Euclidean space RN . The power
Suppose that we are trying to locate all of the set 2Y of Y is the class of all possible sub-
stars in some patch of sky. In some cases, two sets S  Y. Any subclass of 2Y is called
dim stars may be so close that they are difficult a “hyperspace.” The “elements” or “points” of
to distinguish. We will then collect three kinds a hyperspace are thus actually subsets of some
of measurements from them: Z D ; (did not other space. For a hyperspace to be of interest,
see either star), Z D f.˛; /g (we saw one or one must extend the topology on Y to it. There
the other), or Z D f.˛1 ; 1 /; .˛2 ; 2 /g (saw both). are many possible topologies for hyperspaces
The total collected measurement in the patch of (Michael 1950). The most well studied is the
sky is a finite set Z D fz1 ; : : : ; zm g of point Fell-Matheron topology, also called the “hit-and-
measurements with zj D . j ; ˛j /, where each zi miss” topology (Matheron 1975). It is applicable
is random, where m is random, and where m D 0 when Y is Hausdorff, locally compact, and com-
corresponds to the null measurement Z D ;. pletely separable. It topologizes only the hyper-
Locating multiple stars in a quantized sky space c.2Y / of all closed subsets C of Y. In this
(estimation using imprecise measurements). Sup- case, a random (closed) set ‚ is a measurable
pose that, for computational reasons, the patch mapping from some probability space into c.2Y /.
of sky must be quantized into a finite number The Fell-Matheron topology’s major strength
of hexagonal-shaped cells, c1 ; : : : ; cM . Then, the is its relative simplicity. Let “Pr.E/” denote the
measurement from any star is not a specific point probability of a probabilistic event E. Then,
z, but instead the cell c that contains z. The normally, the probability law of ‚ would be
362 Estimation for Random Sets

described by a very abstract probability measure A Poisson RFS ‰ is perhaps the simplest
p‚ .O/ D Pr.‚ 2 O/. This measure must nontrivial example of a random point pattern. It
be defined on the Borel-measurable subsets is specified by a spatial distribution s.y/ and an
O  c.2Y /, with respect to the Fell-Matheron intensity
. At any given instant, the probability
topology, where O is itself a class of subsets that there will be n points in the pattern is p.n/ D
of Y. However, define the Choquet capacity e 

n =nŠ (the value of the Poisson distribution).
functional by c‚ .G/ D Pr.‚ \ G ¤ ;/ for The probability that one of these n points will be
all open subsets G  Y. Then, the Choquet- y is s.y/. The function D‰ .y/ D
 s.y/ is called
Matheron theorem states that the probability law the intensity function of ‰.
of ‚ is completely described by the simpler, At any moment, the point pattern produced
albeit nonadditive, measure c‚ .G/. by ‰ is a finite set Y D fy1 ; : : : ; yn g of points
The theory of random sets has evolved y1 ; : : : ; yn in Y, where n D 0; 1; : : : and where
into a substantial subgenre of statistical theory Y D ; if n D 0. If n D 0 then Y represents
(Molchanov 2005). For estimation theory, the the hypothesis that no objects at all are present. If
concept of the expected value E[‚] of a random n D 1 then Y D fy1 g represents the hypothesis
set ‚ is of particular interest. Most definitions that a single object y1 is present. If n D 2 then
of E[‚] are very abstract (Molchanov 2005, Y D fy1 ; y2 g represents the hypothesis that there
Chap. 2). In certain circumstances, however, are two distinct objects y1 ¤ y2 . And so on.
more conventional-looking definitions are The probability distribution of ‰ – i.e., the
possible. Suppose that Y is a Euclidean space probability that ‰ will have Y D fy1 ; : : : ; yn g
and that c.2Y / is restricted to K.2Y /, the as an instantiation – is entirely determined by its
bounded, convex, closed subsets of Y. If C; C 0 intensity function D‰ .y/:
are two such subsets, their Minkowski sum is
C C C 0 D fc C c 0 j c 2 C; c 0 2 C 0 g. Endowed f‰ .Y / D f‰ .fy1 ; : : : ; yn g/
with this definition of addition, K.2Y / can
be homeomorphically and homomorphically D e mu  D‰ .y1 /    D‰ .yn /
embedded into a certain space of functions
(Molchanov 2005, pp. 199–200). Denote this Every suitably well-behaved RFS ‰ has a
embedding by C 7! C . Then, the expected probability distribution f‰ .Y / and an intensity
value E[‚] of ‚, defined in terms of Minkowski function D‰ .y/ (a.k.a. first-moment density). A
addition, corresponds to the conventional Poisson RFS is unique in that f‰ .Y / is com-
expected value E[‚ ] of the random function ‚ . pletely determined by D‰ .y/.
Conventional signal processing is often con-
cerned with single-object random systems that
Random Finite Sets (Random Point have the form
Processes)
Suppose that the c.2Y / is restricted to f.2Y /, Z D .x/ C V
the class of finite subsets of Y. (In many for-
mulations, f.2Y / is taken to be the class of where x is the state of the system; .x/ is the
locally finite subsets of Y – i.e., those whose “signal” generated by the system; the zero-mean
intersection with compact subsets is finite.) A random vector V is the random “noise” associ-
random finite set (RFS) is a measurable mapping ated the sensor; and Z is the random measurement
from a probability space into f.2Y /. An example: that is observed. The purpose of signal processing
the field of twinkling stars in some patch of a is to construct an estimate xO .z1 ; : : : :; zk / of x,
night sky. RFS theory is a particular mathematical using the information contained in one or more
formulation of point process theory (Daley and draws z1 ; : : : :; zk from the random variable Z.
Vere-Jones 1998; Snyder and Miller 1991; Stoyan RFS theory is analogously concerned with
et al. 1995). random systems that have the form
Estimation for Random Sets 363

† D ‡.X / [ pattern that models clutter; and ‚ is the total


pattern that has been observed. A common sim-
where a random finite point pattern ‡.X / is the plifying assumption is that and c are germ-
“signal” generated by the point pattern X (which grain processes. One goal of stochastic geometry
is an instantiation of a random point pattern „); is to devise algorithms that can construct an opti-
is a random finite point “noise” pattern; † is mal estimate SO .T1 ; : : : :; Tk / of S , using multiple
the total random finite point pattern that has been patterns T1 ; : : : :; Tk  Y drawn from ‚.
observed; and “[” denotes set-theoretic union.
One goal of RFS theory is to devise algorithms
O 1 ; : : : :; Zk / of
that can construct an estimate X.Z Random Sets and Image Processing E
X , using multiple point patterns Z1 ; : : : :; Zk 
Y drawn from †. One approximate approach is Both point process theory and stochastic
that of estimating only the first-moment density geometry have found extensive application
D„ .x/ of „. to image-processing applications. These are
considered briefly in turn.
Stochastic Geometry
Stochastic geometry addresses more complicated Stochastic Geometry and Image Processing.
random patterns. An example: the field of twin- Stochastic geometry methods are based on the
kling stars in a quantized patch of the night use of a “structuring element” B (a geometrical
sky, in which case the measurement is the union shape, such as a disk, sphere, or more complex
c1 [ : : : [ cm of a finite number of hexagonally structure) to modify an image.
shaped cells. The dilation of a set S by B is S ˚ B where
This is one instance of a germ-grain process “˚” is Minkowski addition (Stoyan et al. 1995).
(Stoyan et al. 1995, pp. 59–64). Such a process is Dilation tends to fill in cavities and fissures in
specified by two items: an RFS ‰ and a function images. The erosion of S is S B D .S c ˚ B c /c
cy that associates with each y in Y a closed subset where “c ” indicates set-theoretic complement.
cy  Z. For example, if Y D R2 is the real- Erosion tends to create and increase the size of
valued plane, then cy could be the disk of radius cavities and fissures. Morphological filters are
r centered at y D .x; y/. Let Y D fy1 ; : : : ; yn g constructed from various combinations of dila-
be a particular random draw from ‰. The points tion and erosion operators.
y1 ; : : : ; yn are the “germs,” and cy1 ; : : : ; cyn are Suppose that a binary image † D S has been
the “grains” of this random draw from the germ- degraded by some measurement process – for
grain process ‚. The total pattern in Y is the example, the process ‚ D .S \ / [ . Then,
union cy1 [ : : : [ cyn of the grains – a random image restoration refers to the construction of an
draw from ‚. Germ-grain processes can be used O / of the original image S from a
estimate S.T
to model many kinds of natural processes. One single degraded image ‚ D T . The restoration
example is the distribution of graphite particles O / is optimal if it can be shown to
operator S.T
in a two-dimensional section of a piece of iron, be optimally close to S , given some concept of
in which case the cy could be chosen to be line closeness. The symmetric difference
segments rather than disks.
Stochastic geometry is concerned with ran- T1 t T2 D .T1 [ T2 /  .T1 \ T2 /
dom binary images that have observation struc-
tures such as is a commonly used method for measuring the
dissimilarity of binary images. It can be used to
‚ D .S \ / [ construct measures of distance between random
images. One such distance is
where S is a “signal” pattern;  is a random
pattern that models obscurations; is a random d.‚1 ; ‚2 / D E Œj‚1 t ‚2 j
364 Estimation for Random Sets

where jS j denotes the size of the set S and E[A] clutter k . Others are generated by the targets,
is the expected value of the random number A. with some targets possibly not having generated
Other distances require some definition of the any. Mathematically speaking, Z is a random
expected value E[‚] of a random set ‚. It has draw from an RFS †k that can be decomposed
been shown that, under certain circumstances, as †k D ‡.Xk / [ k , where ‡.Xk / is the set of
certain morphological operators can be viewed as target-generated measurements.
consistent maximum a posteriori (MAP) estima-
tors of S (Goutsias et al. 1997, p. 97). Conventional Multitarget Detection and
Tracking. This is based on a “divide and
RFS Theory and Image Processing. Positron- conquer” strategy with three basic steps: time
emission tomography (PET) is one example of update, data association, and measurement
the application of RFS theory. In PET, tissues update. At time tk we have n “tracks” 1 ; : : : ; n
of interest are suffused with a positron-emitting (hypothesized targets). In the time update, an
radioactive isotope. When a positron annihilates extended Kalman filter (EKF) is used to time-
an electron in a suitable fashion, two photons predict the tracks i to predicted tracks iC
are emitted in opposite directions. These photons at the time tkC1 of the next measurement set
are detected by sensors in a ring surrounding the ZkC1 D fz1 ; : : : ; zm g.
radiating tissue. The location of the annihilation Given ZkC1 , we can construct the following
on the line can be estimated by calculating time data-association hypothesis H : for each i D
difference of arrival. 1; : : : ; n, the predicted track iC generated the
Because of the physics of radioactive decay, detection zji , for some index ji , or, alternatively,
the annihilations can be accurately modeled this track was not detected at all. If we remove
as a Poisson RFS ‰. Since a Poisson RFS is from ZkC1 all of the zj1 ; : : : ; zjn , the remaining
completely determined by its intensity function measurements are interpreted either as being clut-
D‰ .x/, it is natural to try to estimate D‰ .x/. ter or as having been generated by new targets.
This yields the spatial distribution s‰ .y/ of Enumerating all possible association hypotheses
annihilations – which, in turn, is the basis of the (which is a combinatorily complex procedure),
PET image (Snyder and Miller 1991, pp. 115– we end up with a “hypothesis table” H1 ; : : : ; H .
119). Given Hi , let zji be the measurement that is
hypothesized to have been generated by predicted
track iC . Then, the measurement-update step of
Random Sets and Multitarget an EKF is used to construct a measurement-
Processing updated track i;ji from iC and zji . Attached
to each Hi is a hypothesis probability pi – the
The purpose of this section is to summarize probability that the particular hypothesis Hi is the
the application of RFS theory to multitarget de- correct one. The hypothesis with largest pi yields
tection, tracking, and localization. An example: the multitarget estimate XO D fOx1 ; : : : ; xO nO g.
tracking the positions of stars in the night sky
over an extended period of time. RFS Multitarget Detection and Tracking. In
Suppose that at time tk there are an unknown the place of tracks and hypothesis tables, this uses
number n of targets with unknown states multitarget state sets and multitarget probability
x1 ; : : : ; xn . The state of the entire multitarget distributions. In place of the conventional time
system is a finite set X D fx1 ; : : : ; xn g with update, data association, and measurement
n  0. When interrogating a scene, many update, it uses a recursive Bayes filter. A random
sensors (such as radars) produce a measurement multitarget state set is an RFS „kjk whose
of the form Z D fz1 ; : : : ; zm g – i.e., a points are target states. A multitarget probability
finite set of measurements. Some of these distribution is the probability distribution
measurements are generated by background f .Xk jZ1Wk / D f„kjk .X / of the RFS „kjk ,
Estimation for Random Sets 365

where Z1Wk W Z1 ; : : : ; Zk is the time sequence perform better than conventional approaches in
of measurement sets at time tk . some applications.
Random Sets and Human-Mediated Data
RFS Time Update. The Bayes filter time-
update step f .Xk jZ1Wk / ! f .XkC1 jZ1Wk /
requires a multitarget Markov transition function
f .XkC1 jXk /. It is the probability that the Random Sets and Human-Mediated
multitarget system will have multitarget state set Data
XkC1 at time tkC1 , if it had multitarget state set
Xk at time tk . It takes into account all pertinent Natural-language statements and inference rules E
characteristics of the targets: individual target have already been mentioned as examples of
motion, target appearance, target disappearance, human-mediated information. Expert-systems
environmental constraints, etc. It is explicitly theory was introduced in part to address
constructed from an RFS multitarget motion situations – such as this – that involve
model using a multitarget integrodifferential uncertainties other than randomness. Expert-
calculus. system methodologies include fuzzy set theory,
the Dempster-Shafer (D-S) theory of uncertain
RFS Measurement Update. The Bayes filter evidence, and rule-based inference. RS theory
measurement-update step f .XkC1 jZ1Wk / ! provides solid Bayesian foundations for them
f .XkC1 jZ1WkC1 / is just Bayes rule. It requires and allows human-mediated data to be processed
a multitarget likelihood function fkC1 .ZjX / – using standard Bayesian estimation techniques.
the likelihood that a measurement set Z will be The purpose of this section is to briefly
generated, if a system of targets with state set summarize this aspect of the RS approach.
X is present. It takes into account all pertinent The relationships between expert-systems
characteristics of the sensor(s): sensor noise, theory and random set theory were first
fields of view and obscurations, probabilities established by researchers such as Orlov (1978),
of detection, false alarms, and/or clutter. It is Höhle (1982), Nguyen (1978), and Goodman and
explicitly constructed from an RFS measurement Nguyen (1985). At a relatively early stage, it
model using multitarget calculus. was recognized that random set theory provided
a potential means of unifying much of expert-
RFS State Estimation. Determination of the systems theory (Goodman and Nguyen 1985;
number n and states x1 ; : : : ; xn of the targets is Kruse et al. 1991).
accomplished using a Bayes-optimal multitarget A conventional sensor measurement at time tk
state estimator. The idea is to determine the XkC1 is typically represented as Zk D .xk / C Vk –
that maximizes f .XkC1 jZ1WkC1 / in some sense. equivalently formulated as a likelihood function
f .zk jxk /. It is conventional to think of zk as
Approximate Multitarget RFS Filters. the actual “measurement” and of f .zk jxk / as the
The multitarget Bayes filter is, in general, full description of the uncertainty associated with
computationally intractable. Central to the RFS it. In actuality, zk is just a mathematical model
approach is a toolbox of techniques – including zk of some real-world measurement k . Thus,
the multitarget calculus – designed to produce the likelihood actually has the form f .k jxk / D
statistically principled approximate multitarget f .zk jxk /.
filters. The two most well studied are the This observation assumes crucial importance
probability hypothesis density (PHD) filter and when one considers human-mediated data. Con-
its generalization the cardinalized PHD (CPHD) sider the simple natural-language statement
filter. In such filters, f .Xk jZ1Wk / is replaced by
the first-moment density D.xk jZ1Wk / of „kjk .
These filters have been shown to be faster and  D “The target is near the tower”
366 Estimation for Random Sets

where the tower is a landmark, located at a known fields such as multitarget tracking and in expert-
position .x0 ; y0 /, and where the term “near” is systems theory. All of these fields of application
assumed to have the following specific meaning: remain areas of active research.
.x; y/ is near .x0 ; y0 / means that .x; y/ 2 T5
where T5 is a disk of radius 5 m, centered at
.x0 ; y0 /. If z D .x; y/ is the actual measurement Cross-References
of the target’s position, then  is equivalent to the
formula z 2 T5 . Since z is just one possible draw  Estimation, Survey on
from Zk , we can say that  – or, equivalently,  Extended Kalman Filters
T5 – is actually a constraint on the underlying  Nonlinear Filters
measurement process: Zk 2 T5 .
Because the word “near” is rather vague, we
Recommended Reading
could just as well say that z 2 T5 is the best
choice, with confidence w5 D 0:7; that z 2 T4 is
Molchanov (2005) provides a definitive exposi-
the next best choice, with confidence w4 D 0:2;
tion of the general theory of random sets. Two
and that z 2 T6 is the least best, with confidence
excellent references for stochastic geometry are
w6 D 0:1. Let ‚ be the random subset of Z
Stoyan et al. (1995) and Barndorff-Nielsen and
defined by Pr.‚ D Ti / D wi for i D 4; 5; 6. In
van Lieshout (1999). The books by Kingman
this case,  is equivalent to the random constraint
(1993) and Daley and Vere-Jones (1998) are
good introductions to point process theory. The
Zk 2 ‚: application of point process theory and stochas-
tic geometry to image processing is addressed
The probability in, respectively, Snyder and Miller (1991) and
Stoyan et al. (1995). The application of RFSs to
k .‚jxk / D Pr..xk / C Vk 2 ‚/ multitarget estimation is addressed in the tutorials
D Pr.Zk 2 ‚jXk D xk / Mahler (2004, 2013) and the book Mahler (2007).
Introductions to the application of random sets to
expert systems can be found in Kruse et al. (1991)
is called a generalized likelihood function (GLF).
and Mahler (2007), Chaps. 3–6.
GLFs can be constructed for more complex
natural-language statements, for inference rules,
and more. Using their GLF representations,
such “nontraditional measurements” can be Bibliography
processed using single- and multi-object
Barndorff-Nielsen O, van Lieshout M (1999) Stochas-
recursive Bayes filters and their approximations. tic geometry: likelihood and computation. Chap-
As a consequence, it can be shown that fuzzy man/CRC, Boca Raton
logic, the D-S theory, and rule-based inference Daley D, Vere-Jones D (1998) An introduction to the
can be subsumed within a single Bayesian- theory of point processes, 1st edn. Springer, New York
Goodman I, Nguyen H (1985) Uncertainty models for
probabilistic paradigm. knowledge based systems. North-Holland, Amsterdam
Goutsias J, Mahler R, Nguyen H (eds) (1997) Random
sets: theory and applications. Springer, New York
Höhle U (1982) A mathematical theory of uncertainty:
Summary and Future Directions fuzzy experiments and their realizations. In: Yager R
(ed) Recent developments in fuzzy set and possibility
In the engineering world, the theory of random theory. Pergamon, New York, pp 344–355
sets has been associated primarily with certain Kingman J (1993) Poisson processes. Oxford University
Press, London
specialized image-processing applications, such
Kruse R, Schwencke E, Heinsohn J (1991) Uncertainty
as morphological filters and tomographic imag- and vagueness in knowledge-based systems. Springer,
ing. It has more recently found application in New York
Estimation, Survey on 367

Mahler R (2004) ‘Statistics 101’ for multisensor, multitar- science. A certainly incomplete list of possible
get data fusion. IEEE Trans Aerosp Electron Sys Mag application domains of estimation includes the
Part 2: Tutorials 19(1):53–64
Mahler R (2007) Statistical multisource-multitarget infor-
following: statistics (Bard 1974; Ghosh et al.
mation fusion. Artech House, Norwood 1997; Koch 1999; Lehmann and Casella 1998;
Mahler R (2013) ‘Statistics 102’ for multisensor- Tsybakov 2009; Wertz 1978), telecommunication
multitarget tracking. IEEE J Spec Top Sign Proc systems (Sage and Melsa 1971; Schonhoff
7(3):376–389
Matheron G (1975) Random sets and integral geometry.
and Giordano 2006; Snyder 1968; Van Trees
Wiley, New York 1971), signal and image processing (Barkat
Michael E (1950) Topologies on spaces of subsets. Trans 2005; Biemond et al. 1983; Elliott et al. 2008;
Am Math Soc 71:152–182 Itakura 1971; Kay 1993; Kim and Woods 1998;
Molchanov I (2005) Theory of random sets. Springer, E
London
Levy 2008; Najim 2008; Poor 1994; Tuncer
Nguyen H (1978) On random sets and belief functions. J and Friedlander 2009; Wakita 1973; Woods and
Math Anal Appl 65:531–542 Radewan 1977), aerospace engineering (McGee
Orlov A (1978) Fuzzy and random sets. Prikladnoi Mno- and Schmidt 1985), tracking (Bar-Shalom and
gomerni Statisticheskii Analys, Moscow
Snyder D, Miller M (1991) Random point processes in
Fortmann 1988; Bar-Shalom et al. 2001, 2013;
time and space, 2nd edn. Springer, New York Blackman and Popoli 1999; Farina and Studer
Stoyan D, Kendall W, Meche J (1995) Stochastic geome- 1985, 1986), navigation (Dissanayake et al.
try and its applications, 2nd edn. Wiley, New York 2001; Durrant-Whyte and Bailey 2006a,b; Farrell
and Barth 1999; Grewal et al. 2001; Mullane
et al. 2011; Schmidt 1966; Smith et al. 1986;
Thrun et al. 2006), control systems (Anderson
Estimation, Survey on and Moore 1979; Athans 1971; Goodwin et al.
2005; Joseph and Tou 1961; Kalman 1960a;
Luigi Chisci1 and Alfonso Farina2 Maybeck 1979, 1982; Söderström 1994; Stengel
1
Dipartimento di Ingegneria dell’Informazione, 1994), econometrics (Aoki 1987; Pindyck
Università di Firenze, Firenze, Italy and Roberts 1974; Zellner 1971), geophysics
2
Selex ES, Roma, Italy (e.g., seismic deconvolution) (Bayless and
Brigham 1970; Flinn et al. 1967; Mendel
1977, 1983, 1990), oceanography (Evensen
Abstract 1994a; Ghil and Malanotte-Rizzoli 1991),
weather forecasting (Evensen 1994b, 2007;
This entry discusses the history and describes McGarty 1971), environmental engineering
the multitude of methods and applications of this (Dochain and Vanrolleghem 2001; Heemink
important branch of stochastic process theory. and Segers 2002; Nachazel 1993), demographic
systems (Leibungudt et al. 1983), automotive
systems (Barbarisi et al. 2006; Stephant et al.
Keywords 2004), failure detection (Chen and Patton
1999; Mangoubi 1998; Willsky 1976), power
Linear stochastic filtering; Markov step pro- systems (Abur and Gómez Espósito 2004;
cesses; Maximum likelihood estimation; Riccati Debs and Larson 1970; Miller and Lewis
equation; Stratonovich-Kushner equation 1971; Monticelli 1999; Toyoda et al. 1970),
nuclear engineering (Robinson 1963; Roman
Estimation is the process of inferring the value of et al. 1971; Sage and Masters 1967; Venerus
an unknown given quantity of interest from noisy, and Bullock 1970), biomedical engineering
direct or indirect, observations of such a quantity. (Bekey 1973; Snyder 1970; Stark 1968), pattern
Due to its great practical relevance, estimation recognition (Andrews 1972; Ho and Agrawala
has a long history and an enormous variety 1968; Lainiotis 1972), social networks (Snijders
of applications in all fields of engineering and et al. 2012), etc.
368 Estimation, Survey on

Chapter Organization While estimation problems had been


addressed for several centuries, it was not
The rest of the chapter is organized as follows. until the 1940s that a systematic theory of
Section “Historical Overview on Estimation” estimation started to be established, mainly
will provide a historical overview on estimation. relying on the foundations of the modern theory
The next section will discuss applications of of probability (Kolmogorov 1933). Actually, the
estimation. Connections between estimation and roots of probability theory can be traced back to
information theories will be explored in the sub- the calculus of combinatorics (the Stomachion
sequent section. Finally, the section “Conclusions puzzle invented by Archimedes (Netz and Noel
and Future Trends” will conclude the chapter 2011)) in the third century B.C. and to the
by discussing future trends in estimation. An gambling theory (work of Cardano, Pascal, de
extensive list of references is also provided. Fermat, Huygens) in the sixteenth–seventeenth
centuries.
Differently from the previous work devoted to
the estimation of constant parameters, in the
Historical Overview on Estimation period 1940–1960 the attention was mainly
shifted toward the estimation of signals. In
A possibly incomplete, list of the major achieve- particular, Wiener in 1940 (Wiener 1949)
ments on estimation theory and applications is and Kolmogorov in 1941 (Kolmogorov 1941)
reported in Table 1. The entries of the table, formulated and solved the problem of linear
sorted in chronological order, provide for each minimum mean-square error (MMSE) estimation
contribution the name of the inventor (or inven- of continuous-time and, respectively, discrete-
tors), the date, and a short description with main time stationary random signals. In the late 1940s
bibliographical references. and in the 1950s, Wiener-Kolmogorov’s theory
Probably the first important application of was extended and generalized in many directions
estimation dates back to the beginning of exploiting both time-domain and frequency-
the nineteenth century whenever least-squares domain approaches. At the beginning of the
estimation (LSE), invented by Gauss in 1795 1960s Rudolf E. Kálmán made pioneering
(Gauss 1995; Legendre 1810), was successfully contributions to estimation by providing the
exploited in astronomy for predicting planet mathematical foundations of the modern theory
orbits (Gauss 1806). Least-squares estimation based on state-variable representations. In
follows a deterministic approach by minimizing particular, Kálmán solved the linear MMSE
the sum of squares of residuals defined as filtering and prediction problems both in discrete-
differences between observed data and model- time (Kalman 1960b) and in continuous-time
predicted estimates. A subsequently introduced (Kalman and Bucy 1961); the resulting optimal
statistical approach is maximum likelihood estimator was named after him, Kalman filter
estimation (MLE), popularized by R. A. Fisher (KF). As a further contribution, Kalman also
between 1912 and 1922 (Fisher 1912, 1922, singled out the key technical conditions, i.e.,
1925). MLE consists of finding the estimate of observability and controllability, for which the
the unknown quantity of interest as the value resulting optimal estimator turns out to be
that maximizes the so-called likelihood function, stable. Kalman’s work went well beyond earlier
defined as the conditional probability density contributions of A. Kolmogorov, N. Wiener, and
function of the observed data given the quantity to their followers (“frequency-domain” approach)
be estimated. In intuitive terms, MLE maximizes by means of a general state-space approach. From
the agreement of the estimate with the observed the theoretical viewpoint, the KF is an optimal
data. Whenever the observation noise is assumed estimator, in a wide sense, of the state of a linear
Gaussian (Kim and Shevlyakov 2008; Park et al. dynamical system from noisy measurements;
2013), MLE coincides with LSE. specifically it is the optimal MMSE estimator in
Estimation, Survey on 369

Estimation, Survey on, Table 1 Major developments on estimation


Combinatorics (Netz and Noel 2011) as the basis of
Archimedes Third century B.C. probability
G. Cardano, B. Pascal, P. de Sixteenth–seventeenth cen- Roots of the theory of probability (Devlin 2008)
Fermat, C. Huygens turies
J. F. Riccati 1722–1723 Differential Riccati equation (Riccati 1722, 1723), sub-
sequently exploited in the theory of linear stochastic
filtering
T. Bayes 1763 Bayes’ formula on conditional probability (Bayes
1763; McGrayne 2011)
C. F. Gauss, A. M. Legendre 1795–1810 Least-squares estimation and its applications to the pre- E
diction of planet orbits (Gauss 1806, 1995; Legendre
1810)
P. S. Laplace 1814 Theory of probability (Laplace 1814)
R. A. Fisher 1912–1922 Maximum likelihood estimation (Fisher 1912, 1922,
1925)
A. N. Kolmogorov 1933 Modern theory of probability (Kolmogorov 1933)
N. Wiener 1940 Minimum mean-square error estimation of continuous-
time stationary random signals (Wiener 1949)
A. N. Kolmogorov 1941 Minimum mean-square error estimation of discrete-
time stationary random signals (Kolmogorov 1941)
H. Cramér, C. R. Rao 1945 Theoretical lower bound on the covariance of estima-
tors (Cramér 1946; Rao 1945)
S. Ulam, J. von Neumann, 1946–1949 Monte Carlo method (Los Alamos Scientific Labora-
N. Metropolis, E. Fermi tory 1966; Metropolis and Ulam 1949; Ulam 1952;
Ulam et al. 1947)
J. Sklansky, T. R. Benedict, 1957–1967 ˛  ˇ and ˛  ˇ   filters (Benedict and Bordner
G. W. Bordner, H. R. Simp- 1962; Neal 1967; Painter et al. 1990; Simpson 1963;
son, S. R. Neal Sklansky 1957)
R. L. Stratonovich, 1959–1964 Bayesian approach to stochastic nonlinear filtering of
H. J. Kushner continuous-time systems, i.e., Stratonovich-Kushner
equation for the evolution of the state conditional
probability density (Jazwinski 1970; Kushner 1962,
1967; Stratonovich 1959, 1960)
R. E. Kalman 1960 Linear filtering and prediction for discrete-time sys-
tems (Kalman 1960b)
R. E. Kalman 1961 Observability of linear dynamical systems (Kalman
1960a)
R. E. Kalman, R. S. Bucy 1961 Linear filtering and prediction for continuous-time sys-
tems (Kalman and Bucy 1961)
A. E. Bryson, M. Frazier, Since 1963 Smoothing of linear and nonlinear systems (Anderson
H. E. Rauch, F. Tung, C. T. and Chirarattananon 1972; Bryson and Frazier 1963;
Striebel, D. Q. Mayne, J. S. Mayne 1966; Meditch 1967; Rauch 1963; Rauch et al.
Meditch, D. C. Fraser, L. E. 1965; Zachrisson 1969)
Zachrisson, B. D. O. Ander-
son, etc.
D. G. Luenberger 1964 State observer for a linear system (Luenberger 1964)
Y. C. Ho, R. C. K. Lee 1964 Bayesian approach to recursive nonlinear estimation
for discrete-time systems (Ho and Lee 1964)
W. M. Wonham 1965 Optimal filtering for Markov step processes (Wonham
1965)
A. H. Jazwinski 1966 Bayesian approach to stochastic nonlinear filtering for
continuous-time stochastic systems with discrete-time
observations (Jazwinski 1966)
(continued)
370 Estimation, Survey on

Estimation, Survey on, Table 1 (continued)


Combinatorics (Netz and Noel 2011) as the basis of
Archimedes Third century B.C. probability
S. F. Schmidt 1966 Extended Kalman filter and its application for the
manned lunar missions (Schmidt 1966)
P. L. Falb, A. V. Balakr- Since 1967 State estimation for infinite-dimensional (e.g., dis-
ishnan, J. L. Lions, S. G. tributed parameter, partial differential equation (PDE),
Tzafestas, J. M. Nightin- delay) systems (Balakrishnan and Lions 1967; Falb
gale, H. J. Kushner, J. S. 1967; Kushner 1970; Kwakernaak 1967; Meditch
Meditch, etc. 1971; Tzafestas and Nightingale 1968)
T. Kailath 1968 Principle of orthogonality and innovation approach
to estimation (Frost and Kailath 1971; Kailath 1968,
1970; Kailath and Frost 1968; Kailath et al. 2000)
A. H. Jazwinski, B. Rawl- Since 1968 Limited memory (receding-horizon, moving-horizon)
ings, etc. state estimation with constraints (Alessandri et al.
2005, 2008; Jazwinski 1968; Rao et al. 2001, 2003)
F. C. Schweppe, D. P. Bert- Since 1968 Set-membership recursive state estimation with sys-
sekas, I. B. Rhodes, M. Mi- tems with unknown but bounded noises (Alamo et al.
lanese, etc. 2005; Bertsekas and Rhodes 1971; Chisci et al. 1996;
Combettes 1993; Milanese and Belforte 1982; Mi-
lanese and Vicino 1993; Schweppe 1968; Vicino and
Zappa 1996)
J. E. Potter, G. Golub, S. 1968–1975 Square-root filtering (Andrews 1968; Bierman 1974,
F. Schmidt, P. G. Kaminski, 1977; Golub 1965; Kaminski and Bryson 1972; Morf
A. E. Bryson, A. Andrews, and Kailath 1975; Potter and Stern 1963; Schmidt
G. J. Bierman, M. Morf, T. 1970)
Kailath, etc.
C. W. Helstrom 1969 Quantum estimation (Helstrom 1969, 1976)
D. L. Alspach, H. W. Soren- 1970–1972 Gaussian-sum filters for nonlinear and/or non-
son Gaussian systems (Alspach and Sorenson 1972;
Sorenson and Alspach 1970, 1971)
T. Kailath, M. Morf, G. S. 1973–1974 Fast Chandrasekhar-type algorithms for recursive state
Sidhu estimation of stationary linear systems (Kailath 1973;
Morf et al. 1974)
A. Segall 1976 Recursive estimation from point processes (Segall
1976)
J. W. Woods and C. Rade- 1977 Kalman filter in two dimensions (Woods and Radewan
wan 1977) for image processing
J. H. Taylor 1979 Cramér-Rao lower bound (CRLB) for recursive state
estimation with no process noise (Taylor 1979)
D. Reid 1979 Multiple Hypothesis Tracking (MHT) filter for multi-
target tracking (Reid 1979)
L. Servi, Y. Ho 1981 Optimal filtering for linear systems with uniformly
distributed measurement noise (Servi and Ho 1981)
V. E. Benes 1981 Exact finite-dimensional optimal MMSE filter for a
class of nonlinear systems (Benes 1981)
H. V. Poor, D. Looze, J. Dar- 1981–1988 Robust (e.g., H1 ) filtering (Darragh and Looze 1984;
ragh, S. Verdú, M. J. Grim- Grimble 1988; Hassibi et al. 1999; Poor and Looze
ble, etc. 1981; Simon 2006; Verdú and Poor 1984)
V. J. Aidala, S. E. Hammel 1983 Bearings-only tracking (Aidala and Hammel 1983;
Farina 1999)
F. E. Daum 1986 Extension of the Benes filter to a more general class of
nonlinear systems (Daum 1986)
(continued)
Estimation, Survey on 371

Estimation, Survey on, Table 1 (continued)


Combinatorics (Netz and Noel 2011) as the basis of
Archimedes Third century B.C. probability
L. Dai and others Since 1987 State estimation for linear descriptor (singular, im-
plicit) stochastic systems (Chisci and Zappa 1992; Dai
1987, 1989; Nikoukhah et al. 1992)
N. J. Gordon, D. J. Salmond, 1993 Particle (sequential Monte Carlo) filter (Doucet et al.
A. M. F. Smith 2001; Gordon et al. 1993; Ristic et al. 2004)
K. C. Chou, A. S. Willsky, 1994 Multiscale Kalman filter (Chou et al. 1994)
A. Benveniste
G. Evensen 1994 Ensemble Kalman filter for data assimilation in meteo- E
rology and oceanography (Evensen 1994b, 2007)
R. P. S. Mahler 1994 Random set filtering (Mahler 1994, 2007a; Ristic et al.
2013)
S. J. Julier, J. K. Uhlmann, 1995 Unscented Kalman filter (Julier and Uhlmann 2004;
H. Durrant-Whyte Julier et al. 1995)
A. Germani et al. Since 1996 Polynomial extended Kalman filter for nonlinear
and/or non-Gaussian systems (Carravetta et al. 1996;
Germani et al. 2005)
P. Tichavsky, C. H. Mu- 1998 Posterior Cramér-Rao lower bound (PCRLB) for recur-
ravchik, A. Nehorai sive state estimation (Tichavsky et al. 1998; van Trees
and Bell 2007)
R. Mahler 2003, 2007 Probability hypothesis density (PHD) and cardinalized
PHD (CPHD) filters (Mahler 2003, 2007b; Ristic 2013;
Vo and Ma 1996; Vo et al. 2007)
A.G. Ramm 2005 Estimation of random fields (Ramm 2005)
M. Hernandez, A. Farina, B. 2006 PCRLB for tracking in the case of detection probability
Ristic less than one and false alarm probability greater than
zero (Hernandez et al. 2006)
Olfati-Saber and others Since 2007 Consensus filters (Olfati-Saber et al. 2007; Calafiore
and Abrate 2009; Xiao et al. 2005; Alriksson and
Rantzer 2006; Olfati-Saber 2007; Kamgarpour and
Tomlin 2007; Stankovic et al. 2009; Battistelli et al.
2011, 2012, 2013; Battistelli and Chisci 2014) for
networked estimation

the Gaussian case (e.g., for normally distributed accessible observables. Due to the generality
noises and initial state) and the best linear of the state estimation problem, which actually
unbiased estimator irrespective of the noise encompasses parameter and signal estimation as
and initial state distributions. From the practical special cases, the literature on estimation since
viewpoint, the KF enjoys the desirable properties 1960 till today has been mostly concentrated
of being linear and acting recursively, step-by- on extensions and generalizations of Kalman’s
step, on a noise-contaminated data stream. This work in several directions. Considerable efforts,
allows for cheap real-time implementation on motivated by the ubiquitous presence of
digital computers. Further, the universality of nonlinearities in practical estimation problems,
“state-variable representations” allows almost have been devoted to nonlinear and/or non-
any estimation problem to be included in the Gaussian filtering, starting from the seminal
KF framework. For these reasons, the KF is, papers of Stratonovich (1959, 1960) and Kushner
and continues to be, an extremely effective and (1962, 1967) for continuous-time systems,
easy-to-implement tool for a great variety of Ho and Lee (1964) for discrete-time systems,
practical tasks, e.g., to detect signals in noise and Jazwinski (1966) for continuous-time
or to estimate unmeasurable quantities from systems with discrete-time observations. In these
372 Estimation, Survey on

papers, state estimation is cast in a probabilistic can be found in van Trees and Bell (2007). A brief
(Bayesian) framework as the problem of evolving review of the earlier (until 1974) state of art in
in time the state conditional probability density estimation can be found in Lainiotis (1974).
given observations (Jazwinski 1970). Work on
nonlinear filtering has produced over the years
several nonlinear state estimation algorithms, Applications
e.g., the extended Kalman filter (EKF) (Schmidt
1966), the unscented Kalman filter (UKF) (Julier Astronomy
and Uhlmann 2004; Julier et al. 1995), the The problem of making estimates and predictions
Gaussian-sum filter (Alspach and Sorenson on the basis of noisy observations originally at-
1972; Sorenson and Alspach 1970, 1971), the tracted the attention many centuries ago in the
sequential Monte Carlo (also called particle) filter field of astronomy. In particular, the first attempt
(SMCF) (Doucet et al. 2001; Gordon et al. 1993; to provide an optimal estimate, i.e., such that a
Ristic et al. 2004), and the ensemble Kalman filter certain measure of the estimation error be min-
(EnKF) (Evensen 1994a,b, 2007) which have imized, was due to Galileo Galilei that, in his
been, and are still now, successfully employed Dialogue on the Two World Chief Systems (1632)
in various application domains. In particular, (Galilei 1632), suggested, as a possible criterion
the SMCF and EnKF are stochastic simulation for estimating the position of Tycho Brahe’s su-
algorithms taking inspiration from the work in the pernova, the estimate that required the “mini-
1940s on the Monte Carlo method (Metropolis mum amendments and smallest corrections” to
and Ulam 1949) which has recently got renewed the data. Later, C. F. Gauss mathematically speci-
interest thanks to the tremendous advances in fied this criterion by introducing in 1795 the least-
computing technology. A thorough review on squares method (Gauss 1806, 1995; Legendre
nonlinear filtering can be found, e.g., in Daum 1810) which was successfully applied in 1801
(2005) and Crisan and Rozovskii (2011). to predict the location of the asteroid Ceres.
Other interesting areas of investigation have This asteroid, originally discovered by the Italian
concerned smoothing (Bryson and Frazier 1963), astronomer Giuseppe Piazzi on January 1, 1801,
robust filtering for systems subject to model- and then lost in the glare of the sun, was in
ing uncertainties (Poor and Looze 1981), and fact recovered 1 year later by the Hungarian
state estimation for infinite-dimensional (i.e., dis- astronomer F. X. von Zach exploiting the least-
tributed parameter and/or delay) systems (Bal- squares predictions of Ceres’ position provided
akrishnan and Lions 1967). Further, a lot of by Gauss.
attention has been devoted to the implementa-
tion of the KF, specifically square-root filtering Statistics
(Potter and Stern 1963) for improved numerical Starting from the work of Fisher in the 1920s
robustness and fast KF algorithms (Kailath 1973; (Fisher 1912, 1922, 1925), maximum likelihood
Morf et al. 1974) for enhancing computational estimation has been extensively employed
efficiency. Worth of mention is the work over in statistics for estimating the parameters of
the years on theoretical bounds on the estimation statistical models (Bard 1974; Ghosh et al.
performance originated from the seminal papers 1997; Koch 1999; Lehmann and Casella 1998;
of Rao (1945) and Cramér (1946) on the lower Tsybakov 2009; Wertz 1978).
bound of the MSE for parameter estimation and
subsequently extended in Tichavsky et al. (1998) Telecommunications and Signal/Image
to nonlinear filtering and in Hernandez et al. Processing
(2006) to more realistic estimation problems with Wiener-Kolmogorov’s theory on signal esti-
possible missed and/or false measurements. An mation, developed in the period 1940–1960
extensive review of this work on Bayesian bounds and originally conceived by Wiener during
for estimation, nonlinear filtering, and tracking the Second World War for predicting aircraft
Estimation, Survey on 373

trajectories in order to direct the antiaircraft fire, filter for nonlinear systems and square-root filter
subsequently originated many applications in implementations for enhanced numerical robust-
telecommunications and signal/image processing ness have been developed as part of the NASA’s
(Barkat 2005; Biemond et al. 1983; Elliott Apollo program. The aerospace field was only
et al. 2008; Itakura 1971; Kay 1993; Kim the first of a long and continuously expanding
and Woods 1998; Levy 2008; Najim 2008; list of application domains where the Kalman
Poor 1994; Tuncer and Friedlander 2009; filter and its nonlinear generalizations have found
Van Trees 1971; Wakita 1973; Woods and widespread and beneficial use.
Radewan 1977). For instance, Wiener filters have
been successfully applied to linear prediction, Control Systems and System Identification E
acoustic echo cancellation, signal restoration, and The work on Kalman filtering (Kalman 1960b;
image/video de-noising. But it was the discovery Kalman and Bucy 1961) had also a significant
of the Kalman filter in 1960 that revolutionized impact on control system design and implemen-
estimation by providing an effective and powerful tation. In Kalman (1960a) duality between esti-
tool for the solution of any, static or dynamic, mation and control was pointed out, in that for a
stationary or adaptive, linear estimation problem. certain class of control and estimation problems
A recently conducted, and probably non- one can solve the control (estimation) problem
exhaustive, search has detected the presence for a given dynamical system by resorting to a
of over 16,000 patents related to the “Kalman corresponding estimation (control) problem for
filter,” spreading over all areas of engineering a suitably defined dual system. In particular, the
and over a period of more than 50 years. What Kalman filter has been shown to be dual of
is astonishing is that even nowadays, more than the linear-quadratic (LQ) regulator, and the two
50 years after its discovery, one can see the dual techniques constitute the linear-quadratic-
continuous appearance of lots of new patents and Gaussian (LQG) (Joseph and Tou 1961) regula-
scientific papers presenting novel applications tor. The latter consists of an LQ regulator feeding
and/or novel extensions in many directions (e.g., back in a linear way the state estimate provided
to nonlinear filtering) of the KF. Since 1992 by a Kalman filter, which can be independently
the number of patents registered every year and designed in view of the separation principle.
related to the KF follows an exponential law. The KF as well as LSE and MLE techniques
are also widely used in system identification
Space Navigation and Aerospace (Ljung 1999; Söderström and Stoica 1989) for
Applications both parameter estimation and output prediction
The first important application of the Kalman purposes.
filter was in the NASA (National Aeronautic
and Space Administration) space program. As Tracking
reported in a NASA technical report (McGee and One of the major application areas for estimation
Schmidt 1985), Kalman presented his new ideas is tracking (Bar-Shalom and Fortmann 1988; Bar-
while visiting Stanley F. Schmidt at the NASA Shalom et al. 2001, 2013; Blackman and Popoli
Ames Research Center in 1960, and this meeting 1999; Farina and Studer 1985, 1986), i.e., the
stimulated the use of the KF during the Apollo task of following the motion of moving objects
program (in particular, in the guidance system of (e.g., aircrafts, ships, ground vehicles, persons,
Saturn V during Apollo 11 flight to the Moon), animals) given noisy measurements of kinematic
and, furthermore, in the NASA Space Shuttle variables from remote sensors (e.g., radar, sonar,
and in Navy submarines and unmanned aerospace video cameras, wireless sensors, etc.). The de-
vehicles and weapons, such as cruise missiles. velopment of the Wiener filter in the 1940s was
Further, to cope with the nonlinearity of the space actually motivated by radar tracking of aircraft
navigation problem and the small word length for automatic control of antiaircraft guns. Such
of the onboard computer, the extended Kalman filters began to be used in the 1950s whenever
374 Estimation, Survey on

computers were integrated with radar systems, and oceanography. A large-scale state-space
and then in the 1960s more advanced and better model is typically obtained from the physical
performing Kalman filters came into use. Still system model, expressed in terms of partial
today it can be said that the Kalman filter and differential equations (PDEs), by means of
its nonlinear generalizations (e.g., EKF (Schmidt a suitable spatial discretization technique so
1966), UKF (Julier and Uhlmann 2004), and that data assimilation is cast into a state
particle filter (Gordon et al. 1993)) represent estimation problem. To deal with the huge
the workhorses of tracking and sensor fusion. dimensionality of the resulting state vector,
Tracking, however, is usually much more compli- appropriate filtering techniques with reduced
cated than a simple state estimation problem due computational load have been suitably developed
to the presence of false measurements (clutter) (Evensen 2007).
and multiple objects in the surveillance region
of interest, as well as for the uncertainty about Global Navigation Satellite Systems
the origin of measurements. This requires to use, Global Navigation Satellite Systems (GNSSs),
besides filtering algorithms, smart techniques for such as GPS put into service in 1993 by the
object detection as well as for association be- US Department of Defense, provide nowadays a
tween detected objects and measurements. The commercially diffused technology exploited by
problem of joint target tracking and classifica- millions of users all over the world for navigation
tion has also been formulated as a hybrid state purposes, wherein the Kalman filter plays a key
estimation problem and addressed in a number of role (Bar-Shalom et al. 2001). In fact, the Kalman
papers (see, e.g., Smeth and Ristic (2004) and the filter not only is employed in the core of the
references therein). GNSS to estimate the trajectories of all the satel-
lites, the drifts and rates of all system clocks, and
Econometrics hundreds of parameters related to atmospheric
State and parameter estimation have been widely propagation delay, but also any GNSS receiver
used in econometrics (Aoki 1987) for analyz- uses a nonlinear Kalman filter, e.g., EKF, in order
ing and/or predicting financial time series (e.g., to estimate its own position and velocity along
stock prices, interest rates, unemployment rates, with the bias and drift of its own clock with
volatility etc.). respect to the GNSS time.

Geophysics Robotic Navigation (SLAM)


Wiener and Kalman filtering techniques are em- Recursive state estimation is commonly em-
ployed in reflection seismology for estimating the ployed in mobile robotics (Thrun et al. 2006) in
unknown earth reflectivity function given noisy order to on-line estimate the robot pose, location
measurements of the seismic wavelet’s echoes and velocity, and, sometimes, also the location
recorded by a geophone. This estimation prob- and features of the surrounding objects in the
lem, known as seismic deconvolution (Mendel environment exploiting measurements provided
1977, 1983, 1990), has been successfully ex- by onboard sensors; the overall joint estimation
ploited, e.g., for oil exploration. problem is referred to as SLAM (simultaneous
localization and mapping) (Dissanayake et al.
Data Assimilation for Weather Forecasting 2001; Durrant-Whyte and Bailey 2006a,b;
and Oceanography Mullane et al. 2011; Smith et al. 1986; Thrun
Another interesting application of estimation et al. 2006).
theory is data assimilation (Ghil and Malanotte-
Rizzoli 1991) which consists of incorporating Automotive Systems
noisy observations into a computer simulation Several automotive applications of the Kalman
model of a real system. Data assimilation has filter, or of its nonlinear variants, are reported
widespread use especially in weather forecasting in the literature for the estimation of various
Estimation, Survey on 375

quantities of interest that cannot be directly mea- Static Scalar Case


sured, e.g., roll angle, sideslip angle, road-tire Consider two scalar real-valued random vari-
forces, heading direction, vehicle mass, state of ables, x and y, related by
charge of the battery (Barbarisi et al. 2006),
p
etc. In general, one of the major applications of yD x C v (1)
state estimation is the development of virtual sen-
sors, i.e., estimation algorithms for physical vari- where v, the measurement noise, is a standard
ables of interest, that cannot be directly measured Gaussian random variable independent of x and
for technical and/or economic reasons (Stephant  can be regarded as the gain in the output
et al. 2004). signal-to-noise ratio (SNR) due to the channel. E
By considering the MI between  xpand y as  a
Miscellaneous Applications function of , i.e., I./ D I x;  x C v , it
Other areas where estimation has found can be shown that the following relation holds
numerous applications include electric power (Guo et al. 2005):

1 h i
systems (Abur and Gómez Espósito 2004; Debs
d 2
and Larson 1970; Miller and Lewis 1971; I./ D E .x  x.// O (2)
Monticelli 1999; Toyoda et al. 1970), nuclear d 2
reactors (Robinson 1963; Roman et al. 1971;  p
O
where x./ D E xj x C v is the minimum
Sage and Masters 1967; Venerus and Bullock
mean-square error estimate of x given y. Figure 1
1970), biomedical engineering (Bekey 1973;
displays the behavior of both MI, in natural log-
Snyder 1970; Stark 1968), pattern recognition
arithmic units of information (nats), and MMSE
(Andrews 1972; Ho and Agrawala 1968;
versus SNR.
Lainiotis 1972), and many others.
As mentioned in Guo et al. (2005), the above
information-estimation relationship (2) has found
a number of applications, e.g., in nonlinear filter-
Connection Between Information and ing, in multiuser detection, in power allocation
Estimation Theories over parallel Gaussian channels, in the proof
of Shannon’s entropy power inequality and its
In this section, the link between two fundamental generalizations, as well as in the treatment of the
quantities in information theory and estimation capacity region of several multiuser channels.
theory, i.e., the mutual information (MI) and
respectively the minimum mean-square error Linear Dynamic Continuous-Time Case
(MMSE), is investigated. In particular, a While in the static case the MI is assumed to be
strikingly simple but very general relationship a function of the SNR, in the dynamic case it
can be established between the MI of the input is of great interest to investigate the relationship
and the output of an additive Gaussian channel between the MI and the MMSE as a function of
and the MMSE in estimating the input given the time.
output, regardless of the input distribution (Guo Consider the following first-order (scalar) lin-
et al. 2005). Although this functional relation ear Gaussian continuous-time stochastic dynami-
holds for general settings of the Gaussian channel cal system:
(e.g., both discrete-time and continuous-time,
possibly vector, channels), in order to avoid the dxt D axt dt C d wt
p (3)
heavy mathematical preliminaries needed to treat dyt D  xt dt C dvt
rigorously the general problem, two simple scalar
cases, a static and a (continuous-time) dynamic where a is a real-valued constant while wt
one, will be discussed just to highlight the main and vt are independent standard Brownian
concept. motion processes that represent the process and,
376 Estimation, Survey on

Estimation, Survey on, Fig. 1 MI and MMSE versus SNR

respectively, measurement noises. Defining directions. Among the many new future trends,
4 networked estimation and quantum estimation
by D fxs ; 0  s  tg the collection
x0t
of all states up to time t and analogously (briefly overviewed in the subsequent parts of this
4 section) certainly deserve special attention due to
y0t D fys ; 0  s  tg for the channel outputs
the growing interest on wireless sensor networks
(i.e., measurements) and considering the MI
and, respectively, quantum computing.
between x0t and y0t as a function of time t, i.e.,
I.t/ D I x0t ; y0t , it can be shown that (Duncan
1970; Mayer-Wolf and Zakai 1983) Networked Information Fusion and
Estimation
d  h i Information or data fusion is about combining,
I.t/ D E .xt  xO t /2 (4) or fusing, information or data from multiple
dt 2
 sources to provide knowledge that is not
where xO t D E xt jy0t is the minimum mean- evident from a single source (Bar-Shalom
square error estimate of the state xt given all the et al. 2013; Farina and Studer 1986). In
channel outputs up to time t, i.e., y0t . Figure 2 1986, an effort to standardize the terminol-
depicts the time behavior of both MI and MMSE ogy related to data fusion began and the
for several values of  and a D 1. JDL (Joint Directors of Laboratories) data
fusion working group was established. The
result of that effort was the conception of
Conclusions and Future Trends a process model for data fusion and a data
fusion lexicon (Blasch et al. 2012; Hall and
Despite the long history of estimation and the Llinas 1997). Information and data fusion are
huge amount of work on several theoretical and mainly supported by sensor networks which
practical aspects of estimation, there is still a lot present the following advantages over a single
of research investigation to be done in several sensor:
Estimation, Survey on 377

Estimation, Survey on, Fig. 2 MI and MMSE for different values of  (10 dB, 0 dB, +10 dB) and a D 1

• Can be deployed over wide regions • Spatial and temporal sensor alignment
• Provide diverse characteristics/viewing angles • Scalable fusion
of the observed phenomenon • Robustness with respect to data incest (or dou-
• Are more robust to failures ble counting), i.e., repeated use of the same
• Gather more data that, once fused, provide a information
more complete picture of the observed phe- • Handling data latency (e.g., out-of-sequence
nomenon measurements/estimates)
• Allow better geographical coverage, i.e., • Communication bandwidth limitations
wider area and less terrain obstructions. In particular, to counteract data incest the
Sensor network architectures can be centralized, so-called covariance intersection (Julier and
hierarchical (with or without feedback), and Uhlmann 1997) robust fusion approach has
distributed (peer-to-peer). Today’s trend for been proposed to guarantee, at the price of
many monitoring and decision-making tasks is some conservatism, consistency of the fused
to exploit large-scale networks of low-cost and estimate when combining estimates from
low-energy consumption devices with sensing, different nodes with unknown correlations. For
communication, and processing capabilities. scalable fusion, a consensus approach (Olfati-
For scalability issues, such networks should Saber et al. 2007) can be undertaken. This
operate in a fully distributed (peer-to-peer) allows to carry out a global (i.e., over the
fashion, i.e., with no centralized coordination, whole network) processing task by iterating
so as to achieve in each node a global local processing steps among neighboring
estimation/decision objective through localized nodes.
processing only. Several consensus algorithms have been
The attainment of this goal actually requires proposed for distributed parameter (Calafiore
several issues to be addressed like: and Abrate 2009) or state (Alriksson and
378 Estimation, Survey on

Rantzer 2006; Kamgarpour and Tomlin 2007; In Iida et al. (2010) the quantum Kalman filter is
Olfati-Saber 2007; Stankovic et al. 2009; Xiao applied to an optical cavity composed of mirrors
et al. 2005) estimation. Recently, Battistelli and and crystals inside, which interacts with a probe
Chisci (2014) introduced a generalized consensus laser. In particular, a form of a quantum stochastic
on probability densities which opens up the differential equation can be written for such a
possibility to perform in a fully distributed system so as to design the algorithm that updates
and scalable way any Bayesian estimation the estimates of the system variables on the basis
task over a sensor network. As by-products, of the measurement outcome of the system.
this approach allowed to derive consensus
Kalman filters with guaranteed stability under
Cross-References
minimal requirements of system observability
and network connectivity (Battistelli et al. 2011,
 Averaging Algorithms and Consensus
2012; Battistelli and Chisci 2014), consensus
 Bounds on Estimation
nonlinear filters (Battistelli et al. 2012), and a
 Consensus of Complex Multi-agent Systems
consensus CPHD filter for distributed multitarget
 Data Association
tracking (Battistelli et al. 2013). Despite these
 Estimation and Control over Networks
interesting preliminary results, networked
 Estimation for Random Sets
estimation is still a very active research area with
 Extended Kalman Filters
many open problems related to energy efficiency,
 Kalman Filters
estimation performance optimality, robustness
 Moving Horizon Estimation
with respect to delays and/or data losses, etc.
 Networked Control Systems: Estimation and
Control over Lossy Networks
Quantum Estimation
 Nonlinear Filters
Quantum estimation theory consists of a general-
 Observers for Nonlinear Systems
ization of the classical estimation theory in terms
 Observers in Linear Systems Theory
of quantum mechanics. As a matter of fact, the
 Particle Filters
statistical theory can be seen as a particular case
of the more general quantum theory (Helstrom Acknowledgments The authors warmly recognize the
1969, 1976). Quantum mechanics presents prac- contribution of S. Fortunati (University of Pisa, Italy),
tical applications in several fields of technology L. Pallotta (University of Napoli, Italy), and G. Battistelli
(University of Firenze, Italy).
(Personick 1971) such as, the use of quantum
number generators in place of the classical ran-
dom number generators. Moreover, manipulating
Bibliography
the energy states of the cesium atoms, it is
possible to suppress the quantum noise levels Abur A, Gómez Espósito A (2004) Power system state es-
and consequently improve the accuracy of timation – theory and implementation. Marcel Dekker,
atomic clocks. Quantum mechanics can also be New York
Aidala VJ, Hammel SE (1983) Utilization of modified
exploited to solve optimization problems, giving
polar coordinates for bearings-only tracking. IEEE
sometimes optimization algorithms that are faster Trans Autom Control 28(3):283–294
than conventional ones. For instance, McGeoch Alamo T, Bravo JM, Camacho EF (2005) Guaranteed
and Wang (2013) provided an experimental state estimation by zonotopes. Automatica 41(6):
1035–1043
study of algorithms based on quantum annealing.
Alessandri A, Baglietto M, Battistelli G (2005)
Interestingly, the results of McGeoch and Wang Receding-horizon estimation for switching discrete-
(2013) have shown that this approach allows to time linear systems. IEEE Trans Autom Control
obtain better solutions with respect to those found 50(11):1736–1748
Alessandri A, Baglietto M, Battistelli G (2008) Moving-
with conventional software solvers. In quantum
horizon state estimation for nonlinear discrete-time
mechanics, also the Kalman filter has found systems: new stability results and approximation
its proper form, as the quantum Kalman filter. schemes. Automatica 44(7):1753–1765
Estimation, Survey on 379

Alriksson P, Rantzer A (2006) Distributed Kalman Bayless JW, Brigham EO (1970) Application of the
filtering using weighted averaging. In: Proceed- Kalman filter. Geophysics 35(1):2–23
ings of the 17th International Symposium on Bekey GA (1973) Parameter estimation in biologi-
Mathematical Theory of Networks and Systems, cal systems: a survey. In: Proceedings of the 3rd
Kyoto IFAC Symposium Identification and System Parameter
Alspach DL, Sorenson HW (1972) Nonlinear Bayesian Estimation, Delft, pp 1123–1130
estimation using Gaussian sum approximations. IEEE Benedict TR, Bordner GW (1962) Synthesis of an optimal
Trans Autom Control 17(4):439–448 set of radar track-while-scan smoothing equations.
Anderson BDO, Chirarattananon S (1972) New lin- IRE Trans Autom Control 7(4):27–32
ear smoothing formulas. IEEE Trans Autom Control Benes VE (1981) Exact finite-dimensional filters for
17(1):160–161 certain diffusions with non linear drift. Stochastics
Anderson BDO, Moore JB (1979) Optimal filtering. Pren- 5(1):65–92
tice Hall, Englewood Cliffs Bertsekas DP, Rhodes IB (1971) Recursive state estima- E
Andrews A (1968) A square root formulation of tion for a set-membership description of uncertainty.
the Kalman covariance equations. AIAA J 6(6): IEEE Trans Autom Control 16(2):117–128
1165–1166 Biemond J, Rieske J, Gerbrands JJ (1983) A fast Kalman
Andrews HC (1972) Introduction to mathematical tech- filter for images degraded by both blur and noise. IEEE
niques in pattern recognition. Wiley, New York Trans Acoust Speech Signal Process 31(5):1248–1256
Aoki M (1987) State space modeling of time-series. Bierman GJ (1974) Sequential square root filtering and
Springer, Berlin smoothing of discrete linear systems. Automatica
Athans M (1971) An optimal allocation and guidance laws 10(2):147–158
for linear interception and rendezvous problems. IEEE Bierman GJ (1977) Factorization methods for discrete
Trans Aerosp Electron Syst 7(5):843–853 sequential estimation. Academic, New York
Balakrishnan AV, Lions JL (1967) State estimation Blackman S, Popoli R (1999) Design and analysis of
for infinite-dimensional systems. J Comput Syst Sci modern tracking systems. Artech House, Norwood
1(4):391–403 Blasch EP, Lambert DA, Valin P, Kokar MM, Llinas J,
Barbarisi O, Vasca F, Glielmo L (2006) State of charge Das S, Chong C, Shahbazian E (2012) High level
Kalman filter estimator for automotive batteries. Con- information fusion (HLIF): survey of models, issues,
trol Eng Pract 14(3):267–275 and grand challenges. IEEE Aerosp Electron Syst Mag
Bard Y (1974) Nonlinear parameter estimation. 27(9):4–20
Academic, New York Bryson AE, Frazier M (1963) Smoothing for linear and
Barkat M (2005) Signal detection and estimation. Artech nonlinear systems, TDR 63-119, Aero System Divi-
House, Boston sion. Wright-Patterson Air Force Base, Ohio
Bar-Shalom Y, Fortmann TE (1988) Tracking and data Calafiore GC, Abrate F (2009) Distributed linear estima-
association. Academic, Boston tion over sensor networks. Int J Control 82(5):868–882
Bar-Shalom Y, Li X, Kirubarajan T (2001) Estimation Carravetta F, Germani A, Raimondi M (1996) Polyno-
with applications to tracking and navigation. Wiley, mial filtering for linear discrete-time non-Gaussian
New York systems. SIAM J Control Optim 34(5):1666–1690
Bar-Shalom Y, Willett P, Tian X (2013) Tracking and data Chen J, Patton RJ (1999) Robust model-based fault diag-
fusion: a handbook of algorithms. YBS Publishing nosis for dynamic systems. Kluwer, Boston
Storrs, CT Chisci L, Zappa G (1992) Square-root Kalman filtering of
Battistelli G, Chisci L, Morrocchi S, Papi F (2011) An descriptor systems. Syst Control Lett 19(4):325–334
information-theoretic approach to distributed state es- Chisci L, Garulli A, Zappa G (1996) Recursive
timation. In: Proceedings of the 18th IFAC World state bounding by parallelotopes. Automatica 32(7):
Congress, Milan, pp 12477–12482 1049–1055
Battistelli G, Chisci L, Mugnai G, Farina A, Graziano Chou KC, Willsky AS, Benveniste A (1994) Multiscale
A (2012) Consensus-based algorithms for distributed recursive estimation, data fusion, and regularization.
filtering. In: Proceedings of the 51st IEEE Control and IEEE Trans Autom Control 39(3):464–478
Decision Conference, Maui, pp 794–799 Combettes P (1993) The foundations of set-theoretic esti-
Battistelli G, Chisci L (2014) Kullback-Leibler average, mation. Proc IEEE 81(2):182–208
consensus on probability densities, and distributed Cramér H (1946) Mathematical methods of statistics.
state estimation with guaranteed stability. Automatica University Press, Princeton
50:707–718 Crisan D, Rozovskii B (eds) (2011) The Oxford handbook
Battistelli G, Chisci L, Fantacci C, Farina A, Graziano of nonlinear filtering. Oxford University Press, Oxford
A (2013) Consensus CPHD filter for distributed mul- Dai L (1987) State estimation schemes in singular
titarget tracking. IEEE J Sel Topics Signal Process systems. In: Preprints 10th IFAC World Congress,
7(3):508–520 Munich, vol 9, pp 211–215
Bayes T (1763) Essay towards solving a problem in the Dai L (1989) Filtering and LQG problems for discrete-
doctrine of chances. Philos Trans R Soc Lond 53: time stochastic singular systems. IEEE Trans Autom
370–418. doi:10.1098/rstl.1763.0053 Control 34(10):1105–1108
380 Estimation, Survey on

Darragh J, Looze D (1984) Noncausal minimax linear Fisher RA (1912) On an absolute criterion for fitting
state estimation for systems with uncertain second- frequency curves. Messenger Math 41(5):155–160
order statistics. IEEE Trans Autom Control 29(6):555– Fisher RA (1922) On the mathematical foundations of
557 theoretical statistics. Philos Trans R Soc Lond Ser A
Daum FE (1986) Exact finite dimensional nonlinear fil- 222:309–368
ters. IEEE Trans Autom Control 31(7):616–622 Fisher RA (1925) Theory of statistical estimation. Math
Daum F (2005) Nonlinear filters: beyond the Kalman Proc Camb Philos Soc 22(5):700–725
filter. IEEE Aerosp Electron Syst Mag 20(8):57–69 Flinn EA, Robinson EA, Treitel S (1967) Special issue
Debs AS, Larson RE (1970) A dynamic estimator for on the MIT geophysical analysis group reports. Geo-
tracking the state of a power system. IEEE Trans physics 32:411–525
Power Appar Syst 89(7):1670–1678 Frost PA, Kailath T (1971) An innovations approach to
Devlin K (2008) The unfinished game: Pascal, Fermat least-squares estimation, Part III: nonlinear estimation
and the seventeenth-century letter that made the world in white Gaussian noise. IEEE Trans Autom Control
modern. Basic Books, New York. Copyright (C) Keith 16(3):217–226
Devlin Galilei G (1632) Dialogue concerning the two chief world
Dissanayake G, Newman P, Clark S, Durrant-Whyte H, systems (Translated in English by S. Drake (Foreword
Csorba M (2001) A solution to the simultaneous lo- of A. Einstein), University of California Press, Berke-
calization and map building (SLAM) problem. IEEE ley, 1953)
Trans Robot Autom 17(3):229–241 Gauss CF (1806) Theoria motus corporum coelestium in
Dochain D, Vanrolleghem P (2001) Dynamical modelling sectionibus conicis solem ambientum (Translated in
and estimation of wastewater treatment processes. English by C.H. Davis, Reprinted as Theory of motion
IWA Publishing, London of the heavenly bodies, Dover, New York, 1963)
Doucet A, de Freitas N, Gordon N (eds) (2001) Sequential Gauss CF (1995) Theoria combinationis observationum
Monte Carlo methods in practice. Springer, New York erroribus minimis obnoxiae (Translated by G.W. Stew-
Duncan TE (1970) On the calculation of mutual informa- art, Classics in Applied Mathematics), vol 11. SIAM,
tion. SIAM J Appl Math 19(1):215–220 Philadelphia
Durrant-Whyte H, Bailey T (2006a) Simultaneous local- Germani A, Manes C, Palumbo P (2005) Polynomial
ization and mapping: Part I. IEEE Robot Autom Mag extended Kalman filter. IEEE Trans Autom Control
13(2):99–108 50(12):2059–2064
Durrant-Whyte H, Bailey T (2006b) Simultaneous local- Ghil M, Malanotte-Rizzoli P (1991) Data assimilation
ization and mapping: Part II. IEEE Robot Autom Mag in meteorology and oceanography. Adv Geophys
13(3):110–117 33:141–265
Elliott RJ, Aggoun L, Moore JB (2008) Hidden Markov Ghosh M, Mukhopadhyay N, Sen PK (1997) Sequential
models: estimation and control. Springer, New York. estimation. Wiley, New York
(first edition in 1995) Golub GH (1965) Numerical methods for solving lin-
Evensen G (1994a) Inverse methods and data assimilation ear least squares problems. Numerische Mathematik
in nonlinear ocean models. Physica D 77:108–129 7(3):206–216
Evensen G (1994b) Sequential data assimilation with a Goodwin GC, Seron MM, De Doná JA (2005) Con-
nonlinear quasi-geostrophic model using Monte Carlo strained control and estimation. Springer, London
methods to forecast error statistics. J Geophys Res Gordon NJ, Salmond DJ, Smith AFM (1993) Novel
99(10):143–162 approach to nonlinear/non Gaussian Bayesian state
Evensen G (2007) Data assimilation – the ensemble estimation. IEE Proc F 140(2):107–113
Kalman filter. Springer, Berlin Grewal MS, Weill LR, Andrews AP (2001) Global po-
Falb PL (1967) Infinite-dimensional filtering; the Kalman- sitioning systems, inertial navigation and integration.
Bucy filter in Hilbert space. Inf Control 11(1):102–137 Wiley, New York
Farina A (1999) Target tracking with bearings-only mea- Grimble MJ (1988) H1 design of optimal linear filters.
surements. Signal Process 78(1):61–78 In: Byrnes C, Martin C, Saeks R (eds) Linear circuits,
Farina A, Studer FA (1985) Radar data processing. Vol- systems and signal processing: theory and applica-
ume 1 – introduction and tracking, Editor P. Bowron. tions. North Holland, Amsterdam, pp 535–540
Research Studies Press, Letchworth/Wiley, New York Guo D, Shamai S, Verdú S (2005) Mutual information and
(Translated in Russian (Radio I Sviaz, Moscow 1993) minimum mean-square error in Gaussian channels.
and in Chinese. China Defense Publishing House, IEEE Trans Inf Theory 51(4):1261–1282
1988) Hall DL, Llinas J (1997) An introduction to multisensor
Farina A, Studer FA (1986) Radar data processing. Vol- data fusion. Proc IEEE 85(1):6–23
ume 2 – advanced topics and applications, Editor P. Hassibi B, Sayed AH, Kailath T (1999) Indefinite-
Bowron. Research Studies Press, Letchworth/Wiley, quadratic estimation and control. SIAM, Philadelphia
New York (In Chinese, China Defense Publishing Heemink AW, Segers AJ (2002) Modeling and predic-
House, 1992) tion of environmental data in space and time using
Farrell JA, Barth M (1999) The global positioning system Kalman filtering. Stoch Environ Res Risk Assess 16:
and inertial navigation. McGraw-Hill, New York 225–240
Estimation, Survey on 381

Helstrom CW (1969) Quantum detection and estimation Kalman RE (1960b) A new approach to linear filtering and
theory. J Stat Phys 1(2):231–252 prediction problems. Trans ASME J Basic Eng Ser D
Helstrom CW (1976) Quantum detection and estimation 82(1):35–45
theory. Academic, New York Kalman RE, Bucy RS (1961) New results in linear filtering
Hernandez M, Farina A, Ristic B (2006) PCRLB for track- and prediction theory. Trans ASME J Basic Eng Ser D
ing in cluttered environments: measurement sequence 83(3):95–108
conditioning approach. IEEE Trans Aerosp Electron Kamgarpour M, Tomlin C (2007) Convergence properties
Syst 42(2):680–704 of a decentralized Kalman filter. In: Proceedings of the
Ho YC, Agrawala AK (1968) On pattern classification 47th IEEE Conference on Decision and Control, New
algorithms – introduction and survey. Proc IEEE Orleans, pp 5492–5498
56(12):2101–2113 Kaminski PG, Bryson AE (1972) Discrete square-root
Ho YC, Lee RCK (1964) A Bayesian approach to prob- smoothing. In: Proceedings of the AIAA Guidance and
lems in stochastic estimation and control. IEEE Trans Control Conference, paper no. 72–877, Stanford E
Autom Control 9(4):333–339 Kay SM (1993) Fundamentals of statistical signal pro-
Iida S, Ohki F, Yamamoto N (2010) Robust quantum cessing, volume 1: estimation theory. Prentice Hall,
Kalman filtering under the phase uncertainty of the Englewood Cliffs
probe-laser. In: IEEE international symposium on Kim K, Shevlyakov G (2008) Why Gaussianity? IEEE
computer-aided control system design, Yokohama, Signal Process Mag 25(2):102–113
pp 749–754 Kim J, Woods JW (1998) 3-D Kalman filter for image mo-
Itakura F (1971) Extraction of feature parameters of tion estimation. IEEE Trans Image Process 7(1):42–52
speech by statistical methods. In: Proceedings of Koch KR (1999) Parameter estimation and hypothesis
the 8th Symposium Speech Information Processing, testing in linear models. Springer, Berlin
Sendai, pp II-5.1–II-5.12 Kolmogorov AN (1933) Grundbegriffe der Wahrschein-
Jazwinski AH (1966) Filtering for nonlinear dynamical lichkeitsrechnung (in German) (English translation
systems. IEEE Trans Autom Control 11(4):765–766 edited by Nathan Morrison: Foundations of the theory
Jazwinski AH (1968) Limited memory optimal filtering. of probability, Chelsea, New York, 1956). Springer,
IEEE Trans Autom Control 13(5): 558–563 Berlin
Jazwinski AH (1970) Stochastic processes and filtering Kolmogorov AN (1941) Interpolation and extrapolation of
theory. Academic, New York stationary random sequences. Izv Akad Nauk SSSR
Joseph DP, Tou TJ (1961) On linear control theory. Trans Ser Mat 5:3–14 (in Russian) (English translation by
Am Inst Electr Eng 80(18):193–196 G. Lindqvist, published in Selected works of A.N.
Julier SJ, Uhlmann JK (1997) A non-divergent estimation Kolmogorov - Volume II: Probability theory and math-
algorithm in the presence of unknown correlations. In: ematical statistics, A.N. Shyryaev (Ed.), pp. 272–280,
Proceedings of the 1997 American Control Confer- Kluwer, Dordrecht, Netherlands, 1992)
ence, Albuquerque, vol 4, pp 2369–2373 Kushner HJ (1962) On the differential equations satis-
Julier SJ, Uhlmann JK (2004) Unscented Filtering and fied by conditional probability densities of Markov
Nonlinear Estimation. Proc IEEE 92(3): 401–422 processes with applications. SIAM J Control Ser A
Julier SJ, Uhlmann JK, Durrant-Whyte H (1995) A new 2(1):106–199
approach for filtering nonlinear systems. In: Proceed- Kushner HJ (1967) Dynamical equations for optimal non-
ings of the 1995 American control conference, Seattle, linear filtering. J Differ Equ 3(2):179–190
vol 3, pp 1628–1632 Kushner HJ (1970) Filtering for linear distributed param-
Kailath T (1968) An innovations approach to least- eter systems. SIAM J Control 8(3): 346–359
squares estimation, Part I: linear filtering in addi- Kwakernaak H (1967) Optimal filtering in linear sys-
tive white noise. IEEE Trans Autom Control 13(6): tems with time delays. IEEE Trans Autom Control
646–655 12(2):169–173
Kailath T (1970) The innovations approach to de- Lainiotis DG (1972) Adaptive pattern recognition: a state-
tection and estimation theory. Proc IEEE 58(5): variable approach. In: Watanabe S (ed) Frontiers of
680–695 pattern recognition. Academic, New York
Kailath T (1973) Some new algorithms for recursive Lainiotis DG (1974) Estimation: a brief survey. Inf Sci
estimation in constant linear systems. IEEE Trans Inf 7:191–202
Theory 19(6):750–760 Laplace PS (1814) Essai philosophique sur le probabilités
Kailath T, Frost PA (1968) An innovations approach (Translated in English by F.W. Truscott, F.L. Emory,
to least-squares estimation, Part II: linear smoothing Wiley, New York, 1902)
in additive white noise. IEEE Trans Autom Control Legendre AM (1810) Méthode des moindres quarrés, pour
13(6):655–660 trouver le milieu le plus probable entre les résultats
Kailath T, Sayed AH, Hassibi B (2000) Linear estimation. de diffèrentes observations. In: Mémoires de la Classe
Prentice Hall, Upper Saddle River des Sciences Mathématiques et Physiques de l’Institut
Kalman RE (1960a) On the general theory of control sys- Impérial de France, pp 149–154. Originally appeared
tems. In: Proceedings of the 1st International Congress as appendix to the book Nouvelle méthodes pour la
of IFAC, Moscow, USSR, vol 1, pp 481–492 détermination des orbites des comètes, 1805
382 Estimation, Survey on

Lehmann EL, Casella G (1998) Theory of point estima- Mendel JM (1977) White noise estimators for seismic
tion. Springer, New York data processing in oil exploration. IEEE Trans Autom
Leibungudt BG, Rault A, Gendreau F (1983) Application Control 22(5):694–706
of Kalman filtering to demographic models. IEEE Mendel JM (1983) Optimal seismic deconvolution: an
Trans Autom Control 28(3):427–434 estimation-based approach. Academic, New York
Levy BC (2008) Principles of signal detection and param- Mendel JM (1990) Maximum likelihood deconvolution
eter estimation. Springer, New York – a journey into model-based signal processing.
Ljung L (1999) System identification: theory for the user. Springer, New York
Prentice Hall, Upper Saddle River Metropolis N, Ulam S (1949) The Monte Carlo method.
Los Alamos Scientific Laboratory (1966) Fermi invention J Am Stat Assoc 44(247):335–341
rediscovered at LASL. The Atom, pp 7–11 Milanese M, Belforte G (1982) Estimation theory and un-
Luenberger DG (1964) Observing the state of a linear certainty intervals evaluation in presence of unknown
system. IEEE Trans Mil Electron 8(2):74–80 but bounded errors: linear families of models and
Mahler RPS (1996) A unified foundation for data fusion. estimators. IEEE Trans Autom Control 27(2):408–414
In: Sadjadi FA (ed) Selected papers on sensor and Milanese M, Vicino A (1993) Optimal estimation theory
data fusion. SPIE MS-124. SPIE Optical Engineering for dynamic systems with set-membership uncertainty.
Press, Bellingham, pp 325–345. Reprinted from 7th Automatica 27(6):427–446
Joint Service Data Fusion Symposium, pp 154–174, Miller WL, Lewis JB (1971) Dynamic state estima-
Laurel (1994) tion in power systems. IEEE Trans Autom Control
Mahler RPS (2003) Multitarget Bayes filtering via first- 16(6):841–846
order multitarget moments. IEEE Trans Aerosp Elec- Mlodinow L (2009) The drunkard’s walk: how random-
tron Syst 39(4):1152–1178 ness rules our lives. Pantheon Books, New York. Copy-
Mahler RPS (2007a) Statistical multisource multitarget right (C) Leonard Mlodinow
information fusion. Artech House, Boston Monticelli A (1999) State estimation in electric power
Mahler RPS (2007b) PHD filters of higher order in systems: a generalized approach. Kluwer, Dordrecht
target number. IEEE Trans Aerosp Electron Syst Morf M, Kailath T (1975) Square-root algorithms for
43(4):1523–1543 least-squares estimation. IEEE Trans Autom Control
Mangoubi ES (1998) Robust estimation and failure detec- 20(4):487–497
tion: a concise treatment. Springer, New York Morf M, Sidhu GS, Kailath T (1974) Some new al-
Maybeck PS (1979 and 1982) Stochastic models, estima- gorithms for recursive estimation in constant, linear,
tion and control, vols I, II and III. Academic, New discrete-time systems. IEEE Trans Autom Control
York. 1979 (volume I) and 1982 (volumes II and III) 19(4):315–323
Mayer-Wolf E, Zakai M (1983) On a formula relating Mullane J, Vo B-N, Adams M, Vo B-T (2011) Random
the Shannon information to the Fisher information for finite sets for robotic mapping and SLAM. Springer,
the filtering problem. In: Korezlioglu H, Mazziotto Berlin
G and Szpirglas J (eds) Lecture notes in control and Nachazel K (1993) Estimation theory in hydrology and
information sciences, vol 61. Springer, New York, water systems. Elsevier, Amsterdam
pp 164–171 Najim M (2008) Modeling, estimation and optimal fil-
Mayne DQ (1966) A solution of the smoothing problem tering in signal processing. Wiley, Hoboken. First
for linear dynamic systems. Automatica 4(2):73–92 published in France by Hermes Science/Lavoisier as
McGarty TP (1971) The estimation of the constituent Modélisation, estimation et filtrage optimal en traite-
densities of the upper atmosphere. IEEE Trans Autom ment du signal (2006)
Control 16(6):817–823 Neal SR (1967) Discussion of parametric relations for
McGee LA, Schmidt SF (1985) Discovery of the Kalman the filter predictor. IEEE Trans Autom Control 12(3):
filter as a practical tool for aerospace and industry, 315–317
NASA-TM-86847 Netz R, Noel W (2011) The Archimedes codex. Phoenix,
McGeoch CC, Wang C (2013) Experimental evaluation of Philadelphia, PA
an adiabatic quantum system for combinatorial opti- Nikoukhah R, Willsky AS, Levy BC (1992) Kalman
mization. In: Computing Frontiers 2013 Conference, filtering and Riccati equations for descriptor systems.
Ischia, article no. 23. doi:10.1145/2482767.2482797 IEEE Trans Autom Control 37(9):1325–1342
McGrayne SB (2011) The theory that would not die. How Olfati-Saber R (2007) Distributed Kalman filtering for
Bayes’ rule cracked the enigma code, hunted down sensor networks. In: Proceedings of the 46th IEEE
Russian submarines, and emerged triumphant from Conference on Decision and Control, New Orleans,
two centuries of controversies. Yale University Press, pp 5492–5498
New Haven/London Olfati-Saber R, Fax JA, Murray R (2007) Consensus and
Meditch JS (1967) Orthogonal projection and discrete cooperation in networked multi-agent systems. Proc
optimal linear smoothing. SIAM J Control 5(1):74–89 IEEE 95(1):215–233
Meditch JS (1971) Least-squares filtering and smoothing Painter JH, Kerstetter D, Jowers S (1990) Reconciling
for linear distributed parameter systems. Automatica steady-state Kalman and ˛  ˇ filter design. IEEE
7(3):315–322 Trans Aerosp Electron Syst 26(6):986–991
Estimation, Survey on 383

Park S, Serpedin E, Qarage K (2013) Gaussian assump- Sage AP, Masters GW (1967) Identification and modeling
tion: the least favorable but the most useful. IEEE of states and parameters of nuclear reactor systems.
Signal Process Mag 30(3):183–186 IEEE Trans Nucl Sci 14(1):279–285
Personick SD (1971) Application of quantum estimation Sage AP, Melsa JL (1971) Estimation theory with appli-
theory to analog communication over quantum chan- cations to communications and control. McGraw-Hill,
nels. IEEE Trans Inf Theory 17(3):240–246 New York
Pindyck RS, Roberts SM (1974) Optimal policies Schmidt SF (1966) Application of state-space methods to
for monetary control. Ann Econ Soc Meas 3(1): navigation problems. Adv Control Syst 3:293–340
207–238 Schmidt SF (1970) Computational techniques in Kalman
Poor HV (1994) An introduction to signal detection and filtering, NATO-AGARD-139
estimation, 2nd edn. Springer, New York. (first edition Schonhoff TA, Giordano AA (2006) Detection and esti-
in 1988) mation theory and its applications. Pearson Prentice
Poor HV, Looze D (1981) Minimax state estimation for Hall, Upper Saddle River E
linear stochastic systems with noise uncertainty. IEEE Schweppe FC (1968) Recursive state estimation with
Trans Autom Control 26(4):902–906 unknown but bounded errors and system inputs. IEEE
Potter JE, Stern RG (1963) Statistical filtering of space Trans Autom Control 13(1):22–28
navigation measurements. In: Proceedings of the Segall A (1976) Recursive estimation from discrete-
1963 AIAA Guidance and Control Conference, paper time point processes. IEEE Trans Inf Theory 22(4):
no. 63–333, Cambridge 422–431
Ramm AG (2005) Random fields estimation. World Sci- Servi L, Ho Y (1981) Recursive estimation in the presence
entific, Hackensack of uniformly distributed measurement noise. IEEE
Rao C (1945) Information and the accuracy attainable in Trans Autom Control 26(2):563–565
the estimation of statistical parameters. Bull Calcutta Simon D (2006) Optimal state estimation – Kalman, H1
Math Soc 37:81–89 and nonlinear approaches. Wiley, New York
Rao CV, Rawlings JB, Lee JH (2001) Constrained linear Simpson HR (1963) Performance measures and optimiza-
state estimation – a moving horizon approach. Auto- tion condition for a third-order tracker. IRE Trans
matica 37(10):1619–1628 Autom Control 8(2):182–183
Rao CV, Rawlings JB, Mayne DQ (2003) Constrained Sklansky J (1957) Optimizing the dynamic parameters of
state estimation for nonlinear discrete-time systems: a track-while-scan system. RCA Rev 18:163–185
stability and moving horizon approximations. IEEE Smeth P, Ristic B (2004) Kalman filter and joint tracking
Trans Autom Control 48(2): 246–257 and classification in TBM framework. In: Proceedings
Rauch HE (1963) Solutions to the linear smoothing prob- of the 7th international conference on information
lem. IEEE Trans Autom Control 8(4): 371–372 fusion, Stockholm, pp 46–53
Rauch HE, Tung F, Striebel CT (1965) Maximum like- Smith R, Self M, Cheeseman P (1986) Estimating uncer-
lihood estimates of linear dynamic systems. AIAA J tain spatial relationships in robotics. In: Proceedings
3(8):1445–1450 of the 2nd conference on uncertainty in artificial intel-
Reid D (1979) An algorithm for tracking multiple targets. ligence, Philadelphia, pp 435–461
IEEE Trans Autom Control 24(6):843–854 Snijders TAB, Koskinen J, Schweinberger M (2012) Max-
Riccati JF (1722) Animadversiones in aequationes differ- imum likelihood estimation for social network dynam-
entiales secundi gradi. Actorum Eruditorum Lipsiae, ics. Ann Appl Stat 4(2):567–588
Supplementa 8:66–73. Observations regarding differ- Snyder DL (1968) The state-variable approach to contin-
ential equations of second order, translation of the uous estimation with applications to analog communi-
original Latin into English, by I. Bruce, 2007 cations. MIT, Cambridge
Riccati JF (1723) Appendix in animadversiones in aequa- Snyder DL (1970) Estimation of stochastic intensity func-
tiones differentiales secundi gradi. Acta Eruditorum tions of conditional Poisson processes. Monograph no.
Lipsiae 128, Biomedical Computer Laboratory, Washington
Ristic B (2013) Particle filters for random set models. University, St. Louis
Springer, New York Söderström T (1994) Discrete-time stochastic system –
Ristic B, Arulampalam S, Gordon N (2004) Beyond the estimation & control. Prentice Hall, New York
Kalman filter: particle filters for tracking applications. Söderström T, Stoica P (1989) System identification. Pren-
Artech House, Boston tice Hall, New York
Ristic B, Vo B-T, Vo B-N, Farina A (2013) A tutorial on Sorenson HW, Alspach DL (1970) Gaussian sum approx-
Bernoulli filters: theory, implementation and applica- imations for nonlinear filtering. In: Proceedings of
tions. IEEE Trans Signal Process 61(13):3406–3430 the 1970 IEEE Symposium on Adaptive Processes,
Robinson EA (1963) Mathematical development of dis- Austin, TX pp 19.3.1–19.3.9
crete filters for detection of nuclear explosions. Sorenson HW, Alspach DL (1971) Recursive Bayesian
J Geophys Res 68(19):5559–5567 estimation using Gaussian sums. Automatica 7(4):
Roman WS, Hsu C, Habegger LT (1971) Parameter identi- 465–479
fication in a nonlinear reactor system. IEEE Trans Nucl Stankovic SS, Stankovic MS, Stipanovic DM (2009)
Sci 18(1):426–429 Consensus based overlapping decentralized estimation
384 Event-Triggered and Self-Triggered Control

with missing observations and communication faults. Vo B-N, Ma WK (1996) The Gaussian mixture probability
Automatica 45(6):1397–1406 hypothesis density filter. IEEE Trans Signal Process
Stark L (1968) Neurological control systems studies in 54(11):4091–4101
bioengineering. Plenum Press, New York Vo B-T, Vo B-N, Cantoni A (2007) Analytic implementa-
Stengel PF (1994) Optimal control and estimation. Dover, tions of the cardinalized probability hypothesis density
New York. Originally published as Stochastic optimal filter. IEEE Trans Signal Process 55(7):3553–3567
control. Wiley, New York, 1986 Wakita H (1973) Estimation of the vocal tract shape by
Stephant J, Charara A, Meizel D (2004) Virtual sensor: inverse optimal filtering. IEEE Trans Audio Electroa-
application to vehicle sideslip angle and transversal coust 21(5):417–427
forces. IEEE Trans Ind Electron 51(2):278–289 Wertz W (1978) Statistical density estimation – a survey.
Stratonovich RL (1959) On the theory of optimal nonlin- Vandenhoeck and Ruprecht, Göttingen
ear filtering of random functions. Theory Probab Appl Wiener N (1949) Extrapolation, interpolation and smooth-
4:223–225 ing of stationary time-series. MIT, Cambridge
Stratonovich RL (1960) Conditional Markov processes. Willsky AS (1976) A survey of design methods for fail-
Theory Probab Appl 5(2):156–178 ure detection in dynamic systems. Automatica 12(6):
Taylor JH (1979) The Cramér-Rao estimation error 601–611
lower bound computation for deterministic nonlin- Wonham WM (1965) Some applications of stochastic
ear systems. IEEE Trans Autom Control 24(2): differential equations to optimal nonlinear filtering.
343–344 SIAM J Control 2(3):347–369
Thrun S, Burgard W, Fox D (2006) Probabilistic robotics. Woods JW, Radewan C (1977) Kalman filtering in
MIT, Cambridge two dimensions. IEEE Trans Inf Theory 23(4):
Tichavsky P, Muravchik CH, Nehorai A (1998) Posterior 473–482
Cramér-Rao bounds for discrete-time nonlinear filter- Xiao L, Boyd S, Lall S (2005) A scheme for robust
ing. IEEE Trans Signal Process 6(5):1386–1396 distributed sensor fusion based on average consensus.
Toyoda J, Chen MS, Inoue Y (1970) An application of In: Proceedings of the 4th International Symposium
state estimation to short-term load forecasting, Part I: on Information Processing in Sensor Networks, Los
forecasting model and Part II: implementation. IEEE Angeles, pp 63–70
Trans Power Appar Syst 89(7):1678–1682 (Part I) and Zachrisson LE (1969) On optimal smoothing of
1683–1688 (Part II) continuous-time Kalman processes. Inf Sci 1(2):
Tsybakov AB (2009) Introduction to nonparametric esti- 143–172
mation. Springer, New York Zellner A (1971) An introduction to Bayesian inference in
Tuncer TE, Friedlander B (2009) Classical and modern econometrics. Wiley, New York
direction-of-arrival estimation. Academic, Burlington
Tzafestas SG, Nightingale JM (1968) Optimal filter-
ing, smoothing and prediction in linear distributed-
parameter systems. Proc Inst Electr Eng 115(8):
1207–1212 Event-Triggered and Self-Triggered
Ulam S (1952) Random processes and transformations. In: Control
Proceedings of the International Congress of Mathe-
maticians, Cambridge, vol 2, pp 264–275
W.P.M.H. Heemels1 , Karl H. Johansson2, and
Ulam S, Richtmeyer RD, von Neumann J (1947) Sta-
tistical methods in neutron diffusion, Los Alamos Paulo Tabuada3
1
Scientific Laboratory report LAMS-551 Department of Mechanical Engineering,
Van Trees H (1971) Detection, estimation and modulation Eindhoven University of Technology,
theory – Parts I, II and III. Wiley, New York. Repub-
Eindhoven, The Netherlands
lished in 2013 2
van Trees HL, Bell KL (2007) Bayesian bounds for ACCESS Linnaeus Center, Royal Institute of
parameter estimation and nonlinear filtering/tracking. Technology, Stockholm, Sweden
Wiley, Hoboken/IEEE, Piscataway 3
Department of Electrical Engineering,
Venerus JC, Bullock TE (1970) Estimation of the dynamic
reactivity using digital Kalman filtering. Nucl Sci Eng
University of California, Los Angeles, CA, USA
40:199–205
Verdú S, Poor HV (1984) Minimax linear observers
and regulators for stochastic systems with uncertain
second-order statistics. IEEE Trans Autom Control Abstract
29(6):499–511
Vicino A, Zappa G (1996) Sequential approximation
Recent developments in computer and commu-
of feasible parameter sets for identification with set
membership uncertainty. IEEE Trans Autom Control nication technologies have led to a new type
41(6):774–785 of large-scale resource-constrained wireless
Event-Triggered and Self-Triggered Control 385

embedded control systems. It is desirable in respect to periodic control when handling these
these systems to limit the sensor and control constraints, but they also introduce many new
computation and/or communication to instances interesting theoretical and practical challenges.
when the system needs attention. However, Although the discussions regarding periodic
classical sampled-data control is based on vs. aperiodic implementation of feedback control
performing sensing and actuation periodically loops date back to the beginning of computer-
rather than when the system needs attention. controlled systems, e.g., Gupta (1963), in the
This article discusses event- and self-triggered late 1990s two influential papers (Årzén 1999;
control systems where sensing and actuation is Åström and Bernhardsson 1999) highlighted the
performed when needed. Event-triggered control advantages of event-based feedback control. E
is reactive and generates sensor sampling and These two papers spurred the development
control actuation when, for instance, the plant of the first systematic designs of event-based
state deviates more than a certain threshold from implementations of stabilizing feedback control
a desired value. Self-triggered control, on the laws, e.g., Yook et al. (2002), Tabuada (2007),
other hand, is proactive and computes the next Heemels et al. (2008), and Henningsson et al.
sampling or actuation instance ahead of time. The (2008). Since then, several researchers have
basics of these control strategies are introduced improved and generalized these results and
together with references for further reading. alternative approaches have appeared. In the
meantime, also so-called self-triggered control
(Velasco et al. 2003) emerged. Event-triggered
Keywords
and self-triggered control systems consist of
two elements, namely, a feedback controller
Event-triggered control; Hybrid systems; Real-
that computes the control input and a triggering
time control; Resource-constrained embedded
mechanism that determines when the control
control; Sampled-data systems; Self-triggered
input has to be updated again. The difference
control
between event-triggered control and self-
triggered control is that the former is reactive,
Introduction while the latter is proactive. Indeed, in event-
triggered control, a triggering condition based on
In standard control textbooks, e.g., Aström and current measurements is continuously monitored
Wittenmark (1997) and Franklin et al. (2010), pe- and when the condition holds, an event is
riodic control is presented as the only choice for triggered. In self-triggered control the next
implementing feedback control laws on digital update time is precomputed at a control update
platforms. Although this time-triggered control time based on predictions using previously
paradigm has proven to be extremely success- received data and knowledge of the plant
ful in many digital control applications, recent dynamics. In some cases, it is advantageous
developments in computer and communication to combine event-triggered and self-triggered
technologies have led to a new type of large-scale control resulting in a control system reactive
resource-constrained (wireless) control systems to unpredictable disturbances and proactive by
that call for a reconsideration of this traditional predicting future use of resources.
paradigm. In particular, the increasing popularity
of (shared) wired and wireless networked con-
trol systems raises the importance of explicitly Time-Triggered, Event-Triggered and
addressing energy, computation, and communi- Self-Triggered Control
cation constraints when designing feedback con-
trol loops. Aperiodic control strategies that allow To indicate the differences between various dig-
the inter-execution times of control tasks to be ital implementations of feedback control laws,
varying in time offer potential advantages with consider the control of the nonlinear plant
386 Event-Triggered and Self-Triggered Control

xP D f .x; u/ (1) tkC1 D infft > tk j C.x.t/; x.tk // > 0g (5)

with x 2 Rnx the state variable and u 2 Rnu with t0 D 0. Hence, it is clear from (5) that
the input variable. The system is controlled by a there is a feedback mechanism present in the
nonlinear state feedback law determination of the next execution time as it is
based on the measured state variable. In this sense
u D h.x/ (2) event-triggered control is reactive.
Finally, in self-triggered control the next ex-
ecution time is determined proactively based on
where h W Rnx ! Rnu is an appropriate mapping
the measured state x.tk / at the previous execution
that has to be implemented on a digital platform.
time. In particular, there is a function M W Rnx !
Recomputing the control value and updating the
R0 that specifies the next execution time as
actuator signals will occur at times denoted by
t0 ; t1 ; t2 ; : : : with t0 D 0. If we assume the inputs
tkC1 D tk C M.x.tk // (6)
to be held constant in between the successive re-
computations of the control law (referred to as
with t0 D 0. As a consequence, in self-triggered
sample-and-hold or zero-order-hold), we have
control both the control value u.tk / and the next
execution time tkC1 are computed at execution
u.t/ D u.tk / D h.x.tk // 8t 2 Œtk ; tkC1 /; k 2 N: time tk . In between tk and tkC1 , no further actions
(3) are required from the controller. Note that the
time-triggered implementation can be seen as a
We refer to the instants ftk gk2N as the triggering special case of the self-triggered implementation
times or execution times. Based on these times by taking M.x/ D Ts for all x 2 Rnx .
we can easily explain the difference between Clearly, in all the three implementation
time-triggered control, event-triggered control, schemes Ts , C and M are chosen together with
and self-triggered control. the feedback law given through h to provide
In time-triggered control we have the equality stability and performance guarantees and to
tk D kTs with Ts > 0 being the sampling period. realize a certain utilization of computer and
Hence, the updates take place equidistantly in communication resources.
time irrespective of how the system behaves.
There is no “feedback mechanism” in determin-
ing the execution times; they are determined a Lyapunov-Based Analysis
priori and in “open loop.” Another way of writing
the triggering mechanism in time-triggered con- Much work on event-triggered control used one
trol is of the following two modeling and analysis
frameworks: The perturbation approach and the
tkC1 D tk C Ts ; k 2 N (4) hybrid system approach.

with t0 D 0. Perturbation Approach


In event-triggered control the next execution In the perturbation approach one adopts
time of the controller is determined by an event- perturbed models that describe how the event-
triggering mechanism that continuously verifies triggered implementation of the control law per-
if a certain condition based on the actual state turbs the ideal continuous-time implementation
variable becomes true. This condition includes u.t/ D h.x.t//, t 2 R0 . In order to do so,
often also information on the state variable x.tk / consider the error e given by
at the previous execution time tk and can be
written, for instance, as C.x.t/; x.tk // > 0. For- e.t/ D x.tk /  x.t/ for t 2 Œtk ; tkC1 /; k 2 N:
mally, the execution times are then determined by (7)
Event-Triggered and Self-Triggered Control 387

Using this error variable we can write the closed- holds. When  < 1=ˇ, we obtain from (11) and
loop system based on (1) and (3) as (13) that GAS properties are preserved for the
event-triggered implementation. Besides, under
xP D f .x; h.x C e//: (8) certain conditions provided in Tabuada (2007), a
global positive lower bound exists on the inter-
Essentially, the three implementations discussed execution times, i.e., there exists a min > 0 such
above have their own way of indicating when that tkC1  tk > min for all k 2 N and all initial
an execution takes place and the error e is reset states x0 .
to zero. The equation (8) clearly shows how the Also self-triggered controllers can be derived
ideal closed-loop system is perturbed by using a using the perturbation approach. In this case, sta- E
time-triggered, event-triggered, or self-triggered bility properties can be guaranteed by choosing
implementation of the feedback law in (2). In- M in (6) ensuring that C.x.t/; x.tk //  0 holds
deed, when e D 0 we obtain the ideal closed loop for all times t 2 Œtk ; tkC1 / and all k 2 N.

xP D f .x; h.x//: (9)


Hybrid System Approach
By taking as a state variable  D .x; e/, one
The control law in (2) is typically chosen so
can write the closed-loop event-triggered control
as to guarantee that the system in (9) has certain
system given by (1), (3), and (5) as the hybrid
global asymptotic stability (GAS) properties. In
impulsive system (Goebel et al. 2009)
particular, it is often assumed that there exists a
Lyapunov function V W Rnx ! R0 in the sense  
that V is positive definite and for all x 2 Rnx we f .x; h.x C e//
P D when C.x; x C e/  0
have f .x; h.x C e//
@V (14a)
f .x; h.x//  kxk2 : (10)
@x  
x
Note that this inequality is stronger than strictly C D when C.x; x C e/  0: (14b)
needed (at least for nonlinear systems), but for 0
pedagogical reasons we choose this simpler for-
mulation. For the perturbed model, the inequality This observation was made in Donkers and
in (10) can in certain cases (including linear Heemels (2010, 2012) and Postoyan et al. (2011).
systems) be modified to Tools from hybrid system theory can be used to
analyze this model, which is more accurate as it
@V includes the error dynamics of the event-triggered
f .x; h.x//  kxk2 C ˇkek2 (11)
@x closed-loop system. In fact, the stability bounds
obtained via the hybrid system approach can be
in which ˇ > 0 is a constant used to indicate proven to be never worse than ones obtained
how the presence of the implementation error e
using the perturbation approach in many cases,
affects the decrease of the Lyapunov function. see, e.g., Donkers and Heemels (2012), and
Based on (10) one can now choose the function typically the hybrid system approach provides
C in (5) to preserve GAS of the event-triggered (strictly) better results in practice. However,
implementation. For instance, C.x.t/; x.tk // D in general an analysis via the hybrid system
kx.tk /  x.t/k  kx.tk, i.e., approach is more complicated than using a
perturbation approach.
tkC1 D infft > tk j ke.t/k > kx.t/kg; (12) Note that by including a time variable ,
one can also write the closed-loop system corre-
assures that sponding to self-triggered control (1), (3), and (6)
as a hybrid system using the state variable  D
kek  kxk (13) .x; e; /. This leads to the model
388 Event-Triggered and Self-Triggered Control

0 1
f .x; h.x C e// on a held value x.tk / of the state variable, as
P D @f .x; h.x C e//A when0   M.x C e/ specified in (3). However, if good model-based
1 information regarding the plant is available, one
(15a) can use better model-based predictions of the
0 1 actuator signal. For instance, in the linear context,
x (Lunze and Lehmann 2010) proposed to use a
C D @ 0 A when D M.x C e/; (15b) control input generator instead of a plain zero-
0 order hold function. In fact, the plant model was
described by
which can be used for analysis based on hybrid
tools as well.
xP D Ax C Bu C Ew (18)

Alternative Event-Triggering with x 2 Rnx the state variable, u 2 Rnu the input
Mechanisms variable, and w 2 Rnw a bounded disturbance
input. It was assumed that a well functioning state
There are various alternative event-triggering feedback controller u D Kx was available. The
mechanisms. A few of them are described in this control input generator was then based on the
section. model-based predictions given for Œtk ; tkC1 / by

Relative, Absolute, and Mixed Triggering xP s .t/ D .A C BK/xs .t/ C E w.t


O k/
Conditions
with xs .tk / D x.tk / (19)
Above we discussed a very basic event-triggering
condition in the form given in (12), which is
sometimes called relative triggering as the next O k / is an estimate for the (average) dis-
and w.t
control task is executed at the instant when the turbance value, which is determined at execution
ratio of the norms of the error kek and the time tk , k 2 N. The applied input to the actuator
measured state kxk is larger than or equal to . is then given by u.t/ D Kxs .t/ for t 2 Œtk ; tkC1 /,
Also absolute triggering of the form k 2 N. Note that (19) provides a prediction of
the closed-loop state evolution using the latest
tkC1 D infft > tk j ke.t/k  ıg (16) received value of the state x.tk / and the esti-
O k / of the disturbances. Also the event-
mate w.t
triggering condition is based on this model-based
can be considered. Here ı > 0 is an abso-
prediction of the state as it is given by
lute threshold, which has given this scheme the
name send-on-delta (Miskowicz 2006). Recently,
a mixed triggering mechanism of the form tkC1 D infft > tk j kxs .t/  x.t/k  ıg: (20)

tkC1 D infft > tk j ke.t/k  kx.t/k C ıg; Hence, when the prediction xs .t/ diverts to far
(17) from the measured state x.t/, the next event is
triggered so that updates of the state are sent
combining an absolute and a relative threshold, to the actuator. These model-based triggering
was proposed (Donkers and Heemels 2012). It schemes can enhance the communication savings
is particularly effective in the context of output- as they reduce the number of events by using
based control. model-based knowledge.
Other model-based event-triggered control
Model-Based Triggering schemes are proposed, for instance, in Yook
In the triggering conditions discussed so far, es- et al. (2002), Garcia and Antsaklis (2013), and
sentially the current control value u.t/ is based Heemels and Donkers (2013).
Event-Triggered and Self-Triggered Control 389

Triggering with Time-Regularization distributed over a wide area, this assumption is


Time-regularization was proposed for output- prohibitive for its implementation. In such cases,
based triggering to avoid the occurrence of it is of high practical importance that the event-
accumulations in the execution times (Zeno triggering mechanism can be decentralized and
behavior) that would obstruct the existence of the execution of control tasks can be executed
a positive lower bound on the inter-execution based on local information. One first idea could
times tkC1  tk , k 2 N. In Tallapragada and be to use local event-triggering mechanisms for
Chopra (2012a,b), the triggering update the i -th sensor that measures xi . One could “de-
centralize” the condition (5), into
tkC1 D infft > tk C T j ke.t/k  kx.t/kg E
(21) tki i C1 D infft > tki i j kei .t/k  kxi .t/kg;
was proposed, where T > 0 is a built-in lower (23)
bound on the minimal inter-execution times. The in which ei .t/ D xi .tki i /  xi .t/ for t 2
authors discussed how T and  can be designed Œtki i ; tki i C1 /, k i 2 N. Note that each sensor
to guarantee closed-loop stability. In Heemels now has its own execution times tki i , k i 2 N at
et al. (2008) a similar triggering was proposed which the information xi .t/ is transmitted. More
using an absolute-type of triggering. importantly, the triggering condition (23) is based
An alternative to exploiting a built-in lower on local data only and does not need a central
bound T is combining ideas from time-triggered coordinator having access to the complete state
control and event-triggering control. Essentially, information. Besides since (23) still guarantees
the idea is to only verify a specific event- that (13) holds, stability properties can still be
triggering condition at certain equidistant time guaranteed; see Mazo and Tabuada (2011).
instants kTs , k 2 N, where Ts > 0 is the Several other proposals for decentralized
sampling period. Such proposals were mentioned event-triggered control schemes were made,
in, for instance, Årzén (1999), Yook et al. (2002), e.g., Persis et al. (2013), Wang and Lemmon
Henningsson et al. (2008), and Heemels et al. (2011), Garcia and Antsaklis (2013), Yook et al.
(2008, 2013). In this case the execution times are (2002), and Donkers and Heemels (2012).
given by

tkC1 D infft > tk j t D kTs ; k 2 N; Triggering for Multi-agent Systems


Event-triggered control strategies are suitable
and ke.t/k  kx.t/kg (22) for cooperative control of multi-agent systems.
In multi-agent systems, local control actions of
in case a relative triggering is used. In Heemels individual agents should lead to a desirable global
et al. (2013) the term periodic event-triggered behavior of the overall system. A prototype
control was coined for this type of control. problem for control of multi-agent systems is the
agreement problem (also called the consensus
Decentralized Triggering Conditions or rendezvous problem), where the states of
Another important extension of the mentioned all agents should converge to a common value
event-triggered controllers, especially in large- (sometimes the average of the agents’ initial
scale networked systems, is the decentralization conditions). The agreement problem has been
of the event-triggered control. Indeed, if one shown to be solvable for certain low-order
focuses on any of the abovementioned event- dynamical agents in both continuous and discrete
triggering conditions (take, e.g., (5)), it is ob- time, e.g., Olfati-Saber et al. (2007). It was
vious that the full state variable x.t/ has to be recently shown in Dimarogonas et al. (2012), Shi
continuously available in a central coordinator and Johansson (2011), and Seyboth et al. (2013)
to determine if an event is triggered or not. If that the agreement problem can be solved using
the sensors that measure the state are physically event-triggered control. In Seyboth et al. (2013)
390 Event-Triggered and Self-Triggered Control

the triggering times for agent i are determined Cross-References


by
 Discrete Event Systems and Hybrid Systems,
Connections Between
tki i C1 D infft > tki i j Ci .xi .t/; xi .tki i // > 0g;
 Hybrid Dynamical Systems, Feedback Control
(24)
of
which should be compared to the triggering
 Models for Discrete Event Systems: An
times as specified through (5). The triggering
Overview
condition compares the current state value with
 Supervisory Control of Discrete-Event Systems
the one previously communicated, similarly to
the previously discussed decentralized event- Acknowledgments The work of Maurice Heemels was
triggered control (see (23)), but now the partially supported by the Dutch Technology Foundation
communication is only to the agent’s neighbors. (STW) and the Dutch Organization for Scientific Re-
Using such event-triggered communication, search (NWO) under the VICI grant “Wireless controls
systems: A new frontier in automation”. The work of
the convergence rate to agreement (i.e., Karl Johansson was partially supported by the Knut and
kxi .t/  xj .t/k ! 0 as t ! 1 for all Alice Wallenberg Foundation and the Swedish Research
i; j ) can be maintained with a much lower Council. Maurice Heemels and Karl Johansson were also
communication rate than for time-triggered supported by the European 7th Framework Programme
Network of Excellence under grant HYCON2-257462.
communication. The work of Paulo Tabuada was partially supported by
NSF awards 0834771 and 0953994.

Outlook

Many simulation and experimental results show Bibliography


that event-triggered and self-triggered control
strategies are capable of reducing the number Araujo J, Mazo M Jr, Anta A, Tabuada P, Johansson
KH (2014) System architectures, protocols, and al-
of control task executions, while retaining a gorithms for aperiodic wireless control systems. In-
satisfactory closed-loop performance. In spite dustrial Informatics, IEEE Trans Ind Inform. 10(1):
of these results, the actual deployment of these 175–184
novel control paradigms in relevant applications Årzén K-E (1999) A simple event-based PID controller.
In: Proceedings of the IFAC World Congress, Beijing,
is still rather marginal. Some exceptions include China, vol 18, pp 423–428. Preprints
recent event-triggered control applications in Åström KJ, Bernhardsson BM (1999) Comparison of peri-
underwater vehicles (Teixeira et al. 2010), odic and event based sampling for first order stochastic
systems. In: Proceedings of the IFAC World Congress,
process control (Lehmann et al. 2012), and
Beijing, China, pp 301–Ű306
control over wireless networks (Araujo et al. Aström, KJ, Wittenmark B (1997) Computer controlled
2014). To foster the further development of systems. Prentice Hall, Upper Saddle River
event-triggered and self-triggered controllers in De Persis C, Sailer R, Wirth F (2013) Parsimonious event-
triggered distributed control: a Zeno free approach.
the future, it is therefore important to validate
Automatica 49(7):2116–2124
these strategies in practice, next to building up Dimarogonas DV, Frazzoli E, Johansson KH (2012) Dis-
a complete system theory for them. Regarding tributed event-triggered control for multi-agent sys-
the latter, it is fair to say that, even though tems. IEEE Trans Autom Control 57(5):1291–1297
Donkers MCF, Heemels WPMH (2010) Output-based
many interesting results are currently available,
event-triggered control with guaranteed L1 -gain and
the system theory for event-triggered and self- improved event-triggering. In: Proceedings of the
triggered control is far from being mature, IEEE conference on decision and control, Atlanta,
certainly compared to the vast literature on time- Georgia, USA, pp 3246–3251
Donkers MCF, Heemels WPMH (2012) Output-based
triggered (periodic) sampled-data control. As
event-triggered control with guaranteed L1 -gain and
such, many theoretical and practical challenges improved and decentralised event-triggering. IEEE
are ahead of us in this appealing research field. Trans Autom Control 57(6): 1362–1376
Evolutionary Games 391

Franklin GF, Powel JD, Emami-Naeini A (2010) Feed- Tallapragada P, Chopra N (2012b) Event-triggered dy-
back control of dynamical systems. Prentice Hall, namic output feedback control for LTI systems. In:
Upper Saddle River IEEE 51st annual conference on decision and control
Garcia E, Antsaklis PJ (2013) Model-based event- (CDC), Maui, pp 6597–6602
triggered control for systems with quantization and Teixeira PV, Dimarogonas DV, Johansson KH, Borges
time-varying network delays. IEEE Trans Autom Con- de Sousa J (2010) Event-based motion coordination of
trol 58(2):422–434 multiple underwater vehicles under disturbances. In:
Goebel R, Sanfelice R, Teel AR (2009) Hybrid dynamical IEEE OCEANS, Sydney
systems. IEEE Control Syst Mag 29: 28–93 Velasco M, Marti P, Fuertes JM (2003) The self triggered
Gupta S (1963) Increasing the sampling efficiency for task model for real-time control systems. In: Pro-
a control system. IEEE Trans Autom Control 8(3): ceedings of 24th IEEE real-time systems symposium,
263–264 work-in-progress session, Cancun, Mexico
Heemels WPMH, Donkers MCF (2013) Model-based Wang X, Lemmon MD (2011) Event-triggering in dis- E
periodic event-triggered control for linear systems. tributed networked systems with data dropouts and
Automatica 49(3):698–711 delays. IEEE Trans Autom Control 586–601
Heemels WPMH, Sandee JH, van den Bosch PPJ (2008) Yook JK, Tilbury DM, Soparkar NR (2002) Trading
Analysis of event-driven controllers for linear systems. computation for bandwidth: reducing communica-
Int J Control 81:571–590 tion in distributed control systems using state es-
Heemels WPMH, Donkers MCF, Teel AR (2013) Periodic timators. IEEE Trans Control Syst Technol 10(4):
event-triggered control for linear systems. IEEE Trans 503–518
Autom Control 58(4):847–861
Henningsson T, Johannesson E, Cervin A (2008) Sporadic
event-based control of first-order linear stochastic sys-
tems. Automatica 44:2890–2895
Lehmann D, Kiener GA, Johansson KH (2012) Event- Evolutionary Games
triggered PI control: saturating actuators and anti-
windup compensation. In: Proceedings of the IEEE Eitan Altman
conference on decision and control, Maui
Lunze J, Lehmann D (2010) A state-feedback approach to
INRIA, Sophia-Antipolis, France
event-based control. Automatica 46:211–215
Mazo M Jr, Tabuada P (2011) Decentralized event-
triggered control over wireless sensor/actuator net- Abstract
works. IEEE Trans Autom Control 56(10):2456–2461.
Special issue on Wireless Sensor and Actuator Net-
works Evolutionary games constitute the most recent
Miskowicz M (2006) Send-on-delta concept: an event- major mathematical tool for understanding,
based data-reporting strategy. Sensors 6: 49–63 modelling and predicting evolution in biology
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
and cooperation in networked multi-agent systems. and other fields. They complement other well
Proc IEEE 95(1):215–233 establlished tools such as branching processes
Postoyan R, Anta A, Nešić D, Tabuada P (2011) A and the Lotka-Volterra (1910) equations (e.g.
unifying Lyapunov-based framework for the event- for the predator - prey dynamics or for epidemics
triggered control of nonlinear systems. In: Proceedings
of the joint IEEE conference on decision and control evolution). Evolutionary Games also brings novel
and European control conference, Orlando, pp 2559– features to game theory. First, it focuses on the
2564 dynamics of competition rather than restricting
Seyboth GS, Dimarogonas DV, Johansson KH (2013) attention to the equilibrium. In particular, it
Event-based broadcasting for multi-agent average con-
sensus. Automatica 49(1):245–252 tries to explain how an equilibrium emerges.
Shi G, Johansson KH (2011) Multi-agent robust Second, it brings new definitions of stability,
consensus—part II: application to event-triggered co- that are more adapted to the context of large
ordination. In: Proceedings of the IEEE conference on populations. Finally, in contrast to standard
decision and control, Orlando
Tabuada P (2007) Event-triggered real-time scheduling of game theory, players are not assumed to be
stabilizing control tasks. IEEE Trans Autom Control “rational” or “knowledgeable” as to anticipate
52(9):1680–1685 the other players’ choices. The objective of this
Tallapragada P, Chopra N (2012a) Event-triggered de- article, is to present foundations as well as recent
centralized dynamic output feedback control for LTI
systems. In: IFAC workshop on distributed estimation advances in evolutionary games, highlight the
and control in networked systems, pp 31–36 novel concepts that they introduce with respect
392 Evolutionary Games

to game theory as formulated by John Nash, and evolutionary games is, thus, that the evolution
describe through several examples their huge of the fraction of a given type in the population
potential as tools for modeling interactions in depends on the sizes of other types only through
complex systems. the normalized size rather than through their
actual one.
The relative rate of the decrease or increase
Keywords of the normalized population size of some type
in the replicator dynamics is what we call fitness
Evolutionary stable strategies; Fitness; Replicator and is to be understood in the Darwinian sense.
dynamics If some type or some behavior increases more
than another one, then it has a larger fitness.
the evolution of the fitness as described by the
Introduction replicator dynamics is a central object of study in
evolutionary games.
Evolutionary game theory is the youngest of So far we did not actually consider any
several mathematical tools used in describing and game and just discussed ways of modeling
modeling evolution. It was preceded by the the- evolution. The relation to game theory is due
ory of branching processes (Watson and Francis to the fact that under some conditions, the fitness
Galton 1875) and its extensions (Altman 2008) converges to some fixed limit, which can be
which have been introduced in order to explain identified as an equilibrium of a matrix game
the evolution of family names in the English in which the utilities of the players are the
population of the second half of the nineteenth fitnesses. This limit is then called an ESS –
century. This theory makes use of the probabilis- evolutionary stable strategy – as defined by
tic distribution of the number of offspring of an Meynard Smith and Price in Maynard Smith
individual in order to predict the probability at and Price (1973). It can be computed using
which the whole population would become even- elementary tools in matrix games and then used
tually extinct. It describes the evolution of the for predicting the (long term) distribution of
number of offsprings of a given individual. The behaviors within a population. Note that an
Lotka-Volterra equations (Lotka-Volterra 1910) equilibrium in a matrix game can be obtained
and their extensions are differential equations that only when the players of the matrix game are
describe the population size of each of several rational (each one maximizing its expected
species that have a predator-prey type relation. utility, being aware of the utilities of other players
One of the foundations in evolutionary games and of the fact that these players maximize
(and its extension to population games) which is their utilities, etc.). A central contribution of
often used as the starting point in their definition evolutionary games is thus to show that evolution
is the replicator dynamics which, similarly to the of possibly nonrational populations converges
Lotka-Volterra equations, describe the evolution under some conditions to the equilibrium of a
of the size of various species that interact with game played by rational players. This surprising
each other (or of various behaviors within a given relationship between the equilibrium of a
population). In both the Lotka-Volterra equations noncooperative matrix game and the limit points
and in replicator dynamics, the evolution of the of the fitness dynamics has been supported by a
size of one type of population may depend on rich body of experimental results; see Friedman
the sizes of all other populations. Yet, unlike (1996).
the Lotka-Volterra equations, the object of the On the importance of the ESS for understand-
modeling is the normalized sizes of populations ing the evolution of species, Dawkins writes in
rather than the size itself. By normalized size his book “The Selfish Gene” (Dawkins 1976):
of some type, we mean the fraction of that type “we may come to look back on the invention of
within the whole population. A basic feature in the ESS concept as one of the most important
Evolutionary Games 393

" #
advances in evolutionary theory since Darwin.” X
N
He further specifies: “Maynard Smith’s concept pPj .t/ D
pj .t/ fj .p.t//  pk .t/fk .p.t// ;
of the ESS will enable us, for the first time, to see kD1
clearly how a collection of independent selfish (1)
entities can come to resemble a single organized
whole.” where
is some positive constant and the payoff
Here we shall follow the nontraditional ap- function fk is called the fitness of strategy k.
proach describing evolutionary games: we shall In evolutionary games, evolution is assumed to
first introduce the replicator dynamics and then be due to pairwise interactions between players,
introduce the game theoretic interpretation re- as will be described in the next section. There-
P E
lated to it. fore, fk has the form fk .p/ D N i D1 J.k; i /p.i /
where J.k; i / is the fitness of an individual play-
ing k if it interacts with an individual that plays
Replicator Dynamics strategy i .
Within quite general settings (Weibull 1995),
In the biological context, the replicator dynamics the above replicator dynamics is known to con-
is a differential equation that describes the way verge to an ESS (which we introduce in the next
in which the usage of strategies changes in time. section).
They are based on the idea that the average
growth rate per individual that uses a given strat-
Evolutionary Games: Fitnesses
egy is proportional to the excess of fitness of that
strategy with respect to the average fitness.
Consider an infinite population of players. Each
In engineering, the replicator dynamics could
individual i plays at times tni , n D 1; 2; 3; : : :
be viewed as a rule for updating mixed strategies
(assumed to constitute an independent Poisson
by individuals. It is a decentralized rule since
process with some rate ) a matrix game against
it only requires knowing the average utility of
some player j.n/ randomly selected within the
the population rather than the strategy of each
population. The choice j.n/ of the other players
individual.
at different times is independent. All players have
Replicator dynamics is one of the most studied
the same finite space of pure strategies (also
dynamics in evolutionary game theory. It has
called actions) K. Each time it plays, a player
been introduced by Taylor and Jonker (1978).
may use a mixed strategy p, i.e., a probability
The replicator dynamics has been used for de-
measure over the set of pure strategies. We con-
scribing the evolution of road traffic congestion in
sider J.k; i / (defined in the previous section) to
which the fitness is determined by the strategies
be the payoff for a tagged individual if it uses
chosen by all drivers (Sandholm 2009). It has
a strategy k, and it interacts with an individual
also been studied in the context of the association
using strategy i . With some abuse of notation,
problem in wireless communications (Shakkottai
one denotes by J.p; q/ the expected payoff for a
et al. 2007).
player who uses a mixed strategy p when meeting
Consider a set of N strategies and let pj .t/
another individual who adopts the mixed strategy
be the fraction of the whole population that uses
q. If we define a payoff matrix A and consider
strategy j at time t. Let p.t/ be the correspond-
p and q to be column vectors, then J.p; q/ D
ing N -dimensional vector. A function fj is asso-
p 0 Aq. The payoff function J is indeed linear in
ciated with the growth rate of strategy j , and it is
p and q. A strategy q is called a Nash equilibrium
assumed to depend on the fraction of each of the
if
N strategies in the population. There are various
8p 2 .K/; J.q; q/  J.p; q/ (2)
forms of replicator dynamics (Sandholm 2009)
and we describe here the one most commonly where .K/ is the set of probabilities over the set
used. It is given by K.
394 Evolutionary Games

Suppose that the whole population uses a the condition (5) holds, then a population using
strategy q and that a small fraction  (called “mu- q is “weakly” immune against a mutation using
tations”) adopts another strategy p. Evolutionary p. Indeed, if the mutant’s population grows, then
forces are expected to select against p if we shall frequently have individuals with strategy
q competing with mutants. In such cases, the
J.q; p C .1  /q/ > J.p; p C .1  /q/: (3) condition J.q; p/ > J.p; p/ ensures that the
growth rate of the original population exceeds
Evolutionary Stable Strategies: ESS that of the mutants.
Definition 1 q is said to be an evolutionary sta- A mixed strategy q that satisfies (4) for all p 6D
ble strategy (ESS) if for every p 6D q there q is called strict Nash equilibrium. Recall that a
exists some  p > 0 such that (3) holds for all mixed strategy q that satisfies (2) for all p 6D q is
 2 .0;  p /. a Nash equilibrium. We conclude from the above
theorem that being a strict Nash equilibrium im-
The definition of ESS is thus related to a
plies being an ESS, and being an ESS implies
robustness property against deviations by a whole
being a Nash equilibrium. Note that whereas a
(possibly small) fraction of the population. This
mixed Nash equilibrium is known to exist in a
is an important difference that distinguishes the
matrix game, an ESS may not exist. However,
equilibrium in populations as seen by biologists
an ESS is known to exist in evolutionary games
and the standard Nash equilibrium often used in
where the number of strategies available to each
economics context, in which robustness is defined
player is 2 (Weibull 1995).
against the possible deviation of a single user.
Why do we need the stronger type of robust- Proposition 1 In a symmetric game with two
ness? Since we deal with large populations, it strategies for each player and no pure Nash
is likely to be expected that from time to time, equilibrium, there exists a unique mixed Nash
some group of individuals may deviate. Thus equilibrium which is an ESS.
robustness against deviations by a single user is
not sufficient to ensure that deviations will not
develop and end up being used by a growing
Example: The Hawk and Dove Game
portion of the population.
We briefly describe the hawk and dove game
Often ESS is defined through the following
(Maynard Smith and Price 1973). A bird
equivalent definition.
that searches food finds itself competing with
Theorem 1 (Weibull 1995, Proposition 2.1 or another bird over food and has to decide
Hofbauer and Sigmund 1998, Theorem 6.4.1, whether to adopt a peaceful behavior (dove)
p 63) A strategy q is said to be an evolutionary or an aggressive one (hawk). The advantage of
stable strategy if and only if 8p ¤ q one of the behaving aggressively is that in an interaction
following conditions holds: with a peaceful bird, the aggressive one gets
access to all the food. This advantage comes
J.q; q/ > J.p; q/; (4) at a cost: a hawk which meets another hawk
ends up fighting with it and thus takes a risk
or of getting wounded. In contrast, two doves that
meet in a contest over food share it without
J.q; q/ D J.p; q/ and J.q; p/ > J.p; p/: (5) fighting. The fitnesses for player 1 (who chooses
a row) are summarized in Table 1, in which the
In fact, if condition (4) is satisfied, then the cost for fighting is taken to be some parameter
fraction of mutations in the population will tend ı > 1=2.
to decrease (as it has a lower fitness, meaning a This game has a unique mixed Nash equi-
lower growth rate). Thus, the strategy q is then librium (and thus a unique ESS) in which the
immune to mutations. If it does not but if still fraction p of aggressive birds is given by
Evolutionary Games 395

Evolutionary Games, Table 1 The hawk-dove game Summary and Future Directions
H D
H 1=2  ı 1 The entry has provided an overview of the foun-
D 0 1/2
dations of evolutionary games which include the
ESS (evolutionary stable strategy) equilibrium
concept that is stronger than the standard Nash
2 equilibrium and the modeling of the dynamics of
pD
1:5 C ı the competition through the replicator dynamics.
Evolutionary game framework is a first step in
linking game theory to evolutionary processes. E
Extension: Evolutionary Stable Sets The payoff of a player is identified as its fitness,
i.e., the rate of reproduction. Further develop-
Assume that there are two mixed strategies pi and ment of this mathematical tool is needed for
pj that have the same performance against each handling hierarchical fitness, i.e., the cases where
other, i.e., J.pi ; pj / D J.pj ; pj /. Then neither the individual that interacts cannot be directly
one of them can be an ESS, even if they are identified with the reproduction as it is part of a
quite robust against other strategies. Now assume larger body. For example, the behavior of a blood
that when excluding one of them from the set cell in the human body when interacting with a
of mixed strategies, the other one is an ESS. virus cannot be modeled as directly related to
This could imply that different combinations of the fitness of the blood cell but rather to that of
these two ESS’s could coexist and would together the human body. A further development of the
be robust to any other mutations. This motivates theory of evolutionary games is needed to define
the following definition of an ESSet (Cressman meaningful equilibrium notions and relate them
2003): to replication in such contexts.
Definition 2 A set E of symmetric Nash equilib-
ria is an evolutionarily stable set (ESSet) if, for all Cross-References
q 2 E, we have J.q; p/ > J.p; p/ for all p 62 E
and such that J.p; q/ D J.q; q/.  Dynamic Noncooperative Games
 Game Theory: Historical Overview
Properties of ESSet:

(i) For all p and p 0 in an ESSet E, we have Recommended Reading


J.p 0 ; p/ D J.p; p/.
(ii) If a mixed strategy is an ESS, then the Several books cover evolutionary game theory
singleton containing that mixed strategy is well. These include Cressman (2003), Hofbauer
an ESSet. and Sigmund (1998), Sandholm (2009), Vincent
(iii) If the ESSet is not a singleton, then there is and Brown (2005), and Weibull (1995). In ad-
no ESS. dition, the book The Selfish Gene by Dawkins
(iv) If a mixed strategy is in an ESSet, then it is presents an excellent background on evolution in
a Nash equilibrium (see Weibull 1995, p. 48, biology.
Example 2.7). Acknowledgments The author has been supported by
(v) Every ESSet is a disjoint union of Nash CONGAS project FP7-ICT-2011-8-317672, see www.
equilibria. congas-project.eu.
(vi) A perturbation of a mixed strategy which is
in the ESSet can move the system to another
Bibliography
mixed strategy in the ESSet. In particular,
every ESSet is asymptotically stable for the Altman E (2008) Semi-linear stochastic difference equa-
replicator dynamics (Cressman 2003). tions. Discret Event Dyn Syst 19:115–136
396 Experiment Design and Identification for Control

Cressman R (2003) Evolutionary dynamics and extensive control, one of the main applications of system
form games. MIT, Cambridge identification.
Dawkins R (1976) The selfish gene. Oxford University
Press, Oxford
Friedman D (1996) Equilibrium in evolutionary games:
some experimental results. Econ J 106: 1–25
Hofbauer J, Sigmund K (1998) Evolutionary games and Keywords
population dynamics. Cambridge University Press,
Cambridge/New York Adaptive experiment design; Application-
Lotka-Volterra AJ (1910) Contribution to the theory of
oriented experiment design; Cramér-Rao lower
periodic reaction. J Phys Chem 14(3): 271–274
Maynard Smith J, Price GR (1973) The logic of animal bound; Crest factor; Experiment design;
conflict. Nature 246(5427):15–18 Fisher information matrix; Identification for
Sandholm WH (2009) Population games and evolutionary control; Least-costly identification; MultiSine;
dynamics. MIT
Pseudorandom binary signal (PRBS); Robust
Shakkottai S, Altman E, Kumar A (2007) Multihoming of
users to access points in WLANs: a population game experiment design
perspective. IEEE J Sel Areas Commun Spec Issue
Non-Coop Behav Netw 25(6):1207–1215
Taylor P, Jonker L (1978) Evolutionary stable strategies
and game dynamics. Math Biosci 16: 76–83 Introduction
Vincent TL, Brown JS (2005) Evolutionary game theory,
natural selection & Darwinian dynamics. Cambridge
University Press, Cambridge The accuracy of an identified model is governed
Watson HW, Galton F (1875) On the probability of the by:
extinction of families. J Anthropol Inst Great Br 4: (i) Information content in the data used for esti-
138–144 mation
Weibull JW (1995) Evolutionary game theory. MIT,
Cambridge (ii) The complexity of the model structure
The former is related to the noise properties and
the “energy” of the external excitation of the
system and how it is distributed. In regard to (ii),
a model structure which is not flexible enough to
Experiment Design and capture the true system dynamics will give rise to
Identification for Control a systematic error, while an overly flexible model
will be overly sensitive to noise (so-called overfit-
Håkan Hjalmarsson ting). The model complexity is closely associated
School of Electrical Engineering, ACCESS with the number of parameters used. For a linear
Linnaeus Center, KTH Royal Institute of model structure with n parameters modeling the
Technology, Stockholm, Sweden dynamics, it follows from the invariance result
in Rojas et al. (2009) that to obtain a model
for which the variance of the frequency function
Abstract estimate is less than 1= over all frequencies, the
signal-to-noise ratio, as measured by input energy
Understanding the effect of experiment on over noise variance, must be at least n  . With
estimation result is a crucial part of system energy being power  time and as input power
identification – if the experiment is constrained is limited in physical systems, this indicates that
or otherwise fixed, then the implied limitations the experiment time grows at least linearly with
need to be understood – but if the experiment the number of model parameters. When the input
can be designed, then given its fundamental energy budget is limited, the only way around
importance that design parameter should be fully this problem is to sacrifice accuracy over certain
exploited, this entry will give an understanding of frequency intervals. The methodology to achieve
how it can be exploited. We also briefly discuss this in a systematic way is known as experiment
the particulars of identification for model-based design.
Experiment Design and Identification for Control 397

Model Quality Measures Application-Oriented Quality Measures


When demands are high and/or experimentation
The Cramér-Rao bound provides a lower bound resources are limited, it is necessary to tailor the
on the covariance matrix of the estimation error experiment carefully according to the intended
for an unbiased estimator. With ON 2 Rn denot- use of the model. Below we will discuss a couple
ing the parameter estimate (based on N input– of closely related application-oriented measures.
output samples) and o the true parameters,
Average Performance Degradation

T Let Vapp . /  0 be a measure of how well the
NE ON  o ON  o  N IF1 . o ; N / model corresponding to parameter performs E
when used in the application. In finance, Vapp
(1) can, e.g., represent the ability to predict the stock
market. In process industry, Vapp can represent the
profit gained using a feedback controller based
where IF . o ; N / 2 Rnn appearing in the lower on the model corresponding to . Let us assume
bound is the so-called Fisher information ma- that Vapp is normalized such that min Vapp . / D
trix (Ljung 1999). For consistent estimators, i.e., Vapp . o / D 0. That Vapp has minimum corre-
when ON ! o as N ! 1, the inequal- sponding to the parameters of the true system is
ity (1) typically holds asymptotically as the sam- quite natural. We will call Vapp the application
ple size N grows to infinity. The right-hand cost. Assuming that the estimator is asymptoti-
side in (1) is then replaced by the inverse of cally efficient, using a second-order Taylor ap-
the per sample Fisher information IF . o / WD proximation gives that the average application
limN !1 IF . o ; N /=N . An estimator is said to cost can be expressed as (the first-order term
be asymptotically efficient if equality is reached vanishes since o is the minimizer of Vapp )
in (1) as N ! 1.
Even though it is possible to reduce the mean- h i 1
T
square error by constraining the model flexibility O O 00 O
E Vapp . N /
E N  o Vapp . o / N  o
2
appropriately, it is customary to use consistent
1 n o
estimators since the theory for biased estimators D 00
Tr Vapp . o /IF1 . o / (2)
is still not well understood. For such estimators, 2N
using some function of the Fisher information as
performance measure is natural. This is a generalization of the A-optimal de-
sign measure and its minimization is known as
L-optimal design.
General-Purpose Quality Measures
Over the years a number of “general-purpose” Acceptable Performance
quality measures have been proposed. Perhaps Alternatively, one may define a set of acceptable
the most frequently used is the determinant of the models, i.e., a set of models which will give
inverse Fisher information. This represents the acceptable performance when used in the appli-
volume of confidence ellipsoids for the parameter cation. With a performance degradation measure
estimates and minimizing this measure is known defined of the type Vapp above, this would be a
as D-optimal design. Two other criteria relating level set
to confidence ellipsoids are E-optimal design,  
which uses the length of the longest principal 1
Eapp D W Vapp . /  (3)
axis (the minimum eigenvalue of IF ) as quality 
measure, and A-optimal design, which uses the
sum of the squared lengths of the principal axes for some constant  > 0. The objective of
(the trace of IF1 ). the experiment design is then to ensure that the
398 Experiment Design and Identification for Control

resulting estimate ends up in Eapp with high prob- should be conducted in closed loop (Agüero and
ability. Goodwin 2007).

External Excitation Signals


Design Variables The most important design variable is the ex-
ternal excitation, including the length of the ex-
In an identification experiment there are a number periment. Even for moderate experiment lengths,
of design variables at the user’s disposal. Below solving optimal experiment design problems with
we discuss three of the most important ones. respect to the entire excitation sequence can be a
formidable task. Fortunately, for experiments of
reasonable length, the design can be split up in
Sampling Interval two steps:
For the sampling interval, the general advice from (i) First, optimization of the probability density
an information theoretic point of view is to sam- function of the excitation
ple as fast as possible (Ljung 1999). However, (ii) Generation of the actual sequence from the
sampling much faster than the time constants of obtained density function through a stochas-
the system may lead to numerical issues when tic simulation procedure
estimating discrete time models as there will be More details are provided in section “Computa-
poles close to the unit circle. Downsampling may tional Issues.”
thus be required.

Feedback Experimental Constraints


Generally speaking, feedback has three effects
from an identification and experiment design An experiment is always subject to constraints,
point of view: physical as well as economical. Such constraints
(i) Not all the power in the input can be used to are typically translated into constraints on the
estimate the system dynamics when a noise following signal properties:
model is estimated as a part of the input (i) Variability. For example, too high level of
signal has to be used for the latter task; see excitation may cause the end product to
Section 8.1 in Forssell and Ljung (1999). go off-spec, resulting in product waste and
When a very flexible noise model is used, associated high costs.
the estimate of the system dynamics then has (ii) Frequency content. Often, too harsh move-
to rely almost entirely on external excitation. ments of the inputs may damage equipment.
(ii) Feedback can reduce the effect of distur- (iii) Amplitudes. For example, actuators have
bances and noise at the output. When there limited range, restricting input amplitudes.
are constraints on the outputs, this allows for (iv) Waveforms. In process industry, it is not
larger (input) excitation and therefore more uncommon that control equipment limit the
informative experiments. type of signals that can be applied. In other
(iii) The cross-correlation between input and applications, it may be physically possible to
noise/disturbances requires good noise realize only certain types of excitation. See
models to avoid biased estimates (Ljung section “Waveform Generation” for further
1999). discussion.
Strictly speaking, (i) is only valid when the sys- It is also often desired to limit the experiment
tem and noise models are parametrized sepa- time so that the process may go back to normal
rately. Items (i) and (ii) imply that when there operation, reducing, e.g., cost of personnel. The
are constraints on the input only, then the opti- latter is especially important in the process in-
mal design is always in open loop, whereas for dustry where dynamics are slow. The above type
output constrained only problems, the experiment of constraints can be formulated as constraints on
Experiment Design and Identification for Control 399

the design variables in section “Design Variables” sequence, all typical quality measures become
and associated variables. non-convex.
While various methods for non-convex nu-
merical optimization can be used to solve such
Experiment Design Criteria problems, they often encounter problems with,
e.g., local minima. To address this a number
There are two principal ways to define an optimal of techniques have been developed where either
experiment design problem: the problem is reparametrized so that it becomes
(i) Best effort. Here the best quality as, e.g., convex or where a convex approximation is used.
given by one of the quality measures in The latter technique is called convex relaxation E
section “Model Quality Measures” is sought and is often based on a reparametrization as well.
under constraints on the experimental effort We use the example above to provide a flavor of
and cost. This is the classical problem for- the different techniques.
mulation.
(ii) Least-costly. The cheapest experiment is Reparametrization
sought that results in a predefined model If the input is constrained to be periodic so that
quality. Thus, as compared to best effort u.t/ D u.t C N /, t D n; : : : ; 1, it follows
design, the optimization criterion and that the Fisher information is linear in the sample
constraint are interchanged. This type of correlations of the input. Using these as design
design was introduced by Bombois and variables instead of u results in that all quality
coworkers; see Bombois et al. (2006). measures referred to above become convex func-
As shown in Rojas et al. (2008), the two ap- tions.
proaches typically lead to designs only differing This reparametrization thus results in the two-
by a scaling factor. step procedure discussed in section “External
Excitation Signals”: First, the sample correlations
are obtained from an optimal experiment design
Computational Issues problem, and then an input sequence is generated
that has this sample correlation. In the second
The optimal experiment design problem based on step there is a considerable freedom. Notice,
the Fisher information is typically non-convex. however, that since correlations do not directly
For example, consider a finite-impulse response relate to the actual amplitudes of the resulting
model subject to an experiment of length N with signals, it is difficult to incorporate waveform
the measured outputs collected in the vector constraints in this approach. On the contrary,
2 3 variance constraints are easy to incorporate.
u.0/ : : : u..n  1//
6 :: :: :: 7 Convex Relaxations
Y D ˆ CE; ˆ D 4 : : : 5
u.N  1/ : : : u.N  n/ There are several approaches to obtain convex
relaxations.
where E 2 RN is zero-mean Gaussian noise with
Using the per Sample Fisher Information
covariance matrix  2 IN N . Then it holds that
If the input is a realization of a stationary random
1 T process and the sample size N is large enough,
IF . o ; N / D ˆ ˆ (4) IF . o ; N /=N is approximately equal to the per
2
sample Fisher matrix which only depends on
From an experiment design point of view, the the correlation sequence of the input. Using this
 T
input vector u D u..n  1// : : : u.N / is approximation, one can now follow the same
the design Variable, but with the elements of procedure as in the reparametrization approach
IF . o ; N / being a quadratic function of the input and first optimize the input correlation sequence.
400 Experiment Design and Identification for Control

The generation of a stationary signal with a cer- U and u (subject to the matrix inequality in (5))
tain correlation is a stochastic realization problem are decision variables.
which can be solved using spectral factorization
followed by filtering white noise sequence, i.e., Frequency-by-Frequency Design
a sequence of independent identically distributed An approximation for linear systems that allows
random variables, through the (stable) spectral frequency-by-frequency design of the input spec-
factor (Jansson and Hjalmarsson 2005). trum and feedback is obtained by assuming that
More generally, it turns out that the per sample the model is of high order. Then the variance of an
Fisher information for linear models/systems nth-order estimate, G.e i ! ; ON /, of the frequency
only depends on the joint input/noise spectrum function can approximately be expressed as
(or the corresponding correlation sequence).
A linear parametrization of this quantity thus n ˆv .!/
Var G.e i ! ; ON /
(6)
typically leads to a convex problem (Jansson and N ˆu .!/
Hjalmarsson 2005).
The set of all spectra is infinite dimensional ( System Identification: An Overview) in the
and this precludes a search over all possible spec- open loop case (there is a closed-loop extension
tra. However, since there is a finite-dimensional as well), where ˆu and ˆv are the input and noise
parametrization of the per sample Fisher informa- spectra, respectively. Performance measures of
tion (it is a symmetric nn matrix), it is also pos- the type (2) can then be written as
sible to find finite-dimensional sets of spectra that Z 
parametrize all possible per sample Fisher infor- ˆv .!/
W .e i ! / d!
mation matrices. Multisines with appropriately  ˆu .!/
chosen frequencies is one possibility. However,
even though all per sample Fisher information where the weighting W .e i ! /  0 depends on the
matrices can be generated, the solution may be application. When only variance constraints are
suboptimal depending on which constraints the present, such problems can be solved frequency
problem contains. by frequency, providing both simple calculations
The situation for nonlinear problems is con- and insight into the design.
ceptually the same, but here the entire proba-
bility density function of the stationary process
generating the input plays the same role as the Implementation
spectrum in the linear case. This is a much more
complicated object to parametrize. We have used the notation IF . o ; N / to indicate
that the Fisher information typically (but not al-
Lifting ways) depends on the parameter corresponding to
An approach that can deal with amplitude con- the true system. That the optimal design depends
straints is based on a so-called lifting technique: on the to-be identified system is a fundamental
Introduce the matrix U D uuT , representing problem in optimal experiment design. There are
all possible products of the elements of u. This two basic approaches to address this problem
constraint is equivalent to which are covered below. Another important as-
pect is the choice of waveform for the external



U u U u excitation signal. This is covered last in this
 0; rank T D1 (5)
uT 1 u 1 section.

The idea of lifting is now to observe that the Robust Experiment Design
Fisher information matrix is linear in the ele- In robust experiment design, it is assumed that
ments of U and by dropping the rank constraint it is known beforehand that the true parameter
in (5) a convex relaxation is obtained, where both belongs to some set, i.e., o 2 ‚. A minimax
Experiment Design and Identification for Control 401

approach is then typically taken, finding the ex- maxt u2 .t/


periment that minimizes the worst performance Cr2 D 1 PN
limN !1 N t D1 u2 .t/
over the set ‚. Such optimization problems are
computationally very difficult.
The crest factor is thus the ratio between
Adaptive Experiment Design the squared maximum amplitude and the
The alternative to robust experiment design is power of the signal. Now, for a class of signal
to perform the design adaptively or sequentially, waveforms with a given crest factor, the input
meaning that first a design is performed based power that can be used is upper-bounded
on some initial “guess” of the true parameter, by
E
and then as samples are collected, the design is
revised taking advantage of the data information. 1 X 2
N
A2
Interestingly, the convergence rate of the parame- lim u .t/  2 (7)
N !1 N Cr
ter estimate is typically sufficiently fast that for t D1

this approach the asymptotic distribution is the


same as for the design based on the true model However, the power is the integral of the
parameter (Hjalmarsson 2009). signal spectrum, and since increasing the
amplitude of the input signal spectrum will
Waveform Generation increase a model’s accuracy, cf. (6), it is
We have argued above that it is the spectrum of desirable to use as much signal power as
the excitation (together with the feedback) that possible. By (7) we see that this means that
determines the achieved model accuracy in the waveforms with low crest factor should be
linear time-invariant case. In section “Using the used.
per Sample Fisher Information” we argued that a A lower bound for the crest factor is readily seen
signal with a particular spectrum can be obtained to be 1. This bound is achieved for binary sym-
by filtering a white noise sequence through a metric signals. Unfortunately, there exists no sys-
stable spectral factor of the desired spectrum. tematic way to design a binary sequence that has
However, we have also in section “Experimental a prescribed spectrum. However, the so-called
Constraints” argued that particular applications arcsin law may be used. It states that the sign
may require particular waveforms. We will here of a zero-mean Gaussian process with correlation
elaborate further on how to generate a waveform sequence r gives a binary signal having corre-
with desired characteristics. lation sequence rQ D 2= arcsin.r /. With rQ
From an accuracy point of view, there are two given, one can try to solve this relation for the
general issues that should be taken into account corresponding r .
when the waveform is selected: A crude, but often sufficient, method to gen-
• Persistence of excitation. A signal with a spec- erate binary sequences with desired spectral con-
trum having n nonzero frequencies (on the tent is based on the use of pseudorandom binary
interval .; ) can be used to estimate at signals (PRBS). Such signals (which are gener-
most n parameters. Thus, as is typically the ated by a shift register) are periodic signals which
case, if there is uncertainty regarding which have correlation sequences similar to random
model structure to use before the experiment, white noise, i.e., a flat spectrum. By resampling
one has to ensure that a sufficient number of such sequences, the spectrum can be modified.
frequencies is excited. It should be noted that binary sequences are less
• The crest factor. For all systems, the maximum attractive when it comes to identifying nonlinear-
input amplitude, say A, is constrained. To deal ities. This is easy to understand by considering a
with this from an experiment design point of static system. If only one amplitude of the input is
view, it is convenient to introduce what is used, it will be impossible to determine whether
called the crest factor of a signal: the system is nonlinear or not.
402 Experiment Design and Identification for Control

A PRBS is a periodic signal and can therefore Implications for the Identification
be split into its Fourier terms. With a period of Problem Per Se
M , each such term corresponds to one frequency
on the grid 2k=M , k D 0; : : : ; M  1. Such a In order to get some understanding of how
signal can thus be used to estimate at most M pa- optimal experimental conditions influence the
rameters. Another way to generate a signal with identification problem, let us return to the
period M is to add sinusoids corresponding to finite-impulse response model example in
the above frequencies, with desired amplitudes. section “Computational Issues.” Consider a least-
A periodic signal generated in this way is com- costly setting with an acceptable performance
monly referred to as a MultiSine. The crest factor constraint. More specifically, we would like
of a multisine depends heavily on the relation to use the minimum input energy that ensures
between the phases of the sinusoids. times the that the parameter estimate ends up in a set
number of sinusoids. It is possible to optimize the of the type (3). An approximate solution to
crest factor with respect to the choice of phases this is that a 99 % confidence ellipsoid for the
(Rivera et al. 2009). There exist also simple resulting estimate is contained in Eapp . Now,
deterministic methods for choosing phases that it can be shown that a confidence ellipsoid is
give a good crest factor, e.g., Schroeder phasing. a level set for the average least-squares cost
Alternatively, phases can be drawn randomly and EŒVN . / D EŒkY  ˆ k2  D k  o k2ˆT ˆ C  2 .
independently, giving what is known as random- Assuming the application cost Vapp also is
phase multisines (Pintelon and Schoukens 2012), quadratic in , it follows after a little bit of
a family of random signals with properties similar algebra (see Hjalmarsson 2009) that it must hold
to Gaussian signals. Periodic signals have some that
useful features:
 
• Estimation of nonlinearities. A linear time- EŒVN . /   2 1 C cVapp . / ; 8 (8)
invariant system responds to a periodic input
signal with a signal consisting of the same for a constant c that is not important for our
frequencies, but with different amplitudes discussion. The value of EŒVN . / D k 
and phases. Thus, it can be concluded o k2ˆT ˆ C  2 is determined by how large the
that the system is nonlinear if the output weighting ˆT ˆ is, which in turn depends on how
contains other frequencies than the input. large the input u is. In a least-costly setting with
This can be explored in a systematic way the energy kuk2 as criterion, the best solution
to estimate also the nonlinear part of a would be that we have equality in (8). Thus we
system. see that optimal experiment design tries to shape
• Estimation of noise variance. For a linear the identification criterion after the application
time-invariant system, the difference in the cost. We have the following implications of this
output between different periods is due en- result:
tirely to the noise if the system is in steady (i) Perform identification under appropriate
state. This can be used to devise simple meth- scaling of the desired operating conditions.
ods to estimate the noise level. Suppose that Vapp . / is a function of
• Data compression. By averaging measure- how the system outputs deviate from a
ments over different periods, the noise level desired trajectory (determined by o ).
can be reduced at the same time as the number Performing an experiment which performs
of measurements is reduced. the desired trajectory then gives that the
Further details on waveform generation and sum of the squared prediction errors are
general-purpose signals useful in system an approximation of Vapp . /, at least for
identification can be found in Pintelon and parameters close to o . Obtaining equality
Schoukens (2012) and Ljung (1999). in (8) typically requires an additional scaling
Experiment Design and Identification for Control 403

of the input excitation or the length of parametric ellipsoidal uncertainty sets resulting
the experiment. The result is intuitively from standard system identification. In fact only
appealing: The desired operating conditions in the last decade analysis and design tools for
should reveal the system properties that are such type of model uncertainty have started to
important in the application. emerge, e.g., Raynaud et al. (2000) and Gevers
(ii) Identification cost for application perfor- et al. (2003).
mance. We see that the required energy The advantages of matching the identification
grows (almost) linearly with  , which criterion to the application have been recognized
is a measure of how close to the ideal since long in this line of research. For control
performance (using the true parameter o ) applications this typically implies that the iden- E
we want to come. Furthermore, it is typical tification experiment should be performed under
that as the performance requirements in the the same closed-loop operation conditions as the
application increase, the sensitivity to model controller to be designed. This was perhaps first
errors increases. This means that Vapp . / recognized in the context of minimum variance
increases, which thus in turn means that the control (see Gevers and Ljung 1986) where vari-
identification cost increases. In summary, ance errors were the concern. Later on this was
the identification cost will be higher, the recognized to be the case also for the bias error,
higher performance that is required in the although here pre-filtering can be used to achieve
application. The inequality (8) can be used the same objective.
to quantify this relationship. To account for that the controller to be
(iii) Model structure sensitivity. As Vapp will be designed is not available, techniques where
sensitive to system properties important for control and identification are iterated have been
the application, while insensitive to system developed, cf. adaptive experiment design in
properties of little significance, with the section “Adaptive Experiment Design.” Conver-
identification criterion VN matched to Vapp , gence of such schemes has been established when
it is only necessary that the model structure the true system is in the model set but has proved
is able to model the important properties of out of reach for models of restricted complexity.
the system. In recent years, techniques integrating exper-
In any case, whatever model structure iment design and model predictive control have
that is used, the identified model will be started to appear. A general-purpose design cri-
the best possible in that structure for the terion is used in Rathouský and Havlena (2013),
intended application. This is very different while Larsson et al. (2013) uses an application-
from an arbitrary experiment where it is oriented criterion.
impossible to control the model fit when a
model of restricted complexity is used.
We conclude that optimal experiment de- Summary and Future Directions
sign simplifies the overall system identifica-
tion problem. When there is the “luxury” to design the exper-
iment, then this opportunity should be seized by
the user. Without informative data there is little
Identification for Control that can be done. In this exposé we have outlined
the techniques that exist but also emphasized
Model-based control is one of the most impor- that a well-conceived experiment, reflecting the
tant applications of system identification. Robust intended application, significantly can simplify
control ensures performance and stability in the the overall system identification problem.
presence of model uncertainty. However, the ma- Further developments of computational tech-
jority of such design methods do not employ the niques are high on the agenda, e.g., how to handle
404 Experiment Design and Identification for Control

time-domain constraints and nonlinear models. Bibliography


To this end, developments in optimization
methods are rapidly being incorporated. While, Agüero JC, Goodwin GC (2007) Choosing between open
and closed loop experiments in linear system identifi-
as reported in Hjalmarsson (2009), there are some
cation. IEEE Trans Autom Control 52(8):1475–1480
results on how the identification cost depends on Bombois X, Scorletti G, Gevers M, Van den Hof
the performance requirements in the application, PMJ, Hildebrand R (2006) Least costly identifi-
further understanding of this issue is highly cation experiment for control. Automatica 42(10):
1651–1662
desirable. Theory and further development of Fedorov VV (1972) Theory of optimal experiments. Prob-
the emerging model predictive control schemes ability and mathematical statistics, vol 12. Academic,
equipped with experiment design may very well New York
be the direction that will have most impact in Forssell U, Ljung L (1999) Closed-loop identification
revisited. Automatica 35:1215–1241
practice. Gevers M (2005) Identification for control: from the early
achievements to the revival of experiment design. Eur
J Control 11(4–5):335–352. Semi-plenary lecture at
IEEE conference on decision and control – European
Cross-References control conference
Gevers M, Bombois X, Codrons B, Scorletti G, Ander-
 System Identification: An Overview son BDO (2003) Model validation for control and
controller validation in a prediction error identifica-
tion framework – part I: theory. Automatica 39(3):
403–445
Gevers M, Ljung L (1986) Optimal experiment designs
Recommended Reading with respect to the intended model application. Auto-
matica 22(5):543–554
A classical text on optimal experiment design Goodwin GC, Payne RL (1977) Dynamic system iden-
tification: experiment design and data analysis. Aca-
is Fedorov (1972). The textbooks Goodwin demic, New York
and Payne (1977) and Zarrop (1979) cover this Hjalmarsson H (2005) From experiment design to closed
theory adapted to a dynamical system framework. loop control. Automatica 41(3):393–438
A general overview is provided in Pronzato Hjalmarsson H (2009) System identification of complex
and structured systems. Eur J Control 15(4):275–310.
(2008). A semi-definite programming framework Plenary address. European control conference
based on the per sample Fisher information is Jansson H, Hjalmarsson H (2005) Input design via
provided in Jansson and Hjalmarsson (2005). LMIs admitting frequency-wise model specifications
The least-costly framework is covered in in confidence regions. IEEE Trans Autom Control
50(10):1534–1549
Bombois et al. (2006). The lifting technique Larsson CA, Hjalmarsson H, Rojas CR, Bombois X, Mes-
was introduced for input design in Manchester bah A, Modén P-E (2013) Model predictive control
(2010). Details of the frequency-by-frequency with integrated experiment design for output error
design approach can be found in Ljung (1999). systems. In: European control conference, Zurich
Ljung L (1999) System identification: theory for the user,
References to robust and adaptive experiment 2nd edn. Prentice-Hall, Englewood Cliffs
design can be found in Pronzato (2008) and Manchester IR (2010) Input design for system identifica-
Hjalmarsson (2009). For an account of the tion via convex relaxation. In: 49th IEEE conference
implications of optimal experiment design for on decision and control, Atlanta, pp 2041–2046
Pintelon R, Schoukens J (2012) System identification:
the system identification problem as a whole, a frequency domain approach, 2nd edn. Wiley/IEEE,
see Hjalmarsson (2009). Thorough accounts of Hoboken/Piscataway
the developments in identification for control Pronzato L (2008) Optimal experimental design and some
are provided in Hjalmarsson (2005) and Gevers related control problems. Automatica 44(2):303–325
Rathouský J, Havlena V (2013) MPC-based appro-
(2005). ximate dual controller by information matrix max-
imization. Int J Adapt Control Signal Process
Acknowledgments This work was supported by the 27(11):974–999
European Research Council under the advanced grant Raynaud HF, Pronzato L, Walter E (2000) Robust iden-
LEARN, contract 267381, and by the Swedish Research tification and control based on ellipsoidal parametric
Council, contract 621-2009-4017. uncertainty descriptions. Eur J Control 6(3):245–255
Explicit Model Predictive Control 405

Rivera DE, Lee H, Mittelmann HD, Braun MW Introduction


(2009) Constrained multisine input signals for
plant-friendly identification of chemical process sys-
tems. J Process Control 19(4):623–635 Model predictive control (MPC) is a well-known
Rojas CR, Agüero JC, Welsh JS, Goodwin GC (2008) methodology for synthesizing feedback control
On the equivalence of least costly and traditional ex- laws that optimize closed-loop performance
periment design for control. Automatica 44(11):2706– subject to prespecified operating constraints
2715
Rojas CR, Welsh JS, Agüero JC (2009) Fundamental on inputs, states, and outputs (Borrelli et al.
limitations on the variance of parametric models. IEEE 2011; Mayne and Rawlings 2009). In MPC, the
Trans Autom Control 54(5):1077–1081 control action is obtained by solving a finite
Zarrop M (1979) Optimal experiment design for dynamic horizon open-loop optimal control problem at
system identification. Lecture notes in control and E
information sciences, vol 21. Springer, Berlin each sampling instant. Each optimization yields
a sequence of optimal control moves, but only
the first move is applied to the process: At the
next time step, the computation is repeated over a
Explicit Model Predictive Control shifted time horizon by taking the most recently
available state information as the new initial
Alberto Bemporad condition of the new optimal control problem.
IMT Institute for Advanced Studies Lucca, For this reason, MPC is also called “receding
Lucca, Italy horizon control.” In most practical applications,
MPC is based on a linear discrete-time time-
invariant model of the controlled system and
Abstract quadratic penalties on tracking errors and actu-
ation efforts; in such a formulation, the optimal
Model predictive control (MPC) has been used
control problem can be recast as a quadratic
in the process industries for more than 30 years
programming (QP) problem, whose linear term
because of its ability to control multivariable
of the cost function and right-hand side of the
systems in an optimized way under constraints
constraints depend on a vector of parameters that
on input and output variables. Traditionally, MPC
may change from one step to another (such as
requires the solution of a quadratic program
the current state and reference signals). To enable
(QP) online to compute the control action, often
the implementation of MPC in real industrial
restricting its applicability to slow processes.
products, a QP solution method must be embed-
Explicit MPC completely removes the need for
ded in the control hardware. The method must
on-line solvers by precomputing the control law
be fast enough to provide a solution within short
off-line, so that online operations reduce to a
sampling intervals and require simple hardware,
simple function evaluation. Such a function is
limited memory to store the data defining the
piecewise affine in most cases, so that the MPC
optimization problem and the code implementing
controller is equivalently expressed as a lookup
the algorithm itself, a simple program code, and
table of linear gains, a form that is extremely easy
good worst-case estimates of the execution time
to code, requires only basic arithmetic operations,
to meet real-time system requirements.
and requires a maximum number of iterations that
Several online solution algorithms have been
can be exactly computed a priori.
studied for embedding quadratic optimization
in control hardware, such as active-set meth-
ods (Ricker 1985), interior-point methods (Wang
Keywords and Boyd 2010), and fast gradient projection
methods (Patrinos and Bemporad 2014). Explicit
Constrained control; Embedded optimization; MPC takes a different approach to meet the above
Model predictive control; Multiparametric requirements, where multiparametric quadratic
programming; Quadratic programming programming is proposed to pre-solve the QP
406 Explicit Model Predictive Control

off-line, therefore converting the MPC law into a in the MPC problem formulation (1) is q D
continuous and piecewise-affine function of the N nc C nN .
parameter vector (Bemporad et al. 2002b). We By eliminating the states xk D Ak x C
Pk1 j
review the main ideas of explicit MPC in the j D0 A Buk1j from problem (1), the optimal
next section, referring the reader to Alessio and control problem (1) can be expressed as the
Bemporad (2009) for a more complete survey convex quadratic program (QP):
paper on explicit MPC.
1 0 1
V ? .x/ , min z H z C x0 F 0z C x0 Y x
z 2 2
(3a)
Model Predictive Control Problem
s:t: Gz  W C S x (3b)
Consider the following finite-time optimal con-
trol problem formulation for MPC: where H D H 0 2 Rn is the Hessian matrix; F 2
Rnm defines the linear term of the cost function;
X
N 1 Y 2 Rmm has no influence on the optimizer, as
V  .p/ D min `N .xN / C `.xk ; uk / (1a) it only affects the optimal value of (3a); and the
z
kD0 matrices G 2 Rqn , S 2 Rqm , W 2 Rq define
s:t: xkC1 D Axk C Buk (1b) in a compact form the constraints imposed in (1).
Because of the assumptions made on the weight
Cx xk C Cu uk  c (1c)
matrices Q, R, P , matrix H is positive definite
k D 0; : : : ; N  1 and matrix H F 0 is positive semidefinite.
F Y
The MPC control law is defined by setting
CN xN  cN (1d)
x0 D x (1e) u.x/ D ŒI 0 : : : 0z.x/ (4)

where N is the prediction horizon; x 2 Rm is where z.x/ is the optimizer of the QP problem (3)
the current state vector of the controlled system; for the current value of x and I is the identity
uk 2 Rnu is the vector of manipulated variables matrix of dimension nu  nu .
at prediction time k, k D 0; : : : ; N  1; z ,
Πu00 ::: u0N 1 0 2 Rn , n , nu N , is the vector of Multiparametric Solution
decision variables to be optimized;
Rather than using a numerical QP solver online to
1
`.x; u/ D x 0 Qx C u0 Ru (2a) compute the optimizer z.x/ of (3) for each given
2 current state vector x, the basic idea of explicit
1 MPC is to pre-solve the QP off-line for the entire
`N .x/ D x 0 P x (2b)
2 set of states x (or for a convex polyhedral subset
X  Rm of interest) to get the optimizer function
are the stage cost and terminal cost, respectively; z, and therefore the MPC control law u, explicitly
Q, P are symmetric and positive semidefinite as a function of x.
matrices; and R is a symmetric and positive The main tool to get such an explicit solu-
definite matrix. tion is multiparametric quadratic programming
Let nc 2 N be the number of constraints (mpQP). For mpQP problems of the form (3),
imposed at prediction time k D 0; : : : ; N  1, Bemporad et al. (2002b) proved that the opti-
namely, Cx 2 Rnc m , Cu 2 Rnc nu , c 2 Rnc , mizer function z W Xf 7! Rn is piecewise affine
and let nN be the number of terminal constraints, and continuous over the set Xf of parameters
namely, CN 2 RnN m , cN 2 RnN . The total x for which the problem is feasible (Xf is a
number q of linear inequality constraints imposed polyhedral set, possibly Xf D X ) and that
Explicit Model Predictive Control 407

Explicit Model Predictive Control, Fig. 1 Explicit MPC solution for the double integrator example

the value function V  W Xf 7! R associating construct the solution by exploiting the Karush-
with every x 2 Xf the corresponding optimal Kuhn-Tucker (KKT) conditions for optimality:
value of (3) is continuous, convex, and piecewise
quadratic. H z C F x C G0 D 0 (6a)
An immediate corollary is that the explicit
i .G z  W  S x/ D 0; 8i D 1; : : : ; q (6b)
i i i
version of the MPC control law u in (4), being
the first nu components of vector z.x/, is also Gz  W C S x (6c)
a continuous and piecewise-affine state-feedback
0 (6d)
law defined over a partition of the set Xf of states
into M polyhedral cells;
where  2 Rq is the vector of Lagrange multipli-
8 ers. For the strictly convex QP (3), conditions (6)
< F1 x C g1 if H1 x  K1
ˆ are necessary and sufficient to characterize opti-
u.x/ D :: :: (5)
: : mality.

FM x C gM if HM x  KM An mpQP algorithm starts by fixing an arbi-
trary starting parameter vector x0 2 Rm (e.g.,
An example of such a partition is depicted in the origin x0 D 0), solving the QP (3) to get the
Fig. 1. The explicit representation (5) has mapped optimal solution z.x0 /, and identifying the subset
the MPC law (4) into a lookup table of linear
gains, meaning that for each given x, the values
computed by solving the QP (3) online and those Q
Gz.x/ D SQ x C WQ (7a)
obtained by evaluating (5) are exactly the same.
of all constraints (6c) that are active at z.x0 / and
Multiparametric QP Algorithms the remaining inactive constraints:
A few algorithms have been proposed in the liter-
ature to solve the mpQP problem (3). All of them O
Gz.x/  SO x C WO (7b)
408 Explicit Model Predictive Control

Correspondingly, in view of the complementarity All methods handle the case of degeneracy,
condition (6b), the vector of Lagrange multipliers which may happen for some combinations of
is split into two subvectors: active constraints that are linearly dependent, that
is, the associated matrix GQ has no full row rank
Q
.x/ 0 (8a) Q
(in this case, .x/ may not be uniquely defined).
O
.x/ D0 (8b)

We assume for simplicity that the rows of GQ Extensions


are linearly independent. From (6a), we have the
relation The explicit approach described earlier can be
extended to the following MPC setting:
Q
z.x/ D H 1 .F x C GQ 0 .x// (9)
X
N 1
1 1
that, when substituted into (7a), provides min .yk  rk /0Qy .yk  rk /C u0k R uk
z 2 2
kD0
Q
.x/ D MQ .WQ C .SQ C GH
Q 1 F /x/ (10)
C .uk  urk /0 R.uk  urk /0 C   2 (13a)
where MQ D GQ 0 .GH
Q 1 GQ 0 /1 and, by substitu- s:t: xkC1 D Axk C Buk C Bv vk (13b)
tion in (9), yk D C xk C Du uk C Dv vk (13c)

z.x/ D H 1 .MQ WQ C MQ .SQ C GH


Q 1 F /x  F x/ uk D uk1 C uk ; k D 0; : : : ; N  1
(11) (13d)
The solution z.x/ provided by (11) is the correct uk D 0; k D Nu ; : : : ; N  1 (13e)
one for all vectors x such that the chosen com-
ukmin  uk  ukmax ; k D 0; : : : ; Nu  1 (13f)
bination of active constraints remains optimal.
Such all vectors x are identified by imposing con- ukmin  uk  ukmax ; k D 0; : : : ; Nu  1
Q
straints (7b) and (8a) on z.x/ and .x/, respec- (13g)
tively, that leads to constructing the polyhedral
set (“critical region”): ykmin  Vmin  yk  ykmax C Vmax (13h)
k D 0; : : : ; Nc  1
Q
CR0 D fx 2 Rn W .x/ O
 0; Gz.x/  WO C SO xg
(12)
where R is a symmetric and positive definite
Different mpQP solvers were proposed to matrix; matrices Qy and R are symmetric and
cover the rest X n CR0 of the parameter set positive semidefinite; vk is a vector of measured
with other critical regions corresponding to disturbances; yk is the output vector; rk its corre-
new combinations of active constraints. The sponding reference to be tracked; uk is the vec-
most efficient methods exploit the so-called tor of input increments; urk is the input reference;
“facet-to-facet” property of the multiparametric ukmin , ukmax , ukmin , ukmax , ykmin , ykmax are bounds;
solution (Spjøtvold et al. 2006) to identify and N , Nu , Nc are, respectively, the prediction,
neighboring regions as in Tøndel et al. (2003a) control, and constraint horizons. The extra vari-
and Baotić (2002). Alternative methods were able  is introduced to soften output constraints,
proposed in Jones and Morari (2006), based penalized by the (usually large) weight  in the
on looking at (6) as a multiparametric linear cost function (13a).
complementarity problem, and in Patrinos and Everything marked in bold-face in (13), to-
Sarimveis (2010), which provides algorithms for gether with the command input u1 applied at
determining all neighboring regions even in the the previous sampling step and the current state
case the facet-to-facet property does not hold. x, can be treated as a parameter with respect to
Explicit Model Predictive Control 409

which to solve the mpQP problem and obtain the Applicability of Explicit MPC
explicit form of the MPC controller. For example,
for a tracking problem with no anticipative action Complexity of the Solution
(rk r0 , 8k D 0; : : : ; N  1), no measured dis- The complexity of the solution is given by the
turbance, and fixed upper and lower bounds, the number M of regions that form the explicit so-
explicit solution is a continuous hpiecewise
i affine lution (5), dictating the amount of memory to
x
function of the parameter vector ur0 . Note that store the parametric solution (Fi , Gi , Hi , Ki ,
1
prediction models and/or weight matrices in (13) i D 1; : : : ; M ), and the worst-case execution
cannot be treated as parameters to maintain the time required to compute Fi x C Gi once the
mpQP formulation (3). problem of identifying the index i of the region E
fx W Hi x  Ki g containing the current state x
is solved (which usually takes most of the time).
Linear MPC Based on Convex
The latter is called the “point location problem,”
Piecewise-Affine Costs
and a few methods have been proposed to solve
A similar setting can be repeated for MPC
the problem more efficiently than searching lin-
problems based on linear prediction models
early through the list of regions (see, e.g., the
and convex piecewise-affine costs, such as
tree-based approach of Tøndel et al. 2003b).
1- and 1-norms. In this case, the MPC
An upper bound to M is 2q , which is the
problem is mapped into a multiparametric linear
number of all possible combinations of active
programming (mpLP) problem, whose solution
constraints. In practice, M is much smaller than
is again continuous and piecewise-affine with
2q , as most combinations are never active at
respect to the vector of parameters. For details,
optimality for any of the vectors x (e.g., lower
see Bemporad et al. (2002a).
and upper limits on an actuation signal cannot
be active at the same time, unless they coin-
Robust MPC cide). Moreover, regions in which the first nu
Explicit solutions to min-max MPC problems component of the multiparametric solution z.x/
that provide robustness with respect to additive is the same can be joined together, provided that
and/or multiplicative unknown-but-bounded their union is a convex set (an optimal merging
uncertainty were proposed in Bemporad et al. algorithm was proposed by Geyer et al. (2008) to
(2003), based on a combination of mpLP and get a minimal number M of partitions). Nonethe-
dynamic programming. Again the solution is less, the complexity of the explicit MPC law
piecewise affine with respect to the state vector. typically grows exponentially with the number
q of constraints. The number m of parameters
Hybrid MPC is less critical and mainly affects the number of
An MPC formulation based on 1- or 1-norms elements to be stored in memory (i.e., the number
and hybrid dynamics expressed in mixed-logical of columns of matrices Fi , Hi ). The number n
dynamical (MLD) form can be solved explic- of free variables also affects the number M of
itly by treating the optimization problem asso- regions, mainly because they are usually upper
ciated with MPC as a multiparametric mixed and lower bounded.
integer linear programming (mpMILP) problem.
The solution is still piecewise affine but may be
discontinuous, due to the presence of binary vari- Computer-Aided Tools
ables (Bemporad et al. 2000). A better approach The Model Predictive Control Toolbox (Bempo-
based on dynamic programming combined with rad et al. 2014) offers functions for designing ex-
mpLP (or mpQP) was proposed in Borrelli et al. plicit MPC controllers in MATLAB since 2014.
(2005) for hybrid systems in piecewise-affine Other tools exist such as the Hybrid Toolbox
(PWA) dynamical form and linear (or quadratic) (Bemporad 2003) and the Multi-Parametric Tool-
costs. box (Kvasnica et al. 2006).
410 Explicit Model Predictive Control

Summary and Future Directions towards new challenging applications. Lecture notes
in control and information sciences, vol 384. Springer,
Berlin/Heidelberg, pp 345–369
Explicit MPC is a powerful tool to convert an Baotić M (2002) An efficient algorithm for multi-
MPC design into an equivalent control law that parametric quadratic programming. Tech. Rep.
can be implemented as a lookup table of linear AUT02-05, Automatic Control Institute, ETH, Zurich
gains. Whether the explicit form is preferable Bemporad A (2003) Hybrid toolbox – user’s guide. http://
cse.lab.imtlucca.it/~bemporad/hybrid/toolbox
to solving the QP problem online depends on Bemporad A, Borrelli F, Morari M (2000) Piecewise linear
available CPU time, data memory, and program optimal controllers for hybrid systems. In: Proceedings
memory and other practical considerations. Al- of American control conference, Chicago, pp 1190–
though suboptimal methods have been proposed 1194
Bemporad A, Borrelli F, Morari M (2002a) Model predic-
to reduce the complexity of the control law, still tive control based on linear programming – the explicit
the explicit MPC approach remains convenient solution. IEEE Trans Autom Control 47(12):1974–
for relatively small problems (such as one or 1985
two command inputs, short control and constraint Bemporad A, Morari M, Dua V, Pistikopoulos E (2002b)
The explicit linear quadratic regulator for constrained
horizons, up to ten states). For larger problems, systems. Automatica 38(1):3–20
and/or problems that are linear time varying, on Bemporad A, Borrelli F, Morari M (2003) Min-max
line QP solution methods tailored to embedded control of constrained uncertain discrete-time linear
MPC may be preferable. systems. IEEE Trans Autom Control 48(9):1600–1606
Bemporad A, Morari M, Ricker N (2014) Model predic-
tive control toolbox for matlab – user’s guide. The
Mathworks, Inc., http://www.mathworks.com/access/
Cross-References helpdesk/help/toolbox/mpc/
Borrelli F, Baotić M, Bemporad A, Morari M (2005)
Dynamic programming for constrained optimal con-
 Model-Predictive Control in Practice
trol of discrete-time linear hybrid systems. Automatica
 Nominal Model-Predictive Control 41(10):1709–1721
 Optimization Algorithms for Model Predictive Borrelli F, Bemporad A, Morari M (2011, in press) Predic-
Control tive control for linear and hybrid systems. Cambridge
University Press
Geyer T, Torrisi F, Morari M (2008) Optimal complexity
reduction of polyhedral piecewise affine systems. Au-
Recommended Reading tomatica 44:1728–1740
Jones C, Morari M (2006) Multiparametric linear com-
For getting started in explicit MPC, we plementarity problems. In: Proceedings of the 45th
IEEE conference on decision and control, San Diego,
recommend reading the paper by Bemporad et al. pp 5687–5692
(2002b) and the survey paper Alessio and Kvasnica M, Grieder P, Baotić M (2006) Multi parametric
Bemporad (2009). Hands-on experience using toolbox (MPT). http://control.ee.ethz.ch/~mpt/
one of the MATLAB tools listed above is also Mayne D, Rawlings J (2009) Model predictive control:
theory and design. Nob Hill Publishing, LCC, Madi-
useful for fully appreciating the potentials and son
limitations of explicit MPC. For understanding Patrinos P, Bemporad A (2014) An accelerated dual
how to program a good multiparametric QP gradient-projection algorithm for embedded linear
model predictive control. IEEE Trans Autom Control
solver, the reader is recommended to take the
59(1):18–33
approach of Tøndel et al. (2003a) and Spjøtvold Patrinos P, Sarimveis H (2010) A new algorithm for
et al. (2006) or, in alternative, of Patrinos and solving convex parametric quadratic programs based
Sarimveis (2010) or Jones and Morari (2006). on graphical derivatives of solution mappings. Auto-
matica 46(9):1405–1418
Ricker N (1985) Use of quadratic programming for con-
strained internal model control. Ind Eng Chem Process
Bibliography Des Dev 24(4):925–936
Spjøtvold J, Kerrigan E, Jones C, Tøndel P, Johansen
Alessio A, Bemporad A (2009) A survey on explicit TA (2006) On the facet-to-facet property of solutions
model predictive control. In: Magni L, Raimondo DM, to convex parametric quadratic programs. Automatica
Allgower F (eds) Nonlinear model predictive control: 42(12):2209–2214
Extended Kalman Filters 411

Tøndel P, Johansen TA, Bemporad A (2003) An algorithm with high-dimensional state vectors (d D several
for multi-parametric quadratic programming and ex- hundreds) in real time on a single microprocessor
plicit MPC solutions. Automatica 39(3):489–497
Tøndel P, Johansen TA, Bemporad A (2003b) Evaluation
chip. The computational complexity of the EKF
of piecewise affine control via binary search tree. scales as the cube of the dimension of the state
Automatica 39(5):945–950 vector (d) being estimated. The EKF often gives
Wang Y, Boyd S (2010) Fast model predictive control good estimation accuracy for practical nonlinear
using online optimization. IEEE Trans Control Syst
Technol 18(2):267–278
problems, although the EKF accuracy can be
very poor for difficult nonlinear non-Gaussian
problems. There are many different variations
of EKF algorithms, most of which are intended E
Extended Kalman Filters to improve estimation accuracy. In particular,
the following types of EKFs are common in
Frederick E. Daum engineering practice: (1) second-order Taylor
Raytheon Company, Woburn, MA, USA series expansion of the nonlinear functions, (2)
iterated measurement updates that recompute the
point at which the first order Taylor series is
Synonyms evaluated for a given measurement, (3) second-
order iterated (i.e., combination of items 1
EKF and 2), (4) special coordinate systems (e.g.,
Cartesian, polar or spherical, modified polar
or spherical, principal axes of the covariance
Abstract matrix ellipse, hybrid coordinates, quaternions
rather than Euler angles, etc.), (5) preferred order
The extended Kalman filter (EKF) is the most of processing sequential scalar measurement
popular estimation algorithm in practical appli- updates, (6) decoupled or partially decoupled or
cations. It is based on a linear approximation to quasi-decoupled covariance matrices, and many
the Kalman filter theory. There are thousands of more variations. In fact, there is no such thing
variations of the basic EKF design, which are as “the” EKF, but rather there are thousands of
intended to mitigate the effects of nonlinearities, different versions of the EKF. There are also
non-Gaussian errors, ill-conditioning of the co- many different versions of the Kalman filter
variance matrix and uncertainty in the parameters itself, and all of these can be used to design EKFs
of the problem. as well. For example, there are many different
equations to update the Kalman filter error
covariance matrices with the intent of mitigating
Keywords ill-conditioning and improving robustness,
including (1) square-root factorization of the
Estimation; Nonlinear filters covariance matrix, (2) information matrix update,
(3) square-root information update, (4) Joseph’s
The extended Kalman filter (EKF) is by far robust version of the covariance matrix update,
the most popular nonlinear filter in practical (5) at least three distinct algebraic versions of the
engineering applications. It uses a linear covariance matrix update, as well as hybrids of
approximation to the nonlinear dynamics and the above.
measurements and exploits the Kalman filter Many of the good features of the Kalman filter
theory, which is optimal for linear and Gaussian are also enjoyed by the EKF, but unfortunately
problems; Gelb (1974) is the most accessible not all. For example, we have a very good theory
but thorough book on the EKF. The real-time of stability for the Kalman filter, but there is
computational complexity of the EKF is rather no theory that guarantees that an EKF will be
modest; for example, one can run an EKF stable in practical applications. The only method
412 Extended Kalman Filters

to check whether the EKF is stable is to run the coordinate system. Moreover, in math and
Monte Carlo simulations that cover the relevant physics, coordinate-free methods are preferred,
regions in state space with the relevant measure- owing to their greater generality and simplicity
ment parameters (e.g., data rate and measurement and power. The physics does not depend on the
accuracy). Secondly, the Kalman filter computes specific coordinate system; this is essentially a
the theoretical error covariance matrix, but there definition of what “physics” means, and it has
is no guarantee that the error covariance matrix resulted in great progress in physics over the
computed by the EKF approximates the actual last few hundred years (e.g., general relativity,
filter errors, but rather the EKF covariance ma- gauge invariance in quantum field theory, Lorentz
trix could be optimistic by orders of magnitude invariance in special relativity, as well as a host
in real applications. Third, the numerical val- of conservation laws in classical mechanics that
ues of the process noise covariance matrix can are explained by Noether’s theorem which relates
be computed theoretically for the Kalman filter, invariance to conserved quantities). Similarly
but there is no guarantee that these will work in math, coordinate-free methods have been the
well for the EKF, but rather engineers typically royal road to progress over the last 100 years
tune the process noise covariance matrix using but not so for practical engineering of EKFs,
Monte Carlo simulations or else use a heuris- because EKFs are approximations rather than
tic adaptive process (e.g., IMM). All of these being exact, and the accuracy of the EKF
short-comings of the EKF compared with the approximation depends crucially on the specific
Kalman filter theory are due to a myriad of coordinate system used. Moreover, the effect
practical issues, including (1) nonlinearities in of ill-conditioning of the covariance matrices
the dynamics or measurements, (2) non-Gaussian in EKFs depends crucially on the specific
measurement errors, (3) unmodeled measurement coordinate system used in the computer; for
error sources (e.g., residual sensor bias), (4) un- example, if we could compute the EKF in
modeled errors in the dynamics, (5) data associa- principal coordinates, then the covariance
tion errors, (6) unresolved measurement data, (7) matrices would be diagonal, and there would
ill-conditioning of the covariance matrix, etc. The be no effect of ill-conditioning, despite enormous
actual estimation accuracy of an EKF can only condition numbers of the covariance matrices.
be gauged by Monte Carlo simulations over the Surprisingly, these two simple points about
relevant parameter space. coordinate systems are still not well understood
The actual performance of an EKF can depend by many researchers in nonlinear filtering.
crucially on the specific coordinate system that
is used to represent the state vector. This is
extremely well known in practical engineering Cross-References
applications (e.g., see Mehra 1971; Stallard
1991; Miller 1982; Markley 2007; Daum 1983;  Estimation, Survey on
Schuster 1993). Intuitively, this is because the  Kalman Filters
dynamics and measurement equations can be  Nonlinear Filters
exactly linear in one coordinate system but not  Particle Filters
another; this is very easy to see; start with dy-
namics and measurements that are exactly linear
in Cartesian coordinates and transform to polar Bibliography
coordinates and we will get highly nonlinear
equations. Likewise, we can have approximately Crisan D, Rozovskii B (eds) (2011) The Oxford hand-
linear dynamics and measurements in a specific book of nonlinear filtering. Oxford University Press,
coordinate system but highly nonlinear equations Oxford/New York
Daum FE, Fitzgerald RJ (1983) Decoupled Kalman filters
in another coordinate system. But in theory, the for phased array radar tracking. IEEE Trans Autom
optimal estimation accuracy does not depend on Control 28:269–283
Extremum Seeking Control 413

Gelb A et al (1974) Applied optimal estimation. MIT,


Cambridge q f '' f (q )
f*+ (q −q * )2
Markley FL, Crassidis JL, Cheng Y (2007) Nonlinear 2
attitude filtering methods. AIAA J 30:12–28
Mehra R (1971) A comparison of several nonlinear filters
for reentry vehical tracking. IEEE Trans Autom Con- q k
+ ×
trol 16:307–310 s
Miller KS, Leskiw D (1982) Nonlinear observations with
radar measurements. IEEE Trans Aerosp Electron Syst
2:192–200
Ristic B, Arulampalam S, Gordon N (2004) Beyond the
a sin (ωt)
Kalman filter. Artech House, Boston
Schuster MD (1993) A survey of attitude representations. E
J Astronaut Sci 41:439–517 Extremum Seeking Control, Fig. 1 The simplest
Sorenson H (ed) (1985) Kalman filtering: theory and perturbation-based extremum seeking scheme for a
f 00
application. IEEE, New York quadratic single-input map f . / D f  C .   /2 ,
Stallard T (1991) Angle-only tracking filter in modified 2
where f  ; f 00 ;  are all unknown. The user has to only
spherical coordinates. AIAA J Guid 14:694–696
know the sign of f 00 , namely, whether the quadratic map
Tanizaki H (1996) Nonlinear filters, 2nd edn. Springer,
has a maximum or a minimum, and has to choose the
Berlin/New York
adaptation gain k such that sgnk D sgnf 00 . The user
has to also choose the frequency ! as relatively large
compared to a, k, and f 00
Extremum Seeking Control

Miroslav Krstic
Department of Mechanical and Aerospace 2006). The most common version employs per-
Engineering, University of California, San turbation signals for the purpose of estimating the
Diego, La Jolla, CA, USA gradient of the unknown map that is being opti-
mized. To understand the basic idea of extremum
seeking, it is best to first consider the case of a
Abstract
static single-input map of the quadratic form, as
shown in Fig. 1.
Extremum seeking (ES) is a method for real-time
Three different thetas appear in Fig. 1:  is
non-model-based optimization. Though ES was O
the unknown optimizer of the map, .t/ is the
invented in 1922, the “turn of the twenty-first cen-
real-time estimate of  , and .t/ is the actual
tury” has been its golden age, both in terms of the
input into the map. The actual input .t/ is based
development of theory and in terms of its adop- O
on the estimate .t/ but is perturbed by the
tion in industry and in fields outside of control
signal a sin.!t/ for the purpose of estimating the
engineering. This entry overviews basic gradient-
unknown gradient f 00  .   / of the map f . /.
and Newton-based versions of extremum seeking
The sinusoid is only one choice for a perturbation
with periodic and stochastic perturbation signals.
signal – many other perturbations, from square
waves to stochastic noise, can be used in lieu of
Keywords sinusoids, provided they are of zero mean. The
O is generated with the integrator k=s
estimate .t/
Gradient climbing; Newton’s method with the adaptation gain k controlling the speed
of estimation.
The ES algorithm is successful if the error
The Basic Idea of Extremum Seeking O
between the estimate .t/ and the unknown  ,
namely, the signal
Many versions of extremum seeking exist, with
various approaches to their stability study (Krstic
and Wang 2000; Liu and Krstic 2012; Tan et al. Q
.t/ O  
D .t/ (1)
414 Extremum Seeking Control

converges towards zero. Based on Fig. 1, the esti- y


θ
P
mate is governed by the differential equation O D Q(·)
k sin.!t/f . /, which means that the estimation
error is governed by
θ̂ Ĝ
+
K ×

S(t) M(t)
d Q f 00 Q 2 s
D ka sin.!t/ f  C C a sin.!t/
dt 2
(2) Extremum Seeking Control, Fig. 2 Extremum seeking
algorithm for a multivariable map y D Q. /, where
Expanding the right-hand side, one obtains is the input vector D Π1 ; 2 ;    ; n T . The algorithm
employs the additive perturbation vector signal S.t / given
in (6) and the multiplicative demodulation vector signal
Q
d .t/ f 00
D kaf  sin.!t/ C ka3 sin3 .!t/ M.t / given in (7)
dt „ ƒ‚ … 2 „ ƒ‚ …
mean=0 mean=0
f 00 Q 2 ES for Multivariable Static Maps
C ka sin.!t/ .t/
2 „ ƒ‚ … „ƒ‚…
fast, mean=0 slow For static maps, ES extends in a straightforward
C ka f 2 00
sin .!t/ Q 2
.t/ (3) manner from the single-input case shown in Fig. 1
„ ƒ‚ … „ƒ‚… to the multi-input case shown in Fig. 2.
fast, mean=1/2 slow The algorithm measures the scalar signal
y.t/ D Q. .t//, where Q./ is an unknown map
A theoretically rigorous time-averaging pro- whose input is the vector D Π1 ; 2 ;    ; n T .
cedure allows to replace the above sinusoidal The gradient is estimated with the help of the
signals by their means, yielding the “average signals
system”
 T
S.t/ D a1 sin.!1 t/  an sin.!n t/
<0
‚…„ƒ
d Qave kf 00 a2 Q (6)
D ave ; (4)
T
dt 2 2 2
M.t/ D sin.!1 t/  sin.!n t/
a1 an
which is exponentially stable. The averaging the- (7)
ory guarantees that there exists sufficiently large
O
! such that, if the initial estimate .0/ is suffi-
with nonzero perturbation amplitudes ai and with
ciently close to the unknown  ,
a gain matrix K that is diagonal. To guarantee
  convergence, the user should choose !i ¤ !j .
kf 00 a2 1

j .t/  j  j .0/  j e  2 t
CO This is a key condition that differentiates the
! multi-input case from the single-input case. In
Ca ; 8t  0 : (5) addition, for simplicity in the convergence analy-
sis, the user should choose !i =!j as rational and
!i C !j ¤ !k for distinct i; j; and k.
For the user, the inequality (5) guarantees that, If the unknown map is quadratic, namely,
if a is chosen small and ! is chosen large, the Q. / D Q C 12 .   /T H.   /, the averaged
input .t/ exponentially converges to a small in- system is
terval around the unknown  and, consequently,
the output f . .t// converges to the vicinity of the
optimal output f  . PQave D KH Qave ; H D Hessian. (8)
Extremum Seeking Control 415

θ ẋ = f (x, α(x, θ)) y


y = h(x)

s
S (t) M (t)
s+ωh

θ̂
K Ĝ ωl
+ ×
s s+ωl z = y−η

Extremum Seeking Control, Fig. 3 The ES algorithm slower than the dynamics of the plant, convergence is
E
in the presence of dynamics with an equilibrium map 7! guaranteed (at least locally). The two filters are useful
y that satisfies the same conditions as in the static case. If in the implementation to reduce the adverse effect of the
the dynamics are stable and the user employs parameters perturbation signals on asymptotic performance but are
in the ES algorithm that make the algorithm dynamics not needed in the stability analysis

If, for example, the map Q./ has a maximum parameters are chosen so that the algorithm’s
that is locally quadratic (which implies H D dynamics are slower than those of the plant. The
H T < 0) and if the user chooses the elements algorithm is shown in Fig. 3.
of the diagonal gain matrix K as positive, the The technical conditions for convergence in
ES algorithm is guaranteed to be locally conver- the presence of dynamics are that the equilibria
gent. However, the convergence rate depends on x D l. / of the system xP D f .x; ˛.x; //,
the unknown Hessian H . This weakness of the where ˛.x; / is the control law of an internal
gradient-based ES algorithm is removed with the feedback loop, are locally exponentially stable
Newton-based ES algorithm. uniformly in and that, given the output map
A stochastic version of the algorithm in Fig. 2 y D h.x/, there exists at least one  2 Rn
also exists, in which S.t/ and M.t/ are re- @ @2
such that .hıl/.  / D 0 and 2 .hıl/.  / D
placed by @ @
H < 0; H D H T .
The stability analysis in the presence of
S..t// D Œa1 sin.1 .t//; : : : ; an sin.n .t//T ;
dynamics employs both averaging and singular
(9) perturbations, in a specific order. The design
" guidelines for the selection of the algorithm’s pa-
2 rameters follow the analysis. Though the guide-
M..t// D 2
sin.1 .t//; : : : ;
a1 .1  e q1 / lines are too lengthy to state here, they ensure
T that the plant’s dynamics are on a fast time scale,
2 the perturbations are on a medium time scale, and
sin.n .t// (10)
an .1  e qn /
2
the ES algorithm is on a slow time scale.
p
qi "i P
where i D ŒWi  and WP i are independent
"i s C 1 Newton ES Algorithm for Static Map
unity-intensity white noise processes.
A Newton version of the ES algorithm, shown in
Fig. 4, ensures that the convergence rate be user
ES for Dynamic Systems assignable, rather than being dependent on the
unknown Hessian of the map.
ES extends in a relatively straightforward man- The elements of the demodulating matrix N.t/
ner from static maps to dynamic systems, pro- for generating the estimate of the Hessian are
vided the dynamics are stable and the algorithm’s given by
416 Extremum Seeking Control

θ y
Q(·)

S(t) M(t)

θ̂ K Ĝ
+ −ΓĜ ×
s
Ĥ N(t)
ˆΓ
Γ̇ = ωrΓ − ωrΓH ×

Extremum Seeking Control, Fig. 4 A Newton-based as HO .t / D N.t /y.t /. The Riccati matrix differential
ES algorithm for a static map. The multiplicative excita- equation .t / generates an estimate of the Hessian’s
@2 Q. / inverse matrix, avoiding matrix inversions of Hessian
tion N.t / helps generate the estimate of Hessian
@ 2 estimates that may be singular during the transient

16 2 1 on this topic, presenting further theoretical de-


Ni i .t/ D sin .!i t/  ;
2
ai 2 velopments and applications of ES. A proof that
expands the validity of extremum seeking from
4
Nij .t/ D sin.!i t/ sin.!j t/ (11) local to global stability was published in Tan et al.
ai aj
(2006). The book Liu and Krstic (2012) presents
stochastic versions of the algorithms in this entry,
For a quadratic map, the averaged system in
where the sinusoids are replaced by filtered white
error variables Q D O   , Q D   H 1 is
noise perturbation signals.
d Q ave
D K Q ave  K 
Q ave H Q ave ;
„ ƒ‚ …
dt
quadratic
Cross-References
d Q ave
D !r Q ave  !r „
Q aveƒ‚
H Q ave
…: (12)  Adaptive Control, Overview
dt
quadratic  Optimal Deployment and Spatial Coverage

Since the eigenvalues are determined by K


and !r and are therefore independent of the
unknown H , the (local) convergence rate is user Bibliography
assignable.
Krstic M, Wang HH (2000) Stability of extremum seeking
feedback for general dynamic systems. Automatica
Further Reading on Extremum 36:595–601
Seeking Liu S-J, Krstic M (2012) Stochastic averaging and
stochastic extremum seeking. Springer, London/New
York
Since the publication of the first proof of sta-
Tan Y, Nesic D, Mareels I (2006) On non-local stability
bility of extremum seeking (Krstic and Wang properties of extremum seeking control. Automatica
2000), thousands of papers have been published 42:889–903
F

Keywords
Fault Detection and Diagnosis
Consistency relations; Diagnostic observers;
Janos Gertler
Fault detection; Fault diagnosis; Parity space;
George Mason University, Fairfax, VA, USA
Principal component analysis; Residual genera-
tion

Synonyms
Introduction
FDD
Faults are malfunctions of various elements of
technical systems. Extreme cases of faults, called
failures, are catastrophic breakdowns of the
Abstract same. The technical systems (the plant) we are
concerned with range from complex production
The fundamental concepts and methods of systems (chemical plants, oil refineries, power
fault detection and diagnosis are reviewed. stations) through major transportation equipment
Faults are defined and classified as additive or (airplanes, ships) to consumer machines (auto-
multiplicative. The model-free approach of alarm mobiles, home-heating systems, etc.). The faults
systems is described and critiqued. Residual may affect various parts of the main technical
generation, using the mathematical model system (motors, pumps, storage tanks, pipelines)
of the plant, is introduced. The propagation or devices interfacing the main technical system
of additive and multiplicative faults to the with computers providing for control, monitor-
residuals is discussed, followed by a review ing, and operator information. These latter in-
of the effect of disturbances, noise, and model clude sensors (measuring devices) and actuators
errors. Enhanced residuals (structured and (devices acting on the process, such as valves).
directional) are introduced. The main residual The objective of fault detection is to determine
generation techniques are briefly described, and signal if there is a fault anywhere in the
including direct consistency relations, parity system. Fault diagnosis is aimed at providing
space, and diagnostic observers. Principal more specific information about the fault; fault
component analysis and its application to isolation is to pinpoint at the component(s) (sen-
fault detection and diagnosis are outlined. The sors, actuators, or plant components) where the
article closes with some thoughts about future fault is located, while fault identification is to
directions. determine (estimate) the size of the fault and, in

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
418 Fault Detection and Diagnosis

some cases, the time of its arrival. With the ubiq- Model-Based FDD Concepts
uitous presence of the computer, fault detection
and diagnosis (FDD) is, in general, a function of Model-based methods utilize an explicit
the computer interfaced to the plant. mathematical model of the plant. Such model
The simplest approaches to FDD consist of is obtained usually from empirical plant data by
comparing individual plant measurements to pre- systems identification methods or, exceptionally,
set limits, without utilizing any knowledge of the from the “first principles” understanding of the
plant model (limit checking or alarm systems). plant. Model building, though critical to the
More sophisticated techniques rely on an explicit success of model-based FDD, is usually not
mathematical model of the plant. They compare considered part of the FDD effort. The models
plant measurements to estimates obtained, from may be linear or nonlinear, static or dynamic,
other measurements, by the model; any discrep- and continuous or discrete time. In FDD, most
ancy may be an indication of faults. Another class frequently linear discrete-time dynamic models
of techniques (generally but incorrectly called are used.
“data driven”), most notably principal compo- The fundamental idea of model-based FDD is
nent analysis (PCA), include the estimation of the comparison of measured plant outputs to their
an implicit model, from empirical plant data, and estimates, obtained, via the mathematical model,
then use this in ways similar to the model-based from measured or actuated plant inputs (Fig. 1).
methods. These approaches will be described in Any discrepancy is (at least ideally) an indication
more detail in the sequel. that a fault (or faults) is (are) present in the
system. Mathematically, the difference between
the measured output yi .t/ and its estimate yi ^ .t/
Alarm Systems is a (primary) residual (Willsky 1976):

Alarm systems rely on the comparison of ei .t/ D yi .t/  yi ^ .t/


individual plant measurements to their respective
limits. The limits may be two or one sided In general, residuals are quantities that are
(upper and lower limit or upper limit only) and zero in the absence of faults and nonzero in their
may have one or two levels (preliminary and presence.
full alarm). Momentary comparisons may be Unfortunately, it is not only the faults that can
extended to include trend checks. Alarm systems make the residuals nonzero. Usually, the plant
are relatively simple but suffer from two major is subject to disturbances (unmeasured determin-
shortcomings:
– They have very limited fault specificity.
A variable exceeding its limit is not a fault faults
but a symptom of faults. A single-component noise disturbances
fault may cause alarm on many variables
and a particular alarm may be due to various
component faults. outputs
inputs
PLANT
– They have limited fault sensitivity. What is
“normal” for a plant output variable depends
+ primary
on the value of the plant inputs. Such relation-
ship, however, cannot be considered without − residuals
a plant model; therefore, the alarm thresholds
MODEL
need to be set conservatively high.
Because of their simplicity, and in spite of the
above shortcomings, alarm systems are widely Fault Detection and Diagnosis, Fig. 1 Analytical re-
used in industrial applications. dundancy
Fault Detection and Diagnosis 419

istic inputs) and noise (unmeasured random in- faults


puts) (Fig. 1). In addition, and most importantly,
model-based FDD is subject to model errors (due noise disturbances
either to initial inaccuracies in model building or
to changes in the physical plan). The FDD algo-
rithm should be designed, as much as possible, inputs outputs
PLANT
to be insensitive to noise and “robust” in face of
disturbances and model errors.

Additive and multiplicative faults. Depending


on the way they appear in the system equations,
faults may be additive or multiplicative. Additive MODEL F
faults are sensor and actuator biases, leaks in the
plant, etc. Multiplicative faults are changes in the
RESIDUAL
plant parameters. In the following input-output GENERATOR
relationship, u.t/ is the vector of observed (mea-
sured or commanded) plant inputs, y.t/ is the
vector of measured plant outputs, and p.t/ is the
vector of additive faults and t is the discrete time.
residuals
M.q/ and S.q/ are transfer function matrices in
the shift operator q, and ™ is the vector of plant Fault Detection and Diagnosis, Fig. 2 Generating
parameters. Then, model-based residuals

y.t/ D M.q; ™/u.t/ C S.q; ™/p.t/


– Directional residuals, whereas the residual
The (“primary”) residual vector e.t/, in response vector maintains a fault-specific direction in
to additive faults, is response to each particular fault
– Diagonal residuals, whereas each residual re-
e.t/ D y.t/  M.q; ™/u.t/ D S.q; ™/p.t/ sponds only to a particular fault
Residual generators take the input and output ob-
If there are multiplicative faults, then ™ D ™ı C servations from the plant and generate enhanced
™, where ™ı is the nominal parameter vector residuals by one of the above schemes, utilizing
and ™ is its change (the parametric fault); now the mathematical model of the plant (Fig. 2).
and the residual vector e.t/ is (Gertler 1998)
Dealing with noise. Noise is practically
ı unavoidable in physical systems. In FDD,
e.t/ D y.t/  M.q; ™ /u.t/
basically two steps may be taken to reduce the
D †j .@M.q; ™/=@™j /u.t/ ™j effect of noise:
– Residual filtering. This can be achieved by
basing decisions on moving averages of the
Enhanced residuals. To facilitate the isolation residuals or by applying explicit low-pass fil-
of faults, the primary residuals e.t/ are subject ters to the residuals or by designing the resid-
to some enhancement manipulation. The three ual generators in such a way that they have
widely used enhancement techniques are: built-in low-pass behavior.
– Structured residuals, whereas each residual – Statistical testing of the residuals. Structured
is selectively sensitive to a subset of faults, residuals are tested individually; each scalar
resulting in a fault-specific set of zero/nonzero residual is then represented by a Boolean
residuals upon a particular fault (fault codes) 1 or 0, depending on the outcome of the
420 Fault Detection and Diagnosis

test. Directional residuals are tested as vectors briefly introduce the three methods, for discrete-
against multivariable distributions. The test time plant models and additive faults. Note that
thresholds are determined either theoretically, though they look formally different, if designed
using assumptions for the source noise, or em- for the same plant under the same design condi-
pirically based on measurements from fault- tions, the three methods yield identical residuals
free operating conditions. (Gertler 1991).

Dealing with disturbances. Additive dis- Direct consistency (parity) relations (Gertler
turbances are unmeasurable inputs. If the 1998). The input-output model of the plant is
disturbance-to-output transfer function (or utilized directly in the design. The enhanced
equivalent state-space representation) is known, residuals are obtained from the primary residuals
then it is possible to design residuals that are by a transformation W.q/:
completely decoupled from (insensitive to) those
disturbances. However, the FDD algorithm is r.t/ D W.q/e.t/ D W.q/Œy.t/  M.q/u.t/
subject to a certain degree of “design freedom,”
D W.q/S.q/p.t/
defined by the number of outputs in the physical
system; disturbance decoupling is competing for
The desired behavior of the residuals is specified
this freedom with fault isolation enhancement.
as r.t/ D Z.q/p.t/, where the specification Z.q/
If there are too many disturbances, or if their
contains the basic residual properties (structure
path to the outputs is unknown, then only
or directions) plus the residual dynamics. The
approximate decoupling is possible, making FDD
resulting design condition is W.q/S.q/ D Z.q/.
also approximate, usually designed to optimize
If the S.q/ matrix is square, what is usually the
some (H-infinity) performance index.
case (Gertler 1998), then this can be solved for
W.q/ by direct inversion. The residual generator
Dealing with model errors. Model errors are
has to be causal and stable; this can always be
also unavoidable in most practical situations. This
achieved by the appropriate modification of the
is the most serious obstacle in the application
dynamics in Z.q/.
of model-based FDD techniques. In some very
special cases, uncertainty of a particular plant
Parity space (Chow and Willsky 1984). This
parameter may be handled as a “multiplicative
method, also known as the “Chow-Willsky
disturbance,” and residuals designed to be ex-
scheme,” relies on the state-space description
plicitly decoupled from it. In general, however,
of the system:
only approximate solutions are possible, reducing
the residuals’ sensitivity to modeling errors, at
x.t C 1/ D A x.t/ C B u.t/ C E p.t/
the expense of also reducing their sensitivity to
faults. Design methods utilizing some optimiza- y.t/ D C x.t/ C D u.t/ C F p.t/
tion techniques, mostly based on H-infinity or
similar performance indices, are available in the Stacking n consecutive output vectors y.t/
literature (Edelmayer et al. 1994). (where n is the order of the model), and chain-
substituting the state x.t/, yields the equation

Residual Generation Methods Y.t/ D J x.t  n/ C K U.t/ C L P.t/

For linear dynamic systems, provided exact (non- where Y.t/; U.t/, and P.t/ are stacked vectors
approximate) solution is possible, there are three and J; K, and L are hyper-matrices composed of
major techniques to design residual generators: the A; B; C; D; E; F matrices. Now
(i) direct consistency (parity) relations, (ii) parity
space, and (iii) diagnostic observers. We will E .t/ D Y.t/  K U.t/ D L P.t/ C J x.t  n/
Fault Detection and Diagnosis 421

would be a stacked vector of primary residuals, relations among the variables, it significantly
was it not for the presence of the inaccessible ini- reduces the dimensionality of the plant model
tial state x.t n/. To obtain true residuals, a trans- (Kresta et al. 1991). The application of PCA for
formation ri .t/ D wi E .t/ is necessary, so that FDD implies two phases. In the training phase,
wi J D 0. Any vector wi satisfying this orthogo- an implicit plant model is created from empirical
nality condition is a parity vector, together span- plant data. In the monitoring phase, this model is
ning the parity space. Any parity vector yields a used for FDD.
true residual ri .t/; they can be so chosen that a Training data (measured inputs and outputs)
set of residuals possesses structured behavior. are collected from the plant during fault-free
operation. The covariance matrix of the data is
Diagnostic observers. Various observer formed and its eigenstructure obtained. Due to
schemes have been extensively investigated linear relations among the data, some of the F
as possible residual generator algorithms. The eigenvalues will be zero (or near zero, in the
basic full-order Luenberger observer (assuming presence of noise). The eigenvectors belonging
D D 0) is to the nonzero eigenvalues form the data space,
where the fault-free data exist, while those be-
x.t C1/ D A x.t / C B u.t/ C K e.t/ longing to the zero eigenvalues form the residual
space.
where K is the observer gain matrix and It is the residual space that is utilized for FDD.
The projection of a measurement vector onto the
e.t/ D y.t/  C x.t / residual space is the (primary) residual. A statis-
tical test on its size leads to a detection decision
is the innovation vector. If the observer is stable (the absence or presence of faults). A thresh-
then, apart from the start-up transient of the old test is necessary because noise also causes
observer, the innovation qualifies as the primary nonzero residuals. An analysis of the eigenvec-
residual. The gain matrix K is the major design tors spanning the residual space shows how the
parameter; it is chosen to place the observer various faults propagate to the primary residual.
poles, thus achieving stability and desired dy- This allows for the design of residual manipula-
namic behavior (e.g., noise suppression). The tions yielding structured or directional residuals,
remaining design freedom can be utilized to in- just like in the FDD methods based on exact
fluence residual properties. The latter are further models (Gertler et al. 1999).
affected by the transformation r.t/ D H e.t/, The procedure as described above applies to
where the H matrix is an additional design pa- sensor and actuator faults; inclusion of plant
rameter. Diagnostic observers can be designed for faults requires extra effort (and experiments).
both structured and directional residuals (Chen Also, PCA is primarily meant for static models.
and Patton 1999; White and Speyer 1987). Other Its extension to discrete-time dynamic models is
observer schemes, most notably the unknown in- straightforward, but it increases the size of the
put observer, have also been proposed (Frank and model, proportionally to the dynamic order of the
Wunnenberg 1989). Because of their complexity, model.
the detailed design procedures of diagnostic ob-
servers go beyond the scope of this entry.
Summary and Future Directions

Principal Component Analysis Fault detection and diagnosis is today a mature


field of systems and control engineering. There
Principal component analysis is extensively is a very significant level of activity, as measured
used in the monitoring of complex plants with in published papers and conference contributions,
hundreds of variables because, by revealing linear but much of this (in the opinion of this author)
422 Fault-Tolerant Control

is just minor refinements of earlier results. This Isermann R (1984) Process fault detection based on mod-
applies particularly to the long ongoing quest to eling and estimation methods. Automatica 20:387–404
Kresta JV, MacGregor JF, Marlin TE (1991) Multivariate
create “robust” FDD algorithms, especially in the statistical monitoring of processes. Can J Chem Eng
face of model errors. 69:35–47
There are still open challenges in a couple of White JE, Speyer JL (1987) Detection filter design: spec-
areas, most notably extensions to various non- tral theory and algorithm. IEEE Trans Autom Control
AC-32:593–603
linear or parameter varying problems. Another Willsky AS (1976) A survey of design methods for
open and active area, of great practical impor- failure detection in dynamic systems. Automatica
tance, is FDD in networked control systems. 12:601–611
What is really of the greatest interest, though,
is the application of the wealth of available the-
oretical results and design methods to real-life
Fault-Tolerant Control
problems; there has recently been some visible
progress here, a most welcome development.
Ron J. Patton
School of Engineering, University of Hull, Hull,
UK
Cross-References

 Controller Performance Monitoring Synonyms


 Diagnosis of Discrete Event Systems
 Fault-Tolerant Control FTC
 Multiscale Multivariate Statistical Process
Control
 Observers for Nonlinear Systems
 Robust Fault Diagnosis and Control Abstract
 Statistical Process Control in Manufacturing
A closed-loop control system for an engineer-
ing process may have unsatisfactory performance
or even instability when faults occur in actua-
Bibliography
tors, sensors, or other process components. Fault-
Chen J, Patton RJ (1999) Robust model-based tolerant control (FTC) involves the development
fault diagnosis for dynamic systems. Kluwer, and design of special controllers that are capable
Boston/Dordrecht/Amsterdam of tolerating the actuator, sensor, and process
Chow EJ, Willsky AS (1984) Analytical redundancy and
faults while still maintaining desirable and ro-
the design of robust failure detection systems. IEEE
Trans Autom Control AC-29:603–614 bust performance and stability properties. FTC
Edelmayer A, Bokor J, Keviczky L (1994) An H-infinity designs involve knowledge of the nature and/or
filtering approach to robust detection of failures in dy- occurrence of faults in the closed-loop system
namic systems. In: 33rd IEEE conference on decision
either implicitly or explicitly using methods of
and control, Lake Buena Vista
Frank PM, Wunnenberg J (1989) Robust fault diagnosis fault detection and isolation (FDI), fault detection
using unknown input observer schemes. In: Patton R, and diagnosis (FDD), or fault estimation (FE).
Frank P, Clark R (eds) Fault diagnosis in dynamic FTC controllers are reconfigured or restructured
systems. Prentice Hall, Upper Saddle River
using FDI/FDD information so that the effects of
Gertler J (1991) A survey of analytical redundancy meth-
ods in fault detection and isolation. Plenary paper, the faults are reduced or eliminated within each
IFAC Safeprocess Symposium, Baden-Baden feedback loop in active or passive approaches or
Gertler J (1998) Fault detection and diagnosis in engineer- compensated in each control-loop using FE meth-
ing systems. Marcel Dekker, New York
ods. A non-mathematical outline of the essential
Gertler J, Li W, Huang Y, McAvoy T (1999) Isola-
tion enhanced principal component analysis. AIChE J features of FTC systems is given with important
45:323–334 definitions and a classification of FTC systems
Fault-Tolerant Control 423

into either active/passive approaches with exam- Patton 1999; Gertler 1998; Patton et al. 2000)
ples of some well-known strategies. motivated by studies in the 1980s on this topic
(Patton et al. 1989). Fault-tolerant control (FTC)
began to develop in the early 1990s (Patton 1993)
Keywords and is now a standard in the literature (Patton
1997; Blanke et al. 2006; Zhang and Jiang 2008),
Active FTC; Fault accommodation; Fault based on the aerospace subject of reconfigurable
detection and diagnosis (FDD); Fault detection flight control making use of redundant actua-
and isolation (FDI); Fault estimation (FE); Fault- tors and sensors (Steinberg 2005; Edwards et al.
tolerant control; Passive FTC; Reconfigurable 2010).
control
F
Introduction Definitions Relating to Fault-Tolerant
Control
The complexity of modern engineering systems
has led to strong demands for enhanced control FTC is a strategy in control systems architecture
system reliability, safety, and green operation in and design to ensure that a closed-loop system
the presence of even minor anomalies. There can continue acceptable operation in the face
is a growing need not only to determine the of bounded actuator, sensor, or process faults.
onset and development of process faults before The goal of FTC design must ensure that the
they become serious but also to adaptively com- closed-loop system maintains satisfactory stabil-
pensate for their effects in the closed-loop sys- ity and acceptable performance during either one
tem or using hardware redundancy to replace or more fault actions. When prescribed stability
faulty components by duplicate and fault-free and closed-loop performance indices are main-
alternatives. The title “failure tolerant control” tained despite the action of faults, the system
was given by Eterno et al. (1985) working on a is said to be “fault tolerant,” and the control
reconfigurable flight control study defining the scheme that ensures the fault tolerance is the
meaning of control system tolerance to failures fault-tolerant controller (Blanke et al. 2006; Pat-
or faults. The word “failure” is used when a fault ton 1997).
is so serious that the system function concerned Fault modelling is concerned with the rep-
fails to operate (Isermann 2006). The title failure resentation of the real physical faults and their
detection has now been superseded by fault de- effects on the system mathematical model. Fault
tection, e.g., in fault detection and isolation (FDI) modelling is important to establish how a fault
or fault detection and diagnosis (FDD) (Chen and should be detected, isolated, or compensated.

Fault-Tolerant Control, Actuator fault Process fault Sensor fault


Fig. 1 Closed-loop system fa(t) fp(t) fs(t)
with actuator, process, and
sensor faults

Actuator Plant Sensor


Outputs y(t)

System

Control signal u(t)


Controller
yref Reference
424 Fault-Tolerant Control

The faults illustrated in Fig. 1 act at system loca- based on generalized internal model control
tions defined as follows (Chen and Patton 1999): (GIMC) has been proposed by Zhou and Ren
An actuator fault .fa .t// corresponds to vari- (2001) and other studies by Niemann and
ations of the control input u.t/ applied to the Stoustrup (2005). Figure 2 shows a suitable
controlled system either completely or partially. architecture to encompass active and passive
The complete failure of an actuator means that FTC methods in which a distinction is made
it produces no actuation regardless of the input between “execution” and “supervision” levels.
applied to it, e.g., as a result of breakage and The essential differences and requirements
burnout of wiring. For partial actuator faults, the between the passive FTC (PFTC) and active
actuator becomes less effective and provides the FTC (AFTC).
plant with only a part of the normal actuation PFTC is based solely on the use of robust
signal. control in which potential faults are considered
A sensor is an item of equipment that as if they are uncertain signals acting in the
takes a measurement or observation from the closed-loop system. This can be related to the
system, e.g., potentiometers, accelerometers, concept of reliable control (Veillette et al. 1992).
tachometers, pressure gauges, strain gauges, etc.; PFTC requires no online information from the
a sensor fault .fs .t// implies that incorrect fault diagnosis (FDI/FDD/FE) function about the
measurements are taken from the real system. occurrence or presence of faults and hence it is
This fault can also be subdivided into either a not by itself and adaptive system and does not
complete or partial sensor fault. When a sensor involve controller reconfiguration (Patton 1993,
fails, the measurements no longer correspond 1997; Šiljak 1980). PFTC approach can be used
to the required physical parameters. For a if the time window during which the system
partial sensor fault the measurements give remains stabilizable in the presence of a fault is
an inaccurate indication of required physical short; see, for example, the problem of the double
parameters. inverted pendulum (Weng et al. 2007) which is
A process fault .fp .t// directly affects the unstable during a loop failure.
physical system parameters and in turn the in- AFTC has two conceptual steps to provide
put/output properties of the system. Process faults the system with fault-tolerant capability (Blanke
are often termed component faults, arising as et al. 2006; Patton 1997; Zhang and Jiang 2008;
variations from the structure or parameters used Edwards et al. 2009):
during system modelling, and as such cover a • Equip the system with a mechanism to make it
wide class of possible faults, e.g., dirty water hav- able to detect and isolate (or even estimate) the
ing a different heat transfer coefficient compared fault promptly, identify a faulty component,
to when it is clean, or changes in the viscosity of a and select the required remedial action in to
liquid or components slowly degrading over time maintain acceptable operation performance.
through wear and tear, aging, or environmental With no fault a baseline controller attenu-
effects. ates disturbances and ensures good stability
and closed-loop tracking performance (Pat-
ton 1997), and the diagnostic (FDI/FDD/FE)
Architectures and Classification block recognizes that the closed-loop system
of FTC Schemes is fault-free with no control law change re-
quired (supervision level).
FTC methods are classified according to whether • Make use of supervision level information and
they are “passive” or “active,” using fixed or adapt or reconfigure/restructure the controller
reconfigurable control strategies (Eterno et al. parameters so that the required remedial activ-
1985). Various architectures have been proposed ity can be achieved (execution level).
for the implementation of FTC schemes, for Figure 3 gives a classification of PFTC and AFTC
example, the structure of reconfigurable control methods (Patton 1997).
Fault-Tolerant Control 425

Fault-Tolerant Control,
Fig. 2 Scheme of FTC
(Adapted from Blanke
et al. (2006))

Fault-Tolerant Control,
Fig. 3 General
classification of FTC
methods

Figure 3 shows that AFTC approaches (c) Tolerant to unanticipated faults using
are divided into two main types of methods: FDI/FDD/FE
projection-based methods and online automatic (d) Dependent upon use of a baseline controller
controller redesign methods. The latter involves
the calculation of new controller parameters AFTC Examples
following control impairment, i.e., using One example of AFTC is model-based predictive
reconfigurable control. In projection-based control (MPC) which uses online computed con-
methods, a new precomputed control law is trol redesign. MPC is online-fault accommodat-
selected according to the required controller ing; it does not use an FDI/FDD unit and is not
structure (i.e., depending on the type of isolated dependent on a baseline controller. MPC has a
fault). certain degree of fault tolerance against actuator
AFTC methods use online-fault accommoda- faults under some conditions even if the faults are
tion based on unanticipated faults, classified as not detected. The representation of actuator faults
(Patton 1997): in MPC is relatively natural and straightforward
(a) Based on offline (pre-computed) control laws since actuator faults such as jams and slew-rate
(b) Online-fault accommodating reductions can be represented by changing the
426 Fault-Tolerant Control

MPC optimization problem constraints. Other Hence, a AFTC system provides fault
faults can be represented by modifying the inter- tolerance either by selecting a precomputed
nal model used by MPC (Maciejowski 1998). The control law (projection-based) (Boskovic and
fact that online-fault information is not required Mehra 1999; Maybeck and Stevens 1991;
means that MPC is an interesting method for Rauch 1995) or by synthesizing a new control
flight control reconfiguration as demonstrated by strategy online (online controller redesign)
Maciejowski and Jones in the GARTEUR AG 16 (Ahmed-Zaid et al. 1991; Richter et al.
project on “fault-tolerant flight control” (Edwards 2007; Efimov et al. 2012; Zou and Kumar
et al. 2010). 2011).
Another interesting AFTC example that makes Another widely studied AFTC method is the
use of the concept of model-matching in explicit estimation and compensation approach, where
model following is the so-called pseudo-inverse a fault compensation input is superimposed
method (PIM) (Gao and Antsaklis 1992) which on the nominal control input (Noura et al.
requires the nominal or reference closed-loop 2000; Boskovic and Mehra 2002; Sami and
system matrix to compute the new controller gain Patton 2013; Zhang et al. 2004). There is a
after a fault has occurred. The challenges are: growing interest in robust FE methods based on
1. Guarantee of stability of the reconfigured sliding mode estimation (Edwards et al. 2000)
closed-loop system and augmented observer methods (Gao and
2. Minimization of the time consumed to ap- Ding 2007; Jiang et al. 2006; Sami and Patton
proach the acceptable matching 2013).
3. Achieving perfect matching through use of An important development of this approach is
different control methodologies the so-called fault hiding strategy which is cen-
Exact model-matching may be too demanding, tered on achieving FTC loop goals such that the
and some extensions to this approach make nominal control loop remains unchanged through
use of alternative, approximate (norm-based) the use of virtual actuators or virtual sensors
model-matching through the computation of (Blanke et al. 2006; Sami and Patton 2013). Fault
the required model-following gain. To relax hiding makes use of the difference between the
the matching condition further, Staroswiecki nominal and faulty system state to changes in the
(2005) proposed an admissible model-matching system dynamics such that the required control
approach which was later extended by Tornil objectives are continuously achieved even if a
et al. (2010) using D-region pole assignment. fault occurs. In the sensor fault case, the effect
The PIM approach requires an FDI/FDD/FE of the fault is hidden from the input of the con-
mechanism and is online-fault accommodating troller. However, actuator faults are compensated
only in terms of a priori anticipated faults. by the effect of the fault (Lunze and Steffen
This limits the practical value of this ap- 2006; Richter et al. 2007; Ponsart et al. 2010;
proach. Sami and Patton 2013) in which it is assumed
As a third example, feedback linearization that the FDI/FDD or FE scheme is available.
can be used to compensate for nonlinear dy- The virtual actuator/sensor FTC can be good
namic effects while also implementing control practical value if FDI/FDD/FE robustness can be
law reconfiguration or restructure. In flight con- demonstrated.
trol an aileron actuator fault will cause a strong Traditional adaptive control methods that
coupling between the lateral and longitudinal automatically adapt controller parameters to
aircraft dynamics. Feedback linearization is an system changes can be used in a special
established technique in flight control (Ochi and application of AFTC, potentially removing the
Kanai 1991). The faults are identified indirectly need for FDI/FDD and controller redesign steps
by estimating aircraft flight parameters online, (Tang et al. 2004; Zou and Kumar 2011) but
e.g., using a recursive least-squares algorithm to possibly using the FE function. Adaptive control
update the FTC. is suitable for FTC on plants that have slowly
Fault-Tolerant Control 427

varying parameters and can tolerate actuator and Cross-References


process faults. Sensor faults are not tolerated well
as the controller parameters must adapt according  Diagnosis of Discrete Event Systems
to the faulty measurements, causing incorrect  Estimation, Survey on
closed-loop system operation; the FDI/FDD/FE  Fault Detection and Diagnosis
unit is required for such cases.  H-infinity Control
 LMI Approach to Robust Control
 Lyapunov’s Stability Theory
 Model Reference Adaptive Control
Summary and Future Directions  Optimization Based Robust Control
 Robust Adaptive Control
FTC is now a significant subject in control  Robust Fault Diagnosis and Control F
systems science with many quite significant  Robust H2 Performance in Feedback Control
application studies, particularly since the new
millennium. Most of the applications are within
the flight control field with studies such as
the GARTEUR AG16 project “Fault-Tolerant Bibliography
Flight Control” (Edwards et al. 2010). As a very
Ahmed-Zaid F, Ioannou P, Gousman K, Rooney R (1991)
complex engineering-led and mathematically Accommodation of failures in the F-16 aircraft using
focused subject, it is important that FTC remains adaptive control. IEEE Control Syst 11:73–78
application-driven to keep the theoretical con- Blanke M, Kinnaert M, Lunze J, Staroswiecki M
cepts moving in the right directions and satisfying (2006) Diagnosis and fault-tolerant control. Springer,
Berlin/New York
end-user needs. The original requirement for FTC Boskovic JD, Mehra RK (1999) Stable multiple model
in safety-critical systems has now widened to en- adaptive flight control for accommodation of a large
compass a good range of fault-tolerance require- class of control effector failures. In: Proceedings of the
ments involving energy and economy, e.g., for ACC, San Diego, 2–4 June 1999, pp 1920–1924
Boskovic JD, Mehra RK (2002) An adaptive retrofit re-
greener aircraft and for FTC in renewable energy. configurable flight controller. In: Proceedings of the
Faults and modelling uncertainties as well as 41st IEEE CDC, Las Vegas, 10–13 Dec 2002, pp
endogenous disturbances have potentially com- 1257–1262
peting effects on the control system performance Chen J, Patton RJ (1999) Robust model based fault diag-
nosis for dynamic systems. Kluwer, Boston
and stability. This is the robustness problem in Edwards C, Spurgeon SK, Patton RJ (2000) Sliding mode
FTC which is beyond the scope of this article. observers for fault detection and isolation. Automatica
The FTC system provides a degree of tolerance 36:541–553
Edwards C, Lombaerts T, Smaili H (2010) Fault toler-
to closed-loop systems faults and it is also subject
ant flight control a benchmark challenge. Springer,
to the effects of modelling uncertainty arising Berlin/Heidelberg
from the reality that all engineering systems are Efimov D, Cieslak J, Henry D (2012) Supervisory fault-
nonlinear and can even have complex dynamics. tolerant control with mutual performance optimiza-
tion. Int J Adapt Control Signal Process 27(4):251–
For example, consider the PFTC approach rely-
279
ing on robustness principles – as a more complex Eterno J, Weiss J, Looze D, Willsky A (1985) Design is-
extension to robust control. PFTC design requires sues for fault tolerant restructurable aircraft control. In:
the closed-loop system to be insensitive to faults Proceedings of the 24th IEEE CDC, Fort-Lauderdale,
Dec 1985
as well as modelling uncertainties. This requires
Gao Z, Antsaklis P (1992) Reconfigurable control system
the use of multi-objective optimization methods design via perfect model following. Int J Control
e.g., using linear matrix inequalities (LMI), as 56:783–798
well as methods of accounting for dynamical sys- Gao Z, Ding S (2007) Actuator fault robust estimation and
fault-tolerant control for a class of nonlinear descriptor
tem parametric variations, e.g., linear parameter
systems. Automatica 43:912–920
varying (LPV) system structures, Takagi-Sugeno, Gertler J (1998) Fault detection and diagnosis in engineer-
or sliding mode methods. ing systems. Marcel Dekker, New York
428 FDD

Isermann R (2006) Fault-diagnosis systems an introduc- Tornil S, Theilliol D, Ponsart JC (2010) Admissible model
tion from fault detection to fault tolerance. Springer, matching using D R-regions: fault accommodation and
Berlin/New York robustness against FDD inaccuracies J Adapt Control
Jiang BJ, Staroswiecki M, Cocquempot V (2006) Fault Signal Process 24(11): 927–943
accommodation for nonlinear dynamic systems. IEEE Veillette R, Medanic J, Perkins W (1992) Design of
Trans Autom Control 51:1578–1583 reliable control systems. IEEE Trans Autom Control
Lunze J, Steffen T (2006) Control reconfiguration after ac- 37:290–304
tuator failures using disturbance decoupling methods. Weng J, Patton RJ, Cui P (2007) Active fault-tolerant
IEEE Trans Autom Control 51:1590–1601 control of a double inverted pendulum. J Syst Control
Maciejowski JM (1998) The implicit daisy-chaining prop- Eng 221:895
erty of constrained predictive control. J Appl Math Zhang Y, Jiang J (2008) Bibliographical review on re-
Comput Sci 8(4):101–117 configurable fault-tolerant control systems. Annu Rev
Maybeck P, Stevens R (1991) Reconfigurable flight con- Control 32:229–252
trol via multiple model adaptive control methods. Zhang X, Parisini T, Polycarpou M (2004) Adaptive fault-
IEEE Trans Aerosp Electron Syst 27:470–480 tolerant control of nonlinear uncertain systems: an
Niemann H, Stoustrup J (2005) An architecture for fault information-based diagnostic approach. IEEE Trans
tolerant controllers. Int J Control 78(14):1091–1110 Autom Control 49(8):1259–1274
Noura H, Sauter D, Hamelin F, Theilliol D (2000) Fault- Zhang K, Jiang B, Staroswiecki M (2010) Dynamic output
tolerant control in dynamic systems: application to a feedback-fault tolerant controller design for Takagi-
winding machine. IEEE Control Syst Mag 20:33–49 Sugeno fuzzy systems with actuator faults. IEEE Trans
Ochi Y, Kanai K (1991) Design of restructurable flight Fuzzy Syst 18:194–201
control systems using feedback linearization. J Guid Zhou K, Ren Z (2001) A new controller architecture for
Dyn Control 14(5): 903–911 high performance, robust, and fault-tolerant control.
Patton RJ, Frank PM, Clark RN (1989) Fault diagnosis IEEE Trans Autom Control 46(10):1613–1618
in dynamic Systems: theory and application. Prentice Zou A, Kumar K (2011) Adaptive fuzzy fault-tolerant
Hall, New York attitude control of spacecraft. Control Eng Pract
Patton RJ (1993) Robustness issues in fault tolerant 19:10–21
control. In: Plenary paper at international conference
TOOLDIAG’93, Toulouse, Apr 1993
Patton RJ (1997) Fault tolerant control: the 1997 situation.
In: IFAC Safeprocess ’97, Hull, pp 1033–1055 FDD
Patton RJ, Frank PM, Clark RN (2000) Issues of fault
diagnosis for dynamic systems. Springer, London/New
York  Fault Detection and Diagnosis
Ponsart J, Theilliol D, Aubrun C (2010) Virtual sen-
sors design for active fault tolerant control system
applied to a winding machine. Control Eng Pract
18:1037–1044 Feedback Linearization of Nonlinear
Rauch H (1995) Autonomous control reconfiguration. Systems
IEEE Control Syst 15:37–48
Richter JH, Schlage T, Lunze J (2007) Control reconfig-
uration of a thermofluid process by means of a virtual
A.J. Krener
actuator. IET Control Theory Appl 1:1606–1620 Department of Applied Mathematics, Naval
Sami M, Patton RJ (2013) Active fault tolerant control for Postgrauate School, Monterey, CA, USA
nonlinear systems with simultaneous actuator and sen-
sor faults. Int J Control Autom Syst 11(6):1149–1161
Šiljak DD (1980) Reliable control using multiple control Abstract
systems. Int J Control 31:303–329
Staroswiecki M (2005) Fault tolerant control us-
ing an admissible model matching approach. In: Effective methods exist for the control of linear
Joint Decision and Control Conference and Euro- systems but this is less true for nonlinear systems.
pean Control Conference, Seville, 12–15 Dec 2005, Therefore, it is very useful if a nonlinear system
pp 2421–2426
Steinberg M (2005) Historical overview of research in can be transformed into or approximated by a
reconfigurable flight control. Proc IMechE, Part G J linear system. Linearity is not invariant under
Aeros Eng 219:263–275 nonlinear changes of state coordinates and non-
Tang X, Tao G, Joshi S (2004) Adaptive output feedback linear state feedback. Therefore, it may be possi-
actuator failure compensation for a class of non-linear
systems. Int J Adapt Control Signal Process 19(6): ble to convert a nonlinear system into a linear one
419–444 via these transformations. This is called feedback
Feedback Linearization of Nonlinear Systems 429

linearization. This entry surveys feedback lin- around this operating point, expand (2) to first
earization and related topics. order

@f 0 0 @f 0 0
zP D .x ; u /z C .x ; u /v
Keywords @x @u
CO.z; v/2 (4)
Distribution; Frobenius theorem; Involutive dis-
tribution; Lie derivative Ignoring the higher order terms, we get a linear
dynamics (1) where

Introduction @f 0 0 @f 0 0
F D .x ; u /; GD .x ; u /
@x @u F
A controlled linear dynamics is of the form
This simple technique works very well in
xP D F x C Gu (1) many cases and is the basis for many engineer-
ing designs. For example, if the linear feedback
where the state x 2 IRn and the control u 2 IRm . v D Kz puts all the eigenvalues of F C GK
A controlled nonlinear dynamics is of the form in the left half plane, then the affine feedback
u D u0 C K.x  x 0 / makes the closed-loop
xP D f .x; u/ (2) dynamics locally asymptotically stable around
x 0 . So, one way to linearize a nonlinear dynamics
where x; u have the same dimensions but may is to approximate it by a linear dynamics.
be local coordinates on some manifolds X ; U. The other way to linearize is by a nonlinear
Frequently, the dynamics is affine in the control, change of state coordinates and a nonlinear state
i.e., feedback because linearity is not invariant under
these transformations. To see this, suppose we
xP D f .x/ C g.x/u (3) have a controlled linear dynamics (1) and we
make the nonlinear change of state coordinates
where f .x/ 2 IRn1 is a vector field and g.x/ D z D .x/ and nonlinear feedback u D .z; v/.
Œg 1 .x/; : : : ; g m .x/ 2 IRnm is a matrix field. We assume that these transformations are invert-
Linear dynamics are much easier to analyze ible from some neighborhood of x 0 D 0; u0 D 0
and control than nonlinear dynamics. For exam- to some neighborhood of z0 ; v 0 , and the inverse
ple, to globally stabilize the linear dynamics (1), maps are
all we need to do is to find a linear feedback law
..x// D x; . .z/ D z
u D Kx such that all the eigenvalues of F C GK
are in the open left half plane. Finding a feedback . .z/; .z; v// D v; ..x/; .x; u// D u
law u D .x/ to globally stabilize the nonlinear
dynamics is very difficult and frequently impos- then (1) becomes
sible. Therefore, finding techniques to linearize @
nonlinear dynamics has been a goal for several zP D .x/ .F x C Gu/
@x
centuries.
@
The simplest example of a linearization tech- D . .z// .F .z/ C G.z; v//
nique is to approximate a nonlinear dynamics @x
around a critical point by its first-order terms. which is a controlled nonlinear dynamics (2).
Suppose x 0 ; u0 is an operating point for the This raises the question asked by Brockett (1978),
nonlinear dynamics (2), that is, f .x 0 ; u0 / D 0. when is a controlled nonlinear dynamics a change
Define displacement variables z D x  x 0 and of coordinate and feedback away from a con-
v D u  u0 , and assuming f .x; u/ is smooth trolled linear dynamics?
430 Feedback Linearization of Nonlinear Systems

Linearization of a Smooth Vector where the Lie bracket of two vector fields
Field f .x/; g.x/ is the new vector field defined by

Let us start by addressing an apparently simpler @g @f


Œf .x/; g.x/ D .x/f .x/  .x/g.x/:
question that was first considered by Poincaré. @x @x
Given an uncontrolled nonlinear dynamics
around a critical point, say x 0 D 0, Hence,  Œ2 .x/ must satisfy the so-called ho-
mological equation (Arnol’d 1983)
xP D f .x/; f .0/ D 0;
 
F x;  Œ2 .x/ D f Œ2 .x/:
find a smooth local change of coordinates
This is a linear equation from the space of
z D .x/; .0/ D 0 quadratic vector fields to the space of quadratic
vector fields. The quadratic n-dimensional vector
which transforms it into an uncontrolled linear fields form a vector space of dimension n times
dynamics. n C 1 choose 2.
zP D F z: Poincaré showed that the eigenvalues of the
linear map
This question is apparently simpler, but as we
shall see in the next section, the corresponding  
question for a controlled nonlinear dynamics that  Œ2 .x/ 7! F x;  Œ2 .x/ (5)
is affine in the control is actually easier to answer.
Without loss of generality, we can restrict our are i C j  k where i ; j ; k are eigenvalues
attention to changes of coordinates which carry of F . If none of these expressions are zero,
x 0 D 0 to z0 D 0 and whose Jacobian at this then the operator (5) is invertible. A degree two
point is the identity, i.e., resonance occurs when i C j  k D 0 and
then the homological equation is not solvable for
z D x C O.x 2 / all f Œ2 .x/.
Suppose a change of coordinates exists that
then linearizes the vector field up to degree r. In the
new coordinates, the vector field is of the form
@f
F D .0/:
@x xP D F x C f Œr .x/ C O.x/rC1 :

Poincaré’s formal solution to this problem was to


expand the vector field and the desired change of We seek a change of coordinates of the form
coordinates in a power series,
z D x   Œr .x/
xP D F x C f Œ2 .x/ C O.x 3 /
to cancel the degree r terms, i.e., we seek a
z D x   Œ2 .x/
solution of the rth degree homological equation,

where f Œ2 ;  Œ2 are n-dimensional vector fields,  


whose entries are homogeneous polynomials F x;  Œr .x/ D f Œr .x/:
of degree 2 in x. A straightforward calculation
yields A degree r resonance occurs if

 
zP D F z C f Œ2 .x/  F x;  Œ2 .x/ C O.x/3 i1 C : : : C ir  k D 0:
Feedback Linearization of Nonlinear Systems 431

If there is no resonance of degree r, then the de- A necessary and sufficient condition is that
gree r homological equation is uniquely solvable
 
for every f Œr .x/. adk .f /g i ; adl .f /g j D 0
When there are no resonances of any degree,
then the convergence of the formal power series
for k D 0; : : : ; n  1; l D 0; : : : ; n
solution is delicate. We refer the reader to Arnol’d
The proof of this theorem is straightforward.
(1983) for the details.
Under a change of state coordinates, the vector
fields and their Lie brackets are transformed by
the Jacobian of the coordinate change. Trivially
Linearization of a Controlled for linear systems,
Dynamics by Change of State
Coordinates F
adk .F x/G i D .1/k F k G
 k 
Given a controlled affine dynamics (3) when does ad .F x/G i ; adl .F x/G j D 0:
there exist a smooth local change of coordinates

z D .x/; 0 D .0/ Feedback Linearization

We turn to a question posed and partially an-


transforming it to
swered by Brockett (1978). Given a system affine
in the m-dimensional control
zP D F z C Gu
xP D f .x/ C g.x/u;
where
find a smooth local change of coordinates and
@f
F D .0/; G D g.0/ smooth feedback
@x

This is an easier question to answer than that of z D .x/; u D ˛.x/ C ˇ.x/v


Poincaré.
The controlled affine dynamics (3) is transforming it to
said to have well-defined controllability
(Kronecker) indices if there exists a reordering of zP D F z C Gv
g 1 .x/; : : : ; g m .x/ and integers r1  r2     
rm  0 such that r1 C    C rm D n, and the
Brockett solved this problem under the assump-
vector fields
tions that ˇ is a constant and the control is a
˚ k  scalar, m D 1. The more general question for
ad .f /g j W j D 1; : : : ; m; k D 0; : : : ; rj  1 ˇ.x/ and arbitrary m was solved in different ways
by Korobov (1979), Jakubczyk and Respondek
are linearly independent at each x where gi (1980), Sommer (1980), Hunt and Su (1981), Su
denotes the i th column of g and (1982), and Hunt et al. (1983).
We describe the solution when m D 1. If the
  pair F; G is controllable, then there exist an H
ad0 .f /g i D g i ; adk .f /g i D f; adk1 .f /g i :
such that
If there are several sets of indices that satisfy this
HF k1 G D 0 k D 1; : : : ; n  1
definition, then the controllability indices are the
smallest in the lexicographic ordering. HF n1 G D 1
432 Feedback Linearization of Nonlinear Systems

If the nonlinear system is feedback lineariz- One might ask if it is possible to use dynamic
able, then there exists a function h.x/ D H.x/ feedback to linearize a system that is not lineariz-
such that able by static feedback. Suppose we treat one of
the controls uj as a state and let its derivative be
Lad k1 .f /g h D 0 k D 1; : : : ; n  1 a new control,
uP j D uN j
Lad n1 .f /g h ¤ 0
can the resulting system be linearized by state
where the Lie derivative of a function h by a feedback and change of state coordinates?
vector field g is given by Loosely speaking, the effect of adding such
an integrator to the j th control is to shift the
@h j th column of the above matrix down by one
Lg h D g row. This changes the distribution spanned by
@x
the first through rth rows of the above matrix
This is a system of first-order PDEs, and the and might make it involutive. A scalar input
solvability conditions are given by the classical system m D 1 that is linearizable by dynamic
Frobenius theorem, namely, that state feedback is also linearizable by static state
feedback. There are multi-input systems m > 1
fg; : : : ; ad n2 .f /gg that are dynamically linearizable but not statically
linearizable (Charlet et al. 1989, 1991).
is involutive, i.e., its span is closed under Lie The generic system is not feedback lineariz-
bracket. able, but mechanical systems with one actuator
For controllable systems, this is a necessary for each degree of freedom typically are feedback
and sufficient condition. The controllability con- linearizable. This fact had been used in many
dition is that fg; : : : ; ad n1 .f /gg spans x space. applications, e.g., robotics, before the concept of
Suppose m D 2 and the system has controlla- feedback linearization.
bility (Kronecker) indices r1  r2 . Such a system One should not lose sight of the fact that sta-
is feedback linearizable iff bilization, model-matching, or some other perfor-
mance criterion is typically the goal of controller
design. Linearization is a means to the goal.
fg 1 ; ; g 2 ; : : : ; ad ri 2 .f /g 1 ; ad ri 2 .f /g 2 g
We linearize because we know how to meet the
performance goal for linear systems.
is involutive for i D 1; 2. Another way of
Even when the system is linearizable, finding
putting is that the distribution spanned by the first
the linearizing coordinates and feedback can be
through rth rows of the following matrix must
a nontrivial task. Mechanical systems are the ex-
be involutive for r D ri  1; i D 1; 2. This is
ception as the linearizing coordinates are usually
equivalent to the distribution spanned by the first
the generalized positions. Since the ad k1 .f /g
through rth rows of the following matrix being
for k D 1; : : : ; n  1 are characteristic directions
involutive for all r D 1; : : : ; r1 .
of the PDE for h, the general solutions of the
2 3 ODE’s
g1 g2
xP D ad k1 .f /g.x/
6 ad.f /g ad.f /g 2 7
6 7
6 :: :: 7 can be used to construct the solution (Blanken-
6 : : 7
6 r 2 7 ship and Quadrat 1984). The Gardner-Shadwick
6 ad 2 .f /g 1 ad r2 2 .f /g 2 7
6 7 (GS) algorithm (1992) is the most efficient
6 ad r2 1 .f /g 1 ad r2 1 .f /g 2 7
6 7 method that is known.
6 :: 7
6 : 7 Linearization of discrete time systems was
6 7
4 ad r1 2 .f /g 1 5 treated by Lee et al. (1986). Linearization of
ad r1 1 .f /g 1 discrete time systems around an equilibrium
Feedback Linearization of Nonlinear Systems 433

manifold was treated by Barbot et al. (1995) and and choose n  r functions i .x/; i D 1; : : : ;
Jakubczyk (1987). Banaszuk and Hauser have n  r so that .; / are a full set of coordinates on
also considered the feedback linearization of the state space. Furthermore, it is always possible
the transverse dynamics along a periodic orbit, (Isidori 1995) to choose i .x/ so that
(Banaszuk and Hauser 1995a,b).
Lg i .x/ D 0 i D 1; : : : ; n  r

In these coordinates, the system is in the nor-


Input–Output Linearization mal form

y D 1
Feedback linearization as presented above ig-
nores the output of the system but typically one P1 D 2
uses the input to control the output. Therefore, ::
F
one wants to linearize the input–output response :
of the system rather than the dynamics. This was Pr1 D r
first treated in Isidori and Krener (1982) and
Isidori and Ruberti (1984). Pr D fr .; / C gr .; /u
Consider a scalar input, scalar output system P D .; /
of the form
The feedback u D u.; ; v/ defined by
xP D f .x/ C g.x/u; y D h.x/ .v  fr .; //
uD
gr .; /
The relative degree of the system is the number of
integrators between the input and the output. To transforms the system to
be more precise, the system is of relative degree
r  1 if for all x of interest, y D H
P D F  C Gv
Lad j .f /g h.x/ D 0 j D 0; : : : ; r  2 P D .; /
Lad r1 .f /g h.x/ ¤ 0
where F; G; H are the r r; r 1; 1r matrices
In other words, the control appears first in the rth 2 3 2 3
010 :::
0 0
time derivative of the output. Of course, a system 60 0 1 07
::: 607
6 7 6 7
might not have a well-defined relative degree as 6 :: 7 6 7
F D 6 ::: ::: ::: ::
:7
: G D 6 ::: 7
the r might vary with x. 6 7 6 7
Rephrasing the result of the previous section, 40 0 0 ::: 15 405
a scalar input nonlinear system is feedback lin- 0 0 0 ::: 0 1
earizable if there exist an pseudo-output map  
h.x/ such the resulting scalar input, scalar output H D 1 0 0 ::: 0 :
system has a relative degree equal to the state
The system has been transformed into a string
dimension n.
of integrators plus additional dynamics that is
Assume we have a scalar input, scalar out-
unobservable from the output.
put system with a well-defined relative degree
By suitable choice of additional feedback v D
1  r  n. We can define r partial coordinate
K, one can insure that the poles of F C GK
functions
are stable. The stability of the overall system then
depends on the stability of the zero dynamics
i .x/ D .Lf /i 1 h.x/ i D 1; : : : ; r (Byrnes and Isidori 1984, 1988),
434 Feedback Linearization of Nonlinear Systems

P D .0; / v D u  ˛ Œ2 .x; u/

If this is stable, then the overall system will be The transformed system is
stable. The zero dynamics is so-called because
it is the dynamics that results from imposing the zP D F z C Gv C f Œ2 .x; u/
constraint y.t/ D 0 on the system. For this to be  
satisfied, the initial value must satisfy .0/ D 0  F x C Gu;  Œ2 .x/ C G˛ Œ2 .x; u/
and the control must satisfy
CO.x; u/3
fr .0; .t//
u.t/ D  To eliminate the quadratic terms, one seeks a
gr .0; .t//
solution of the degree two homological equations
for  Œ2 ; ˛ Œ2
Similar results hold in the multiple input, multi-
ple output case, see Isidori (1995) for the details.
 
F x C Gu;  Œ2 .x/  G˛ Œ2 .x; u/ D f Œ2 .x; u/

Approximate Feedback Linearization Unlike before, the degree two homological


equations are not square. Almost always, the
Since so few controlled dynamics are exactly number of unknowns is less than the number of
feedback linearizable, Krener (1984) introduced equations. Furthermore, the the mapping
the concept of approximate feedback lineariza-
tion. The goal is to find a smooth local change    
of coordinates and a smooth feedback  Œ2 .x/; ˛ Œ2 .x; u/ !7 F x C Gu;  Œ2 .x/

G˛ Œ2 .x; u/


z D .x/; u D ˛.x/ C ˇ.x/v

transforming the affinity controlled dynamics is less than full rank. Hence, only an approximate,
(3) to e.g., a least squares solution, is possible.
Krener has written a MATLAB toolbox
zP D F z C Gv C N.x; u/
(http://www.math.ucdavis.edu/ krener 1995)
where the nonlinearity N.x; u/ is small in some to compute term by term solutions to the
sense. In the design process, the nonlinearity is homological equations. The routine fh2f_h_.m
ignored, and the controller design is done on the sequentially computes the least square solutions
linear model and then transformed back into a of the homological equations to arbitrary
controller for the original system. degree.
The power series approach of Poincaré was
taken by Krener et al. (1987, 1988, 1991), Krener
(1990), and Krener and Maag (1991). See also Observers with Linearizable Error
Kang (1994). It is applicable to dynamics which Dynamics
may not be affine in the control. The controlled
nonlinear dynamics (2), the change of coordi- The dual of linear state feedback is linear input–
nates, and the feedback are expanded in a power output injection. Linear input–output injection is
series the transformation carrying

xP D F x C Gu C f Œ2 .x; u/ C O.x; u/3 xP D F x C Gu

z D x   Œ2 .x/ y D Hx
Feedback Linearization of Nonlinear Systems 435

into If H; F is detectable, then F C LH can be made


Hurwitz, i.e., all its eigenvalues are in the open
xP D F x C Bu C Ly C M u left half plane.
The case when  D identity, there are no
y D Hx
inputs m D 0 and one output p D 1,
Linear input–output injection and linear change
xP D f .x/
of state coordinates
y D h.x/
xP D F x C Bu C Ly C M u
was solved by Krener and Isidori (1983) and
y D Hx
Bestle and Zeitz (1983) when the pair H; F
z D Tx defined by
F
zP D TF T 1 x C T Ly C TM u @f
F D .0/
@x
define a group action on the class of linear sys-
tems. Of course, output injection is not physically @h
H D .0/
realizable on the original system, but it is realiz- @x
able on the observer error dynamics.
Nonlinear input–output injection is not well is observable.
defined independent of the coordinates; input– One seeks a change of coordinates z D .x/
output injection in one coordinate system does so that the system is linear up to output injection
not look like input–output injection in another
coordinate system. zP D F z C ˛.y/
If a system y D Hz

xP D f .x; u/; y D h.x/ If they exist, the z coordinates satisfy the PDE’s

can be transformed by nonlinear changes of state Lad nk .f /g .zj / D ık;j


and output coordinates
where the vector field g.x/ is defined by
z D .x/; w D .y/
8
<0 1  k < n
to a linear system with nonlinear input–output Lg Lfk1 h D
:
injection 1 kDn

zP D F z C Gu C ˛.y; u/ The solvability conditions for these PDE’s are


that for 1  k < l  n  1
w D Hz
 
ad k1 .f /g; ad l1 .f /g D 0:
then the observer
The general case with ; m; p arbitrary was
zPO D .F C LH /Oz C Gu C ˛.y; u/  Lw
solved by Krener and Respondek (1985). The
solution is a three-step process. First, one must
has linear error dynamics
set up and solve a linear PDE for .y/. The
integrability conditions for this PDE involve the
zQ D z  zO
vanishing of a pseudo-curvature (Krener 1986).
zPQ D .F C LH /Qz The next two steps are similar to the above.
436 Feedback Linearization of Nonlinear Systems

One defines a vector field g j ; 1  j  p for Barbot JP, Monaco S, Normand-Cyrot D (1997) Quadratic
each output, these define a PDE for the change forms and approximate feedback linearization in dis-
crete time. Int J Control 67:567–586
of coordinates, for which certain integrability Bestle D, Zeitz M (1983) Canonical form observer design
conditions must be satisfied. The process is more for non-linear time-variable systems. Int J Control
complicated than feedback linearization and even 38:419–431
less likely to be successful so approximate solu- Blankenship GL, Quadrat JP (1984) An expert system
for stochastic control and signal processing. In: Pro-
tions must be sought which we will discuss later ceedings of IEEE CDC, Las Vegas. IEEE, pp 716–
in this section. We refer the reader Krener and 723
Respondek (1985) and related work Zeitz (1987) Brockett RW (1978) Feedback invariants for nonlinear
and Xia and Gao (1988a,b, 1989). systems. In: Proceedings of the international congress
of mathematicians, Helsinki, pp 1357–1368
Very few systems can be linearized by change Brockett RW (1981) Control theory and singular Rieman-
of state coordinates and input–output injection, so nian geometry. In: Hilton P, Young G (eds) New di-
Krener et al. (1987, 1988, 1991), Krener (1990), rections in applied mathematics. Springer, New York,
pp 11–27
and Krener and Maag (1991) sought approximate
Brockett RW (1983) Nonlinear control theory and differ-
solutions by the power series approach. Again, ential geometry. In: Proceedings of the international
the system, the changes of coordinates, and the congress of mathematicians, Warsaw, pp 1357–1368
output injection are expanded in a power series. Brockett RW (1996) Characteristic phenomena and model
problems in nonlinear control. In: Proceedings of the
See the above references for details.
IFAC congress, Sidney. IFAC
Byrnes CI, Isidori A (1984) A frequency domain phi-
losophy for nonlinear systems. In: Proceedings, IEEE
conference on decision and control, Las Vegas. IEEE,
Conclusion pp 1569–1573
Byrnes CI, Isidori A (1988) Local stabilization of
We have surveyed the various ways a nonlinear minimum-phase nonlinear systems. Syst Control Lett
11:9–17
system can be approximated by a linear system. Charlet R, Levine J, Marino R (1989) On dynamic feed-
back linearization. Syst Control Lett 13:143–151
Charlet R, Levine J, Marino R (1991) Sufficient condi-
tions for dynamic state feedback linearization. SIAM J
Cross-References Control Optim 29:38–57
Gardner RB, Shadwick WF (1992) The GS-algorithm for
exact linearization to Brunovsky normal form. IEEE
 Differential Geometric Methods in Nonlinear Trans Autom Control 37:224–230
Control Hunt LR, Su R (1981) Linear equivalents of nonlinear
 Lie Algebraic Methods in Nonlinear Control time varying systems. In: Proceedings of the sym-
 Nonlinear Zero Dynamics posium on the mathematical theory of networks and
systems, Santa Monica, pp 119–123
Hunt LR, Su R, Meyer G (1983) Design for multi-
input nonlinear systems. In: Millman RS, Brockett
RW, Sussmann HJ (eds) Differential geometric control
Bibliography theory. Birkhauser, Boston, pp 268–298
Isidori A (1995) Nonlinear control systems. Springer,
Arnol’d VI (1983) Geometrical methods in the theory of Berlin
ordinary differential equations. Springer, Berlin Isidori A, Krener AJ (1982) On the feedback equivalence
Banaszuk A, Hauser J (1995a) Feedback linearization of of nonlinear systems. Syst Control Lett 2:118–121
transverse dynamics for periodic orbits. Syst Control Isidori A, Ruberti A (1984) On the synthesis of linear
Lett 26:185–193 input–output responses for nonlinear systems. Syst
Banaszuk A, Hauser J (1995b) Feedback linearization of Control Lett 4:17–22
transverse dynamics for periodic orbits in R3 with Jakubczyk B (1987) Feedback linearization of discrete-
points of transverse controllability loss. Syst Control time systems. Syst Control Lett 9:17–22
Lett 26:95–105 Jakubczyk B, Respondek W (1980) On linearization of
Barbot JP, Monaco S, Normand-Cyrot D (1995) Lin- control systems. Bull Acad Polonaise Sci Ser Sci Math
earization about an equilibrium manifold in discrete 28:517–522
time. In: Proceedings of IFAC NOLCOS 95, Tahoe Kang W (1994) Approximate linearization of nonlinear
City. Pergamon control systems. Syst Control Lett 23: 43–52
Feedback Stabilization of Nonlinear Systems 437

Korobov VI (1979) A general approach to the solution Xia XH, Gao WB (1989) Nonlinear observer design by
of the problem of synthesizing bounded controls in a observer error linearization. SIAM J Control Optim
control problem. Math USSR-Sb 37:535 27:199–216
Krener AJ (1973) On the equivalence of control systems Zeitz M (1987) The extended Luenberger observer
and the linearization of nonlinear systems. SIAM J for nonlinear systems. Syst Control Lett 9:
Control 11:670–676 149–156
Krener AJ (1984) Approximate linearization by state feed-
back and coordinate change. Syst Control Lett 5:181–
185
Krener AJ (1986) The intrinsic geometry of dynamic
observations. In: Fliess M, Hazewinkel M (eds) Al- Feedback Stabilization of Nonlinear
gebraic and geometric methods in nonlinear control Systems
theory. Reidel, Amsterdam, pp 77–87
Krener AJ (1990) Nonlinear controller design via ap-
proximate normal forms. In: Helton JW, Grunbaum A. Astolfi F
A, Khargonekar P (eds) Signal processing, Part II: Department of Electrical and Electronic
control theory and its applications. Springer, New Engineering, Imperial College London, London,
York, pp 139–154 UK
Krener AJ, Isidori A (1983) Linearization by output in-
jection and nonlinear observers. Syst Control Lett 3: Dipartimento di Ingegneria Civile e Ingegneria
47–52 Informatica, Università di Roma Tor Vergata,
Krener AJ, Maag B (1991) Controller and observer de- Roma, Italy
sign for cubic systems. In: Gombani A, DiMasi GB,
Kurzhansky AB (eds) Modeling, estimation and con-
trol of systems with uncertainty. Birkhauser, Boston,
pp 224–239 Abstract
Krener AJ, Respondek W (1985) Nonlinear observers
with linearizable error dynamics. SIAM J Control We consider the simplest design problem for non-
Optim 23:197–216
Krener AJ, Karahan S, Hubbard M, Frezza R (1987) linear systems: the problem of rendering asymp-
Higher order linear approximations to nonlinear con- totically stable a given equilibrium by means of
trol systems. In: Proceedings, IEEE conference on state feedback. For such a problem, we provide
decision and control, Los Angeles. IEEE, pp 519–523 a necessary condition, known as Brockett condi-
Krener AJ, Karahan S, Hubbard M (1988) Approximate
normal forms of nonlinear systems. In: Proceedings, tion, and a sufficient condition, which relies upon
IEEE conference on decision and control, San Anto- the definition of a class of functions, known as
nio. IEEE, pp 1223–1229 control Lyapunov functions. The theory is illus-
Krener AJ, Hubbard M, Karahan S, Phelps A, Maag
trated by means of a few examples. In addition,
B (1991) Poincare’s linearization method applied to
the design of nonlinear compensators. In: Jacob G, we discuss a nonlinear enhancement of the so-
Lamnahbi-Lagarrigue F (eds) Algebraic computing in called separation principle for stabilization by
control. Springer, Berlin, pp 76–114 means of partial state information.
Lee HG, Arapostathis A, Marcus SI (1986) Lineariza-
tion of discrete-time systems. Int J Control 45:1803–
1822
Murray RM (1995) Nonlinear control of mechanical sys- Keywords
tems: a Lagrangian perspective. In: Krener AJ, Mayne
DQ (eds) Nonlinear control systems design. Perga-
mon, Oxford Brockett theorem; Control Lyapunov function;
Nonlinear Systems Toolbox available for down load at Output feedback; State feedback
http://www.math.ucdavis.edu/~krener/1995
Sommer R (1980) Control design for multivariable non-
linear time varying systems. Int J Control 31:883–891
Su R (1982) On the linear equivalents of control systems. Introduction
Syst Control Lett 2:48–52
Xia XH, Gao WB (1988a) Nonlinear observer design by The problem of feedback stabilization, namely,
canonical form. Int J Control 47: 1081–1100
Xia XH, Gao WB (1988b) On exponential ob-
the problem of designing a feedback control law
servers for nonlinear systems. Syst Control Lett 11: locally, or globally, asymptotically stabilizing a
319–325 given equilibrium point, is the simplest design
438 Feedback Stabilization of Nonlinear Systems

problem for nonlinear systems. If the state of where .t/ 2 IR


describes the state of the
the system is available for feedback, then the controller, y.t/ 2 IRp is given by y D h.x/,
problem is referred to as the state feedback stabi- for some mapping h W IRn ! IRp , and describes
lization problem, whereas if only part of the state, the available information on the state x, and ˇ W
for example, an output signal, is available for IR
 IRp ! IR
and ˛ W IR
! IRm are smooth
feedback, the problem is referred to as the partial mappings. Within this scenario, the stabilization
state feedback (or output feedback) stabilization problem boils down to selecting the (nonnega-
problem. We initially focus on the state feedback tive) integer
(i.e., the order of the controller),
stabilization problem, which can be formulated as a constant 0 2 IR
, and the mappings ˛ and ˇ
follows. such that the closed-loop system
Consider a nonlinear system described by the
equation
xP D F .x; u/; (1) xP D F .x; ˛. //; P D ˇ. ; h.x//; (5)
where x.t/ 2 IRn denotes the state of the system,
u.t/ 2 IRm denotes the input of the system, and
has a locally (or globally) asymptotically stable
F W IRn  IRm ! IRn is a smooth mapping.
equilibrium at .x0 ; 0 /. Alternatively, one may
Let x0 2 IRn be an achievable equilibrium,
require that the equilibrium .x0 ; 0 / of the closed-
i.e., x0 is such that there exists a constant u0 2
loop system (5) be locally asymptotically stable
IRm such that F .x0 ; u0 / D 0: The state feedback
with a region of attraction that contains a given,
stabilization problem consists in finding, if possi-
user-specified, set.
ble, a state feedback control law, described by the
The rest of the entry is organized as follows.
equation
We begin discussing two key results. The first
u D ˛.x/; (2) is a necessary condition, due to R.W. Brockett,
with ˛ W IRn ! IRm , such that the equilibrium for continuous stabilizability. This provides an
x0 is a locally asymptotically stable equilibrium obstruction to the solvability of the problem and
for the closed-loop system can be used to show that, for nonlinear systems,
controllability does not imply stabilizability by
continuous feedback. The second one is the ex-
xP D F .x; ˛.x//: (3) tension of the Lyapunov direct method to systems
with control. The main idea is the introduction
of a control version of Lyapunov functions, the
Alternatively, one could require that the equilib- control Lyapunov functions, which can be used
rium be globally asymptotically stable. Note that to design stabilizing control laws by means of a
it is not always possible to extend local properties universal formula. We then describe two classes
to global properties. For example, for the system of systems for which it is possible to construct,
described by the equations xP 1 D x2 .1  x12 /, with systematic procedures, smooth control laws
xP 2 D u, with x1 .t/ 2 IR, x2 .t/ 2 IR, and yielding global asymptotic stability of a given
u.t/ 2 IR, it is not possible to design a feedback equilibrium: systems in feedback and in feedfor-
law which renders the zero equilibrium globally ward form. There are several other constructive
asymptotically stable. and systematic stabilization methods which have
If only partial information on the state is been developed in the last few decades. Worth
available, then one has to resort to a dynamic mentioning are passivity-based methods and cen-
output feedback controller, namely, a controller ter manifold-based methods.
described by equations of the form We conclude the entry describing a nonlin-
ear version of the separation principle for the
asymptotic stabilization, by output feedback, of
P D ˇ. ; y/; u D ˛. /; (4) a general class of nonlinear systems.
Feedback Stabilization of Nonlinear Systems 439

Preliminary Results The above linear arguments are often inade-


quate to design feedback stabilizers: a theory for
To highlight the difficulties and peculiarities of nonlinear feedback has to be developed. How-
the nonlinear stabilization problem, we recall ever, this theory is much more involved. In partic-
some basic facts from linear systems theory and ular, it is important to observe that the solvability
exploit such facts to derive a sufficient condition of the stabilization problem may depend upon
and a necessary condition. In the case of linear the regularity properties of the feedback, i.e.,
systems, i.e., systems described by the equation of the mapping ˛. In fact, a given equilibrium
xP D Ax C Bu, with A 2 IRnn and B 2 IRnm, of a nonlinear system may be rendered locally
and linear state feedback, i.e., feedback described asymptotically stable by a continuous feedback,
by the equation u D Kx, with K 2 IRmn , the whereas there may be no continuously differen-
stabilization problem boils down to the problem tiable feedback achieving the same goal. If the F
of placing, in the complex plane, the eigenvalues feedback is required to be continuously differen-
of the matrix A C BK to the left of the imaginary tiable, then the problem is often referred to as the
axis. This problem is solvable if and only if the smooth stabilization problem.
uncontrollable modes of the system are located,
Example 1 To illustrate the role of the regularity
in the complex plane, to the left of the imaginary
properties of the feedback, consider the system
axis.
described by the equations
The linear theory may be used to provide
a simple obstruction to feedback stabilizability
and a simple sufficient condition. Let x0 be an xP 1 D x1  x23 ; xP 2 D u;
achievable equilibrium with u0 D 0 and note
that the linear approximation of the system (1) with x1 .t/ 2 IR, x2 .t/ 2 IR, and u.t/ 2 IR, and
around x0 is described by an equation of the form the equilibrium x0 D .0; 0/. The equilibrium is
xP D Ax C Bu. globally asymptotically stabilized by the contin-
If for any K 2 IRmn the condition uous feedback

.A C BK/ \ C C ¤ ; (6) 4 1
˛.x/ D x2 C x1 C x13  x23 ;
3
holds, then the equilibrium of the nonlinear sys-
tem cannot be stabilized by any continuously but it is not stabilizable by any continuously dif-
differentiable feedback such that ˛.x0 / D 0. ferentiable feedback. Note, in fact, that condition
The notation .A/ denotes the spectrum of the (6) holds.
matrix A, i.e., the eigenvalues of A. Note however
that, if the condition ˛.x0 / D 0 is dropped, the
obstruction does not hold: the zero equilibrium of Brockett Theorem
xP D x C xu is not stabilizable by any continuous
feedback such that ˛.0/ D 0, yet the feedback Brockett’s necessary condition, which is far from
u D 2 is a (global) stabilizer. being sufficient, provides a simple test to rule out
On the contrary, if there exists a K such that the existence of a continuous stabilizer.
Theorem 1 Consider the system (1) and assume
.A C BK/  C  x0 D 0 is an achievable equilibrium with u0 D 0.
Assume there exists a continuous stabilizing
then the feedback ˛.x/ D Kx locally asymptot- feedback u D ˛.x/. Then, for each – > 0 there
ically stabilizes the equilibrium x0 of the closed- exists ı > 0 such that, for all y with jjyjj < ı, the
loop system. This fact is often referred to as the equation y D F .x; u/ has at least one solution in
linearization approach. the set jjxjj < –; jjujj < –:
440 Feedback Stabilization of Nonlinear Systems

Theorem 1 can be reformulated as follows. Control Lyapunov Functions


The existence of a continuous stabilizer implies
that the image of the mapping F W IRn  IRm ! The Lyapunov theory states that the equilibrium
IRn covers a neighborhood of the origin. Note, in x0 of the system
addition, that the obstruction expressed by Theo-
rem 1 is of topological nature. Hence, it requires xP D f .x/;
continuity of F and ˛, and time invariance: it
does not hold if u D ˛.x; t/, i.e., a time-varying with f W IRn ! IRn , is locally asymptotically
feedback is designed. stable if there exists a continuously differentiable
In the linear case, Brockett condition reduces function V W IRn ! IR, called Lyapunov
to the condition function, and a neighborhood U of x0 such that
V .x0 / D 0, V .x/ > 0, for all x 2 U and x ¤ x0 ,
rankŒA  I; B D n and @V
@x
f .x/ < 0, for all x 2 U and x ¤ x0 .
To apply this idea to the stabilization problem,
for  D 0. This is a necessary, but clearly not consider the system (1). If the equilibrium x0 of
sufficient, condition for the stabilizability of xP D xP D F .x; u/ is continuously stabilizable, then
Ax C Bu: there must exist a continuously differentiable
function V and a neighborhood U of x0 such that
Example 2 Consider the kinematic model of a
mobile robot given by the equations @V
inf F .x; u/ < 0;
u @x
xP D cos v;
yP D sin v; for all x 2 U and x ¤ x0 . This motivates the
P D !; following definition.
Definition 1 A continuously differentiable func-
where .x.t/; y.t// 2 IR2 denotes the Cartesian tion V such that
position of the robot, .t/ 2 . ;  denotes • V .x0 / D 0 and V .x/ > 0, for all x 2 U and
the robot orientation (with respect to the x-axis), x ¤ x0 ,
v.t/ 2 IR is the forward velocity of the robot, @V
and !.t/ 2 IR is its angular velocity. Simple • inf F .x; u/ < 0, for all x 2 U and x ¤
u @x
intuitive considerations suggest that the system x0 ,
is controllable, i.e., it is possible to select the is called a control Lyapunov function.
forward and angular velocities to drive the robot By Lyapunov theory, the existence of a contin-
from any initial position/orientation to any final uous stabilizer implies the existence of a control
position/orientation in any given positive time. Lyapunov function. On the other hand, the ex-
Nevertheless, the zero equilibrium (and any other istence of a control Lyapunov function does not
equilibrium of the system) is not continuously guarantee the existence of a stabilizer. However,
stabilizable. In fact, the equations in the case of systems affine in the control, i.e.,
systems described by the equation
y1 D cos v; y2 D sin v; y3 D !;
xP D f .x/ C g.x/u; (7)
with jj.y1 ; y2 ; y3 /jj < ı and jj.x; y; /jj < –,
jj.v; !/jj < –, are in general not solvable. For
example, if – < =2 and y1 D 0; y2 ¤ 0; y3 D with f W IRn ! IRn and g W IRn ! IRnm
0, then the unique solution of the first and third smooth mappings, very general results can be
equations is v D 0 and ! D 0, implying proven. These have been proven by Z. Artstein,
sin v D 0; hence, the second equation does not who gave a nonconstructive statement, and have
have a solution. been given a constructive form by E.D. Sontag.
Feedback Stabilization of Nonlinear Systems 441

In particular, for single-input nonlinear systems, @V @V


the following statement holds. 1. g.x/ D 0 ) f .x/ < 0, for all
@x @x
x ¤ 0;
Theorem 2 Consider the system (7), with m D 2. For each – > 0 there is a ı > 0 such that
1, and assume f .0/ D 0.
jjxjj < ı implies that there is a juj < – such
There exists an almost smooth feedback, i.e.,
that
the feedback ˛.x/ is continuously differentiable @V @V
for all x 2 IRn and x ¤ 0, and continuous at f .x/ C g.x/u < 0:
@x @x
x D 0 which globally asymptotically stabilizes Condition 2 is known as the small control
the equilibrium x D 0 if and only if there property, and it is necessary to guarantee continu-
exists a positive definite, radially unbounded, i.e., ity of the feedback at x D 0. If Conditions 1 and 2
lim V .x/ D 1 and smooth function V .x/ hold, then an almost smooth feedback is given by
kxk!1 F
such that the so-called Sontag’s universal formula:

8
ˆ @V
ˆ
ˆ 0; if g.x/ D 0;
ˆ
< @x
˛.x/ D q   @V 4
ˆ 2
ˆ @x f .x/ C C
@V @V
ˆ @x f .x/ @x g.x/
:̂  @V
; elsewhere:
@x g.x/

Constructive Stabilization a continuously differentiable and radially


unbounded function V1 W IRn ! IR such that
We now introduce two classes of nonlinear sys- V1 .0/ D 0, V1 .x1 / > 0, for all x1 ¤ 0, and
tems for which systematic design methods to
solve the state feedback stabilization problem are @V1
f .x1 ; ˛1 .x1 // < 0;
available. @x1

for all x1 ¤ 0, i.e., the zero equilibrium of the


Feedback Systems
system xP 1 D f .x1 ; v/ is globally asymptotically
Consider a nonlinear system described by equa-
stabilizable.
tions of the form
Consider now the function
xP 1 D f1 .x1 ; x2 /; xP 2 D u; (8)
1
V .x1 ; x2 / D V1 .x1 / C .x2  ˛1 .x1 //2 ;
2
with x1 .t/ 2 IRn , x2 .t/ 2 IR, u.t/ 2 IRn
and f1 .0; 0/ D 0. This system belongs to which is radially unbounded and such that
the so-called class of feedback systems for V .0; 0/ D 0 and V .x1 ; x2 / > 0 for all nonzero
which a sort of reduction principle holds: the .x1 ; x2 /, and note that
zero equilibrium of the system is smoothly
stabilizable if the same holds for the reduced @V1
VP D f .x1 ; x2 / C .x2  ˛1 .x1 //
system xP 1 D f .x1 ; v/, which is obtained from @x1
the first of Eq. (8) replacing the state variable .u C 1 .x1 ; x2 //
x2 with a virtual control input v. To show this @V1
property, suppose there exist a continuously D f .x1 ; ˛1 .x1 // C .x2  ˛1 .x1 //
@x1
differentiable function ˛1 W IRn ! IR and .u C 2 .x1 ; x2 //;
442 Feedback Stabilization of Nonlinear Systems

for some continuously differentiable mappings for all x2 ¤ 0. Suppose, in addition, that there ex-
1 and 2 . As a result, the feedback ists a continuously differentiable mapping M.x2 /
such that
˛.x1 ; x2 / D 2 .x1 ; x2 /  k.x2  ˛1 .x1 //;
@M
f1 .x2 /  f2 .x2 / D 0;
with k > 0, yields VP < 0 for all nonzero .x1 ; x2 /; @x2
hence, the feedback is a continuously differen- ˇ
tiable stabilizer for the zero equilibrium of the @M ˇˇ
M.0/ D 0 and ¤ 0: Existence of such
system (8). Note, finally, that the function V is @x2 ˇx2 D0
a control Lyapunov for the system (8); hence, a mapping is guaranteed, for example, by asymp-
Sontag’s formula can be also used to construct a totic stability of the linearization of the system
stabilizer. xP 2 D f2 .x2 / around the origin and controllability
The result discussed above is at the basis of the of the linearization of the system (9) around the
so-called backstepping technique for recursive origin.
stabilization of systems described, for example, Consider now the function
by equations of the form
1
V .x1 ; x2 / D .x1  M.x2 //2 C V2 .x2 /;
xP 1 D x2 C '1 .x1 /; 2
xP 2 D x3 C '2 .x1 ; x2 /;
which is radially unbounded and such that
xP 3 D x4 C '3 .x1 ; x2 ; x3 /;
:: V .0; 0/ D 0 and V .x1 ; x2 / > 0 for all nonzero
: .x1 ; x2 /, and note that
xP n D u C 'n .x1 ; : : : ; xn /;
@M @V2
VP D .x1  M.x2 // g2 .x2 /u C f2 .x2 /
with xi .t/ 2 IR for all i 2 Œ1; n, and 'i smooth @x2 @x2
mappings such that 'i .0/ D 0, for all i 2 Œ1; n. @V2
C g2 .x2 /u
@x2
Feedforward Systems  
@V2 @V2 @M
Consider a nonlinear system described by equa- D f2 .x2 / C  .x1  M.x2 //
@x2 @x2 @x2
tions of the form g2 .x2 /u:

xP 1 D f1 .x2 /; xP 2 D f2 .x2 / C g2 .x2 /u; (9) As a result, the feedback

with x1 .t/ 2 IR, x2 .t/ 2 IRn , u.t/ 2 IR,  


@V2 @M
f1 .0/ D 0 and f2 .0/ D 0. This system belongs ˛.x1 ; x2 / D k  .x1  M.x2 //
@x2 @x2
to the so-called class of feedforward systems for
which, similarly to feedback systems, a sort of g2 .x2 /;
reduction principle holds: the zero equilibrium
of the system is smoothly stabilizable if the zero with k > 0, yields VP < 0 for all nonzero .x1 ; x2 /;
equilibrium of the reduced system xP 2 D f .x2 / hence, the feedback is a continuously differen-
is globally asymptotically stable and some ad- tiable stabilizer for the zero equilibrium of the
ditional structural assumption holds. To show system (9). Note, finally, that the function V is
this property, suppose there exists a continuously a control Lyapunov for the system (9); hence,
differentiable and radially unbounded function Sontag’s formula can be also used to construct a
V2 W IRn ! IR such that V2 .0/ D 0, V2 .x2 / > 0, stabilizer.
for all x2 ¤ 0, and The result discussed above is at the basis of
the so-called forwarding technique for recursive
@V2 stabilization of systems described, for example,
f2 .x2 / < 0;
@x2 by equations of the form
Feedback Stabilization of Nonlinear Systems 443

xP 1 D '1 .x2 ; : : : ; xn /; 0 .x/ D h.x/;


xP 2 D '2 .x3 ; : : : ; xn /;
:: @0
: 1 .x; v0 / D
@x
xP n1 D 'n1 .xn /; Œf .x/ C g.x/v0  ;
xP n D u;
@1
with xi .t/ 2 IR for all i 2 Œ1; n, and 'i smooth 2 .x; v0 ; v1 / D Œf .x/ C g.x/v0 
@x
mappings such that 'i .0/ D 0, for all i 2 Œ1; n. @1
C v1 ;
@v0
Stabilization via Output Feedback 
F
@k1
k .x; v0 ; v1 ;    ; vk1 / D
In the previous sections, we have studied the @x
stabilization problem for nonlinear systems  Œf .x/ C g.x/v0 
under the assumption that the whole state
is available for feedback. This requires X
k2
@k1
C vi C1 ;
the online measurement of the state vector i D0
@vi
x, which may pose a severe constrain in
(11)
applications. This observation motivates the
study of the much more challenging, but
more realistic, problem of stabilization with with k  n. Note that if u.t/ is of class C k1 ,
partial state information. This problem requires then
the introduction of a notion of observabil-
ity. Note that for nonlinear systems, it is y .k/ .t/ D k .x.t/; u.t/;    ; u.k1/ .t//;
possible to define several, nonequivalent, ob-
servability notions. Similarly to section “Control where the notation y .k/ .t/, with k positive inte-
Lyapunov Functions”, we focus on the class of ger, is used to denote the k-th order derivative of
systems affine in the control, i.e., systems the function y.t/, provided it exists. The map-
described by equations of the form pings 0 to n1 can be collected into a unique
mapping ˆ W IRn  IRn1 ! IRn defined as
xP D f .x/ C g.x/u;
(10)
y D h.x/; ˆ.x; v0 ; v1 ;    ; vn2 /
2 3
0 .x/
with f W IRn ! IRn , g W IRn ! IRnm and 6 7
6 1 .x; v0 / 7
h W IRn ! IRp smooth mappings. This is D6 :: 7:
precisely the class of systems in Eq. (7) with 4 : 5
the addition of the output map h, i.e., a map n1 .x; v0 ; v1 ;    ; vn2 /
which describes the information that is available
for feedback. In addition, we assume, to simplify The mapping ˆ is, by construction, such that
the notation, that m D 1 and p D 1: the system
is single input, single output. Finally assume, ˆ.x.t/; u.t/; uP .t/;    ; u.n2/ .t//
without loss of generality, that the equilibrium  0
to be stabilized is x0 D 0, that any stabilizing D y.t/ y.t/P    y .n1/ .t/ ;
state feedback control law u D ˛.x/ is such that
˛.0/ D 0, and that h.0/ D 0. for any t in which the indicated signals exist. As
To define the observability notion of interest, a consequence, if the mapping ˆ is such that,as
0
consider the sequence of mappings some point .x; N where vN D vN0 vN1    vN n2 ,
N v/,
444 Feedback Stabilization of Nonlinear Systems

@ˆ ! D ˆ.‰.!; v/; v/ x D ‰.ˆ.x; v/; v/


rank N v/
.x; N D n; (12)
@x
hold for all x, v and !. In principle, this property
then, by the Implicit Function Theorem, there may be used in an output feedback control archi-
N v/
exists locally around .x; N a smooth mapping tecture obtained implementing a stabilizing state
‰ W IRn  IRn1 ! IRn such that feedback u D ˛.x/ as u D ˛.‰.!; v//, with v
and ! as given in (13). This implementation is
! D ˆ.‰.!; v/; v/; however not possible, since it gives an implicit
definition of u and requires the exact differentia-
i.e., the mapping ‰ is the inverse of ˆ, parame- tion of the input and output signals.
terized by v. We conclude the discussion noting To circumvent these difficulties, one needs
N N to follow a somewhat longer path, as described
that
 if, at a certain time0 t , xN D x.t / and vN D
u.tN/ uP .tN/    u.n2/ .tN/ are such that the rank hereafter. In addition, the global asymptotic sta-
condition (12) holds, then the mapping ‰ can be bility requirement should be replaced by a less
used to reconstruct the state of the system from ambitious, yet practically meaningful, require-
measurements of the input, and its derivatives, ment: semi-global asymptotic stability. This re-
and the output and its derivatives, for all t in a quirement can be formalized as follows.
neighborhood of tN: x.t/ D ‰.!.t/; v.t//; for all Definition 3 The equilibrium x0 of the system
t in a neighborhood of tN, where (1), or (7), is said to be semi-globally asymptot-
 0 ically stabilizable if, for each compact set K 
v.t/ D u.t/ uP .t/    u.n2/ .t/ ; IRn such that x0 2 int.K/, i.e. the set of all
 0 interior points of K, there exists a feedback con-
!.t/ D y.t/ y.t/
P    y .n1/ .t/ : (13)
trol law, possibly depending on K, such that the
equilibrium x0 is a locally asymptotically stable
This property is a local property: to derive a
equilibrium of the closed-loop system and for any
property which allows a global reconstruction of
x.0/ 2 K one has lim x.t/ D x0 .
the state, we need to impose additional condi- t !1
tions. To bypass the need for the derivatives of the
Definition 2 Consider the system (10) with m D input signal, consider the extended system
p D 1. The system is said to be uniformly
observable if: xP D f .x/ C g.x/v0 ; vP 0 D v1 ;
(i) The mapping H W IRn ! IRn defined as vP1 D v2 ;  vPn1 D uQ : (14)
2 3
h.x/ Note that, as described in section “Feedback
6 Lf h.x/ 7
6 7 Systems”, if the equilibrium x0 D 0 of the system
H.x/ D 6 :: 7
4 : 5 xP D f .x/ C g.x/u is globally asymptotically
Lfn1 h.x/ stabilizable by a (smooth) feedback u D ˛.x/,
then there exists a smooth state feedback
is a global diffeomorphism. The functions uQ D ˛.x;
Q v0 ; v1 ;    ; vn1 / which globally
Lif h, with i nonnegative integer, are defined asymptotically stabilizes the zero equilibrium
as Lf h.x/ D L1f h.x/ D @x @h
f .x/ and, of the system (14). In the feedback ˛, Q one can
i C1 replace x with .!; v/, thus yielding a feedback
recursively, as Lf h.x/ D Lf .Lif h.x//.
of the measurable part of the state of the system
(ii) The rank condition (12) holds for all .x; v/ 2
(14) and of the output
 y and its derivatives. 0 Note
IRn  IRn1 .
that if !.t/ D y.t/ y.t/ P    y .n1/ .t/ , then
The notion of uniform observability allows to the feedback uQ D ˛. Q .!; v/; v0 ; v1 ;    ; vn1 /
perform a global reconstruction of the state, i.e., globally asymptotically stabilizes the zero
it makes sure that the identities equilibrium of the system (14).
Feedback Stabilization of Nonlinear Systems 445

 0
To avoid the computation of the derivatives with  D 0 ; 1 ;    ; n1 , has the property
of y, we exploit the uniform observability of reproducing y.t/ and its derivatives up to
property, which implies that the auxiliary y .n1/ .t/ if properly initialized. This initializa-
system tion is not feasible, since it requires the knowl-
edge of the derivative of the output at t D 0.
P 0 D 1 ; Nevertheless, the auxiliary system (15) can be
P 1 D 2 ; modified to provide an estimate of y and its
:: (15) derivatives. The modification is obtained adding
:
a linear correction term yielding the system
P n1 D n . .; v/; v0 ; v1 ;    ; vn1 /;

P 0 D 1 C L cn1 .y  0 /; F
P 1 D 2 C L2 cn2 .y  0 /;
:: (16)
:
P n1 D n . .; v/; v0 ; v1 ;    ; vn1 / C Ln c0 .y  0 /;

with L > 0 and the coefficients c0 ,    , cn1 such such that for each M > M ? and L > L? , the
that all roots of the polynomial n C cn1 n1 C dynamic output feedback control law
c1  C c0 are in C  . The system (16) has the
ability to provide asymptotic estimates of y and
its derivatives up to y .n1/ provided these are vP 0 D v1 ;
bounded and the gain L is selected sufficiently vP 1 D v2 ;
large, i.e., the system (16) is a semi-global ob- ::
server of y and its derivatives up to y .n1/ . :
The closed-loop system obtained using the Q Q .; v/; v0 ; v1 ;    ; vn1 /
vPn1 D ˛.
feedback law uQ D ˛. Q .; v/; v0 ; v1 ;    ; vn1 / P 0 D 1 C L cn1 .y  0 /;
has a locally asymptotically stable equilibrium at
P 1 D 2 C L2 cn2 .y  0 /;
the origin. To achieve semi-global stability, one
::
has to select L sufficient large and replace :
with P n1 D n . .; v/; v0 ; v1 ;    ; vn1 /
8 C Ln c0 .y  0 /;
ˆ
< .; v/; if k .; v/k < M;
u D v0 ;
Q .; v/ D .; v/
:̂ M ; if k .; v/k  M;
k .; v/k
yields a closed-loop system with the following
properties:
with M > 0 to be selected, as detailed in the • The zero equilibrium of the system is locally
following statement. asymptotically stable.
Theorem 3 Consider the system (10) with m D • For any x.0/, v.0/ and .0/ such that
p D 1. Let x0 D 0 be an achievable equilibrium. kx.0/k < R and k.v.0/; .0/k < R, Q the
Assume h.0/ D 0. Suppose the system is uni- trajectories of the closed-loop system are such
formly observable and there exists a smooth state that
feedback control law u D ˛.x/ which globally
asymptotically stabilizes the zero equilibrium and
lim x.t/ D 0; lim v.t/ D 0;
it is such that ˛.0/ D 0. t !1 t !1
Then for each R > 0, there exist RQ > 0 and
lim .t/ D 0:
M > 0, and for each M > M ? , there exists L?
? t !1
446 Feedback Stabilization of Nonlinear Systems

The foregoing result can be informally for- Recommended Reading


mulated as follows: global state feedback stabi-
lizability and uniform observability imply semi- Classical references on stabilization for nonlinear
global stabilizability by output feedback. This systems and on recent research directions are
can be regarded as a nonlinear enhancement of given below.
the so-called separation principle for the stabi-
lization, by output feedback, of linear systems.
Note, finally, that a global version of the separa-
tion principle can be derived under one additional
assumption: the existence of an estimator of the Bibliography
norm of the state x. Artstein Z (1983) Stabilization with relaxed controls.
Nonlinear Analysis Theory Methods Appl 7:1163–
1173
Summary and Future Directions Astolfi A, Praly L (2003) Global complete observability
and output-to-state stability imply the existence of a
A necessary condition and a sufficient condition globally convergent observer. In: Proceedings of the
42nd IEEE conference on decision and control, Maui,
for stabilizability of an equilibrium of a nonlinear
pp 1562–1567
system have been given, together with two sys- Astolfi A, Praly L (2006) Global complete observability
tematic design methods. The necessary condition and output-to-state stability imply the existence of a
allows to rule out, using a simple algebraic test, globally convergent observer. Math Control Signals
Syst 18:32–65
existence of continuous stabilizers, whereas the
Astolfi A, Karagiannis D, Ortega R (2008) Nonlinear and
sufficient condition provides a link with classical adaptive control with applications. Springer, London
Lyapunov theory. In addition, the problem of Bacciotti A (1992) Local stabilizability of nonlinear con-
semi-global stability by dynamic output feedback trol systems. World Scientific, Singapore/River Edge
Brockett RW (1983) Asymptotic stability and feedback
has been discussed in detail. Several issues have
stabilization. In: Brockett RW, Millman RS, Suss-
not been discussed, including the use of discon- mann HJ (eds) Differential geometry control theory.
tinuous, hybrid and time-varying feedbacks; sta- Birkhauser, Boston, pp 181–191
bilization by static output feedback and dynamic Gauthier J-P, Kupka I (2001) Deterministic observation
theory and applications. Cambridge University Press,
state feedback; robust stabilization. Note finally
Cambridge/New York
that similar considerations can be carried out for Isidori A (1995) Nonlinear control systems, 3rd edn.
nonlinear discrete-time systems. Springer, Berlin
Isidori A (1999) Nonlinear control systems II. Springer,
London
Cross-References Jurdjevic V, Quinn JP (1978) Controllability and stability.
J Differ Equ 28:381–389
Khalil HK (2002) Nonlinear systems, 3rd edn. Prentice-
 Controllability and Observability Hall, Upper Saddle River
 Fundamental Limitation of Feedback Control Krstić M, Kanellakopoulos I, Kokotović P (1995) Nonlin-
 Input-to-State Stability ear and adaptive control design. Wiley, New York
Marino R, Tomei P (1995) Nonlinear control design:
 Linear State Feedback
geometric, adaptive and robust. Prentice-Hall, London
 Lyapunov’s Stability Theory Mazenc F, Praly L (1996) Adding integrations, saturated
 Lyapunov Methods in Power System Stability controls, and stabilization for feedforward systems.
 Observers for Nonlinear Systems IEEE Trans Autom Control 41:1559–1578
Mazenc F, Praly L, Dayawansa WP (1994) Global stabi-
 Observers in Linear Systems Theory
lization by output feedback: examples and counterex-
 Power System Voltage Stability amples. Syst Control Lett 23:119–125
 Small Signal Stability in Electric Power Sys- Ortega R, Loría A, Nicklasson PJ, Sira-Ramírez H (1998)
tems Passivity-based control of Euler-Lagrange systems.
Springer, London
 Stability and Performance of Complex Systems
Ryan EP (1994) On Brockett’s condition for smooth sta-
Affected by Parametric Uncertainty bilizability and its necessity in a context of nonsmooth
 Stability: Lyapunov, Linear Systems feedback. SIAM J Control Optim 32:1597–1604
Financial Markets Modeling 447

Sepulchre R, Janković M, Kokotović P (1996) Construc- This model, where prices can take negative
tive nonlinear control. Springer, Berlin values, was changed by taking the exponential
Sontag ED (1989) A “universal” construction of Artstein’s
theorem on nonlinear stabilization. Syst Control Lett
as in the celebrated Black-Scholes-Merton model
13:117–123 (BSM) where
Teel A, Praly L (1995) Tools for semiglobal stabilization
B
by partial state and output feedback. SIAM J Control St D e St D S0 exp.
t C Wt /:
Optim 33(5):1443–1488
van der Schaft A (2000) L2 -gain and passivity techniques
in nonlinear control, 2nd edn. Springer, London Only two constant parameters were needed; the
coefficient is called the volatility. From Itô’s
formula, the dynamics of the BSM’s price are

Financial Markets Modeling dSt D St .dt C d Wt / F

Monique Jeanblanc where  D


C 12 2 . The interest rate is assumed
Laboratoire Analyse et Probabilités, IBGBI, to be a constant r.
Université d’Evry Val d’Essonne, Evry Cedex, The price of a derivative product of payoff
France H 2 FT D .Ss ; s  T / is obtained using a
hedging procedure. One proves that there exists a
(self-financing) portfolio with value V , investing
Abstract in the savings account and in the stock S , with
terminal value H , i.e., VT D H . The price of
Mathematical finance is an important part of H at time t is Vt . Using that methodology, the
applied mathematics since the 1980s. At the price of a European call with strike K (i.e., for
beginning, the main goal was to price derivative H D .ST  K/C ) is
products and to provide hedging strategies.
Nowadays, the goal is to provide models Vt D BS. ; K/t WD St N .d1 /  Ke r.T t / N .d2 /
for prices and interest rates, such that better
calibration of parameters can be done. In these where N is the cumulative distribution function
pages, we present some basic models. Details can of a standard Gaussian law and
be found in Musiela and Rutkowski (2005).   
1 St
Keywords d1 D p ln
T t Ke r.T t /
1 p p
Affine process; Brownian process; Default times; C T  t ; d2 D d1  T  t
Levy process; Wishart distribution 2

Note that the coefficient  plays no role in this


pricing methodology. This formula opened the
Models for Prices of Stocks door to the notion of risk neutral probability mea-
sure: the price of the option (or of any derivative
The first model of prices was elaborated by Louis product) is the expectation under the unique prob-
Bachelier, in his thesis (1900). The idea was ability measure Q, equivalent to P such that the
that the dynamic of prices has a trend, perturbed discounted price St e rt ; t  0 is a Q martingale.
by a noise. For this noise, Bachelier set the During one decade, the financial market was
fundamental properties of the Brownian motion. quite smooth and this simple model was efficient
Bachelier’s prices were of the form to price derivative products and to calibrate the
coefficients from the data. After the Black Mon-
StB D S0B C
t C Wt
day of October 1987, the model was recognized
where W is a Brownian motion. to suffer some weakness and the door was fully
448 Financial Markets Modeling

p
open to more sophisticated models, with more dvt D .vt  v/dt
N C  vt dBt
parameters. In particular, the smile effect was
seen on the data: the BSM formula being true, where B and W are two Brownian motions with
the price of the option would be a determin- correlation factor . This model is generalized
istic function of the fixed parameter and of by Gourieroux and Sufana (2003) as Wishart
K (the other parameter as maturity, underlying model where the risky asset S is a d dimensional
price, interest rate being fixed), and one would process, with matrix of quadratic variation †
obtain, using .; K/, the inverse of BS.; K/, the satisfy
constant D .C o .K/; K/, where C o .K/ is the
p
observed prices associated with the strike K. This dSt D Diag.St /.dt C ˙t d Wt /
is not the case, the curve K ! .C o .K/; K/
having a smile shape (or a smirk shape). The d˙t D .ƒƒT C M˙t C ˙t M T /dt
p p
BSM model is still used as a benchmark in the C ˙t dBt Q C QT .dBt /T ˙t
concept of implied volatility: for a given observed
option price C o .K; T / (with strike K and ma- where W is a d dimensional Brownian motion
turity T ), one can find the value of  such and B a .d  d / matrix Brownian, ƒ; M; Q are
that BS.  ; K; T / D C o .K; T /. The surface .d  d / matrices, M is semidefinite negative,
 .K; T / is called the implied volatility surface and ƒƒT D ˇQQT with ˇ  d  1 to
and plays an important role in calibration issues. ensure strict positivity. The efficiency of these
Due to the need of more accurate formula, models was checked using calibration method-
many models were presented, the only (mathe- ology; however, the main idea of hedging for
matical) restriction being that prices have to be pricing issues is forgotten: there is, in general,
semi-martingales (to avoid arbitrage opportuni- no hedging strategy, even for common deriva-
ties). tive products, and the validation of the model
A first class is the stochastic volatility models. (the form of the volatility, since the drift term
Assuming that the diffusion coefficient (called plays no role in pricing) is done by calibra-
the local volatility) is a deterministic function of tion. The risk neutral probability is no more
time and underlying, i.e., unique.

dSt D St .dt C .t; St /d Wt /


Interest Rate Models
Dupire proved, using Kolmogorov backward
equation that the function is determined by the In the beginning of the 1970s, a specific attention
observed prices of call options by was paid for the interest rate modeling, a constant
interest rate being far from the real world.
1 2 2 @T C o .K; T / C rK@K C o .K; T / A first class consists of a dynamic for the
K .T; K/ D
2 @2KK C o .K; T / instantaneous interest rate r.
Vasisek suggested an Ornstein Uhlenbeck dif-
where @T (resp. @K ) is the partial derivative oper- fusion, i.e., drt D a.b  rt /dt C d Wt , the
ator with respect to the maturity (resp., the strike). solution being a Gaussian process of the form
However, this important model (which allows
Z t
hedging for derivative products) does not allow
a full calibration of the volatility surface. rt D .r0  b/e at C b C e a.t u/ d Wu :
0
A second class consists of assuming that the
volatility is a stochastic process. The first exam- This model is fully tractable, one of its weakness
ple is the Heston model which assumes that is that the interest rate can take negative values.
Cox, Ingersoll, and Rubinstein (CIR) studied
p
dSt D St .dt C vt d Wt / the square root process
Financial Markets Modeling 449

p
drt D a.b  rt /dt C rt dBt ; with C > 0; M  0; G  0, and Y < 2.
These models are often presented as a special
where ab > 0 (so that r is nonnegative). No case of a change of time: roughly speaking,
closed form for r is known; however, closed form any semi-martingale is a time-changed Brownian
for the price of the associated zero coupons is motion, and many examples are constructed as
known (see below in Affine Processes). St D BAt , where B is a Brownian motion and
A second class is the celebrated Heath-Jarrow- A an increasing process (the change of time),
Morton model (HJM): the starting point being to chosen independent of B (for simplicity) and A
model the price of a zero coupon with maturity a Lévy process (for computational issues).
T (i.e., an asset which pays one monetary unit at
time T , these prices being observable) in terms of Affine Processes
the instantaneous forward rate f .t; T /, as Affine models were introduced by Duffie et al. F
(2003) and are now a standard tool for modeling
 Z T  stock prices, or interest rates. An affine process
B.t; T / D exp  f .t; u/d u : enjoys the property that, for any affine function g,
t
Z T
Assuming that E.exp.uXT C g.Xs /ds/jFt /
t

df .t; T / D ˛.t; T /dt C .t; T /d Wt D exp.˛.t; T /Xt C ˇ.t; T //

one finds where ˛ and ˇ are deterministic solutions of


PDEs.
dB.t; T / D B.t; T /.a.t; T /dt C b.t; T /d Wt / A class of affine processes X (Rn -valued) is
the one where
where the relationship between a; b and ˛; ˇ is
known. The instantaneous interest rate is rt D dXt D b.Xt /dt C .Xt /d Wt C dZt
f .t; t/. This model is quite efficient; however,
no conditions are known in order that the interest where the drift vector b is an affine function of
rate is positive. x, the covariance matrix .x/ T .x/ is an affine
function of x, W is an n-dimensional Brownian
motion, and Z is a pure jump process whose
jumps have a given law
on Rn and arrive with
Models with Jumps
intensity f .Xt / where f is an affine function. An
example is the CIR model (without jumps), where
Lévy Processes
one can find the price of a zero coupon as
Following the idea to produce tractable models
to fit the data, many models are now based on Z T
Lévy’s processes. These models are used for E.exp  rs dsjFt /
modeling prices as St D e Xt where X is a t

Lévy process. They present a nice feature: even D ˆ.T  t/ expŒrt ‰.T  t/
if closed forms for pricing are not known, nu-
merical methods are efficient. One of the most where
popular is the Carr-Geman-Madan-Yor (CGMY)
2.e s  1/
model which is a Lévy process without Gaussian ‰.s/ D ;
component and with Lévy density . C a/.e s  1/ C 2
s
! 2ab2
2e . Ca/ 2
C M x C ˆ.s/ D ;
e 11fx>0g C e Gx 11fx<0g . C a/.e s  1/ C 2
x Y C1 jxjY C1
450 Financial Markets Modeling

 2 D a2 C 2 2 : P.Fi1 .i /  ui ; i D 1; : : : ; n/
 
D N˙n N 1 .u1 /; : : : ; N 1 .un / ;

Models for Defaults


where N˙n is the c.d.f. for the n-variate central
At the end of 1990s, new kinds of financial prod- normal distribution with the linear correlation
ucts appear on the market: defaultable options, matrix ˙ and N 1 is the inverse of the c.d.f.
defaultable zero coupons, and credit derivatives, for the univariate standard normal distribution.
as the CDOs. Then, particular attention was paid However, a dynamical model was needed, mainly
to the modeling of default times; see Bielecki and to study the contagion effect, i.e., how the oc-
Rutkowski (2001). The most popular model for currence of a default affect the probability of
a single default is the reduced form approach, occurrence of the next defaults.
which is based on the knowledge of the intensity A first class of models is based on the inten-
rate process. Given a nonnegative process , sity: starting from a given family of intensities
adapted with respect to a reference filtration F, .it ; i D 1; : : : ; n/ which satisfies
a random time  is defined so that
Z t dit D f .t; it /dt C g.t; it /d Wt C dLt
.i /
P. > tjF1 / D exp. s ds/
0
.i / P
R t ^ where Lt D k¤i ˇk 11k t , one constructs
This implies that 11 t  0 u d u is a martingale random times having the given intensities.
(in the smallest filtration G which contains F and Another class of models, which allows for
makes  a random time). This intensity rate  common jumps, is based on Markov Chains with
plays the role of the interest spread due to the absorbing state, the default time of the i -th entity
equality, for any Y 2 FT and F adapted interest being the time where the i -th component of
rate r the Markov Chain enters in the absorbing state.
Z T These models are efficient due to the introduction
E.Y 11T < exp. rs ds/jGt /11t < of common factors.
t Many studies are done to find a form of
Z T a dynamic copula, i.e., a family of processes
D 11t < E.Y exp. .rs C s /ds/jFt / Gt . /; t  0 such that
t

The real challenge is to model multi-defaults.


Gt . / D P.1 > 1 ; : : : ; n > n jFt /
Defaults are assumed to occur at times i , and
one has to describe the (conditional) joint law of
the vector  D .1 ; 2 ; : : : ; n /, the number n of Some examples where, for any fixed t, the
defaults being quite large (100). A first step is to quantity Gt ./ is a (stochastic) Gaussian law are
study the law of the defaults, i.e., known; however, there are few concrete examples
for other cases.
P.1 > t1 ; : : : ; n > tn /

which is performed using the classical copula ap-


proach, based on the knowledge of the marginal Cross-References
laws of the i , i.e., on the knowledge of P.i 
s/ DW Fi .s/ where the cumulative distribution  Credit Risk Modeling
functions Fi are assumed invertible. The simplest  Option Games: The Interface Between Optimal
and most popular copula being the Gaussian one Stopping and Game Theory
Flexible Robots 451

Bibliography control design but may lead to performance


degradation and even unstable behavior, due to
Bachelier L (1900) Théorie de la Spéculation, Thèse, the excitation of vibrational phenomena.
Annales Scientifiques de l’Ecole Normale Supérieure, Flexibility is mainly due to the limited
21–86, III-17
stiffness of transmissions at the joints (Sweet
Bielecki TR, Rutkowski M (2001) Credit risk: modelling
valuation and hedging. Springer, Berlin and Good 1985) and to the deflection of
Duffie D, Filipović D, Schachermayer W (2003) Affine slender and lightweight links (Cannon and
processes and applications in finance. Ann Appl Schmitz 1984). Joint flexibility is common when
Probab 13:984–1053 (2003)
Gourieroux C, Sufana R (2003) Wishart quadratic term
motion transmission/reduction elements such
structure, CREF 03–10, HEC Montreal as belts, long shafts, cables, harmonic drives,
Musiela M, Rutkowski M (2005) Martingale methods in or cycloidal gears are used. Link flexibility is
financial modelling. Springer, Berlin present in large articulated structures, such as F
very long arms needed for accessing hostile
environments (deep sea or space) or automated
Flexible Robots crane devices for building construction. In
both situations, static displacements and
Alessandro De Luca dynamic oscillations are introduced between
Sapienza Università di Roma, Roma, Italy the driving actuators and the actual position
of the robot end effector. Undesired vibrations
Abstract are typically confined beyond the closed-
loop control bandwidth, but flexibility cannot
Mechanical flexibility in robot manipulators is be neglected when large speed/acceleration
due to compliance at the joints and/or distributed and high accuracy are requested by the
deflection of the links. Dynamic models of the task.
two classes of robots with flexible joints or flex- In the dynamic modeling, flexibility is as-
ible links are presented, together with control sumed concentrated at the robot joints or dis-
laws addressing the motion tasks of regulation tributed along the robot links (most of the times
to constant equilibrium states and of asymptotic with some finite-dimensional approximation). In
tracking of output trajectories. Control design for both cases, additional generalized coordinates are
robots with flexible joints takes advantage of the introduced beside those used to describe the rigid
passivity and feedback linearization properties. In motion of the arm in a Lagrangian formulation.
robots with flexible links, basic differences arise As a result, the number of available control inputs
when controlling the motion at the joint level or is strictly less than the number of degrees of
at the tip level. freedom of the mechanical system. This type of
under-actuation, though counterbalanced by the
presence of additional potential energy helping
Keywords to achieve system controllability, suggests that
the design of satisfactory motion control laws is
Feedback linearization; Gravity compensation;
harder than in the rigid case.
Joint elasticity; Link flexibility; Noncausal and
From a control point of view, different de-
stable inversion; Regulation by motor feedback;
sign approaches are needed because of struc-
Singular perturbation; Vibration damping
tural differences arising between flexible-joint
and flexible-link robots. These differences hold
Introduction for single- or multiple-link robots, in the linear
or nonlinear domain, and depend on the physical
Robot manipulators are usually considered as co-location or not of mechanical flexibility versus
rigid multi-body mechanical systems. This ideal control actuation, as well as on the choice of
assumption simplifies dynamic analysis and controlled outputs.
452 Flexible Robots

In order to measure the state of flexible B R C K .  q/ D ; (2)


robots for trajectory tracking control or feedback
stabilization purposes, a large variety of sensing where M .q/ is the positive definite, symmetric
devices can be used, including encoders, joint inertia matrix of the robot links (including the
torque sensors, strain gauges, accelerometers, motor masses); n.q; q/ P is the sum of Coriolis
and high-speed cameras. In particular, measuring and centrifugal terms c.q; q/ P (quadratic in q) P
the full state of the system would require twice and gravitational terms g.q/ D .@Ug =@q/T ;
the number of sensors than in the rigid case for B is the positive definite, diagonal matrix of
robots with flexible joints and possibly more motor inertias (reflected through the gear ratios);
for robots with flexible links. The design of and  are the motor torques (performing work
controllers that work provably good with a on ). The inertia matrix of the complete sys-
reduced set of measurements is thus particularly tem is then M.q/ D block diagfM .q/; Bg:
attractive. The two n-dimensional second-order differential
equations (1) and (2) are referred to as the link
and the motor equations, respectively. When the
joint stiffness K ! 1, it is  ! q and
Robots with Flexible Joints  J ! , so that the two equations collapse in
the limit into the standard dynamic model of rigid
Dynamic Modeling robots with total inertia M.q/ D M .q/ C B.
A robot with flexible joints is modeled as an On the other hand, when the joint stiffness K is
open kinematic chain of n C 1 rigid bodies, relatively large but still finite, robots with elastic
interconnected by n joints undergoing deflection joints show a two-time-scale dynamic behavior.
and actuated by n electrical motors. Let  be A common large scalar factor 1= 2 1 can be
the n-vector of motor (i.e., rotor) positions, as extracted from the diagonal stiffness matrix as
reflected through the reduction gears, and q the K D K O = 2 . The slow subsystem is associated
n-vector of link positions. The joint deflection is to the link dynamics
ı D   q 6 0. The standard assumptions are:
A1 Joint deflections ı are small, limited to M .q/qR C n.q; q/
P D J ; (3)
the domain of linear elasticity. The elastic
torques due to joint deformations are
while the fast subsystem takes the form
 J D K .  q/, where K is the positive
definite, diagonal joint stiffness matrix. 
O B 1 .   J /
 2 R J D K
A2 The rotors of the electrical motors are mod-

eled as uniform bodies having their center of CM 1 .q/ .n.q; q/
P  J / (4)
mass on the rotation axis.
A3 The angular velocity of the rotors is due only For small , Eqs. (3) and (4) represent a singularly
to their own spinning. perturbed system. The two separate time scales
The last assumption, introduced by Spong governing the slow and fast dynamics are t and
(1987), is very reasonable for large reduction D t=.
ratios and also crucial for simplifying the
dynamic model.
Regulation
From the gravity and elastic potential energy,
The basic robotic task of moving between two
U D Ug C Uı , and the kinetic energy T of the
arbitrary equilibrium configurations is realized
robot, applying the Euler-Lagrange equations to
by a feedback control law that asymptotically
the Lagrangian L D T  U and neglecting all
stabilizes the desired robot state.
dissipative effects leads to the dynamic model
In the absence of gravity (g  0), the equi-
librium states are parameterized by the desired
M .q/qR C n.q; q/
P C K .q  / D 0 (1) reference position q d of the links and take the
Flexible Robots 453

form q D q d ,  D  d D q d (with no joint is particularly convenient when a joint torque


deflection at steady state) and qP D P D 0. As sensor measuring  J is available (torque-
P
a result of passivity of the mapping from  to , controlled robots). Using
global regulation is achieved by a decentralized
PD law using only feedback from the motor  D K P . d  /  K D P C K T .g.q d /   J /
variables,
K S P J C g.q d /; (7)
P
 D K P . d  /  K D ; (5)
the four diagonal gain matrices can be given
with diagonal K P > 0 and K D > 0. a special structure so that asymptotic stability
In the presence of gravity, the (unique) is automatically guaranteed (Albu-Schäffer and
equilibrium position of the motor associated Hirzinger 2001). F
with a desired link position q d becomes  d D
q d C K 1 g.q d /. Global regulation is obtained Trajectory Tracking
by adding an extra gravity-dependent term  g to Let a desired sufficiently smooth trajectory q d .t/
the PD control law (5), be specified for the robot links over a finite or
infinite time interval. The control objective is
 D K P . d  /  K D P C  g ; (6) to asymptotically stabilize the trajectory tracking
error e D q d .t/  q.t/ to zero, starting from a
with diagonal matrices K P > 0 (at least) and generic initial robot state. Assuming that q d .t/
K D > 0. The term  g needs to match the gravity is four times continuously differentiable, a torque
load g.q d / at steady state. The following choices input profile  d .t/ D  d .q d ; qP d ; qR d ; «
q d ;¬
q d / can
are of slight increasing control complexity, with be derived from the dynamic model (1) and (2)
progressively better transient performance. so as to reproduce exactly the desired trajectory,
• Constant gravity compensation:  g D g.q d /. when starting from matched initial conditions. A
Global regulation is achieved when the small- local solution to the trajectory tracking problem is
est positive gain in the diagonal matrix K P provided by the combination of such feedforward
is large enough (Tomei 1991). This sufficient term  d .t/ with a stabilizing linear feedback
condition can be enforced only if the joint from the partial or full robot state; see Eqs. (6)
stiffness K dominates the gradient of gravity or (7).
terms. When the joint stiffness is large enough, one
• Online compensation:  g D g./, Q Q D can take advantage of the system being singularly
1
  K g.q d /. Gravity effects on the links perturbed. A control law  s designed for the rigid
are approximately compensated during robot robot will deal with the slow dynamics, while a
motion. Global regulation is proven under the relatively simple action  f is used to stabilize the
same conditions above (De Luca et al. 2005). fast vibratory dynamics around an invariant man-
• Quasi-static compensation:  g D g .q.//. Q ifold associated to the rigid robot control (Spong
At any measured motor position , the link et al. 1987). This class of composite control laws
position q./
Q is computed by solving numer- has the general form
ically g.q/ C K .q  / D 0. This removes
the need of a strictly positive lower bound on
 D  s .q; q;
P t/ C  f .q; q;
P  J ; P J /: (8)
K P (Kugi et al. 2008), but the joint stiffness
should still dominate the gradient of gravity
terms. When setting  D 0 in Eqs. (3), (4), and (8),
Additional feedback from the full robot the control setup of the equivalent rigid robot is
state .q; q; P
P ; /, measured or reconstructed recovered as
through dynamic observers, can provide faster
and damped transient responses. This solution .M .q/ C B/ qR C n.q; q/
P D s : (9)
454 Flexible Robots

Though more complex, the best performing actuator producing a torque  and rotating on
trajectory tracking controller for the general case a horizontal plane. The beam is flexible in the
is based on feedback linearization. Spong (1987) lateral direction only, being stiff with respect to
has shown that the nonlinear state feedback axial forces, torsion, and bending due to gravity.
Deformations are small and are in the elastic do-
 D ˛.q; q;
P q;
R «
q / C ˇ.q/v; (10) main. The physical parameters of interest are the
linear density  of the beam, its flexural rigidity
with EI , the beam length `, and the hub inertia Ih
(with It D Ih C `3 =3). The equations of motion
˛ D M .q/qR C n.q; q/
P
combine lumped and distributed parameter parts,
CBK 1 R P .q/«
M .q/ C K qR C 2M q with the hub rotation .t/ and the link deforma-
tion w.x; t/, being x 2 Œ0; ` the position along
Cn.q;
R q//
P
the link. From Hamilton principle, we obtain
ˇ D BK 1 M .q/;
Z `
leads globally to the closed-loop linear system R C
It .t/ R
x w.x; t/ dx D .t/ (12)
0

q Œ4 D v; (11) R D 0 (13)


EI w0000 .x; t/ C  w.x;
R t/ C  x .t/

i.e., to decoupled chains of four input–output


integrators from each auxiliary input vi to w.0; t/ D w0 .0; t/ D 0;
each link position output qi , for i D 1; : : : ; n.
w00 .`; t/ D w000 .`; t/ D 0; (14)
The control design is then completed on the
linear SISO side, by forcing the trajectory
where a prime denotes partial derivative w.r.t. to
tracking error to be exponentially stable with an
arbitrary decaying rate. The control law (10) is space. Equation (14) are the clamped-free bound-
expressed as a function of the linearizing ary conditions at the two ends of the beam (no
coordinates .q; q;
P q;
R «
q / (up to the link jerk), payload is present at the tip).
which can be however rewritten in terms For the analysis of this self-adjoint PDE prob-
of the original state .q; q; P using the
P ; / lem, one proceeds by separation of variables in
space and time, defining
dynamic model equations. This fundamental
result is the direct extension of the so-
called “computed torque” method for rigid w.x; t/ D .x/ı.t/ .t/ D ˛.t/ C kı.t/;
robots. (15)
where .x/ is the link spatial deformation, ı.t/
is its time behavior, ˛.t/ describes the angular
Robots with Flexible Links motion of the instantaneous center of mass of the
beam, and k is chosen so as to satisfy (12) for  D
Dynamic Modeling 0. Being system (12)–(14) linear, nonrational
For the dynamic modeling of a single flexible transfer functions can be derived in the Laplace
link, the distributed nature of structural flexibility transform domain between the input torque and
can be captured, under suitable assumptions, by some relevant system output, e.g., the angular po-
partial differential equations (PDE) with associ- sition of the hub or of the tip of the beam (Kanoh
ated boundary conditions. A common model is 1990). The PDE formalism provides also a con-
the Euler-Bernoulli beam. The link is assumed venient basis for analyzing distributed sensing,
to be a slender beam, with uniform geometric feedback from strain sensors (Luo 1993), or even
characteristics and homogeneous mass distribu- distributed actuation with piezo-electric devices
tion, clamped at the base to the rigid hub of an placed along the link.
Flexible Robots 455

The transcendental characteristic equation as- The joint-level transfer function pjoint .s/ D
sociated to the spatial part of the solution to .s/=.s/ will always have relative degree two
Eqs. (12)–(14) is and only minimum phase zeros. On the other
hand, the tip-level transfer function ptip .s/ D
 
Ih  3 1 C cos. `/ cosh. `/ y.s/=.s/ will contain non-minimum phase
  zeros. This basic difference in the pattern of the
C sin. `/ cosh. `/  cos. `/ sinh. `/ D 0:
transmission zeros is crucial for motion control
(16) design.
In a simpler modeling technique, a spec-
When the hub inertia Ih ! 1, the second ified class of spatial functions i .x/ is
term can be neglected and the characteristic assumed for describing link deformation.
equation collapses into the so-called clamped The functions need to satisfy only a re- F
condition. Equation (16) has an infinite but duced set of geometric boundary conditions
countable number of positive real roots i , with (e.g., clamped modes at the link base), but
associatedpeigenvalues of resonant frequencies otherwise no dynamic equations of motion
!i D i2 EI = and orthonormal eigenvectors such as (13). The use of finite-dimensional
i .x/, which are the natural deformation expansions like (17) limits the validity of the
shapes of the beam (Barbieri and Özgüner resulting model to a maximum frequency. This
1988). A finite-dimensional dynamic model is truncation must be accompanied by suitable
obtained by truncation to a finite number me of filtering of measurements and of control
eigenvalues/shapes. From commands, so as to avoid or limit spillover effects
(Balas 1978).
X
me
In the dynamic modeling of robots with n
w.x; t/ D i .x/ıi .t/ (17)
flexible links, the resort to assumed modes of
i D1
link deformation becomes unavoidable. In prac-
we get tice, some form of approximation and a finite-
dimensional treatment is necessary. Let  be the
n-vector of joint variables describing the rigid
R D .t/
It ˛.t/
motion, and ı be the m-vector collecting the
ıRi .t/ C !i2 ıi .t/ D i0 .0/.t/; (18)
deformation variables of all flexible links. Fol-
i D 1; : : : ; me ;
lowing a Lagrangian formulation, the dynamic
model with clamped modes takes the general
where the rigid body motion (top equation) ap- form (Book 1984)
pears as decoupled from the flexible dynamics,
! !
thanks to the choice of variable ˛ rather than M .; ı/ M ı .; ı/ R
. Modal damping can be added on the left-
M T ı .; ı/ M ıı .; ı/ ıR
hand sides of the lower equations through terms ! ! !
2i !i ıPi with i 2 Œ0; 1. The angular position of n .; ı; ;P ı/
P 0 
C C D ;
the motor hub at the joint is given by nı .; ı; ;P ı/
P D ıP C K ı 0
(21)
X
me
.t/ D ˛.t/ C i0 .0/ıi .t/; (19)
where the positive definite, symmetric inertia
i D1
matrix M of the complete robot and the Cori-
while the tip angular position is olis, centrifugal, and gravitational terms n have
been partitioned in blocks of suitable dimensions,
X
me K > 0 and D  0 are the robot link stiffness
i .`/
y.t/ D ˛.t/ C ıi .t/: (20) and damping matrices, and  is the n-vector of
i D1
` actuating torques.
456 Flexible Robots

The dynamic model (21) shows the general end-effector motion, the closed-loop dynamics of
couplings existing between nonlinear rigid body the ı variables is stable and link deformations
motion and linear flexible dynamics. In this re- converge to a steady-state constant value (zero
spect, the linear model (18) of a single flexible in the absence of gravity) thanks to the intrinsic
link is a remarkable exception. damping of the mechanical structure. Improved
The choice of specific assumed modes may transients are indeed obtained by active modal
simplify the blocks of the robot inertia matrix, damping control (Cannon and Schmitz 1984).
e.g., orthonormal modes used for each link induce A control approach specifically developed
a decoupled structure of the diagonal inertia for the rest-to-rest motion of flexible mechanical
subblocks of M ıı . Quite often the total kinetic systems is command shaping (Singer and Seering
energy of the flexible robot is evaluated only 1990). The original command designed to
in the undeformed configuration ı D 0. With achieve a desired motion for a rigid robot is
this approximation, the inertia matrix becomes convolved with suitable signals delayed in time,
independent of ı, and so the velocity terms in so as to cancel (or reduce to a minimum) the
the model. Furthermore, due to the hypothesis of effects of the excited vibration modes at the time
small deformation of each link, the dependence of motion completion. For a single slewing link
of the gravity term in the lower component nı is with linear dynamics, as in (18), the rest-to-rest
only a function of . input command is computed in closed form by
The validation of (21) goes through the ex- using impulsive signals and can be made robust
perimental identification of the relevant dynamic via an over-parameterization.
parameters. Besides those inherited from the rigid
case (mass, inertia, etc.), also the set of structural Control of Tip-Level Motion
resonant frequencies and associated deformation The design of a control law that allows asymp-
profiles should be identified. totic tracking of a desired trajectory for the end
effector of a robot with flexible links needs to
Control of Joint-Level Motion face the unstable zero dynamics associated to the
When the target variables to be controlled are problem. In the linear case of a single flexible
defined at the joint level, the control problem link, this is equivalent to the presence of non-
for robots with flexible links is similar to that of minimum phase zeros in the transfer function to
robots with flexible joints. As a matter of fact, the tip output (20). Direct inversion of the input–
the models (1), (2), and (21) are both passive output map leads to instability, due to cancel-
systems with respect to the output ; see (19) lation of non-minimum phase zeros by unstable
in the scalar case. For instance, regulation is poles, with link deformation growing unbounded
achieved by a PD action with constant grav- and control saturations.
ity compensation, using a control law of the The solution requires instead to determine the
form (6) without the need of feeding back link unique reference state trajectory of the flexible
deformation variables (De Luca and Siciliano structure that is associated to the desired tip
1993a). Similarly, stable tracking of a joint trajec- trajectory and has bounded deformation. Based
tory  d .t/ is obtained by a singular perturbation on regulation theory, the control law will be the
control approach, with flexible modes dynamics superposition of a nominal feedforward action,
acting at multiple time scales with respect to which keeps the system along the reference state
rigid body motion (Siciliano and Book 1988), trajectory (and thus the output on the desired
or by an inversion-based control (De Luca and trajectory), and of a stabilizing feedback that
Siciliano 1993b), where input–output (rather than reduces the error with respect to this state tra-
full state) exact linearization is realized and the jectory to zero without resorting to dangerous
effects of link flexibility are canceled on the cancellations.
motion of the robot joints. While vibrational In general, computing such a control law
behavior will still affect the robot at the level of requires the solution of a set of nonlinear partial
Flexible Robots 457

differential equations. However, in the case of uncertainties and external disturbances (with
a single flexible link with linear dynamics, the adaptive, learning, or iterative schemes), and
feedforward profile is simply derived by an further exploit the design of control laws
inversion defined in the frequency domain (Bayo under limited measurements and noisy sensing.
1987). The desired tip acceleration yRd .t/, Beyond free motion tasks, an accurate treatment
t 2 Œ0; T , is considered as part of a rest-to-rest of interaction tasks with the environment,
periodic signal, with zero mean value and zero requiring force or impedance controllers, is
integral. The procedure, implemented efficiently still missing for flexible robots. In this respect,
using Fast Fourier Transform on discrete-time passivity-based control approaches that do not
samples, will automatically generate bounded necessarily operate dynamic cancellations may
time signals only. The resulting unique torque take advantage of the existing compliance,
profile d .t/ will be a noncausal command, trading off between improved energy efficiency F
anticipating the actual start of the output and some reduction in nominal performance.
trajectory at t D 0 (so as to precharge the link to Often seen as a limiting factor for per-
the correct initial deformation) and ending after formance, the presence of joint elasticity
t D T (to discharge the residual link deformation is now becoming an explicit advantage for
and recover the final rest configuration). safe physical human-robot interaction and for
The same result was recovered by Kwon and locomotion. Next generation lightweight robots
Book (1994) in the time domain, by forward and humanoids will use flexible joints and also
integrating in time the stable part of the inverse compact actuation with online controlled variable
system dynamics and backward integrating the joint stiffness, an area of active research.
unstable part. An extension to the multi-link non-
linear case uses an iterative approach on repeated
linear approximations of the system along the
nominal trajectory (Bayo et al. 1989).
Cross-References

Summary and Future Directions  Feedback Linearization of Nonlinear Systems


 Modeling of Dynamic Systems from First Prin-
The presence of mechanical flexibility in the ciples
joints and the links of multi-dof robots poses  Nonlinear Zero Dynamics
challenging control problems. Control designs  PID Control
take advantage or are limited by some system-  Regulation and Tracking of Nonlinear Systems
level properties. Robots with flexible joints are
passive systems at the level of motor outputs,
have no zero dynamics associated to the link
position outputs, and are always feedback lin-
earizable systems. Robots with flexible links are Recommended Reading
still passive for joint-level outputs, but cannot be
feedback linearized in general, and have unstable In addition to the works cited in the body of this
zero dynamics (non-minimum phase zeros in the article, a detailed treatment of dynamic modeling
linear case) when considering the end-effector and control issues for flexible robots can be found
position as controlled output. in De Luca and Book (2008). This includes also
State-of-the-art control laws address regula- the use of dynamic feedback linearization for a
tion and trajectory tracking tasks in a satisfactory more general model of robots with elastic joints.
way, at least in nominal conditions and under For the same class of robots, Brogliato et al.
full-state feedback. Current research directions (1995) provided a comparison of passivity-based
are aimed at achieving robustness to model and inversion-based tracking controllers.
458 Flocking in Networked Systems

Bibliography Spong MW, Khorasani K, Kokotovic PV (1987) An inte-


gral manifold approach to the feedback control of flex-
Albu-Schäffer A, Hirzinger G (2001) A globally stable ible joint robots. IEEE J Robot Autom 3(4):291–300
state feedback controller for flexible joint robots. Adv Sweet LM, Good MC (1985) Redefinition of the robot
Robot 15(8):799–814 motion control problem. IEEE Control Syst Mag
Balas MJ (1978) Feedback control of flexible systems. 5(3):18–24
IEEE Trans Autom Control 23(4):673–679 Tome P (1991) A simple PD controller for robots
Barbieri E, Özgüner Ü (1988) Unconstrained and con- with elastic joints. IEEE Trans Autom Control
strained mode expansions for a flexible slewing link. 36(10):1208–1213
ASME J Dyn Syst Meas Control 110(4):416–421
Bayo E (1987) A finite-element approach to control
the end-point motion of a single-link flexible robot.
J Robot Syst 4(1):63–75
Bayo E, Papadopoulos P, Stubbe J, Serna MA (1989) Flocking in Networked Systems
Inverse dynamics and kinematics of multi-link elastic
robots: an iterative frequency domain approach. Int J Ali Jadbabaie
Robot Res 8(6):49–62 University of Pennsylvania, Philadelphia, PA,
Book WJ (1984) Recursive Lagrangian dynamics of
flexible manipulators. Int J Robot Res 3(3):87–106 USA
Brogliato B, Ortega R, Lozano R (1995) Global
tracking controllers for flexible-joint manipulators: a
comparative study. Automatica 31(7):941–956 Abstract
Cannon RH, Schmitz E (1984) Initial experiments on the
end-point control of a flexible one-link robot. Int J Flocking is a collective behavior exhibited by
Robot Res 3(3):62–75
De Luca A, Book W (2008) Robots with flexible
many animal species such as birds, insects, and
elements. In: Siciliano B, Khatib O (eds) Springer fish. Such behavior is generated by distributed
handbook of robotics. Springer, Berlin, pp 287–319 motion coordination through nearest-neighbor in-
De Luca A, Siciliano B (1993a) Regulation of flexible teractions. Empirical study of such behavior has
arms under gravity. IEEE Trans Robot Autom
9(4):463–467 been an active research in ecology and evolu-
De Luca A, Siciliano B (1993b) Inversion-based nonlinear tionary biology. Mathematical study of such be-
control of robot arms with flexible links. AIAA J Guid haviors has become an active research area in a
Control Dyn 16(6):1169–1176 diverse set of disciplines, ranging from statistical
De Luca A, Siciliano B, Zollo L (2005) PD control
with on-line gravity compensation for robots with physics and computer graphics to control theory,
elastic joints: theory and experiments. Automatica robotics, opinion dynamics in social networks,
41(10):1809–1819 and general theory of multiagent systems. While
Kanoh H (1990) Distributed parameter models of flexible models vary in detail, they are all based on
robot arms. Adv Robot 5(1):87–99
Kugi A, Ott C, Albu-Schäffer A, Hirzinger G (2008) On local diffusive dynamics that results in emergence
the passivity-based impedance control of flexible joint of consensus in direction of motion. Flocking
robots. IEEE Trans Robot 24(2):416–429 is closely related to the notion of consensus
Kwon D-S, Book WJ (1994) A time-domain inverse and synchronization in multiagent systems, as
dynamic tracking control of a single-link flexible
manipulator. ASME J Dyn Syst Meas Control 116(2): examples of collective phenomena that emerge
193–200 in multiagent systems as result of local nearest-
Luo ZH (1993) Direct strain feedback control of flexible neighbor interactions.
robot arms: new theoretical and experimental results.
IEEE Trans Autom Control 38(11):1610–1622
Siciliano B, Book WJ (1988) A singular perturbation Keywords
approach to control of lightweight flexible
manipulators. Int J Robot Res 7(4):79–90
Singer N, Seering WP (1990) Preshaping command Consensus; Dynamics; Flocking; Graph theory;
inputs to reduce system vibration. ASME J Dyn Syst Markov chains; Switched dynamical systems;
Meas Control 112(1):76–82 Synchronization
Spong MW (1987) Modeling and control of elastic
joint robots. ASME J Dyn Syst Meas Control Flocking or social aggregation is a group be-
109(4):310–319 havior observed in many animal species, ranging
Flocking in Networked Systems 459

from various types of birds to insects and fish. conditions. SPPs are essentially kinematic
The phenomena can be loosely defined as any particles moving with constant speed and
aggregate collective behavior in parallel rectilin- the steering law determines the angle of
ear formation or (in case of fish) in collective (what control theorists call a kinematic,
circular motion. The mechanisms leading to such nonholonomic vehicle model). Vicsek et al.’s
behavior have been (and continues to be) an steering law was very intuitive and simple
active area of research among ecologists and and was essentially Reynolds’ zone-based
evolutionary biologists, dating back to the 1950s alignment rule (Vicsek was not aware of
if not earlier. The engineering interest in the topic Reynolds’ result): each particle averages the
is much more recent. angle of its velocity vector with that of its
In 1986, Craig Reynolds (1987), a computer neighbors (those within a disk of a prespecified
graphics researcher, developed a computer model distance), plus a noise term used to model F
of collective behavior for animated artificial ob- inaccuracies in averaging. Once the velocity
jects called boids. The flocking model for boids vector is determined at each time, each particle
was used to realistically duplicate aggregation takes a unit step along that direction, then
phenomena in fish flocks and bird schools for determining its neighbors again and repeating
computer animation. Reynolds developed a sim- the protocol.
ple, intuitive physics-based model: each boid was The simulations were done in a square of
a point mass subject to three simple steering unit length with periodic boundary conditions,
forces: alignment (to steer each boid towards to simulate infinite space. Vicsek and coauthors
the average heading of its local flockmates), co- simulated this behavior and found that as the
hesion (steering to move towards the average density of the particles increased, a preferred
position of local flockmates), and separation (to direction spontaneously emerged, resulting in
avoid crowding local flockmates). The term local a global group behavior with nearly aligned
should be understood as those flockmates who velocity vectors for all particles, despite
are within each other’s influence zone, which the fact that the update protocol is entirely
could be a disk (or a wide-angle sector of a local.
disk) centered at each boid with a prespecified With the interest in control theory shifting
radius. This simple zone-based model created towards multiagent systems and networked co-
very realistic flocking behaviors and was used in ordination and control, it became clear that the
many animations (Fig. 1). Reynolds’ 3 rules of mathematics of how birds flock and fish school
flocking. and how individuals in a social network reach
Nine years later, in 1995, Vicsek et al. (1995) agreement (even though they are often only influ-
and coauthors independently developed a model enced by other like-minded individuals) are quite
for velocity alignment of self-propelled particles related to the question of how can one engineer a
(SPPs) in a square with periodic boundary swarm of robots to behave like bird flocks.

Flocking in Networked Systems, Fig. 1 Photo from http://red3d.com/cwr/boids/


460 Flocking in Networked Systems

These questions have occupied the minds network is fixed and connected (i.e., there is a
of many researchers in diverse areas ranging path from every node to every other node) and
from control theory to robotics, mathematics, and agents also include their own opinions in the
computer science. As discussed above, most of averaging, the update results in computation of
the early research, which happened in computer a global weighted average of initial opinions,
graphics and statistical physics, was on modeling where the weight of each initial opinion in the fi-
and simulation of collective behavior. Over the nal aggregate is proportional to the “importance”
past 13 years, however, the focus has shifted to of each node in the network.
rigorous systems theoretic foundations, leading The flocking models of Reynolds (1987) and
to what one might call a theory of collective Vicsek et al. (1995), however, have an extra
phenomena in multiagent systems. This theory twist: the network changes as opinions are up-
blends dynamical systems, graph theory, Markov dated. Similarly, more refined sociological mod-
chains, and algorithms. els developed over the past decade also capture
This type of collective phenomena are often this endogeneity (Hegselmann and Krause 2002):
modeled as many-degrees-of-freedom (discrete- each individual agent is influenced by others
time or continuous-time) dynamical systems only when their opinion is close to her own. In
with an additional twist that the interconnection other words, as opinions evolve, neighborhood
structure between individual dynamical systems structures change as the function of the evolving
changes, since the motion of each node in a flock opinion, resulting in a switched dynamical sys-
(or opinion of an individual) is affected primarily tem in which switching is state dependent.
by those in each node’s local neighborhood. In a paper in 2003, Jadbabaie and coau-
The twist here is that the local neighborhood thors (2003) studied the Reynolds’ alignment
is not fixed: neighbors are defined based on rule in the context of Vicsek’s model when there
the actual state of the system, for example, in is no exogenous noise. To model the endogeneity
case of Vicsek’s alignment rule, as each particle of the change in neighborhood structure, they
averages its velocity direction with that of its developed a model based on repeated local
neighbors and then takes a step, the set of its averaging in which the neighborhood structure
nearest neighbors can change. changes over time and therefore instead of a
Interestingly, very similar models were de- simple discrete-time linear dynamical system,
veloped in statistics and mathematical sociology the model is a discrete linear inclusion or a
literature to describe how individuals in a social switched linear system. The question of interest
network update their opinions as a function of was to determine what regimes of network
the opinion of their friends. The first such model changes could result in flocking. Clearly, as
goes back to the seminal work of DeGroot (1974) also DeGroot’s model suggests, when the local
in 1974. DeGroot’s model simply described the neighbor structures do not change, connectivity
evolution of a group’s scalar opinion as a function is the key factor for flocking. This is a direct
of the opinion of their neighbors by an iterative consequence of Perron-Frobenius theory. The
averaging scheme that can be conveniently mod- result can also be described in terms of directed
eled as a Markov chain. Individuals are repre- graphs. What is the equivalent condition in
sented by nodes of a graph, start from an opinion changing networks? Jadbabaie and coauthors
at time zero, and then are influenced by the show in their paper that indeed connectivity is
people in their social clique. In DeGroot’s model important, but it need not hold every time: rather,
though, the network is given exogenously and there needs to be time periods over which the
does not change as a function of the opinions. graphs are connected in time. More formally, the
The model therefore can be analyzed using the process of neighborhood changes due to motion
celebrated Perron-Frobenius theorem. The evo- in Vicsek’s model can be simply abstracted as
lution of opinions is a discrete dynamic system a graph in which the links “blink” on and off.
that corresponds to an averaging map. When the For flocking, one needs to ensure that there are
Flocking in Networked Systems 461

time periods over which the union of edges (that contains a directed spanning tree (a node who has
occur as a result of proximity of particles) needs direct links to every other node).
to correspond to a connected graph and such Some of these results were also extended to
intervals need to occur infinitely often. It turns the analysis of the Reynolds’ model of flocking
out that many of these ideas were developed including the other two behaviors. First, Tanner
much earlier in a thesis and following paper and coauthors (2003, 2007) showed in a series
by Tsitsiklis (1984) and Tsitsiklis et al. (1986), of papers in 2003 and 2007 that a zone-based
in the context of distributed and asynchronous model similar to Reynolds can result in flocking
computation of global averages in changing for dynamic agents, provided that the graph
graphs, and in a paper by Chatterjee and Seneta representing interagent communications stays
(1977), in the context of nonhomogeneous connected. Olfati-Saber and coauthors (2007)
Markov chains. The machinery for proving developed similar results with a slightly different F
such a results, however, is classical and has a model.
long history in the theory of inhomogeneous Many generalizations and extension for these
Markov chains, a subject studied since the time results exist in a diverse set of disciplines, re-
of Markov himself, followed by Birkhoff and sulting in a rich theory which has had appli-
other mathematicians such as Hajnal, Dobrushin, cations from robotics (such as rendezvous in
Seneta, Hatfield, Daubachies, and Lagarias, to mobile robots) (Cortés et al. 2006) to mathemati-
name a few. cal sociology (Hegselmann and Krause 2002) and
The interesting twist in analysis of Vicsek’s from economics (Golub and Jackson 2010) to dis-
model is that the Euclidean norm of the distance tributed optimization theory (Nedic and Ozdaglar
to the globally converged “consensus angle” (or 2009). However, some of the fundamental math-
consensus opinion in the case of opinion models) ematical questions related to flocking still remain
can actually grow in a single step of the process; open.
therefore, standard quadratic Lyapunov function First, most results focus on endogenous mod-
arguments which serve as the main tool for anal- els of network change. A notable extension is
ysis of switched linear systems are not suitable a paper by Cucker and Smale (2007), in which
for the analysis of such models. However, it the authors develop and analyze an endogenous
is fairly easy to see that under the process of model of flocking that cleverly smoothens out
local averaging, the largest value cannot increase the discontinuous change in network structure by
and the smallest value cannot decrease. In fact, allowing each node’s influence to decay smoothly
one can show that if enough connected graphs as a function of distance.
occur as the result of the switching process, the Recently, in a series of papers, Chazelle has
maximum value will be strictly decreasing and made progress in this arena by using tools from
the minimum value will be strictly increasing computational geometry and algorithms for anal-
The paper by Jadbabaie and coauthors (2003) ysis of endogenous models of flocking (Chazelle
has lead to a flurry of results in this area over 2012). Chazelle has introduced the notion of the
the past decade. One important generalization to s-energy of a flock, which can be thought of as a
the results of Jadbabaie et al. (2003), Tsitsiklis parameterized family of Lyapunov functions that
(1984), and Tsitsiklis et al. (1986) came 2 years represent the evolution of global misalignment
later in a paper by Moreau (2005), who showed between flockmates. Via tools from dynamical
that these results can be generalized to nonlinear systems, computational geometry, combinatorics,
updates and directed graphs. Moreau showed complexity theory, and algorithms, Chazelle cre-
that any dynamic process that assigns a point in ates an “algorithmic calculus,” for diffusive in-
the interior of the convex hull of the value of fluence systems: surprisingly, he shows that the
each node and its neighbors will eventually result orbit or flow of such systems is attracted to a
in agreement and consensus, if and only if the fixed point in the case of undirected graphs and a
union of graphs from every time step till infinity limit cycle for almost all arbitrarily small random
462 Flocking in Networked Systems

perturbations. Furthermore, the convergence time Bibliography


can also be bounded in both cases and the bounds
are essentially optimal. The setup of the diffusive Chatterjee S, Seneta E (1977) Towards consensus: some
convergence theorems on repeated averaging. J Appl
influence system developed by Chazelle creates a
Probab 14:89–97
near-universal setup for analyzing various prob- Chazelle B (2012) Natural algorithms and influence sys-
lems involving collective behavior in networked tems. Commun ACM 55(12):101–110
multiagent systems, from flocking, opinion dy- Cortés J, Martínez S, Bullo F (2006) Robust rendezvous
for mobile autonomous agents via proximity graphs
namics, and information aggregation to synchro-
in arbitrary dimensions. IEEE Trans Autom Control
nization problems. 51(8):1289–1298
To make further progress on analysis of what Cucker F, Smale S (2007) Emergent behavior
one might call networked dynamical systems in flocks. IEEE Trans Autom Control 52(5):
852–862
(which Chazelle calls influences systems), one
DeGroot MH (1974) Reaching a consensus. J Am Stat
needs to combine mathematics of algorithms, Assoc 69(345):118–121
complexity, combinatorics, and graphs with Golub B, Jackson MO (2010) Naive learning in social
systems theory and dynamical systems. networks and the wisdom of crowds. Am Econ J
Microecon 2(1):112–149
Hegselmann R, Krause U (2002) Opinion dynamics and
bounded confidence models, analysis, and simulation.
Summary and Future Directions J Artif Soc Soc Simul 5(3):2
Jadbabaie A, Lin J, Morse AS (2003) Coordination of
This article presented a brief summary of the groups of mobile autonomous agents using nearest
neighbor rules. IEEE Trans Autom Control 48(6):988–
literature on flocking and distributed motion co- 1001
ordination. Flocking is the process by which Moreau L (2005) Stability of multiagent systems with
various species exhibit synchronous collective time-dependent communication links. IEEE Trans Au-
motion from simple local interaction rules. Mo- tom Control 50(2):169–182
Nedic A, Ozdaglar A (2009) Distributed subgradient
tivated by social aggregation in various species, methods for multi-agent optimization. IEEE Trans
various algorithms have been developed in the Autom Control 54(1):48–61
literature to design distributed control laws for Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
and cooperation in networked multi-agent systems.
group behavior in collective robotics and analysis
Proc IEEE 95(1):215–233
of opinion dynamics in social networks. The Reynolds CW (1987) Flocks, herds, and schools: a dis-
models describe each agent as a kinematic or tributed behavioral model. Comput Graph 21(4):25–
point mass particle that aligns each agent’s direc- 34. (SIGGRAPH ’87 Conference Proceedings)
Tanner HG, Jadbabaie A, Pappas GJ (2003) Stable flock-
tion with that of its neighbors using repeated local
ing of mobile agents, Part I: fixed Topology, Part II:
averaging of directions. Since the neighborhood switching topology. In: Proceedings of the 42nd IEEE
structures change due to motion, this results in a conference on decision and control, Maui, vol 2. IEEE,
distributed switched dynamical system. If a weak pp 2010–2015
notion of connectivity among agents is preserved Tanner HG, Jadbabaie A, Pappas GJ (2007) Flocking
in fixed and switching networks. IEEE Trans Autom
over time, then agents reach consensus in their Control 52(5):863–868
direction of motion. Despite the flurry of results Tsitsiklis JN (1984) Problems in decentralized decision
in this area, the analysis of this phenomenon that making and computation (No. LIDS-TH-1424). Lab-
accounts for endogenous change in dynamics is oratory for Information and Decision Systems, Mas-
sachusetts Institute of Technology
for the most part open. Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed
asynchronous deterministic and stochastic gradient
optimization algorithms. IEEE Trans Autom Control
Cross-References 31(9):803–812
Vicsek T, Czirók A, Ben-Jacob E, Cohen I, Shochet
O (1995) Novel type of phase transition in a sys-
 Averaging Algorithms and Consensus tem of self-driven particles. Phys Rev Lett 75(6):
 Oscillator Synchronization 1226
Force Control in Robotics 463

collaboration with a human. In all cases, a pure


Force Control in Robotics motion control strategy is not recommended, es-
pecially if the environment is stiff.
Luigi Villani
The higher the environment stiffness and po-
Dipartimento di Ingeneria Elettrica e Tecnologie
sition control accuracy are, the more easily the
dell’Informazione, UniversitJa degli Studi di
contact forces may rise and reach unsafe values.
Napoli Federico II, Napoli, Italy
This drawback can be overcome by introducing
compliance, either in a passive or in an active
fashion, to accommodate the robot motion in
Abstract response to interaction forces.
Passive compliance may be due to the struc-
Force control is used to handle the physical inter- tural compliance of the links, joints, and end ef- F
action between a robot and the environment and fector or to the compliance of the position servo.
also to ensure safe and dependable operation in Soft robot arms with elastic joints or links are
the presence of humans. The control goal may be purposely designed for intrinsically safe interac-
that to keep the interaction forces limited or that tion with humans. In contrast, active compliance
to guarantee a desired force along the directions is entrusted to the control system, denoted inter-
where interaction occurs while a desired motion action control or force control. In same cases, the
is ensured in the other directions. This entry measurement of the contact force and moment
presents the basic control schemes, focusing on is required, which is fed back to the controller
robot manipulators. and used to modify or even generate online the
desired motion of the robot (Whitney 1977).
The passive solution is faster than active re-
Keywords action commanded by a computer control algo-
rithm. However, the use of passive compliance
Compliance control; Constrained motion; alone lacks of flexibility and cannot guarantee
Force control; Force/torque sensor; Hybrid that high contact forces will never occur. Hence,
force/motion control; Impedance control; the most effective solution is that of using ac-
Stiffness control tive force control (with or without force feed-
back) in combination with some degree of passive
compliance.
Introduction In general, six force components are required
to provide complete contact force information:
Control of the physical interaction between three translational force components and three
a robot manipulator and the environment is torques. Often, a force/torque sensor is mounted
crucial for the successful execution of a number at the robot wrist (see an example in Fig. 1),
of practical tasks where the robot end effector but other possibilities exist, for example, force
has to manipulate an object or perform some sensors can be placed on the fingertips of robotic
operation on a surface. Typical examples in hands; also, external forces and moments can be
industrial settings include polishing, deburring, estimated via shaft torque measurements of joint
machining, or assembly. torque sensors.
During contact, the environment may set con- The force control strategies can be grouped
straints on the geometric paths that can be fol- into two categories (Siciliano and Villani 1999):
lowed by the robot’s end effector (kinematic those performing indirect force control and those
constraints) as in the case of sliding on a rigid performing direct force control. The main dif-
surface. In other situations, the interaction occurs ference between the two categories is that the
with a dynamic environment as in the case of former achieve force control via motion control,
464 Force Control in Robotics

ve D J.q/Pq :

The matrix J is the end-effector Jacobian. For


simplicity, the case of nonredundant nonsingular
manipulators is considered; therefore, the Jaco-
bian is a square nonsingular matrix.
The force f e and moment me applied by the
end effector to the environment
 are the compo-
nents of the wrench he D f Te mTe T . The joint
torques  corresponding to he can be computed
as
 D JT.q/he :
It is useful to consider the operational space
formulation of the dynamic model of a rigid
robot manipulator in contact with the environ-
ment (Khatib 1987):

ƒ.q/Pve C  .q; qP /ve C .q/ D hc  he ; (1)

where ƒ.q/ is the 6  6 operational space inertia


matrix,  .q; qP / is the wrench including centrifu-
Force Control in Robotics, Fig. 1 Industrial robot with
wrist force/torque sensor and deburring tool
gal and Coriolis effects, and .q/ is the wrench of
the gravitational effects. The vector hc D JT  is
the equivalent end-effector wrench corresponding
without explicit closure of a force feedback loop; to the input joint torques  c .
the latter instead offer the possibility of control- Equation (1) can be seen as a representation
ling the contact force and moment to a desired of the Newton’s Second Law of Motion where all
value, thanks to the closure of a force feedback the generalized forces acting on the joints of the
loop. robot are reported at the end effector.
The full specification of the system dynamics
would require also the analytic description of the
Modeling interaction force and moment he . This is a very
demanding task from a modeling viewpoint.
The case of interaction of the end effector of a The design of the interaction control and the
robot manipulator with the environment is con- performance analysis are usually carried out un-
sidered, which is the most common situation in der simplifying assumptions. The following two
industrial applications. cases are considered:
The end-effector pose can be represented by 1. The robot is perfectly rigid, all the compliance
the position vector pe and the rotation matrix Re , in the system is localized in the environment,
corresponding to the position and orientation of a and the contact wrench is approximated by
frame attached to the end effector with respect to a linear elastic model.
a fixed-base frame. 2. The robot and the environment are perfectly
The end-effector velocity
 is denoted
 by the rigid and purely kinematics constraints are
6  1 twist vector ve D pP Te !Te T where pP e is imposed by the environment.
the translational velocity and !e is the angular It is obvious that these situations are only ideal.
velocity and can be computed from the joint However, the robustness of the control should be
velocity vector qP using the linear mapping able to cope with situations where some of the
Force Control in Robotics 465

ideal assumptions are relaxed. In that case the action of the external wrench he . It is clear that,
control laws may be adapted to deal with nonideal if he ¤ 0, then the end effector deviates from the
characteristics. desired pose, which is usually denoted as virtual
pose.
Physically, the closed-loop system (1) with
Indirect Force Control (2) can be seen as a 6-DOF nonlinear and
configuration-dependent mass-spring-damper
The aim of indirect force control is that of achiev- system with inertia (mass) matrix ƒ.q/ and
ing a desired compliant dynamic behavior of the adjustable damping KD and stiffness KP , under
robot’s end effector in the presence of interaction the action of the external wrench he .
with the environment.
Impedance Control F
Stiffness Control A configuration-independent dynamic behavior
The simpler approach is that of imposing a suit- can be achieved if the measure of the end-effector
able static relationship between the deviation of force and moment he is available, by using the
the end-effector position and orientation from a control law, known as impedance control (Hogan
desired pose and the force exerted on the envi- 1985):
ronment, by using the control law
hc D ƒ.q/˛ C  .q; qP /Pq C .q/ C he ;
hc D KP xde  KD ve C .q/ ; (2)
where ˛ is chosen as:
where KP and KD are suitable matrix gains and
xde is a suitable error between a desired and the
˛ D vP d C K1
M .KD vde C KP xde  he / :
actual end-effector position and orientation. The
position error component of xde can be simply
chosen as pd  pe . Concerning the orientation The following expression can be found for the
error component, different choices are possible closed-loop system
(Caccavale et al. 1999), which are not all equiv-
alent, but this issue is outside the scope of this KM Pvde C KD vde C KP xde D he ; (3)
entry.
The control input (2) corresponds to a wrench representing the equation of a 6-DOF
(force and moment) applied to the end effector, configuration-independent mass-spring-damper
which includes a gravity compensation term .q/, system with adjustable inertia (mass) matrix
a viscous damping term KD ve , and an elastic KM , damping KD , and stiffness KP , known as
wrench provided by a virtual spring with stiffness mechanical impedance.
matrix KP (or, equivalently, compliance matrix A block diagram of the resulting impedance
K1
P ) connecting the end-effector frame with a control is sketched in Fig. 2.
frame of desired position and orientation. This The selection of good impedance parameters
control law is known as stiffness control or com- ensuring a satisfactory behavior is not an easy
pliance control (Salisbury 1980). task and can be simplified under the hypothesis
Using the Lyapunov method, it is possible to that all the matrices are diagonal, resulting in
prove the asymptotic stability of the equilibrium a decoupled behavior for the end-effector coor-
solution of equation dinates.
Moreover, the dynamics of the controlled sys-
KP xde D he ; tem during the interaction depends on the dynam-
ics of the environment that, for simplicity, can
meaning that, at steady state, the robot’s end be approximated as a simple elastic law for each
effector has a desired elastic behavior under the coordinate, of the form
466 Force Control in Robotics

pd,Rd he
ud Impedance a Inverse t Manipulator and q
ud control dynamics environment q

pe,Re
ue Direct
kinematics

Force Control in Robotics, Fig. 2 Impedance control

he D kxeo ; to the stiffness of the environment k and vice


versa.
where xeo D xe  xo , while xo and k are the
undeformed position and the stiffness coefficient
of the spring, respectively. Direct Force Control
In the above hypotheses, the transient behavior
of each component of Eq. (3) can be set by as- Indirect force control does not require explicit
signing the natural frequency and damping ratio knowledge of the environment, although to
with the relations achieve a satisfactory dynamic behavior, the
control parameters have to be tuned for
s
a particular task. On the other hand, a model
kP C k 1 kD
!n D ; D p : of the interaction task is usually required for the
kM 2 kM .kP C k/ synthesis of direct force control algorithms.
In the following, it is assumed that the en-
Hence, if the gains are chosen so that a given vironment is rigid and frictionless and imposes
natural frequency and damping ratio are ensured kinematic constraints to the robot’s end-effector
during the interaction (i.e., for k ¤ 0), a smaller motion (Mason 1981). These constraints reduce
natural frequency with a higher damping ratio the dimension of the space of the feasible end-
will be obtained when the end effector moves in effector velocities and of the contact forces and
free space (i.e., for k D 0). As for the steady- moments. In detail, in the presence of m inde-
state performance, the end-effector error and the pendent constraints (m < 6), the end-effector
interaction force for the generic component are velocity belongs to a subspace of dimension 6 
m, while the end-effector wrench belongs to a
subspace of dimension m and can be expressed
k kP k in the form
xde D xdo ; hD xdo ;
.kP C k/ kP C k
ve D Sv .q/ ; he D Sf .q/
showing that, during interaction, the contact
force can be made small at the expense of where  is a suitable .6  m/  1 vector and  is
a large position error in steady state, as long a suitable m  1 vector. Moreover, the subspaces
as the robot stiffness kP is set low with respect of forces and velocity are reciprocal, i.e.:
Force Control in Robotics 467

hTe ve D 0 ; STf .q/Sv .q/ D 0 :

The concept of reciprocity expresses the physical


fact that, in the hypothesis of rigid and friction-
less contact, the wrench does not cause any work
against the twist.
An interaction task can be assigned in terms of
a desired end-effector twist vd and wrench hd that
are computed as: zt
yt
vd D Sv  d ; hd D Sf d ;
xt F
by specifying vectors d and  d .
Force Control in Robotics, Fig. 3 Insertion of a cylin-
In many robotic tasks it is possible to set an or- drical peg into a hole
thogonal reference frame, usually referred as task
frame (De Schutter and Van Brussel 1988), in
which the matrices Sv and Sf are constant. More- The task frame can be chosen attached either to
over, the interaction task is specified by assigning the end effector or to the environment.
a desired force/torque or a desired linear/angular
velocity along/about each of the frame axes. Hybrid Force/Motion Control
An example of task frame definition and task The reciprocity of the velocity and force sub-
specification is given below. spaces naturally leads to a control approach,
Peg-in-Hole: The goal of this task is to push known as hybrid force/motion control (Raibert
the peg into the hole while avoiding wedging and and Craig 1981; Yoshikawa 1987), aimed at con-
jamming. The peg has two degrees of motion trolling simultaneously both the contact force
freedom; hence, the dimension of the velocity- and the end-effector motion in two reciprocal
controlled subspace is 6  m D 2, while the subspaces.
dimension of the force-controlled subspace is The reduced order dynamics of the robot with
m D 4. The task frame can be chosen as shown kinematic constraints is described by 6  m
in Fig. 3, and the task can be achieved by assign- second-order equations
ing the following desired forces and torques:
• Zero forces along the xt and yt axes  
ƒv .q/P D STv hc  .q; qP / ;
• Zero torques about the xt and yt axes and the
desired velocities
• A nonzero linear velocity along the zt -axis where ƒv D STv ƒSv and .q; qP / D  .q; qP /ve C
• An arbitrary angular velocity about the zt -axis .q/, assuming constant matrices Sv and Sf .
The task continues until a large reaction force in Moreover, the vector  can be computed as
the zt direction is measured, indicating that the
  
peg has hit the bottom of the hole, not represented  D Sf .q/ hc  .q; qP / ;
in the figure. Hence, the matrices Sf and Sv can be
chosen as
revealing that the contact force is a constraint
0 1 0 1 force which instantaneously depends on the ap-
100 0 00
B0 1 0 0C B0 0 C plied input wrench hc .
B C B C
B0 0 0 0C B1 0 C An inverse-dynamics inner control loop can be
Sf D B
B0 0 1
C ; Sv D B C
B0 0 C :
B 0C
C B C
designed by choosing the control wrench hc as
@0 0 0 1A @0 0 A
0000 01 hc D ƒ.q/Sv ˛v C Sf f  C .q; qP / ;
468 Force Control in Robotics

where ˛v and f  are properly designed control control schemes can be defined also in this case
inputs, which leads to the equations Villani and De Schutter (2008).

P D ˛
;  D f ;

showing a complete decoupling between motion


control and force control. Summary and Future Directions
Then, the desired force d .t/ can be achieved
by setting This entry has sketched the main approaches to
f  D d .t/ ; force control in a unifying perspective. However,
there are many aspects that have not been consid-
but this choice is very sensitive to disturbance ered here. The two major paradigms of force con-
forces, since it contains no force feedback. Al- trol (impedance and hybrid force/motion control)
ternative choices are are based on several simplifying assumptions that
  are only partially satisfied in practical implemen-
f  D d .t/ C KP d .t/  .t/ ; tations and that have been partially removed in
more advanced control methods.
or Notice that the performance of a force-
Z controlled robotic system depends on the
t   interaction with a changing environment which
f  D d .t/ C KI d ./  ./ d ;
0 is very difficult to model and identify correctly.
Hence, the standard performance indices used
where KP and KI are suitable positive-definite
to evaluate a control system, i.e., stability,
matrix gains. The proportional feedback is able to
bandwidth, accuracy, and robustness, cannot
reduce the force error due to disturbance forces,
be defined by considering the robotic system
while the integral action is able to compensate for
alone, as for the case of robot motion control, but
constant bias disturbances.
must be always referred to the particular contact
Velocity control is achieved by setting
situation at hand.
  Force control in industrial applications can
˛ D P d .t/ C KP
 d .t/  .t/ be considered as a mature technology, although,
Z t
  for the reason explained above, standard design
C KI
 d ./  ./ d ;
0 methodologies are not yet available. Force
control techniques are employed also in medical
where KP
and KI
are suitable matrix gains. It is robotics, haptic systems, telerobotics, humanoid
straightforward to show that asymptotic tracking robotics, micro-robotics, and nano robotics.
of  d .t/ and P d .t/ is ensured with exponential An interesting field of application is related to
convergence for any choice of positive-definite human-centered robotics, where control plays
matrices KP
and KI
. a key role to achieve adaptability, reaction
Notice that the implementation of force feed- capability, and safety. Robots and biomechatronic
back requires the computation of vector  from systems based on the novel variable impedance
the measurement of the end-effector wrench he as actuators, with physically adjustable compliance
 
Sf ve , being Sf a suitable pseudoinverse of matrix and damping, capable to react softly when
Sf . Analogously, vector  can be computed from touching the environment, necessitate the design
ve as Sv ve . of specific control laws. The combined use of
The hypothesis of rigid contact can be re- exteroceptive sensing (visual, depth, proximity,
moved, and this implies that along some direc- force, tactile sensing) for reactive control in
tions both motion and force are allowed, although the presence of uncertainty represents another
they are not independent. Hybrid force/motion challenging research direction.
Frequency Domain System Identification 469

Cross-References Mason MT(1981) Compliance and force control for com-


puter controlled manipulators. IEEE Trans Syst Man
Cybern 11:418–432
 Robot Grasp Control Ott C, Albu-Schaeffer A, Kugi A, Hirzinger G (2008) On
 Robot Motion Control the passivity based impedance control of flexible joint
robots. IEEE Trans Robot 24:416–429
Raibert MH, Craig JJ (1981) Hybrid position/force control
of manipulators. ASME J Dyn Syst Meas Control
103:126–133
Recommended Reading Salisbury JK (1980) Active stiffness control of a manipu-
lator in Cartesian coordinates. In: 19th IEEE confer-
This entry has presented a brief overview of the ence on decision and control, Albuquerque, pp 95–100
basic force control techniques, and the cited refer- Siciliano B, Villani L (1999) Robot force control. Kluwer,
Boston
ences represent a selection of the main pioneering F
Villani L, De Schutter J (2008) Robot force control. In:
contributions. A more extensive treatment of this Siciliano B, Khatib O (eds) Springer handbook of
topic with related bibliography can be found robotics. Springer, Berlin, pp 161–185
in Villani and De Schutter (2008). Besides Whitney DE (1977) Force feedback control of manipu-
lator fine motions. ASME J Dyn Syst Meas Control
impedance control and hybrid force/position
99:91–97
control, an approach designed to cope with Yoshikawa T (1987) Dynamic hybrid position/force con-
uncertainties in the environment geometry is trol of robot manipulators – description of hand con-
the parallel force/position control (Chiaverini straints and calculation of joint driving force. IEEE J
Robot Autom 3:386–392
and Sciavicco 1993; Chiaverini et al. 1994). In
the paper Ott et al. (2008) the passive compliance
of lightweight robots is combined with the active Frequency Domain System
compliance ensured by impedance control. A Identification
systematic constraint-based methodology to
specify complex tasks has been presented by Johan Schoukens and Rik Pintelon
De Schutter et al. (2007). Department ELEC, Vrije Universiteit Brussel,
Brussels, Belgium

Bibliography
Abstract
Caccavale F, Natale C, Siciliano B, Villani L (1999) Six-
DOF impedance control based on angle/axis represen- In this chapter we give an introduction to fre-
tations. IEEE Trans Robot Autom 15:289–300 quency domain system identification. We start
Chiaverini S, Sciavicco L (1993) The parallel approach to
force/position control of robotic manipulators, IEEE from the identification work loop in  System
Trans Robot Autom 9:361–373 Identification: An Overview, Fig. 4, and we dis-
Chiaverini S, Siciliano B, Villani L (1994) Force/position cuss the impact of selecting the time or frequency
regulation of compliant robot manipulators. IEEE domain approach on each of the choices that are
Trans Autom Control 39:647–652
De Schutter J, Van Brussel H (1988) Compliant robot mo- in this loop. Although there is a full theoreti-
tion I. A formalism for specifying compliant motion cal equivalence between the time and frequency
tasks. Int J Robot Res 7(4):3–17 domain identification approach, it turns out that,
De Schutter J, De Laet T, Rutgeerts J, Decré W, Smits from practical point of view, there can be a
R, Aerbeliën E, Claes K, Bruyninckx H (2007)
Constraint-based task specification and estimation for natural preference for one of both domains.
sensor-based robot systems in the presence of geomet-
ric uncertainty. Int J Robot Res 26(5):433–455
Hogan N (1985) Impedance control: an approach to Keywords
manipulation: parts I–III. ASME J Dyn Syst Meas
Control 107:1–24 Discrete and continuous time models; Experi-
Khatib O (1987) A unified approach for motion and force
control of robot manipulators: the operational space ment setup; Frequency and time domain identi-
formulation. IEEE J Robot Autom 3:43–53 fication; Plant and noise model
470 Frequency Domain System Identification

Introduction • How will the model be matched to the data?


This question boils down to the choice of a
System identification provides methods to build cost function that measures the distance be-
a mathematical model for a dynamical system tween the data and the model. We will discuss
starting from measured input and output signals the use of nonparametric weighting functions
( System Identification: An Overview; see the in the frequency domain.
section “Models and System Identification”). In the next sections, we will address these
Initially, the field was completely dominated by and similar questions in more detail. First,
the time domain approach, and the frequency we discuss the measurement of the raw
domain was used to interpret the results (Ljung data. The choices that are made in this
and Glover 1981). This picture changed in the step will have a strong impact on many
nineteenth of the last century by the development user aspects of the identification process.
of advanced frequency domain methods (Ljung The frequency domain formulation will turn
2006; Pintelon and Schoukens 2012), and out to be a natural choice to propose a
nowadays it is widely accepted that there is unified formulation of the system identification
a full theoretical equivalence between time problem, including discrete and continuous
and frequency domain system identification time modeling. Next, a generalized frequency
under some weak conditions (Agüero et al. domain description of the system relation will
2009; Pintelon and Schoukens 2012). Dedicated be proposed. This model will be matched to
toolboxes are available for both domains (Kollar the data, using a weighted least squares cost
1994; Ljung 1988). This raises the question how function, formulated in the frequency domain
to choose between time and frequency domain identification. This will allow for the use of
identification. Many times the choice between nonparametric weighting functions, based on
both approaches can be made based on the user’s a nonparametric preprocessing of the data.
familiarity with one of two methods. However, Eventually, some remaining user aspects are
for some problems it turns out that there is a discussed.
natural preference for the time or the frequency
domain. This contribution discusses the main
issues that need to be considered when making Data Collection
this choice, and it provides additional insights to
guide the reader to the best solutions for her/his In this section, we discuss the measurement as-
problem. sumptions that are made when the raw data are
In the identification work loop  System collected. It will turn out that these will directly
Identification: An Overview, Fig. 4, we need influence the natural choice of the models that
to address three important questions (Ljung are used to describe the continuous time physical
1999, Sect. 1.4; Söderström and Stoica 1989, systems.
Chap. 1; Pintelon and Schoukens 2012, Sect. 1.4)
that directly interact with the choice between Time Domain and Frequency Domain
time and frequency domain system identifica- Measurements
tion: The data can be collected either in the time or
• What data are available? What data are in the frequency domain, and we discuss briefly
needed? This discussion will influence the both options.
selection of the measurement setup, the model
choice, and the design of the experiment. Time Domain Measurements
• What kind of models will be used? We will Most measurements are nowadays made in the
mainly focus on the identification of discrete time domain because very fast high-quality
time and continuous time models, using ex- analog-to-digital convertors (ADC) became
actly the same frequency domain tools. available at a low price. These allow us to sample
Frequency Domain System Identification 471

Frequency Domain a 6
System Identification,

Signals
Fig. 1 Comparison of the
ZOH and BL signal 0
reconstruction of a discrete
time sequence: (a) time
−6 Time (s)
domain:    BL, – ZOH;
0 10 20
(b) spectrum ZOH signal;
(c) spectrum BL signal b 1
Amplitude

0 f/fs
0 1 2 3 4
c F
1
Amplitude

0 f/fs
0 1 2 3 4

and discretize the continuous time input and band of interest, resulting directly in a
output signals and process these on a digital measured frequency response function at a
computer. Also the excitation signals are mostly user selected set of frequencies !k ; k D
generated from a discrete time sequence with a 1;    ; F :
digital-to-analog convertor (DAC). G.!k /:
The continuous time input and output sig-
From the identification point of view, we
nals uc .t/; yc .t/ of the system to be modeled
can easily fit the latter situation in the
are measured at the sampling moments tk D
frequency domain identification framework, by
kTs , with Ts D 1=fs the sampling period and
putting
fs the sampling frequency: u.k/ D uc .kTs /,
and y.k/ D yc .kTs /. The discrete time signals
u.k/; y.k/, k D 1;    ; N are transformed to U.k/ D 1; Y .k/ D G.!k /:
the frequency domain using the discrete Fourier
transform (DFT) ( Nonparametric Techniques For that reason we will focus completely on
in System Identification, Eq. 1), resulting in the the time domain measurement approach in the
DFT spectra U.l/; Y .l/, at the frequencies fl D remaining part of this contribution.
l fNs . Making some abuse of notation, we will
reuse the same symbols later in this text, to de- Zero-Order-Hold and Band-Limited Setup:
note the Z-transform of the signals, for example, Impact on the Model Choice
Y .z/ will also be used for the Z-transform of No information is available on how the continu-
y.k/. ous time signals uc .t/; yc .t/ vary in between the
measured samples u.k/; y.k/. For that reason
Frequency Domain Measurements we need to make an assumption and make sure
A major exception to this general trend toward that the measurement setup is selected such that
time domain measurements are the (high- the intersample assumption is met. Two inter-
frequency) network analyzers that measure sample assumptions are very popular (Pintelon
the transfer function of a system frequency and Schoukens 2012; Schoukens et al. 1994,
by frequency, starting from the steady- pp. 498–512): the zero-order-hold (ZOH) and the
state response to a sine excitation. The band-limited assumption (BL). Both options are
frequency is stepped over the frequency shown in Fig. 1 and discussed below. The choice
472 Frequency Domain System Identification

of these assumptions does not only affect the BL setup is the standard choice for discrete
measurement setup; it has also a strong impact on time measurements. Without using anti-alias
the selection between a continuous or a discrete filters, large errors can be created due to aliasing
time model choice. effects: the high frequency (f > fs =2/ content
of the measured signals is folded down in the
Zero-Order Hold frequency band of interest and act there as a
The ZOH setup assumes that the excitation disturbance. For that reason it is strongly advised
remains constant in between the samples. In to use always anti-alias filters in the measurement
practice, the model is identified between the setup.
discrete time reference signal in the memory The exact relation between BL signals is de-
of the generator and the sampled output. The scribed by a continuous time model, for example,
intersample behavior is an intrinsic part of the in the frequency domain:
model: if the intersample behavior changes, also
the corresponding model will change. The ZOH Y .!/ D G.!; /U.!/
assumption is very popular in digital control. In
that case the sampling frequency fs is commonly (Schoukens et al. 1994).
chosen 10 times larger than the frequency band
of interest. Combining Discrete Time Models and BL Data
A discrete time model gives, in case of noise- It is also possible to identify a discrete time
free data, an exact description between the sam- model between the BL data, at a cost of creat-
pled input u.k/ and output y.k/ of the continuous ing (very) small model errors (Schoukens et al.
time system: 1994). This is the standard setup that is used in
digital signal procession applications like digital
y.k/ D G.q; /u.k/ audio processing. The level of the model errors
can be reduced by lowering the ratio fmax =fs
in the time domain ( System Identification: An or by increasing the model order. In a properly
Overview, Eq. 6). In this expression, q denotes designed setup, the discrete time model errors can
the shift operator (time domain), and it is replaced be made very small, e.g., relative errors below
by z in the z-domain description (transfer func- 105 . In Table 1 an overview of the models
tion description): corresponding to the experimental conditions is
given.
Y .z/ D G.z; /U.z/:
Extracting Continuous Time Models from
Evaluating the transfer function at the unit circle ZOH Data
by replacing z D e i ! results in the frequency Although the most robust and practical choice
domain description of the system: to identify a continuous time model is to start
from BL data, it is also possible to extract a
Y .e i ! / D G.e i ! ; /U.e i ! / continuous time model under the ZOH setup.
A first possibility is to assume that the ZOH
( System Identification: An Overview, Eq. 34). assumption is perfectly met, which is a very
hard assumption to realize in practice. In that
Band-Limited Setup case the continuous time model can be retrieved
The BL setup assumes that above a given by a linear step invariant transformation of the
frequency fmax < fs =2, there is no power in discrete time model. A second possibility is to
the signals. The continuous time signals are select a very high sample frequency with respect
filtered by well-tuned anti-alias filters (cutoff to the bandwidth of the system. In that case it is
frequency fmax < fs =2), before they are advantageous to describe the discrete time model
sampled. Outside the digital control world, the using the delta operator (Goodwin 2010), and
Frequency Domain System Identification 473

Frequency Domain System Identification, Table 1 Relations between the continuous time system G.s/ and the
identified models as a function of the signal and model choices
DT-model (Assuming ZOH-setup) CT-model (Assuming BL-setup)
ZOH setup Exact DT-model
  n G.s/ o
G.z/ D 1  z1 Z S Not studied
‘standard conditions DT modelling’
BL setup Approximate DT model Exact CT-model G.s/
 
Q
G.z/ GQ z D e j!Ts Ð G.s D
!s
j!/; j!j < 2
‘digital signal processing field’ ‘standard conditions CT modelling’
F
we have that the coefficients of the discrete time excitation: the selection of the amplitude spec-
model converge to those of the continuous time trum (How is the available power distributed over
model. the frequency?) and the choice of the frequency
resolution (What is the frequency step between
Models of Cascaded Systems two successive points of the measured FRF?)
In some problems we want to build models for a (Pintelon and Schoukens 2012, Sect. 5.3).
cascade of two systems G1 ; G2 . It is well known The amplitude spectrum is mainly set by the
that the overall transfer function G is given by requirement that the excited frequency band
the product G.!/ D G1 .!/G2 .!/. This result should cover the frequency band of interest.
holds also for models that are identified under the A white noise excitation covers the full frequency
BL signal assumption: the model for the cascade band, including those bands that are of no interest
will be the product of the models of the individual for the user. This is a waste of power and it should
systems. However, the result does not hold under be avoided. Designing a good power spectrum for
the ZOH assumption, because the intermediate identification and control purposes is discussed
signal between G1 ; G2 does not meet the ZOH in  Experiment Design and Identification for
assumption. For that reason, the ZOH model of Control.
a cascaded system is not obtained by cascading The frequency resolution f0 D 1=T is set by
the ZOH models of the individual systems. the inverse of the period of the signal. It should be
small enough so that no important dynamics are
Experiment Design: Periodic or Random missed, e.g., a very sharp mechanical resonance.
Excitations? The reader should be aware that exactly the
In general, arbitrary data can be used to identify same choices have to be made during the design
a system as long as some basic requirements of nonperiodic excitations. If, for example, a
are respected ( System Identification: An random noise excitation is used, the frequency
Overview, section on Experiment Design). resolution is also restricted by the length of the
Imposing periodic excitations can be an experiment Tm and the corresponding frequency
important restriction of the user’s freedom to resolution is again f0 D 1=Tm. The power
design the experiment, but we will show in the spectrum of the noise excitation should be well
next sections that it offers also major advantages shaped using a digital filter.
at many steps in the identification work
loop ( System Identification: An Overview, Nonparametric Preprocessing of the Data
Fig. 4). in the Frequency Domain
With the availability of arbitrary wave form Before a parametric model is identified from the
generators, it became possible to generate arbi- raw data, a lot of information can be gained,
trary periodic signals. The user should make two almost for free, by making a nonparametric ana-
major choices during the design of the periodic lysis of the data. This can be done with very
474 Frequency Domain System Identification

little user interaction. Some of these methods of the system is measured as a function of the
are explicitly linked to the periodic nature of frequency, and it is even possible to differenti-
the data, other methods apply also to random ate between even (e.g., x 2 ) and odd (e.g., x 3 )
excitations. distortions. While the first only act as disturb-
ing noise in a linear modeling framework, the
Nonparametric Frequency Analysis latter will also affect the linearized dynamics
of Periodic Data and can change, for example, the pole posi-
By using simple DFT techniques, the follow- tions of a system (Pintelon and Schoukens 2012,
ing frequency domain information is extracted Sect. 4.3).
from sampled time domain data u.t/; y.t/; t D
1; : : : ; N (Pintelon and Schoukens 2012):
• The signal information: U.l/; Y .l/, the DFT
spectra of the input and output, evaluated at Noise and Data Reduction
the frequencies lf0 , with k D 1; 2;    ; F . By averaging the periodic signals over the suc-
• Disturbing noise variance information: The cessive periods, we get a first reduction of the
full data record is split, so that each sub- noise. An additional noise reduction is possible
record contains a single period. For each of when not all frequencies are excited. If a very
these, the DFT spectrum is calculated. Since wide frequency band has to be covered, a fine
the signals are periodic, they do not vary frequency resolution is needed at the low frequen-
from one period to the other, so that the cies, whereas in the higher frequency bands, the
observed variations can be attributed to the resolution can be reduced. Signals with a loga-
noise. By calculating the sample mean and rithmic frequency distribution are used for that
variance over the periods at each frequency, a purpose. Eliminating the unexcited frequencies
nonparametric noise analysis is available. The does not only reduce the noise, it also reduces sig-
estimated variances O U2 .k/; O Y2 .k/ measure the nificantly the amount of raw data to be processed.
disturbing noise power spectrum at frequency By combining different experiments that cover
fk on the input and the output respectively. each a specific frequency band, it is possible to
The covariance O Y2 U .k/ characterizes the lin- measure a system over multiple decades, e.g.,
ear relations between the noise on the input electrical machines are measured from a few mHz
and the output. to a few kHz.
This is very valuable information because, even In a similar way, it is also possible to focus the
before starting the parametric identification step, fit on the frequency band of interest by including
we get already full access to the quality of the only those frequencies in the parametric model-
raw data. As a consequence, there is also no ing step.
interference between the plant model estimation
and the noise analysis: plant model errors do not
affect the estimated noise model. It is also im-
portant to realize that there is no user interaction High-Quality Frequency Response Function
requested to make this analysis and it follows Measurements
directly from a simple DFT analysis of the raw For periodic excitations, it is very simple to ob-
data. These are two major advantages of using tain high-quality measurements of the nonpara-
periodic excitations. metric frequency response function of the system.
These results can be extended to random excita-
Nonlinear Analysis tions at a cost of using more advanced algorithms
Using well-designed periodic excitations, it is that require more computation time (Pintelon and
possible to detect the presence of nonlinear dis- Schoukens, Chap. 7). This approach is discussed
tortions during the nonparametric frequency step. in detail in  Nonparametric Techniques in Sys-
The level of the nonlinear distortions at the output tem Identification (Eq. 1).
Frequency Domain System Identification 475

Generalized Frequency Domain the begin and end effects (called leakage in the
Models frequency domain). See  Nonparametric Tech-
niques in System Identification, “The Leakage
A very important step in the identification work Problem” section. The amazing result is that in
loop is the choice of the model class ( System the frequency domain, both effects are described
Identification: An Overview, Fig. 4). Although by exactly the same mathematical expression.
most physical systems are continuous time, the This leads eventually to the following model in
models that we need might be either discrete time the frequency domain that is valid for periodic
(e.g., digital control, computer simulations, dig- and arbitrary (nonperiodic) BL or ZOH exci-
ital signal processing) or continuous time (e.g., tations (Pintelon et al. 1997; McKelvey 2002;
physical interpretation of the model, analog con- Pintelon and Schoukens 2012, Chap. 6):
trol) (Ljung 1999, Sects. 2.1 and 4.3). A major F
advantage of the frequency domain is that both Y .k/ D G.˝; /U.k/ C TG .˝; /
model classes are described by the same transfer
function model. The only difference is the choice
of the frequency variable. For continuous time which becomes for SISO (single-input-single-
models we operate in the Laplace domain, and the output) systems:
frequency variable is retrieved on the imaginary
axis by putting s D j!. For discrete time systems B.˝; / I.˝; /
G.˝; / D ; and TG .˝; /D :
we work on the unit circle that is described by the A.˝; / A.˝; /
frequency variable z D e j 2 f =fs . The application A; B; I are all polynomials in ˝. The transient
class can be even extended to include diffusion
p term TG .˝; / models transient and leakage ef-
phenomena by putting ˝ D j! (Pintelon et fects. It is most important for the reader to realize
al. 2005). From now on we will use the gen- that this is an exact description for noise free data.
eralized frequency variable ˝, and depending Observe that it is very similar to the description in
on the selected domain, the proper substitution the time domain: y.t/ D G.q; /u.t/ C tG .t; /.
p
j!; e j 2 f =fs ; j! should be made. The unified In that case the transient term tG .t; / models the
discrete and continuous time description initial transient that is due to the initial condi-
tions.
Y .k/ D G.˝k ; /U.k/:

illustrates also very nicely that in the frequency Parametric Identification


domain there is a strong similarity between dis-
crete and continuous time system identification. Once we have the data and the model available
To apply this model to finite length measure- in the frequency domain, we define the following
ments, it should be generalized to include the weighted least squares cost function to match the
effect of the initial conditions (time domain) or model to the data (Schoukens et al. 1997):

ˇ ˇ2
ˇO O .k/  TG .˝k ; /ˇˇ
1 XF ˇ Y .k/  G .˝ k ; / U
V . / D :
F O 2 .k/ C O U2 .k/ jG .˝k ; /j2  2Re. O Y2 U .k/ G .˝k ; //
kD1 Y

The properties of this estimator are fully stud- The formal link with the time domain cost
ied in Schoukens et al. (1999) and Pintelon and function, as presented in  System Identification:
Schoukens (2012, Sect. 10.3), and it is shown that An Overview, can be made by assuming
it is a consistent and almost efficient estimator that the input is exactly known ( O U2 .k/ D
under very mild conditions. 0; O Y2 U .k/ D 0/, and replacing the nonparametric
476 Frequency Domain System Identification

noise model on the output by a parametric These changes reduce the cost function, within
model: a parameter independent constant  to

O Y2 .k/ D  jH .˝k ; /j2 :

ˇ ˇ2
ˇO ˇ
1 X
F ˇY .k/  G .˝k ; / U0 .k/  TG .˝k ; /ˇ
VF . / D ;
F
kD1
jH .˝k ; /j2

which is exactly the same expression as Eq. 37 in noise model can share some dynamics with the
 System Identification: An Overview, provided plant model, for example, when the disturbance
that the frequencies ˝k cover the full unit circle is an unobserved plant input.
and the transient term TG is omitted. The latter It is also possible to use a nonparametric
models the initial condition effects in the time do- noise model in the time domain. This leads to a
main. The expression shows the full equivalence Toeplitz weighting matrix, and the fast numerical
with the classical discrete time domain formula- algorithms that are used to deal with these make
tion. If only a subsection of the full unit circle use internally of FFT (fast Fourier transform)
is used for the fit, the additional term N log det algorithms which brings us back to the frequency
in  System Identification: An Overview, Eq. 21 domain representation of the data.
should be added to the cost function VF . /.
Stable and Unstable Plant Models
In the frequency domain formulation, there is no
special precaution needed to deal with unstable
Additional User Aspects in Parametric
models, so that we can tolerate these models
Frequency Domain System
without any problem. There are multiple reasons
Identification
why this can be advantageous. The most obvious
one is the identification of an unstable system,
In this section we highlight some additional user
operating in a stabilizing closed loop. It can
aspects that are affected by the choice for a time
also happen that the intermediate models that
or frequency approach to system identification.
are obtained during the optimization process are
unstable. Imposing stability at each iteration can
Nonparametric Noise Models be too restrictive, resulting in estimates that are
The use of a nonparametric noise model is a trapped in a local minimum. A last significant ad-
natural choice in the frequency domain. It is of vantage is the possibility to split the identification
course also possible to use the parametric noise problem (extract a model from the noisy data)
model in the frequency domain formulation, but and the approximation problem (approximate the
then we would lose two major advantages of the unstable model by a stable one). This allows us to
frequency domain formulation: (i) For periodic use in each step a cost function that is optimal for
excitations, there is no interaction between the that step: maximum noise reduction in the first
identification of the plant model and the nonpara- step, followed by a user-defined approximation
metric noise model. Plant model errors do not criterion in the second step.
show up in the noise model. (ii) The availability
of the nonparametric noise models eliminates the Model Selection and Validation
need for tuning the parametric noise model order, An important step in the system identification
resulting in algorithms that are easier to use. procedure is the tuning of the model complexity,
A disadvantage of using a nonparametric noise followed by the evaluation of the model qual-
model is that we can no longer express that the ity on a fresh data set. The availability of a
Frequency Domain System Identification 477

nonparametric noise model and a high-quality framework (Soderstrom 2012). A special case
frequency response function measurement sim- is the identification of a system that is captured
plifies these steps significantly: in a feedback loop. In that case we have that the
noisy output measurements are fed back to the
Absolute Interpretation of the Cost Function: input of the system which creates a dependency
Impact on Model Selection and Validation between the input and output disturbance. We
The weighted least squares cost function, discuss both situations briefly below.
using the nonparametric noise weighting, is a
normalized function. Its expected value equals
O D .F  n =2/=F , with n the number Errors-in-Variables Framework
EfV . /g
The major difficulty of the EIV framework
of free parameters in the model and F the
is the simultaneous identification of the plant
number of frequency points. A cost function
model describing the input-output relations,
F
that is far too large points to remaining model
the noise models that describe the input and
errors. Unmodeled dynamics result in correlated
the output noise disturbances, and the signal
residuals (difference between the measured and
model describing the coloring of the excitation
the modeled FRF): the user should increase the
(Soderstrom 2012). Advanced identification
model order to capture these dynamics in the
methods are developed, but today it is still
linear model. A cost function that is far too
necessary to impose strong restrictions on the
large, while the residuals are white, points to the
noise models, e.g., correlations between input
presence of nonlinear distortions: the best linear
and output noise disturbances are not allowed.
approximation is identified, but the user should be
The periodic frequency domain approach
aware that this approximation is conditioned on
encapsulates the general EIV, including mutually
the actual excitation signal. A cost function that is
correlated colored input–output noise. Again, a
far too low points to an error in the preprocessing
full nonparametric noise model is obtained in the
of the data, resulting in a bad noise model.
preprocessing step. This reduces the complexity
of the EIV problem to that of a classical
Missing Resonances
weighted least squares identification problem
Some systems are lightly damped, resulting in a
which makes a huge difference in practice
resonant behavior, for example, a vibrating me-
(Pintelon and Schoukens 2012; Söderström et
chanical structure. By comparing the parametric
al. 2010).
transfer function model with the nonparametric
FRF measurements, it becomes clearly visible if
a resonance is missed in the model. This can be Identification in a Feedback Loop
either due to a too simple model structure (the Identification under feedback conditions can
model order should be increased), or it can appear be solved in the time domain prediction error
because the model is trapped in a local minimum. method (Ljung 1999, Sect. 13.4; Söderström
In the latter case, better numerical optimization and Stoica 1989, Chap. 10). This leads to
and initialization procedures should be looked consistent estimates, provided that the exact
for. plant and noise model structure and order is
retrieved. In the periodic frequency domain
Identification in the Presence of Noise on approach, a nonparametric noise model is
the Input and Output Measurements extracted (variances input and output noise, and
Within the band-limited measurement setup, both covariance between input and output noise) in the
the input and the output have to be measured. preprocessing step, without any user interaction.
This leads in general to an identification Next, these are used as a weighting in the
framework where both the input and the weighted least squares cost function which leads
output are disturbed by noise. Such problems to consistent estimates provided that the plant
are studied in the errors-in-variables (EIV) model is flexible enough to capture the true plant
478 Frequency Domain System Identification

transfer function (Pintelon and Schoukens 2012, Recommended Reading


Sect. 9.18).
We recommend the reader the books of Ljung
(1999) and Söderström and Stoica (1989) for a
Summary and Future Directions systematic study of time domain system iden-
tification. The book of Pintelon and Schoukens
Theoretically, there is a full equivalence between (2012) gives a comprehensive introduction to fre-
the time and frequency domain formulation quency domain identification. An extended dis-
of the system identification problem. In many cussion of the basic choices (intersample behav-
practical situations the user can make a free ior, measurement setup) is given in Chap. 13 of
choice between both approaches, based on Pintelon and Schoukens (2012) or in Schoukens
nontechnical arguments like familiarity with one et al. (1994). The other references in this list
of both domains. However, some problems can highlight some of the technical aspects that were
be easier formulated in the frequency domain. discussed in this text.
Identification of continuous time models is not
more involved than retrieving a discrete time
model. The frequency domain formulation is Acknowledgments This work was supported in part by
also the natural choice to use nonparametric the Fund for Scientific Research (FWO-Vlaanderen), by
the Flemish Government (Methusalem), by the Belgian
noise models. This eliminates the request to
Government through the Inter university Poles of Attrac-
select a specific noise model structure and tion (IAP VII) Program, and the ERC Advanced Grant
order, although this might be a drawback for SNLSID.
experienced users who can take advantage of a
clever choice of this structure. The advantages
that are directly linked to periodic excitation
signals can be explored most naturally in the Bibliography
frequency domain: the noise model is available
Agüero JC, Yuz JI, Goodwin GC et al (2009) On the
for free, EIV identification is not more involved equivalence of time and frequency domain maximum
than the output error identification problem, likelihood estimation. Automatica 46:260–270
and identification under feedback conditions Goodwin GC (2010) Sampling and sampled-data models.
does not differ from open-loop identification. In: American control conference, ACC 2010, Balti-
more
Nonstationary effects are an example of a Kollar I (1994) Frequency domain system identification
problem that will be easier detected in the time toolbox for use with MATLAB. The MathWorks Inc.,
domain. In general, we advise the reader to take Natick
the best of both approaches and to swap from Ljung L (1988) System identification toolbox for use with
MATLAB. The MathWorks Inc., Natick
one domain to the other whenever it gives some Ljung L (1999) System identification: theory for the user,
advantage to do so. In the future, it will be 2nd edn. Prentice Hall, Upper Saddle River
necessary to extend the framework to include Ljung L (2006) Frequency domain versus time domain
a characterization of nonlinear and time-varying methods in system identification – revisited. In: Work-
shop on control of uncertain systems location, Univer-
effects. sity of Cambridge, 21–22 Apr
Ljung L, Glover K (1981) Frequency-domain versus time
domain methods in system identification. Automatica
17:71–86
Cross-References McKelvey T (2002) Frequency domain identification
methods. Circuits Syst Signal Process 21: 39–55
 System Identification: An Overview Pintelon R, Schoukens J (2012) System identification: a
 Nonparametric Techniques in System Identifi- frequency domain approach, 2nd edn. Wiley, Hobo-
ken/IEEE, Piscataway
cation
Pintelon R, Schoukens J, Vandersteen G (1997) Frequency
 Experiment Design and Identification for domain system identification using arbitrary signals.
Control IEEE Trans Autom Control 42:1717–1720
Frequency-Response and Frequency-Domain Models 479

Pintelon R, Schoukens J, Pauwels L et al (2005) Diffusion the easiest method to use for designing dynamic
systems: stability, modeling, and identification. IEEE compensation.
Trans Instrum Meas 54:2061–2067
Schoukens J, Pintelon R, Van hamme H (1994) Identi-
fication of linear dynamic systems using piecewise-
constant excitations – use, misuse and alternatives. Keywords
Automatica 30:1153–1169
Schoukens J, Pintelon R, Vandersteen G et al (1997)
Frequency-domain system identification using non- Bandwidth; Bode plot; Frequency response; Gain
parametric noise models estimated from a small num- margin (GM); Magnitude; Phase; Phase margin
ber of data sets. Automatica 33: 1073–1086 (PM); Resonant peak; Stability
Schoukens J, Pintelon R, Rolain Y (1999) Study of con-
ditional ML estimators in time and frequency-domain
system identification. Automatica 35:91–100
Söderström T (2012) System identification for the errors- Introduction: Frequency Response F
in-variables problem. Trans Inst Meas Control 34:780–
792
Söderström T, Stoica P (1989) System identification.
A very common way to use the exponential re-
Prentice-Hall, Englewood Cliffs sponse of linear time-invariant systems (LTIs) is
Söderström T, Hong M, Schoukens J et al (2010) Ac- in finding the frequency response, or response to
curacy analysis of time domain maximum likelihood a sinusoid. First we express the sinusoid as a sum
method and sample maximum likelihood method for
errors-in-variables and output error identification. Au- of two exponential expressions (Euler’s relation):
tomatica 46:721–727
A j!t
A cos.!t/ D .e C e j!t /: (1)
2
Suppose we have an LTI system with input u
Frequency-Response and and output y. If we let s D j! in the transfer
Frequency-Domain Models function G.s/, then the response to u.t/ D e j!t
is y.t/ D G.j!/e j!t ; similarly, the response to
Abbas Emami-Naeini and J. David Powell
u.t/ D e j!t is G.j!/e j!t . By superposition,
Stanford University, Stanford, CA, USA
the response to the sum of these two exponentials,
which make up the cosine signal, is the sum of the
responses:
Abstract
A
y.t/ D ŒG.j!/e j!t C G.j!/e j!t : (2)
A major advantage of using frequency response 2
is the ease with which experimental information
can be used for design purposes. Raw measure- The transfer function G.j!/ is a complex
ments of the output amplitude and phase of a number that can be represented in polar form
plant undergoing a sinusoidal input excitation are or in magnitude-and-phase form as G. j!/ D
sufficient to design a suitable feedback control. M.!/e j¥.!/ , or simply G D M e j¥ . With this
No intermediate processing of the data (such substitution, Eq. (2) becomes for a specific input
as finding poles and zeros or determining sys- frequency ! D !o
tem matrices) is required to arrive at the system
A  j.!t C'/ 
model. The wide availability of computers has y.t/ D M e C e j.!t C'/ ;
rendered this advantage less important now than 2
it was years ago; however, for relatively simple D AM cos.!t C '/; (3)
systems, frequency response is often still the
most cost-effective design method. The method M D jG. j!/j D jG.s/jsDj!o
is most effective for systems that are stable in p
open-loop. Yet another advantage is that it is D fReŒG. j!o /g2 C fImŒG. j!o /g2 ;
480 Frequency-Response and Frequency-Domain Models


1 ImŒG. j!o / boundary between stability and instability; evalu-
' D †G. j!/ D tan : (4)
ReŒG. j!o / ating G. j!/ provides information that allows us
to determine closed-loop stability from the open-
loop G.s/.
This means that if an LTI system represented
For the second-order system
by the transfer function G.s/ has a sinusoidal
input with magnitude A, the output will be sinu-
soidal at the same frequency with magnitude AM 1
and will be shifted in phase by the angle '. M G.s/ D ; (5)
.s=!n /2 C 2.s=!n/ C 1
is usually referred to as the amplitude ratio or
magnitude and ' is referred to as the phase and
they are both functions of the input frequency, !. the Bode plot is shown in Fig. 3 for various values
The frequency response can be measured experi- of .
mentally quite easily in the laboratory by driving A natural specification for system per-
the system with a known sinusoidal input, letting formance in terms of frequency response is
the transient response die, and measuring the the bandwidth, defined to be the maximum
steady-state amplitude and phase of the system’s frequency at which the output of a system will
output as shown in Fig. 1. The input frequency track an input sinusoid in a satisfactory manner.
is set to sufficiently many values so that curves By convention, for the system shown in Fig. 4
such as the one in Fig. 2 are obtained. Bode sug- with a sinusoidal input r, the bandwidth is the
gested that we plot log jM j vs. log ! and '.!/ frequency of r at which the output y is attenuated
vs. log ! to best show the essential features of to a factor of 0:707 times the input (If the output
G.j!/. Hence, such plots are referred to as Bode is a voltage across a 1- resistor, the power is
plots. Bode plotting techniques are discussed in v 2 and when jvj D 0:707, the power is reduced
Franklin et al. (2015). by a factor of 2. By convention, this is called
We are interested in analyzing the frequency the half-power point.). Figure 5 depicts the idea
response not only because it will help us un- graphically for the frequency response of the
derstand how a system responds to a sinusoidal closed-loop transfer function
input, but also because evaluating G.s/ with s
taking on values along the j! axis will prove
to be very useful in determining the stability of Y .s/  KG.s/
D T .s/ D : (6)
a closed-loop system. Since the j! axis is the R.s/ 1 C KG.s/

Frequency-Response
and Frequency-Domain
Models, Fig. 1 Response
of G.s/ D .sC1/
1
to the
input u D sin 10t (Source:
Franklin et al. (2010),
p. 298, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)
Frequency-Response and Frequency-Domain Models 481

Frequency-Response
and Frequency-Domain
Models, Fig. 2 Frequency
response for G.s/ D sC1 1

(Source: Franklin et al.


(2010), p. 83, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)

The plot is typical of most closed-loop systems be an average, the rise time (Rise time tr .) from
in that (1) the output follows the input .jT j Š 1/ y D 0:1 to y D 0:9 is approximately !n tr D 1:8.
at the lower excitation frequencies and (2) the Thus, we can say that
output ceases to follow the input .jT j < 1/
at the higher excitation frequencies. The 1:8
tr Š : (7)
maximum value of the frequency-response !n
magnitude is referred to as the resonant Although this relationship could be embellished
peak Mr . by including the effect of the damping ratio,
Bandwidth is a measure of speed of response it is important to keep in mind how Eq. (7) is
and is therefore similar to time-domain measures typically used. It is accurate only for a second-
such as rise time and peak time or the s-plane order system with no zeros; for all other systems
measure of dominant-root(s) natural frequency. it is a rough approximation to the relationship
In fact, if the KG.s/ in Fig. 4 is such that the between tr and !n . Most systems being analyzed
closed-loop response is given by Fig. 3a, we can for control systems design are more complicated
see that the bandwidth will equal the natural than the pure second-order system, so design-
frequency of the closed-loop root (that is, !BW D ers use Eq. (7) with the knowledge that it is a
!n for a closed-loop damping ratio of  D rough approximation only. Hence, for a second-
0:7). For other damping ratios, the bandwidth is order system the bandwidth is inversely propor-
approximately equal to the natural frequency of tional to the rise time, tr . Hence we are able to
the closed-loop roots, with an error typically less link the time and frequency domain quantities in
than a factor of 2. this way.
For a second-order system, the time responses The definition of the bandwidth stated here is
are functions of the pole-location parameters  meaningful for systems that have a low-pass filter
and !n . If we consider the curve for  D 0:5 to behavior, as is the case for any physical control
482 Frequency-Response and Frequency-Domain Models

Frequency-Response
and Frequency-Domain
Models, Fig. 3 Frequency
responses of standard
second-order systems (a)
magnitude (b) phase
(Source: Franklin et al.
(2010), p. 303, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)

system. In other applications the bandwidth may analysis, we are more interested in the sensitivity
be defined differently. Also, if the ideal model of function S.s/ D 1  T .s/, rather than T .s/. For
the system does not have a high-frequency roll- most open-loop systems with high gain at low
off (e.g., if it has an equal number of poles and frequencies, S.s/ for a disturbance input has very
zeros), the bandwidth is infinite; however, this low values at low frequencies and grows as the
does not occur in nature as nothing responds well frequency of the input or disturbance approaches
at infinite frequencies. the bandwidth. For analysis of either T .s/ or
In many cases, the designer’s primary concern S.s/, it is typical to plot their response versus the
is the error in the system due to disturbances frequency of the input. Either frequency response
rather than the ability to track an input. For error for control systems design can be evaluated using
Frequency-Response and Frequency-Domain Models 483

the computer or can be quickly sketched for design process as well as to perform sanity checks
simple systems using the efficient methods de- on the computer output.
scribed in Franklin et al. (2015). The methods
described next are also useful to expedite the
Neutral Stability: Gain and Phase
Margins

In the early days of electronic communications,


most instruments were judged in terms of their
frequency response. It is therefore natural that
when the feedback amplifier was introduced,
techniques to determine stability in the presence F
of feedback were based on this response.
Frequency-Response and Frequency-Domain
Models, Fig. 4 Unity feedback system (Source: Franklin Suppose the closed-loop transfer function
et al. (2010), p. 304, reprinted by permission of Pearson of a system is known. We can determine the
Education, Inc., Upper Saddle River, NJ) stability of a system by simply inspecting the

Frequency-Response
and Frequency-Domain
Models, Fig. 5
Definitions of bandwidth
and resonant peak (Source:
Franklin et al. (2010),
p. 304, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)

Frequency-Response and Frequency-Domain reprinted by permission of Pearson Education, Inc., Upper


Models, Fig. 6 Stability example: (a) system definition; Saddle River, NJ)
(b) root locus (Source: Franklin et al. (2010), p. 318,
484 Frequency-Response and Frequency-Domain Models

denominator in factored form (because the We can see from the root locus in Fig. 6b that
factors give the system roots directly) to observe any value of K less than the value at the neutrally
whether the real parts are positive or negative. stable point will result in a stable system. At the
However, the closed-loop transfer function is frequency ! where the phase †G. j!/ D 180ı
usually not known. In fact, the whole purpose (! D 1 rad/s), the magnitude jKG. j!/j < 1:0
behind understanding the root-locus technique is for stable values of K and >1 for unstable values
to be able to find the factors of the denominator of K. Therefore, we have the following trial
in the closed-loop transfer function, given only stability condition, based on the character of the
the open-loop transfer function. Another way open-loop frequency response:
to determine closed-loop stability is to evaluate
the frequency response of the open-loop transfer jKG. j!/j < 1 at †G. j!/ D 180ı : (9)
function KG. j!/ and then perform a test on
that response. Note that this method also does This stability criterion holds for all systems for
not require factoring the denominator of the which increasing gain leads to instability and
closed-loop transfer function. In this section we jKG. j!/j crosses the magnitude .D1/ once, the
will explain the principles of this method. Note most common situation. However, there are sys-
that this method also does not require factoring tems for which an increasing gain can lead from
the denominator of the closed-loop transfer instability to stability; in this case, the stability
function. Here we will explain the principles condition is
of this method.
Suppose we have a system defined by Fig. 6a
jKG. j!/j > 1 at †G. j!/ D 180ı : (10)
and whose root locus behaves as shown in
Fig. 6b; that is, instability results if K is larger
than 2. The neutrally stable points lie on the Based on the above ideas, we can now define the
imaginary axis – that is, where K D 2 and robustness metrics gain and phase margins:
s D j 1:0. Furthermore, all points on the root Phase Margin: Suppose at !1 , jG.j!1 /j D K1 .
locus have the property that How much more phase could the system toler-
ate (as a time delay, perhaps) before reaching
jKG.s/j D 1 and †G.s/ D 180ı : the stability boundary? The answer to this
question follows from Eq. (8), i.e., the phase
margin (PM) is defined as
At the point of neutral stability we see that these
root-locus conditions hold for s D j!, so
PM D †G.j!1 /  .180ı /: (11)
ı
jKG. j!/j D 1 and †G. j!/ D 180 :
(8) Gain Margin: Suppose at !2 , †G.j!2 / D
Thus, a Bode plot of a system that is neutrally 180ı . How much more gain could the
stable (that is, with K defined such that a closed- system tolerate (as an amplifier, perhaps)
loop root falls on the imaginary axis) will satisfy before reaching the stability boundary? The
the conditions of Eq. (8). Figure 7 shows the answer to this question follows from Eq. (9),
frequency response for the system whose root i.e., the gain margin (GM) is defined as
locus is plotted in Fig. 6 for various values of K.
The magnitude response corresponding to K D 2 1
GM D : (12)
passes through 1 at the same frequency (! D KjG.j!2 /j
1 rad/s) at which the phase passes through 180ı,
as predicted by Eq. (8). There are also rare cases when jKG. j!/j crosses
Having determined the point of neutral stabil- magnitude .D1/ more than once, or where an in-
ity, we turn to a key question: Does increasing the creasing gain leads to instability. A rigorous way
gain increase or decrease the system’s stability? to resolve these situations is to use the Nyquist
Frequency-Response and Frequency-Domain Models 485

Frequency-Response
and Frequency-Domain
Models, Fig. 7 Stability
example: (a) system
definition; (b) root locus
(Source: Franklin et al.
(2010), p. 319, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)

stability criterion as discussed in Franklin et al. where !c is the crossover frequency. The closed-
(2015). loop frequency-response magnitude is approxi-
mated by

Closed-Loop Frequency Response ˇ ˇ


ˇ KG. j!/ ˇˇ 1; !
!c ;
jT . j!/jD ˇˇ Š
The closed-loop bandwidth was defined earlier 1 C KG. j!/ ˇ jKGj; ! !c :
in this section. The natural frequency is always (13)
within a factor of 2 of the bandwidth for a second- In the vicinity of crossover, where
order system. We can help establish a more exact jKG. j!/j D 1, jT . j!/j depends heavily on the
correspondence by making a few observations. PM. A PM of 90ı means that †G. j!c / D 90ı ,
Consider a system in which jKG. j!/j shows the and therefore jT . j!c /j D 0:707. On the other
typical behavior hand, PM D 45ı yields jT . j!c /j D 1:31.
The exact evaluation of Eq. (13) was used
to generate the curves of jT . j!/j in Fig. 8. It
shows that the bandwidth for smaller values of
jKG. j!/j 1 for !
!c ;
PM is typically somewhat greater than !c , though
jKG. j!/j
1 for ! !c ; usually it is less than 2!c ; thus,
486 FTC

Frequency-Response and Frequency-Domain Models, Fig. 8 Closed-loop bandwidth with respect to PM (Source:
Franklin et al. (2010), p. 347, reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ)

!c  !BW  2!c : (14) plot. The dynamic compensation can be carried


out directly from the Bode plot. Extension of
Another specification related to the closed- the ideas to multivariable systems has been
loop frequency response is the resonant-peak done via singular value plots. Extension to
magnitude Mr , defined in Fig. 5. For linear nonlinear systems is still the subject of current
systems, Mr is generally related to the damping research.
of the system. In practice, Mr is rarely used; most
designers prefer to use the PM to specify the
damping of a system, because the imperfections Cross-References
that make systems nonlinear or cause delays
usually erode the phase more significantly than  Classical Frequency-Domain Design
the magnitude. Methods
It is also important in the design to achieve  Frequency Domain System Identification
certain error characteristics and these are often
evaluated as a function of the input or disturbance
frequency. In some cases, the primary function
Bibliography
of the control system is to regulate the output
to a certain constant input in the presence of Franklin GF, Powell JD, Emami-Naeini A (2015) Feed-
disturbances. For these situations, the key item of back control of dynamic systems, 7th edn. Pearson
interest for the design would be the closed-loop Education, Upper Saddle River, NJ, Boston
frequency response of the error with respect to Franklin GF, Powell JD, Emami-Naeini A (2010) Feed-
back control of dynamic systems, 6th edn. Pearson
disturbance inputs. Education, Upper Saddle River

Summary and Future Directions


The frequency response methods are the
most popular because they can deal with
model uncertainty and can be measured in FTC
the laboratory. A wide range of information
about the system can be displayed in a Bode  Fault-Tolerant Control
Fundamental Limitation of Feedback Control 487

constraints may exist in design, what kind of


Fundamental Limitation of Feedback tradeoffs are to be made? (3) What are the best
Control achievable performance limits? (4) How can the
constraints, limitations, and limits be quantified,
Jie Chen
in ways meaningful for control analysis and
City University of Hong Kong, Hong Kong,
design? Needless to say, issues of this kind
China
are very general and in fact are commonplace
in science and engineering. Analogies can be
made, for example, to Shannon’s theorems
Abstract in communications theory, the Cramer-Rao
bound in statistics, and Heisenberg’s uncertainty
Feedback systems are designed to meet many
different objectives. Yet, not all design objec-
principle in quantum mechanics; they all address F
the fundamental limits and limitations, though for
tives are achievable as desired due to the fact
different problems and in different contexts. The
that they are often mutually conflicting and that
search for fundamental limitations of feedback
the system properties themselves may impose
control, as such, may be considered a quest for
design constraints and thus limitations on the
an “ultimate truth” or the “law of feedback.”
performance attainable. An important step in the
For their fundamentality and importance,
control design process is then to analyze what
inquiries into control performance limitations
and how system characteristics may impose con-
have persisted over time and continue to be of
straints, and accordingly, how to make tradeoffs
vital interest. It is worth emphasizing, however,
between different objectives by judiciously navi-
that performance limitation studies are not
gating between the constraints. Fundamental lim-
merely driven by intellectual curiosity, but are
itation of feedback control is an area of research
tantamount to better and more realistic feedback
that addresses these constraints, limitations, and
systems design and hence of tangible practical
tradeoffs.
value. An analysis of performance limitations
can aid control design in several aspects. First,
it may provide a fundamental limit on the best
Keywords performance attainable irrespective of controller
design, thus furnishing a guiding benchmark in
Bode integrals; Design tradeoff; Performance the design process. Secondly, it helps a designer
limitation; Tracking and regulation limits assess what and how system properties may be
inherently conflicting and thus pose inherent
difficulties to performance objectives, which
Introduction in turn helps the designer specify reasonable
goals, and make judicious modifications and
Fundamental control limitations are referred revisions on the design. In this process, the theory
to those intrinsic of feedback that can neither of fundamental control limitations promises
be overcome nor circumvent regardless how it to provide valuable insights and analytical
may be designed. By this nature, the study of justifications to long-held design heuristics and,
fundamental limitations dwells on a Hamletian indeed, to extend such heuristics further beyond.
question: Can or can’t it be done? What can This has become increasingly more relevant, as
and cannot be done? To be more specific, yet modern control design theory and practice relies
still general enough, at heart here are issues heavily on optimization-based numerical routines
concerning the benefit and cost of feedback. and tools.
We ask such questions as (1) What system Systematic investigation and understanding
characteristics may impose inherent limitations of fundamental control limitations began with
regardless of controller design? (2) What inherent the classical work of Bode in the 1940s on
488 Fundamental Limitation of Feedback Control

logarithmic sensitivity integrals, known as the singular value of a matrix A will be written as
Bode integrals. Bode’s work has had a lasting .A/. If A is a Hermitian matrix, we denote
impact on the theory and practice of control by .A/ its largest eigenvalue. For any unitary
and has inspired continued research effort dated vectors u; v 2 Cn , we denote by †.u; v/ the
most recently, leading to a variety of extensions principal angle between the two one-dimensional
and new results which seek to quantify design subspaces, called the directions, spanned by u and
constraints and performance limitations by v:
logarithmic integrals of Bode and Poisson type. cos †.u; v/ WD juH vj:
On the other hand, the search for the best
For a stable continuous-time system with transfer
achievable performance is a natural goal in
function matrix G.s/, we define its H1 norm by
optimal control problems, which has lend bounds
on optimal performance indices defined under
various criteria. Likewise, the latter developments kGk1 WD sup .G.s//:
Re.s/>0
have also been substantial and are continuing to
branch to different problems and different system
We consider the standard configuration of
categories.
finite-dimensional linear time-invariant (LTI)
In this entry we attempt to provide a summary
feedback control systems given in Fig. 1. In this
overview of the key developments in the study
setup, P and K represent the transfer functions
of fundamental limitations of feedback control.
of the plant model and controller, respectively,
While the understanding on this subject has been
r is a command signal, d a disturbance, n a
compelling and the results are rather prolific, we
noise signal, and y the output response. Define
focus on Bode-type integral relations and the
the open-loop transfer function, the sensitivity
best achievable performance limits, two branches
function, and the complementary sensitivity
of the study that are believed to be most well-
function by
developed. Roughly speaking, the Bode-type
integrals are most useful for quantifying the
L D PK; S D .I C L/1 ; T D L.I C L/1 ;
inherent design constraints and tradeoffs in the
frequency domain, while the performance results
provide fundamental limits of canonical control respectively. Then the output can be expressed as
objectives defined using frequency- and time-
domain criteria. Invariably, the two sets of results y D Sd  T n C SP r:
are intimately related and reinforce each other.
The essential message then is that despite its The goal of feedback control design is to design a
many benefits, feedback has its own limitations controller K so that the closed-loop system is sta-
and is subject to various constraints. Feedback ble and that it achieves certain performance spec-
design, for that sake, requires often times a hard ifications. Typical design objectives include:
tradeoff. • Disturbance attenuation. The effect of the
disturbance signal on the output should be
kept small, which translates into the require-
Control Design Specifications

We begin by introducing the basic notation to be


used in the sequel. Let CC WD fz W Re.z/ > 0g
denote the open right half plane (RHP) and CC
the closed RHP (CRHP). For a complex number
z, we denote its conjugate by z. For a complex
vector x, we denote its conjugate transpose by Fundamental Limitation of Feedback Control, Fig. 1
x H , and its Euclidean norm by kxk2 . The largest Feedback configuration
Fundamental Limitation of Feedback Control 489

ment that the sensitivity function be small in S.zi / D 1; T .zi / D 0:


magnitude at the frequencies of interest. For a
4. Structural constraints: Constraints in this cate-
single-input single-output (SISO) system, this
gory arise from the feedback structure itself;
mandates that
for example, S.s/ C T .s/ D 1. The im-
plication then is that the closed-loop transfer
jS.j!/j < 1; 8 ! 2 Œ0; !1 /: functions cannot be independently designed,
thus resulting in conflicting design objectives.
The sensitivity magnitude jS.j!/j is to be For a given plant, each of these constraints is
kept as small as possible in the low frequency unalterable and hence is fundamental, and each
range. will constrain the performance attainable in one
• Noise reduction. The noise response should be way or another. The question we face then is how
reduced at the output. This requires that the the constraints may be captured in a form that is
F
complementary sensitivity function be small directly pertinent and useful to feedback design.
in magnitude at frequencies of interest. For a
SISO system, the objective is to achieve
Bode Integral Relations
jT .j!/j < 1; 8 ! 2 Œ!2 ; 1/:
In the classical feedback control theory, Bode’s
Similarly, the magnitude jT .j!/j is desired to gain-phase formula (Bode 1945) is used to ex-
be the smallest at high frequencies. press the aforementioned design constraints for
Moreover, feedback can be introduced to achieve SISO systems.
many other objectives including regulation, com-
mand tracking, improved sensitivity to parameter Bode Gain-Phase Integral Suppose that L.s/
variations, and, more generally, system robust- has no pole and zero in CC . Then at any fre-
ness, all by manipulating the three key transfer quency !0 ,
functions: the open-loop transfer function, the
Z 1
sensitivity function, and the complementary sen- 1 d log jLj j
j
†L.j!0 / D log coth d
:
sitivity function. 1 d
2
The design and implementation of feedback
systems, on the other hand, are also subject to A special form of the Hilbert transform, this
many constraints, which include gain-phase formula relates the gain and phase of
1. Causality: A system must be causal for it to be the open-loop transfer function evaluated along
implementable. This constraint requires that the imaginary axis, whose implication may be
no ideal filter can be used for compensation explained as follows. In order to make the sensi-
and that the system’s relative degree and delay tivity response small in the low frequency range,
be preserved. the open-loop transfer function is required to
2. Stability: The closed-loop system must be sta- have a high gain; the higher, the better. On the
ble. This implies that every closed-loop trans- other hand, for noise reduction and robustness
fer function must be bounded and analytic in purposes, we need to keep the loop gain low at
CRHP. high frequencies, the lower the better. Evidently,
3. Interpolation: There should be no unstable to maximize these objectives, we want the two
pole-zero cancelation between the plant and frequency bands as wide as possible. This then
controller, in order to rid of hidden instability. requires a steep decrease of the loop gain and
Thus, at each RHP pole pi and zero zi , it is hence a rather negative slope in the crossover
necessary that region, say, the intermediate frequency range near
!0 . But the gain-phase relationship tells that a
S.pi / D 0; T .pi / D 1; very negative derivative in the gain will lead to
490 Fundamental Limitation of Feedback Control

a very negative phase, driving the phase closer abide some kind of conservation law, or invari-
to the negative 180 degree, namely, the critical ance property: the integral of the logarithmic
point of stability. It consequently reduces the sensitivity magnitude over the entire frequency
phase margin and may even cause instability. As range must be a nonnegative constant, determined
a result, the gain-phase relationship demonstrates by the open-loop unstable poles. This property
a conflict between the two design objectives. It is mandates a tradeoff between sensitivity reduc-
safe to claim that much of the classical feedback tion and sensitivity amplification in different fre-
design theory came as a consequence of this quency bands. Indeed, to achieve disturbance
simple relationship, aiming to shape the open- attenuation, the logarithmic sensitivity magnitude
loop frequency response in the crossover region must stay below zero db, the lower the better.
by trial and error, using lead or lag filters. For noise reduction and robustness, however, its
While in using the gain-phase formula, the de- tail has to roll off sufficiently fast to zero db
sign specifications imposed on closed-loop trans- at high frequencies. Since, in light of the inte-
fer functions are translated approximately into gral relation, the total area under the logarithmic
the requirements on the open-loop transfer func- magnitude curve is nonnegative, the logarithmic
tion, and the tradeoff between different design magnitude must rise above zero db, so that under
goals is achieved by shaping the open-loop gain its curve, the positive and negative areas may
and phase; a more direct vehicle to accomplish cancel each other to yield a nonnegative value.
this same goal is Bode’s sensitivity integral (Bode As such, an undesirable sensitivity amplifica-
1945). tion occurs, resulting in a fundamental tradeoff
between the desirable sensitivity reduction and
Bode Sensitivity Integrals Let pi 2 CC be the the undesirable sensitivity amplification, known
unstable poles and zi 2 CC the nonminimum colloquially as the waterbed effect.
phase zeros of L.s/. Suppose that the closed-loop
system in Fig. 1 is stable.
(i) If L.s/ has relative degree greater than one, MIMO Integral Relations
then
Z 1 X For a multi-input multi-output (MIMO) system
log jS.j!/jd! D pi : depicted in Fig. 1, the sensitivity and complemen-
0 i tary sensitivity functions, which now are transfer
function matrices, satisfy similar interpolation
(ii) If L.s/ contains no less than two integrators, constraints: at each RHP pole pi and zero zi of
then L.s/, the equations
Z 1 X 1
log jT .j!/j
2
d! D : S.pi /i D 0; T .pi /i D i ;
0 ! i
zi

Bode’s original work concerns the sensitivity


i S.zi / D wi ;
wH i T .zi / D 0
H
wH
integral for open-loop stable systems only. The
integral relations shown herein, which are at- hold with some unitary vectors i and wi , where
tributed to Freudenberg and Looze (1985) and i is referred to as a right pole direction vector
Middleton (1991), respectively, provide general- associated with pi , and wi a left zero direction
izations to open-loop unstable and nonminimum vector associated with zi .
phase systems. While it seems both natural and tempting, the
Why are Bode sensitivity integrals important? extension of Bode integrals to MIMO systems has
What is the hidden message behind the math- been highly nontrivial a task. Deep at the root is
ematical formulas? Simply put, Bode integral the complication resulted from the directionality
exhibits that a feedback system’s sensitivity must properties of MIMO systems. Unlike in a SISO
Fundamental Limitation of Feedback Control 491

system, the measure of frequency response SISO systems. Yet there is something additional
magnitude is now the largest singular value and unique of MIMO systems: the integrals now
of a transfer function matrix, which represents depend on not only the locations but also the
the worst-case amplification of energy-bounded directions of the zeros and poles. In particular, it
signals, a direct counterpart to the gain of a can be shown that they depend on the mutual ori-
scalar transfer function. This fact alone proves entation of these directions, and the dependence
to cast a fundamental difference and poses a can be explicitly characterized geometrically by
formidable obstacle. From a technical standpoint, the principal angles between the directions. This
the logarithmic function of the largest singular new phenomenon, which finds no analog in SISO
value is no longer a harmonic function as systems, thus highlights the important role of
in the SISO case, but only a subharmonic directionality in sensitivity tradeoff and more
function. Much to our regret then, familiar generally, in the design of MIMO systems. F
tools found from analytic function theory, such A more sophisticated and accordingly, more
as Cauchy and Poisson theorems, the very informative variant of Bode integrals is the Pois-
backbone in developing Bode integrals, cease to son integral for sensitivity and complementary
be applicable. Nevertheless, it remains possible sensitivity functions (Freudenberg and Looze
to extend Bode integrals in their essential spirit. 1985), which can be used to provide quantitative
Advances are made by Chen (1995, 1998, estimates of the waterbed effect. MIMO versions
2000). of Poisson integrals are also available (Chen
1995, 2000).
MIMO Bode Sensitivity Integrals Let pi 2
CC be the unstable poles of L.s/ and zi 2
CC the nonminimum phase zeros of L.s/. Sup-
Frequency-Domain Performance
pose that the closed-loop system in Fig. 1 is
Bounds
stable.
(i) If L.s/ has relative degree greater than one,
Performance bounds complement the integral
then
relations and provide fundamental thresholds to
Z ! the best possible performance ever attainable.
1 X
log .S.j!//d!   pi i H : Such bounds are useful in providing benchmarks
i
0 i for evaluating a system’s performance prior to
and after controller design. In the frequency
(ii) If L.s/ contains no less than two integrators, domain, fundamental limits can be specified as
then the minimal peak magnitude of the sensitivity and
complementary sensitivity functions achievable
Z !
1
log .T .j!// X 1 by feedback, or formally, the minimal achievable
d!   wi wH
i H1 norms:
0 !2 zi
i S
min W D inf fkS.s/k1 W K.s/ stabilizes P .s/g ;
where i and wi are some unitary vectors related
T
min W D inf fkT .s/k1 W K.s/ stabilizes P .s/g :
to the right pole direction vectors associated with
pi and the left zero direction vectors associated Drawing upon Nevanlinna-Pick interpolation
with zi , respectively. theory for analytic functions, one can obtain
exact performance limits under rather general
From these extensions, it is evident that same circumstances (Chen 2000).
limitations and tradeoffs on the sensitivity and
complementary sensitivity functions carry over H1 Performance Limits Let zi 2 CC be the
to MIMO systems; in fact, both integrals re- nonminimum phaze zeros of P .s/ with left direc-
duce to the Bode integrals when specialized to tion vectors wi , and pi 2 CC the unstable poles
492 Fundamental Limitation of Feedback Control

of P .s/ with right direction vectors i , where zi the output z to track a given reference input r,
and pi are all distinct. Then, based on the feedforward of the reference signal
r and the feedback of the measured output y.
r

1=2 1=2 The tracking performance is defined in the time


S
min D T
min D 1 C 2 Qp Qzp Qz ; domain by the integral square error
Z 1
where Qz , Qp , and Qzp are the matrices given
J D kz.t/  r.t/k22 dt;
by 0

H Typically, we take r to be a step signal, which


wH i wj i j
Qz WD ; Qp WD ; in the MIMO setting corresponds to a unitary
zi C zj p i C pj
constant vector, i.e., r.t/ D v; t > 0 and r.t/ D
0; t < 0, where kvk2 D 1. We assume P to be
wHi j
Qzp WD : LTI. But K can be arbitrarily general, as long as
zi  pj
it is causal and stabilizing. We want z to not only
More explicit bounds showing how zeros and track r asymptotically but also minimize J . But
poles may interact to have an effect on these how small can it be?
limits can be obtained, e.g., as For the regulation problem, a general setup is
given in Fig. 3. Likewise, K may be taken as a
S
min D min
T
 2-DOF controller. The control output energy is
s ˇ ˇ
measured by the quadratic cost
ˇ pj C zi ˇ2 Z
ˇ
sin †.wi ; j / C ˇ
2 ˇ cos2 †.wi ; j /; 1
pj  zi ˇ ED ku.t/k22 dt:
0
which demonstrates once again that the pole
and zero directions play an important role in We consider a disturbance signal d , typically
MIMO systems. Note that for RHP poles and taken as an impulse signal, d.t/ D vı.t/, where
zeros located in the close vicinity, this bound v is a unitary vector. In this case, the disturbance
can become excessively large, which serves as can be interpreted as a nonzero initial condition,
another vindication why unstable pole-zero can- and the controller K is to regulate the system’s
celation must be prohibited. Note also that for zero-input response. Similarly, we assume that P
MIMO systems however, whether near pole-zero
cancelation is problematic depends additionally
on the mutual orientation of the pole and zero
directions.

Tracking and Regulation Limits Fundamental Limitation of Feedback Control, Fig. 2


2-DOF tracking control structure

Tracking and regulation are two canonical


objectives of servo mechanisms and constitute
chief criteria in assessing the performance of
feedback control systems. Understandings gained
from these problems will shed light into more
general issues indicative of feedback design.
In its full generality, a tracking system can be
depicted as in Fig. 2, in which a 2-DOF (degree Fundamental Limitation of Feedback Control, Fig. 3
of freedom) controller K is to be designed for 2-DOF regulator
Fundamental Limitation of Feedback Control 493

is LTI, but allow K to be any causal, stabilizing Summary and Future Directions
controller. Evidently, for a stable P , the problem
is trivial; the response will restore itself to the Whether in time or frequency domain, while
origin and thus no energy is required. But what the results presented herein may differ in forms
if the system is unstable? How much energy and contexts, they unequivocally point to the
does the controller must generate to combat the fact that inherent constraints exist in feedback
disturbance? What is the smallest amount of design, and fundamental limitations will neces-
energy required? These questions are answered sarily arise, limiting the performance achievable
by the best achievable limits of the tracking regardless of controller design. Such constraints
and regulation performance (Chen et al. 2000, and limitations are especially exacerbated by the
2003). nonminimum phase zeros and unstable poles in
the system. Understanding of these constraints F
Tracking and Regulation Performance Limits and limitations proves essential to the success of
Let pi 2 CC and zi 2 CC be the RHP poles and control design.
zeros of P .s/, respectively. Then, For both its intrinsic appeal and fundamental
implications, the study of fundamental control
X limitations will continue to be a topic of en-
inffE W K stabilizes P .s/g D pi cos2 .i ; v/; during vitality and indeed will prove timeless.
i Challenges are especially daunting and endeavor
X 1 is called for, e.g., to incorporate information and
inffJ W K stabilizes P .s/g D cos2 . i ; v/;
zi communication constraints into control limitation
i
studies, of which networked control and multi-
agent systems serve as notable testimonies.
where i and i are some unitary vectors related
to the right pole direction vectors associated with
pi and the left zero direction vectors associated
Cross-References
with zi , respectively.
It becomes instantly clear that the optimal
 H-Infinity Control
performance depends on both the pole/zero loca-
 H2 Optimal Control
tions and their directions. In particular, it depends
 Linear Quadratic Optimal Control
on the mutual orientation between the input and
pole/zero directions. This sheds some interesting
light. Take the tracking performance for an ex-
ample. For a SISO system, the minimal tracking Bibliography
error can never be made zero for a nonmini-
mum phase plant; in other words, perfect tracking Astrom KJ (2000) Limitations on control system perfor-
can never be achieved. Yet this is possible for mance. Eur J Control 6:2–20
Bode HW (1945) Network analysis and feedback ampli-
MIMO systems, when the input and zero direc-
fier design. Van Nostrand, Princeton
tions are appropriately aligned, specifically when Chen J (1995) Sensitivity integral relations and design
they are orthogonal. Interestingly, the optimal tradeoffs in linear multivariable feedback systems.
performance in both cases can be achieved by LTI IEEE Trans Autom Control 40(10):1700–1716
Chen J (1998) Multivariable gain-phase and sensitivity
controllers, though allowed to be more general.
integral relations and design tradeoffs. IEEE Trans
As a result, the results herein provide the true fun- Autom Control 43(3):373–385
damental limits that cannot be further improved, Chen J (2000) Logarithmic integrals, interpolation
in spite of using any other more general forms bounds, and performance limitations in MIMO sys-
tems. IEEE Trans Autom Control 45:1098–1115
such as nonlinear, time-varying feedforward and
Chen J, Qiu L, Toker O (2000) Limitations on maxi-
feedback. It is simply the best one can ever hope mal tracking accuracy. IEEE Trans Autom Control
for, and the LTI controllers turn out to be optimal. 45(2):326–331
494 Fundamental Limitation of Feedback Control

Chen J, Hara S, Chen G (2003) Best tracking and Middleton RH (1991) Trade-offs in linear control system
regulation performance under control energy design. Automatica 27(2):281–292
constraint. IEEE Trans Autom Control 48(8): Qiu L, Davison EJ (1993) Performance limitations of
1320–1336 non-minimum phase systems in the servomechanism
Freudenberg JS, Looze DP (1985) Right half plane problem. Automatica 29(2):337–349
zeros and poles and design tradeoffs in feed- Seron MM, Braslavsky JH, Goodwin GC (1997) Funda-
back systems. IEEE Trans Autom Control 30(6): mental limitations in filtering and control. Springer,
555–565 London
G

network, part of a business process in an


Game Theory for Security organization, or belongs to a critical infras-
tructure. One has to decide on, for example,
Tansu Alpcan
how to configure sensors for surveillance, collect
Department of Electrical and Electronic
further information on system properties, allocate
Engineering, The University of Melbourne,
resources to secure a critical segment, or who
Melbourne, Australia
should be able to access a specific function in the
system. The decision-maker can be, depending
Abstract on the setting, a regular employee, a system
administrator, or the chief technical officer of an
Game theory provides a mature mathematical organization. In many cases, the decisions are
foundation for making security decisions in a made automatically by a computer program such
principled manner. Security games help formaliz- as allowing a packet pass the firewall or filtering
ing security problems and decisions using quan- it out. The time frame of these decisions exhibits
titative models. The resulting analytical frame- a high degree of variability from milliseconds,
works lead to better allocation of limited re- if made by software, to days and weeks, e.g.,
sources and result in more informed responses to when they are part of a strategic plan. Each
security problems in complex systems and orga- security decision has a cost and any decision-
nizations. The game-theoretic approach to secu- maker is always constrained by limited amount
rity is applicable to a wide variety of systems and of available resources. More importantly, each
critical infrastructures such as electricity, water, decision carries a security risk that needs to be
financial services, and communication networks. taken into account when balancing the costs and
the benefits.
Security games facilitate building analytical
Keywords models which capture the interaction between
malicious attackers, who aim to compromise net-
Complex systems; Cyberphysical system security; works, and owners or administrators defending
Game theory; Security games them. Attacks exploiting vulnerabilities of the
underlying systems and defensive countermea-
sures constitute the moves of the game. Thus, the
Introduction strategic struggle between attackers and defend-
ers is formalized quantitatively based on the solid
Securing a system involves making numerous mathematical foundation provided by the field of
decisions whether the system is a computer game theory.

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
496 Game Theory for Security

An important aspect of security games is the


allocation of limited available resources from the
perspectives of both attackers and defenders. If
the players had access to unlimited resources
(e.g., time, computing power, bandwidth), then
the resulting security games would be trivial. In
real-world security settings, however, both at-
tackers and defenders have to act strategically and
make numerous decisions when allocating their Game Theory for Security, Fig. 1 A simple, two-player
respective resources. Unlike in an optimization security game
approach, security games take into account the
decisions and resource limitations of both the
attackers and the defenders. If the decision variables of the players are
continuous, for example, x 2 Œ0; a and y 2 Œ0; c
denote attack and defense intensity, respectively,
Security Games
then the resulting continuous-kernel game is de-
scribed using functions instead of a matrix. Then,
A security game is defined with four components:
J attacker .x; y/ and J defender .x; y/ quantify the cost
the players, the set of possible actions or strate-
of the attacker and defender as a function of their
gies for each player, the outcome of the game for
actions, respectively.
each player as a result of their action-reaction,
and information structures in the game. The play-
Security Game Types
ers have their own (selfish or malicious) motiva-
In its simplest formulation, the conflict between
tions and inherent resource constraints. Based on
those defending a system and malicious attackers
these motivations and information available, they
targeting it can be modeled as a two-person zero-
choose the most beneficial strategies for them-
sum security game, where the loss of a player
selves and act accordingly. Hence, game theory
is the gain of the other. Alternatively, two- and
helps analyzing decision-makers interacting on a
multi-person nonzero-sum security games gener-
system in a quantitative manner.
alize this for capturing a broader range of interac-
tions. Static game formulations and their repeated
An Example Formulation
versions are helpful for modeling myopic behav-
A simple security game can be formulated as a
ior of players in fast changing situations where
two-player and strategic (noncooperative) one,
planning future actions is of little use. In the
where one player is the attacker and the other
case where the underlying system dynamics are
one is the defender protecting a system. Let
predictable and available to the players, dynamic
the discrete actions available to the attacker and
security game formulations can be utilized. If
the defender be fa; bg and fc; d g, respectively.
there is an order of actions in the game, for ex-
Each attack-defense pair leads to one of the
ample, a purely reactionary defender, then leader-
outcome pairs for the attacker and the defender
follower games can be used to formulate such
f.x1; y1/; .x2; y2/; .x3; y3/; .x4; y4/g, which
cases where the attacker takes the lead and the
represent the respective player’s gains (or losses).
defender follows.
This security game is depicted graphically in
Within the framework of security games, the
Fig. 1. It can also be represented as a matrix
concept of Nash equilibrium, where no player
game as follows, where the attacker is the row
gains from deviating from its own Nash equilib-
player and the defender is the column player:
rium strategy if others stick with theirs, provides
 .c/ .d /  a solid foundation. However, there are refine-
.x1; y1/ .x2; y2/ .a/ ments and additional solution concepts when the
.x3; y3/ .x4; y4/ .b/ game is dynamic or when there is more than one
Game Theory for Security 497

Nash equilibrium or information limitations in Applications


the game. These constitute an open and ongoing
research topic. A related research question is the An early application of the decision and game-
design of security games to ensure a favorable theoretic approach has been to the well-defined
outcome from a global perspective while taking jamming problem, where malicious attackers aim
into account the independence of individual play- to disrupt wireless communication between legit-
ers in their decisions. imate parties (Kashyap et al. 2004; Zander 1990).
In certain cases, it is useful to analyze the Detection of security intrusion and anomalies due
player interactions in multiple layers. For exam- to attacks is another problem, where the interac-
ple, in some security games there may be defend- tion between attackers and defenders has been
ers and malicious attackers trying to influence a modeled successfully using game theory (Alp-
population of other players by indirect means. can and Başar 2011; Kodialam and Lakshman
In order to circumvent modeling complexity of 2003). Decision and game-theoretic approaches
the problem, evolutionary games have been sug- have been applied to a broad variety of networked G
gested. While evolutionary games forsake mod- system security problems such as security invest-
eling individual player actions, they provide valu- ments in organizations (Miura-Ko et al. 2008),
able insights to collective behavior of populations (location) privacy (Buttyan and Hubaux 2008;
of players and ensure tractability. Such models Kantarcioglu et al. 2011), distributed attack de-
are useful, for example, in the analysis of various tection, attack trees and graphs, adversarial con-
security policies affecting many users or security trol (Altman et al. 2010), network path selection
of large-scale critical systems. (Zhang et al. 2010) and topology planning in
Another important aspect of security decisions presence of adversaries (Gueye et al. 2010), as
is the availability and acquisition of informa- well as to other types of security games and
tion on the properties of the system at hand, decisions. More recently, security games have
the actions of other players, and the incentives been used to investigate cyberphysical security
behind them. Clearly, the amount of information of (smart) power grid (Law et al. 2012). The
available has a direct influence on the decisions, proceedings of the last three Conferences on De-
yet acquiring information is often costly or even cision and Game Theory for Security published
infeasible in some cases. Then, the decisions as edited volumes in 2010 (Alpcan et al. 2010),
have to be made with partial information, and 2011 (Baras et al. 2011), and 2012 (Grossklags
information collection becomes part of the deci- and Walrand 2012) as well as the recent survey
sion process itself, creating a complex feedback paper (Manshaei et al. 2013) present an extensive
loop. segment of the literature on the subject.
Statistical or machine learning techniques and Analytical risk management is a related
system identification are other useful methods emerging research subject. Analytical methods
in the analysis of security games where players and game theory have been applied to
use the acquired information iteratively to update the field only recently but with increasing
their own model of the environment and other success (Guikema 2009; Mounzer et al. 2010).
players. The players then decide on their best Another emerging topic is the adversarial
courses of action. Existing work on fictitious mechanism design (Chorppath and Alpcan
play and reinforcement learning methods such as 2011; Roth 2008), where the goal is to design
Q-learning are applicable and useful. A unique mechanisms resistant to malicious behavior.
feature of security games is the fact that players
try to hide their actions from others. These ob-
servability issues and distortions in observations Summary and Future Directions
can be captured by modeling the interaction be-
tween players who observe each other’s actions Game theory provides quantitative methods for
as a noisy communication channel. studying the players in security problems such
498 Game Theory for Security

as attackers, defenders, and users as well as their Baras J, Katz J, Altman E (eds) (2011) Proceedings
interaction and incentives. Hence, it facilitates of the second international conference on decision
and game theory for security, GameSec 2011, Col-
making decisions on the best courses of action lege Park, 14–15 Nov 2011. Lecture notes in com-
in addressing security problems while taking puter science, vol 7037. Springer, Berlin/Heidelberg.
into account resource limitations, underlying doi:10.1007/978-3-642-25280-8
incentive mechanisms, and security risks. Thus, Buttyan L, Hubaux JP (2008) Security and cooperation
in wireless networks. Cambridge University Press,
security games and associated quantitative
Cambridge. http://secowinet.epfl.ch
models have started to replace the prevalent Chorppath AK, Alpcan T (2011) Adversarial behav-
ad hoc decision processes in a wide variety of ior in network mechanism design. In: Proceedings
security problems from safeguarding critical of the of 4th international workshop on game the-
infrastructure to risk management, trust, and ory in communication networks (Gamecomm), ENS,
Cachan
privacy in networked systems. Grossklags J, Walrand JC (eds) (2012) Proceedings of the
Game theory for security is a young and active third international conference on decision and game
research area as evidenced by the recently initi- theory for security, GameSec 2012, Budapest, 5–6 Nov
ated conference series “Conference on Decision 2012. Lecture notes in computer science, vol 7638.
Springer
and Game Theory for Security” (www.gamesec-
Gueye A, Walrand J, Anantharam V (2010) Design of
conf.org), the increasing number of journal and network topology in an adversarial environment. In:
conference articles, as well as the recently pub- Alpcan T, Buttyán L, Baras J (eds) Decision and
lished books (Alpcan and Başar 2011; Buttyan game theory for security. Lecture notes in computer
science, vol 6442. Springer, Berlin/Heidelberg, pp 1–
and Hubaux 2008; Tambe 2011).
20. doi:10.1007/978-3-642-17197-0_1
Guikema SD (2009) Game theory models of intelligent
actors in reliability analysis: an overview of the state of
the art. In: Bier VM, Azaiez MN (eds) Game theoretic
Cross-References risk analysis of security threats, Springer, New York,
pp 1–19. doi:10.1007/978-0-387-87767-9
 Dynamic Noncooperative Games Kantarcioglu M, Bensoussan A, Hoe S (2011)
Investment in privacy-preserving technologies under
 Evolutionary Games
uncertainty. In: Baras J, Katz J, Altman E (eds)
 Learning in Games Decision and game theory for security. Lecture
 Network Games notes in computer science, vol 7037. Springer,
 Stochastic Games and Learning Berlin/Heidelberg, pp 219–238. doi:10.1007/978-3-
642-25280-8_17
 Strategic Form Games and Nash Equilibrium
Kashyap A, Başar T, Srikant R (2004) Correlated
jamming on MIMO Gaussian fading channels.
IEEE Trans Inf Theory 50(9):2119–2123.
doi:10.1109/TIT.2004.833358
Bibliography Kodialam M, Lakshman TV (2003) Detecting network
intrusions via sampling: a game theoretic approach.
Alpcan T, Başar T (2011) Network security: a decision In: Proceedings of 22nd IEEE conference on computer
and game theoretic approach. Cambridge University communications (Infocom), San Fransisco, vol 3,
Press, Cambridge, UK. http://www.tansu.alpcan.org/ pp 1880–1889
book.php Law YW, Alpcan T, Palaniswami M (2012) Security
Alpcan T, Buttyan L, Baras J (eds) (2010) Proceedings of games for voltage control in smart grid. In: 50th an-
the first international conference on decision and game nual Allerton conference on communication, control,
theory for security, GameSec 2010, Berlin, 22–23 Nov and computing (Allerton 2012), Monticello, IL, USA,
2010. Lecture notes in computer science, vol 6442. pp 212–219. doi:10.1109/Allerton.2012.6483220
Springer, Berlin/Heidelberg. doi:10.1007/978-3-642- Manshaei MH, Zhu Q, Alpcan T, Başar T, Hubaux
17197-0 JP (2013) Game theory meets network security
Altman E, Başar T, Kavitha V (2010) Adversarial con- and privacy. ACM Comput Surv 45(3):25:1–25:39.
trol in a delay tolerant network. In: Alpcan T, doi:10.1145/2480741.2480742
Buttyán L, Baras J (eds) Decision and game the- Miura-Ko RA, Yolken B, Bambos N, Mitchell J (2008)
ory for security. Lecture notes in computer science, Security investment games of interdependent organi-
vol 6442. Springer, Berlin/Heidelberg, pp 87–106. zations. In: 46th annual Allerton conference, Monti-
doi:10.1007/978-3-642-17197-0_6 cello, IL, USA
Game Theory: Historical Overview 499

Mounzer J, Alpcan T, Bambos N (2010) Dynamic control game theory; Nash equilibrium; Stackelberg
and mitigation of interdependent IT security risks. equilibrium
In: Proceedings of the IEEE conference on commu-
nication (ICC), Cape Town. IEEE Communications
Society
Roth A (2008) The price of malice in linear congestion What Is Game Theory?
games. In: WINE ’08: proceedings of the 4th interna-
tional workshop on internet and network economics,
Shanghai, pp 118–125 Game theory deals with strategic interactions
Tambe M (2011) Security and game theory: algorithms, among multiple decision makers, called players
deployed systems, lessons learned. Cambridge Univer- (and in some context agents), with each player’s
sity Press, New York, NY, USA preference ordering among multiple alternatives
Zander J (1990) Jamming games in slotted Aloha packet
radio networks. In: IEEE military communications
captured in an objective function for that player,
conference (MILCOM), Morleley, vol 2, pp 830–834. which she either tries to maximize (in which case
doi:10.1109/MILCOM.1990.117531 the objective function is a utility function or a
Zhang N, Yu W, Fu X, Das S (2010) gPath: a benefit function) or minimize (in which case we G
game-theoretic path selection algorithm to protect
Tor’s anonymity. In: Alpcan T, Buttyán L, Baras J
refer to the objective function as a cost function or
(eds) Decision and game theory for security. Lec- a loss function). For a nontrivial game, the objec-
ture notes in computer science, vol 6442. Springer, tive function of a player depends on the choices
Berlin/Heidelberg, pp 58–71. doi:10.1007/978-3-642- (actions or equivalently decision variables) of
17197-0_4
at least one other player, and generally of all
the players, and hence a player cannot simply
optimize her own objective function independent
Game Theory: Historical Overview of the choices of the other players. This thus
brings in a coupling among the actions of the
Tamer Başar players and binds them together in decision mak-
Coordinated Science Laboratory, University of ing even in a noncooperative environment. If the
Illinois, Urbana, IL, USA players are able to enter into a cooperative agree-
ment so that the selection of actions or decisions
is done collectively and with full trust, so that
all players would benefit to the extent possible,
Abstract then we would be in the realm of cooperative
game theory, where issues such as bargaining
This article provides an overview of the aspects and characterization of fair outcomes, coalition
of game theory that are covered in this Encyclo- formation, and excess utility distribution are of
pedia, which includes a broad spectrum of topics relevance; an article in this Encyclopedia (by
on static and dynamic game theory. It starts with Haurie) discusses cooperation and cooperative
a brief overview of game theory, identifying its outcomes in the context of dynamic games. Other
basic ingredients, and continues with a brief his- aspects of cooperative game theory can be found
torical account of the development and evolution in several standard texts on game theory, such as
of the field. It concludes by providing pointers to Owen (1995), Vorob’ev (1977), or Fudenberg and
other articles in the Encyclopedia on game theory, Tirole (1991). See also the 2009 survey article
and a list of references. Saad et al. (2009), which emphasizes applications
of cooperative game theory to communication
networks.
If no cooperation is allowed among the play-
Keywords ers, then we are in the realm of noncooperative
game theory, where first one has to introduce a
Cooperation; Dynamic games; Evolutionary satisfactory solution concept. Leaving aside for
games; Game theory; Historical evolution of the moment the issue of how the players can
500 Game Theory: Historical Overview

reach such a solution point, let us address the two-player game is zero-sum if the sum of the
issue of what would be the minimum features one objective functions of the two players is zero
would expect to see there. To first order, such or can be made zero by appropriate positive
a solution point should have the property that scaling and/or translation that do not depend on
if all players but one stay put, then the player the decision variables of the players; hence, two-
who has the option of moving away from the player zero-sum games can be viewed as a spe-
solution point should not have any incentive to cial subclass of two-player nonzero-sum games,
do so because she cannot improve her payoff. and in this case the Nash equilibrium becomes
Note that we cannot allow two or more players the saddle-point equilibrium. A game is a finite
to move collectively from the solution point, game if each player has only a finite number of
because such a collective move requires cooper- alternatives, that is, the players pick their actions
ation, which is not allowed in a noncooperative out of finite sets (action sets); otherwise, the game
game. Such a solution point where none of the is an infinite game. Finite games are also known
players can improve her payoff by a unilateral as matrix games. An infinite game is said to be a
move is known as a noncooperative equilibrium continuous-kernel game if the action sets of the
or Nash equilibrium, named after John Nash, who players are continua and the players’ objective
introduced it and proved that it exists in finite functions are continuous with respect to action
games (i.e., games where each player has only variables of all players. A game is said to be
a finite number of alternatives), over 60 years deterministic if the players’ actions uniquely de-
ago; see Nash (1950, 1951). This result and termine the outcome, as captured in the objective
its various extensions for different frameworks functions, whereas if the objective function of at
as well as its computation (both off-line and least one player depends on an additional variable
online) are discussed in several articles in this (state of nature) with a probability distribution
Encyclopedia. Another noncooperative equilib- known to all players (or can be learned on line),
rium solution concept is the Stackelberg equi- then we have a stochastic game. A game is a com-
librium, introduced in von Stackelberg (1934), plete information game if the description of the
and predating the Nash equilibrium, where there game (i.e., the players, the objective functions,
is a hierarchy in decision making among the and the underlying probability distributions (if
players, with some of the players, designated stochastic)) is common information to all players;
as leaders, having the ability to first announce otherwise, we have an incomplete information
their actions (and make a commitment to play game. We say that a game is static if players have
them) and the remaining players, designated as access to only the a priori information (shared
followers, taking these actions as given in the by all) and none of the players has access to
process of computation of their noncooperative information on the actions of any of the other
(Nash) equilibria (among themselves). Before players; otherwise, what we have is a dynamic
announcing their actions, the leaders would of game. A game is a single-act game if every player
course anticipate these responses and determine acts only once; otherwise, the game is multi-act.
their actions in a way that the final outcome Note that it is possible for a single-act game to be
will be most favorable to them (in terms of dynamic and for a multi-act game to be static. A
their objective functions). For a comprehensive dynamic game is said to be a differential game if
treatment of Nash and Stackelberg equilibria for the evolution of the decision process (controlled
different classes of games, see Başar and Olsder by the players over time) takes place in contin-
(1999). uous time and generally involves a differential
We say that a noncooperative game is nonzero- equation; if it takes place over a discrete-time
sum if the sum of the players’ objective functions horizon, the dynamic game is sometimes called
cannot be made zero after appropriate positive a discrete-time game.
scaling and/or translation that do not depend on In dynamic games, as the game progresses
the players’ decision variables. We say that a players acquire information (complete or partial)
Game Theory: Historical Overview 501

on past actions of other players and use this one-player difference or differential games can
information in selecting their own actions (also be viewed as optimal control problems.
dictated by the equilibrium solution concept at
hand). In finite dynamic games, for example, the
progression of a game involves a tree structure
(also called extensive form) where each node Highlights on the History and
is identified with a player along with the time Evolution of Game Theory
when she acts, and branches emanating from a
node show the possible moves of that particular Game theory has enjoyed over 70 years of
player. A player, at any point in time, could scientific development, with the publication of
generally be at more than one node, which is a the Theory of Games and Economic Behavior by
situation that arises when the player does not have von Neumann and Morgenstern (1947) generally
complete information on the past moves of other acknowledged to kick-start the field. It has
players and hence may not know with certainty experienced incessant growth in both the number G
which particular node she is at at any particular of theoretical results and the scope and variety
time. This uncertainty leads to a clustering of of applications. As a recognition of the vitality
nodes into what is called information sets for of the field, through 2012 a total of 10 Nobel
that player. What players decide on within the Prizes were given in Economic Sciences for
framework of the extensive form is not their work primarily in game theory, with the first such
actions, but their strategies, that is, what action recognition bestowed in 1994 on John Harsanyi,
they would take at each information set (in other John Nash, and Reinhard Selten “for their
words, correspondences between their informa- pioneering analysis of equilibria in the theory
tion sets and their allowable actions). They then of noncooperative games.” The second round
take specific actions (or actions are executed on of Nobel Prizes in game theory went to Robert
their behalf), dictated by the strategies chosen Aumann and Thomas Schelling in 2005, “for
as well as the progression of the game (deci- having enhanced our understanding of conflict
sion) process along the tree. The equilibrium is and cooperation through game-theory analysis.”
then defined in terms of not actions but strate- The third round recognized Leonid Hurwicz,
gies. Eric Maskin, and Roger Myerson in 2007, “for
The notion of a strategy, as a mapping from having laid the foundations of mechanism design
the collection of information sets to action sets, theory.” And the most recent one was in 2012,
extends readily to infinite dynamic games, and recognizing Alvin Roth and Lloyd Shapley,
hence, in both differential games and difference “for the theory of stable allocations and the
games, Nash equilibria are defined in terms of practice of market design.” To this list of highest-
strategies. Several articles in this Encyclopedia level awards related to contributions to game
discuss such equilibria, for both zero-sum and theory, one should also add the 1999 Crafoord
nonzero-sum dynamic games, with and without Prize (which is the highest prize in Biological
the presence of probabilistic uncertainty. Sciences), which went to John Maynard Smith
In the broad scheme of things, game theory (along with Ernst Mayr and G. Williams)
and particularly noncooperative game theory “for developing the concept of evolutionary
can be viewed as an extension of two fields, biology,” where Smith’s recognized contributions
both covered in this Encyclopedia: Mathematical had a strong game-theoretic underpinning,
Programming and Optimal Control Theory. Any through his work on evolutionary games and
problem in game theory collapses to a problem evolutionary stable equilibrium (Smith 1974,
in one of these disciplines if there is only one 1982; Smith and Price 1973); this is the topic
player. One-player static games are essentially of one of the articles in this Encyclopedia
mathematical programming problems (linear (by Altman). Several other “game theory”
programming or nonlinear programming), and articles in the Encyclopedia also relate to the
502 Game Theory: Historical Overview

contributions of the Nobel Laureates mentioned and their uniqueness or nonuniqueness (termed
above. informational nonuniqueness) was carried out in
Even though von Neumann and Morgenstern’s Başar (1974, 1976, 1977).
1944 book is taken as the starting point of
the scientific approach to game theory, game-
theoretic notions and some isolated key results Related Articles on Game Theory in
date back to earlier years and even centuries. the Encyclopedia
Sixteen years earlier, in 1928, von Neumann
himself had resolved completely an open Several articles in the Encyclopedia introduce
fundamental problem in zero-sum games, that various subareas of game theory and discuss im-
every finite two-player zero-sum game admits a portant developments (past and present) in each
saddle point in mixed strategies, which is known corresponding area.
as the Minimax Theorem (von Neumann 1928) The article  Strategic Form Games and
– a result which Emile Borel had conjectured Nash Equilibrium introduces the static game
to be false eight years before. Some early framework along with the Nash equilibrium
traces of game-theoretic thinking can be seen concept, for both finite and infinite games,
in the 1802 work (Considérations sur la théorie and discusses the issues of existence and
mathématique du jeu) of André-Marie Ampère uniqueness as well efficiency. The article
(1775–1836), who was influenced by the 1777  Dynamic Noncooperative Games focuses on
writings (Essai d’Arithmétique Morale) of dynamic games, again for both finite and infinite
Georges Louis Buffon (1707–1788). games, and discusses extensive form descriptions
Which event or writing has really started of the underlying dynamic decision process,
game-theoretic thinking or approach to decision either as trees (in finite games) or difference
making (in law, politics, economics, operations equations (in discrete-time infinite games).
research, engineering, etc.) may be a topic of Bernhard, in two articles, discusses continuous-
debate, but what is indisputable is that in (zero- time dynamic games, described by differential
sum) differential games (which is most relevant equations (so-called differential games), but
to control theory) the starting point was the work in the two-person zero-sum case. One of these
of Rufus Isaacs in the RAND Corporation in articles  Pursuit-Evasion Games and Zero-Sum
the early 1950s, which remained classified for Two-Person Differential Games describes the
at least a decade, before being made accessible framework initiated by Isaacs, and several of
to a broad readership in 1965 (Isaacs 1965); see its extensions for pursuit-evasion games, and
also the review (Ho 1965) which first introduced the other one  Linear Quadratic Zero-Sum
the book to the control community. One of the Two-Person Differential Games presents results
articles in this Encyclopedia (by Bernhard) talks on the special case of linear quadratic differential
about this history and the theory developed by games, with an important application of that
Isaacs, within the context of pursuit-evasion framework to robust control and more precisely
games, and another article (again by Bernhard) H1 -optimal control.
discusses the impact the zero-sum differential When the number of players in a nonzero-
game framework has made on robust control sum game is countably infinite, or even just
design (Başar and Bernhard 1995). Extension sufficiently large, some simplifications arise in
of the game-theoretic framework to nonzero- the computation and characterization of Nash
sum differential games with Nash equilibrium as equilibria. The mathematical framework appli-
the solution concept was initiated in Starr and cable to this context is provided by mean field
Ho (1969) and with Stackelberg equilibrium theory, which is the topic of the article  Mean
as the solution concept in Simaan and Cruz Field Games, which discusses this relatively new
(1973). Systematic study of the role information theory within the context of stochastic differential
structures play in the existence of such equilibria games.
Game Theory: Historical Overview 503

Cooperative solution concepts for dynamic Future of Game Theory


games are discussed in the article  Cooperative
Solutions to Dynamic Games, which introduces The second half of the twentieth century was a
Pareto optimality, the bargaining solution concept golden era for game theory, and all evidence so
by Nash, characteristic functions, core, and C- far in the twenty-first century indicates that the
optimality, and presents some selected results us- next half century is destined to be a platinum era.
ing these concepts. In the article  Evolutionary In all respects game theory is on an upward slope
Games, the foundations of, as well as the recent in terms of its vitality, the wealth of topics that fall
advances in, evolutionary games are presented, within its scope, the richness of the conceptual
along with examples showing their potential as framework it offers, the range of applications, and
a tool for capturing and modeling interactions in the challenges it presents to an inquisitive mind.
complex systems.
The article  Learning in Games addresses the Cross-References
online computation of Nash equilibrium through G
an iterative process which takes into account each  Auctions
player’s response to choices made by the remain-  Cooperative Solutions to Dynamic Games
ing players, with built-in learning and adaptation  Dynamic Noncooperative Games
rules; one such scheme that is discussed in the  Evolutionary Games
article is the well-known fictitious play. Learning  Game Theory for Security
is also the topic of the article  Stochastic Games  Game Theory: Historical Overview
and Learning, which presents a framework and a  Learning in Games
set of results using the stochastic games formula-  Linear Quadratic Zero-Sum Two-Person Dif-
tion introduced by Shapley in the early 1950s. ferential Games
The article  Network Games shows how  Mean Field Games
game theory plays an important role in modeling  Mechanism Design
interactions between entities on a network, partic-  Network Games
ularly communication networks, and presents a  Optimal Control and the Dynamic Program-
simple mathematical model to study one such ming Principle
instance, namely, resource allocation in the  Optimization Based Robust Control
Internet. How to design a game so as to obtain a  Pursuit-Evasion Games and Zero-Sum
desired outcome (as captured by say a Nash equi- Two-Person Differential Games
librium) is a question central to mechanism de-  Stochastic Games and Learning
sign, which is covered in the article  Mechanism  Strategic Form Games and Nash Equilibrium
Design, which discusses as a specific example the
Vickrey-Clarke-Groves (VCG) mechanism.
Two other applications of game theory are Bibliography
to design of auctions and security. The article
Başar T (1974) A counter example in linear-quadratic
 Auctions addresses the former, discussing
games: existence of non-linear Nash solutions. J Optim
general auction theory along with equilibrium Theory Appl 14(4):425–430
strategies and more specifically combinatorial Başar T (1976) On the uniqueness of the Nash solution in
auctions. The latter is addressed in the article linear-quadratic differential games. Int J Game Theory
5:65–90
 Game Theory for Security, which discusses
Başar T (1977) Informationally nonunique equilibrium
how the game-theoretic approach leads to more solutions in differential games. SIAM J Control
effective responses to security in complex 15(4):636–660
systems and organizations, with applications Başar T, Bernhard P (1995) H1 optimal control and
related minimax design problems: a dynamic game
to a wide variety of systems and critical approach, 2nd edn. Birkhäuser, Boston
infrastructures such as electricity, water, financial Başar T, Olsder GJ (1999) Dynamic noncooperative game
services, and communication networks. theory. Classics in applied mathematics, 2nd edn.
504 Generalized Finite-Horizon Linear-Quadratic Optimal Control

SIAM, Philadelphia. (First edition, Academic, Lon- problems, including the fixed endpoint, the point-
don, 1982) to-point, and several H2 /H1 control problems,
Fudenberg D, Tirole J (1991) Game theory. MIT, Cam-
bridge
as well as the dual counterparts. In the past
Ho Y-C (1965) Review of ‘Differential Games’ by R. 50 years, these problems have been addressed
Isaacs. IEEE Trans Autom Control AC-10(4):501–503 using different techniques, each tailored to
Isaacs R (1975) Differential games, 2nd edn. Kruger, New their specific structure. It is only in the last
York. (First edition: Wiley, New York, 1965)
Nash JF Jr (1950) Equilibrium points in N-person games.
10 years that it was recognized that a unifying
Proc Natl Acad Sci 36(1):48–49 framework is available. This framework hinges
Nash JF Jr (1951) Non-cooperative games. Ann Math on formulae that parameterize the solutions of
54(2):286–295 the Hamiltonian differential equation in the
Owen G (1995), Game theory, 3rd edn. Academic, New
York
continuous-time case and the solutions of the
Saad W, Han Z, Debbah M, Hjorungnes A, Başar T (2009) extended symplectic system in the discrete-time
Coalitional game theory for communication networks case. Whereas traditional techniques involve the
[a tutorial]. IEEE Signal Process Mag 26(5):77–97. solutions of Riccati differential or difference
Special issue on Game Theory
Simaan M, Cruz JB Jr (1973) On the Stackelberg strategy
equations, the formulae used here to solve the
in nonzero sum games. J Optim Theory Appl 11:533– finite-horizon LQ control problem only rely on
555 solutions of the algebraic Riccati equations.
Smith JM (1974) The theory of games and the evolution In this article, aspects of the framework are
of animal conflicts. J Theor Biol 47: 209–221
Smith JM (1982) Evolution and the theory of games. described within a discrete-time context.
Cambridge University Press, Cambridge
Smith JM, Price GR (1973) The logic of animal conflict.
Nature 246:15–18 Keywords
Starr AW, Ho Y-C (1969) Nonzero-sum differential
games. J Optim Theory Appl 3:184–206
von Neumann J (1928) Zur theorie der Gesellschaftspiele. Cyclic boundary conditions; Discrete-time linear
Mathematische Annalen 100:295–320 systems; Fixed end-point; Initial value; Point-
von Neumann J, Morgenstern O (1947) Theory of games to-point boundary conditions; Quadratic cost;
and economic behavior, 2nd edn. Princeton University Riccati equations
Press, Princeton. (First edition: 1944)
von Stackelberg H (1934) Marktform und Gleichgewicht,
Springer, Vienna. (An English translation appeared in
1952 entitled “The theory of the market economy,” Introduction
published by Oxford University Press, Oxford)
Vorob’ev NH (1977) Game theory. Springer, Berlin
Ever since the linear-quadratic (LQ) optimal
control problem was introduced in the 1960s by
Kalman in his pioneering paper (1960), it has
Generalized Finite-Horizon found countless applications in areas such as
Linear-Quadratic Optimal Control chemical process control, aeronautics, robotics,
servomechanisms, and motor control, to name but
Augusto Ferrante1 and Lorenzo Ntogramatzidis2 a few.
1
Dipartimento di Ingegneria dell’Informazione, For details on the raisons d’être of LQ prob-
Università di Padova, Padova, Italy lems, readers are referred to the classical text-
2
Department of Mathematics and Statistics, books on this topic (Anderson and Moore 1971;
Curtin University, Perth, WA, Australia Kwakernaak and Sivan 1972) and to the Spe-
cial Issue on LQ optimal control problems in
IEEE Trans. Aut. Contr., vol. AC-16, no. 6, 1971.
Abstract The LQ regulator is not only important per se. It
is also the prototype of a variety of fundamental
The linear-quadratic (LQ) problem is the optimization problems. Indeed, several optimal
prototype of a large number of optimal control control problems that are extremely relevant in
Generalized Finite-Horizon Linear-Quadratic Optimal Control 505

practice can be recast into composite LQ, dual Since its introduction, the original formulation
LQ, or generalized LQ problems. Examples in- of the classic LQ optimal control problem has
clude LQG, H2 and H1 problems, and Kalman been generalized in several different directions,
filtering problems. Moreover, LQ optimal control to accommodate for the need of considering
is intimately related, via matrix Riccati equations, more general scenarios than the one represented
to absolute stability, dissipative networks, and by Problem 1. Examples include the so-called
optimal filtering. The importance of LQ problems fixed endpoint LQ, in which the extreme states
is not restricted to linear systems. For example, are sharply assigned, and the point-to-point
LQ control techniques can be used to modify an case, in which the initial and terminal values
optimal control law in response to perturbations of an output of the system are constrained to be
in the dynamics of a nonlinear plant. For these equal to specified values. This led to a number
reasons, the LQ problem is universally regarded of contributions in the area where different
as a cornerstone of modern control theory. adaptations of the Riccati theory were tailored to
In its simplest and most classical version, the these diversified contexts of LQ optimal control. G
finite-horizon discrete LQ optimal control can be These variations of the classic LQ problem are
stated as follows: becoming increasingly important due to their
use in several applications of interest. Indeed,
Problem 1 Let A 2 Rnn and B 2 Rnm , and
many applications including spacecraft, aircraft,
consider the linear system
and chemical processes involve maneuvering
between two states during some phases of a
xt C1 D A xt C B ut ; yt D C xt C Dut ; (1) typical mission. Another interesting example is
the H2 -optimization of transients in switching
where the initial state x0 2 Rn is given. Let plants, where the problem can be divided
W D W > 2 Rnn be positive semidefinite. Find into a set of finite-horizon LQ problems with
a sequence of inputs ut , with t D 0; 1; : : : ; N  1, welding conditions on the optimal arcs for
minimizing the cost function each switch instant. This problem has been
the object of a large number of contributions
X
N 1 in the recent literature, under different names:
JN;x0 .u/ D kyt k2 C xN> W xN :
def
(2) Parameter varying systems, jump linear systems,
t D0 switching systems, and bumpless systems are
definitions extensively used to denote different
Historically, LQ problems were first intro- classes of systems affected by sensible changes
duced and solved by Kalman in (1960). In this in their parameters or structures (Balas and
paper, Kalman showed that the LQ problem can Bokor 2004). In recent years, a new unified
be solved for any initial state x0 , and the optimal approach emerged in Ferrante et al. (2005),
control can be written as a state feedback u.t/ D Ferrante and Ntogramatzidis (2005), Ferrante
K.t/ x.t/, where K.t/ can be found by solving and Ntogramatzidis (2007a), and Ferrante
a famous quadratic matrix difference equation and Ntogramatzidis (2007b) that solves the
known as the Riccati equation. When W is no finite-horizon LQ optimal control problem
longer assumed to be positive semidefinite, the via a formula which parameterizes the set of
optimal solution may or may not exist. A com- trajectories generated by the corresponding
plete analysis of this case has been worked out Hamiltonian differential equation in the
in Bilardi and Ferrante (2007). In the infinite- continuous time and the extended symplectic
horizon case (i.e., when N is infinite), the optimal difference equation in the discrete case. Loosely,
control (when it exists) is stationary and may be we can say that the expressions parameterizing
computed by solving an algebraic Riccati equa- the trajectories of the Hamiltonian differential
tion (Anderson and Moore 1971; Kwakernaak equation and the extended symplectic difference
and Sivan 1972). equation using this approach hinge on a pair of
506 Generalized Finite-Horizon Linear-Quadratic Optimal Control

“opposite” solutions of the associated algebraic X


N 1  
xt
Riccati equations. This active stream of research JN;x0 .u/ D Πxt> u>
t …
ut
considerably enlarged the range of optimal t D0
 
control problems that can be successfully Q S
CxN> W xN ; … D
def
addressed. This point of view always requires S> R
some controllability-type assumption and the  
C>
extended symplectic pencil (Ferrante and D Œ C D  D …>  0: (3)
def

Ntogramatzidis 2005, 2007b) to be regular D>


and devoid of generalized eigenvalues on the
unit circle. More recently, a new point of Now, let X0 ; X1 ; : : : ; XN be an arbitrary
view has emerged which yields a more direct sequence of n  n symmetric matrices. We have
solution to this problem, without requiring the identity
system-theoretic assumptions (Ferrante and PN 1  
t D0 xt>C1 Xt C1 xt C1  xt> Xt xt
Ntogramatzidis 2013a,b; Ntogramatzidis and
Ferrante 2013). Cx0> X0 x0  xN> XN xN D 0: (4)
The discussion here is restricted to the
discrete-time case; for the corresponding Adding (4)–(3) and using the expression (1) for
continuous-time counterpart, we will only make xt C1 , we get
some comments and refer to the literature. 1
X
N
JN;x0 .u/ D Πxt> u>
t 
Notation. For the reader’s convenience, we t D0
briefly review some, mostly standard, matrix  
notation used throughout the paper. Given a Q C A> Xt C1 A  Xt S C A> Xt C1 B
matrix B 2 Rnm , we denote by B > its S > C B > Xt C1 A R C B > Xt C1 B
transpose and by B  its Moore-Penrose pseudo-  
xt
inverse, the unique matrix B  that satisfies C xN> .W  XN / xN C x0> X0 x0 ; (5)
ut
BB  B D B, B  BB  D B  , .BB  /> D BB  ,
and .B  B/> D B  B. The kernel of B is which holds for any sequence of matrices Xt .
the subspace fx 2 Rn j Bx D 0g and is With XN D W fixed, for t D N 1; N 2; : : : ; 0,
def

denoted ker B. The image of B is the subspace let


fy 2 Rm j 9x 2 Rn W y D A xg and is denoted
Xt D Q C A> Xt C1 A  .S C A> Xt C1 B/
def
by im B. Given a square matrix A, we denote by
.A/ its spectrum, i.e., the set of its eigenvalues.
.R C B > Xt C1 B/ .S > C B > Xt C1 A/: (6)
We write A1 > A2 (resp. A1  A2 ) when A1 A2
is positive definite (resp. positive semidefinite).
It is now easy to see that all the matrices of the
sequence Xt defined above are positive semidefi-
nite. Indeed, XN D W  0. Assume by induction
that Xt C1  0. Then,
Classical Finite-Horizon  
Q C A> Xt C1 A S C A> Xt C1 B
D
def
Linear-Quadratic Optimal Control Mt C1
S > C B > Xt C1 A R C B > Xt C1 B
 >
The simplest classical version of the finite- A
D…C Xt C1 Œ A B   0:
horizon LQ optimal control is Problem 1. By B>
employing some standard linear algebra, this
problem may be solved by the classical technique Since Xt is the generalized Schur complement of
known as “completion of squares”: First of all, the right upper block of Mt C1 in Mt C1 , it follows
the cost can be rewritten as that Xt  0, which in turns implies
Generalized Finite-Horizon Linear-Quadratic Optimal Control 507

R C B > Xt B  0: J  D x0> X0 x0 : (10)

Moreover, by employing Eq. (6) and recalling


The corresponding results in continuous
that, given a positive semidefinite
 matrix …0 D
h Q0 S 0 i S0 R0 S0> S0
 time can be obtained along the same lines
>
D …0  0, we have D as the discrete-time case; see Ferrante and
S> R S0>
h 0i 0 R0
Ntogramatzidis (2013b) and references therein.
S0
R0 ΠS0> R0   0, we easily see that

R0

 
Xt 0 More General Linear-Quadratic
Mt C1 
0 0 Problems
 
S C A> Xt C1 B
D .R C B > Xt B/ The problem discussed in the previous section
R C B > Xt C1 B
 > 
presents some limitations that prevent its appli- G
S C A> Xt C1 B R C B > Xt C1 B  0: cability in several important situations. In partic-
ular, three relevant generalizations of the classical
Hence, (5) takes the form problem are:
1. The fixed endpoint case, where the states at the
X
N 1 endpoints x0 and xN are both assigned.
JN;x0 .u/ D k Œ.R C B > Xt C1 B/1=2  2. The point-to-point case, where the initial and
t D0 terminal values z0 and zN of linear combina-
tion zt D V xt of the state of the dynamical
.S C B Xt C1 A/xt C .R C B > Xt C1 B/1=2
> >
system described by (1) are constrained to be
ut k22 Cx0> X0 x0 : (7) equal to two assigned vectors.
3. The cyclic case, where the states at the end-
Now it is clear that ut is optimal if and only if points x0 and xN are not sharply assigned, but
they are constrained to be equal (clearly, we
.R C B > Xt C1 B/1=2  .S > C B > Xt C1 A/xt can have combinations of (2) and (3)).
All these problems are special cases of a
C.R C B > Xt C1 B/1=2 ut D 0; general LQ problem that can be stated as follows:

whose solutions are parameterized by the feed- Problem 2 Consider the dynamical setting (1) of
back control Problem 1. Find a sequence of inputs ut , with t D
0; 1; : : : ; N  1 and an initial state x0 minimizing
the cost function
ut D Kt xt C Gt vt ; (8)

X
N 1
where Kt D .RCB > Xt C1 B/ .S > CB > Xt C1 A/
def

JN;x0 .u/ D kyt k2 C Πx0>  x > xN>  x >


def

and Gt D ŒI  .R C B > Xt C1 B/ .R C


def
0 N 
> t D0
B Xt C1 B/ is the orthogonal projector onto   
the linear space of vectors that can be added W11 W12 x0  x 0
(11)
to the optimal control ut without affecting W12> W22 xN  x N
optimality and vt is a free parameter. The optimal
state trajectory is now given by the closed-loop under the dynamic constraints (1) and the end-
dynamics points constraints

xt C1 D .A  BKt /xt C BGt vt : (9)  


x0
V D v: (12)
The optimal cost is clearly xN
508 Generalized Finite-Horizon Linear-Quadratic Optimal Control

h W11 W12 i
where Q; S; R are defined as in (3). We make the
Here W D W > W
def
is a positive semidefinite
12 22 following assumptions:
matrix (partitioned in four blocks) that quadrati-
(A1) The pair .A; B/ is modulus controllable,
cally penalizes the differences between the initial
i.e., 8 2 Cnf0g at least one of the two
state x0 and a desired initial state x 0 and between
matrices ŒI  A j B and Œ1 I  A j B
the final state xN and a desired final state x N
has full row rank.
(This general problem formulation can also en-
(A2) The pencil zF  G is regular (i.e.,
compass problems where the difference x0  xN
det.zF  G/ is not the zero polynomial)
is not fixed but has to be quadratically penalized
>
and has no generalized eigenvalues on the
with a matrix  D P  0 in the performance
0 N 1 xt  unit circle.
index JN;x0 .u/ D t D0 Œ xt> u> t … ut C .x0  Consider the discrete algebraic Riccati
xN /> .x0  xN /. It is simple to see that this equation
performance indexhcan be brought i back to (11)
 
by setting W D   and x 0 D x N D X D A> X A  .A> X B C S / .R C B > X B/1
0.). Equation (12) permits to impose also a hard
.S > C B > X A/ C Q: (14)
constraint on an arbitrary linear combination of
initial and final states.
Under assumptions (A1)–(A2), (14) admits
The solution of Problem 2 can be obtained
a strongly unmixed solution X D X > , i.e., a
by parameterizing the solutions of the so-called
solution X for which the corresponding closed-
extended symplectic system (see Ferrante and
loop matrix
Levy (1998) and references therein for a discus-
sion on symplectic matrices and pencils). This
AX D A  B KX ; KX D .R C B > X B/1
def def
solution can be convenient also for the classi-
cal case of Problem 1. In fact it does not re-
.S > C B > X A/ (15)
quire to iterate the difference Riccati equation
(which can be undesirable if the time horizon
has spectrum that does not contain reciprocal
is large) but only to solve an algebraic Ric-
values, i.e.,  2 .AX / implies 1 … .AX /. It
cati equation and a related discrete Lyapunov
can now be proven that the following closed-
equation.
loop Lyapunov equation admits a unique solution
This solution requires some definitions, pre-
Y D Y > 2 Rn  n :
liminary results, and standing assumptions (see
Ntogramatzidis and Ferrante (2013) and Ferrante
and Ntogramatzidis (2013a) for a more general AX Y A> > 1 >
X  Y C B .R C B X B/ B D 0:
approach which does not require such assump- (16)
tions). A detailed proof of the main result can
be found in Ferrante and Ntogramatzidis (2007b); The following theorem provides an explicit for-
see also Zattoni (2008) and Ferrante and Ntogra- mula parameterizing all the optimal state and
matzidis (2012). The extended symplectic pencil control trajectories for Problem 2. Notice that this
is defined by formula can be readily implemented starting from
the problem data.
23
In 0 0 Theorem 1 With reference to Problem 2, assume
zF  G; F D 4 0 A> 0 5 ;
def
that (A1) and (A2) are satisfied. Let X D X >
0 B > 0 be any strongly unmixed solution of (14) and
2 3 Y D Y > be the corresponding solution of (16).
A 0 B Let NV be a basis matrix (In the case when
G D 4 Q In S 5 :
def
(13) ker V D f0g, we consider NV to be void.) of the
S> 0 R null space of V . Moreover, let
Generalized Finite-Horizon Linear-Quadratic Optimal Control 509

F D AN
def
algebraic Riccati equation and the solution of a
X;
Lyapunov equation may be computed by standard
K? D KX Y A> >
X  .R C B X B/ B ;
1 >
def
and robust algorithms available in any control
  package (see the MATLABR routines dare.m
x0
XO D diag .X; X /; x D
def def
; and dlyap.m). We refer to Ferrante et al. (2005)
xN
    and Ferrante and Ntogramatzidis (2013b) for the
v In Y F > continuous-time counterpart of the above results.
wD ; LD
def def
;
NV> W x F Y
 
0 F > Summary
U D
def
;
0 In
  With the technique discussed in this paper, a
VL
M D
def
:
NV> Π.XO  W / L  U  large number of LQ problems can be tackled
in a unified framework. Moreover, several finite- G
horizon LQ problems that can be interesting and
Problem 2 admits solutions if and only if
useful in practice can be recovered as particular
w 2 im M . In this case, let NM be a basis matrix
cases of the control problem considered here. The
of the null space of M , and define
generality of the optimal control problem herein
considered is crucial in the solution of several
P D f D M  w C NM  j  arbitrary g: (17)
def
H2  H1 optimization problems whose optimal
trajectory is composed of a set of arches, each one
Then, the set of optimal state and control trajec- solving a parametric LQ subproblem in a specific
tories of Problem 2 is parameterized in terms of
 2 P, by time horizon and all joined together at the end-
points of each subinterval. In these cases, in fact,
8" # a very general form of constraint on the extreme
ˆ
ˆ AtX Y .A>
X/
N t
ˆ
ˆ ; states is essential in order to express the condition
ˆ
ˆ KX AtX K? .A> N t1
  ˆˆ X/
< of conjunction of each pair of subsequent arches
x.t /
D 0  t  N  1; (18) at the endpoints.
u.t / ˆ
ˆ " #
ˆ
ˆ
ˆ
ˆ N
ˆ AX Y ; t D N:

0 0
Cross-References
The interpretation of the above result is the
following. As  varies, (18) describes the trajec-  H2 Optimal Control
tories of the extended symplectic system. The set  H-Infinity Control
P defined in (17) is the set of  for which these  Linear Quadratic Optimal Control
trajectories satisfy the boundary conditions. All
the details of this construction can be found in
Ferrante and Ntogramatzidis (2007b). If the pair Bibliography
.A; B/ is stabilizable, we can choose X D X >
to be the stabilizing solution of (14). In such Anderson BDO, Moore JB (1971) Linear optimal control.
case, the matrices AtX , .A> X/
N t
and .A>X/
N t 1 Prentice Hall International, Englewood Cliffs
Balas G, Bokor J (2004) Detection filter design for
appearing in (18) are asymptotically stable for all
LPV systems – a geometric approach. Automatica
t D 0; : : : ; N . Thus, in this case, the optimal state 40:511–518
trajectory and control are expressed in terms of Bilardi G, Ferrante A (2007) The role of terminal
powers of strictly stable matrices in the overall cost/reward in finite-horizon discrete-time LQ optimal
control. Linear Algebra Appl (Spec Issue honor Paul
time interval, thus ensuring the robustness of
Fuhrmann) 425:323–344
the obtained solution even for very large time Ferrante A (2004) On the structure of the solution of
horizons. Indeed, the stabilizing solution of an discrete-time algebraic Riccati equation with singular
510 Graphs for Modeling Networked Interactions

closed-loop matrix. IEEE Trans Autom Control AC-


49(11):2049–2054 Graphs for Modeling Networked
Ferrante A, Levy B (1998) Canonical form for sym- Interactions
plectic matrix pencils. Linear Algebra Appl 274:
259–300
Mehran Mesbahi1 and Magnus Egerstedt2
Ferrante A, Ntogramatzidis L (2005) Employing the 1
algebraic Riccati equation for a parametrization of University of Washington, Seattle, WA, USA
2
the solutions of the finite-horizon LQ problem: the Georgia Institute of Technology, Atlanta,
discrete-time case. Syst Control Lett 54(7):693–703 GA, USA
Ferrante A, Ntogramatzidis L (2007a) A unified approach
to the finite-horizon linear quadratic optimal control
problem. Eur J Control 13(5):473–488
Ferrante A, Ntogramatzidis L (2007b) A unified approach Abstract
to finite-horizon generalized LQ optimal control
problems for discrete-time systems. Linear Algebra
Appl (Spec Issue honor Paul Fuhrmann) 425(2–3):
Graphs constitute natural models for networks of
242–260 interacting agents. This chapter introduces graph
Ferrante A, Ntogramatzidis L (2012) Comments theoretic formalisms that facilitate analysis and
on “Structural Invariant Subspaces of Singular synthesis of coordinated control algorithms over
Hamiltonian Systems and Nonrecursive Solutions of
Finite-Horizon Optimal Control Problems”. IEEE
networks.
Trans Autom Control 57(1):270–272
Ferrante A, Ntogramatzidis L (2013a) The extended sym-
plectic pencil and the finite-horizon LQ problem with Keywords
two-sided boundary conditions. IEEE Trans Autom
Control 58(8):2102–2107
Distributed control; Graph theory; Multi-agent
Ferrante A, Ntogramatzidis L (2013b) The role of the
generalised continuous algebraic Riccati equation in networks
impulse-free continuous-time singular LQ optimal
control. In: Proceedings of the 52nd conference on
decision and control (CDC 13), Florence, 10–13 Dec Introduction
2013b
Ferrante A, Marro G, Ntogramatzidis L (2005) A
parametrization of the solutions of the finite-horizon Distributed and networked systems are character-
LQ problem with general cost and boundary condi- ized by a set of dynamical units (agents, actors,
tions. Automatica 41:1359–1366 nodes) that share information with each other
Kalman RE (1960) Contributions to the theory of optimal
in order to achieve a global performance objec-
control. Bulletin de la Sociedad Matematica Mexicana
5:102–119 tive using locally available information. Informa-
Kwakernaak H, Sivan R (1972) Linear optimal control tion can typically be shared if agents are within
systems. Wiley, New York communication or sensing range of each other.
Ntogramatzidis L, Ferrante A (2010) On the solution of
It is useful to abstract away the particulars of
the Riccati differential equation arising from the LQ
optimal control problem. Syst Control Lett 59(2):114– the underlying information-exchange mechanism
121 and simply say that an information link exists
Ntogramatzidis L, Ferrante A (2013) The gener- between two nodes if they can share information.
alised discrete algebraic Riccati equation in linear-
quadratic optimal control. Automatica 49:471–478.
Such an abstraction is naturally represented in
doi:10.1016/j.automatica.2012.11.006 terms of a graph.
Ntogramatzidis L, Marro G (2005) A parametrization of A graph is a combinatorial object defined by
the solutions of the Hamiltonian system for stabiliz- two constructs: vertices (or nodes) and edges (or
able pairs. Int J Control 78(7):530–533
Prattichizzo D, Ntogramatzidis L, Marro G (2008) A
links) connecting pairs of distinct vertices. The
new approach to the cheap LQ regulator exploiting set of N vertices specified by V D {v1 ; : : :; vN }
the geometric properties of the Hamiltonian system. corresponds to the agents, and an edge between
Automatica 44:2834–2839 vertices vi and vj is represented by .vi , vj /;
Zattoni E (2008) Structural invariant subspaces of singular
Hamiltonian systems and nonrecursive solutions of
the set of all edges constitutes the edge set E.
finite-horizon optimal control problems. IEEE Trans The graph G is thus the pair G D .V; E/ and
Autom Control AC-53(5):1279–1284 the interpretation is that an edge .vi ; vj / 2 E
Graphs for Modeling Networked Interactions 511

Graphs for Modeling


Networked Interactions,
Fig. 1 A network of
agents equipped with
omnidirectional range
sensors can be viewed as a
graph (undirected in this
case), with nodes
corresponding to the agents
and edges to their pairwise
interactions, which are
enabled whenever the
agents are within a certain
distance from each other

G
if information can flow from vertex i to vertex theory, namely, the broad discipline of algebraic
j: If the information exchange is sensor based, graph theory, by first associating matrices with
and if there is a state xi associated with vertex i , graphs. For undirected graphs, the following ma-
e.g., its position, then this information is typically trices play a key role:
relative, i.e., the states are measured relative to
each other and the information obtained along Degree matrix W DDiag .deg .v1/ ; : : : ; deg .vN//;
the edge is xj  xi . If on the other hand the  
Adjacency matrix W A D aij ;
information is communicated, then the full state
information xi can be transmitted along the edge
where Diag denotes a diagonal matrix whose
.vi , vj /. The graph abstraction for a network of
diagonal consists of its argument and deg.vi /
agents with sensor-based information exchange is
is the degree of vertex i in the graph, i.e., the
illustrated in Fig. 1.
cardinality of the set of edges incident on vertex
It is often useful to differentiate between sce-
i . Moreover,
narios where the Information exchange is bidi-
rectional – if agent i can get information from 
1 if (vj ; vi / 2 E
agent j , then agent j can get information from aij D
0 otherwise.
agent i – and when it is not. In the language of
graph theory, an undirected graph is one where
As an example of how these matrices come into
.vi ; vj / 2 E implies that .vj ; vi / 2 E, while a
play, the so-called consensus protocol over scalar
directed graph is one where such an implication
states can be compactly written on ensemble form
may not hold.
as
xP D Lx;
Graph-Based Coordination Models where x = [x1 ,. . . , xN ]T and L is the graph
Laplacian:
Graphs provide structural insights into how dif- L D   A:
ferent coordination algorithms behave over a net-
work. A coordination algorithm, or protocol, is A useful matrix for directed networks is the
an update rule that describes how the node states incidence matrix, obtained by associating an in-
should evolve over time. To understand such dex to each edge in E. We say that vi = tail.ej / if
protocols, one needs to connect the interaction edge ej starts at node vi and vi = head.ej / if ej
dynamics to the underlying graph structure. This ends up at vi , leading to the
connection is facilitated through the common  
intersection of linear system theory and graph Incidence matrix W D D ij ;
512 Graphs for Modeling Networked Interactions

where can be generalized


 is by
 defining an edge-tension
8 energy Eij xi  xj  along each edge in the
< 1 if vi D head.ej / graph, which gives the total energy in the network
ij D 1 if vi D tail.ej / as
:
0 otherwise. XN X
 
E.x/ D Eij xi  xj  :
i D1 j 2Ni
It now follows that for undirected networks, the
Laplacian has an equivalent representation as If the agents update their states in such a way as
to reduce the total energy in the system according
L D DD T ; to a gradient descent scheme, the update law
becomes
where D is the incidence matrix associated with  
an arbitrary orientation (assignment of directions @E.x/  @E.x/ 2
xP D  ) P 
E.x/ D    ;
to the edges) of the undirected graph. This in @xi @x 2
turn implies that for undirected networks, L is
a positive semi-definite matrix and that all of its which is nonpositive, i.e., the total energy is
eigenvalues are nonnegative. reduced in the network. For undirected networks,
If the network is directed, one has to pay the ensemble version of this protocol assumes the
attention to the direction in which information form
is flowing, using the in-degree and out-degree xP D Lw .x/x;
of the vertices. The out-degree of vertex i is
where the weighted graph Laplacian is
the number of directed edges that originate at
i , and similarly the in-degree of node i is the
number of directed edges that terminate at node i . Lw .x/ D DW .x/D T ;
A directed graph is balanced if the out-degree is
equal to the in-degree at every vertex in the graph. with the weight matrix W .x/ = Diag.w1 .x/; : : :;
And, the graph Laplacian for directed graphs is wM .x//. Here M is the total number of edges
obtained by only counting information in the network, and wk .x/ is the weight that
  flowing in
the correct direction, i.e., if L D `ij , then li i is corresponds to the kth edge, given an arbitrary
the in-degree of vertex i and lij D 1 if i ¤ j ordering of the edges consistent with the incident
and .vj ; vi / 2 E. matrix D.
As a final note, for both directed and This energy interpretation allows for the
undirected networks, it is possible to associate synthesis of coordination laws for multi-agent
weights to the edges, w W E !
where
is set networks with desirable properties, such as
   
of nonnegative reals or more generally a field, in Eij xi  xj  D xi  xj   dij 2 for
which case the Laplacian’s diagonal elements are making the agents reach the desired interagent
the sum of the weights of edges incident to node distances dij , as shown in Fig. 2. Other
i and the off-diagonal elements are w.vj ; vi / applications where these types of constructions
when .vj ; vi / 2 E. have been used include collision avoidance and
connectivity maintenance.

Applications
Summary and Future Directions
Graph-based coordination has been used in a
number of application domains, such as multi- A number of issues pertaining to graph-based dis-
agent robotics, mobile sensor and communication tributed control remain to be resolved. These in-
networks, formation control, and biological sys- clude how heterogeneous networks, i.e., networks
tems. One way in which the consensus protocol comprising of agents with different capabilities,
Graphs for Modeling Networked Interactions 513

Graphs for Modeling Networked Interactions, Fig. 2 Fifteen mobile robots are forming the letter “G” by executing
a weighted version of the consensus protocol. (a) Formation control .t D 0/. (b) Formation control .t D 5/

G
can be designed and understood. A variation to Recommended Reading
this theme is networks of networks, i.e., net-
works that are loosely coupled together and that There are a number of research manuscripts
must coordinate at a higher level of abstrac- and textbooks that explore the role of network
tion. Another key issue concerns how human structure on the system theoretic aspects of
operators should interact with networked control networked dynamic systems and its many
systems. ramifications. Some of these references are listed
below.

Cross-References
Bibliography
 Averaging Algorithms and Bai H, Arcak M, Wen J (2011) Cooperative control de-
Consensus sign: a systematic, passivity-based approach. Springer,
 Distributed Optimization Berlin
 Dynamic Graphs, Connectivity of Bullo F, Cortés J, Martínez S (2009) Distributed control of
robotic networks: a mathematical approach to motion
 Flocking in Networked Systems coordination algorithms. Princeton University Press,
 Networked Systems Princeton
 Optimal Deployment and Spatial Mesbahi M, Egerstedt M (2010) Graph theoretic methods
Coverage in multiagent networks. Princeton University Press,
Princeton
 Oscillator Synchronization Ren W, Beard R (2008) Distributed consensus in multi-
 Vehicular Chains vehicle cooperative control. Springer, Berlin
H

multi-feedback-loop designs, with the added


H2 Optimal Control benefit that design methods derived from it
are amenable to computer implementation.
Ben M. Chen
Indeed, over the last five decades, a number of
Department of Electrical and Computer
multivariable analysis and design methods have
Engineering, National University of Singapore,
been developed using the state-space description
Singapore, Singapore
of systems. Of these design tools, H2 optimal
control problems involve minimizing the H2
norm of the closed-loop transfer function from
Abstract
exogenous disturbance signals to a pertinent
controlled output signals of a given plant
An optimization-based approach to linear
by appropriate use of a internally stabilizing
feedback control system design uses the H2
feedback controller. It was not until the 1990s
norm, or energy of the impulse response,
that a complete solution to the general H2 optimal
to quantify closed-loop performance. In this
control problem began to emerge. To elaborate
entry, an overview of state-space methods for
on this, let us concentrate our discussion on H2
solving H2 optimal control problems via Riccati
optimal control for a continuous-time system †
equations and matrix inequalities is presented
expressed in the following state-space form:
in a continuous-time setting. Both regular and
singular problems are considered. Connections
xP D Ax C Bu C Ew (1)
to so-called LQR and LQG control problems are
also described.
y D C1 x C D11 u C D1 w (2)
z D C2 x C D2 u C D22 w (3)
Keywords
where x is the state variable, u is the control
Feedback control; H2 control; Linear matrix input, w is the exogenous disturbance input, y is
inequalities; Linear systems; Riccati equations; the measurement output, and z is the controlled
State-space methods output. The system † is typically an augmented
or generalized plant model including weighting
functions that reflect design requirements. The
Introduction H2 optimal control problem is to find an ap-
propriate control law, relating the control input
Modern multivariable control theory based u to the measured output y, such that when it
on state-space models is able to handle is applied to the given plant in Eqs. (1)–(3), the

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
516 H2 Optimal Control

resulting closed-loop system is internally stable, as singular H2 optimal control problems, consists
and the H2 norm of the resulting closed-loop of those which are not regular.
transfer matrix from the disturbance input w to Most of the research in the literature was
the controlled output z, denoted by Tzw .s/, is expended on regular problems. Also, most of the
minimized. For a stable transfer matrix Tzw .s/, available textbooks and review articles, see, for
the H2 norm is defined as example, Anderson and Moore (1989), Bryson
and Ho (1975), Fleming and Rishel (1975),
 Z 1  Kailath (1974), Kwakernaak and Sivan (1972),
1 1
kTzw k2 D trace H
Tzw .j!/Tzw .j!/d! 2 Lewis (1986), and Zhou et al. (1996), to name a
2 1
few, cover predominantly only a subset of regular
(4) problems. The singular H2 control problem with
state feedback was studied in Geerts (1989) and
where Tzw H
is the conjugate transpose of Tzw . Note Willems et al. (1986). Using different classes of
that the H2 norm is equal to the energy of the state- and measurement-feedback control laws,
impulse response associated with Tzw .s/ and this Stoorvogel et al. (1993) studied the general H2
is finite only if the direct feedthrough term of the optimal control problems for the first time. In
transfer matrix is zero. particular, necessary and sufficient conditions are
It is standard to make the following assump- provided therein for the existence of a solution in
tions on the problem data: D11 D 0; D22 D the case of state-feedback control, and in the case
0; .A; B/ is stabilizable; .A; C1 / is detectable. of measurement-feedback control. Following
The last two assumptions are necessary for the this, Trentelman and Stoorvogel (1995) explored
existence of an internally stabilizing control law. necessary and sufficient conditions for the
The first assumption can be made without loss existence of an H2 optimal controller within
of generality via a constant loop transformation. the context of discrete-time and sampled-data
Finally, either the assumption D22 D 0 can be systems. At the same time Chen et al. (1993,
achieved by a pre-static feedback law, or the 1994a) provided a thorough treatment of the
problem does not yield a solution that has finite H2 optimal control problem with state-feedback
H2 closed-loop norm. controllers. This includes a parameterization
There are two main groups into which all H2 and construction of the set of all H2 optimal
optimal control problems can be divided. The controllers and the associated sets of H2 optimal
first group, referred to as regular H2 optimal fixed modes and H2 optimal fixed decoupling
control problems, consists of those problems for zeros. Also, they provided a computationally
which the given plant satisfies two additional feasible design algorithm for selecting an H2
assumptions: optimal state-feedback controller that places the
1. The subsystem from the control input to the closed-loop poles at desired locations whenever
controlled output, i.e., .A; B; C2 ; D2 /, has no possible. Furthermore, Chen and Saberi (1993)
invariant zeros on the imaginary axis, and and Chen et al. (1996) developed the necessary
its direct feedthrough matrix, D2 , is injective and sufficient conditions for the uniqueness of
(i.e., it is tall and of full rank). an H2 optimal controller. Interested readers are
2. The subsystem from the exogenous dis- referred to the textbook Saberi et al. (1995)
turbance to the measurement output, i.e., for a detailed treatment of H2 optimal control
.A; E; C1 ; D1 /, has no invariant zeros on problems in their full generality.
the imaginary axis and its direct feedthrough
matrix, D1 , is surjective (i.e., it is fat and of
full rank).
Assumption 1 implies that .A; B; C2 ; D2 / is left Regular Case
invertible with no infinite zero, and Assump-
tion 2 implies that .A; E; C1 ; D1 / is right invert- Solving regular H2 optimal control problems is
ible with no infinite zero. The second, referred to relatively straightforward. In the case that all of
H2 Optimal Control 517

1
the state variables of the given plant are available
where R? > 0 and Q?  0 with .A; Q?2 / being
for feedback, i.e., y D x, and Assumption 1
detectable. The LQR problem is equivalent to
holds, the corresponding H2 optimal control
finding a static state-feedback H2 optimal control
problem can be solved in terms of the unique
law for the following auxiliary plant †LQR :
positive semi-definite stabilizing solution P  0
of the following algebraic Riccati equation:
xP D Ax C Bu C X0 w (10)

AT P C PA C C2T C2  .PB C C2T D2 /.D2T D2 /1 yDx (11)


! !
.D2 C2 C B P / D 0
T T
(5) 0
1
2
zD 1
2
x C R? u (12)
Q? 0
The H2 optimal state-feedback law is given by
For the measurement-feedback case with both
u D F x D .D2T D2 /1 .D2T C2 C B T P / x (6) Assumptions 1 and 2 being satisfied, the cor-
responding H2 optimal control problem can be
solved by finding a positive semi-definite stabi-
H
and the resulting closed-loop transfer matrix from
w to z, Tzw .s/, has the following property: lizing solution P  0 for the Riccati equation
given in Eq. (5) and a positive semi-definite sta-
q bilizing solution Q  0 for the following Riccati
kTzw k2 D trace.E T PE/ (7) equation:

Note that the H2 optimal state-feedback control QAT CAQ C EE T .QC1T C ED1T /.D1 D1T /1
law is generally nonunique. A trivial example
is the case when E D 0, whereby every sta- .D1 E T C C1 Q/ D 0 (13)
bilizing control law is an optimal solution. It
is also interesting to note that the closed-loop The H2 optimal measurement-feedback law is
system comprising the given plant with y D x given by
and the state-feedback control law of Eq. (6) has
poles at all the stable invariant zeros and all the vP D .A C BF C KC1 /v  Ky; u D F x (14)
mirror images of the unstable invariant zeros of
.A; B; C2 ; D2 / together with some other fixed where F is as given in Eq. (6) and
locations in the left half complex plane. More de-
tailed results about the optimal fixed modes and
fixed decoupling zeros for general H2 optimal K D .QC1T C ED1T /.D1 D1T /1 (15)
control can be found in Chen et al. (1993).
It can be shown that the well-known linear In fact, such an optimal control law is unique and
quadratic regulation (LQR) problem can be refor- the resulting closed-loop transfer matrix from w
mulated as a regular H2 optimal control problem. to z, Tzw .s/, has the following property:
For a given plant ˚
kTzw k2 D trace.E T PE/
xP D Ax C Bu; x.0/ D X0 (8)  
1
Ctrace AT P C PA C C2T C2 Q 2

with .A; B/ being stabilizable, the LQR problem (16)


is to find a control law u D F x such that the
following performance index is minimized: Similarly, consider the standard LQG problem
for the following system:
Z 1
J D .x T Q? x C uT R? u/dt; (9)
0 xP D Ax C Bu C G? d (17)
518 H2 Optimal Control

y D C x C N? n; N? > 0 (18) to find a suboptimal control law for the singular


    problem, i.e., to find an appropriate control law
H? x d
zD ; R? > 0; w D (19) such that the H2 norm of the resulting closed-
R? u n
loop transfer matrix from w to z can be made
where x is the state, u is the control, d and n arbitrarily close to the best possible performance.
white noises with identity covariance, and y the The procedure given below is to transform the
measurement output. It is assumed that .A; B/ is original problem into an H2 almost disturbance
stabilizable and .A; C / is detectable. The control decoupling problem; see Stoorvogel (1992) and
objective is to design an appropriate control law Stoorvogel et al. (1993).
that minimizes the expectation of jzj2 . Such an Consider the given plant in Eqs. (1)–(3) with
LQG problem can be solved via the H2 optimal Assumption 1 and/or Assumption 2 not satisfied.
control problem for the following auxiliary sys- First, find the largest solution P  0 for the
tem †LQG (see Doyle 1983): following linear matrix inequality

xP D Ax C Bu C ΠG? 0 w (20)  T 
A P C PA C C2T C2 PB C C2T D2
F .P /D 0
B T P C D2T C2 D2T D2
y D C x C Π0 N? w (21) (23)
    and find the largest solution Q  0 for
H? 0
zD xC u (22)
0 R?  
AQ C QAT C EE T QC1T C ED1T
H2 optimal control problem for discrete- G.Q/D 0
C1 Q C D1 E T D1 D1T
time systems can be solved in a similar way
(24)
via the corresponding discrete-time algebraic
Riccati equations. It is worth noting that many Note that by decomposing the quadruples
works can be found in the literature that deal .A; B; C2 ; D2 / and .A; E; C1 ; D1 / into various
with solutions to discrete-time algebraic Riccati subsystems in accordance with their structural
equations related to optimal control problems; properties, solutions to the above linear matrix
see, for example, Kucera (1972), Pappas et al. inequalities can be obtained by solving a Riccati
(1980), and Silverman (1976), to name a few. It equation similar to those in Eq. (5) or Eq. (5) for
is proven in Chen et al. (1994b) that solutions the regular case. In fact, for the regular problem,
to the discrete- and continuous-time algebraic the largest solution P  0 for Eq. (23) and
Riccati equations for optimal control problems the stabilizing solution P  0 for Eq. (5) are
can be unified. More specifically, the solution identical. Similarly, the largest solution Q  0
to a discrete-time Riccati equation can be done for Eq. (24) and the stabilizing solution Q  0
through solving an equivalent continuous-time for Eq. (13) are also the same. Interested readers
one and vice versa. are referred to Stoorvogel et al. (1993) for more
details or to Chen et al. (2004) for a more system-
atic treatment on the structural decomposition of
Singular Case linear systems and its connection to the solutions
of the linear matrix inequalities.
As in the previous section, only the key procedure It can be shown that the best achievable H2
in solving the singular H2 -optimization problem norm of the closed-loop transfer matrix from w
for continuous-time systems is addressed. For to z, i.e., the best possible performance over all
the singular problem, it is generally not possible internally stabilizing control laws, is given by
to obtain an optimal solution, except for some
˚
situations when the given plant satisfies certain 2? D trace.E T PE/
geometric constraints; see, e.g., Chen et al. (1993)  
1
and Stoorvogel et al. (1993). It is more feasible C trace AT P C PA C C2T C2 Q 2 (25)
H2 Optimal Control 519

Next, partition to Saberi et al. (1995) and the references therein,


for the complete treatment of H2 optimal control
 
CPT   problems, and to Chap. 10 of Chen et al. (2004)
F .P / D CP DP
DPT for the unification and differentiation of H2
  control, H1 control, and disturbance decoupling
EQ  T T 
and G.Q/ D EQ DQ (26) control problems. H2 optimal control is a mature
DQ
area and has a long history. Possible future
research includes issues on how to effectively
where ŒCP DP  and ŒEQT DQT  are of maximal utilize the theory in solving real-life problems.
rank, and then define an auxiliary system †PQ :

xP PQ D AxPQ C Bu C EQ wPQ (27) Cross-References


y D C1 xPQ C DQ wPQ (28)  H-Infinity Control
zPQ D CP xPQ C DP u (29)  Linear Matrix Inequality Techniques in
Optimal Control
H
It can be shown that the quadruple.A;B;CP;DP /  Linear Quadratic Optimal Control
is right invertible and has no invariant zeros in the  Optimal Control via Factorization and Model
open right-half complex plane, and the quadruple Matching
.A; EQ ; C1 ; DQ / is left invertible and has no  Stochastic Linear-Quadratic Control
invariant zeros in the open right-half complex
plane. It can also be shown that there exists
an appropriate control law such that when it is Bibliography
applied to †PQ , the resulting closed-loop system
is internally stable and the H2 norm of the closed- Anderson BDO, Moore JB (1989) Optimal control: linear
loop transfer matrix from wPQ to zPQ can be made quadratic methods. Prentice Hall, Englewood Cliffs
Bryson AE, Ho YC (1975) Applied optimal control, opti-
arbitrarily small. Equivalently, H2 almost distur- mization, estimation, and control. Wiley, New York
bance decoupling problem for †PQ is solvable. Chen BM, Saberi A (1993) Necessary and sufficient con-
More importantly, it can further be shown ditions under which an H2 -optimal control problem
that if an appropriate control law solves the H2 has a unique solution. Int J Control 58:337–348
Chen BM, Saberi A, Sannuti P, Shamash Y (1993) Con-
almost disturbance decoupling problem for †PQ , struction and parameterization of all static and dy-
then it solves the H2 suboptimal problem for †. namic H2 -optimal state feedback solutions, optimal
As such, the solution to the singular H2 control fixed modes and fixed decoupling zeros. IEEE Trans
problem for † can be done by finding a solution Autom Control 38:248–261
Chen BM, Saberi A, Shamash Y, Sannuti P (1994a)
to the H2 almost disturbance decoupling problem Construction and parameterization of all static and dy-
for †PQ . There are vast results available in the namic H2 -optimal state feedback solutions for discrete
literature dealing with disturbance decoupling time systems. Automatica 30:1617–1624
problems. More detailed treatments can be found Chen BM, Saberi A, Shamash Y (1994b) A non-recursive
method for solving the general discrete time algebraic
in Saberi et al. (1995). Riccati equation related to the H control problem.
1
Int J Robust Nonlinear Control 4:503–519
Chen BM, Saberi A, Shamash Y (1996) Necessary and
sufficient conditions under which a discrete time H2 -
Conclusion optimal control problem has a unique solution. J Con-
trol Theory Appl 13:745–753
This entry considers the basic solutions to Chen BM, Lin Z, Shamash Y (2004) Linear systems the-
H2 optimal control problems for continuous- ory: a structural decomposition approach. Birkhäuser,
Boston
time systems. Both the regular problem and
Doyle JC (1983) Synthesis of robust controller and filters.
the general singular problem are presented. In: Proceedings of the 22nd IEEE conference on deci-
Readers interested in more details are referred sion and control, San Antonio
520 H-Infinity Control

Fleming WH, Rishel RW (1975) Deterministic and closed-loop system can be derived. Connections
stochastic optimal control. Springer, New York to other problems, such as game theory and risk-
Geerts T (1989) All optimal controls for the singular linear
quadratic problem without stability: a new interpreta-
sensitive control, are discussed and finally appro-
tion of the optimal cost. Linear Algebra Appl 122:65– priate problem formulations to produce “good”
104 controllers using this methodology are outlined.
Kailath T (1974) A view of three decades of linear filtering
theory. IEEE Trans Inf Theory 20: 146–180
Kucera V (1972) The discrete Riccati equation of optimal
control. Kybernetika 8:430–447 Keywords
Kwakernaak H, Sivan R (1972) Linear optimal control
systems. Wiley, New York
Loop-shaping; Robust control; Robust stability
Lewis FL (1986) Optimal control. Wiley, New York
Pappas T, Laub AJ, Sandell NR Jr (1980) On the nu-
merical solution of the discrete-time algebraic Riccati
equation. IEEE Trans Autom Control AC-25:631–641 Introduction
Saberi A, Sannuti P, Chen BM (1995) H2 optimal control.
Prentice Hall, London
Silverman L (1976) Discrete Riccati equations: alternative The H1 -norm probably first entered the study
algorithms, asymptotic properties, and system theory of robust control with the observations made
interpretations. Control Dyn Syst 12:313–386 by Zames (1981) in the considering optimal
Stoorvogel AA (1992) The singular H2 control problem.
Automatica 28:627–631 sensitivity. The so-called H1 methods were
Stoorvogel AA, Saberi A, Chen BM (1993) Full and subsequently developed and are now routinely
reduced order observer based controller design for H2 - available to control engineers. In this entry
optimization. Int J Control 58:803–834 we consider the H1 methods for control, and
Trentelman HL, Stoorvogel AA (1995) Sampled-data and
discrete-time H2 optimal control. SIAM J Control for simplicity of exposition, we will restrict
Optim 33:834–862 our attention to linear, time-invariant, finite
Willems JC, Kitapci A, Silverman LM (1986) Singular op- dimensional, continuous-time systems. Such
timal control: a geometric approach. SIAM J Control systems can be represented by their transfer
Optim 24:323–337
Zhou K, Doyle JC, Glover K (1996) Robust and optimal function matrix, G.s/, which will then be a
control. Prentice Hall, Upper Saddle River rational function of s. Although the Hardy
Space, H1 , also includes nonrational functions,
a rational G.s/ is in H1 if and only if it is proper
and all its poles are in the open left half plane, in
H-Infinity Control which case the H1 -norm is defined as:

Keith Glover
kG.s/k1 D sup max .G.s//D sup max ..j!//
Department of Engineering, University of Res>0 1<!<1
Cambridge, Cambridge, UK

(where max denotes the largest singular value).


Hence for a single input/single output system
Abstract with transfer function, g.s/, its H1 -norm,
kg.s/k1 gives the maximum value of jg.j!/j
The area of robust control, where the perfor- and hence the maximum amplification of
mance of a feedback system is designed to be sinusoidal signals by a system with this transfer
robust to uncertainty in the plant being controlled, function. In the multi-input/multi-output case
has received much attention since the 1980s. a similar result holds regarding the system
System analysis and controller synthesis based on amplification of a vector of sinusoids. There
the H-infinity norm has been central to progress is now a good collection of graduate level
in this area. This article outlines how the control textbooks that cover the area in some detail from
law that minimizes the H-infinity norm of the a variety of approaches, and these are listed
H-Infinity Control 521

in the Recommended Reading section and the


references in this article are generally to these z w
texts rather than to the original journal papers. P11 P12
P=
Consider a system with transfer function, y P21 P22 u
G.s/, input vector, u.t/ 2 L2 .0; 1/ and an
output vector, y.t/, whose Laplace transforms
are given by uN .s/ and y.s/.
N Such a system will
have a state space realization,
K
P
x.t/ D Ax.t/ C Bu.t/; y.t/ D C x.t/ C Du.t/

H-Infinity Control, Fig. 1 Lower linear fractional trans-


giving G.s/ D D C C.sI  A/1 B, which we
formation: feedback system
also denote
 
A B
G.s/ D ; ) yN D .I  P22 K/1 P21 w;
N H
C D
uN D K.I  P22 K/1 P21 w
N
N D G.s/Nu.s/ if x.0/ D 0.  
and hence y.s/ zN D P11 C P12 K.I  P22 K/1 P21 wN
There are two main reasons for using the H1 -
norm. Firstly in representing the system gain for DW Fl .P; K/w
N DW Tz N
ww

input signals u.t/ 2 L2 .0; 1/ or equivalently


uN .j!/ 2 RL2 .1; 1/, with corresponding norm where Fl .P; K/ denotes the lower Linear
1
kuk22 D 0 u.t/ u.t/ dt (where x  denotes the Fractional Transformation (LFT) with connection
conjugate transpose of the vector x (or a matrix)). around the lower terminals of P as in Fig. 1.
With these input and output spaces the induced The standard H1 -control synthesis problem is
norm of the system is easily shown to be the H1 - to find a controller with transfer function, K, that
norm of G.s/, and in particular,
stabilizes the closed-loop system in Fig. 1
kyk2  kG.s/k1 kuk2 and minimizes kFl .P; K/k1 .

Hence in a control context the H1 -norm can That is, the controller is designed to minimize
give a measure of the gain, for example, from the worst-case effect of the disturbance w on
disturbances to the resulting errors. In the the output/error signal z as measured by the L2
interconnection of systems, the property that norm of the signals. This article will describe the
kP .s/Q.s/k1  kP .s//k1 kQ.s/k1 is often solution to this problem.
useful.
The second reason for using the H1 -norm
is in representing uncertainty in the plant being Robust Stability
controlled, e.g., the nominal plant is Po .s/ but
the actual plant is P .s/ D Po .s/ C .s/ where Before we describe the solution to the synthe-
k.s/k1  ı. sis problem, consider the problem of the robust
A typical control design problem is given in stability of an uncertain plant with a feedback
Fig. 1, i.e., controller. Suppose the plant is given by the upper
      LFT, Fu .P; / with kk1  1= as illustrated
zN wN P11 wN C P12 uN in Fig. 2,
D P D
yN uN P21 wN C P22 uN
uN D K yN yN D Fu .P; /Nu; (1)
522 H-Infinity Control

  
0 W2 Po
P D .I C W1 W2 /Po D Fu ;
Δ W1 Po

with robust stability test given by


  
0 W2 Po
kFl ; K k1
z w W1 Po
P11 P12
P =
y P21 P22
u D kW2 Po K.I  Po K/1 W1 k1 < 

As a second example consider the plants


Q 1 Q
H-Infinity Control, Fig. 2 Upper linear fractional trans-   D .M
P C M / .N C N /, with  D
formation N M and kk1  1= . Here Po D
MQ 1 NQ is a left coprime factorization of the
nominal plant and the plants P are represented
Δ by perturbations to these coprime factors. In this
case P D Fu .P; /, where
2  3
0 I
P D 4 MQ 1 MQ 1 NQ 5
MQ 1 MQ 1 NQ
z P11 P12 w
P=
y P21 P22
u and the robust stability test will be
 
K

kFl .P;K/k1 D 1 Q 1
.I Po K/ M < 
I 1

K This is related to plant perturbations in the gap


metric (see Vinnicombe 2001). It is therefore
observed that the robust stability test for these
H-Infinity Control, Fig. 3 Feedback system with plant
useful representations of uncertain plants is given
uncertainty by an H1 -norm test just as in the controller
synthesis problem.

where Fu .P;K/ WD P22 CP21 .IP11 /1P12


Derivation of the H1 -Control Law
(2)
In this section we present a solution to the H1 -
The small gain theorem then states that the control problem and give some interpretations of
feedback system of Fig. 3 will be stable for all the solution. The approach presented is as in by
such  if the feedback connection of P22 and Doyle et al. (1989); see also Zhou et al. (1996).
K is stable and kFl .P; K/k1 <  . This robust We will make some simplifying structural as-
stability result is valid if P and  are both stable; sumptions to make the formulae less complex and
more care is required when either or both are will not state the required assumptions on rank,
unstable but with such care a similar result is true. stabilizability, and detectability. Let the system in
Let us consider a couple of examples. First Fig. 1 be described by the equations:
suppose that the uncertainty is represented as
output multiplicative uncertainty, P
x.t/ D Ax.t/ C B1 w.t/ C B2 u.t/ (3)
H-Infinity Control 523

z.t/ D C1 x.t/ C D12 u.t/ (4) such that kFl .P; K/k1 <  . In addition since
transposing a system does not change its H1 -
y.t/ D C2 x.t/ C D21 w.t/ (5)
norm, the following dual ARE will also have a
solution, Y1  0,
i.e., in Fig. 1
2 3 AY1 C Y1 A C B1 B1
A B1 B2
P D 4 C1 0 D12 5 CY1 . 2 C1 C1  C2 C2 /Y1 D 0 (8)
C2 D21 0
To obtain a solution to the output feedback
where we also assume, with little loss of general- case, note that (7) implies that kzk22 <  2 kwk22
  
ity, that D12 D12 D I , D21 D21 D I , D12 C1 D if and only if kvk22 <  2 krk22 and vN D

0 and B1 D21 D 0. Since we wish to have Fl .Ptmp ; K/Nr where
kTz w k1 <  , we need to find u such that
   
vN rN
kzk22   2 kwk22 < 0 for all w ¤ 0 2 L2 .0; 1/: D Ptmp ; H
yN uN
We could consider w to be an adversary trying to
and
make this expression positive, while u has to en-
sure that it always remains negative in spite of the 2 3
A C  2 B1 B1 X1 B1 B2
malicious intentions of w, as in a noncooperative
Ptmp D4 B2 X1 0 I 5
game. Suppose that there exists a solution, X1 ,
C2 D21 0
to the Algebraic Riccati Equation (ARE),

A X1 C X1 A C C1 C1 The special structure of this problem enables a


solution to be derived in much the same way
CX1 . 2 B1 B1  B2 B2 /X1 D 0 (6) as the dual of the state feedback problem. The
corresponding ARE will have a solution Ytmp D
with X1  0 and A C . 2 B1 B1  B2 B2 /X1 .I   2 Y1 X1 /1 Y1  0 if and only if the
a stable “A-matrix.” A simple substitution then spectral radius .Y1 X1 / <  2 .
gives that The above outline, supported by significant
technical detail and assumptions, will therefore
d
.x.t/ X1 x.t// D z z C  2 w w demonstrate that there exists a stabilizing con-
dt troller, K.s/, such that the system described by
Cv  v   2 r  r (3–1) satisfies kTz w k1 <  if and only if there
exist stabilizing solutions to the AREs in (6) and
where (8) such that

v WD u C B2 X1 x; r WD w   2 B1 X1 x: X1  0; Y1  0; .Y1 X1 / <  2 (9)

Now let x.0/ D 0 and assuming stability so that


The state equations for the resulting controller
x.1/ D 0, then integrating from 0 to 1 gives
can be written as
kzk22   2 kwk22 D kvk22   2 krk22 (7)
xPO D AxO CB1 wO worst CB2 uCZ1 L1 .C2 xO y/
If the state is available to u, then the control O wO worst D  2 B1 X1 xO
u D F1 x;
law u D B2 X1 x gives v D 0 and kzk22 
F1 WD B2 X1 ; L1 WD Y1 C2 ;
 2 kwk22 < 0 for all w ¤ 0. It can be shown
that (6) has a solution if there exists a controller Z1 WD .I   2 Y1 X1 /1
524 H-Infinity Control

  
giving feedback from a state estimator in the pres- I 0
where J D and generates state-
ence of an estimate of the worst-case disturbance. 0 I
As  ! 1 the standard LQG controller space solutions to these problems (Kimura 1997).
is obtained with state feedback of a state esti- The derivation above clearly demonstrates re-
mate obtained from a Kalman filter. In contrast lations to noncooperative differential games, and
to the LQG problem, the controller depends on this is fully developed in Başar and Bernhard
the value of  , and if this is chosen to be too (1995) and Green and Limebeer (1995).
small, then one of the conditions in (9) will The model matching problem is clearly a con-
be violated. In order to determine the minimum vex optimization problem. The solution of linear
achievable value of  , a bisection search over  matrix inequalities can give effective methods
can be performed checking (9) for each candidate for solving certain convex optimization problems
value of  . (e.g., calculating the H1 norm using the bounded
In the limit as  ! opt (its minimum value), real lemma) and can be exploited in the H1 -
a variety of situations can arise and the formulae control problem. See Boyd and Barratt (1991) for
given here may become ill-conditioned. Typically a variety of results on convex optimization and
achieving opt is more of an interesting and some- control and Dullerud and Paganini (2000) for this
times challenging mathematical exercise rather approach in robust control.
than a control system requirement. As noted above there is a family of solutions
This control problem does not have a unique to the H1 -control problem. The central solution
solution, and all solutions can be characterized in fact minimizes the entropy integral given by
by an LFT form such as K D Fl .M; Q/ where Z
Q 2 H1 with kQk1 < 1, the present solution 2 1
I.Tz wI  / WD  ln
is sometimes referred to as the “central solution” 2 1
obtained with Q D 0. ˇ ˇ
ˇdet.I  2 Tz w .j!/ Tz w .j!//ˇ d!
(10)

Relations for Other Solution Methods It can be seen that this criterion will penalize the
and Problem Formulations singular values of Tz w .j!/ from being close to
 for a large range of frequencies.
The H1 -control problem has been shown to One of the more surprising connections is
be related to an extraordinarily wide variety of with the risk-sensitive stochastic control problem
mathematical techniques and to other problem (Whittle 1990) where w is assumed to be Gaus-
areas, and investigations of these connections sian white noise and it is desired to minimize
have been most fruitful. Earlier approaches (see 2 n 1 2 o
Francis 1988) firstly used the characterization JT . / WD ln E e 2  VT (11)
T
of all stabilizing controllers of Youla et al. (see Z T
Vidyasagar 1985) which shows that all stable where VT WD z.t/ z.t/ dt (12)
closed-loop systems can be written as T

The situation with  2 > 0 corresponds to the


Fl .P; K/ D T1 C T2 QT3 ; where Q 2 H1 risk averse controller since large values of VT
are heavily penalized by the exponential function.
and then solved the model matching problem It can be shown that if kTz w k1 <  , then
infQ2H1 kT1 C T2 QT3 k1 . This model matching
problem is related to interpolation theory and lim JT . / D I.Tz wI  /
T !1
resulted in a productive interaction with the
operator theory. One solution method reduces and hence the central controller minimizes both
this problem to J-spectral factorisation problems the entropy integral and the risk-sensitive cost
H-Infinity Control 525

function. When  is chosen to be too small, nevertheless exhibit most of the features of linear
Whittle refers to the controller having a “neurotic time-invariant systems without such assumptions
breakdown” because the cost will be infinite and for which routine design software is now
for all possible control laws! If in (11) we set available. Connections to a surprisingly large
 2 D  1 , then the entropy minimizing con- range of other problems are also discussed.
troller will have  < 0 and will be risk-averse. Generalizations to more general cases such
The risk neutral controller is when  ! 0,  ! as time-varying and nonlinear systems, where
1 and gives the standard LQG case. If  > 0, the norm is interpreted as the induced norm of
then the controller will be risk-seeking, believing the system in L2 , can be derived although the
that large variance will be in its favor. computational aspects are no longer routine. For
the problems of robust control, there are necessar-
ily continuing efforts to match the mathematical
Controller Design with H1 representation of system uncertainty and system
Optimization performance to the physical system requirements
and to have such representations amenable to
The above solutions to the H1 mathematical analysis and computation.
H
problem do not give guidance on how to set up a
problem to give a “good” control system design.
The problem formulation typically involves iden- Cross-References
tifying frequency-dependent weighting matrices
 Fundamental Limitation of Feedback Control
to characterize the disturbances, w, and the rel-
 H2 Optimal Control
ative importance of the errors, z (see Skogestad
and Postlethwaite 1996). The choice of weights  Linear Quadratic Optimal Control
should also incorporate system uncertainty to  LMI Approach to Robust Control
obtain a robust controller.  Robust H2 Performance in Feedback Control
One approach that combines both closed-loop  Structured Singular Value and Applications:
system gain and system uncertainty is called Analyzing the Effect of Linear Time-Invariant
H1 loop-shaping where the desired closed-loop Uncertainty in Linear Systems
behavior is determined by the design of the loop-
shape using pre- and post-compensators and the
system uncertainty is represented in the gap met-
Bibliography
ric (see Vinnicombe 2001). This makes classical Dullerud GE, Paganini F (2000) A course in robust control
criteria such as low frequency tracking error, theory: a convex approach. Springer, New York
bandwidth, and high-frequency roll-off all eas- Green M, Limebeer D (1995) Linear robust control. Pren-
ily incorporated. In this framework the perfor- tice Hall, Englewood Cliffs
Başar T, Bernhard P (1995) H 1 -optimal control and re-
mance and robustness measures are very well lated minimax design problems, 2nd edn. Birkhäuser,
matched to each other. Such an approach has been Boston
successfully exploited in a number of practical Boyd SP, Barratt CH (1991) Linear controller design:
examples (e.g., Hyde (1995) for flight control limits of performance. Prentice Hall, Englewood Cliffs
Doyle JC, Glover K, Khargonekar PP, Francis BA (1989)
taken through to successful flight tests). Standard State-space solutions to standard H2 and H1 control
control design software packages now routinely problems. IEEE Trans Autom Control 34(8):831–847
have H1 -control design modules. Francis BA (1988) A course in H1 control theory. Lec-
ture notes in control and information sciences, vol 88.
Springer, Berlin, Heidelberg
Hyde RA (1995) H1 aerospace control design: a VSTOL
Summary and Future Directions flight application. Springer, London
Kimura H (1997) Chain-scattering approach to H1 -
control. Birkhäuser, Basel
We have outlined the derivation of H1 Skogestad S, Postlethwaite I (1996) Multivariable feed-
controllers with straightforward assumptions that back control: analysis and design. Wiley, Chichester
526 History of Adaptive Control

Vidyasagar M (1985) Control system synthesis: a factor- accommodate variations in ambient light. The
ization approach. MIT, Cambridge distinction between adaptation and conventional
Vinnicombe G (2001) Uncertainty and feedback: H1
loop-shaping and the -gap metric. Imperial College
feedback is subtle because feedback also attempts
Press, London to reduce the effects of disturbances and plant
Whittle P (1990) Risk-sensitive optimal control. Wiley, uncertainty. Typical examples are adaptive optics
Chichester and adaptive machine tool control which are
Zames G (1981) Feedback and optimal sensitivity: model
reference transformations, multiplicative seminorms,
conventional feedback systems, with controllers
and approximate inverses. IEEE Trans Automat Con- having constant parameters. In this entry we take
trol 26:301–320 the pragmatic attitude that an adaptive controller
Zhou K, Doyle JC, Glover K (1996) Robust and opti- is a controller that can modify its behavior in re-
mal control. Prentice Hall, Upper Saddle River, New
Jersey
sponse to changes in the dynamics of the process
and the character of the disturbances, by adjusting
the controller parameters.
Adaptive control has had a colorful history
History of Adaptive Control with many ups and downs and intense debates
in the research community. It emerged in the
Karl Åström 1950s stimulated by attempts to design autopi-
Department of Automatic Control, Lund lots for supersonic aircrafts. Autopilots based on
University, Lund, Sweden constant-gain, linear feedback worked well in one
operating condition but not over the whole flight
envelope. In process control there was also a need
Abstract for automatic tuning of simple controllers.
Much research in the 1950s and early 1960s
This entry gives an overview of the development contributed to conceptual understanding of
of adaptive control, starting with the early ef- adaptive control. Bellman showed that dynamic
forts in flight and process control. Two popular programming could capture many aspects of
schemes, the model reference adaptive controller adaptation (Bellman 1961). Feldbaum introduced
and the self-tuning regulator, are described with the notion of dual control, meaning that control
a thumbnail overview of theory and applications. should be probing as well as directing; the
There is currently a resurgence in adaptive flight controller should thus inject test signals to obtain
control as well as in other applications. Some better information. Tsypkin showed that schemes
reflections on future development are also given. for learning and adaptation could be captured in
a common framework (Tsypkin 1971).
Gabor’s work on adaptive filtering (Gabor
et al. 1959) inspired Widrow to develop an
Keywords analogue neural network (Adaline) for adaptive
control (Widrow 1962). Widrow’s adaptation
Adaptive control; Auto-tuning; Flight control; mechanism was inspired by Hebbian learning in
History; Model reference adaptive control; Pro- biological systems (Hebb 1949).
cess control; Robustness; Self-tuning regulators; There are adaptive control problems in eco-
Stability nomics and operations research. In these fields
the problems are often called decision making
under uncertainty. A simple idea, called the cer-
Introduction tainty equivalence principle proposed by Simon
(1956), is to neglect uncertainty and treat esti-
In everyday language, to adapt means to change mates as if they are true. Certainty equivalence
a behavior to conform to new circumstances, was commonly used in early work on adaptive
for example, when the pupil area changes to control.
History of Adaptive Control 527

A period of intense research and ample fund- based on air data sensors, the interest in adaptive
ing ended dramatically in 1967 with a crash of the flight control diminished significantly.
rocket powered X15-3 using Honeywell’s MH- There was also interest of adaptation for
96 self-oscillating adaptive controller. The self- process control. Foxboro patented an adaptive
oscillating adaptive control system has, however, process controller with a pneumatic adaptation
been successfully used in several missiles. mechanism in 1950 (Foxboro 1950). DuPont had
Research in adaptive control resurged in the joint studies with IBM aimed at computerized
1970s, when the two schemes the model ref- process control. Kalman worked for a short
erence adaptive control (MRAC) and the self- time at the Engineering Research Laboratory
tuning regulator (STR) emerged together with at DuPont, where he started work that led to a
successful applications. The research was influ- paper (Kalman 1958), which is the inspiration
enced by stability theory and advances in the field of the self-tuning regulator. The abstract of this
of system identification. There was an intensive entry has the statement, This paper examines the
period of research from the late 1970s through the problem of building a machine which adjusts
1990s. The insight and understanding of stability, itself automatically to control an arbitrary
convergence, and robustness increased. Recently dynamic process, which clearly captures the
H
there has been renewed interest because of flight dream of early adaptive control.
control (Hovakimyan and Cao 2010; Lavretsky Draper and Li investigated the problem of op-
and Wise 2013) and other applications; there is, erating aircraft engines optimally, and they devel-
for example a need for adaptation in autonomous oped a self-optimizing controller that would drive
systems. the system towards optimal working conditions.
The system was successfully flight-tested (Draper
and Li 1966) and initiated the field of extremal
control.
The Brave Era Many of the ideas that emerged in the brave
era inspired future research in adaptive control.
Supersonic flight posed new challenges for flight The MRAC, the STR, and extremal control are
control. Eager to obtain results, there was a very typical examples.
short path from idea to flight test with very little
theoretical analysis in between. A number of
research projects were sponsored by the US air
force. Adaptive flight control systems were devel- Model Reference Adaptive Control
oped by General Electric, Honeywell, MIT, and (MRAC)
other groups. The systems are documented in the
Self-Adaptive Flight Control Systems Sympo- The MRAC was one idea from the early work
sium held at the Wright Air Development Center on flight control that had a significant impact on
in 1959 (Gregory 1959) and the book (Mishkin adaptive control. A block diagram of a system
and Braun 1961). with model reference adaptive control is shown
Whitaker of the MIT team proposed the model in Fig. 1. The system has an ordinary feedback
reference adaptive controller system which is loop with a controller, having adjustable param-
based on the idea of specifying the performance eters, and the process. There is also a reference
of a servo system by a reference. Honeywell pro- model which gives the ideal response ym to the
posed a self-oscillating adaptive system (SOAS) command signal ym and a mechanism for adjust-
which attempted to keep a given gain margin ing the controller parameters . The parameter
by bringing the system to self-oscillation. The adjustment is based on the process output y, the
system was flight-tested on several aircrafts. It control signal u, and the output ym of the ref-
experienced a disaster in a test on the X-15. erence model. Whitaker proposed the following
Combined with the success of gain scheduling rule for adjusting the parameters:
528 History of Adaptive Control

History of Adaptive
ym
Control, Fig. 1 Block Model
diagram of a feedback
system with a model
reference adaptive Controller parameters
Adjustment
controller (MRAC)
mechanism

uc
u y
Controller Plant

d @e There was much research, and by the late


D e ; (1)
dt @ 1980s, there was a relatively complete theory for
MRAC and a large body of literature (Ander-
where e D y  ym and @e=@ is the sensitivity son et al. 1986; Åström and Wittenmark 1989;
derivative. Efficient ways to compute the sensitiv- Egardt 1979; Goodwin and Sin 1984; Kumar
ity derivative were already available in sensitivity and Varaiya 1986; Narendra and Annaswamy
theory. The adaptation law (1) became known as 1989; Sastry and Bodson 1989).The problem of
the MIT rule. flight control was, however, solved by using gain
Experiments and simulations of the model scheduling based on air data sensors and not by
reference adaptive systems indicated that there adaptive control (Stein 1980). The MRAC was
could be problems with instability, in particular also extended to nonlinear systems using back-
if the adaptation gain  in Eq. (1) is large. stepping (Krstić et al. 1993); Lyapunov stability
This observation inspired much theoretical and passivity were essential ingredients in devel-
research. The goal was to replace the MIT oping the algorithm and analyzing its stability.
rule by other parameter adjustment rules with
guaranteed stability; the models used were non
linear continuous time differential equations. The The Self-Tuning Regulator
papers Butchart and Shackcloth (1965) and Parks
(1966) demonstrated that control laws could The self-tuning regulator was inspired by
be obtained using Lyapunov theory. When all steady-state regulation in process control. The
state variables are measured, the adaptation laws mathematical setting was discrete time stochastic
obtained were similar to the MIT rule (1), but systems. A block diagram of a system with a self-
the sensitivity function was replaced by linear tuning regulator is shown in Fig. 2. The system
combinations of states and control variables. has an ordinary feedback loop with a controller
The problem was more difficult for systems and the process. There is an external loop for
that only permitted output feedback. Lyapunov adjusting the controller parameters based on real-
theory could still be used if the process transfer time parameter estimation and control design.
function was strictly positive real, establishing a There are many ways to estimate the process
connection with Popov’s hyper-stability theory parameters and many ways to do the control
(Landau 1979). The assumption of a positive design. Simple schemes do not take parameter
real process is a severe restriction because such uncertainty into account when computing the
systems can be successfully controlled by high- controller parameters invoking the certainty
gain feedback. The difficulty was finally resolved equivalence principle.
by using a scheme called error augmentation Single-input, single-output stochastic systems
(Monopoli 1974; Morse 1980). can be modeled by
History of Adaptive Control 529

History of Adaptive Self-tuning regulator


Control, Fig. 2 Block
Specification Process parameters
diagram of a feedback
system with a self-tuning
regulator (STR)
Controller Estimation
design
Controller
parameters
Reference
Controller Process
Input Output

y.t/Ca1 y.t  h/ C    an y.t  nh/ D Insight from system identification showed that H
excitation is required to obtain good estimates. In
b1 u.t  h/ C    C bn u.t  nh/C the absence of excitation, a phenomenon of burst-
c1 w.t  h/ C    cn w.t  nh/ C e.t/; ing could be observed. There could be epochs
(2) with small control actions due to insufficient
excitation. The estimated parameters then drifted
where u is the control signal, y the process out-
towards values close to or beyond the stability
put, w a measured disturbance, and e a stochas-
boundary generating large control axions. Good
tic disturbance. Furthermore, h is the sampling
parameter estimates were then obtained and the
period and ak , bk and ck , are the parameters.
system quickly recovered stability. The behavior
Parameter estimation is typically done using least
then repeated in an irregular fashion. There are
squares, and a control design that minimized the
two ways to deal with the problem. One possibil-
variance of the variations was well suited for
ity is to detect when there is poor excitation and
regulation. A surprising result was that if the
stop adaptation (Hägglund and Åström 2000).
estimates converge, the limiting controller is a
The other is to inject perturbations when there is
minimum variance controller even if the distur-
poor excitation in the spirit of dual control.
bance e is colored noise (Åström and Wittenmark
1973). Convergence conditions for the self-tuning
regulator were given in Goodwin et al. (1980),
and a very detailed analysis was presented in Guo Robustness and Unification
and Chen (1991).
The problem of output feedback does not The model reference adaptive control and the
appear for the model (2) because the sequence self-tuning regulator originate from different ap-
of past inputs and outputs y.t  h/; : : : ; y.t  plication domains, flight control and process con-
nh/; u.t  h/; : : : ; u.t  nh/ is indeed a state, trol. The differences are amplified because they
albeit not a minimal state representation. The are typically presented in different frameworks,
continuous analogue would be to use derivatives continuous time for MRAC and discrete time
of states and inputs which is not feasible because for the STR. The schemes are, however, not too
of measurement noise. The selection of the sam- different. For a given process model and given
pling period is however important. design criterion the process model can often be
Early industrial experience indicated that the re-parameterized in terms of controller parame-
ability of the STR to adapt feedforward gains was ters, and the STR is then equivalent to an MRAC.
particularly useful, because feedforward control Similarly there are indirect MRAC where the
requires good models. process parameters are estimated (Egardt 1979).
530 History of Adaptive Control

A fundamental assumption made in the early computers were still slow at the time, it was
analyses of model reference adaptive controllers natural that most experimentats were executed in
was that the process model used for analysis had process control or ship steering which are slow
the same structure as the real process. Rohrs at processes. Advances in computing eliminated the
MIT, which showed that systems with guaranteed technological barriers rapidly.
convergence could be very sensitive to unmod- Self-oscillating adaptive controllers are used
eled dynamics, generated a good deal of research in several missiles. In piloted aircrafts there were
to explore robustness to unmodeled dynamics. complaints about the perturbation signals that
Averaging theory, which is based on the obser- were always exciting the system.
vation that there are two loops in an adaptive Self-tuning regulators have been used indus-
system, a fast ordinary feedback and a slow trially since the early 1970s. Adaptive autopilots
parameter adjustment loop, turned out to be a key for ship steering were developed at the same
tool for understanding the behavior of adaptive time. They outperformed conventional autopi-
systems. A large body of theory was generated lots based on PID control, because disturbances
and many books were written (Ioannou and Sun generated by waves were estimated and com-
1995; Sastry and Bodson 1989). pensated for. These autopilots are still on the
The theory resulted in several improvements market (Northrop Grumman 2005). Asea (now
of the adaptive algorithms. In the MIT rule (1) ABB) developed a small distributed control sys-
and similar adaptation laws derived from Lya- tem, Novatune, which had blocks for self-tuning
punov theory, the rate of change of the adapta- regulators based on least-squares estimation, and
tion rate is a multiplication of the error e with minimum variance control. The company First
other signals in the system. The adaptation rate Control, formed by members of the Novatune
may then become very large when signals are team, has delivered SCADA systems with adap-
large. The analysis of robustness showed that tive control since 1985. The controllers are used
there were advantages in avoiding large adapta- for high-performance process control systems for
tion rates by normalizing the signals. The stabil- pulp mills, paper machines, rolling mills, and
ity analysis also required that parameter estimates pilot plants for chemical process control. The
had to be bounded. To achieve this, parame- adaptive controllers are based on recursive esti-
ters were projected on regions given by prior mation of a transfer function model and a control
parameter bounds. The projection did, however, law based on pole placement. The controller also
require prior process knowledge. The improved admits feedforward. The algorithm is provided
insight obtained from the robustness analysis is with extensive safety logic, parameters are pro-
well described in the books Goodwin and Sin jected, and adaptation is interrupted when varia-
(1984), Egardt (1979), Åström and Wittenmark tions in measured signals and control signals are
(1989), Narendra and Annaswamy (1989), Sastry too small.
and Bodson (1989), Anderson et al. (1986), and The most common industrial uses of adaptive
Ioannou and Sun (1995). techniques are automatic tuning of PID con-
trollers. The techniques are used both in single
loop controllers and in DCS systems. Many dif-
Applications ferent techniques are used, pattern recognition
as well as parameter estimation. The relay auto-
There were severe practical difficulties in tuning has proven very useful and has been shown
implementing the early adaptive controllers to be very robust because it provides proper
using the analogue technology available in the excitation of the process automatically. Some of
brave era. Kalman used a hybrid computer when the systems use automatic tuning to automatically
he attempted to implement his controller. There generate gain schedules, and they also have adap-
were dramatic improvements when mini- and tation of feedback and feedforward gains (Åström
microcomputers appeared in the 1970s. Since and Hägglund 2005).
History of Adaptive Control 531

Summary and Future Directions little easier to analyze because the systems do
not have a feedback controller. New adaptive
Adaptive control has had turbulent history with schemes are appearing. The L1 adaptive con-
alternating periods of optimism and pessimism. troller is one example. It inherits features of
This history is reflected in the conferences. When both the STR and the MRAC. The model-free
the IEEE Conference on Decision and Control controller by Fliess and Join (2013) is another
started in 1962, it included a Symposium on example. It is similar to a continuous time version
Adaptive Processes, which was discontinued af- of the self-tuning regulator.
ter the 20th CDC in 1981. There were two IFAC There is renewed interest in adaptive control
symposia on the Theory of Self-Adaptive Control in the aerospace industry, both for aircrafts and
Systems, the first in Rome in 1962 and the second missiles (Lavretsky and Wise 2013). Good results
in Teddington in 1965 (Hammond 1966). The in flight tests have been reported both using
symposia were discontinued but reappeared when MRAC and the recently developed L1 adaptive
the Theory Committee of IFAC created a working controller (Hovakimyan and Cao 2010).
group on adaptive control chaired by Prof. Lan- Adaptive control is a rich field, and to under-
dau in 1981. The group brought the communities stand it well, it is necessary to know a wide range
H
of control and signal processing together, and a of techniques: nonlinear, stochastic, and sampled
workshop on Adaptation and Learning in Signal data systems, stability, robust control, and system
Processing and Control (ALCOSP) was created. identification.
The first symposium was held in San Francisco in In the early development of adaptive con-
1983 and the 11th in Caen in 2013. trol, there was a dream of the universal adaptive
Adaptive control can give significant benefits, controller that could be applied to any process
it can deliver good performance over wide op- with very little prior process knowledge. The
erating ranges, and commissioning of controllers insight gained by the robustness analysis shows
can be simplified. Automatic tuning of PID con- that knowledge of bounds on the parameters is
trollers is now widely used in the process in- essential to ensure robustness. With the knowl-
dustry. Auto-tuning of more general controller is edge available today, adaptive controllers can be
clearly of interest. Regulation performance is of- designed for particular applications. Design of
ten characterized by the Harris index which com- proper safety nets is an important practical issue.
pares actual performance with minimum variance One useful approach is to start with a basic
control. Evaluation can be dispensed with by constant-gain controller and provide adaptation
applying a self-tuning regulator. as an add-on. This approach also simplifies de-
There are adaptive controllers that have been sign of supervision and safety networks.
in operation for more than 30 years, for example, There are still many unsolved research
in ship steering and rolling mills. There is a problems. Methods to determine the achievable
variety of products that use scheduling, MRAC, adaptation rates are not known. Finding ways
and STR in different ways. Automatic tuning to provide proper excitation is another problem.
is widely used; virtually all new single loop The dual control formulation is very attractive
controllers have some form of automatic tuning. because it automatically generates proper
Automatic tuning is also used to build gain sched- excitation when it is needed. The computations
ules semiautomatically. The techniques appear required to solve the Bellman equations are
in tuning devices, in single loop controllers, in prohibitive, except in very simple cases. The
distributed systems for process control, and in self-oscillating adaptive system, which has been
controllers for special applications. There are successfully applied to missiles, does provide
strong similarities between adaptive filtering and excitation. The success of the relay auto-tuner
adaptive control. Noise cancellation and adaptive for simple controllers indicates that it may
equalization are widely spread uses of adapta- be called in to provide excitation of adaptive
tion. The signal processing applications are a controllers. Adaptive control can be an important
532 History of Adaptive Control

component of the emerging autonomous system. Gregory PC (ed) (1959) Proceedings of the self adaptive
One may expect that the current upswing in flight control symposium. Wright Air Development
Center, Wright-Patterson Air Force Base, Ohio
systems biology may provide more inspiration Guo L, Chen HF (1991) The Åström-Wittenmark’s
because many biological clearly have adaptive self-tuning regulator revisited and ELS-based adaptive
capabilities. trackers. IEEE Trans Autom Control 30(7):802–812
Hägglund T, Åström KJ (2000) Supervision of adaptive
control algorithms. Automatica 36:1171–1180
Hammond PH (ed) (1966) Theory of self-adaptive control
systems. In: Proceedings of the second IFAC sympo-
Cross-References sium on the theory of self-adaptive control systems,
14–17 Sept, National Physical Laboratory, Tedding-
 Adaptive Control, Overview ton. Plenum, New York
Hebb DO (1949) The organization of behavior. Wiley,
 Autotuning
New York
 Extremum Seeking Control Hovakimyan N, Chengyu Cao (2010) L1 adaptive control
 Model Reference Adaptive Control theory. SIAM, Philadelphia
 PID Control Ioannou PA, Sun J (1995) Stable and robust adaptive
control. Prentice-Hall, Englewood Cliffs
Kalman RE (1958) Design of a self-optimizing control
system. Trans ASME 80:468–478
Krstić M, Kanellakopoulos I, Kokotović PV (1993) Non-
linear and adaptive control design. Prentice Hall,
Bibliography Englewood Cliffs
Kumar PR, Varaiya PP (1986) Stochastic systems: estima-
Anderson BDO, Bitmead RR, Johnson CR, Kokotović tion, identification and adaptive control. Prentice-Hall,
PV, Kosut RL, Mareels I, Praly L, Riedle B (1986) Englewood Cliffs
Stability of adaptive systems. MIT, Cambridge Landau ID (1979) Adaptive control—the model reference
Åström KJ, Hägglund T (2005) Advanced PID control. approach. Marcel Dekker, New York
ISA – The Instrumentation, Systems, and Automation Lavretsky E, Wise KA (2013) Robust and adaptive control
Society, Research Triangle Park with aerospace applications. Springer, London
Åström KJ, Wittenmark B (1973) On self-tuning regula- Mishkin E, Braun L (1961) Adaptive control systems.
tors. Automatica 9:185–199 McGraw-Hill, New York
Åström KJ, Wittenmark B (1989) Adaptive con- Monopoli RV (1974) Model reference adaptive control
trol. Addison-Wesley, Reading. Second 1994 edition with an augmented error signal. IEEE Trans Autom
reprinted by Dover 2006 edition Control AC-19:474–484
Bellman R (1961) Adaptive control processes—a guided Morse AS (1980) Global stability of parameter-adaptive
tour. Princeton University Press, Princeton control systems. IEEE Trans Autom Control AC-
Butchart RL, Shackcloth B (1965) Synthesis of model 25:433–439
reference adaptive control systems by Lyapunov’s sec- Narendra KS, Annaswamy AM (1989) Stable adaptive
ond method. In: Proceedings 1965 IFAC symposium systems. Prentice Hall, Englewook Cliffs
on adaptive control, Teddington Northrop Grumman (2005) SteerMaster. http://www.
Draper CS, Li YT (1966) Principles of optimalizing srhmar.com/brochures/as/SPERRY%20SteerMaster
control systems and an application to the internal %20Control%20System.pdf
combustion engine. In: Oldenburger R (ed) Optimal Parks PC (1966) Lyapunov redesign of model reference
and self-optimizing control. MIT, Cambridge adaptive control systems. IEEE Trans Autom Control
Egardt B (1979) Stability of adaptive controllers. AC-11:362–367
Springer, Berlin Sastry S, Bodson M (1989) Adaptive control: sta-
Fliess M, Join C (2013) Model-free control. bility, convergence and robustness. Prentice-
Foxboro (1950) Control system with automatic response Hall, New Jersey
adjustment. US Patent 2,517,081 Simon HA (1956) Dynamic programming under uncer-
Gabor D, Wilby WPL, Woodcock R (1959) A universal tainty with a quadratic criterion function. Economet-
non-linear filter, predictor and simulator which opti- rica 24:74–81
mizes itself by a learning process. Proc Inst Electron Stein G (1980) Adaptive flight control: a pragmatic view.
Eng 108(Part B):1061–, 1959 In: Narendra KS, Monopoli RV (eds) Applications of
Goodwin GC, Sin KS (eds) (1984) Adaptive filtering adaptive control. Academic, New York
prediction and control. Prentice Hall, Englewood Tsypkin YaZ (1971) Adaptation and learning in automatic
Cliffs systems. Academic, New York
Goodwin GC, Ramadge PJ, Caines PE (1980) Discrete- Widrow B (1962) Generalization and information storage
time multivariable adaptive control. IEEE Trans in network of Adaline neurons. In: Yovits et al. (ed)
Autom Control AC-25:449–456 Self-organizing systems. Spartan Books, Washington
Hybrid Dynamical Systems, Feedback Control of 533

plant and the controller, i.e., the interfaces/signal


Hybrid Dynamical Systems, conditioners. Figure 1 depicts a feedback system
Feedback Control of in closed-loop configuration with such subsys-
tems under the presence of environmental dis-
Ricardo G. Sanfelice
turbances. Due to its hybrid dynamics, a hybrid
Department of Computer Engineering,
control system is a particular type of hybrid
University of California at Santa Cruz,
dynamical system.
Santa Cruz, CA, USA

Abstract Motivation

The control of systems with hybrid dynamics Hybrid dynamical systems are ubiquitous in sci-
ence and engineering as they permit capturing
requires algorithms capable of dealing with
the intricate combination of continuous and the complex and intertwined continuous/discrete
behavior of a myriad of systems with variables H
discrete behavior, which typically emerges from
the presence of continuous processes, switching that flow and jump. The recent popularity of feed-
devices, and logic for control. Several analysis back systems combining physical and software
and design techniques have been proposed for components demands tools for stability analysis
and control design that can systematically handle
the control of nonlinear continuous-time plants,
but little is known about controlling plants that such a complex combination. To avoid the issues
feature truly hybrid behavior. This short entry due to approximating the dynamics of a system,
focuses on recent advances in the design of in numerous settings, it is mandatory to keep
feedback control algorithms for hybrid dynamical the system dynamics as pure as possible and
systems. The focus is on hybrid feedback to be able to design feedback controllers that
can cope with flow and jump behavior in the
controllers that are systematically designed em-
ploying Lyapunov-based methods. The control system.
design techniques summarized in this entry
include control Lyapunov function-based control,
passivity-based control, and trajectory tracking
Modeling Hybrid Dynamical Control
control.
Systems

In this entry, hybrid control systems are


Keywords represented in the framework of hybrid
equations/inclusions for the study of hybrid
Feedback control; Hybrid control; Hybrid sys- dynamical systems. Within this framework, the
tems; Asymptotic stability continuous dynamics of the system are modeled
using a differential equation/inclusion, while the
discrete dynamics are captured by a difference
Definition equation/inclusion. A solution to such a system
can flow over nontrivial intervals of time and
A hybrid control system is a feedback system jump at certain time instants. The conditions
whose variables may flow and, at times, jump. determining whether a solution to a hybrid
Such a hybrid behavior can be present in one or system should flow or jump are captured by
more of the subsystems of the feedback system: subsets of the state space and input space of the
in the system to control, i.e., the plant; in the hybrid control system. In this way, a plant with
algorithm used for control, i.e., the controller; hybrid dynamics can be modeled by the hybrid
or in the subsystems needed to interconnect the inclusion.
534 Hybrid Dynamical Systems, Feedback Control of

environment

ZOH u hP (z,u)
plant
D/A disturbances
A/D
interface interface

network controller
sensor and actuators κ(ξ) v
distributed logic

Hybrid Dynamical Systems, Feedback Control of, (along with environmental disturbances) as subsystems
Fig. 1 A hybrid control system: a feedback system featuring variables that flow and, at times, jump
with a plant, controller, and interfaces/signal conditioners

8
< zP 2 FP .z; u/ .z; u/ 2 CP E \ .Œ0; T   f0; 1; : : : J g/
HP W zC 2 GP .z; u/ .z; u/ 2 DP (1)
: can be written in the form
y D hP .z; u/
1
J[
 
Œtj ; tj C1 ; j
where z is the state of the plant and takes values j D0
from the Euclidean space RnP , u is the input
and takes values from RmP , y is the output for some finite sequence of times 0 D t0 
and takes values from the output space RrP , t1  t2  : : :  tJ . A hybrid arc
is a
and .CP ; FP ; DP ; GP ; hP / is the data of the function on a hybrid time domain. The set E \
hybrid system. The set CP is the flow set, the .Œ0; T   f0; 1; : : : ; J g/ defines a compact hybrid
set-valued map FP is the flow map, the set time domain since it is bounded and closed. The
DP is the jump set, the set-valued map GP is hybrid time domain of
is denoted by dom
.
the jump map, and the single-valued map hP is A hybrid arc is such that, for each j 2 N, t 7!
the output map. (This hybrid inclusion captures
.t; j / is absolutely continuous on intervals of
the dynamics of (constrained or unconstrained) flow I j WD ft W .t; j / 2 dom
g with nonzero
continuous-time systems when DP D ; and GP Lebesgue measure. A hybrid input u is a function
is arbitrary. Similarly, it captures the dynamics on a hybrid time domain that, for each j 2 N,
of (constrained or unconstrained) discrete-time t 7! u.t; j / is Lebesgue measurable and locally
systems when CP D ; and FP is arbitrary. Note essentially bounded on the interval I j .
that while the output inclusion does not explicitly In this way, a solution to the plant HP is
include a constraint on .z; u/, the output map is given by a pair .
; u/ with dom
D dom u .D
only evaluated along solutions.) dom.
; u// satisfying
Given an input u, a solution to a hybrid in- (S0) .
.0; 0/; u.0; 0// 2 C P or .
.0; 0/; u.0; 0//
clusion is defined by a state trajectory
that 2 DP , and dom
D dom u;
satisfies the inclusions. Both the input and the (S1) For each j 2 N such that I j has nonempty
state trajectory are functions of .t; j / 2 R0  interior int.I j /, we have
N WD Œ0; 1/  f0; 1; 2; : : :g, where t keeps track .
.t; j /; u.t; j // 2 CP for all t 2 int.I j /
of the amount of flow, while j counts the number
of jumps of the solution. These functions are
and
given by hybrid arcs and hybrid inputs, which are
defined on hybrid time domains. More precisely, d

.t; j / 2 FP .
.t; j /; u.t; j //
hybrid time domains are subsets E of R0  N dt
that, for each .T; J / 2 E, for almost all t 2 I j
Hybrid Dynamical Systems, Feedback Control of 535

(S2) For each .t; j / 2 dom.


; u/ such that .CK ; FK ; DK ; GK ; / is the data of the hybrid
.t; j C 1/ 2 dom.
; u/, we have inclusion defining the hybrid controller.
The control of HP via HK defines an intercon-
.
.t; j /; u.t; j // 2 DP nection through the input/output assignment u D
and v D y; the system in Fig. 1 without inter-
and faces represents this interconnection. The result-
ing closed-loop system is a hybrid dynamical sys-

.t; j C 1/ 2 GP .
.t; j /; u.t; j // tem given in terms of a hybrid inclusion/equation
with state x D .z; /. We will denote such a
A solution pair .
; u/ to H is said to be closed-loop system by H. Its data can be con-
complete if dom.
; u/ is unbounded and maximal structed from the data .CP ; FP ; DP ; GP ; hP / and
if there does not exist another pair .
; u/0 such .CK ; FK ; DK ; GK ; / of each of the subsystems.
that .
; u/ is a truncation of .
; u/0 to some proper Solutions to both HK and H are understood
subset of dom.
; u/0 . A solution pair .
; u/ to following the notion introduced above.
H is said to be Zeno if it is complete and the
H
projection of dom.
; u/ onto R0 is bounded.
Input and output modeling remark: At times, it Definitions and Notions
is convenient to define inputs uc 2 RmP;c and
ud 2 RmP;d collecting every component of the For convenience, we use the equivalent notation
input u that affect flows and that affect jumps, Œx > y > > and .x; y/ for vectors x and y. Also,
respectively (Some of the components of u can we denote by K1 the class of functions from R0
be used to define both uc and ud , that is, there to R0 that are continuous, zero at zero, strictly
could be inputs that affect both flows and jumps.). increasing, and unbounded.
Similarly, one can define yc and yd as the com- The dynamics of hybrid inclusions have
ponents of y that are measured during flows and right-hand sides given by set-valued maps.
jumps, respectively. Unlike functions or single-valued maps, set-
To control the hybrid plant HP in (1), valued maps may return a set when evaluated
control algorithms that can cope with the at a point. For instance, at points in CP ,
nonlinearities introduced by the flow and the set-valued flow map FP of the hybrid
jump equations/inclusions are required. In plant HP might return more than one value,
general, feedback controllers designed using allowing for different values of the derivative
classical techniques from the continuous-time of z. A particular continuity property of set-
and discrete-time domain fall short. Due to this valued maps that will be needed later is lower
limitation, hybrid feedback controllers would semicontinuity. A set-valued map S from Rn to
be more suitable for the control of plants with Rm is lower semicontinuous if for each x 2 Rn
hybrid dynamics. Then, following the hybrid one has that lim infxi !x S.xi /  S.x/, where
plant model above, hybrid controllers for the lim infxi !x S.xi / D fz W 8xi ! x; 9zi ! z
plant HP in (1) will be given by the hybrid s.t. zi 2 S.xi / g is the so-called inner limit of S .
inclusion A vast majority of control problems consist of
8 designing a feedback algorithm that assures that
< P 2 FK . ; v/ . ; v/ 2 CK a function of the solutions to the plant approach
HK W C 2 GK . ; v/ . ; v/ 2 DK (2) a desired set-point condition (attractivity) and,
:
D . ; v/ when close to it, the solutions remain nearby
(stability). In some scenarios, the desired set-
where is the state of the controller and takes point condition is not necessarily an isolated
values from the Euclidean space RnK , v is the point, but rather a set. The problem of designing
input and takes values from RrP , is the output a hybrid controller HK for a hybrid plant HP
and takes values from the output space RmP , and typically pertains to the stabilization of sets, in
536 Hybrid Dynamical Systems, Feedback Control of

particular, due to the hybrid controller’s state methods are sufficient conditions in terms of Lya-
including timers that persistently evolve within punov functions guaranteeing that the asymptotic
a bounded time interval and logic variables that stability property defined in section “Definitions
take values from discrete sets. Denoting by A and Notions” holds. Some of the methods pre-
the set of points to stabilize for the closed-loop sented below exploit such sufficient conditions
system H and j  jA as the distance to such set, the when applied to the closed-loop system H, while
following property captures the typically desired others exploit the properties of the hybrid plant
properties outlined above. A closed set A is said to design controllers with a particular structure.
to be: The design methods are presented in order of
(S) Stable: for each " > 0 there exists ı > 0 complexity of the controller, namely, from it be-
such that each maximal solution
to H with ing a static state-feedback law to being a generic

.0; 0/ D xı , jxı jA  ı satisfies j
.t; j /jA  algorithm with true hybrid dynamics.
" for all .t; j / 2 dom
.
(A) Attractive: there exists  > 0 such that every CLF-Based Control Design
maximal solution
to H with
.0; 0/ D xı , In simple terms, a control Lyapunov function
jxı jA   is bounded and if it is complete (CLF) is a regular enough scalar function that
satisfies lim.t;j /2dom
; t Cj !1 j
.t; j /jA D 0. decreases along solutions to the system for some
(AS) Asymptotically stable: it is stable and at- values of the unassigned input. When such a
tractive. function exists, it is very tempting to exploit
The basin of attraction of an asymptotically stable its properties to construct an asymptotically
set A is the set of points from where the attractiv- stabilizing control law. Following the ideas from
ity property holds. The set A is said to be globally the literature of continuous-time and discrete-
asymptotically stable when the basin of attraction time nonlinear systems, we define control
is equal to the entire state space. Lyapunov functions for hybrid plants HP and
A dynamical system with assigned inputs is present results on CLF-based control design.
said to be detectable when its output being held to For simplicity, as mentioned in the input and
zero implies that its state converges to the origin. output modeling remark in section “Definitions
A similar property can be defined for hybrid and Notions,” we use inputs uc and ud instead u.
dynamical systems. For the closed-loop system Also, we restrict the discussion to sets A that are
H, given sets A and K, the distance to A is compact as well as hybrid plants with FP ; GP
0-input detectable relative to K for H if every single valued and such that hP .z; u/ D z. For
complete solution
to H notational convenience, we use … to denote
the “projection” of CP and DP onto RnP ,

.t; j / 2 K 8.t; j / 2 dom
) i.e., ….CP / D fz W 9uc s.t. .z; uc / 2 CP g and
lim.t;j /2dom
; t Cj !1 j
.t; j /jA D 0 ….DP / D fz W 9ud s.t. .z; ud / 2 DP g, and the
set-valued maps ‰c .z/ D fuc W .z; uc / 2 CP g
and ‰d .z/ D fud W .z; ud / 2 DP g.
where “
.t; j / 2 K” captures the “output being
Given a compact set A, a continuously dif-
held to zero” property in the usual detectability
ferentiable function V W RnP ! R is a control
notion.
Lyapunov function for HP with respect to A
if there exist ˛1 ; ˛2 2 K1 and a continuous,
positive definite function  such that
Feedback Control Design for Hybrid
Dynamical Systems
˛1 .jzjA /  V .z/  ˛2 .jzjA /
Several methods for the design of a hybrid con- 8z 2 RnP
troller HK rendering a given set asymptotically
stable are given below. At the core of these inf hrV .z/; FP .z; uc /i  .jzjA /
uc 2‰c .z/
Hybrid Dynamical Systems, Feedback Control of 537

8z 2 ….CP / (3) static state-feedback law z 7! .z/. Sufficient


conditions guaranteeing the existence of such a
inf V .GP .z; ud //V .z/  .jzjA /
ud 2‰d .z/ controller as well as a particular state-feedback
8z 2 ….DP / (4) law with point-wise minimum norm are given
next.
With the availability of a CLF, the set A Given a compact set A and a control Lyapunov
can be asymptotically stabilized if it is possible function V (with respect to A), define, for each
to synthesize a controller HK from inequali- r  0, the set I.r/ WD fz 2 RnP W V .z/  r g.
ties (3) and (4). Such a synthesis is feasible, Moreover, for each .z; uc / and r  0, define the
in particular, forthe special case of HK being a function

8
< 1
hrV .z/; FP .z; uc /i C .jzjA / if .z; uc / 2 CP \ .I.r/  RmP;c /;
c .z; uc ; r/ WD 2
: 1 otherwise H
and, for each .z; ud / and r  0, the function
8
< 1
V .GP .z; ud //  V .z/ C .jzjA / if .z; ud / 2 DP \ .I.r/  RmP;d /;
d .z; ud ; r/ WD 2
: 1 otherwise

The following result states conditions on the feedback law z 7! .z/ D . c .z/; d .z// with c
data of HP guaranteeing that, for each r > 0, continuous on ….CP / \ I.r/ and d continuous
there exists a continuous state-feedback law z 7! on ….DP / \ I.r/.
.z/ D . c .z/; d .z// rendering the compact set
Theorem 1 assures the existence of a continu-
Ar WD fz 2 R nP
W V .z/  r g ous state-feedback law practically asymptotically
stabilizing A. However, Theorem 1 does not
asymptotically stable. This property corresponds provide an expression of an asymptotically stabi-
to a practical version of asymptotic stabilizability. lizing control law. The following result provides
an explicit construction of such a control law.
Theorem 1 Given a hybrid plant HP D
.CP ; FP ; DP ; GP ; hP /, a compact set A, and a Theorem 2 Given a hybrid plant HP D
.CP ; FP ; DP ; GP ; hP /, a compact set A, and a
control Lyapunov function V for HP with respect
control Lyapunov function V for HP with respect
to A, if to A, if (C1)–(C3) in Theorem 1 hold then, for
(C1) CP and DP are closed sets, and FP and every r > 0, the state-feedback law pair
GP are continuous;
(C2) The set-valued maps ‰c .z/D fuc W .z; uc / c W ….CP / ! RmP;c ; d W ….DP / ! RmP;d
2 CP g and ‰d .z/ D fud W .z; ud / 2 DP g
are lower semicontinuous with convex values; defined on ….CP / and ….DP / as
(C3) For every r > 0, we have that, for every z 2
….CP / \ I.r/, the function uc 7! c .z; uc ; r/ c .z/ WD arg min fjuc j W uc 2 Tc .z/ g
is convex on ‰c .z/ and that, for every z 2
….DP /\I.r/, the function ud 7! c .z; ud ; r/ 8z 2 ….CP / \ I.r/
is convex on ‰d .z/;
d .z/ WD arg min fjud j W ud 2 Td .z/ g
then, for every r > 0, the compact set Ar is
asymptotically stabilizable for HP by a state- 8z 2 ….DP / \ I.r/
538 Hybrid Dynamical Systems, Feedback Control of

respectively, renders the compact set Ar asymp- a “passive” hybrid plant HP , in which energy
totically stable for HP , where Tc .z/ D ‰c .z/ \ might be dissipated during flows, jumps, or both.
fuc W c .z; uc ; V .z//  0g and Td .z/ D ‰d .z/ \ Passivity notions and a passivity-based control
fud W d .z; ud ; V .z//  0g. Furthermore, if the design method for hybrid plants are given next.
set-valued maps ‰c and ‰d have a closed graph, Since the form of the plant’s output plays a key
then c and d are continuous on ….CP / \ I.r/ role in asserting a passivity property, and this
and ….DP / \ I.r/, respectively. property may not necessarily hold both during
flows and jumps, as mentioned in the input and
The stability properties guaranteed by
output modeling remark in section “Definitions
Theorems 1 and 2 are practical. Under further
and Notions,” we define outputs yc and yd ,
properties, similar results hold when the input
which, for simplicity, are assumed to be single
u is not partitioned into uc and ud . To achieve
valued: yc D hc .x/ and yd D hd .x/. Moreover,
asymptotic stability (or stabilizability) of A with
we consider the case when the dimension of the
a continuous state-feedback law, extra conditions
space of the inputs uc and ud coincides with
are required to hold nearby the compact set,
that of the outputs yc and yd , respectively, i.e., a
which for the case of stabilization of continuous-
“duality” of the output and input space.
time systems are the so-called small control
Given a compact set A and functions hc , hd
properties. Furthermore, the continuity of the
such that hc .A/ D hd .A/ D 0, a hybrid plant
feedback law assures that the closed-loop system
HP for which there exists a continuously differ-
has closed flow and jump sets as well as contin-
entiable function V W RnP ! R0 satisfying
uous flow and jump maps, which, in turn, due to
for some functions !c W RmP;c  RnP ! R and
the compactness of A, implies that the asymptotic
!d W RmP;c  RnP ! R
stability property is robust. Robustness follows
from results for hybrid systems without inputs.
hrV .z/; FP .z; uc /i  !c .uc ; z/
Passivity-Based Control Design 8.z; uc / 2 C (5)
Dissipativity and its special case, passivity, pro-
vide a useful physical interpretation of a feedback V .GP .z; ud //  V .z/  !d .ud ; z/
control system as they characterize the exchange 8.z; ud / 2 D (6)
of energy between the plant and its controller.
For an open system, passivity (in its very pure is said to be passive with respect to a compact set
form) is the property that the energy stored in A if
the system is no larger than the energy it has
absorbed over a period of time. The energy stored
.uc ; z/ 7! !c .uc ; z/ D u>
c yc (7)
in a system is given by the difference between
the initial and final energy over a period of time, .ud ; z/ 7! !d .ud ; z/ D u>
d yd (8)
where the energy function is typically called the
storage function. Hence, conveniently, passivity The function V is the so-called storage function.
can be expressed in terms of the derivative of a If (5) holds with !c as in (7), and (6) holds with
storage function (i.e., the rate of change of the !d  0, then the system is called flow-passive,
internal energy) and the product between inputs i.e., the power inequality holds only during flows.
and outputs (i.e., the system’s power flow). Un- If (5) holds with !c  0, and (6) holds with !d as
der further observability conditions, this power in (8), then the system is called jump-passive, i.e.,
inequality can be employed as a design tool the energy of the system decreases only during
by selecting a control law that makes the rate jumps.
of change of the internal energy negative. This Under additional detectability properties,
method is called passivity-based control design. these passivity notions can be used to design
The passivity-based control design method static output feedback controllers. The following
can be employed in the design of a controller for result gives two design methods for hybrid plants.
Hybrid Dynamical Systems, Feedback Control of 539

Theorem 3 Given a hybrid plant HP D Tracking Control Design


.CP ; FP ; DP ; GP ; hP / satisfying While numerous control problems pertain to the
(C10 ) CP and DP are closed sets; FP and GP stabilization of a set-point condition, at times,
are continuous; and hc and hd are continuous; it is desired to stabilize the solutions to the
and a compact set A, we have: plant to a time-upying trajectory. In this section,
(1) If HP is flow-passive with respect to A with we consider the problem of designing a hybrid
a storage function V that is positive definite controller HK for a hybrid plant HP to track
with respect to A and has compact sublevel a given reference trajectory r (a hybrid arc).
sets, and if there exists a continuous function The notion of tracking is introduced below. We
c W RmP;c ! RmP;c , yc> c .yc / > 0 for all propose sufficient conditions that general hybrid
yc ¤ 0, such that the resulting closed-loop plants and controllers should satisfy to solve such
system with uc D  c .yc / and ud  0 has a problem. For simplicity, we consider tracking
the following properties: of state trajectories and that the hybrid controller
(1.1) The distance to A is detectable relative to can measure both the state of the plant z and the
reference trajectory r; hence, v D .z; r/.
˚ The particular approach used here consists of
H
z 2 ….CP / [ ….DP / [ GP .DP / W

recasting the tracking control problem as a set
hc .z/> c .hc .z// D 0; .z;  c .hc .z/// 2 CP I stabilization problem for the closed-loop system
H. To do this, we embed the reference trajectory
(1.2) Every complete solution
is such that, for r into an augmented hybrid model for which it
some ı > 0 and some J 2 N, we have is possible to define a set capturing the condition
tj C1  tj  ı for all j  J ; that the plant tracks the given reference trajectory.
then the control law uc D  c .yc /, ud  0 This set is referred to as the tracking set. More
renders A globally asymptotically stable. precisely, given a reference r W dom r ! Rnp ,
(2) If HP is jump-passive with respect to A with we define the set Tr collecting all of the points
a storage function V that is positive definite .t; j / in the domain of r at which r jumps,
with respect to A and has compact sublevel that is, every point .tjr ; j / 2 dom r such that
sets, and if there exists a continuous function .tjr ; j C 1/ 2 dom r. Then, the state of the closed
d W RmP;d ! RmP;d , yd> d .yd / > 0 for all loop H is augmented by the addition of states
yd ¤ 0, such that the resulting closed-loop  2 R0 and k 2 N. The dynamics of the
system with uc  0 and ud D  d .yd / has states  and k are such that  counts elapsed
the following properties: flow time, while k counts the number of jumps
(2.1) The distance to A is detectable relative to of H; hence, during flows P D 1 and kP D 0,
while at jumps  C D  and k C D k C 1. These
n new states are used to parameterize the given
z 2 ….CP / [ ….DP / [ GP .DP / W
reference trajectory r, which is employed in the
o definition of the tracking set
hd .z/> d .hd .z// D 0; .z;  d .hd .z/// 2 DP I

(2.2) Every complete solution


is Zeno; A D f.z; ; ; k/ 2 RnP  RnK  R0  N W
then the control law ud D  d .yd /, uc  0 z D r.; k/; 2 ˆK g (9)
renders A globally asymptotically stable.
Strict passivity notions can also be formulated This set is the target set to be stabilized for H.
for hybrid plants, including the special cases The set ˆK RnK in the definition of A is some
where the power inequalities hold only during closed set capturing the set of points asymptoti-
flows or jumps. In particular, strict passivity and cally approached by the controller’s state .
output strict passivity can be employed to assert The following result establishes a sufficient
asymptotic stability with zero inputs. condition for stabilization of the tracking set
540 Hybrid Dynamical Systems, Feedback Control of

A. For notational convenience, we define x D Theorem 4 imposes that the jumps of the
.z; ; ; k/, plant and of the reference trajectory occur
˚ simultaneously. Though restrictive, at times, this
C D x W .z; c . ; z; r.; k/// 2 CP ; property can be enforced by proper design of the

 2 Œtkr ; tkC1
r
; . ; z; r.; k// 2 CK controller.
F .z; ; ; k/ D .FP .z; c . ; z; r.; k///;
FK . ; z; r.; k//; 1; 0/ Summary and Future Directions
D D fx W .z; c . ; z; r.; k/// 2 DP ;
Advances over the last decade on modeling and
.; k/ 2 Tr g [ fx W  2 robust stability of hybrid dynamical systems

(without control inputs) have paved the road
Œtkr ; tkC1
r
/; . ; z; r.; k// 2 DK
for the development of systematic methods
G1 .z; ; ; k/ D .GP .z; c . ; z; r.; k///; for the design of control algorithms for hybrid
; ; k C 1/; plants. The results selected for this short
expository entry, along with recent efforts on
G2 .z; ; ; k/ D .z; GK . ; z; r.; k//; ; k/ multimode/logic-based control, event-based
Theorem 4 Given a complete reference trajec- control, and backstepping, which were not
tory r W dom r ! RnP and associated tracking covered here, contribute to that long-term
set A in (9), if there exists a hybrid controller HK goal. The future research direction includes the
guaranteeing that development of more powerful tracking control
(1) The jumps of r and HP occur simultaneously; design methods, state observers, and optimal
(2) There exist a function V W RnP RnK R0  controllers for hybrid plants.
N ! R that is continuously differentiable;
functions ˛1 ; ˛2 2 K1 ; and continuous, pos-
itive definite functions 1 ; 2 ; 3 such that Cross-References
(a) For all .z; ; ; k/ 2 C [ D [ G1 .D/ [
G2 .D/  Lyapunov’s Stability Theory
 Output Regulation Problems in Hybrid
˛1 .j.z; ; ; k/jA /  V .z; ; ; k/ Systems
 ˛2 .j.z; ; ; k/jA /  Stability Theory for Hybrid Dynamical
Systems
(b) For all .z; ; ; k/ 2 C and all  2
F .z; ; ; k/,
Bibliography
hrV .z; ; ; k/; i  1 .j.z; ; ; k/jA /

(c) For all .z; ; ; k/ 2 D1 and all  2 Set-Valued Dynamics and Variational
G1 .z; ; ; k/ Analysis:
Aubin J-P, Frankowska H (1990) Set-valued analysis.
Birkhauser, Boston
V ./  V .z; ; ; k/  2 .j.z; ; ; k/jA / Rockafellar RT, Wets RJ-B (1998) Variational analysis.
Springer, Berlin/Heidelberg
(d) For all .z; ; ; k/ 2 D2 and all  2
G2 .z; ; ; k/ Modeling and Stability:
Branicky MS (2005) Introduction to hybrid systems.
In: Handbook of networked and embedded control
V ./  V .z; ; ; k/  3 .j.z; ; ; k/jA /
systems. Springer, New York, pp 91–116
Haddad WM, Chellaboina V, Nersesov SG (2006)
then A is globally asymptotically stable. Impulsive and hybrid dynamical systems: stability,
Hybrid Observers 541

dissipativity, and control. Princeton University Press, available in the literature related to the observ-
Princeton ability and observer design for different classes
Goebel R, Sanfelice RG, Teel AR (2012) Hybrid dy-
namical systems: modeling, stability, and robustness.
of hybrid systems are introduced.
Princeton University Press, Princeton
Lygeros J, Johansson KH, Simić SN, Zhang J, Sastry
SS (2003) Dynamical properties of hybrid automata.
IEEE Trans Autom Control 48(1):2–17 Keywords
van der Schaft A, Schumacher H (2000) An introduction
to hybrid dynamical systems. Lecture notes in control Hybrid systems; Observer design; Observability;
and information sciences. Springer, London
Switching systems

Control:
Biemond JJB, van de Wouw N, Heemels WPMH,
Nijmeijer H (2013) Tracking control for hybrid sys- Introduction
tems with state-triggered jumps. IEEE Trans Autom
Control 58(4):876–890
Observers design, which are used to estimate the
Forni F, Teel AR, Zaccarian L (2013) Follow the bounc-
ing ball: global results on tracking and state esti- unmeasured plant state, has received a lot of at- H
mation with impacts. IEEE Trans Autom Control tention since the late ’60s. One of the first leading
58(6):1470–1485 contribution to clearly formalize the estimation
Lygeros J (2005) An overview of hybrid systems control.
problem and propose a solution in the linear case
In: Handbook of networked and embedded control
systems. Springer, New York, pp 519–538 has been proposed by Luenberger (1966). The
Naldi R, Sanfelice RG (2013) Passivity-based control for recipe to implement a Luenberger-type observer
hybrid systems with applications to mechanical sys- for a continuous-time linear system described by
tems exhibiting impacts. Automatica 49(5):1104–1116
Sanfelice RG (2013a) On the existence of con-
trol Lyapunov functions and state-feedback laws xP D Ax C Bu; y D C x C Du; (1)
for hybrid systems. IEEE Trans Autom Control
58(12):3242–3248 with x 2 Rn ; u 2 Rp ; y 2 Rm ; A 2 Rnn ; B 2
Sanfelice RG (2013b) Control of hybrid dynamical sys-
tems: an overview of recent advances. In: Daafouz J, Rnp ; C 2 Rmn , and D 2 Rmp , has three main
Tarbouriech S, Sigalotti M (eds) Hybrid systems with ingredients: system data, the correction term
constraints. Wiley, Hoboken, pp 146–177 commonly referred to as output injection, and
Sanfelice RG, Biemond JJB, van de Wouw N, Heemels the observability/detectability/determinability
WPMH (2013, to appear) An embedding approach for
the design of state-feedback tracking controllers for conditions. A Luenberger-type observer for
references with jumps. Int J Robust Nonlinear Control (1), which consists in a copy of the (system
data) dynamics (1) with a linear correction term
L.y  yO /, is given by

xPO D AxO CBuCL.y yO /; yO D C xO CDu; (2)


Hybrid Observers
with L 2 Rnm and where xO is the estimated
Daniele Carnevale value of x. The estimation error e D x  xO
Dipartimento di Ing. Civile ed Ing. Informatica, satisfies the differential equation eP D .A  LC /e
Università di Roma “Tor Vergata”, Roma, Italy with initial condition e.0/ D x.0/  x.0/.
O Since
the observer has a copy of the plant dynamics
and the correction term is L.y  y/ O D LCe, the
Abstract zero estimation error manifold x D xO is invariant
(if x.0/ D x.0/,O then e.t/  0 for all t 
In the first part of the paper, two consolidated 0), whereas its attractivity (yielding global expo-
hybrid observer designs for non-hybrid systems nential stability of the estimation error system)
are presented. In the second part, recently results requires A – LC be Hurwitz. Such an L, if A
542 Hybrid Observers

is not already Hurwitz, exists if the pair (A, C / the linear correction term K .t/ .y .t/  COx .t// at
is detectable or (sufficient condition) observable. jump times, yielding
The observer in (2) exploits only the injection
term in the for continuous time dynamics (flow xPO .t/ D AxO .t/ C Bu .t/ C L .y .t/  COx .t// ;
map), and one may ask how profitable could be
(3a)
resets of the observer state (jump map) designing
        
a hybrid observer. xO tj D x tj C K t
j y t x t
j  CO j ;
The observer design for hybrid systems is a
relatively new area of research and results are (3b)
consolidated only for few classes of linear hybrid
systems. where t0 D 0; tj C1  tj D T > 0; j 2 N1 and
In section “Continuous-Time Plants,” a hy- T is a parameter that defines the interval times
brid redesign of the observer (2) is discussed between resets and has to be chosen such that
first and then a more general design for non-  
linear systems is introduced, whereas in sec- I m p  r T ¤ 2r; r 2 Znf0g; (4)
tion “Systems with Flows and Jumps” the recent
results related to observability and observer de- for each pair (p , r / of complex eigenvalues
signs for hybrid systems is discussed. Conclu- of the matrix A  LC . This preserves the (con-
sions are given in section “Summary and Future tinuous time or flow) observability of the sys-
Directions.” tem (1) when sampled at time instants tj and
allows to select a matrix K0 such that .I 
K0 C / exp ..A  LC / T / has all its eigenvalues
at zero. Then, the estimation error e.t/ converges
Hybrid Observers: Different
to zero in finite time (nT) if (1) is observable
Strategies
and the matrix K .t/ W R0 ! Rnq is selected
The community of researchers working on hybrid such as K .t/ D K0 if t  tn and K.t/ = 0
observer, which is a quite recent area and is the otherwise. It is important to note that the state
subject of growing interest, is wide and a unique reset (3b) yields a hybrid estimation error system
given by
formal definition/notation has not been reached
yet. This fact is strictly related to the large num-
ber of different hybrid system models that are eP .t/ D .A  LC / e .t/ ; (5a)
currently adopted by researchers. To render as  
e tj D I  K tj C e tj : (5b)
simple as possible this short presentation, we let
the state x.t/ of a hybrid system be driven by
The stability property of the origin can be easily
the flow map (differential equation) when t ¤
deduced by noting that
tj and by the jump map (difference equation)
when t D tj , with x.t/ right continuous, i.e., j
  Y
limt !t C x.t/ D x.tj /. e tj D I  K tj C
j
kD1

Continuous-Time Plants exp ..A  LC / T / e .0/;

Linear Case and given that .I  K0 C / exp ..A  LC / T / is


A simple strategy to improve convergence to nilpotent, then e.tn / D e(nT) = 0.
zero of the estimation error for (1) has been The potentiality benefits of hybrid observers
proposed in Raff and Allgower (2008) and con- to improve the performances of classic
sists in resetting the observer state x, at prede- continuous-time observer is a relatively new area
termined fixed time intervals tj , by means of of research. Along this line, the recent work
Hybrid Observers 543

proposed in Prieur et al. (2012) allows to limit is the sample-data (discrete-time) version of (6)
the peaking phenomena for a class of high-gain with sampling time T D tj 1  tj . Then, it
observers opportunely resetting its (augmented) is possible to define a hybrid observer of the
state. Moreover, when the output of (1) is a following type:
nonlinear function of the state, y D h.x/, with
h./ not invertible (e.g. the saturation function), xPO .t/ D f .xO .t/ ; u .t// ; (8a)
it would be possible to rewrite (1) as a hybrid  
system with linear flow map and augmented state xO tj D  y tj ; xO tj ; tj ; (8b)
designing a hybrid observer as in Carnevale and
Astolfi (2009). where the reset map  and the dynamics of the
new variable .t/ have to be properly defined.
Nonlinear Case The main idea in Moraal and Grizzle (1995)
When the input of a continuous-time plant is is that the Newton method, in continuous and
piecewise-constant the hybrid observer proposed discrete time, can be used to estimate the value
in Moraal and Grizzle (1995), exploiting sampled of that renders zero the function
measurements, can be successfully applied for a
H
class of nonlinear continuous (or discrete-time) WjN . / D YjN  H ; UjN ; (9)
systems
     0
where UjN D u0 tj N C1 ; : : : ; u0 tj and
xP D f .x .t/ ; u .t// ; y .t/ D h .x .t/ ; u .t// ;  0  0
  0
Yj D y tj N C1 ; : : : ; y tj
N
are the sam-
(6)
pled input and output vectors, respectively, and
with sufficiently smooth maps f .,/ and h.,/ H W Rn  RmN ! RN maps the state x.tj / and
and where the N-tuple of control inputs UjN into the output
 
       vector YjN , i.e., H x tj ; UjN D YjN , and is
x tj D F x tj 1 ; u tj 1 ; (7) defined as

2  1  1     3
h F F .: : :/ ; u tj N C1 ; u tj N C1
6 :: 7
 6 7
H x; UjN D 6  1   :    7; (10)
4 h F x; u tj 1 ; u t 1
5
  j
h x; u tj


where F 1 shortly represents the P .t/ D kJ . .t//1 YjN
  inverse
  of the

map F such that x tj 1 D F 1 x tj ;u tj 1 .
The system (6)–(7) is said to be N-osbervable,
H .t/ ; UjN ; (11a)
for some N  1 (the generic selection is
N D 2n C 1), when WjN . / D 0 hold only    
  tj D F tj ; u tj 1 ; (11b)
if D x tj , uniformly in UjN . Then, under
certain technical assumptions (see Moraal and
Grizzle 1995) related to the derivatives of f k > 0
with a sufficiently high-gain and the reset
and h and the invertibility of the Jacobian map  ./ D F tj ; u tj 1 . Note that
matrix J .x/ D @H .x/=@x, it is possible to (11a) is commonly referred to as Newton flow.
select This approach could be easily extended to other
544 Hybrid Observers

continuous-time minimization algorithms (nor-  ˚


is ker.Oflow / D x 2 R3 W x2 D x3 D 0g.
malized gradient, line-search, etc.) changing the
Nevertheless, in the first flow time interval
rhs of (11a) or even with discrete-time methods
 D t1  t0 , it is possible to estimate (e.g. in finite
iterated at higher frequency within the sample
time using the observability Gramian matrix)
time T, yielding faster convergence to zero of the
the initial conditions (x2 .t0 /, x3 .t0 //. Then when
estimation error.
the first jump take place at time t1 , thanks to the
The same approach can be used when a
structure of the jump map J that resets the value
continuous-time observer for (6) is considered in
of x3 .t1 / with the flow-unobservable x1 t1 , it is
place of (8a) and the Newton-based resets can be
possible to estimate
 in the next flow time interval
used to possibly improve the performances. The
the value of x1 t1 so that the initial condition
continuous and discrete-time Newton algorithm
x.t0 / can be completely determined. The hybrid
require the knowledge of the jump map F to
observability matrix in this case has the following
define (7), i.e. the exact discrete time model of
expression
(6), and the Jacobian matrix J .x/ D @H .x/=@x.
h
An approach that do not require such knowledge
Ohybrid D Of0 low ; .Of low Je AT1 /0 ;
is proposed in Biyik and Arcak (2006), where
0
continuous time filters and secant method allow .Of low .Je AT2 /2 /0
to estimate (numerically) the map F and the
Jacobian matrix, or in Sassano et al. (2011) and is full rank for all Tj D tj  tj 1
where an extremum-seeking-based technique is that satisfies (4). Note that from a practical
considered. point of view, in this case the time interval
A different approach to estimate the state of that allows to reconstruct the complete state
a continuous-time plant, pursued for example is [t0 , t1 C –) since the observer needs at
in Ahrens and Khalil (2009) and Liu (1997), least an – time of the new measurements
exploits switching output injections, letting the (after the first jump) to evaluate the full
h i state 0
correction term l ./ to switch among opportune Of0 low ; .Of lowJe AT1 /0 ; .Of low .Je AT2 /2 /0 . This
values selected by a suitable definition (often de-
simple example suggests that (impulsive)
rived by a Lyapunov-based proof) of the switch-
hybrid systems might have a reacher notion
ing signal (t/. These switching gains allow to
of observability than the classical ones. These
improve observer performances and robustness
properties have been studied also for mechanical
against measurement noise and model uncertain-
systems subject to non-smooth impacts in
ties.
Martinelli et al. (2004), where a high-gain-like
observer design has been proposed assuming
Systems with Flows and Jumps
the knowledge of the impact times ti , no Zeno
The classical notion of observability does not
phenomena (no finite accumulation point for
hold for hybrid systems. As an example, consider
tj ’s), and a minimum dwell-time, tj C1  tj 
the autonomous linear hybrid system described

   ı > 0. With the aforementioned assumptions
by x .t/ D Ax .t/ and x tj D J x tj and considering the more general class of hybrid
with system described by
2 3 2 3
0 0 0 0 0 1 xP .t/ D f .x; u/ ;
A D 40 0 15 ; J D 40 1 05 ; (12)  
0 0 0 1 0 0 x tj D g x tj ; u tj ; (13)

and C = [0, 1, 0]. Evidently the flow is with y D h.x,u/, a frequent choice is to consider
not observable in the classic sense the hybrid observer of the form
0 given
that Oflow D C 0 ; .C A/0 ; .CA2 /0 is not
full rank and the flow-unobservable subspace xPO .t/ D f .x;
O u/ C l .y; x; u/ ; (14a)
Hybrid Observers 545

 
xO tj D g xO tj ; u tj origin of the estimation error system is GES. In
this case, the standard observability condition for
     
Cm xO tj ; u tj ; (14b) the couple (A; C / is required.

Switching Systems and Hybrid Automata


with l./ and m./ that are zero when xO D x ren-
Switching systems and hybrid automata have
dering flow and jump-invariant the manifold xO D
been the subject of intense study of many re-
x relying only on the correction term l() (m  0)
searchers in the last two decades. For these class
in a high-gain-like design during the flow. The
of systems, there is a neat separation x D Œz; q0
correction during the flow has to recover, within
among purely discrete-time state q (switching
the minimum dwell-time ı, the worst deteriora-
signal or system mode) and rest of the state z
tion of the estimation error induced by the jumps
 that generically can both flow and jump. The
(if
and the transients such that e tj
any) C1
<
   observability of the entire system is often divided
 
e tj or V e tj C1 < V e tj if a into the problem of determining the switching
Lyapunov analysis is considered. This type of signal q first and then z. The switching signal
observer design, with m D 0 and the linear can be divided into two categories: arbitrary (uni-
H
choice l.y; x;O u/ D L.y  M x/; O have been versal problem) or specific (existential problems)
proposed in Heemels et al. (2011) for linear com- switchings.
plementarity systems (LCS) in the presence of In Vidal et al. (2003) the observability of
state jumps induced by impulsive input. Therein, autonomous linear switched systems with no
solutions of LCS are characterized by means of state jump, minimum dwell time, and unknown
piecewise Bohl distributions and the specially switching signal is analyzed. Necessary and
defined well-posedness and low-index properties, sufficient observability conditions based on
which combined with passivity-based arguments, rank tests and output discontinuities detection
allow to design a global hybrid observer with strategies are given. Along the same line, the
exponential convergence. A separation principle results are extended in Babaali and Pappas (2005)
to design an output feedback controller is also to non-autonomous switched systems with non-
proposed. Zeno solutions and without the minimum dwell-
An interesting approach is pursued in Forni time requirement, providing state z and mode q
et al. (2003) where global output tracking results observability characterized by linear-algebraic
on a class of linear hybrid systems subject to conditions.
impacts is introduced. Therein, the key ingre- Luenberger-type observers with two distinct
dient is the definition of a “mirrored” tracking gain matrices L1 and L2 are proposed in the case
reference (a change of coordinate) that depends of bimodal piecewise linear systems in Juloski
on the sequence of different jumps between the et al. (2007) (where state jumps are considered),
desired trajectory (a virtual bouncing ball) and whereas recently in Tanwani et al. (2013),
the plant (the controlled ball). Exploiting this algebraic observability conditions and observer
(time-varying) change of coordinates and assum- design are proposed for switched linear systems
ing that the impact times are known, it is pos- admitting state jumps with known switching
sible to define an estimation error that is not signal (although some asynchronism between
discontinuous even when the tracked ball has the observer and the plant switches is allowed).
a bounce (state jump) and the plant does not. Results related to the observability of hybrid
A time regularization is included in the model automata, which include switching systems,
embedding a minimum dwell-time among jumps. can be found in Balluchi et al. (2002) and the
In this way, it is possible to design a linear related references. Therein the location observer
hybrid observer represented by (14) with a linear estimates first the system current location q,
(mirrored) term l./ and m./  0, proving (by processing system input and output assuming that
standard quadratic Lyapunov functions) that the it is current-location observable, a property that
546 Hybrid Observers

is related to the system current-location observa- Balluchi A, Benvenuti L, Benedetto MDD, Vincentelli
tion tree. This graph is iteratively explored at each ALS (2002) Design of observers for hybrid systems.
In: Hybrid systems: computation and control, Stan-
new input to determine the node associated to the ford, vol 2289
current value of q.t/. Then, a linear (switched) Biyik E, Arcak M (2006) A hybrid redesign of Newton
Luenberger-type observer for the estimation of observers in the absence of an exact discrete-time
the state z, assuming minimum dwell-time and model. Syst Control Lett 55(8):429–436
Branicky MS (1998) Multiple Lyapunov functions and
observability of each pair (Aq; Cq /, is proposed. other analysis tools for swtiched and hybrid systems.
IEEE Trans Autom Control 43(5):475–482
Carnevale D, Astolfi A (2009) Hybrid observer for global
frequency estimation of saturated signals. IEEE Trans
Summary and Future Directions Autom Control 54(13):2461–2464
Forni F, Teel A, Zaccarian L (2003) Follow the bouncing
Observer design and observability properties ball: global results on tracking and state estimation
of general hybrid systems is an active field of with impacts. IEEE Trans Autom Control 58(8):1470–
1485
research and a number of different results have Goebel R, Sanfelice R, Teel AR (2009) Hybrid dynamical
been proposed although not consolidated as for systems. IEEE Control Syst Mag 29: 28–93
classical linear systems. The results are based Heemels WPMH, Camlibel MK, Schumacher J, Brogliato
on different notations and definitions for hybrid B (2011) Observer-based control of linear comple-
mentarity systems. Int J Robust Nonlinear Control
systems. Efforts to provide a unified approach, in 21(13):1193–1218. Special issues on hybrid systems
many case considering the general framework for Juloski AL, Heemels WPMH, Weiland S (2007) Observer
hybrid systems proposed in Goebel et al. (2009), design for a class of piecewise linear systems. Int J
is pursued by the scientific community to im- Robust Nonlinear Control 17(15):1387–1404
Liu Y (1997) Switching observer design for uncer-
prove consistency and cohesion of the general re- tain nonlinear systems. IEEE Trans Autom Control
sults. Observer designs, observability properties, 42(12):1699–1703
and separation principle even with linear flow and Luenberger DG (1966) Observers for multivariable sys-
jump maps are not yet completely characterized tems. IEEE Trans Autom Control 11: 190–197
Martinelli F, Menini L, Tornambè A (2004) Observability,
and, in the nonlinear case, only few works have reconstructibility and observer design for linear me-
been proposed (see Teel (2010)), providing open chanical systems unobservable in absence of impacts.
challenges for the scientific community. J Dyn Syst Meas Control 125:549
Moraal P, Grizzle J (1995) Observer design for nonlinear
systems with discrete-time measurements. IEEE Trans
Autom Control 40(3):395–404
Cross-References Prieur C, Tarbouriech S, Zaccarian L (2012) Hybrid high-
gain observers without peaking for planar nonlinear
systems. In: 2012 IEEE 51st annual conference on
 Hybrid Dynamical Systems, Feedback Con- decision and control (CDC), Maui, pp 6175–6180
trol of Raff T, Allgower F (2008) An observer that converges in
 Observer-Based Control finite time due to measurement-based state updates. In:
Proceedings of the 17th IFAC world congress, COEX,
 Observers for Nonlinear Systems
South Korea, vol 17, pp 2693–2695
 Observers in Linear Systems Theory Sassano M, Carnevale D, Astolfi A (2011) Extremum
seeking-like observer for nonlinear systems. In: 18th
IFAC world congress, Milano, vol 18, pp 1849–1854
Tanwani A, Shim H, Liberzon D (2013) Observability for
Bibliography switched linear systems: characterization and observer
design. IEEE Trans Autom Control 58(5): 891–904
Ahrens JH, Khalil HK (2009) High-gain observers in Teel A (2010) Observer-based hybrid feedback: a local
the presence of measurement noise: a switched-gain separation principle. In: American control conference
approach. Automatica 45(5):936–943 (ACC), 2010, Baltimore, pp 898–903
Babaali M, Pappas GJ (2005) Observability of switched Vidal R, Chiuso A, Soatto S, Sastry S (2003) Observ-
linear systems in continuous time. In: Morari M, ability of linear hybrid systems. In: Maler O, Pnueli
Thiele L (eds) Hybrid systems: computation and con- A (eds) Hybrid systems: computation and control.
trol. Volume 3414 of lecture notes in computer science. Volume 2623 of lecture notes in computer science.
Springer, Berlin/Heidelberg, pp 103–117 Springer, Berlin/Heidelberg, pp 526–539
I

Introduction
Identification and Control of Cell
Populations Control systems, particularly those that employ
feedback strategies, have been used successfully
Mustafa Khammash1 and J. Lygeros2
1 in engineered systems for centuries. But natural
Department of Biosystems Science and
feedback circuits evolved in living organisms
Engineering, Swiss Federal Institute of
much earlier, as they were needed for regulating
Technology at Zurich (ETHZ), Basel,
the internal milieu of the early cells. Owing
Switzerland
2 to modern genetic methods, engineered feed-
Automatic Control Laboratory, Swiss Federal
back control systems can now be used to control
Institute of Technology Zurich (ETHZ), Zurich,
in real-time biological systems, much like they
Switzerland
control any other process. The challenges of con-
trolling living organisms are unique. To be suc-
Abstract cessful, suitable sensors must be used to measure
the output of a single cell (or a sample of cells in a
We explore the problem of identification and con- population), actuators are needed to affect control
trol of living cell populations. We describe how action at the cellular level, and a controller that
de novo control systems can be interfaced with connects the two should be suitably designed. As
living cells and used to control their behavior. Us- a model-based approach is needed for effective
ing computer controlled light pulses in combina- control, methods for identification of models of
tion with a genetically encoded light-responsive cellular dynamics are also needed. In this entry,
module and a flow cytometer, we demonstrate we give a brief overview of the problem of identi-
how in silico feedback control can be configured fication and control of living cells. We discuss the
to achieve precise and robust set point regulation dynamic model that can be used, as well as the
of gene expression. We also outline how external practical aspects of selecting sensor and actua-
control inputs can be used in experimental design tors. The control systems can either be realized on
to improve our understanding of the underlying a computer (in silico feedback) or through genetic
biochemical processes. manipulations (in vivo feedback). As an example,
we describe how de novo control systems can be
Keywords interfaced with living cells and used to control
their behavior. Using computer controlled light
Extrinsic variability; Heterogeneous populations; pulses in combination with a genetically encoded
Identification; Intrinsic variability; Population light-responsive module and a flow cytometer, we
control; Stochastic biochemical reactions demonstrate how in silico feedback control can

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
548 Identification and Control of Cell Populations

be configured to achieve precise and robust set Identification of Cell Population


point regulation of gene expression. Models

The model structure outlined above captures the


Dynamical Models of Cell Populations fundamental information about the chemical re-
actions of interest. The model parameters that
In this entry, we focus on a model of an essential enter the functions wi .x/ reflect the reaction
biological process: gene expression. The goal rates, which are typically unknown. Moreover,
is to come up with a mathematical model for these reaction rates often vary between different
gene expression that can be used for model-based cells, because, for example, they depend on the
control. Due to cell variability, we will work with local cell environment, or on unmodeled chem-
a model that describes the average concentration ical species whose numbers differ from cell to
of the product of gene expression (the regulated cell (Swain et al. 2002). The combination of this
variable). This allows us to use population mea- extrinsic parameter variability with the intrinsic
surements and treat them as measurements of uncertainty of the stochastic process X.t/ makes
the regulated variable. We refer the reader to the the identification of the values of these parame-
entry  Stochastic Description of Biochemical ters especially challenging.
Networks in this encyclopedia for more informa- To address this combination of intrinsic
tion on stochastic models of biochemical reaction and extrinsic variabilities, one can compute
networks. In this framework, the model consist the moments of the stochastic process X.t/
of an N-vector stochastic process X.t/ describ- together with the cross moments of X.t/ and the
ing the number of molecules of each chemical extrinsic variability. In the process, the moments
species of interest in a cell. Given the chemical of the parametric uncertainty themselves become
reactions in which these species are involve, the parameters of the extended system of ordinary
mean, EŒX.t/, of X.t/ evolves according to differential equations and can, in principle,
deterministic equations described by be identified from data. Even though doing
so requires solving a challenging optimization
problem, effective results can often be obtained
P
EŒX.t/ D SEŒw.X.t//; by randomized optimization methods. For
example, Zechner et al. (2012) presents the
where S is an N  M matrix that describes successful application of this approach to a
the stoichiometry of the M reactions described complex model of the system regulating osmotic
in the model, while w./ is an M -vector of stress response in yeast.
propensity functions. The propensity functions When external signals are available, or when
reflect the rate of the reactions being modeled. one would like to determine what species to mea-
When one considers elementary reactions sure when, such moment-based methods can also
(see  Stochastic Description of Biochemical be used in experiment design. The aim here is to
Networks), the propensity function of the i th determine a priori which perturbation signals and
reaction, wi ./, is a quadratic function of the which measurements will maximize the informa-
form wi .x/ D ai C biT x C ci x T Qi x. Typically, tion on the underlying chemical process that can
wi is either a constant: wi .x/ D a, a linear be extracted from experimental data, reducing the
function of the form wi .x/ D bxj or a simple risk of conducting expensive but uninformative
quadratic of the form wi .x/ D cxj2 . Following experiments. One can show that, given a tentative
the same procedure, similar dynamical models model for the biochemical process, the moments
can be derived that describe the evolution of of the stochastic process X.t/ (and cross X.t/-
higher-order moments (variances, covariances, parameter moments in the presence of extrinsic
third-order moments, etc.) of the stochastic variability) can be used to approximate the Fis-
process X.t/. cher information matrix and hence characterize
Identification and Control of Cell Populations 549

the information that particular experiments con- controllers. However, in silico controllers require
tain about the model parameters; an approxima- a setup that maintains contact with all the cells to
tion of the Fischer information based on the first be controlled and cannot independently control
two moments was derived in Komorowski et al. large numbers of such cells. In this entry we
(2011) and an improved estimate using correction focus exclusively on in silico controllers.
terms based on moments up to order 4 was
derived in Ruess et al. (2013). Once an estimate The Actuator
of the Fischer information matrix is available, There could be several ways to send actuating
one can design experiments to maximize the signals into living cells. One consists of chemical
information gained about the parameters of the inducers that the cells respond to either through
model. The resulting optimization problem (over receptors outside the cell or through translocation
an appropriate parametrization of the space of of the inducer molecules across the cellular mem-
possible experiments) is again challenging but brane. The chemical signal captured by these
can be approached by randomized optimization inducers is then transduced to affect gene expres-
methods. sion. Another approach we will describe here is
induction through light. There are several light
systems that can be used. One of these includes I
Control of Cell Populations a light-sensitive protein called phytochrome B
(PhyB). When red light of wavelength 650 nm
There are two control strategies that one can is shined on PhyB in the presence of phyco-
implement. The control systems can be realized cyanobilin (PCB) chromophore, it is activated. In
on a computer, using real-time measurements this active state it binds to another protein Pif3
from the cell population to be controlled. These with high affinity forming PhyB-Pif3 complex.
cells must be equipped with actuators that If then a far-red light (730 nm) is shined, PhyB
respond to the computer signals that close the is deactivated and it dissociates from Pif3. This
feedback loop. We will refer to this as in silico can be exploited for controlling gene expression
feedback. Alternatively, one can implement the as follows: PhyB is fused to a GAL4 binding do-
sensors, actuators, and control system in the main (GAL4BD), which then binds to DNA in a
entirety within the machinery of the living cells. specific site just upstream of the gene of interest.
At least in principle, this can be achieved through Pif3 in turn is fused to a GAL4 activating do-
genetic manipulation techniques that are common main (GAL4AD). Upon red light induction, Pif3-
in synthetic biology. We shall refer to this type of Gal4AD complex is recruited to PhyB, where
control as in vivo feedback. Of course some com- Gal4AD acts as a transcription factor to initiate
bination of the two strategies can be envisioned. gene expression. After far-red light is shined,
In vivo feedback is generally more difficult to the dissociation of GAL4BD-PhyB complex with
implement, as it involves working within the Pif3-Gal4AD means that Gal4AD no longer acti-
noisy uncertain environment of the cell and vates gene expression, and the gene is off. This
requires implementations that are biochemical in way, one can control gene expression – at least in
nature. Such controllers will work autonomously open loop.
and are heritable, which could prove advanta-
geous in some applications. Moreover, coupled The Sensor
with intercellular signaling mechanisms such To measure the output protein concentration
as quorum sensing, in vivo feedback may lead in cell populations, a florescent protein tag is
to tighter regulation (e.g., reduced variance) of needed. This tag can be fused to the protein of
the cell population. On the other hand, in silico interest, and the fluorescence intensity emanating
controllers are much easier to program, debug, from each cell is a direct measure of the protein
and implement and can have much more complex concentration in that cell. There are several
dynamics that would be possible with in vivo technologies for measuring fluorescence of cell
550 Identification and Control of Cell Populations

Far Red
Gal4 AD
(730 nm)
PhyB Pr
PIF3
Red Gal4 BD
(650 nm) Gal 1 UAS YFP

Gal4 AD Fluorescent
PhyB Pr reporter
PIF3

Gal4 BD
Gal 1 UAS YFP

Actuation with light

R FR

R FR

R FR R FR

Yeast population
R R R R R

Identification and Control of Cell Populations, Fig. 1 yeast cells resulting in a corresponding gene expression
Top figure: shows a yeast cell whose gene expression can pattern that is measured by flow cytometry. By applying
be induced by light: red light turns on gene expression multiple carefully chosen light input test sequences and
while far-red turns it off. Bottom figure: Each input light looking at their corresponding gene expression patterns a
sequences can be applied to a culture of light responsive dynamic model of gene expression can be identified

populations. While fluorimeters measure the genetic regulation, are often very slow with time
overall intensity of a population, flow cytometry constants of the order of tens of minutes. This
and microscopy can measure the fluorescence suggests that pure feedback control without some
of each individual cell in a population sample form of preview may be insufficient. Moreover,
at a given time. This provides a snapshot due to our incomplete understanding of the
measurement of the probability density function underlying biology, the available models are
of the protein across the population. Repeated typically inaccurate, or even structurally wrong.
measurements over time can be used as a basis Finally, the control signals are often unconven-
for model identification (Fig. 1). tional; for example, for the light control system
outlined above, experimental limitations imply
The Control System that the system must be controlled using discrete
Equipped with sensors, actuators, and a model light pulses, rather than continuous signals.
identified with the methods outlined above Fortunately advances in control theory allow
one can proceed to design control algorithms one to effectively tackle most of these challenges.
to regulate the behavior of living cells. Even The availability of a model, for example, enables
though moment equations lead to models that the use of model predictive control methods that
look like conventional ordinary differential introduce the necessary preview into the feedback
equations, from a control theory point of view, process. The presence of unconventional inputs
cell population systems offer a number of may make the resulting optimization problems
challenges. Biochemical processes, especially difficult, but the slow dynamics work in our favor,
Identification and Control of Cell Populations 551

cell population
in silico feedback output measurement
Kept in dark vessel with electronically
Kalman filter + Model Predictive Controller Flow cytometry
controlled light source

Desired fluorescence Light pulse sequence Cell culture sample


(setpoint)
0 50 100 150 200 250 300 350 400

40 uM IPTG - 0 hrs 40 uM IPTG - 2 hrs 40 uM IPTG - 3 hrs 40 uM IPTG - 4 hrs 40 uM IPTG - 5 hrs

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7


1.5

0.8
0.8

0.6
0.6
density/cdf
1.0

0.6

0.4
0.4
0.4
0.5
YFP flourescence distribution

0.2
0.2
0.2

0.0

0.0
0.0
0.0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
log GFP intensity log GFP intensity log GFP intensity log GFP intensity log GFP intensity

Desired
fluorescence
Best estimate of
current stage: xk
measured
Model Predictive
fluorescence Kalman Filter
Control
0 50 100 150 200 250 300 350 400

Identification and Control of Cell Populations, Fig. 2 is fed back to the computer which implements a Kalman
Architecture of the closed-loop light control system. Cells filter plus a Model Predictive Controller. The objective
are kept darkness until they are exposed to light pulse se- of the control is to have the mean gene expression level
quences from the in silico feedback controller. Cell culture follow a desired set value
samples are passed to the flow cytometer whose output

Setpoint=7 Setpoint=5
a b 10
Experiment 1
7 Experiment 2
9
Average fluorescence fold

6 Experiment 3
Average fluorescence fold change

8
Experiment 4
5 7
6
change

4
5
3 4
2 3
2
1
OL CL1
CL CL2
0 50 100 150 200 250 300 350 400 CL3
Time (min) CL4
Precomputed Input Model 0 50 100 150 200 250 300 350 400
(no feedback) Time (min)
Experiment
Random Pulses (t < 0);
Feedback Control Experiment Feedback Control (t ≥ 0)

Identification and Control of Cell Populations, Fig. 3 each with a different initial condition. Closed-loop con-
Left panel: The closed-loop control strategy (orange) trol is turned on at time t=0 shows that tracking can
enables set point tracking, whereas an open-loop strategy be achieved regardless of initial condition. (See Milias-
(green) does not. Right panel: Four different experiments, Argeitis et al. (2011))
552 ILC

providing time to search the space of possible Bibliography


input trajectories. Finally, the fundamental
principle of feedback is often enough to deal Komorowski M, Costa M, Rand D, Stumpf M (2011)
Sensitivity, robustness, and identifiability in stochas-
with inaccurate models. Unlike systems biology
tic chemical kinetics models. Proc Natl Acad Sci
applications where the goal is to develop a model 108(21):8645–8650
that faithfully captures the biology, in population Milias-Argeitis A, Summers S, Stewart-Ornstein J, Zuleta
control applications even an inaccurate model I, Pincus D, El-Samad H, Khammash M, Lygeros J
(2011) In silico feedback for in vivo regulation of
is often enough to provide adequate closed-loop
a gene expression circuit. Nat Biotech 29(12):1114–
performance. Exploring these issues, Milias- 1116
Argeitis et al. (2011) developed a feedback Ruess J, Milias-Argeitis A, Lygeros J (2013) Design-
mechanism for genetic regulation using the ing experiments to understand the variability in bio-
chemical reaction networks. J R Soc Interface 10:
light control system, based on an extended
20130588
Kalman filter and a model predictive controller Swain P, Elowitz M, Siggia E (2002) Intrinsic and extrin-
(Figs. 2 and 3). A related approach was taken sic contributions to stochasticity in gene expression.
in Uhlendorf et al. (2012) to regulate the osmotic Proc Natl Acad Sci 99(20):12795–12800
Toettcher J, Gong D, Lim W, Weiner O (2011) Light-
stress response in yeast, while Toettcher et al.
based feedback for controlling intracellular signaling
(2011) develop what is affectively a PI controller dynamics. Nat Methods 8:837–839
for a faster cell signaling system. Uhlendorf J, Miermont A, Delaveau T, Charvin G, Fages
F, Bottani S, Batt G, Hersen P (2012) Long-term
model predictive control of gene expression at the
population and single-cell levels. Proc Natl Acad Sci
Summary and Future Directions 109(35):14271–14276
Zechner C, Ruess J, Krenn P, Pelet S, Peter M, Lygeros
The control of cell populations offers novel chal- J, Koeppl H (2012) Moment-based inference predicts
bimodality in transient gene expression. Proc Natl
lenges and novel vistas for control engineering Acad Sci 109(21):8340–8345
as well as for systems and synthetic biology.
Using external input signals and experiment
design methods, one can more effectively probe
biological systems to force them to reveal ILC
their secrets. Regulating cell populations in a
feedback manner opens new possibilities for  Iterative Learning Control
biotechnology applications, among them the
reliable and efficient production of antibiotics
and biofuels using bacteria. Beyond biology, the Industrial MPC of Continuous
control of populations is bound to find further Processes
applications in the control of large-scale, multi-
agent systems, including those in transportation, Mark L. Darby
demand response schemes in energy systems, CMiD Solutions, Houston, TX, USA
crowd control in emergencies, and education.

Abstract
Cross-References
Model predictive control (MPC) has become
 Deterministic Description of Biochemical Net- the standard for implementing constrained,
works multivariable control of industrial continuous
 Modeling of Dynamic Systems from First Prin- processes. These are processes which are
ciples designed to operate around nominal steady-state
 Stochastic Description of Biochemical values, which include many of the important
Networks processes found in the refining and chemical
 System Identification: An Overview industries. The following provides an overview
Industrial MPC of Continuous Processes 553

of MPC, including its history, major technical


developments, and how MPC is applied today
in practice. Possible future developments are
provided.

Keywords

Constraints; Modeling; Model predictive con-


trol; Multivariable systems; Process identifica-
tion; Process testing

Introduction

Model predictive control (MPC) refer to a class


of control algorithms that explicitly incorporate a I
process model for predicting the future response
of a plant and relies on optimization as the means Industrial MPC of Continuous Processes, Fig. 1
of determining control action. At each sample Industrial control and decision hierarchy
interval, MPC computes a sequence of future
plant input signals that optimize future plant be-
havior. Only the first of the future input sequence CVs include variables normally controlled at a
is applied to the plant, and the optimization is fixed value such as a product impurity and as
repeated at subsequent sample intervals. well as those considered constraints, for example
MPC provides an integrated solution for limits related to capacity or safety that may only
controlling non-square systems with complex be sometimes active. DVs are measurements that
dynamics, interacting variables, and constraints. are treated as feedforward variables in MPC. The
MPC has become a standard in the continuous manipulated variables are typically setpoints of
process industries, particularly in refining and underlying PID controllers, but may also include
chemicals, where it has been widely applied valve position signals. Most of the targets and
for over 25 years. In most commercial MPC limits are local to the MPC, but others come
products, an embedded steady-state optimizer directly from real-time optimization (if present),
is cascaded to the MPC controller. The MPC or indirectly from planning/scheduling, which are
steady-state optimizer determines feasible, normally translated to the MPC in an open-loop
optimal settling values of the manipulated and manner by the operations personnel.
controlled variables. The MPC controller then Linear and nonlinear model forms are found in
optimizes the dynamic path to optimal steady- industrial MPC applications; however, the major-
state values. ity of the applications continue to rely on a linear
The scope of an MPC application may include model, identified from data generated from a
a unit operation such as a distillation column or dedicated plant test. Nonlinearities that primarily
reactor, or a larger scope such as multiple distil- affect system gains are often adequately con-
lation columns, or a scope that combines reaction trolled with linear MPC through gain scheduling
and separation sections of a plant in one con- or by applying linearizing static transformations.
troller. MPC is positioned in the control and de- Nonlinear MPC applications tend to be reserved
cision hierarchy of a processing facility as shown for those applications where nonlinearities are
in Fig. 1. The variables associated with MPC con- present in both system gains and dynamic re-
sist of: manipulated variables (MVs), controlled sponses and the controller must operate at signif-
variables (CVs), and disturbance variables (DVs). icantly different targets.
554 Industrial MPC of Continuous Processes

Origins and History P , by means of a future sequence of inputs


.Oukjk ; : : : ; uO kjkCM 1 / calculated over a control
MPC has its origins in the process industries horizon M . The trajectory may be a constant
in the 1970s. The year 1978 marked the first setpoint. In the general case, the optimization
published description of predictive control under is performed subject to constraints that may be
the name IDCOM, an acronym for Identification imposed on future inputs and outputs. Only the
and Command (Richalet et al. 1978). A short time first of the future moves is implemented and the
later, Cutler and Ramaker (1979) published a optimization is repeated at the next time instant.
predictive control algorithm under the name Dy- Feedback, which accounts for unmeasured dis-
namic Matrix Control (DMC). Both approaches turbances and model error, is incorporated by
had been applied industrially for several years shifting all future output predictions, prior to the
before the first publications appeared. These pre- optimization, based on the difference between the
dictive control approaches targeted the more dif- output measurement yk and the previous predic-
ficult industrial control problems that could not tion yOkjk1 , denoted by dkjk (i.e., the prediction
be adequately handled with other methods, ei- error at time instant k). Future predicted values
ther with conventional PID control or with ad- of the outputs depend on both past and future
vanced regulatory control (ARC) techniques that values inputs. If no future input changes are made
rely on single-loop controllers augmented with (at time k or after), the model can be used to
overrides, feedforwards/decouplers, and custom calculate the future “free” output response, yt0Wk ,
logic. which will ultimately settle at a new steady-state
The basic idea behind the predictive control value based on the settling time (or time to steady
approach is shown in Fig. 2 for the case of a state of the model, Tss ). For the unconstrained
single input single output, stable system. Future case, it is straightforward to show that the optimal
predictions of inputs out outputs are denoted with result is a linear control law that depends only on
the hat symbol and shown as dashes; double the error between the desired trajectory and the
indexes, tjk, indicate future values at time t free output response.
based on information up to and including time The predictive approach seemed to contrast
k. The optimization problem is to bring future with the state-space optimal control method of
predicted outputs .yOkjkC1 ; : : : ; yOkjkCP / close to the time, the linear quadratic regulator (LQR).
a desired trajectory over a prediction horizon, Later research exposed the similarities to LQR

Industrial MPC of
Continuous Processes, past future
Fig. 2 Predictive control
approach
ref
yt|k

ŷt|k
0
ŷt|k
yt dk|k
ŷk|k−1
ût|k
ut

time
k−1 k k+1 k+M −1 k+Tss k+P
Industrial MPC of Continuous Processes 555

and also Internal Model Control (IMC) (Gar- state (e.g., zero slope) has been incorporated
cia and Morari 1982), although these techniques into two-stage formulations (see, e.g., Lee and
did not solve an online optimization problem. Xiao 2000).
Optimization-based control approaches became State Space Models. The first state space for-
feasible for industrial applications due to (1) the mulation of MPC, which was introduced in
slower sampling requirements of most industrial the late 1980s (Marquis and Broustail 1988)
control problems (on the order of minutes) and allowed MPC to be extended to integrating
the hierarchical implementations in which MPC and unstable processes. It also made use of
provides setpoints to lower level PID controllers the Kalman filter which provided additional
which execute on a much faster sample time (on capability to estimate plant states and un-
the order of seconds or faster). measured disturbances. Later, a state space
Although the basic ideas behind MPC MPC offering was developed based on an in-
remain, industrial MPC technology has changed finite horizon (for both control and prediction)
considerably since the first formulations in the (Froisy 2006). These state space approaches
late 1970s. Qin and Badgwell (2003) describe provided a connection back to unconstrained
the enhancements to MPC technology that LQR theory.
occurred over the next 20 plus years until the Nonlinear MPC. The first applications of non- I
late 1990s. Enhancements since that time are linear MPC, which appeared in the 1990s,
highlighted in Darby and Nikolaou (2012). were based on neural net models. In these
These improvements to MPC reflect increases approaches, a linear dynamic model was com-
in computer processing capability and additional bined with a neural net model that accounted
requirements of industry, which have led to for static nonlinearity (Demoro et al. 1997;
increased functionality and tools/techniques Zhao et al. 2001).
to simplify implementation. A summary The late 1990s saw the introduction of an
of the significant enhancements that have industrial nonlinear MPC based on first prin-
been made to industrial MPC is highlighted ciple models derived from differential mass
below. and energy balances and reaction kinetic ex-
Constraints: Posing input and output constraints pressions, expressed in differential algebraic
as linear inequalities, expressed as a function equation (DAE) form (Young et al. 2002).
of the future input sequence (Garcia and A process where nonlinear MPC is routinely
Morshedi 1986), and solved by a standard applied is polymer manufacturing.
quadratic program or an iterative scheme Identification Techniques. Multivariable
which approximates one. prediction error techniques are now routinely
Two-Stage Formulations: Limitations of a used. More recently, industrial application of
single objective function led to two-stage for- subspace identification methods has appeared,
mulations to handle MV degrees of freedom following the development of these algorithms
(constraint pushing) and steady-state opti- in the 1990s. Subspace methods incorporate
mization via a linear program (LP). the correlation of output measurements in the
Integrators. In their native form, impulse and identification of a multivariable state space
step response models can be applied only to model, which can be used directly in a state
stable systems (in which the impulse response space MPC or converted to an impulse or step
model coefficients approach zero). Extension response model based MPC.
to handle integrating variables included em- Testing Methods. The 1990s saw increased
bedding a model of the difference of the in- use of automatic testing methods to gen-
tegrating signal or integrating a fraction of the erate data for (linear) dynamic model
current predictionˇ error into
ˇ the future (imply- identification using uncorrelated binary
ing an increasing ˇdkCj jk ˇ for j  1 in Fig. 2). signals. Since the 2000, closed-loop testing
The desired value of an integrator at steady methods have received considerable attention.
556 Industrial MPC of Continuous Processes

The motivation for closed-loop testing is to Mathematical Formulation


reduce implementation time and/or effort
of the initial implementation as well as While there are differences in how the MPC
the ongoing need to re-identify the model problem is formulated and solved, the following
of an industrial application in light of general form captures most of the MPC products
processes changes. These closed-loop testing (Qin and Badgwell 2003), although not all terms
methods, which require a preliminary or may be present in a given product:
existing model, utilize uncorrelated dither 2 P  2 P   3
P  ref  P skCj jk 2
signals either introduced as biases to the 6 OykCj jk  ykCj jk  C Tj 7
6 j D1 Q j j D1 7
controller MVs or injected through the 6 7
min 6 7
U 6 M 1 7
steady-state LP or QP, where additional 4 P   1 
MP  5
uO kCj jk  uss 2 C ukCj jk 2
Rj Sj
logic or optimization of the test protocol j D1 j D1

may be performed (Kalafatis et al. 2006; (1)

MacArthur and Zhan 2007; Zhu et al. subject to:


2012).
9
xO kCj jk D f.OxkCj 1jk ; uO kCj 1jk /; j D 1; : : : P =
Model equations
;
yO kCj jk D g.OxkCj jk ; uO kCj jk /; j D 1; : : : P
9
ymin  sj  yO kCj jk  ymax C sj ; j D 1; : : : ; P =
Output constraints/slacks
;
sj  0; j D 1; : : : P
9
umin  uO kCj jk  umax ; j D 0; : : : M  1 =
Input constraints
;
umin  ukCj jk  umax ; j D 0; : : : M  1

where the minimization is performed over the equations and the outputs in the dynamic model
^
future sequence of inputs U D uO kjk ; uO kC1jk ; : : : ; are a function of only past inputs, such as with the
uO kCM 1jk . The four terms in the objective finite step response model.
function represent conflicting quadratic penalties When a steady-state optimizer is present in the
^ MPC, it provides the steady-state targets for uss
.kxk2A D xT Ax/; the penalty matrices are most
always diagonal. The first term penalizes the (in the third quadratic term) and yss (in the output
error relative to a desired reference trajectory reference trajectory). Consider the case of linear
(cf. Fig. 2) originating at yO kjk and terminating MPC with LP as the steady-state optimizer. The
at a desired steady-state, yss ; the second term LP is typically formulated as
penalizes output constraint violations over the
prediction horizon (constraint softening); the min
ss
cTu uss C cTy yss C qTC sC C qT s
u
third term penalizes inputs deviations from a
desired steady-state, either manually specified
or calculated. The fourth term penalizes input subject to:
changes as a means of trading off output tracking 9
and input movement (move suppression). yss D Gss uss >
>
=
The above formulation applies to both linear
uss D uk1 C uss Model equations
and nonlinear MPC. For linear MPCs, except >
>
ss ;
for state space formulations, there are no state yss D y0kCTss jk C y
Industrial MPC of Continuous Processes 557


ymin s  yss  ymax CsC Output constraints becoming equality constraints in subsequent
9 optimizations (Maciejowski 2002).
umin  uss  umax =
Input constraints
;
Mumax  uss  Mumax
MPC Design

Gss is formed from the gains of the linear dy- Key design decision for a given application are
namic model. Deviations outside minimum and the number of MPC controllers and the selection
maximum output limits (s and sC , respectively) of the MVs, DVs, and CVs for each controller;
are penalized, which provide constraint softening however, design decisions are not limited to just
in the event all outputs cannot be simultaneously the MPC layer. The design problem is one of
controlled within limits. The weighting in q and deciding on the best overall structure for the
qC determine the relative priorities of the out- MPC(s) and the regulatory controls, given the
put constraints. The input constraints, expressed control objectives, expected constraints, qualita-
in terms of umax , prevent targets from being tive knowledge of the expected disturbances, and
passed to the dynamic optimization that cannot robustness considerations. It may be that exist-
be achieved. The resulting solution – uss and ing measurements are insufficient and additional
yss – provides a consistent, achievable steady-
I
sensors may be required. In addition, a measure-
state for the dynamic MPC controller. Notice that ment many not be updated on a time interval
for inputs, the steady-state delta is applied to the consistent with acceptable dynamic control, for
current value and, for outputs, the steady-state example, laboratory measurements and process
delta is applied to the steady-state prediction of composition analyzers. In this case, a soft sensor,
the output without future moves, after correcting or inferential estimator, may need to be developed
for the current model error (cf. Fig. 2). If a real- from temperature and pressure measurements.
time optimizer is present, its outputs, which may MPC is frequently applied to a major plant
be targets for CVs and/or MVs, are passed to the unit, with the MVs selected based on their sen-
MPC steady-state optimizer and considered with sitivity to key unit CVs and plant economics.
other objectives but at lower weights or priorities. Decisions regarding the number and size of the
Some additional differences or features found MPCs for a given application depend on plant ob-
in industrial MPCs include: jectives, (expected) constraints, and also designer
1-Norm formulations where absolute deviations, preferences. When the objective is to minimize
instead of quadratic deviations, are penalized. energy consumption based on fixed or specified
Use of zone trajectories or “funnels” with small feed rate, multiple smaller controllers can be
or no penalty applied if predictions remain used. In this situation, controllers are normally
within the specified zone boundaries. designed based on the grouping of MVs with
Use of a minimum movement criterion in ei- the largest effect on the identified CVs, often
ther the dynamic or steady-state optimizations, leading to MPCs designed for individual sections
which only lead to MV movement when CV of equipment, such as reactors and distillation
predictions go outside specified limits. This columns. When the objective is to maximize feed
can provide controller robustness to modeling (or certain products), larger controllers are nor-
errors. mally designed, especially if there are multiple
Multiobjective formulations which solve a series constraints that can limit plant throughput. The
of QP or LP problems instead of a single one, MPC steady-state LP or QP is ideally suited to
and can be applied to the dynamic or steady- solving the throughput maximization problem by
state optimizations. In these formulations, utilizing all available MVs. The location of the
higher priority objectives are solved first, most limiting constraints can impact the number
followed by lesser priority objectives with of MPCs. If the major constraints are near the
the solution of the higher priority objectives front-end of the plant, one MPC can be designed
558 Industrial MPC of Continuous Processes

which connects these constraints with key MVs empirical linear dynamic models, fit to data from
such as feed rates, and other MPCs designed for a dedicated plant test. This will be the basis in the
the rest of the plant. If the major constraints are following discussion.
located near the back of the plant, then a single In the pretest phase of work, the key activity
MPC is normally considered; alternatively, an is one of determining the base level regulatory
MPC cascade could be considered, although this controls for MPC, tuning of these controls, and
is not a common practice across the industry (and determining if the current plant instrumentation
often requires customization). is adequate. It is common to retune a significant
The feed maximization objective is a ma- number of PID loops, with significant benefits
jor reason why MPCs have become larger with often resulting from this step alone.
the advances in computer processing capability. A range of testing approaches are used in plant
However, there is generally a higher requirement testing for linear MPC, including both manual
on model consistency for larger controllers due and automatic (computer-generated) test signal
do the increased number of possible constraint designs, most often in open loop but, increas-
sets against which the MPC can operate. A larger ingly, in closed loop. Most input testing continues
controller can also be harder to implement and to be based on uncorrelated signals, implemented
understand. This is a reason why some practi- either manually or from computer-generated ran-
tioners prefer implementing smaller MPCs at the dom sequences. Model accuracy requirements
potential loss of benefits. dictate accuracy across a range of frequencies
which is achieved by varying the duration of the
steps. Model identification runs are made through
MPC Practice out the course of a test to determine when model
accuracy is sufficient and a test can be stopped.
An MPC project is typically implemented in the In the next phase of work, modeling of the
following sequence: plant is performed. This includes constructing the
Pretest and preliminary MPC design overall MPC model from individual identifica-
Plant testing tion runs; for example, deciding which models
Model and controller development are significant and judging the models charac-
Commissioning teristics (dead times, inverse response settling
These tasks apply whether the MPC is linear time, gains) based on engineering/process and
or nonlinear, but with some differences, primar- a priori knowledge. An important step is an-
ily model development and in plant testing. In alyzing, and adjusting if necessary, the gains
nonlinear MPC, key decisions are related to the of the constructed model to insure the models
model form and level of rigor. Note that with a gains satisfy mass balances and gain ratios do
fundamental model, lower level PID loops must not result in fictitious degrees of freedom (due
be included in the model, if the dynamics are to model errors) that the steady-state optimizer
significant; this is in contrast to empirical mod- could exploit. Also included is the development
eling, where the dynamics of the PID loops are of any required inferentials or soft sensors, typi-
embedded in the plant responses. A fundamental cally based on multivariate regression techniques
model will typically require less plant testing such as principal component regression (PCR),
and make use of historical operating data to principal component analysis (PCA) and partial
fit certain model parameters such as heat trans- least squares (PLS), or sometimes based on a
fer coefficients and reaction constants. Historical fundamental model.
data and/or data from a validated nonlinear static During controller development, initial con-
model can also be used to develop nonlinear troller tuning is performed. This relates to estab-
static models (e.g., neural net) to combine with lishing criteria for utilizing available degrees of
empirical dynamic models. As mentioned earlier, freedom and setting control variable priorities. In
most industrial applications continue to rely on addition, initial tuning values are established for
Industrial MPC of Continuous Processes 559

the dynamic control. Steady state responses cor- As mentioned earlier, one of the advantages
responding to expected constraint scenarios are of state-space modeling is the inherent flexibility
analyzed to determine if the controller behaves as to model unmeasured disturbances (i.e., dkCj jj ,
expected, especially with respect to the steady- cf. Fig. 2); however, these have not found wide-
state changes in the manipulated variables. spread use in industry. A useful enhancement
Commissioning involves testing and tuning would be a framework for developing and imple-
the controller against different constraint sets. It menting improved estimators in a convenient and
is not unusual to modify or revisit model de- transparent manner, that would be applicable to
cisions made earlier. In the worst case, control traditional FIR- and FSR- based MPCs.
performance may be deemed unacceptable and In the area of nonlinear control, the use of
the control engineer is forced to revisit earlier hybrid modeling approaches has increased, for
decisions such as the base level regulatory strat- example, integrating known fundamental model
egy or plant model quality, which would require relationships with neural net or linear time-
re-testing and re-identification of portions of the varying dynamic models. The motivation is in
plant model. The main commissioning effort typ- reducing complexity and controller execution
ically takes place over a two to three week period, times. The use of hybrid techniques can be
but can vary based on the size and model density expected to further increase, especially if I
of the controller. In reality, commissioning, or nonlinear control is to be applied more broadly to
more accurately, controller maintenance, is an larger control problems. Even in situations where
ongoing activity. It is important that the operating control with linear MPC is adequate, there may
company have in-house expertise that can be used be benefits from the use of hybrid or fundamental
to answer questions (“why is the controller doing models, even if the models are not directly used
that?”), troubleshoot, and modify the controller to in the control calculation. The resulting model
reflect new operating modes and constraint sets. could be used offline in model development or
online to update the linear MPC model. Benefits
would come from reduced plant testing and in
Future Directions ensuring model consistency. In the longer term,
one can foresee a more general modeling and
Likely future developments are expected to fol- control environment where the user would not
low extensions of current approaches. Due to have to be concerned with the distinction between
the success in automatic, closed-loop testing, one linear and nonlinear models and would be able
possibility is extending it to “dual” or “joint” con- to easily incorporate known relationships into the
trol, where control and identification objectives controller model.
are combined and allow the user to select how An area that has not received significant atten-
much the control (e.g., output variance) can be tion, but is suggested as an area worth pursuing
affected by test perturbation signals. Another is concerns MPC cascades. Most of the applica-
in formulating the plant test as a DOE 8 (design tions and research are based on a single MPC
of experiments) optimization problem that could, or multiply distributed MPCs. An MPC cascade
for example, target specific models or model would permit the lower MPC to run at a faster
parameters. In the identification area, extensions time period and allow the user to decide which
have started to appear which allow constraints degrees of freedom are to be used for higher level
to be imposed, for example, on dead-times or objectives, such as feed maximization.
gains, thus allowing a priori knowledge to be
used. Another important area that has seen recent Cross-References
emphasis, and which more development can be
expected, is in monitoring and diagnosis, for ex-  Control Hierarchy of Large Processing Plants:
ample, detecting which submodels of MPC have An Overview
become inaccurate and require re-identification.  Control Structure Selection
560 Information and Communication Complexity of Networked Control Systems

 Model-Based Performance Optimizing integrated PLS and neural net state space model. Con-
Control trol Eng Pract 9(2): 125–133
Zhu Y, Patwardhan R, Wagner S, Zhao J (2012) Towards
 Model-Predictive Control in Practice
a low cost and high performance MPC: the role of
 Nominal Model-Predictive Control system identification. Comput Chem Eng 51(5):124–
 Real-Time Optimization of Industrial Pro- 135
cesses

Information and Communication


Bibliography Complexity of Networked Control
Systems
Cutler CR, Ramaker BL (1979) Dynamic matrix control –
a computer control algorithm. In: AIChE national
meeting, Houston Serdar Yüksel
Darby ML, Nikolaou M (2012) MPC: current practice and Department of Mathematics and Statistics,
challenges. Control Eng Pract 20(4):328–352 Queen’s University, Kingston, ON, Canada
Demoro E, Axelrud C, Johnson D, Martin G (1997) Neu-
ral network modeling and control of polypropylene
process. In: Society of plastic engineers international
conference, Houston, 487–799 Abstract
Froisy JB (2006) Model predictive control – building a
bridge between theory and practice. Comput Chem
Eng 30:1426–1435
Information and communication complexity of
Garcia CE, Morari M (1982) Internal model control. 1. A a networked control system identifies the min-
unifying review and some new results. Ind Eng Chem imum amount of information exchange needed
Process Des Dev 21:308–323 between the decision makers (such as encoders,
Garcia CE, Morshedi AM (1986) Quadratic programming
solution of dynamic matrix control (QDMC). Chem controllers, and actuators) to achieve a certain
Eng Commun 46(1–3):73–87 objective, which may be in terms of reaching a
Kalafatis K, Patel K, Harmse M, Zheng Q, Craig M (2006) target state or achieving a given cost threshold.
Multivariable step testing for MPC projects reduces This formulation does not impose any constraints
crude unit testing time. Hydrocarb Process 86:93–99
Lee JH, Xiao J (2000) Use of two-stage optimization on the computational requirements to perform the
in model predictive control of stable and integrating. communication or control. Both stochastic and
Comput Chem Eng 24:1591–1596 deterministic formulations are considered.
MacArthur JW, Zhan C (2007) A practical multi-stage
method for fully automated closed-loop identification
of industrial processes. J Process Control 17(10):770–
786 Keywords
Maciejowski J (2002) Predictive control with constraints.
Prentice Hall, Englewood Cliffs
Communication complexity; Information theory;
Marquis P, Broustail JP (1988) SMOC, a bridge between
state space and model predictive controllers: appli- Networked control
cation to the automation of a hydrotreating unit. In:
Proceedings of the 1988 IFAC workshop on model
based process control, Atlanta, GA, 37–43
Qin JQ, Badgwell TA (2003) A survey of industrial model
Introduction
predictive control. Control Eng Pract 11:733–764
Richalet J, Rault A, Testud J, Papon J (1978) Model Consider a dynamic team problem with L control
predictive control: applications to industrial processes. stations (these will be referred to as decision
Automatica 14:413–428
Young RE, Bartusiak D, Fontaine RW (2002) Evolution
makers and denoted by DMs) under the following
of an industrial nonlinear model predictive controller. dynamics and measurement equations:
In: Sixth international conference on chemical pro-
cess control, Tucson. CACHE, American Institute of xt C1 D ft .xt ; u1t ; : : : ; uL t D 0; 1;   
Chemical Engineers, 342–351 t ; wt / ;
Zhao H, Guiver JP, Neelakantan R, Biegler L (2001) A (1)
nonlinear industrial model predictive controller using yt D gt .xt ; ut 1 ; : : : ; ut 1 I vt /;
i i 1 L i
(2)
Information and Communication Complexity of Networked Control Systems 561

Information and Station 4


Communication
Complexity of Station 2
Networked Control
Systems, Fig. 1 A
decentralized networked
control system with Station 3
information exchange
between decision makers

Station 1

Plant

where i 2 f1; 2; : : : ; Lg DW L and x0 ; wŒ0;T 1 ; is called the communication complexity for this I
i objective.
vŒ0;T 1 are mutually independent random
variables with specified probability distributions. The above is a fixed-rate formulation for com-
Here, we use the notation wŒ0;t  WD fws ; 0stg. munication complexity, since for any two coder
The DMs are allowed to exchange limited in- outputs, a fixed number of bits is used at any
formation: see Fig. 1. The information exchange given time. One could also use variable-rate for-
is facilitated by an encoding protocol E which is mulations. The variable-rate formulation exploits
a collection of admissible encoding functions de- the probabilistic distribution of the system vari-
scribed as follows. Let the information available ables: see Cover and Thomas (1991).
to DM i at time t be

i;j j;i
Iti D fyŒ1;t
i
 ; uŒ1;t 1 ; zŒ0;t  ; zŒ0;t  ; j 2 Lg;
i Communication Complexity for
Decentralized Dynamic Optimization
i;j i;j
where zt takes values in Zt and is the informa-
Let E i D fEti ; t  0g and  i D fti ; t 
tion variable transmitted from DM i to DM j at
0g. Under a team-encoding policy E D
time t generated with
fE 1 ; E 2 ; : : : ; E L g, and a team-control policy
i;j
 D f 1 ;  2 ; : : : ;  L g, let the induced cost be
zit D fzt ; j 2 Lg D Eti .Iti1 ; uit 1 ; yti /; (3)
X
T 1
i;j
and for t D 0, zi0 D fz0 ; j 2 Lg D E0i .y0i /. The E ;E Πc.xt ; u1t ; u2t ;    ; uL
t /: (4)
control actions are generated with t D0

uit D ti .Iti /; In networked control, the goal is to mini-


mize (4) over all coding and control policies sub-
i;j ject to information constraints in the system. Let
for all DMs. Define log2 .jZt j/ to be the com-
ut D fu1t ; u2t ;    ; uL
t g. The following definition
munication rate from DM i to DM j at time t
P 1 P i;j and example are from Yüksel and Başar (2013).
and R.zŒ0;T 1 / D tTD0 i;j 2L log2 .jZt j/ to
be the (total) communication rate. The minimum Definition 1 Given a decentralized control prob-
(total) communication rate over all coding and lem as above, team cost-rate function C W R ! R
control policies subject to a design objective is
562 Information and Communication Complexity of Networked Control Systems

 X
T 1 Suppose that we wish to compute the minimum
C.R/ W D inf E  ;E
Πc.xt ; ut / W expected cost subject to a total rate of 2 bits that
 ;E
t D0 can be exchanged. Under a sequential scheme, if

1 we allow DM 1 to encode y 1 to DM 2 with 1 bit,
R.zŒ0;T 1 /  R : then a cost of 0 is achieved since DM 2 knows the
T
relevant information that needs to be transmitted
We can define a dual function. to DM 1, again with 1 bit: If p D 0, x 1 is the
relevant random variable with an optimal policy
Definition 2 Given a decentralized control prob- u1 D u2 D x 1 , and if p D 1, x 2 is relevant with
lem as above, team rate-cost function R W R ! R an optimal policy u1 D u2 D x 2 , and a cost of 0 is
is achieved. However, if the information exchange
 is parallel, then DM 2 does not know which state
1
R.C / W D inf R.zŒ0;T 1 / W is the relevant one, and it can be shown that a cost
 ;E T of 0 cannot be achieved under any policy.
X
T 1  The formulation in Definition 1 can also be
E  ;E
Πc.xt ; ut /  C : adjusted to allow for multiple rounds of commu-
t D0 nication per time stage. Having multiple rounds
can enhance the performance for a class of team
The formulation here can be adjusted to in- problems while keeping the total rate constant.
clude sequential (iterative) information exchange
given a fixed ordering of actions, as opposed to a
simultaneous (parallel) information exchange at
any given time t. That is, instead of (3), we may Communication Complexity
have in Decentralized Computation

i;j Yao (1979) initiated the research on communica-


zit D fzt ; j 2 f1; 2; : : : ; Lgg
tion complexity in distributed computation. This
D Eti .Iti1 ; uit 1 ; yti ; fzk;i
t ; k < i g/: (5) may be viewed as a special case of the setting
considered earlier but with finite spaces and in a
Both to make the discussion more explicit and to deterministic and an error-free context: Consider
show that a sequential (iterative) communication two decision makers (DMs) who have access to
protocol may perform strictly better than an opti- local variables x 2 f0; 1gn; y 2 f0; 1gn. Given a
mal parallel communication protocol given a total function f of variables .x; y/, what is the max-
rate constraint, we state the following example: imum (over all input variables x; y) of the min-
Consider the following setup with two DMs. Let imum amount of information exchange needed
x 1 ; x 2 ; p be uniformly distributed binary random for at least one agent to compute the value of
variables, DM i have access to y i , i D 1; 2, and the function? Let s.x; y/ D fm1 ; m2 ;    ; mt g be
the communication symbols exchanged on input
xD.p; x 1 ; x 2 /; y 1 D p; y 2 D .x 1 ; x 2 /; .x; y/ during the execution of a communication
protocol. Let mi denote the i th binary message
symbol with jmi j bits. The communication com-
and the cost function be
plexity for such a setup is defined as

c.x; u1 ; u2 / D 1fpD0g c.x 1 ; u1 ; u2 /


R.f / D min max js.x; y/j; (6)
C1fpD1g c.x 2 ; u1 ; u2 /;  ;E .x;y/2f0;1gn f0;1gn

P
with where js.x; y/j D ti D1 jmi j and E is a protocol
which dictates the iterative encoding functions as
c.s; u1 ; u2 / D .s  u1 /2 C .s  u2 /2 : in (5) and  is a decision policy.
Information and Communication Complexity of Networked Control Systems 563

For such problems, obtaining good lower where R. ; E; ˛; ˇ; x0 / denotes the communica-
bounds is in general challenging. One lower tion rate under the control and coding functions
bound for such problems is obtained through  ; E, which satisfies the objectives given by the
the following reasoning: A subset of the form choices ˛; ˇ and initial condition x0 .
A  B, where A and B are subsets of f0; 1gn, Wong obtains a cut-set type lower bound:
is called an f -monochromatic rectangle if Given a fixed initial state, a lower bound is given
for every x 2 A; y 2 B, f .x; y/ is the by 2D.f /, where f is a function of the objec-
same. It can be shown that given any finite tive choices and D.f / is a variation of R.f /
message sequence fm1 ; m2 ;    ; mt g, the set introduced in (6) with the additional property that
f.x; y/ W s.x; y/ D fm1 ; m2 ;    ; mt gg is an both DMs know f at the end of the protocol. An
f -monochromatic rectangle. Hence, to minimize upper bound is established by the exchange of the
the number of messages, one needs to minimize initial states and objective functions also taking
the number of f -monochromatic rectangles into account signaling, that is, the communication
which has led to research in this direction. through control actions, which is discussed fur-
Upper bounds are typically obtained by explicit ther below in the context of stabilization. Wong
constructions. For a comprehensive review, see and Baillieul (2012) consider a detailed analysis
Kushilevitz and Nisan (2006). for a real-valued bilinear controlled decentralized I
For control systems, the discussion takes fur- system.
ther aspects into account including a design ob-
jective, system dynamics, and the uncertainty in
the system variables.
Connections with Information Theory

Communication Complexity in Reach Information theory literature has made significant


Control contributions to such problems. An information
theoretic setup typically entails settings where an
Wong (2009) defines the communication com- unboundedly large sequence of messages are en-
plexity in networked control as follows: Consider coded and functions of which are to be computed.
a design specification where two DMs wish to Such a setting is not applicable in a real-time set-
steer the state of a dynamical system in finite ting but is very useful for obtaining performance
time. This can be viewed as a setting in (1)–(2) bounds (i.e., good lower bounds on complexity)
with 4 DMs, where there is iterative communi- which can at certain instances be achievable even
cation between a sensor and a DM, and there is in a real-time setting. That is, instead of a single
no stochastic noise in the system. Given a set of realization of random variables in the setup of
initial states x0 2 X0 , and finite sets of objective (1)–(2), the average performance for a large num-
choices for each DM .A for DM 1, B for DM 2), ber of independent realizations/copies for such
the goal is to ensure that (i) there exists a finite problems is typically considered.
time where both DMs know the final state of the In such a context, Definitions 1 and 2 can
system, (ii) the final state satisfies the choices be adjusted so that the communication complex-
of the DMs, and (iii) the finite time (when the ity is computed by mutual information (Cover
objective is satisfied) is known by the DMs. and Thomas 1991). Replacing the fixed-rate or
The communication complexity for such a variable-rate (entropy) constraint in Definition 1
system is defined as the infimum over all pro- with a mutual information constraint leads to
tocols of the supremum over the triple of initial convexity properties for C.R/ and R.C /. Such
states, and choices of the DMs, such that the an information theoretic formulation can pro-
above is satisfied. That is, vide useful lower bounds and desirable analytical
properties.
R.X0 ; A; B/ D inf sup R.; E; ˛; ˇ; x0 /; We note here the interesting discus-
 ;E ˛;ˇ;x0
sion between decentralized computation and
564 Information and Communication Complexity of Networked Control Systems

communication provided by Orlitsky and Roche


(2001) as well as by Witsenhausen (1976) where Plant
a probability-free construction is considered and
a zero-error (non-asymptotic and error-free) y1 y2 y3
computation is considered in the same spirit
as in Yao (1979).
Such decentralized computation problems can
Station 1 Station 2 Station 3
be viewed as multiterminal source coding prob- q1 q2 q3
lems with a cost function aligned with the com-
putation objective. Ma and Ishwar (2011) and
Gamal and Kim (2012) provide a comprehensive Actuator 2
Actuator 1 Actuator 3
treatment and review of information exchange u1 u2 u3
requirements for computing. Essential in such
constructions is the method of binning, which is Plant
a key tool in distributed source coding problems.
Binning efficiently designates the enumeration of
Information and Communication Complexity of Net-
symbols (which can be confused in the absence worked Control Systems, Fig. 2 Decentralized stabi-
of coding) given the relevant information at a lization with multiple controllers
receiver DM.
Such problems involve interactive communi-
cations as well as multiterminal coding problems. distribution supported on a compact set X0 
As mentioned earlier, it is also important to point Rn . We denote controllable and unobservable
out that multi-round protocols typically reduce subspaces at station i by K i and N i and refer to
the average rate requirements. the subspace orthogonal to N i as the observable
subspace at the i th station, denoted by O i . The
information available to station i at time t is Iti D
Communication Complexity fyŒ0;t
i
 ; uŒ0;t 1 g. For such a system (see Fig. 2),
i

in Decentralized Stabilization it is possible for the controllers to communicate


through the plant with the process known as
An important relevant setting of reach control is signaling which can be used for communication
where the target final state is the zero vector: The of mode information among the decision makers.
system is to be stabilized. Consider the following Denote by i ! j the property that DM i
special case of (1)–(2) for an LTI system: can signal to DM j . This holds if and only if
C j .A/l B i ¤ 0, for at least one l, 1  l  n.
X
L A directed graph G among the L stations can
j
xt C1 D Axt C B j ut ; be constructed through such a communication
j D1 relationship.
Suppose that A is such that in its Jordan
yti D C i xt t D 0; 1; : : : (7)
form, where each Jordan block admits distinct
real eigenvalues. Then, a lower bound on the
where i 2 L, and it is assumed that the joint communication complexity (per time stage) for
P
system is stabilizable and detectable, but the stabilizability is given by ji j>1 Mi log2 .ji j/,
individual pairs .A; B i / may not be stabilizable where
or .A; C i / may not be detectable. Here, xt 2 Rn
is the state, uit 2 Rmi is the control applied
by station i , and yti 2 Rpi is the observation Mi D min fd.l; m/ C 1 W l ! m;
l;m2f1;2;:::;Lg
available at station i , all at time t. The initial
state x0 is generated according to a probability Œx i   O i [ O m ; Œx i   K m g;
Information and Communication Complexity of Networked Control Systems 565

with d.l; m/ denoting the graph distance (num-  Networked Control Systems: Estimation and
ber of edges in a shortest path) between DM Control over Lossy Networks
l and DM m in G and Œxi  denoting the sub-  Quantized Control and Data Rate Constraints
space spanned by xi . Furthermore, there exist
stabilizing coding and control policies whose
sum rate is arbitrarily close to this bound. When Recommended Reading
different Jordan blocks may admit repeated and
possibly complex eigenvalues, variations of the The information exchange requirements for
result above are applicable. In the special case decentralized optimization depend also on the
where there is a centralized controller which re- structural properties of the cost functional to
ceives information from multiple sensors (under be minimized. For a class of team problems,
stabilizability and joint detectability), even in the one might simply need to exchange a sufficient
presence of noise, to achieve asymptotic stability, statistic needed for optimal solutions. For some
it suffices
P to have the average total rate be greater problems, there may be no need for an exchange
than ji j>1 log2 .ji j/. The results above follow at all, if the sufficient statistics are already
from Matveev and Savkin (2008) and Yüksel and available, as in the case of mean field equilibrium
Başar (2013). For the case with a single sensor, problems when the number of decision makers I
this result has been studied extensively in net- is unbounded or very large for almost optimal
worked control (see the chapter on  Quantized solutions; see Huang et al. (2006) and Lasry
Control and Data Rate Constraints in the Ency- and Lions (2007). In case there is no common
clopedia). probabilistic information, the problem considered
becomes further involved. The consensus
literature, under both Bayesian and non-Bayesian
Summary and Future Directions contexts, aims to achieve agreement on a class of
system variables under information constraints:
In this text, we discussed the problem of see, e.g., Tsitsiklis et al. (1986). Optimization
communication complexity in networked control under local interaction and sparsity constraints
systems. Our analysis considered both cost and various criteria have been investigated in
minimization and controllability/reachability a number of publications including Rotkowitz
problems subject to information constraints. We and Lall (2006). A review for the literature
also discussed the communication complexity on norm-optimal control as well as optimal
in distributed computing as has been studied in stochastic dynamic teams is provided in Mahajan
the computer science community and provided et al. (2012). Tsitsiklis and Athans (1985) have
a brief discussion on the information theoretic observed that from a computational complexity
approaches for such problems together with viewpoint, obtaining optimal solutions for a class
structural results. There are many relevant of such communication protocol design problems
open problems on structural results for optimal is non-tractable (NP-hard).
policies, explicit solutions, as well as nontrivial Even though obtaining explicit solutions for
upper and lower bounds on the optimal optimal coding and control results may be dif-
performance. ficult, it is useful to obtain structural results on
optimal coding and control policies since one
can reduce the search space to a smaller class of
Cross-References functions. For dynamic team problems, these typ-
ically follow from the construction of a controlled
 Data Rate of Nonlinear Control Systems and Markov chain (see Walrand and Varaiya 1983)
Feedback Entropy and applying tools from stochastic control theory
 Flocking in Networked Systems which obtain structural results on optimal coding
 Information-Based Multi-Agent Systems and control policies (see Nayyar et al. 2013).
566 Information and Communication Complexity of Networked Control Systems

Along these lines, for system (1)–(2), if the DMs Ma N, Ishwar P (2011) Some results on distributed source
can agree on a joint belief P .xt 2 jIti ; i 2 L/ coding for interactive function computation. IEEE
Trans Inf Theory 57:6180–6195
at every time stage, then the optimal cost that Mahajan A, Martins N, Rotkowitz M, Yüksel S (2012)
would be achieved under a centralized system Information structures in optimal decentralized con-
could be attained (see Yüksel and Başar 2013). trol. In: IEEE conference on decision and control,
As a further important illustrative case, if the Hawaii
Matveev AS, Savkin AV (2008) Estimation and
problem described in Definition 1 is for a real- control over communication networks. Birkhäuser,
time estimation problem for a Markov source, Boston
then the optimal causal fixed-rate coder minimiz- Nayyar A, Mahajan A, Teneketzis D (2013) The common-
ing any cost function uses only the last source information approach to decentralized stochastic con-
trol. In: Como G, Bernhardsson B, Rantzer A (eds)
symbol and the information at the controller’s Information and control in networks. Springer Inter-
memory: see Witsenhausen (1979). We also note national Publishing, Switzerland
that the optimal design of information channels Nemirovsky A, Yudin D (1983) Problem complexity and
for optimization under information constraints method efficiency in optimization. Wiley-Interscience,
New York
is a non-convex problem; see Yüksel and Lin- Orlitsky A, Roche JR (2001) Coding for computing. IEEE
der (2012) and Yüksel and Başar (2013) for a Trans Inf Theory 47:903–917
review of the literature and certain topological Raginsky M, Rakhlin A (2011) Information-based
properties of the problem. We refer the reader complexity, feedback and dynamics in convex
programming. IEEE Trans Inf Theory 57:
to Nemirovsky and Yudin (1983) for a com- 7036–7056
prehensive resource on information complexity Rotkowitz M, Lall S (2006) A characterization of convex
for optimization problems. A sequential setting problems in decentralized control. IEEE Trans Autom
with an information theoretic approach to the Control 51:274–286
Teneketzis D (1979) Communication in decentralized
formulation of communication complexity has control. PhD dissertation, MIT
been considered in Raginsky and Rakhlin (2011). Tsitsiklis J, Athans M (1985) On the complexity of de-
A formulation relevant to the one in Definition 1 centralized decision making and detection problems.
has been considered in Teneketzis (1979) with IEEE Trans Autom Control 30:440–446
Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed
mutual information constraints. Giridhar and Ku- asynchronous deterministic and stochastic gradient
mar (2006) discuss distributed computation for a optimization algorithms. IEEE Trans Autom Control
class of symmetric functions under information 31:803–812
constraints and present a comprehensive review. Walrand JC, Varaiya P (1983) Optimal causal coding-
decoding problems. IEEE Trans Inf Theory 19:814–
820
Witsenhausen HS (1976) The zero-error side information
problem and chromatic numbers. IEEE Trans Inf The-
ory 22:592–593
Bibliography Witsenhausen HS (1979) On the structure of real-time
source coders. Bell Syst Tech J 58:1437–1451
Cover TM, Thomas JA (1991) Elements of information Wong WS (2009) Control communication complexity of
theory. Wiley, New York distributed control systems. SIAM J Control Optim
Gamal AE, Kim YH (2012) Network information theory. 48:1722–1742
Cambridge University Press, UK Wong WS, Baillieul J (2012) Control communication
Giridhar A, Kumar P (2006) Toward a theory of in- complexity of distributed actions. IEEE Trans Autom
network computation in wireless sensor networks. Control 57:2731–2345
IEEE Commun Mag 44:98–107 Yao ACC (1979) Some complexity questions related to
Huang M, Caines PE, Malhamé RP (2006) Large distributive computing. In: Proceedings of the 11th
population stochastic dynamic games: closed-loop annual ACM symposium on theory of computing,
McKean-vlasov systems and the nash certainty Atlanta
equivalence principle. Commun Inf Syst 6: Yüksel S, Başar T (2013) Stochastic networked control
221–251 systems: stabilization and optimization under informa-
Kushilevitz E, Nisan N (2006) Communication complex- tion constraints. Birkhäuser, Boston
ity, 2nd edn. Cambridge University Press, New York Yüksel S, Linder T (2012) Optimization and convergence
Lasry JM, Lions PL (2007) Mean field games. Jpn J Math of observation channels in stochastic control. SIAM J
2:229–260 Control Optim 50:864–887
Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions 567

knows what and when do they know it” (Ho et al.


Information Structures, the 1978; Mitter and Sahai 1999).
Witsenhausen Counterexample, The information pattern is said to be classi-
and Communicating Using Actions cal if all agents in the team receive the same
information and have perfect recall (so they do
Pulkit Grover
not forget it). What is so special about classi-
Carnegie Mellon University, Pittsburgh,
cal information patterns? For these patterns, the
PA, USA
presence of external communication links has no
effect on the optimal costs! After all, what could
the agents use the communication links for, when
Abstract there is no knowledge gradient? More interesting,
therefore, are the problems for which the infor-
The concept of “information structures” in
mation pattern is nonclassical. These problems sit
decentralized control is a formalization of the
at the intersection of communication and control:
notion of “who knows what and when do they
communication between agents can help reduce
know it.” Even seemingly simple problems with
the knowledge differential that exists between
simply stated information structures can be
them, helping them perform the control task. I
extremely hard to solve. Perhaps the simplest
Intellectually and practically, the concept of non-
of such unsolved problem is the celebrated
classical information patterns motivates a lot of
Witsenhausen counterexample, formulated
formulations at the control-communication inter-
by Hans Witsenhausen in 1968. This entry
section. Many of these formulations – including
discusses how the information structure of the
some discussed in this Encyclopedia (e.g.,  Data
Witsenhausen counterexample makes it hard
Rate of Nonlinear Control Systems and Feed-
and how an information-theoretic approach,
back Entropy;  Information and Communica-
which explores the knowledge gradient due to
tion Complexity of Networked Control Systems
a nonclassical information pattern, has helped
 Quantized Control and Data Rate Constraints;
obtain insights into the problem.
 Networked Control Systems: Architecture and
Stability Issues; and  Networked Control Sys-
tems: Estimation and Control Over Lossy Net-
Keywords works) – intellectually ask the question: for a
realistic channel that is constrained by noise,
Decentralized control; Information theory; Im- bandwidth, and speed, what is the optimal com-
plicit communication; Team decision theory munication and control strategy?
One could ask the question of optimal control
strategy even for decentralized control problems
where no external channel is available to bridge
Introduction this knowledge gradient. Why could these
problems be of interest? First, these problems
Modern control systems often comprise of multi- are limiting cases of control with communi-
ple decentralized control agents that interact over cation constraints. Second, and perhaps more
communication channels (Fig. 1). What charac- importantly, they bring out an interesting
teristic distinguishes a centralized control prob- possibility that can allow the agents to
lem from a decentralized one? One fundamental “communicate,” i.e., exchange information, even
difference is a “knowledge gradient”: agents in when the external channel is absent. It is possible
a decentralized team often observe, and hence to use control actions to communicate through
know, different things. This observation leads to changing the system state! We now introduce
the idea of information patterns (Witsenhausen this form of communication using a simple toy
1971), a formalization of the notion of “who example.
568 Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions

Control
agent
External
Channels

Control
agent
Plant

Plant Plant
Plant
Plant
u y u Control y Control
y Control
u Analog agent agent
agent
control
Digital Decentralized Networked control
Open loop Feedback Control control control systems

Information Structures, the Witsenhausen Coun- worked control systems” (also called “cyber-physical sys-
terexample, and Communicating Using Actions, tems”) are decentralized and networked using communi-
Fig. 1 The evolution of control systems. Modern “net- cation channels

Communicating Using Actions:


An Example

To gain intuition into when communication using x

actions could be useful, consider the inverted


pendulum example shown in Fig. 2. The goal of Step 2
the two agents is to bring the pendulum as close to Step 1
the origin as possible. Both controllers have their
Cb Blurry
strengths and weaknesses. The “weak” controller Controller
Weak
Cw has little energy, but has perfect state observa- controller Cw
tions. On the other hand, the “blurry” controller
Cb has infinite energy, but noisy observations.
They act one after the other, and their goal is
Information Structures, the Witsenhausen Coun-
to move the pendulum close to the center from terexample, and Communicating Using Actions,
any initial state. The information structure of Fig. 2 Two controllers, with their respective strengths
the problem is nonclassical: the Cw , but not Cb , and weaknesses, attempting to bring an inverted pendulum
close to the center. Also shown (using green “+” signs) are
knows the initial state of the pendulum, and Cw
possible quantization points chosen by the controllers for
does not know the precise (noisy) observation of a quantization-based control strategy
Cb using which Cb takes actions.
A possible strategy: A little thought reveals an making it take values on a finite set. Once the
interesting strategy – the weak controller, having blurry controller has received the state through
perfect observations, can move the state to the the noise, it can use its infinite energy to push the
closest of some predecided points in space, effec- state to zero.
tively quantizing the state. If these quantization
points are sufficiently far from each other, they
can be estimated accurately (through the noise) The Witsenhausen Counterexample
by the blurry controller, which can then use its
energy to push the pendulum all the way to zero. The above two-controller inverted-pendulum ex-
In this way, the weak controller expends little ample is, in fact, motivated by what is now
energy, but is able to “communicate” the state known as “the Witsenhausen counterexample,”
through the noise to the blurry controller, by formulated by Witsenhausen in 1968 (see Fig. 3).
Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions 569

x0 ∼ N (0, σ20) 15
x1 + x2 k2 =0.5
x0 + +
- 10

Cw uw + Cb ub
5
z ∼ N(0, 1)

x1
0
min k2E u2w + E x22
−5
Information Structures, the Witsenhausen Coun-
terexample, and Communicating Using Actions,
Fig. 3 The Witsenhausen counterexample is a decep- −10
tively simple two-time-step two-controller decentralized
control problem. The weak and the blurry controllers, Cw −15
and Cb act in a sequential manner
−15 −10 −5 0 5 10 15
x0
In the counterexample, two controllers (denoted
here by Cw for “weak” and Cb for “blurry”) act Information Structures, the Witsenhausen Coun-
terexample, and Communicating Using Actions, I
one after the other in two time-steps to minimize
Fig. 4 The optimization solution of Baglietto et al. (1997)
a quadratic cost function. The system state is for k 2 D 0:5, 02 D 5. The information-theoretic strategy
denoted by xt , where t is the time index. uw and of “dirty-paper coding” Costa (1983) also yields the same
ub denote the inputs generated by the curve (Grover and Sahai 2010)
 two con-

trollers. The cost function is k 2 E u2w CE x22 for
some constant k. The initial state x0 and the noise important in decentralized control. This simple
z at the input of the blurry controller are assumed two-time-step two-controller LQG problem
to be Gaussian distributed and independent, with needs to be understood to have any hope of un-
variances 02 and 1 respectively. The problem derstanding larger and more complex problems.
is a “linear-quadratic-Gaussian” (LQG) problem, While the optimal costs for the problem are
i.e., the state evolution is linear, the costs are still unknown (even though it is known that an
quadratic, and the primitive random variables are optimal strategy exists (Witsenhausen 1968;
Gaussian. Wu and Verdú 2011)), there exists a wealth of
Why is the problem called a “counterexam- understanding of the counterexample that has
ple”? The traditional “certainty-equivalence” helped address more complicated problems. A
principle (Bertsekas 1995) shows that for all body of work, starting with that of Baglietto
centralized LQG problems, linear control laws et al. (1997), numerically obtained solutions that
are optimal. Witsenhausen (1968) provided could be close to optimal (although there is no
a nonlinear strategy for the Witsenhausen mathematical proof thereof). All these solutions
problem which outperforms all linear strategies. have a consistent form (illustrated in Fig. 4),
Thus, the counterexample showed that the with slight improvements in the optimal cost.
certainty-equivalence doctrine does not extend Because the discrete version of the problem,
to decentralized control. appropriately relaxed, is known to be NP-
What is this strategy of Witsenhausen that complete (Papadimitriou and Tsitsiklis 1986),
outperforms all linear strategies? It is, in fact, a this approach cannot be used to understand the
quantization-based strategy, as suggested in our entire parameter space and hence has focused on
inverted-pendulum story above. Further, it was one point: k 2 D 0:5; 02 D 5. Nevertheless, the
shown by Mitter and Sahai (1999) that multipoint approach has proven to be insightful: a recent
quantization strategies can outperform linear information-theoretic body of work shows that
strategies by an arbitrarily large factor! This the strategies of Fig. 4 can be thought of as
observation, combined with the simplicity of information-theoretic strategies of “dirty-paper
the counterexample, makes the problem very coding” Costa (1983) that is related to the idea of
570 Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions

embedding information in the state. The question by the Witsenhausen counterexample. However,
here is: how do we embed the information about nonclassical information pattern for some simple
the state in the state itself ? problems – starting with the counterexample –
has recently been explored via an information-
An information-theoretic view of the coun-
theoretic lens, yielding the first optimal or
terexample: This information-theoretic approach
approximately optimal solutions to these
that culminated in Grover et al. (2013) also
problems. This approach is promising for larger
obtained the first approximately optimal
decentralized control problems as well. It is
solutions to the Witsenhausen counterexample
now important to explore what is the simplest
as well as its vector extensions. The result is
decentralized control problem that cannot be
established by analyzing information flows in the
solved (exactly or approximately) using ideas
counterexample that work toward minimizing the
developed for the counterexample. In this
knowledge gradient, effectively an information
manner, the Witsenhausen counterexample can
pattern in which Cw can predict the observation
provide a platform to unify the more modern
of Cb more precisely. The analysis provides
(i.e., external-channel centric approaches, see
an information-theoretic lower bound on cost
 Quantized Control and Data Rate Constraints;
that holds irrespective of what strategy is used.
 Data Rate of Nonlinear Control Systems
For the original problem, this characterizes the
and Feedback Entropy;  Networked Control
optimal costs (with associated strategies) within
Systems: Architecture and Stability Issues;
a factor of 8 for all problem parameters (i.e., k
 Networked Control Systems: Estimation and
and 02 ). For any finite-length extension, uniform
Control Over Lossy Networks;  Information
finite-ratio approximations also exist (Grover
and Communication Complexity of Networked
et al. 2013). The asymptotically infinite-
Control Systems; in the encyclopedia) with the
length extension has been solved exactly
more classical decentralized LQG problems,
(Choudhuri and Mitra 2012).
leading to enriching and useful formula-
The problem has also driven delineation of de-
tions.
centralized LQG control problems with optimal
linear solutions and those with nonlinear optimal
solutions. This led to the development and under-
standing of many variations of the counterexam-
ple (Bansal and Başar 1987; Başar 2008; Ho et al. Cross-References
1978; Rotkowitz 2006) and understanding that
 Data Rate of Nonlinear Control Systems and
can extend to larger decentralized control prob-
lems. More recent work shows that the promise Feedback Entropy
 Information and Communication Complexity
of the Witsenhausen counterexample was not
a misplaced one: the information-theoretic ap- of Networked Control Systems
 Networked Control Systems: Architecture and
proach that provides approximately optimal solu-
tions to the counterexample (Grover et al. 2013) Stability Issues
 Networked Control Systems: Estimation and
yields solutions to other more complex (e.g.,
multi-controller, more time-steps) problems as Control over Lossy Networks
 Quantized Control and Data Rate Constraints
well (Grover 2010; Park and Sahai 2012).

Summary and Future Directions


Bibliography
Even simple problems with nonclassical
Baglietto M, Parisini T, Zoppoli R (1997) Nonlin-
information structures can be hard to solve ear approximations for the solution of team opti-
using classical techniques, as is demonstrated mal control problems. In: Proceedings of the IEEE
Information-Based Multi-Agent Systems 571

conference on decision and control (CDC), San Diego,


pp 4592–4594 Information-Based Multi-Agent
Bansal R, Başar T (1987) Stochastic teams with non- Systems
classical information revisited: when is an affine
control optimal? IEEE Trans Autom Control 32:
Wing-Shing Wong1 and John Baillieul2
554–559 1
Başar T (2008) Variations on the theme of the Witsen- Department of Information Engineering,
hausen counterexample. In: Proceedings of the 47th The Chinese University of Hong Kong,
IEEE conference on decision and control (CDC), Can- Hong Kong, China
cun, pp 1614–1619 2
Bertsekas D (1995) Dynamic programming. Athena Sci-
Mechanical Engineering, Electrical and
entific, Belmont Computer Engineering, Boston University,
Costa M (1983) Writing on dirty paper. IEEE Trans Inf Boston, MA, USA
Theory 29(3):439–441
Choudhuri C, Mitra U (2012) On Witsenhausen’s
counterexample: the asymptotic vector case. In: Abstract
IEEE information theory workshop (ITW), Lausanne,
pp 162–166
Grover P (2010) Actions can speak more clearly Multi-agent systems are encountered in nature
than words. Ph.D. dissertation, UC Berkeley, (animal groups), in various domains of technol-
Berkeley
Grover P, Sahai A (2010) Vector Witsenhausen counterex-
ogy (multi-robot networks, mixed robot-human I
ample as assisted interference suppression. Int J Syst teams) and in various human activities (such as
Control Commun 2:197–237. Special issue on Infor- dance and team athletics). Information exchange
mation Processing and Decision Making in Distributed among agents ranges from being incidentally
Control Systems important to crucial in such systems. Several sys-
Grover P, Park S-Y, Sahai A (2013) Approximately-
optimal solutions to the finite-dimensional Witsen- tems in which information exchange among the
hausen counterexample. IEEE Trans Autom Control agents is either a primary goal or a primary en-
58(9):2189 abler are discussed briefly. Specific topics include
Ho YC, Kastner MP, Wong E (1978) Teams, signaling, power management in wireless communication
and information theory. IEEE Trans Autom Control
23(2):305–312 networks, data-rate constraints, the complexity
Mitter SK, Sahai A (1999) Information and control: Wit- of distributed control, robotics networks and for-
senhausen revisited. In: Yamamoto Y, Hara S (eds) mation control, action-mediated communication,
Learning, control and hybrid systems. Lecture notes in
and multi-objective distributed systems.
control and information sciences, vol 241. New York,
NY: Springer, pp 281–293
Papadimitriou CH, Tsitsiklis JN (1986) Intractable prob-
lems in control theory. SIAM J Control Optim Keywords
24(4):639–654
Park SY, Sahai A (2012) It may be easier to approximate
decentralized infinite-horizon LQG problems. In: 51st Distributed control; Information constraints;
IEEE Conference on Decision and Control (CDC), Multi-agent systems
Maui, pp 2250–2255
Rotkowitz M (2006) Linear controllers are uniformly
optimal for the Witsenhausen counterexample. In: Pro-
ceedings of the 45th IEEE conference on decision and
Introduction
control (CDC), San Diego, pp 553–558
Witsenhausen HS (1968) A counterexample in The role of information patterns in the
stochastic optimum control. SIAM J Control 6(1): decentralized control of multi-agent systems has
131–147
Witsenhausen HS (1971) Separation of estimation
been studied in different theoretical contexts for
and control for discrete time systems. Proc IEEE more than five decades. The paper Ho (1972)
59(11):1557–1566 provides references to early work in this area.
Wu Y, Verdú S (2011) Witsenhausen’s counterexam- While research on distributed decision making
ple: a view from optimal transport theory. In:
IEEE conference on decision and control and Eu-
has continued, a large body of recent research on
ropean control conference (CDC-ECC), Orlando, robotic networks has brought new dimensions of
pp 5732–5737 geometric aspects of information patterns to the
572 Information-Based Multi-Agent Systems

forefront (Bullo et al. 2009). At the same time, algorithm was shown using a Lyapunov-like
machine intelligence, machine learning, machine function.
autonomy, and theories of operation of mixed This entry surveys key topics related to multi-
teams of humans and robots have considerably agent information-based control systems, includ-
extended the intellectual frontiers of information- ing control complexity, control with data-rate
based multi-agent systems (Baillieul et al. 2012). constraints, robotic networks and formation con-
A further important development has been the trol, action-mediated communication, and multi-
study of action-mediated communication and the objective distributed systems.
recently articulated theory of control communica-
tion complexity (Wong and Baillieul 2012). These
developments may shed light on nonverbal forms Control Complexity
of communication among biological organisms
(including humans) and on the intrinsic energy In information-based distributed control systems,
requirements of information processing. how to efficiently share computational and com-
In conventional decentralized control, the con- munication resources is a fundamental issue. One
trol objective is usually well-defined and known of the earliest investigations on how to schedule
to all agents. Multi-agent information-based con- communication resources to support a network
trol encompasses a broader scenario, where the of sensors and actuators is discussed in Brockett
objective can be agent dependent and is not (1995). The concept of communication sequenc-
necessarily explicitly announced to all. For il- ing was introduced to describe how the commu-
lustration, consider the power control problem nication channel is utilized to convey feedback
in wireless communication – one of the ear- control information in a network consisting of
liest engineering systems that can be regarded interacting subsystems. In Brockett (1997), the
as multi-agent information based. It is common concept of control attention was introduced to
that multiple transmitter-receiver communication provide a measure of the complexity of a con-
pairs share the same radio frequency band and trol law against its performance. As attention is
the transmission signals interfere with each other. a shared, limited resource, the goal is to find
The power control problem searches for feedback minimum attention control. Another approach to
control for each transmitter to set its power level. gauge control complexity in a distributed system
The goal is for each transmitter to achieve tar- is by means of the minimum amount of communi-
geted signal-to-interference ratio (SIR) level by cated data required to accomplish a given control
using information of the observed levels at the task.
intended receiver only.
A popular version of the power control
problem (Foschini and Miljanic 1993) defines Control with Data-Rate Constraints
each individual objective target level by means
of a requirement threshold, known only to the A fundamental challenge in any control imple-
intended transmitter. As SIR measurements mentation in which system components com-
naturally reside on a receiver, the observed municate with each other over communication
SIR needs to be communicated back to the links is ensuring that the channel capacity is
transmitter. For obvious reasons, the bandwidth large enough to deal with the fastest time con-
for such communication is limited. The resulting stants among the system components. In a single
model fits the bill of multi-agent information- agent system, the so-called Data-Rate Theorem
based control. In Sung and Wong (1999), a has been formulated in various ways to under-
tristate power control strategy is proposed stand the constraints imposed between the sensor
so that the power control outputs are either and the controller and between the controller
increased or decreased by a fixed dB or no and the actuator. Extensions to this fundamental
change at all. Convergence of the feedback result have been focused on addressing similar
Information-Based Multi-Agent Systems 573

problems in the network control system context. movements in dance have been reported in Bail-
Information on such extensions in the distributed lieul and Özcimder (2012). Motion-based com-
control setting can be found in Nair and Evans munication of this type involves specially tailored
(2004) and Yüksel and Basar (2007). motion description languages in which sequences
of motion primitives are assembled with the ob-
jective of conveying artistic intent, while min-
Robotic Networks and Formation imizing the use of limited energy resources in
Control carrying out the movement. These motion primi-
tives constitute the alphabet that enables commu-
The defining characteristic of robotic networks nication, and physical constraints on the motions
within the larger class of multi-agent systems define the grammatical rules that govern the ways
is the centrality of spatial relationships among in which motion sequences may be constructed.
network nodes. Graph theory has been shown Research on action-mediated communication
to provide a generally convenient mathematical helps illustrate the close connection between con-
language in which to describe spatial concepts trol and information theory. Further discussion
and it is the key to understanding spatial rigid- of the deep connection between the two can be
ity related to the control of formations of au- found, for example, in Park and Sahai (2011), I
tonomous vehicles (Anderson et al. 2008), or which argues for the equivalence between the
in flocking systems (Leonard et al. 2012), or in stabilization of a distributed linear system and
consensus problems (Su and Huang 2012), or in the capacity characterization in linear network
rendezvous problems (Cortés et al. 2006). For coding.
these distributed control research topics, readers
can consult other sections in this Encyclopedia
for a comprehensive reference list. Multi-objective Distributive Systems
Much of the recent work on formation
control has included information limitation In a multi-agent system, agents may aim to carry
considerations. For consensus problems, for out individual objectives. These objectives can
example, Olfati-Saber and Murray (2004) either be cooperatively aligned (such as in a
introduced a sensing cost constraint, and in cooperative control setting) or may contend an-
Ren and Beard (2005) information exchange tagonistically (such as in a zero-sum game set-
constraints are considered, and in Yu and Wang ting). In either case, a common assumption is that
(2010) communication delays are explicitly the objective functions are a priori known to all
modeled. agents. However, in many practical applications,
agents do not know the objectives of other agents,
at least not precisely. For example, in the power
Action-Mediated Communication control problem alluded to earlier, the signal-to-
interference requirement of a user may be un-
Biological organisms communicate through mo- known to other users. Yet this does not prevent the
tion. Examples of this include prides of lions or possibility of deriving convergence algorithms to
packs of wolves whose pursuit of prey is a coop- allow the joint goals to be achieved.
erative effort and competitive team athletics in the The issue of unknown objectives in a multi-
case of humans. Recent research has been aimed agent system is formally analyzed in Wong
at developing a theoretical foundation of action- (2009) via the introduction of choice-based
mediated communication. Communication proto- actions. In an open access network, objectives of
cols for motion-based signaling between mobile an individual agent may be known only partially,
robots have been developed (Raghunathan and via the form of a random distribution in some
Baillieul 2009) and preliminary steps towards a cases. In order to achieve a joint control objective
theory of artistic expression through controlled in general, some communication via the system
574 Information-Based Multi-Agent Systems

is required if there is no side communications Baillieul J, Özcimder K (2012) The control theory of
channel. A basic issue is how to measure the motion-based communication: problems in teaching
robots to dance. In: Proceedings of the American con-
minimum amount of information exchange
trol conference (ACC), Montreal, 27–29 June 2012,
that is required to perform a specific control pp 4319–4326
task. Motivated by the idea of communication Baillieul J, Wong WS (2009) The standard parts problem
complexity in computer science, the idea and the complexity of control communication. In:
Proceedings of the 48th IEEE conference on decision
of control communication complexity was and control, held jointly with the 28th Chinese control
introduced in Wong (2009), which can provide conference, Shanghai, 16–18 Dec 2009, pp 2723–2728
such a measure. In Wong and Baillieul (2009), Baillieul J, Leonard NE, Morgansen KA (2012)
the idea was extended to a rich class of nonlinear Interaction dynamics: the interface of humans
and smart machines. Proc IEEE 100(3):776–801.
systems that arise as models of physical processes doi:10.1109/JPROC.s011.2180055
ranging from rigid body mechancs to quantum Brockett RW (1995) Stabilization of motor networks. In:
spin systems. Proceedings of the 34th IEEE conference on decision
In some special cases, control objectives and control, New Orleans, 13–15 Dec 1995, pp 1484–
1488
can be achieved without any communication Brockett RW (1997) Minimum attention control. In: Pro-
among the agents. For systems with bilinear ceedings of the 36th IEEE conference on decision and
input–output mapping, including the Brockett control, San Diego, 10–12 Dec 1997, pp 2628–2632
Integrator, it is possible to derive conditions Bullo F, Cortés J, Martínez S (2009) Distributed control of
robotic networks: a mathematical approach to motion
that guarantee this property (Wong and Baillieul coordination algorithms. Princeton University Press,
2012). Moreover, for quadratic type of control Princeton
cost, it is possible to compute the optimal Cortés J, Martínez S, Bullo F (2006) Robust rendezvous
control cost. Similar results can be extended for mobile autonomous agents via proximity graphs
in arbitrary dimensions. IEEE Trans Autom Control
to linear systems as discussed in Liu et al. 51(8):1289–1298
(2013). This circle of ideas is connected to the so- Foschini GJ, Miljanic Z (1993) A simple distributed
called standard parts problem as investigated in autonomous power control algorithm and its conver-
Baillieul and Wong (2009). Another connection gence. IEEE Trans Veh Technol 42(4):641–646
Ho YC (1972) Team decision theory and information
is to correlated equilibrium problems that have structures in optimal control problems-part I. IEEE
been recently studied by game theorists Shoham Trans Autom Control 17(1):15–22
and Leyton-Brown (2009). Leonard NE, Young G, Hochgraf K, Swain D, Chen W,
Marshall S (2012) In the dance studio: analysis of
human flocking. In: Proceedings of the 2012 Ameri-
can control conference, Montreal, 27–29 June 2012,
Cross-References pp 4333–4338
Liu ZC, Wong WS, Guo G (2013) Cooperative control
of linear systems with choice actions. In: Proceedings
 Motion Description Languages and Symbolic of the American control conference, Washington, DC,
Control 17–19 June 2013, pp 5374–5379
 Multi-vehicle Routing Nair GN, Evans RJ (2004) Stabilisability of stochastic
linear systems with finite feedback data rates. SIAM
 Networked Systems J Control Optim 43(2):413–436
Olfati-Saber R, Murray RM (2004) Consensus problems
in networks of agents with switching topology and
time-delays. IEEE Trans Autom Control 49(9):1520–
Bibliography 1533
Park SY, Sahai A (2011) Network coding meets decen-
Altman E, Avrachenkov K, Menache I, Miller G, Prabhu tralized control: capacity-stabilizability equivalence.
BJ, Shwartz A (2009) Dynamic discrete power con- In: Proceedings of the 50th IEEE conference on de-
trol in cellular networks. IEEE Trans Autom Control cision and control and European control conference,
54(10):2328–2340 Orlando, 12–15 Dec 2011, pp 4817–4822
Anderson B, Yu C, Fidan B, Hendrickx J (2008) Raghunathan D, Baillieul J (2009) Motion based com-
Rigid graph control architectures for autonomous munication channels between mobile robots-A novel
formations. IEEE Control Syst Mag 28(6):48–63. paradigm for low bandwidth information exchange. In:
doi:10.1109/MCS.2008.929280 Proceedings of the IEEE/RSJ international conference
Input-to-State Stability 575

on intelligent robots and systems, 11–15 Oct 2009, Keywords


St. Louis, pp 702–708
Ren W, Beard RW (2005) Consensus seeking in multia-
gent systems under dynamically changing interaction Asymptotic stability; Dissipation; Lyapunov
topologies. IEEE Trans Autom Control 50(5):655–661 functions
Shoham Y, Leyton-Brown K (2009) Multiagent systems:
algorithmic, game-theoretic, and logical foundations.
Cambridge University Press, New York. ISBN:978-0-
521-89943-7 Introduction
Su Y, Huang J (2012) Two consensus problems for
discrete-time multi-agent systems with switching net- We consider here systems with inputs in the usual
work topology. Automatica 48(9):1988–1997
Sung CW, Wong WS (1999) A distributed fixed-step
sense of control theory:
power control algorithm with quantization and ac-
tive link quality protection. IEEE Trans Veh Technol P
x.t/ D f .x.t/; u.t//
48(2):553–562
Wong WS (2009) Control communication complexity of
distributed control systems. SIAM J Optim Control (the arguments “t” are often omitted). There
48(3):1722–1742 are n state variables and m input channels.
Wong WS, Baillieul J (2009) Control communication
States x.t/ take values in Euclidean space
complexity of nonlinear systems. Commun Inf Syst
9(1):103–140 Rn , and the inputs (also called “controls” I
Wong WS, Baillieul J (2012) Control communication or “disturbances” depending on the context)
complexity of distributed actions. IEEE are measurable in locally essentially bounded
Trans Autom Control 57(11):2731–2745.
maps u./ W Œ0; 1/ ! Rm . The map f W
doi:10.1109/TAC.2012.2192357
Yu J, Wang L (2010) Group consensus in multi-agent Rn  Rm ! Rn is assumed to be locally
systems with switching topologies and communication Lipschitz with f .0; 0/ D 0. The solution, defined
delays. Syst Control Lett 59(6):340–348 on some maximal interval Œ0; tmax .x 0 ; u//, for
Yüksel S, Basar T (2007) Optimal signaling policies
for decentralized multicontroller stabilizability over
each initial state x 0 and input u, is denoted as
communication channels. IEEE Trans Autom Control x.t; x 0 ; u/ and, in particular, for systems with
52(10):1969–1974 no inputs x.t/P D f .x.t//; just as x.t; x 0 /. The
zero system associated to xP D f .x; u/ is by
definition the system with no inputs xP D f .x; 0/.
Euclidean norm is written as jxj. For a function
Input-to-State Stability of time, typically an input or a state trajectory,
kuk, or kuk1 for emphasis, is the (essential)
Eduardo D. Sontag
supremum or “sup” norm (possibly C1, if u is
Rutgers University, New Brunswick, NJ, USA
not bounded). The norm of the restriction of a
signal to an interval I is denoted by kuI k1 (or
justkuI k).
Synonyms

ISS
Input-to-State Stability

Abstract It is convenient to introduce “comparison func-


tions” to quantify stability. A class K1 function
The notion of input to state stability (ISS) is a function ˛ W R0 ! R0 which is con-
qualitatively describes stability of the mapping tinuous, strictly increasing, and unbounded and
from initial states and inputs to internal states satisfies ˛.0/ D 0, and a class KL function is
(and more generally outputs). This entry focuses a function ˇ W R0  R0 ! R0 such that
on the definition of ISS and a discussion of ˇ.; t/ 2 K1 for each t and ˇ.r; t/ decreases to
equivalent characterizations. zero as t ! 1, for each fixed r.
576 Input-to-State Stability

For a system with no inputs xP D f .x/, there x


is a well-known notion of global asymptotic
stability (for short from now on, GAS, or
“0-GAS” when referring to the zero system
xP D f .x; 0/ associated to a given system with
inputs xP D f .x; u/) due to Lyapunov and usually
defined in “–-ı” terms. It is an easy exercise ≈ x0
to show that this standard definition is in fact
equivalent to the following statement: ≈ |u|∞


t
.9ˇ 2 KL/jx.t; x 0 /j  ˇ jx 0 j; t 8 x 0 ; 8 t  0:
Input-to-State Stability, Fig. 1 ISS combines over-
shoot and asymptotic behavior
The notion of input to state stability (ISS) was
introduced in Sontag (1989), and it provides theo- ˚ 
retical concepts used to describe stability features jx.t/j  max ˇ.jx 0 j; t/ ;  .kuk1 /
of a mapping .u./; x.0// ’x./ that sends initial
states and input functions into states (or, more holds for all solutions. Such redefinitions, using
generally, outputs). Prominent among these fea- “max” instead of sum, are also possible for each
tures are that inputs that are bounded, “eventually of the other concepts to be introduced later.
small,” “integrally small,” or convergent should Intuitively, the definition of ISS requires that,
lead to outputs with the respective property. In for t large, the size of the state must be bounded
addition, ISS and related notions quantify in what by some function of the sup norm – that is to say,
manner initial states affect transient behavior. The the amplitude – of inputs (because ˇ.jx 0 j ; t/ ! 0
formal definition is as follows. as t ! 1). On the other hand, the ˇ.jx 0 j ; 0/
A system is said to be input to state stable term may dominate for small t, and this serves
(ISS) if there exist some ˇ 2 KL and  2 K1 to quantify the magnitude of the transient (over-
such that shoot) behavior as a function of the size of the
ˇ ˇ initial state x 0 (Fig. 1). The ISS superposition the-
jx.t/j  ˇ.ˇx 0 ˇ ; t/ C  .kuk1 / (ISS) orem, discussed later, shows that ISS is, in a pre-
cise mathematical sense, the conjunction of two
holds for all solutions (meaning that the estimate properties, one of them dealing with asymptotic
is valid for all inputs u./, all initial conditions bounds on jx 0 j as a function of the magnitude of
x 0 , and all t  0). Note that the supremum the input and the other one providing a transient
sups2Œ0;t  .ju.s/j/ over the interval Œ0; t is the term obtained when one ignores inputs.
same as .kuŒ0;t  k1 / D .sups2Œ0;t  .ju.s/j//, For internally stable linear systems xP D Ax C
because the function  is increasing, so one may Bu, the variation of parameters formula gives
replace this term by .kuk1 /, where kuk1 D immediately the following inequality:
sups2Œ0;1/ .ju.s/j/ is the sup norm of the input,
ˇ ˇ
because the solution x.t/ depends only on values jx.t/j  ˇ.t/ ˇx 0 ˇ C  kuk1 ;
u.s/; s  t (so, one could equally well consider
the input that has values  0 for all s > t).
Since, in general, maxfa; bg  a C b  where
maxf2a; 2bg, one can restate the ISS condition
 
in a slightly different manner, namely, asking for ˇ.t/ D e tA  ! 0 and
the existence of some ˇ 2 KL and  2 K1 Z 1
 sA 
(in general, different from the ones in the ISS  D kBk e  ds < 1:
definition) such that 0
Input-to-State Stability 577

This is a particular case of the ISS estimate,


jx.t/j  ˇ.jx 0 j; t/ C  .kuk1 /, with linear x = f (x,u)
comparison functions.
u x

Feedback Redesign u = k(x)

The notion of ISS arose originally as a way to Input-to-State Stability, Fig. 2 Feedback stabilization,
precisely formulate, and then answer, the follow- closed-loop system xP D f .x; k.x//
ing question. Suppose that, as in many problems
in control theory, a system xP D f .x; u/ has been u
d x = f (x, u)
stabilized by means of a feedback law u D k.x/
(Fig. 2), that is to say, k was chosen such that the
u x
origin of the closed-loop system xP D f .x; k.x//
is globally asymptotically stable. (See, e.g., Son-
tag 1999 for a discussion of mathematical aspects u = k(x)
of state feedback stabilization.) Typically, the de- I
sign of k was performed by ignoring the effect of Input-to-State Stability, Fig. 3 Actuator disturbances,
possible input disturbances d./ (also called ac- closed-loop system xP D f .x; k.x/ C d /
tuator disturbances). These “disturbances” might
represent true noise or perhaps errors in the calcu-
so that f .x; k.x// D x. This is a GAS system.
lation of the value k.x/ by a physical controller
The effect of the disturbance input d is analyzed
or modeling uncertainty in the controller or the
as follows. The system xP D f .x; k.x/ C d / is
system itself. What is the effect of considering
disturbances? In order to analyze the problem, d
xP D x C .x 2 C 1/ d :
is incorporated into the model, and one studies
the new system xP D f .x; k.x/ C d /, where d is
This system has solutions which diverge to
seen as an input (Fig. 3). One may then ask what
infinity even for inputs d that converge to zero;
is the effect of d on the behavior of the system.
moreover, the constant input d  1 results in
Disturbances d may well destabilize the system,
solutions that explode in finite time. Thus k.x/ D
and the problem may arise even when using a rou- 2x
x 2 C1
was not a good feedback law, in the sense
tine technique for control design, feedback lin-
that its performance degraded drastically once
earization. To appreciate this issue, take the fol-
actuator disturbances were taken into account.
lowing very simple example. Given is the system
The key observation for what follows is that
if one adds a correction term “x” to the above
xP D f .x; u/ D x C .x 2 C 1/u: formula for k.x/, so that now,

Q 2x
In order to stabilize it, substitute u D x 2uQC1 (a pre- k.x/ D x;
x2 C 1
liminary feedback transformation), rendering the
system linear with respect to the new input uQ : xP D then the system xP D f .x; k.x/Q C d / with
xC uQ , and then use uQ D 2x in order to obtain the disturbance d as input becomes instead
closed-loop system xP D x. In other words, in
terms of the original input u, the feedback law is xP D  2x  x 3 C .x 2 C 1/ d

2x and this system is much better behaved: it is still


k.x/ D
x2 C 1 GAS when there are no disturbances (it reduces
578 Input-to-State Stability

to xP D 2x  x 3 ), but, in addition, it is ISS (easy vanishes, leaving precisely the GAS property.
to verify directly, or appealing to some of the In particular, then, the system xP D f .x; u/ is
characterizations mentioned later). Intuitively, for 0-stable, meaning that the origin of the system
large x, the term x 3 serves to dominate the term without inputs xP D f .x; 0/ is stable in the sense
.x 2 C 1/d , for all bounded disturbances d./, and of Lyapunov: for each – > 0, there is some ı > 0
this prevents the state from getting too large. such that jx 0 j < ı implies jx.t; x 0 /j < –. (In
This example is an instance of a general result, comparison-function language, one can restate 0-
which says that, whenever there is some feedback stability as follows: there is some  2 K such that
law that stabilizes a system, there is also a (pos- jx.t; x 0 /j  .jx 0 j/ holds for all small x 0 .)
sibly different) feedback so that the system with On the other hand, since ˇ.jx 0 j; t/ ! 0 as t !
external input d is ISS. 1, for t large one has that the first term in the
ISS estimate jx.t/j  max fˇ.jx 0 j; t/;  .kuk1 /g
Theorem 1 (Sontag 1989). Consider a system
vanishes. Thus an ISS system satisfies the fol-
affine in controls
lowing asymptotic gain property (“AG”): there
is some  2 K1 so that:
X
m
xP D f .x; u/ D g0 .x/ C ui gi .x/ .g0 .0/ D 0/
ˇ ˇ
i D1 lim ˇx.t; x 0 ; u/ˇ   .kuk1 / 8 x 0 ; u./
t !C1
(AG)
and suppose that there is some differentiable
feedback law u D k.x/ so that (see Fig. 4). In words, for all large enough t,
the trajectory exists, and it gets arbitrarily close
xP D f .x; k.x// to a sphere whose radius is proportional, in a
possibly nonlinear way quantified by the function
has x D 0 as a GAS equilibrium. Then, there is a  , to the amplitude of the input. In the language
feedback law u D e
k.x/ such that of robust control, the estimate (AG) would be
called an “ultimate boundedness” condition; it
xP D f .x; e
k.x/ C d / is a generalization of attractivity (all trajectories
converge to zero, for a system xP D f .x/ with
is ISS with input d./. no inputs) to the case of systems with inputs; the
“lim sup” is required since the limit of x.t/ as
The reader is referred to the book Krstić et al. t ! 1 may well not exist. From now on (and
(1995), and the references given later, for many analogously when defining other properties), we
further developments on the subjects of recursive
feedback design, the “backstepping” approach,
and other far-reaching extensions.

x(0)
Equivalences for ISS

This section reviews results that show that ISS


is equivalent to several other notions, including x(t)
asymptotic gain, existence of robustness mar- γ (⏐⏐u⏐⏐)
gins, dissipativity, and an energy-like stability
estimate.

Nonlinear Superposition Principle


Clearly, if a system is ISS, then the system with Input-to-State Stability, Fig. 4 Asymptotic gain prop-
no inputs xP D f .x; 0/ is GAS: the term kuk1 erty
Input-to-State Stability 579

will just say “the system is AG” instead of the


more cumbersome “satisfies the AG property.” x = f (x, u)
Observe that, since only large values of t mat-
ter in the limsup, one can equally well consider u x
merely tails of the input u when computing its sup
norm. In other words, one may replace .kuk1 / Δ
by .limt !C1 ju.t/j/, or (since  is increasing)
limt !C1 .ju.t/j/.
Input-to-State Stability, Fig. 5 Margin of robustness
The surprising fact is that these two necessary
conditions are also sufficient. This is summarized
by the ISS superposition theorem:
Theorem 2 (Sontag and Wang 1996). A sys- present notion of robust stability asks about all
tem is ISS if and only if it is 0-stable and AG. possible closed-loop interconnections. One may
think of  as representing uncertainty in the
A minor variation of the above superposition dynamics of the original system, for example.
theorem is as follows. Let us consider the limit
property (LIM): Theorem 4 (Sontag and Wang 1995). A sys-
tem is ISS if and only if it is robustly stable. I
inf jx.t; x 0 ; u/j  .kuk1 / 8 x 0 ; u./ (LIM) Intuitively, the ISS estimate jx.t/j  max
t 0
fˇ.jx 0 j; t/;  .kuk1 /g says that the ˇ term
dominates as long as ju.t/j jx.t/j for all t, but
(for some  2 K1 ).
ju.t/j jx.t/j amounts to u.t/ D d.t/:.jx.t/j/
Theorem 3 (Sontag and Wang 1996). A sys- with an appropriate function . This is an instance
tem is ISS if and only if it is 0-stable and LIM. of a “small gain” argument, see below. One
analog for linear systems is as follows: if A is
Robust Stability a Hurwitz matrix, then A C Q is also Hurwitz,
In this entry, a system is said to be robustly stable for all small enough perturbations Q; note that
if it admits a margin of stability , that is, a when Q is a nonsingular matrix, jQxj is a K1
smooth function  2 K1 so the system function of jxj.

xP D g.x; d / WD f .x; d.jxj//


Dissipation
is GAS uniformly in this sense: for some ˇ 2 Another characterization of ISS is as a dissipation
KL, notion stated in terms of a Lyapunov-like func-
ˇ ˇ ˇ ˇ tion. A continuous function V W Rn ! R is said
ˇx.t; x 0 ; d /ˇ  ˇ.ˇx 0 ˇ; t/
to be a storage function if it is positive definite,
for all possible d./ W Œ0; 1/ ! Œ1; 1m . An al- that is, V .0/ D 0 and V .x/ > 0 for x 6D 0, and
ternative way to interpret this concept (cf. Sontag proper, that is, V .x/ ! 1 as jxj ! 1. This
and Wang 1995) is as uniform global asymptotic last property is equivalent to the requirement that
stability of the origin with respect to all possible the sets V 1 .Œ0; A/ should be compact subsets
time-varying feedback laws  bounded by : of Rn , for each A > 0, and in the engineering
j.t; x/j  .jxj/. In other words, the system literature, it is usual to call such functions radi-
ally unbounded. It is an easy exercise to show that
xP D f .x; .t; x// V W Rn ! R is a storage function if and only if
there exist ˛; ˛ 2 K1 such that
(Fig. 5) is stable uniformly over all such pertur-
bations . In contrast to the ISS definition, which
deals with all possible “open-loop” inputs, the ˛.jxj/  V .x/  ˛.jxj/ 8 x 2 Rn
580 Input-to-State Stability

(the lower bound amounts to properness and or VP  ˛.V /=2. From here, one deduces by
V .x/ > 0 for x 6D 0, while the upper a comparison theorem that, along all solutions,
bound guarantees V .0/ D 0). For convenience,
VP W Rn  Rm ! R is the function: ˚ 
V .x.t//  max ˇ.V .x 0 /; t/; ˛ 1 .2.kuk1 // ;
VP .x; u/ WD rV .x/:f .x; u/
where the KL function ˇ.s; t/ is the solution y.t/
which provides, when evaluated at .x.t/; u.t//, of the initial value problem
the derivative dV .x.t//=dt along solutions of
xP D f .x; u/. 1
yP D  ˛.y/ C .u/; y.0/ D s:
An ISS-Lyapunov function for xP D f .x; u/ 2
is by definition a smooth storage function V for
which there exist functions ; ˛ 2 K1 so that Finally, an ISS estimate is obtained from
V .x 0 /  ˛.x 0 /.
VP .x; u/  ˛.jxj/ C .juj/ 8 x; u : The proof of the converse part of the theorem
(L-ISS) is based upon first showing that ISS implies
Integrating, an equivalent statement is that, along robust stability in the sense already discussed
all trajectories of the system, there holds the and then obtaining a converse Lyapunov
following dissipation inequality: theorem for robust stability for the system
xP D f .x; d.jxj// D g.x; d /, which is
Z t2 asymptotically stable uniformly on all Lebesgue-
V .x.t2 //  V .x.t1 //  w.u.s/; x.s// ds measurable functions d./ W R0 ! B.0; 1/. This
t1
last theorem was given in Lin et al. (1996) and
where, using the terminology of Willems is basically a theorem on Lyapunov functions
(1976), the “supply” function is w.u; x/ D for differential inclusions. The classical result of
.juj/  ˛.jxj/. For systems with no inputs, Massera (1956) for differential equations (with
an ISS-Lyapunov function is precisely the same no inputs) becomes a special case.
object as a Lyapunov function in the usual sense.
Theorem 5 (Sontag and Wang 1995). A sys- Using “Energy” Estimates Instead of
tem is ISS if and only if it admits a smooth ISS- Amplitudes
Lyapunov function. In linear control theory, H1 theory studies L2 !
L2 induced norms, which under coordinate
Since ˛.jxj/  ˛.˛ 1 .V .x///, the ISS- changes leads to the following type of estimate:
Lyapunov condition can be restated as
Z Z
t ˇ ˇ t
VP .x; u/  ˛.V
Q .x// C .juj/ 8 x; u ˛ .jx.s/j// ds  ˛0 .ˇx 0 ˇ/ C .ju.s/j/ ds
0 0

for some ˛Q 2 K1 . In fact, one may strengthen


along all solutions and for some ˛; ˛0 ;  2 K1 .
this a bit (Praly and Wang 1996): for any ISS
Just for the statement of the next result, a system
system, there is a always a smooth ISS-Lyapunov
is said to satisfy an integral-integral estimate if
function satisfying the “exponential” estimate
for every initial state x 0 and input u, the solution
VP .x; u/  V .x/ C .juj/.
x.t; x 0 ; u/ is defined for all t > 0 and an estimate
The sufficiency of the ISS-Lyapunov condi-
as above holds. (In contrast to ISS, this definition
tion is easy to show and was already in the orig-
explicitly demands that tmax D 1.)
inal paper Sontag (1989). A sketch of proof is as
follows, assuming for simplicity a dissipation es- Theorem 6 (Sontag 1998). A system is ISS if
timate in the form VP .x; u/  ˛.V .x// C .juj/. and only if it satisfies an integral-integral esti-
Given any x and u, either ˛.V .x//  2.juj/ mate.
Input-to-State Stability 581


Z t
This theorem is quite easy to prove, in
view of previous results. A sketch of proof C˛ .ju.s/j/ ds
0
is as follows. If the system is ISS, then
there is an ISS-Lyapunov function satisfying for appropriate K1 functions (note the additional
P u/  V .x/ C .juj/, so, integrating along
V.x; “˛”).
any solution:
Theorem 7 (Angeli et al. 2000b). A system
Z t Z t satisfies a weak integral to integral estimate if
V .x.s// ds  V .x.s// ds C V .x.t// and only if it is iISS.
0 0
Z t Another interesting variant is found when consid-
 V .x.0// C .ju.s/j/ ds ering mixed integral/supremum estimates:
0
Z t
and thus an integral-integral estimate holds. Con- ˛.jx.t/j  ˇ.jx 0 j; t/ C 1 .ju.s/j/ ds
versely, if such an estimate holds, one can prove 0

that xP D f .x; 0/ is stable and that an asymptotic C 2 .kuk1 /


gain exists.
I
for suitable ˇ 2 KL and ˛; i 2 K1 . One then
has
Integral Input to State Stability Theorem 8 (Angeli et al. 2000b). A system
satisfies a mixed estimate if and only if it is iISS.
A concept of nonlinear stability that is truly
distinct from ISS arises when considering a
mixed notion which combines the “energy” of the Dissipation Characterization of iISS
input with the amplitude of the state. A system A smooth storage function V is an iISS-Lyapunov
is said to be integral-input to state stable (iISS) function for the system xP D f .x; u/ if there are
provided that there exist ˛;  2 K1 and ˇ 2 KL a  2 K1 and an ˛ W Œ0; C1/ ! Œ0; C1/
such that the estimate which is merely positive definite (i.e., ˛.0/ D 0
and ˛.r/ > 0 for r > 0) such that the inequality
Z
ˇ ˇ t
˛ .jx.t/j/  ˇ.ˇx 0 ˇ; t/ C .ju.s/j/ ds VP .x; u/  ˛.jxj/ C .juj/ (L-iISS)
0
(iISS)
holds for all .x; u/ 2 Rn  Rm . To compare,
holds along all solutions. Just as with ISS, one recall that an ISS-Lyapunov function is required
could state this property merely for all times to satisfy an estimate of the same form but where
t 2 tmax .x 0 ; u/. Since the right-hand side is ˛ is required to be of class K1 ; since every K1
bounded on each interval Œ0; t (because, recall, function is positive definite, an ISS-Lyapunov
inputs are by definition assumed to be bounded function is also an iISS-Lyapunov function.
on each finite interval), it is automatically true
that tmax .x 0 ; u/ D C1 if such an estimate Theorem 9 (Angeli et al. 2000a). A system is
holds along maximal solutions. So forward- iISS if and only if it admits a smooth iISS-
completeness (solution exists for all t > 0) can Lyapunov function.
be assumed with no loss of generality. Since an ISS-Lyapunov function is also an iISS
One might also consider the following type of one, ISS implies iISS. However, iISS is a strictly
“weak integral to integral” mixed estimate: weaker property than ISS, because ˛ may be
Z bounded in the iISS-Lyapunov estimate, which
t
means that V may increase, and the state become
˛.jx.s/j/ ds  .jx 0 j/
0 unbounded, even under bounded inputs, so long
582 Input-to-State Stability

as .ju.t/j/ is larger than the range of ˛. This following sense: there exists a storage function V
is also clear from the iISS definition, since a and  2 K1 , ˛ positive definite, so that
constant input with ju.t/j D r results in a term
in the right-hand side that grows like rt. VP .x; u/  .juj/  ˛.h.x//
An interesting general class of examples is
given by bilinear systems holds for all .x; u/.
! Angeli et al. (2000b) contains several additional
X m
xP D A C ui Ai x C Bu characterizations of iISS.
i D1
Superposition Principles for iISS
for which the matrix A is Hurwitz. Such systems There are also asymptotic gain characterizations
are always iISS (see Sontag 1998), but they are for iISS. A system is bounded energy weakly
not in general ISS. For instance, in the case when converging state (BEWCS) if there exists some
B D 0, boundedness of trajectories for all con-  2 K1 so that the following implication holds:
P
stant inputs already implies that A C m i D1 ui Ai Z C1
must have all eigenvalues with nonpositive real .ju.s/j/ ds < C1 )
part, for all u 2 Rm , which is a condition 0
involving the matrices Ai (e.g., xP D x C ux ˇ ˇ
lim inf ˇx.t; x 0 ; u/ˇ D 0 BEWCS
is iISS but it is not ISS). t !C1

The notion of iISS is useful in situations where


an appropriate notion of detectability can be (more precisely: if the integral is finite,
verified using LaSalle-type arguments. There then tmax .x 0 ; u/ D C1 and the liminf
follow two examples of theorems along these is zero). It is bounded energy frequently
lines. bounded state (BEFBS) if there exists some
 2 K1 so that the following implication
Theorem 10 (Angeli et al. 2000a). A system is holds:
iISS if and only if it is 0-GAS and there is a Z C1
smooth storage function V such that, for some
.ju.s/j/ ds < C1 )
 2 K1 : 0
VP .x; u/  .juj/ ˇ ˇ
lim inf ˇx.t; x 0 ; u/ˇ < C1 BEFBS
t !C1
for all .x; u/.
The sufficiency part of this result follows from (again, meaning that tmax .x 0 ; u/ D C1 and the
the observation that the 0-GAS property by itself lim inf is finite).
already implies the existence of a smooth and Theorem 12 (Angeli et al. 2004). The follow-
positive definite, but not necessarily proper, func- ing three properties are equivalent for any given
tion V0 such that VP0  0 .juj/  ˛0 .jxj/ for all system xP D f .x; u/:
.x; u/, for some 0 2 K1 and positive definite • The system is iISS.
˛0 (if V0 were proper, then it would be an iISS- • The system is BEWCS and 0-stable.
Lyapunov function). Now, one uses V0 C V as an • The system is BEFBS and 0-GAS.
iISS-Lyapunov function (V provides properness).
Theorem 11 (Angeli et al. 2000a). A system is
iISS if and only if there exists an output function Summary and Future Directions
y D h.x/ (continuous and with h.0/ D 0)
which provides zero detectability (u  0 and This entry focuses on stability notions relative to
y  0 ) x.t/ ! 0) and dissipativity in the steady states, but a more general theory is also
Input-to-State Stability 583

possible that allows consideration of more Angeli D, Ingalls B, Sontag ED, Wang Y (2004)
arbitrary attractors, as well as robust and/or Separation principles for input-output and integral-
input-to-state stability. SIAM J Control Optim 43(1):
adaptive concepts. Much else has been omitted 256–276
from this entry. Most importantly, one of the key Freeman RA, Kokotović PV (1996) Robust nonlinear
results is the ISS small-gain theorem due to Jiang control design, state-space and Lyapunov techniques.
et al. (1994), which provides a powerful sufficient Birkhauser, Boston
Isidori A (1999) Nonlinear control systems II. Springer,
condition for the interconnection of ISS systems London
being itself ISS. Isidori A, Marconi L, Serrani A (2003) Robust au-
Other topics not treated include, among many tonomous guidance: an internal model-based ap-
others, all notions involving outputs; ISS proper- proach. Springer, London
Jiang Z-P, Teel A, Praly L (1994) Small-gain theorem for
ties of time-varying (and in particular periodic) ISS systems and applications. Math Control Signals
systems; ISS for discrete-time systems; questions Syst 7:95–120
of sampling, relating ISS properties of continuous Khalil HK (1996) Nonlinear systems, 2nd edn. Prentice-
and discrete-time systems; ISS with respect to Hall, Upper Saddle River
Krstić M, Deng H (1998) Stabilization of uncertain non-
a closed subset K; stochastic ISS; applications linear systems. Springer, London
to tracking, vehicle formations (“leader to fol- Krstić M, Kanellakopoulos I, Kokotović PV (1995)
lowers” stability); and averaging of ISS systems. Nonlinear and adaptive control design. Wiley,
I
Sontag (2006) may also be consulted for further New York
Lin Y, Sontag ED, Wang Y (1996) A smooth converse
references, a detailed development of some of Lyapunov theorem for robust stability. SIAM J Control
these ideas, and citations to the literature for Optim 34(1):124–160
others. In addition, the textbooks Isidori (1999), Massera JL (1956) Contributions to stability theory. Ann
Krstić et al. (1995), Khalil (1996), Sepulchre Math 64:182–206
Praly L, Wang Y (1996) Stabilization in spite of matched
et al. (1997), Krstić and Deng (1998), Freeman unmodelled dynamics and an equivalent definition
and Kokotović (1996), and Isidori et al. (2003) of input-to-state stability. Math Control Signals Syst
contain many extensions of the theory as well as 9:1–33
applications. Sepulchre R, Jankovic M, Kokotović PV (1997) Construc-
tive nonlinear control. Springer, New York
Sontag ED (1989) Smooth stabilization implies co-
prime factorization. IEEE Trans Autom Control
Cross-References 34(4):435–443
Sontag ED (1998) Comments on integral variants of ISS.
 Feedback Stabilization of Nonlinear Systems Syst Control Lett 34(1–2):93–100
Sontag ED (1999) Stability and stabilization: disconti-
 Fundamental Limitation of Feedback Control
nuities and the effect of disturbances. In: Nonlinear
 Linear State Feedback analysis, differential equations and control (Montreal,
 Lyapunov’s Stability Theory 1998). NATO science series C, Mathematical and
 Stability and Performance of Complex Systems physical sciences, vol 528. Kluwer Academic, Dor-
drecht, pp 551–598
Affected by Parametric Uncertainty
Sontag ED (2006) Input to state stability: basic con-
 Stability: Lyapunov, Linear Systems cepts and results. In: Nistri P, Stefani G (eds) Non-
linear and optimal control theory. Springer, Berlin,
pp 163–220
Sontag ED, Wang Y (1995) On characterizations of
Bibliography the input-to-state stability property. Syst Control Lett
24(5):351–359
Angeli D, Sontag ED, Wang Y (2000a) A characterization Sontag ED, Wang Y (1996) New characterizations of
of integral input-to-state stability. IEEE Trans Autom input-to-state stability. IEEE Trans Autom Control
Control 45(6):1082–1097 41(9):1283–1294
Angeli D, Sontag ED, Wang Y (2000b) Further equiv- Willems JC (1976) Mechanisms for the stability and
alences and semiglobal versions of integral input to instability in feedback systems. Proc IEEE 64:
state stability. Dyn Control 10(2):127–149 24–35
584 Interactive Environments and Software Tools for CACSD

domains, to enhance designer’s productivity and


Interactive Environments and the design quality and to manage the design
Software Tools for CACSD versions and documentation. CACSD is not a
new paradigm, since the first such software
Vasile Sima
systems have been developed about 50 years
Advanced Research, National Institute for
ago. See the historical overview in a companion
Research & Development in Informatics,
paper.
Bucharest, Romania
The interactive environments and tools for
CACSD have evolved significantly during the last
decades, in parallel with the developments of
Abstract
numerical linear algebra, scientific computations,
and computer hardware and software, includ-
The main functional and support facilities of-
ing programming and networking capabilities.
fered by interactive environments and tools for
Starting from simple collections of specialized
computer-aided control system design (CACSD)
tools for solving well-defined system analysis
and reference examples of such software systems
and design problems, the CACSD became in-
are presented, from both a user and a developer
creasingly more sophisticated and powerful, al-
perspective. The essential functions these envi-
lowing complicated tasks to be orchestrated for
ronments should possess and requirements which
fully covering the stages of control engineering
should be satisfied are discussed. The importance
design, prototyping, and testing, including even
of reliability and efficiency is highlighted, be-
the transfer to practical systems and applications.
sides the desired friendliness and flexibility of
Modeling, system analysis and synthesis, and
the user interface. Widely used environments and
control system assessment are activities which
software tools for CACSD, including MATLAB,
are assisted by the nowadays advanced CACSD
Mathematica, Maple, and the SLICOT Library,
environments and software tools. The main aim is
serve as illustrative examples.
to help the designer to concentrate on the design
problem itself, not on theoretical approaches,
numerical algorithms, and computational details.
Keywords Moreover, CACSD environments allow the de-
velopers and users to do conceptual thinking,
Automatic control; Controller design; Numerical but also programming and debugging at a higher
algorithms; Simulation; User interface level of abstraction, in comparison with standard
programming languages, like Fortran, C/C++, or
JavaTM .
Introduction There are both commercial or free and open-
source CACSD environments and tools. State-
The complexity of many processes or systems of-the-art CACSD systems exist for several
to be controlled, and the strong performance platforms (Windows, Linux/UNIX, and Mac
requirements to be fulfilled nowadays, makes OS X). Multiple high-speed CPUs, graphics
it very difficult or even impossible to design cards, and large amounts of RAM are well suited
suitable control laws and algorithms without to perform graphically and computationally
resorting to computers and dedicated software intensive tasks. A common feature is the presence
tools. computer-aided control system design of a “friendly” graphical user interface, but often
(CACSD) is the use of computer programs to a dedicated command language is also available.
support the creation, analysis, evaluation, or The user interacts with the CACSD environment,
optimization of a control system design. CACSD e.g., by specifying the model or control structure,
is a specialization of computer-aided design the design requirements, and the values of es-
(CAD) for control systems. CAD is used in many sential parameters or by selecting and combining
Interactive Environments and Software Tools for CACSD 585

Interactive CACSD environment (e.g., MATLAB, Mathematica, Maple)


(for modeling, simulation, analysis, synthesis, etc.)
Toolboxes or packages with executables or functions written in the environment language
(Graphical) User interface, Interactive language, Graphical functions, API
CACSD subroutine libraries (e.g., SLICOT)
Mathematical subroutine libraries (e.g., LAPACK, ARPACK, IMSL, NAG)
Computer-optimized mathematical libraries or their generators (e.g., BLAS, MKL, ATLAS)
Libraries of intrinsic functions (e.g., in Fortran or C/C++)

Interactive Environments and Software Tools for CACSD, Fig. 1 Hierarchy of the software components incorpo-
rated in an interactive CACSD environment

the tools to be used. The process can be repeated • Define or find (via first principles or system
until a satisfactory behavior is obtained. identification) various system models (e.g.,
Usually, the underlying computational tools state-space models or transfer-function matri-
on which the interactive environments are based ces) and convert between different representa-
are hidden to the user. Moreover, software for tions
extensive testing is not normally provided, but • Find reduced order (or simplified) models,
demonstrators running few examples are offered. which can more economically be used for
I
Unfortunately, even mathematically simple simulation, control, prediction, etc.
problems of small dimension can conduct to • Analyze basic system properties, like stabil-
wrong results when using unsuitable algorithms. ity, controllability, observability, stabilizabil-
Illustrative control-related examples are given, ity, detectability, minimality, properness, etc.
e.g., in Van Huffel et al. (2004). Since system • Analyze interactively the behavior of a control
analysis and design tasks usually involve sequen- system for various scenarios
tial or iterative solution of large and complex • Provide alternative tools for different cate-
subproblems, it follows that the quality of the gories of users, from novice to expert, and
intermediate results is of utmost importance. from classical to “modern” or advanced anal-
Consequently, the interactive environments for ysis and synthesis techniques, in time domain
CACSD should be based on reliable, efficient, or frequency domain
and thoroughly tested computational building • Provide a wide range of tools, covering mod-
blocks, which are called at the lower layers of eling, system identification, filtering, control
calculations. These blocks constitute the compu- system design, simulation, real-time behav-
tational engine of an interactive environment. ior, hardware-in-the-loop simulation, and code
Figure 1 gives a typical hierarchy of the soft- generation for easy deployment and ensure
ware components incorporated in an interactive their interoperability
CACSD environment. • Allow the user to add extensions at vari-
ous levels, new functions, interfaces, or even
toolboxes or packages, which can be made
Interactive Environments for CACSD available to a general community and allow
customization
Main Functionality In addition to the functional and computational
A comprehensive set of functions of and require- tools, essential components of an interactive en-
ments for interactive environments for control vironment are the user interface, the application
engineering are described in MacFarlane et al. program interface (API), and the support tools
(1989), but such a set has probably not yet been which enable to easily specify, document, and
covered by any single environment. State-of-the- store a design solution, to visualize and interpret
art interactive environments for CACSD include the results, to export them to other applications
many attractive functional features: for further processing, to generate reports, etc.
586 Interactive Environments and Software Tools for CACSD

A good paradigm for the data environment is graphical user interface, complemented with
object orientation. attractive visualization capabilities, and open
It is a common feature of an interactive envi- for extensions with new toolkits, MATLAB
ronment for CACSD to address the requirements can be used for solving intricate scientific
of a large diversity of users, in various stages and engineering problems, as well as for the
of familiarity with the environment. This feature development and deployment of applications.
is expressed, e.g., by the option to use either MATLAB R
and Simulink R
are registered
a graphical user interface or a command lan- trademarks of The MathWorks, Inc. MATLAB,
guage to call and sequence various computational Simulink, and several toolboxes, including Sys-
procedures. In addition, tools for easy building tem Identification Toolbox, Control System Tool-
new computational or graphical procedures, or box, and Robust Control Toolbox, are suitable
for managing the codes and results, are often for solving various control engineering problems;
included. The command language should oper- other toolboxes, such as Signal Processing Tool-
ate both on low-level data constructs, such as box, Optimization Toolbox, and Symbolic Math
a matrix, and on high-level ones (e.g., system Toolbox, offer additional useful facilities. See
objects), and it should allow operator overloading http://www.mathworks.com/products/.
(e.g., taking G1
G2 as the result of a series Simulink is a high-level implementation of the
interconnection of the systems represented by the engineering approach, based on block diagrams,
system objects G1 and G2 ). to analyze and design control systems. It is also a
An environment for CACSD should integrate powerful modeling and multi-domain simulation
advanced user interfaces and API, a collection and model-based design tool for dynamic sys-
of problem solvers based on reliable and ef- tems, which supports hierarchical system-level
ficient numerical and possibly symbolic algo- design, simulation, automatic code generation,
rithms, and tools for visualizing and interpreting and continuous test and verification of embed-
the results. Widely used such environments are, ded systems. Simulink offers a graphical edi-
for instance, MATLAB from The MathWorks, tor, customizable block libraries, and solvers for
Inc., Mathematica from Wolfram Research, or modeling and simulating dynamic systems. The
Maple from Waterloo Maple Inc. (Maplesoft). models may include MATLAB algorithms, and
Earlier developments of CACSD packages are the simulation results may be further processed
surveyed in Frederick et al. (1991). There are to MATLAB. Managing projects (files, compo-
also environments dedicated to modeling and nents, data), connecting to hardware for real-time
simulation, which cover a broad range of tech- testing, and deploying the designed system are
nical and engineering computations, including additional, useful Simulink features. Real-Time
those for mechanical, electrical, thermodynamic, Workshop code generation allows to speed up
hydraulic, pneumatic, or thermal systems. An the design and implementation, by generating
example is Dymola, presented in a subsequent syntactically and semantically correct code which
subsection. can be uploaded to the target machine.
Reference interactive environments and tools MATLAB environment is very suitable for
for CACSD are presented in the following rapid prototyping, seen in a broad sense. This
(sub)sections. may include not only fully designing and imple-
menting a new control law, testing it on a host
Reference Interactive Environments computer, and deploying on a target computer
MATLAB (MATrix LABoratory) is an but also support for developing and testing new
integrated, interactive environment for tech- mathematical or control theories and algorithms.
nical computing, visualization, and program- Born around 1980, MATLAB has evolved and
ming (MathWorks 2013). Based on a powerful improved impressively. Since 2004, two releases
high-level interpreter language and development have been issued each year. There was a major
tools, an easy-to-use, flexible, and customizable change of the interface in Release 2012b, visible
Interactive Environments and Software Tools for CACSD 587

both in the core MATLAB “Desktop” and in Mathematica offers, e.g., tools for 2D and 3D
Simulink. The so-called Toolstrip interface re- data and function visualization and animation,
places former menus and toolbars and includes numeric and symbolic tools for discrete and
tabs which group functionality for common tasks. continuous calculus, a toolkit for adding user
A gallery of applications from the MATLAB interfaces to applications, control systems
family of products is additionally available and libraries, tools for parallel programming,
can be extended by the user. etc. High-performance computing capabilities
MATLAB supports developing applications include the use of packed and sparse arrays,
with graphical user interface (GUI) features; multiple precision arithmetic, automatic multi-
this itself can be done graphically using GUIDE threading on multi-core computers (based on
(GUI development environment). MATLAB has processor-specific optimized libraries), hardware
support for object-oriented programming and accelerators, support for grid technology,
interfacing with other languages or connecting to and CUDA and OpenCL GPU hardware.
similar environments as Maple or Mathematica. Mathematica and SystemModeler (based on
When using the command-line interface, Modelica c
language) offer numerous built-in
MATLAB helps the user, e.g., by showing the functions which allow to design, analyze, and
arguments of the typed MATLAB functions; also, simulate continuous- and discrete-time control I
MATLAB allows execution profiling, for increas- systems; simplify models; interactively test
ing the computational efficiency, and its editor controllers; and document the design. Both
can suggest changes in the user functions (the so- classical and modern techniques are provided. A
called M-files) for improving the performance. powerful symbolic-numeric computation engine
MATLAB users may upload their own contri- and highly efficient numerical algorithms are
butions to the MATLAB Central website or may used. Mathematica allows to define the system
download tools developed by other people. User models in a more natural form than MATLAB.
feedback is used by the MATLAB developers It can analyze not only numeric systems but
to improve the functionality, reliability, and effi- also symbolic ones, represented by state-space
ciency of the computations. or transfer-function models. The computational
Commercial competitors to MATLAB include precision and algorithms can be automatically
Mathematica, Maple, and IDL; free open-source controlled and selected, respectively, and using
alternatives are, e.g., GNU Octave, FreeMat, arbitrary precision arithmetic is possible.
and Scilab, intended to be mostly compatible
with the MATLAB language. For instance, Maple is a computer algebra system, which
a set of free CACSD tools for GNU Octave combines a powerful engine for mathematical
version 3.6.0 or beyond has been very recently calculations with an intuitive user interface
developed (see http://octave.sourceforge.net/ (see http://www.maplesoft.com/). Classical
control/). The Octave extension package called mathematical notation can be used, and the
control is based on the SLICOT Library and interface is customizable. Arbitrary precision
includes functionalities for system identification, numerical computations, as well as symbolic
system analysis, control system design (including computations, can be performed. The Maple
H1 synthesis), and model reduction, which are language is provided by a small kernel. NAG
the basic steps of the control engineer design Numerical Libraries, ATLAS libraries, and other
workflow. libraries written in this language are used for
numerical calculations. Symbolic expressions
Mathematica is an interactive environment are stored as directed acyclic graphs. The
which supports complete computational work- latest release, Maple 17, added hundreds of
flows, making it suitable for a convenient new problem-solving commands and interface
endeavor from ideas to deployed solutions enhancements. Many calculations recorded an
(see http://www.wolfram.com/mathematica/). impressive improvement in efficiency, compared
588 Interactive Environments and Software Tools for CACSD

to the previous release. Examples include cal- acquisition, instrument control, and industrial
culations with complex floating-point numbers automation. Its Control Design and Simulation
and linear algebra operations. It is possible Module (see http://sine.ni.com/psp/app/doc/p/id/
to use multiple cores and CPUs. The parallel psp-648/lang/en) can be used to build process and
memory management has been improved. Maple controller models using transfer-function, state-
includes some CACSD tools for linear and space, or zero-pole-gain representations, analyze
nonlinear dynamic systems. For instance, the the open- and closed-loop system behavior,
built-in package DynamicSystems (available deploy the designed controllers to real-time
since Maple 12 release) covers the analysis of hardware using built-in functions and LabVIEW
linear time-invariant systems. Numerical solvers Real-Time Module, etc.
for Sylvester and Lyapunov equations have been
added to the LinearAlgebra packages in Maple
13, and solvers for algebraic Riccati equations – Software Tools for CACSD
based on SLICOT Library routines – have been
included in Maple 14 (available in multiple The software tools for CACSD are formally
precision arithmetic since Maple 15). Moreover, divided below into computational and support
the MapleSim environment, based on Modelica, tools. SLICOT Library and Dymola serve as
is dedicated to physical modeling and simulation. illustrative examples. The support tools can also
Symbolic simplification, numerical solution include computational components.
of the differential-algebraic equations (DAEs),
and model post-processing (sensitivity analysis, Computational Tools
linearization, parameter optimization, code The computational tools for CACSD implement
generation, etc.) can be performed in MapleSim. the main numerical algorithms of the systems and
Its Control Design Toolbox provides solutions control theory and should satisfy several strong
for optimal control, Kalman filtering, pole requirements:
assignment, etc. Bidirectional communication • Reliability or guaranteed accuracy, which im-
with MATLAB is possible. plies the use of numerically stable algorithms
as much as possible and the estimation of the
MuPAD is another computer algebra system, problem sensitivity (conditioning) and of the
initially developed by a group at the University results accuracy; backward numerical stability
of Paderborn, Germany, and then in cooperation ensures that the computed results are exact for
with SciFace Software GmbH & Co. KG, com- slightly perturbed original data.
pany purchased in 2008 by The MathWorks, Inc. • Computational efficiency, which is important
MuPAD has been used with Scilab, and now it is for large-scale engineering design problems or
available in the Symbolic Math Toolbox. MuPAD for real-time control.
is able to operate on formulas symbolically or • Robustness, which is mainly ensured by
numerically (with specified accuracy). It offers a avoiding overflows, harmful underflows,
programming language allowing object-oriented and unacceptable accumulation of rounding-
and functional programming, several packages errors; scaling the data may be essential.
for linear algebra, differential equations, number • Ease-of-use, achieved by simplified user inter-
theory, and statistics, an interactive graphical sys- face (hiding the details), and default values for
tem supporting animations and transparent areas algorithmic parameters, such as tolerances.
in tridimensional images, etc. • Wide scope and rich functionality, which ad-
dress the range of problems and system repre-
LabVIEW (Laboratory Virtual Instrumentation sentations that can be handled.
Engineering Workbench), from National • Portability to various platforms, in the sense
Instruments, is an interactive development of functional correctness.
environment, based on MATRIXx, for a visual • Reusability, in building several dedicated en-
programming language mainly used for data gineering software systems or environments.
Interactive Environments and Software Tools for CACSD 589

More details are given, e.g., in Van Huffel et al. or directly by other users. For instance, symbolic
(2004). An example addressing all these aspects calculations are useful for checking the accuracy
is discussed in what follows. of numerical algorithms. The code generation
facility offers a safe and convenient support
SLICOT Library Benner et al. (1999) and for deploying a design solution to the control
Van Huffel et al. (2004) is one of the most com- hardware. A reference support software tool is
prehensive libraries for control theory numerical briefly presented below.
computations, containing over 500 subroutines
which cover system analysis, benchmark and Dymola (Dynamic modeling laboratory), from
test problems, data analysis, filtering, identifica- Dassault Systemes (see http://www.3ds.com/
tion, mathematical routines, some capabilities for products/catia/portfolio/dymola), deals with
nonlinear systems, synthesis, system transforma- high-fidelity modeling and simulation of complex
tion, and utility routines (see http://www.slicot. systems from various domains, like aerospace,
org/). The requirements above have been taken automotive, robotics, process control, and other
into account in the SLICOT Library develop- applications. Compatible and comprehensive
ment. Some of the SLICOT components are used model libraries, developed by leading experts,
in several interactive environments for CACSD, exist for many engineering branches. The I
including MATLAB, Maple, Scilab, and Octave users may create their own libraries or adapt
control package. The library is still under devel- existing libraries. This flexibility and openness is
opment. It is worth mentioning the new focus provided by the use of the open, object-oriented
on structure-preserving algorithms, which offer modeling language Modelica c
, currently further
increased accuracy, reliability, and efficiency, in developed by the Modelica Association.
comparison with standard solvers. Many proce- Equation-oriented models, based on DAEs,
dures for optimal control and filtering, model and symbolic manipulation are used, stimulating
reduction, etc., can benefit from using the “struc- the reuse of components and enhancing the re-
tured” solvers. There are also separate SLICOT- liability and efficiency of the calculations. This
based toolboxes for MATLAB (Benner et al. approach enables to simplify generating the equa-
2010). SLICOT components follow predefined tions, which result from interconnecting various
implementation and documentation standards. subsystems, and to deal with algebraic loops
SLICOT Library routines, and functions from and structurally singular models. Algebraic loops
many interactive environments for CACSD call are encountered when some auxiliary variables
components from the Basic Linear Algebra depend algebraically upon each other in a mu-
Subprograms (BLAS, see Dongarra et al. 1990 tual way (Cellier and Elmqvist 1992). Structural
and the references therein) and Linear Algebra singularities are related to DAE of index higher
PACKage (LAPACK, Anderson et al. 1999). This than 1.
approach enhances portability and efficiency, Dymola allows performing hardware-in-the-
since optimized BLAS and LAPACK Libraries loop simulation and real-time 3D animation. A
are provided for major computer platforms. model can be built by graphical composition,
connecting components from various libraries
Support Tools using simple dragged-and-dropped operations.
The support software tools for CACSD offer The parameters a model depends on can be tuned
additional capabilities compared to compu- either by parameter estimation (also called model
tational tools. They may include alternative calibration), which minimizes the error between
algorithms, symbolic computations (usually, the physical measurements and simulation
for low-dimensional problems), and extended results, or by optimization, which minimizes
functionality, e.g., for modeling/simulation of certain performance criteria. Sometimes, e.g.,
nonlinear systems, code generation, etc. The when designing certain controllers, the criteria
support tools can be used by software developers values are obtained by simulation. Dymola offers
of CACSD environments or computational tools also facilities for model management, including
590 Interactive Environments and Software Tools for CACSD

checking, testing, encrypting, or comparing applications using MATLAB and Simulink.


models, and version control. Many IFAC (International Federation of
Automatic Control) and IEEE (Institute for Elec-
trical and Electronics Engineers) international
Summary and Future Directions conferences and symposia have been dedicated
to CACSD, going back more than two decades.
The main functional and support facilities of- A wealth of material is available, e.g., on IEEE
fered by interactive environments and software Xplore (ieeexplore.ieee.org), containing the
tools for CACSD and reference examples have proceedings of many of the IEEE CACSD events.
been presented. Their remarkable evolution dur- A recent event is the 2011 IEEE International
ing the past decades, combined with the im- Symposium on CACSD. Similar IEEE events
portance of the design solutions they offer, is were hold on 2010, 2008, 2006, 2004, 2002,
the strong argument that the CACSD software 2000, 1999, 1996, 1994, 1992, and 1989. A new
arsenal will continue to evolve and more reli- IEEE CACSD Conference, for Systems under
able, efficient, and powerful systems will come Uncertainty, took place in July 2013.
into place. Progress is expected at all levels,
including basic algorithms and numerical and
symbolic libraries but also command languages, Bibliography
user interfaces, human-machine communication,
and associated hardware. Tools for adaptive, non- Anderson E, Bai Z, Bischof C, Blackford S, Demmel J,
linear, and distributed control systems design Dongarra J, Du Croz J, Greenbaum A, Hammarling
should be developed and integrated. Artificial S, McKenney A, Sorensen D (1999) LAPACK users’
guide, 3rd edn. SIAM, Philadelphia
intelligence support might be required to add Benner P, Mehrmann V, Sima V, Van Huffel S, Varga
expert capabilities to the forthcoming interactive A (1999) SLICOT – a subroutine library in systems
environments. and control theory. In: Datta BN (ed) Applied and
computational control, signals, and circuits, vol 1.
Birkhäuser, Boston, pp 499–539
Benner P, Kressner D, Sima V, Varga A (2010)
Cross-References Die SLICOT-Toolboxen für Matlab. Automatisierung-
stechnik 58:15–25
Cellier FE, Elmqvist H (1992) The need for automated
 Computer-Aided Control Systems Design: In-
formula manipulation in object-oriented continuous-
troduction and Historical Overview system modeling. In: Proceedings of the IEEE sympo-
 Model Order Reduction: Techniques and Tools sium on computer-aided control system design, Tuc-
 Multi-domain Modeling and Simulation son, pp 1–8, 17–19 Mar 1992
Chin CS (2012) Computer-aided control systems de-
 Optimization-Based Control Design Tech-
sign: practical applications using MATLAB R
and
niques and Tools R
Simulink . CRC, Boca Raton
 Robust Synthesis and Robustness Analysis Dongarra JJ, Du Croz J, Duff IS, Hammarling S (1990)
Techniques and Tools Algorithm 679: a set of level 3 basic linear alge-
bra subprograms. ACM Trans Math Softw 16(1–17):
 System Identification: An Overview
18–28
 Validation and Verification Techniques and Frederick DK, Herget CJ, Kool R, Rimvall M (1991)
Tools ELCS – the extended list of control software. Report,
Department of Mathematics and Computer Science,
Eindhoven University of Technology, Eindhoven
MacFarlane AGJ, Grübel G, Ackermann J (1989) Future
Recommended Reading design environments for control engineering. Auto-
matica 25:165–176
CACSD is well presented in many textbooks. MathWorks (2013) MATLAB R
Primer. R2013a. The
MathWorks, Inc., Natick
A very recent one is Chin (2012), which covers Van Huffel S, Sima V, Varga A, Hammarling S, Dele-
modeling, control system design, implemen- becque F (2004) High-performance numerical soft-
tation, and testing, and describes practical ware for control. IEEE Control Syst Mag 24:60–76
Inventory Theory 591

between issuing an order and its receipt is called


Inventory Theory the lead time. For most of this review, we will
assume the lead time to be zero, and the reader
Suresh P. Sethi
can consult the referenced books for nonzero lead
Jindal School of Management, The University of
time extensions and other topics not covered here.
Texas at Dallas, Richardson, TX, USA

Abstract Deterministic Demand


This entry is a brief survey of classical inventory We will describe two classical models: the EOQ
models and their extensions in several direc- model and the dynamic lot size model.
tions such as world-driven demands, presence
of forecast updates, multi-delivery modes and
advanced demand information, incomplete in- The EOQ Model
ventory information, and decentralized inventory This basic and most important deterministic
control in the context of supply chain manage- model is concerned with a product that has a
ment. Important references are provided. We con- constant demand rate D in continuous time over I
clude with suggestions for future research. an infinite horizon. No shortages are allowed.
The costs consist of a fixed setup/ordering cost K
and a holding cost h per unit of average on-hand
stock per unit time. The production/purchase cost
Keywords
per unit time is a sunk cost since there is no
choice of a total amount to produce, and hence
Base stock policy; EOQ model; Incomplete infor-
it can be ignored. Although dynamic, the model
mation; Newsvendor model; (s; S ) policy
can be reduced to a static model by a simple
argument of periodicity. Moreover, it is obvious
that one should never produce or order except for
Introduction when the inventory level is zero, and one should
order the same lot size Q each time the inventory
Optimal inventory theory deals with managing level reaches zero. Since the average inventory
stock levels of goods to effectively meet the level over time is Q=2 and the number of setups
demand of those goods. Because of the huge is D=Q per unit time, the long-run average cost
amount of capital that is tied up in inventory, to be minimized is KD/Q+hQ/2. The optimal
its management is critical to the profitability of policy that minimizes this cost, obtained using
firms. A systematic analysis of inventory prob- the first-order condition, is to order the lot size
lems began with the development of the classi-
cal economic order quantity (EOQ) formula of r
Ford W. Harris in 1913. A substantial amount 2KD
QD (1)
of research was reported in 1958 by Kenneth J. h
Arrow, Samuel Karlin, and Herbert Scarf, and
much more has accumulated since then. Books on every time the inventory level reaches zero.
the topic include Zipkin (2000), Porteus (2002), Harris (1913) introduced the model. Erlenkotter
Axsäter (2006), and Bensoussan (2011). (1990) provides a historical account of the
In this entry, we review single- and multi- formula, and Beyer and Sethi (1998) provide
period models with deterministic, stochastic, par- a mathematically rigorous proof involving
tially observed demand for a single product. In quasi-variational inequalities (QVI) that arise
these models, our aim is to decide on the time in the course of dealing with continuous-time
of the orders and the order quantities. The time optimization problems involving fixed costs.
592 Inventory Theory

The Dynamic Lot Size Model less, and the marginal paper at the optimum
This is an analogue of the EOQ model when should be worth exactly zero. Thus, cu times the
the demand varies over time. Wagner and Whitin probability of selling the marginal paper minus
(1958) developed it in the discrete-time finite co times the probability of not selling it should
horizon framework. With D.t/ denoting the de- equal zero. Now, if F denotes the cumulative
mand in period t and other costs similar to those probability distribution function of the demand
in the EOQ model, they showed that there exists D, then clearly the optimal order quantity Q
an optimal policy in which an order will be satisfies co  F .Q/  cu  .1  F .Q// D 0, which
issued just as the inventory level reaches zero, gives us the famous newsvendor formula for the
except for the first order. This policy is called optimal order quantity
the zero-inventory policy. With this in hand, the

problem reduces to selecting only the order times. 1 cu
QDF ; (2)
This is accomplished by applying a shortest path cu C co
algorithm. Moreover, there are forward (recur-
sion) procedures for solving the problem. where cu =.cu C co / is known as the critical frac-
An important feature of this model is that in tile.
most cases, one can detect a forecast horizon If p denotes the unit sale price, c the unit
which essentially separates earlier periods from cost, and h the holding cost per unit per unit
later ones. More specifically, T is a forecast time, then cu D p  c and co D c C h, and
horizon if the first order in a T horizon problem therefore, the critical fractile can be expressed as
remains optimal in any finite horizon problem .p  c/ = .p C h/. An extension of the newsven-
with horizon longer than T , regardless of the dor formula to allow for a unit cost g of lost
demands beyond the period T . For an extensive goodwill and a unit salvage value s received at
bibliography of this literature, see Chand et al. the end of the period for each unit not sold is
(2002). immediate. If we let ˛ > 0 denote the periodic
discount factor, then cu D p C g  c and co D
c C h  ˛s and the critical fractile becomes
Stochastic Demand .p C g  c/ = .p C g C h  ˛s/ ; and therefore,


We shall discuss three classical models and some 1 pCgc
QDF : (3)
of their extensions. p C g C h  ˛s

The Single-Period Problem: The The newsvendor model has been used exten-
Newsvendor Model sively in the context of supply chain management
The problem of a newsvendor is to decide on an with multiple agents maximizing their individual
order quantity of newspapers to meet a stochas- objectives. In this case, inefficiencies arise due
tic demand at a minimum cost. If the realized to double marginalization. Then, a question of
demand is larger than the ordered quantity, it is appropriate contracts that can lead to the first-best
lost and there is an opportunity loss of cu (selling solution, or coordinate the supply chain, becomes
price minus purchase cost) for each paper short. important. Cachon (2003) surveys this literature.
On the other hand, for each paper ordered but not
sold, there is an opportunity loss of co (purchase Multi-period Inventory Models: No
cost plus holding cost). The newsvendor concep- Fixed Cost
tualizes the decision by each additional paper as a The newsvendor model is a single-period model,
separate marginal contribution. The first is almost and its multi-period generalization requires that
certain to be sold. Each additional paper is less the inventory not sold in a period is carried over
likely to be sold than the previous one. Thus, to the next period. This results in the multi-period
each additional paper will be worth somewhat inventory model with lost sales. It is assumed
Inventory Theory 593

that demand in each period is independent and where the second term represents the savings due
identically distributed (i.i.d.) with F denoting its to postponing the purchase of the backlogged
cumulative probability distribution function. A demand unit by one period, and co D c C h  ˛c
rigorous analysis requires the method of dynamic as in (4). This gives us the base stock level
programming, and it shows that there is a stock

level St called base stock in period t, that we 1 b  .1  ˛/c
S DF ; (6)
would ideally like to have at the beginning of bCh
period t. Thus, the optimal policy in period t,
called the base stock policy, is to order which can be used in (5) to give the optimal
policy.
n Sometimes it is possible to have multiple de-
Qt .x/ D St x if x<St ; (4)
0 if xSt : livery modes such as fast, regular, and slow as
well as demand forecast updates. Then, at the be-
In the special case when the terminal salvage ginning of each period, on-hand inventory and de-
value of an item is exactly equal to its cost c, it is mand information are updated. At the same time,
possible to come up with the optimal policy using decisions on how much to order using each of the
intuition. Since we do not need to salvage unused modes are made. Fast, regular, and slow orders
items in the multi-period setting, one can argue are delivered at the ends of the current, the next,
I
that an item carried over to the next period is and one beyond the next periods, respectively.
worth its purchase cost c. Therefore, its presence In such models, a modified base stock policy is
means that the next period will need to order one optimal only for the two fastest modes. For details
less and thus save an amount c. In the last period, and further generalization, see Sethi et al. (2005).
when there is no next period, our terminal sal- An important extension includes serial inven-
vage value assumption also guarantees a leftover tory systems where stage 1 receives supplies from
item’s worth to be also c. Thus, we can modify an outside source and each downstream stage
(3) and obtain a stationary base stock level receives supplies from its immediate upstream

stage. Clark and Scarf (1960) introduced the
pCgc notion of the echelon inventory position at a stage
S D F 1
.p C g  c/ C .c C h  ˛c/ to consist of the stock at that stage plus stock

in transit to that stage plus all downstream stock
pCgc
D F 1 (5) minus the amount backlogged at the final stage.
p C g C h  ˛c
Then, the optimal ordering policy at each stage
for each period t. is given by an echelon base stock policy with
Thus, the elimination of the endgame effect respect to the echelon inventory position at that
delivers us a myopic policy, a policy optimal in stage. It is known that assembly systems can be
the single-period case to be also optimal in the reduced to a serial system. Details can be found
dynamic multi-period setting. A more general in Zipkin (2000).
concept than the optimality of a myopic policy
is that of the forecast horizon mentioned earlier Multi-period Inventory Models: Fixed Cost
in the context of the dynamic lot size model. When there is a fixed cost of ordering, it is clear
Sometimes, when the demand exceeds the that it would not be reasonable to follow the base
on-hand inventory in the period, the demand is stock policy when the inventory level is not much
not lost but backlogged. In this case, each unit below the base stock level. Indeed, Scarf (1960)
of backlogged demand is satisfied in the next proved that there are numbers st and St , st < St ,
period, and unit revenue p is recovered, but a unit for period t such that the optimal policy in period
backlogging cost b is incurred, due to expediting, t is to order
special handling, delayed receipt of revenue, and n
if xst
Qt .x/ D S0 t x if (7)
loss of goodwill. Thus, cu D b  .1  ˛/c, x>s :
t
594 Inventory Theory

Such a policy is famously known as an (s; S ) sales. In such an environment of incomplete


policy. information, inventories are known to be partially
When the demands are not i.i.d., the model observed and most of the well-known inventory
has been extended to Markovian demands. In this policies including (1) and (7) are not even
case, there is an exogenous Markov process, and admissible, let alone optimal. In such cases,
the distribution of the demand in each period Bensoussan et al. (2010) show that the dynamic
depends on the state of the Markov process, programming equation can be written in terms of
called the demand state, in that period. It can the unnormalized conditional probability of the
be shown that the optimal policy in period t current inventory level given past observations,
is (sti ; Sti ), where i denotes the demand state referred to as signals, instead of just the inventory
in the period. Such a policy is also called a level in the full observation case. Furthermore,
state-dependent (s; S ) policy. Further details are one can write the evolution of the conditional
available in Beyer et al. (2010). Recent advances probability in terms of its current value, the
in information technology have allowed man- current order, and the current observation. How-
agers to obtain advance demand information in ever, there are no longer simple optimal policies
addition to forecast updates. In such cases, a except in cases of information delay reported in
state-dependent (s; S ) policy can be shown to be Bensoussan et al. (2009) where modified base
optimal. For details, refer to Ozer (2011). stock and (s; S ) policies are shown to be optimal.

The Continuous-Time Model: Fixed Cost


The marriage of the two classical results (1) Summary and Future Directions
and (7) is accomplished by Presman and Sethi
(2006) in a continuous-time stochastic inventory We briefly describe some classical results in in-
model involving a demand that is the sum of a ventory theory. These are based on full obser-
constant demand rate and a compound Poisson vation. Some recent work on inventory models
process. The optimal policies that minimize a under incomplete information is reported. This
discounted cost or the long-run average cost are work leads to a number of new research direc-
both of (s; S ) type. The (s; S ) policy minimizing tions, both theoretical and empirical as reported
the long-run average cost reduces to the EOQ for- in Sethi (2010). It would be of much interest
mula when the intensity of the compound Poisson to know the industries where the i3 problem is
process is set to zero. And when the constant de- serious enough to warrant the difficult mathemat-
mand component vanishes, the model reduces to ical analysis required. Furthermore, how are the
the continuous-review stochastic inventory model observed signals related to the inventory level? It
with fixed cost and compound Poisson demand. is also clear from the reviewed literature that there
are no simple optimal policies for most i3 prob-
Incomplete Inventory Information lems, so it would be important to develop effi-
Models (i3) cient computational procedures to obtain optimal
A critical assumption in the vast inventory theory solutions or to specify a class of simple imple-
literature has been that the level of inventory at mentable policies and optimize within this class.
any given time is fully observed. The celebrated An important benefit of solving i3 problems op-
results (1) and (7) have been obtained under the timally is the provision of an economic justifi-
assumption of full observation. Yet the inventory cation for technologies such as RFID that may
level is often not fully observed in practice, for a reduce inaccuracies in inventory observations.
variety of reasons such as replenishment errors, Another area of research would be to study
employee theft, customer shoplifting, improper multi-period multi-agent supply chains with a
handling and damaging of merchandise, stochastic inventory dynamics. While these can
misplaced inventories, uncertain yield, imperfect be formulated as dynamic games, there are a
inventory audits, and incorrect recording of number of equilibrium concepts to deal with,
Investment-Consumption Modeling 595

depending on the information the agents have. Ozer O (2011) Inventory management: information, coor-
Some of them are time consistent or subgame dination, and rationality. In: Kempf KG, Keskinocak P,
Uzsoy R (eds) Planning production and inventories in
perfect and some are not. Regardless, there are the extended enterprise. Springer, New York
inefficiencies that arise from these decentralized Porteus EL (2002) Stochastic inventory theory. Stanford
game settings, and developing contracts for coor- University Press, Stanford, CA
dinating dynamic supply chains remains a wide Presman E, Sethi SP (2006) Inventory models with contin-
uous and Poisson demands and discounted and average
open topic of research. costs. Prod Oper Manag 15(2):279–293
Scarf H (1960) The optimality of (s, S) policies in
dynamic inventory problems. In: Arrow KJ, Karlin S,
Suppes P (eds) Mathematical methods in the social
Cross-References sciences. Stanford University Press, Stanford, pp 196–
202
 Nonlinear Filters Sethi SP (2010) i3: incomplete information inventory
models. Decis Line Oct:16–19
 Stochastic Dynamic Programming
Sethi SP, Yan H, Zhang H (2005) Inventory and supply
chain management with forecast updates. Springer,
New York
Wagner HM, Whitin TM (1958) Dynamic version of the
Bibliography economic lot size model. Manag Sci 5:89–96
Zipkin P (2000) Foundations of inventory management.
I
Arrow KJ, Karlin S, Scarf H (1958) Studies in the math- McGraw Hill, New York
ematical theory of inventory and production. Stanford
University Press, Stanford
Axsäter S (2006) Inventory control. Springer, New York
Bensoussan A (2011) Dynamic programming and inven-
tory control. IOS Press, Washington, DC Investment-Consumption Modeling
Bensoussan A, Cakanyildirim M, Feng Q, Sethi SP (2009)
Optimal ordering policies for stochastic inventory L.C.G. Rogers
problems with observed information delays. Prod Oper
Manag 18(5):546–559 University of Cambridge, Cambridge, UK
Bensoussan A, Cakanyildirim M, Sethi SP (2010) Filter-
ing for discrete-time Markov processes and applica-
tions to inventory control with incomplete information. Abstract
In: Crisan D, Rozovsky B (eds) Handbook on non-
linear filtering. Oxford University Press, Oxford, UK,
pp 500–525 The simplest investment-consumption problem is
Beyer D, Cheng F, Sethi SP, Taksar MI (2010) Markovian the celebrated example of Robert Merton (J Econ
demand inventory models. International series in op- Theory 3(4):373-413, 1971). This survey shows
erations research and management science. Springer,
New York three different ways of solving the problem, each
Beyer D, Sethi SP (1998) A proof of the EOQ formula of which is a valuable solution method for more
using quasi-variational inequalities. Int J Syst Sci complicated versions of the question.
29(11):1295–1299
Cachon GP (2003) Supply chain coordination with con-
tracts. In: de Kok AG, Graves S (eds) Supply chain
management-handbook in OR/MS, chapter 6. North- Keywords
Holland, Amsterdam, pp 229–340
Chand S, Hsu VN, Sethi SP (2002) Forecast, solution and
rolling horizons in operations management problems: Budget constraint; Hamilton-Jacobi-Bellman
a classified bibliography. Manuf Serv Oper Manag (HJB) equation; Merton problem; Value function
4(1):25–43
Clark AJ, Scarf H (1960) Optimal policies for a multi-
echelon inventory problem. Manag Sci 6(4):475–490
Erlenkotter D (1990) Ford Whitman Harris and the eco- Introduction
nomic order quantity model. Oper Res 38(6):937–946
Harris FW (1913) How many parts to make at once. Fact
Mag Manag 10(2):135–136, 152. Reprinted (1990) Consider an investor in a market with a riskless
Oper Res 38(6):947–950 bank account accruing continuously compounded
596 Investment-Consumption Modeling

interest at rate rt , and with a single risky asset The Main Techniques
whose price St at time t evolves as
We present here three important techniques for
dSt D St .t d Wt C
t dt/; (1) solving such problems: the value function ap-
proach; the use of dual variables; and the use of
where W is a standard Brownian motion, and  martingale representation. The first two methods
and
are processes previsible with respect to the only work if the problem is Markovian; the third
filtration of W . The investor starts with initial only works if the market is complete. There is
wealth w0 and chooses the rate ct of consuming, a further method, the Pontryagin-Lagrange ap-
and the wealth t to invest in the risky asset, so proach; see Sect. 1.5 in Rogers (2013). While
that his overall wealth evolves as this is a quite general approach, we can only
expect explicit solutions when further structure is
d wt D t .t d Wt C
t dt/Crt .wt  t /dt  ct dt available.
(2)
The Value Function Approach
D rt wt dt C t ft d Wt C .
t  rt /dtg  ct dt: To illustrate this, we focus on the original Merton
(3) problem (Merton 1971), where  and
are both
constant, and the utility U is constant relative risk
For convenience, we assume that ,  1 , and
aversion (CRRA):
are bounded. See Rogers and Williams (2000a,b)
for background information on stochastic pro- U 0 .x/ D x R .x > 0/ (5)
cesses. The three terms in (2) have natural in-
terpretations: The first expresses the evolution of
for some R > 0 different from 1. The case
the wealth invested in the stock, the second the
R D 1 corresponds to logarithmic utility, and
interest accruing on the wealth .w / invested in
can be solved by similar methods. Perhaps the
the bank account, and the third is the cash being
best starting point is the Davis-Varaiya Martin-
withdrawn for consumption.
gale Principle of Optimal Control
R t (MPOC): The
To avoid so-called doubling strategies, we in-
process Yt D e t V .wt / C 0 e s U.cs / ds
sist that the wealth process so generated by the
is a supermartingale under any control, and a
controls .c; / must remain bounded below in
martingale under optimal control. If we use Itô’s
some suitable way, which here is just the condi-
formula, we find that
tion wt  0 for all t  0; any .c; / satisfying
this condition will be called admissible. The set
of admissible .c; / will be denoted A.w0 /, a e t d Yt D V .wt /dt C V 0 .wt /d wt
notation which makes explicit the dependence on C 12  2 t2 V 00 .wt /dt C U.ct /dt
the investor’s initial wealth. :
D ŒV C f t .
 r/  ct C rgV 0
The investor’s objective is taken to be to obtain
Z C 12  2 t2 V 00 C U.ct / dt; (6)
1
V .w0 /  sup E e t U.ct / dt (4) :
.c; /2A.w0 / 0 where the symbol D denotes that the two sides
differ by a (local) martingale. If the MPOC is to
for some constant  > 0. The problem cannot hold, then we expect that the drift in d Y should
be solved explicitly at this level of generality, be non-positive under any control, and equal to
but if we take some special cases, we are able zero under optimal control. We simply assume
to illustrate the main methods used to attack for now that local martingales are martingales;
it. Many other objectives with various different this is of course not true in general, and is a point
constraints can be handled by similar techniques: that needs to be handled carefully in a rigorous
see Rogers (2013) for a wide range of examples. proof. Directly from (6), we then deduce the
Investment-Consumption Modeling 597

Hamilton-Jacobi-Bellman (HJB) equations for Dual Variables


this problem: We illustrate the use of dual variables in the
constant-coefficient case of the previous section,
0 D supŒV C f .
 r/  c C rgV 0 except that we no longer suppose the special form
c; (5) for U . The analysis runs as before all the way
C 12  2 2 V 00 C U.c/: (7) to (9), but now the convex dual UQ is not simply
given by (8). Although it is not now possible to
Write UQ .y/  sup {U.x/  xy} for the convex guess and verify, there is a simple transformation
dual of U; which in this case has the explicit form which reduces the nonlinear ODE (9) to some-
thing we can easily handle. We introduce the new
1R0
y variable z > 0 related to w by z D V 0 .w/, and
UQ .y/ D  (8)
1  R0 define a function J by

with R0  1=R. We are then able to perform the J.z/ D V .w/  wz: (14)
optimizations in (7) quite explicitly to obtain

.V 0 /2
Simple calculus gives us J 0 D w, J 00 D
0 D V C rV 0 C UQ .V 0 /  1
2 2 V 00 (9) 1=V 00 , so that the HJB equation (9) transforms I
into
where

r
 : (10) 0 D UQ .z/  J.z/ C .  r/zJ 0 .z/ C 12 2 z2 J 00 .z/;

Nonlinear PDEs arising from stochastic optimal (15)
control problems are not in general easy to solve, which is now a second-order linear ODE,
but (9) is tractable in this special setting, because which can be solved by traditional methods;
the assumed CRRA form of U allows us to see Sect. 1.3 of Rogers (2013) for more details.
deduce by a scaling argument that V .w/ /
w1R / U.w/, and we find that Use of Martingale Representation
This time, we shall suppose that the coefficients
R
t , rt , and t in the wealth evolution (3) are
V .w/ D M U.w/; (11)
general previsible processes; to keep things sim-
where pler, we shall suppose that
, r, and  1 are
all bounded previsible processes. The Markovian
nature of the problem which allowed us to find
RM D  C .R  1/.r C 12 2 =R/: (12)
the HJB equation in the first two cases is now
destroyed, and a completely different method
The optimal investment and consumption
is needed. The way in is to define a positive
behavior is easily deduced from the optimal
semimartingale by
choices which took us from (7) to (9). After some
calculations, we discover that
d t D t .rt dt  t d Wt /; 0 D 1 (16)

r
t D M wt  2 wt ; ct D M wt
 R where t D .
t  rt /=t is a previsible process,
(13) bounded by hypothesis. This process, called the
specifies the optimal investment/consumption state-price density process, or the pricing kernel,
behavior in this example. (The positivity of M has the property Rthat if w evolves as (3), then
t
is necessary and sufficient for the problem to Mt  t wt C 0 s cs ds is a positive local
be well posed; see Sect. 1.6 in Rogers (2013)). martingale.
Unsurprisingly, the optimal solution scales Since positive local martingales are super-
linearly with wealth. martingales, we deduce from this that
598 ISS

Z 1
Cross-References
M0 D w0  E s cs ds : (17)
0
 Risk-Sensitive Stochastic Control
Thus, for any .c; / 2 A.w0 /, the budget con-  Stochastic Dynamic Programming
straint (17) must hold. So the solution method  Stochastic Linear-Quadratic Control
here is to maximize the objective (4) subject  Stochastic Maximum Principle
to the constraint (17). Absorbing the constraint
with a Lagrange multiplier , we find the uncon-
strained optimization problem
Bibliography
Z 1
sup E fe s U.cs /   s cs g ds C w0 Cvitanic J, Zhang J (2012) Contract Theory in
0 Continuous-time Models. Springer, Berlin/Heidelberg
(18) Merton RC (1971) Optimum consumption and portfolio
whose optimal solution is given by rules in a continuous-time model. J Econ Theory
3(4):373–413
Rogers LCG (2013) Optimal Investment. Springer,
e s U 0 .cs / D  s ; (19) Berlin/New York
Rogers LCG, Williams D (2000a) Diffusions, Markov
and this determines the optimal c, up to knowl- Processes and Martingales, vol 1. Cambridge Univer-
sity Press, Cambridge/New York
edge of the Lagrange multiplier , whose value is Rogers LCG, Williams D (2000b) Diffusions, Markov
fixed by matching the budget constraint (17) with Processes and Martingales, vol 2. Cambridge Univer-
equality. sity Press, Cambridge
Of course, the missing logical piece of this
argument is that if we are given some c  0
satisfying the budget constraint, is there necessar-
ily some such that the pair .c; / is admissible ISS
for initial wealth w0 ? In this setting, this can be
shown to follow from the Brownian integral rep-  Input-to-State Stability
resentation theorem, since we are in a complete
market; however, in a multidimensional setting,
this can fail, and then the problem is effectively
insoluble.
Iterative Learning Control

David H. Owens
University of Sheffield, Sheffield, UK
Summary and Future Directions

This brief survey states some of the main ideas


Synonyms
of consumption-investment optimization, and
sketches some of the methods in common use.
ILC
Explicit solutions are rare, and much of the inter-
est of the subject focuses on efficient numerical
schemes, particularly when the dimension of
the problem is large. A further area of interest Abstract
is in continuous-time principal-agent problems;
Cvitanic and Zhang (2012) is a recent account Iterative learning control addresses tracking con-
of some of the methods of this subject, but it has trol where the repetition of a task allows im-
to be said that the theory of such problems is proved tracking accuracy from task to task. The
much less complete than the simple single-agent area inherits the analysis and design issues of
optimization problems discussed here. classical control but adds convergence conditions
Iterative Learning Control 599

for task to task learning, the need for acceptable Step two: (Initialization) Given a demand sig-
task-to-task performance and the implications of nal r.t/; t 2 Œ0; T , choose an initial input
modeling errors for task-to-task robustness. u0 .t/; t 2 Œ0; T  and set k D 0.
Step three: (Response measurement) Return
the plant to a defined initial state. Find the
output response yk to the input uk . Construct
Keywords
the tracking error ek D r  yk . Store data.
Step four: (Input signal update) Use past
Adaptation; Optimization; Repetition; Robust-
records of inputs used and tracking er-
ness
rors generated to construct a new input
ukC1 .t/; t 2 Œ0; T  to be used to improve
tracking accuracy on the next trial.
Introduction Step five: (Termination/task repetition)
Either terminate the sequence or increase k
Iterative learning control (ILC) is relevant to by unity and return to step 3.
trajectory tracking control problems on a finite in- It is the updating of the input signal based
terval Œ0; T  (Ahn et al. 2007b; Bien and Xu 1998; on observation that provides the conceptual link I
Chen and Wen 1999). It has close links to multi- to adaptive control. ILC causality defines “past
pass process theory (Edwards and Owens 1982) data” at time t on iteration k as data on the
and repetitive control (Rogers et al. 2007) plus interval Œ0; t on that iteration plus all data on
conceptual links to adaptive control. It focuses Œ0; T  on all previous iterations. Feedback plus
on problems where the repetition of a specified feedforward control normally contains feedfor-
task creates the possibility of improving tracking ward transfer of information from past iterations
accuracy from task to task and, in principle, to the current iteration.
reducing the tracking error to exactly zero. The
iterative nature of the control schemes proposed,
the use of past executions of the control to up- Modeling Issues
date/improve control action, and the asymptotic
learning of the required control signals put the Design approaches have been model-based. Most
topic in the area of adaptive control, although nonlinear problems assume nonlinear state space
other areas of study are reflected in its method- models relating the `  1 input vector u.t/ to the
ologies. m  1 output vector y.t/ via an n  1 state vector
Application areas include robotic assembly x.t/ as follows:
(Arimoto et al. 1984), electromechanical test sys-
tems (Daley et al. 2007), and medical rehabili- P
x.t/ D f .x.t/; u.t//; y.t/ D h.x.t/; u.t//;
tation robotics (Rogers et al. 2010). For example,
consider a manufacturing robot required to under- where t 2 Œ0; T , x.0/ D x0 and f and h
take an indefinite number of identical tasks (such are vector-valued functions. The discrete time
as “pick and place” of components) specified by (sample data) version replaces derivatives by a
a spatial trajectory on a defined time interval. forward shift, where t is now a sample counter,
The problem is two-dimensional. More precisely, 0  t  N (the index of the last sample). The
the controlled system evolves with two variables, continuous time linear model is
namely, time t 2 Œ0; T  (elapsed in each iteration)
and iteration index k  0. Data consists of signals P
x.t/ D Ax.t/ C Bu.t/; y.t/ D C x.t/ C Du.t/
fk .t/ denoting the value of the signal f at time t
on iteration k. The conceptual algorithm used is: with an analogous model for discrete systems. In
Step one: (Preconditioning) Implement loop both cases, the matrices A; B; C; D are constant
controllers to condition plant dynamics. or time varying of appropriate dimension.
600 Iterative Learning Control

Nonlinear systems present the greatest ukC1 D k .ekC1 ; ek ; : : : ; e0 ; uk ; uk1 ; : : : ; u0 /


technical challenge. Linear system’s challenges
are greater for the time-varying, continuous time that ensures that limk!1 ek D 0 (convergence)
case. The simplest linear case of discrete time, in the norm topology of Y.
time-invariant systems can be described by a The update rule k ./ represents the simple
matrix relationship idea of expressing ukC1 in terms of past data. A
general linear “high-order” rule is
y D Gu C d (1)
X
k X
kC1
ukC1 D Wj ukj C Kj ekC1j (2)
where y denotes the m.N C 1/  1 “su- j D0 j D0
pervector” generated by the time series
y.0/; y.1/; : : : ; y.N / and the construction y D with bounded linear operators Wj W U ! U and
 T T
y .0/; y T .1/; : : : ; y T .N / , the supervector Kj W Y ! U, regarded as compensation elements
u is generated, similarly, by the time series and/or filters to condition the signals. Typically
u.0/; u.1/; : : : ; u.N /, and d is generated by the Kj D 0 (resp. Wj D 0) for j > Me (resp. j >
times series C x0 ; CAx0 ; : : : ; CAN x0 . The matrix Mu ). A simple structure is
G has the lower block triangular structure
ukC1 D W0 uk C K0 ekC1 C K1 ek (3)
2 3
D 0 0  0
6 CB D 0  0 7 Assuming that G and W0 commute (i.e., GW0 D
6 7
6 CAB CB D  0 7 W0 G), the resultant error evolution takes the form
GD6 7
6 :: 7
4 : 5
ekC1 D .I C GK0 /1 .W0  GK1 /ek
CAN 1 B CAN 2 B  D
C.I C GK0 /1 .I  W0 /.r  d /

defined in terms of the Markov parameter matri- ROBUST ILC: An ILC algorithm is said to be
ces D; CB; CAB; : : : of the plant. This structure robust if convergence is retained in the presence
has led to a focus on the discrete time, time- of a defined class of modeling errors.
invariant case, and exploitation of matrix algebra Results from multipass systems theory (Ed-
techniques. wards and Owens 1982) indicate robust conver-
More generally, G W U ! Y can be a bounded gence of the sequence fek gko to a limit e1 2 Y
linear operator between suitable signal spaces U (in the presence of small modeling errors) if the
and Y. Taking G as a convolution operator, the spectral radius condition
representation (1) also applies to time-varying
continuous time and discrete time systems. The rŒ.I C GK0 /1 .W0  GK1 / < 1 (4)
representation also applies to differential-delay
systems, coupled algebraic and differential sys- is satisfied where rΠdenotes the spectral radius
tems, multi-rate systems, and other situations of of its argument. However, the desired condition
interest. e1 D 0 is true only if W0 D I . For a given
r, it may be possible to retain the benefits of
choosing W0 ¤ I and still ensure that e1 is
Formal Design Objectives sufficiently small for the application in mind,
e.g., by limiting limit errors to a high-frequency
band. This and other spectral radius conditions
Problem Statement: Given a reference signal form the underlying convergence condition when
r 2 Y and an initial input signal u0 2 U, choosing controller elements but are rarely com-
construct a causal control update rule/algorithm puted. The simplest algorithm using eigenvalue
Iterative Learning Control 601

computation for a linear discrete time system proving that zero error is attainable with added
defines the relative degree to be k  D 0 if D ¤ 0 flexibility in convergence rate control by choos-
and the smallest integer k such that CAk1 B ¤ ing ˇ 2 .0; 2/. Errors in the system model used
0 otherwise. Replacing Y by the range of G; in (6) are an issue. Insight into this problem
choosing W0 D I; K0 D 0, and K1 D I ; has been obtained for single-input, single-output
and supposing that k   1, the Arimoto input discrete time systems with multiplicative plant
update rule ukC1 .t/ D uk .t/ C ek .t C k  /; 0  uncertainty U as retention of monotonic conver-
t  N C 1  k  provides robust convergence if, gence is ensured (Owens and Chu 2012) by a

and only if, rŒI  CAk 1 B < 1. It does not frequency domain condition
imply that the error signal necessarily improves
each iteration. Errors can reach very high values 1 1
j  U.e i /j < ; for all 2 Œ0; 2  (7)
before finally converging to zero. However, if (4) ˇ ˇ
is replaced by the operator norm condition
that illustrates a number of general empirical
1
k.I C GK0 / .W0  GK1 /k < 1 ; then (5) rules for ILC robust design. The first is that
a small learning gain (and hence small input
fkek  e1 kQ gk0 monotonically decreases to update changes and slow convergence) will I
zero. tend to increase robustness and, hence, that it is
The spectral radius condition throws light on necessary that multiplicative uncertainties satisfy
the nature of ILC robustness. Choosing, for sim- some form of strict positive real condition which,
plicity, K0 D 0 and W0 D I , the requirement for (6), is
that rŒI  GK1  < 1 will be satisfied by a
 
wide range of processes G, namely those for Re U.e i / > 0; for all 2 Œ0; 2  ; (8)
which the eigenvalues of I  GK1 lie in the open
unit circle of the complex plane. Translating this a condition that limits high-frequency roll-off
requirement into useful robustness tests may not error and constrains phase errors to the range
be easy in general. The discussion does however . 2 ; 2 /. The second observation is that if G is
show that the behavior of GK1 must be “sign- non-minimum phase, the inverse G 1 is unstable,
definite” to some extent as, if rŒI  GK1  < 1, a situation that cannot be tolerated in practice.
then rŒI  .G/K1  > 1, i.e., replacing the plant
by G (no matter how small) will inevitably pro-
duce non-convergent behavior. A more detailed
Optimization-Based Iteration
characterization of this property is possible for
inverse model ILC.
Design criteria can be strengthened by a mono-
tonicity requirement. Measuring error magnitude
by a norm kekQ on Y, such as the weighted mean
Inverse Model-Based Iteration square error (with Q symmetric, positive definite)

If a linear system G has a well-defined inverse s


Z T
model G 1 , then the required input signal is kekQ D e T .t/Qe.t/dt ;
u1 D G 1 .r  d /. The simple update rule 0

ukC1 D uk C ˇG 1 ek ; (6) then the condition kekC1 kQ < kek kQ for all
k  0 provides a performance improvement from
where ˇ is a learning gain, produces the dynam- iteration to iteration. This idea leads to a number
ics of design approaches, Owens and Daley (2008)
and Ahn et al. (2007b) (which also examines
ekC1 D .1  ˇ/ek or ekC1 D .1  ˇ/k e0 ; aspects of robustness).
602 Iterative Learning Control

Function/Time Series Optimization The IPNOILC solution is nonunique, and the


Norm optimal ILC (NOILC) (Owens and Daley remaining degrees of freedom can be used to
2008) guarantees monotonicity and convergence satisfy other design objectives. Switching algo-
to e1 D 0 by computing ukC1 to minimize an rithms (Owens et al. 2013) converge to a solution
objective function of the problem while simultaneously minimizing
an auxiliary criterion
J.u/ D kek2Q C ku  uk k2R ;
Jaux .u/ D kz  z0 k2QQ C ku  u0 k2R :
subject to plant dynamics. For linear models (1),
Auxiliary optimization is a tool for shaping the
ukC1 D uk C G  ekC1 solution of the IPNOILC problem. The auxiliary
variable z could be internal states whose behavior
where G  W Y ! U is the adjoint operator of is important to plant operation or simply defined
G. For continuous or discrete time linear state by the output, e.g., z D yR which, if small, might
space models, the problem is a classical optimal reduce input “forces” and hence actuator activity.
tracking problem with a solution with online state
feedback and a feedforward term generated off- Parameter Optimization
line by simulation of an “adjoint” model. Re- NOILC can be simplified by reducing the degrees
ducing R in J leads to faster convergence rates, of freedom defining control action to a small
but the presence of non-minimum-phase zeros number of control law parameters. For a discrete
has a negative effect on convergence (Owens system (1), a general update rule is
and Chu 2010). Monotonicity and convergence to
zero is retained, but, after an initial fall, the error ukC1 D uk C .ˇkC1 /ek ; k  0 :
norm then reduces infinitesimally each iteration
producing the practical effect of limited error Here the matrix .ˇ/ is linear in the p  1
reductions over finite iteration horizons. Rules parameter vector ˇ with .0/ D 0. Under these
exist (Owens and Chu 2010) to minimize the conditions .ˇ/e D F .e/ˇ where the matrix
effect by choice of u0 and r. F .e/ is linear in e with F .0/ D 0. Examples
of useful parameterizations include inverse model
Related Linear NOILC Problems control (Owens et al. 2012).
If Y and U are real Hilbert spaces, geometrical Monotonicity of the error norm is ensured by
arguments can be used to generate algorithms ex- choosing the parameter vector ˇkC1 to minimize
tending the NOILC algorithm to include (Owens
and Daley 2008) acceleration mechanisms, pre- J.ˇ/ D kek2Q C ˇ T WkC1 ˇ
dictive control, and the inclusion of input signal
constraints. They also allow more flexibility in subject to the dynamic constraint (1). Each p  p
the form and specification of the task. In the weighting matrix WkC1 is symmetric, positive
intermediate point NOILC problem (denoted IP- definite, and may be iteration dependent. The
NOILC), the task requirement is that the output algorithm creates a nonlinear ILC law providing
signal y.t/; 0  t  T takes specified values a link between parameter evolution, past errors,
r.t1 /; r.t2 /; : : : ; r.tM / as it passes through the M and the choice of weight WkC1 .
intermediate points 0 < t1 < t2 <    < tM . The
precise nature of the trajectory between points
is of secondary importance. Again, the solution, Summary and Future Directions
for linear state space systems, can be constructed
from Riccati equation-based feedback rules com- The basic structure of ILC is now well understood
bined with “jump” conditions and feedforward with a number of algorithms available with
control signals computed off-line. known convergence properties and empirical
Iterative Learning Control 603

links between parameter choice and convergence 1. Extending current ILC knowledge to other
rates (Ahn et al. 2007a; Bristow et al. 2006; classes of model needed for applications.
Owens and Daley 2008; Wang et al. 2009; 2. Integration of online data-based modeling into
Xu 2011). Optimization-based algorithms ILC schemes to enhance adaptive control op-
provide a structured approach to convergence tions.
and have a familiar quadratic optimal control 3. Ensuring the property of error monotonicity
structure. Despite the practical benefits of or characterizing any non-monotonicity to be
monotonic error norms, this approach underlines expected.
the difficulties induced by non-minimum- 4. The construction of robustness tests and using
phase (NMP) properties of the plant. Operator the ideas in new robust design methodologies.
representations extend this theory to include more 5. Providing a better understanding of the effect
general problems such as the intermediate point of noise and disturbances on algorithm perfor-
tracking problem and, where solutions are non- mance.
unique, can be converted into iterative algorithms 6. Extending the range of tasks to include, for
that inherit the properties of NOILC but converge example, different challenges for different out-
to a solution that also minimizes an auxiliary puts on different subintervals of Œ0; T .
optimization criterion. 7. Creating design tools for nonlinear plant that I
Many of the challenges addressed by NOILC ensure convergence and a degree of robustness
are inherited by other algorithms, many of which but, in particular, provide some control of
mimic established control design paradigms. For internal plant states that may be subject to
example, the commonly used PD update law dynamical constraints.

ukC1 .t/ D uk .t/ C K1 ek .t/ C K2 ePk .t/

can produce convergence by suitable choice of Cross-References


K1 and K2 . Proofs of convergence are typi-
cally based on spectral radius conditions similar  Adaptive Control, Overview
to (4) for linear systems or on techniques such as  Adaptive Control of Linear Time-Invariant
contraction mapping (fixed point) theorems (Xu Systems
2011) for nonlinear systems. The nonlinear case  Generalized Finite-Horizon Linear-Quadratic
generally suggests local convergence conditions Optimal Control
dependent on growth conditions on the nonlinear-  Linear Systems: Continuous-Time, Time-In-
ity. They typically cannot be checked in practice variant State Variable Descriptions
but do link convergence to simple, empirical, gain  Linear Systems: Discrete-Time, Time-Invariant
selection rules. State Variable Descriptions
ILC, as a topic, is a very large area of study.  Linear Quadratic Optimal Control
Survey papers indicate that progress has been  Optimal Control and Pontryagin’s Maximum
made in a number of other areas including adap- Principle
tive ILC, the use of intelligent control ideas of
fuzzy logic and neural networks-based control
structures, 2D systems theory, and mathematical
Bibliography
studies of fractional order control laws (Chen
et al. 2013). The further development of ILC Ahn H-S, Chen YQ, Moore KL (2007a) Iterative learning
from its current strong base will draw extensively control: brief survey and categorization. IEEE Trans
from classical control knowledge but relies on the Syst Man Cybern C Appl Rev 37(3):1099–1121
Ahn H-S, Moore KL, YQ Chen (2007b) Iterative learning
three aspects of plant modeling, control design,
control: robustness and monotonic convergence for in-
and coping with uncertainty. Issues central to terval systems. Series on communications and control
medium-term success include: engineering. Springer, New York/London
604 Iterative Learning Control

Arimoto S, Miyazaki F, Kawamura S (1984) Bettering Owens DH, Chu B (2012) Combined inverse
operation of robots by learning. J Robot Syst 1(2): and gradient iterative learning control: performance,
123–140 monotonicity, robustness and non-minimum-phase
Bien Z, Xu J-X (1998) Iterative learning control: analysis, zeros. Int J Robust Nonlinear Control (2):203–226.
design, integration and applications. Kluwer Aca- doi:10:1002/rnc.2893
demic, Boston Owens DH, Daley S (2008) Iterative learning control
Bristow DA, Tharayil M, Alleyne AG (2006) A survey of – monotonicity and optimization. Int J Appl Math
iterative learning control a learning-based method for Comput Sci 18(3):279–293
high-performance tracking control. IEEE Control Syst Owens DH, Chu B, Songjun M (2012) Parameter-optimal
Mag 26(3):96–114 iterative learning control using polynomial representa-
Chen Y, Wen C (1999) Iterative learning control: con- tions of the inverse plant. Int J Control 85(5): 533–544
vergence, robustness and applications. Lecture notes Owens DH, Freeman CT, Chu B (2013) Multivariable
in control and information sciences, vol 48. Springer, norm optimal iterative learning control with auxiliary
London/New York optimization. Int J Control 1(1):1–2
Chen YQ, Li Y, Ahn H-S, Tian G (2013) A survey on Rogers E, Galkowski K, Owens DH (2007) Control
fractional order iterative learning. J Optim Theory systems theory and applications for linear repetitive
Appl 156:127–140 processes. Lecture notes in control and information
Daley S, Owens DH, Hatonen J (2007) Application sciences, vol 349. Springer, Berlin
of optimal iterative learning control to the dynamic Rogers E, Owens DH, Werner H, Freeman CT, Lewin PL,
testing of mechanical structures. Proc Inst Mech Eng I Schmidt C, Kirchhoff S, Lichtenberg G (2010) Norm-
J Syst Control Eng 221:211–222 optimal iterative learning control with application to
Edwards JB, Owens DH (1982) Analysis and control of problems in acceleration-based free electron lasers and
multipass processes. Research Studies Press/Wiley, rehabilitation robotics. Eur J Control 5:496–521
Chichester Wang Y, Gao F, Doyle FJ III (2009) Survey on iterative
Owens DH, Chu B (2010) Modelling of non- learning control repetitive control and run-to run con-
minimum phase effects in discrete-time norm opti- trol. J Process Control 19:1589–1600
mal iterative learning control. Int J Control 83(10): Xu J-X (2011) A survey on iterative learning control for
2012–2027 nonlinear systems. Int J Control 84(7):1275–1294
K

dynamical system given discrete time measure-


Kalman Filters ments of a linear function of the state vector
corrupted by additive white Gaussian noise. The
Frederick E. Daum
Kalman filter also quantifies the uncertainty in
Raytheon Company, Woburn, MA, USA
its estimate of the state vector, using the covari-
ance matrix of estimation errors. The detailed
equations of the Kalman filter algorithm and the
Abstract
problem that it solves are given in Gelb et al.
(1974), which is the most accessible but thorough
The Kalman filter is a very useful algorithm
book on Kalman filters. The linear dynamical
for linear Gaussian estimation problems. It is
system can be time varying, but its parameters
extremely popular and robust in practical appli-
must be known exactly. The measurements can
cations. The algorithm is easy to code and test.
be made at arbitrary (nonuniform) discrete times,
There are many reasons for the popularity of the
but these times must be known exactly. Likewise,
Kalman filter in the real world, including stability
the covariance matrices of the measurement er-
and generality and simplicity. Moreover, the real-
rors and the process noise can be arbitrary and
time computational complexity is very reason-
time varying, but the numerical values of these
able for high-dimensional problems. In particular,
covariance matrices must be known exactly. Also,
the computational complexity scales as the cube
the initial uncertainty in the state vector must be
of the dimension of the state vector.
Gaussian and the mean and covariance matrix can
be arbitrary, but these must be known exactly.
Keywords There is a very powerful theory of Kalman filter
stability due to Kalman (1963), which guaran-
Controllability; Discrete time measurements; tees that the Kalman filter is stable under very
Estimation algorithm; Extended Kalman filter; mild technical assumptions which can always be
Filtering; Gaussian errors; Linear dynamical satisfied in practice. In particular, the Kalman
system; Linear system; Observability; Recursive; filter is stable for estimating the state vector
Smoothing; Stability of linear dynamical systems that are stable or
unstable, for arbitrarily slow measurement rates,
provided that the mild technical assumptions are
Description of Kalman Filter fulfilled. These assumptions require that the di-
mension of the state vector is minimal and that
The Kalman filter is an algorithm that computes the measurement error covariance matrix and the
the best estimate of the state vector of a linear process noise covariance matrices are positive

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
606 Kalman Filters

definite, although weaker conditions are also suf- every operation, (3) Tychonov regularization,
ficient for stability in some cases; see Kailath (4) tuning the process noise covariance matrix,
et al. (2000) for such details on the stability (5) coding the Kalman filter in principal coor-
of the Kalman filter. Kalman filter stability is dinates or approximately principal coordinates
connected with observability and controllability (i.e., aligned with the eigenvectors of the state
of the input-output model of the relevant dynam- vector error covariance matrix), (6) sequential
ical system in Kalman (1963) The corresponding scalar measurement updates in a preferred order,
algorithm for continuous time linear measure- and (7) various factorizations of the covariance
ments (with Gaussian additive white noise) and matrices (e.g., square root, information matrix,
continuous time linear dynamical systems (with information square root, upper triangular and
Gaussian additive white process noise) is called lower triangular factorization, UDL, etc.). The
the Kalman-Bucy filter; see Kalman (1961). classic book on error covariance matrix factor-
izations is by Bierman (2006). Unfortunately,
there is no guarantee that the Kalman filter will
Design Issues work well even if all of these mitigation methods
are used. Moreover, there is no useful theoretical
In engineering practice, almost all real-world analysis of this phenomenon, with the exception
applications are nonlinear or non-Gaussian, and of a few not very tight upper bounds on the
therefore, they do not fit the Kalman filter theory. condition number. Plotting the numerical values
Nevertheless, by approximating the nonlinear dy- of the condition number of the covariance matrix
namics and measurements with linear equations, vs. time is often a helpful diagnostic. In certain
one can apply the Kalman filter theory; this is real-world applications, the condition number of
called the “extended Kalman filter” (EKF); see the Kalman filter error covariance matrix can be
Gelb et al. (1974). The linearization of the non- ten billion or larger.
linear dynamics and measurements is made by
computing the first-order Taylor series expansion
and evaluating it at the estimated state vector; this Why Is the Kalman Filter So Useful
is a very simple and fast approximation that is and Popular?
widely used in real- world applications, and it
often gives good estimation accuracy, although The Kalman filter has been enormously success-
there is no guarantee of that. Moreover, there is ful in real-world applications, and it is inter-
no guarantee that the EKF will be stable, even if esting to reflect on why it has been so useful
the linearized system satisfies all the theoretical and so popular. In particular, Kalman himself
requirements for stability of the Kalman filter. believes that his filter was successful because it
Even if the dynamics and measurements was based on probability rather than statistics;
are exactly linear and if the measurement see Kalman (1978). For example, the error co-
noise and process noise and initial uncertainty variance matrix for the Kalman filter is computed
are all exactly Gaussian with exactly known from the assumed dynamics and measurement
means and covariance matrices, there can still model and the assumed values of the initial state
be significant practical problems with Kalman uncertainty and process noise and measurement
filter accuracy, owing to ill-conditioning. In noise covariance matrices, rather than by com-
particular, the Kalman filter can be extremely puting sample covariance matrices. Likewise, the
sensitive to quantization errors in the computer Kalman filter computes the estimated state vector
arithmetic and storage. On the other hand, there from assumed Gaussian and linear probability
are many different methods to try to mitigate ill- models, rather than computing the sample mean,
conditioning including (1) double or quadruple as would be done in statistics. There is substantial
or octuple precision arithmetic, (2) making the wisdom in Kalman’s assertion, owing to the dif-
covariance matrices symmetric before and after ficulty of estimating sample covariance matrices
Kalman Filters 607

and sample vectors that are sufficiently accu- bugs in the code or ill-conditioning of the error
rate, given a limited number of samples and ill- covariance matrices or nonlinearities or errors in
conditioning and high-dimensional state vectors. modeling the dynamics or measurements.
A second reason that Kalman filters are so popu- Kalman’s 1960 paper represented a big
lar is that the real-time computational complexity paradigm shift in two ways: (1) it exploited fast
is very reasonable for modern digital comput- low-cost modern digital computers, whereas the
ers, even for problems with a high-dimensional literature up to that time did not, and (2) it used
state vector. In particular, the computational com- time domain methods rather than the ubiquitous
plexity of the Kalman filter scales as the cube Fourier transform methods, which limited the
of the dimension of the state vector; for ex- dynamics to steady state asymptotic in time,
ample, the modern GPS system uses a Kalman which in turn limited the theory to cover stable
filter with a state vector of dimension of about dynamics. Of course today we take both of
1,000 to jointly estimate the orbits of the satel- these big points as normal engineering rather
lites in the GPS constellation. In 1960, when than revolutionary and surprising. The state of
Kalman’s paper was first published, digital com- the art prior to Kalman’s 1960 paper was the
puters were starting to become fast enough at Wiener filter, which was based firmly on the
reasonable cost to multiply large matrices in real Fourier transform. The Wiener filter required
time, which is the most challenging computation very lengthy and cumbersome algebraic spectral
in a Kalman filter. Today computers are roughly factorization with complex variables, resulting in
ten orders of magnitude faster per unit cost than erroneous formulas published in certain books,
in 1960, and hence, we can run Kalman filters owing to algebraic errors which were not obvious K
for high-dimensional problems on very inexpen- and which could not be checked by computers,
sive computers that fit into your wristwatch. A owing to the nonexistence of computer algebra
third reason that Kalman filters are popular is software (e.g., MATHEMATICA) in 1960.
that the algorithms are easy to understand and Kalman explains many other problems with the
code and test. A fourth reason is the guaran- Wiener filter in Kalman (2003).
teed stability of the Kalman filter under very
mild conditions which can always be satisfied in
practical applications. A fifth reason is that the Summary and Future Directions
Kalman filter is optimal for time-varying unstable
linear dynamics with time-varying measurement To a large extent the Kalman filter theory is
noise covariance and process noise covariance. complete. There is a simple and useful theory
A sixth reason is that one can use Kalman fil- of stability of the Kalman filter, which is com-
ters for nonlinear problems by approximating the pletely lacking for nonlinear filters including the
nonlinear dynamics and measurement equations extended Kalman filter (EKF) and particle fil-
with a first-order Taylor series; this is called the ters. Moreover, there are many robust versions
extended Kalman filter (EKF), which is without of the Kalman filter that have been invented to
a doubt the most widely used algorithm in real- mitigate ill-conditioning of the covariance matrix
world estimation applications. A seventh reason as well as uncertainty in system models and
is that the Kalman filter automatically provides system parameters. However, there remain many
a convenient quantification of uncertainty of the important design issues that need to be addressed
estimated state vector using the error covariance for practical applications; see Daum (2005) for
matrix. The final reason is that it is easy to test details. The most obvious issues include nonlin-
the accuracy of the Kalman filter by comparing ear measurements, nonlinear plant models, ro-
the theoretical error covariance matrix to errors bustness to uncertainty in measurement models
computed by Monte Carlo simulations of the and plant models, non-Gaussian measurement
filter; the two errors should agree approximately, noise and plant noise, non-Gaussian initial un-
and statistically significant discrepancies suggest certainty in the state vector, and ill-conditioning
608 KYP Lemma and Generalizations/Applications

of the error covariance matrix. That is, any de-


viation from the exact mathematical assumptions
KYP Lemma and
in the Kalman filter theory can cause problems
Generalizations/Applications
in practice. It is the job of engineers to miti-
Tetsuya Iwasaki
gate such problems and design filters that are
Department of Mechanical & Aerospace
robust to such perturbations. Kalman’s recent
Engineering, University of California,
papers on how the Kalman filter was invented
Los Angeles, CA, USA
and why it is so popular contain interesting and
useful ideas; see Kalman (1978) and Kalman
(2003).
Abstract

Various properties of dynamical systems


Cross-References can be characterized in terms of inequality
conditions on their frequency responses. The
 Estimation, Survey on Kalman-Yakubovich-Popov (KYP) lemma shows
 Extended Kalman Filters equivalence of such frequency domain inequality
 Nonlinear Filters (FDI) and a linear matrix inequality (LMI). The
fundamental result has been a basis for robust
and optimal control theories in the past several
Bibliography decades. The KYP lemma has recently been
generalized to the case where an FDI on a
Bierman G (2006) Factorization methods for discrete possibly improper transfer function is required
sequential estimation. Dover, Mineola to hold in a (semi)finite frequency range. The
Daum F (2005) Nonlinear filters: beyond the Kalman
generalized KYP lemma allows us to directly
filter. IEEE AES Mag 20:57–69
Gelb A et al (1974) Applied optimal estimation. MIT, deal with practical situations where design
Cambridge parameters are sought to satisfy FDIs in multiple
Kailath T (1974) A view of three decades of linear filtering (semi)finite frequency ranges. Various design
theory. IEEE Trans Inf Theory 20:146–181
Kailath T, Sayed A, Hassibi B (2000) Linear estimation.
problems, including FIR filter and PID controller,
Prentice Hall, Upper Saddle River reduce to LMI problems which can be solved via
Kalman R (1960) A new approach to linear filtering and semidefinite programming.
prediction problems. Trans ASME J 82:35–45
Kalman R (1961) New methods in Wiener filtering theory.
RIAS technical report 61-1 (Feb 1961); also published
in Bogdanoff J, Kozin F (eds) (1963) Proceedings of Keywords
first symposium on engineering applications of ran-
dom function theory and probability. Wiley, pp 270–
388 Bounded real; Frequency domain inequality;
Kalman R (1978) A retrospective after twenty years: from Linear matrix inequality; Multi-objective design;
the pure to the applied. In: Applications of the Kalman Optimal control; Positive real; Robust control
filter to hydrology, edited by Chao-lin Chiu, University
of Pittsburgh, LC card number 78-069752, available
on-line. pp 31–54
Kalman R (2003) Discovery and invention: the Newtonian Introduction
revolution in systems technology. AIAA J Guid Con-
trol Dyn 26(6):833–837
Kalman R, Bucy R (1961) New results in linear In linear systems analysis and control design,
filtering and prediction theory. Trans ASME 83: dynamical properties are often characterized
95–107 by frequency responses. The shape of a
Sorenson H (1970) Least squares estimation from Gauss
frequency response, as visualized by the
to Kalman. IEEE Spectr 7:63–68
Stepanov OA (2011) Kalman filtering: past and present. Bode or Nyquist plot, is closely related to
J Gyroscopy Navig 2:99–110 various performance measures including the
KYP Lemma and Generalizations/Applications 609

steady state error, fast and smooth transient, KYP Lemma


and robustness against unmodeled dynamics.
Hence, desired system properties can be The KYP lemma may be motivated from various
formalized in terms of a set of frequency aspects, but let us explain it as an extension of a
domain inequalities (FDIs) on selected transfer gain condition. Consider a stable linear system
functions. The analysis and design problems
then reduce to verification and satisfaction of xP D Ax C Bu; G.s/ WD .sI  A/1 B;
the FDIs.
The Kalman-Yakubovich-Popov (KYP) where x.t/ 2 Rn is the state, u.t/ 2 Rm is
lemma (Anderson 1967; Kalman 1963; Rantzer the input, and G.s/ is the transfer function from
1996; Willems 1971) establishes the equivalence u to x. If u is a disturbance to the system and
between an FDI and a linear matrix inequality x represents the error from a desired operating
(LMI). The LMI is defined by state space point, we may be interested in how large the
matrices of the transfer function in the FDI so state variables can become for a given magnitude
that the FDI holds true if and only if the LMI of the disturbance. The gain kG.j!/k captures
admits a solution. The LMI characterization of this property for the case of a sinusoidal distur-
an FDI is useful since it replaces the process of bance at frequency !, where k  k denotes the
checking the FDI at infinitely many frequency spectral norm (D absolute value for a scalar). If
points by the search for a symmetric matrix kG.j!/k <  holds for all frequency ! with a
satisfying a finite dimensional convex constraint small  , then the system has a good disturbance
defined by the LMI. In addition to exact and attenuation property. K
tractable computations, benefits of the LMI A version of the KYP lemma states that the
conditions include analytical understanding of FDI kG.j!/k <  with  D 1 holds for
robust and optimal controls through spectral all frequency ! if and only if there exists a
factorizations and storage/Lyapunov functions. symmetric matrix P satisfying the LMI:
The KYP lemma is a fundamental result in the  
PA C AT P C I PB
systems and control field that has provided, in < 0:
B TP I
the past half century, a theoretical basis for
developments of various tools for system analysis Thus, existence of one particular P satisfying the
and design. LMI is enough to conclude that the gain is less
A drawback of the KYP lemma is its in- than one for all, infinitely many, frequencies. This
ability to characterize an FDI in a finite fre- result is known as the bounded real lemma and
quency range. Feedback control designs typically has played a fundamental role in the robust and
involve a set of specifications given in terms H1 control theories.
of multiple FDIs in various frequency ranges. The KYP lemma can be introduced as a gen-
However, the KYP lemma is not capable of treat- eralization of the bounded real lemma. First, note
ing such FDIs directly since it has to consider that the gain bound condition kG.j!/k < 1 and
the entire frequency range. To address this defi- the LMI condition can equivalently be written as
ciency, the KYP lemma has recently been gener-
   
alized to characterize an FDI in a finite frequency G.j!/ G.j!/
‚ < 0; (1)
range exactly (Iwasaki et al. 2000). Further gen- I I
eralizations (Iwasaki and Hara 2005) are avail-  
PA C AT P PB
able for FDIs within various frequency ranges C ‚ < 0; (2)
B TP 0
for both continuous- and discrete-time, possi-
bly improper, rational transfer functions. The
generalized KYP lemma allows for direct multi-
where  
I 0
objective design of filters, controllers, and dy- ‚ WD :
0 I
namical systems.
610 KYP Lemma and Generalizations/Applications

In these equations, the particular matrix ‚ is For instance, a version of the generalized KYP
chosen to describe the gain bound condition as lemma states that the FDI (1) holds in the low
a special case of the quadratic form (1), and we frequency range j!j  $` if and only if there
observe that ‚ appears in the LMI as in (2). exist matrices P D P T and Q D QT > 0
It turns out that the equivalence of (1) and (2) satisfying
holds not only for this particular ‚ but also for
an arbitrary symmetric matrix ‚. This result is  T   
A B Q P A B
called the KYP lemma, which states that, given C ‚ < 0; (5)
arbitrary matrices A, B, and ‚ D ‚T , the FDI I 0 P $`2 Q I 0
(1) holds for all frequency ! if and only if there
exists a matrix P D P T satisfying the LMI (2), provided A has no imaginary eigenvalues in the
provided A has no eigenvalues on the imaginary frequency range. In the limiting case where $`
axis. approaches infinity and the FDI is required to
The FDI in (1) can be specialized to an FDI hold for the entire frequency range, the solution
Q to (5) approaches zero, and we recover (2).
   
L.j!/ L.j!/ The role of the additional parameter Q is to
… < 0: (3)
I I enforce the FDI only in the low frequency range.
To see this, consider the case where the system is
on transfer function stable and a sinusoidal input u D <ŒOue j!t , with
(complex) phasor vector uO , is applied. The state
converges to the sinusoid x D <Œxe O j!t  in the
L.s/ WD C.sI  A/1 B C D steady state where xO WD G.j!/Ou. Multiplying (5)
by the column vector obtained by stacking xO and
by choosing uO in a column from the right, and by its complex
conjugate transpose from the left, we obtain
 T  
C D C D
‚ WD … : (4)    
0 I 0 I xO xO
.$`2 2 
 ! /xO QxO C ‚ < 0:
uO uO
The choice of matrix … allows for characteri-
zations of important system properties involving
In the low frequency range j!j  $` , the first
gain and phase of L.s/. For instance, the FDI (3)
term is nonnegative, enforcing the second term to
with  
0 I be negative, which is exactly the FDI in (1). If !
… WD is outside of the range, however, the first term is
I 0
negative, and the FDI is not required to hold.
gives L.j!/ C L.j!/ > 0. This is called the Similar results hold for various frequency
positive real property, with which the phase angle ranges. The term involving Q in (5) can be
remains between ˙90ı when L.j!/ is a scalar. expressed as the Kronecker product ‰ ˝ Q with
‰ being a diagonal matrix with entries .1; $`2 /.
The matrix ‰ arises from characterization of the
low frequency range:
Generalization
   
The standard KYP lemma deals with FDIs that j! j!
‰ D $`2  ! 2  0:
are required to hold for all frequencies. To allow 1 1
for more flexibility in practical system designs,
the KYP lemma has been generalized to deal with By different choices of ‰, middle and high fre-
FDIs in (semi)finite frequency ranges. quency ranges can also be characterized:
KYP Lemma and Generalizations/Applications 611

Low Middle High


 j!j  $` $1  !  $2 j!j 
$h
     
1 0 1 j $c 1 0
‰ 2 2
0 $` j $c $1 $2 0 $h

where $c WD .$1 C $2 /=2 and  is the


frequency range. For each pair .; ‰/, the FDI
(1) holds in the frequency range ! 2  if and
only if there exist real symmetric matrices P and
Q > 0 satisfying

F T .ˆ ˝ P C ‰ ˝ Q/F C ‚ < 0; (6)

provided A has no eigenvalues in , where


   
0 1 A B
ˆ WD ; F WD :
1 0 I 0 KYP Lemma and Generalizations/Applications, Fig. 1
Loop shaping design specifications
K
Further generalizations are available Iwasaki
and Hara (2005). The discrete-time case (fre-
quency variable on the unit circle) can be simi- Typical design specifications are given in
larly treated by a different choice of ˆ. FDIs for terms of bounds on the gain and phase of the
descriptor systems and polynomial (rather than open-loop transfer function L.s/ WD P .s/K.s/
rational) functions can also be characterized in a in various frequency ranges as shown in Fig. 1.
form similar to (6) by modifying the matrix F . The controller K.s/ should be designed so that
More specifically, the choices the frequency response L.j!/ avoids the shaded
    regions. For instance, the gain should satisfy
1 0 A B jL.j!/j  1 for j!j < !2 and jL.j!/j  1 for
ˆ WD ; F WD
0 1 E O j!j > !3 to ensure the gain crossover occurs
in the range !2  !  !3 , and the phase
give the result for the discrete-time transfer func- bound †L.j!/   in this range ensures robust
tion L.z/ D .zE  A/1 .B  zO/. stability by the phase margin.
The design specifications can be expressed as
Applications FDIs of the form (3), where a particular gain or
phase condition can be specified by setting … as
The generalized KYP lemma is useful for a va-    
riety of dynamical system designs. As an exam- ˙ 1 0
or
0 j  tan 
ple, let us consider a classical feedback control 0  2 j  tan  0
design via shaping of a scalar open-loop transfer
function in the frequency domain. The objective with  D 1 , 4 , or 1, and the C= signs
is to design a controller K.s/ for a given plant for upper/lower gain bounds. These FDIs in the
P .s/ such that the closed-loop system is stable corresponding frequency ranges can be converted
and possesses a good performance dictated by to inequalities of the form (6) with ‚ given by (4).
reference tracking, disturbance attenuation, noise The control problem is now reduced to the
sensitivity, and robustness against uncertainties. search for design parameters satisfying the set of
612 KYP Lemma and Generalizations/Applications

inequality conditions (6). In general, both coeffi- to satisfy convex FDI constraints. An important
cient matrices F and ‚ may depend on the design problem, which falls outside of this framework
parameters, but if the poles of the controller are and remains open, is the design of feedback con-
fixed (as in the PID control), then the design trollers to satisfy multiple FDIs on closed-loop
parameters will appear only in ‚. If in addition transfer functions in various frequency ranges.
an FDI specifies a convex region for L.j!/ on the There have been some attempts to address this
complex plane, then the corresponding inequality problem, but none of them has so far succeeded
(6) gives a convex constraint on P , Q, and the to give an exact solution.
design parameters. This is the case for specifica- The KYP lemma has been extended in
tions of gain upper bound (disk: jLj <  ) and other directions as well, including FDIs
phase bound (half plane:   †L   C ). A with frequency-dependent weights (Graham
gain lower bound jLj >  is not convex but can and de Oliveira 2010), internally positive
often be approximated by a half plane. The design systems (Tanaka and Langbort 2011), full
parameters satisfying the specifications can then rank polynomials (Ebihara et al. 2008), real
be computed via convex programming. multipliers (Pipeleers and Vandenberghe 2011),
Various design problems other than the open- a more general class of FDIs (Gusev 2009),
loop shaping can also be solved in a similar multidimensional systems (Bachelier et al. 2008),
manner, including finite impulse response (FIR) negative imaginary systems (Xiong et al. 2012),
digital filter design with gain and phase con- symmetric formulations for robust stability
straints in a passband and stop-band and sensor analysis (Tanaka and Langbort 2013), and
or actuator placement for mechanical control sys- multiple frequency intervals (Pipeleers et al.
tems (Hara et al. 2006; Iwasaki et al. 2003). Con- 2013). Extensions of the KYP lemma and related
trol design with the Youla parametrization also S-procedures are thoroughly reviewed in Gusev
falls within the framework if a basis expansion and Likhtarnikov (2006). A comprehensive
is used for the Youla parameter and the coeffi- tutorial of robust LMI relaxations is provided
cients are sought to satisfy convex constraints on in Scherer (2006) where variations of the KYP
closed-loop transfer functions. lemma, including the generalized KYP lemma as
a special case, are discussed in detail.

Summary and Further Directions


Cross-References
The KYP lemma has played a fundamental role
in systems and control theories, equivalently con-  Classical Frequency-Domain Design Methods
verting an FDI to an LMI. Dynamical systems  H-Infinity Control
properties characterized in the frequency domain  LMI Approach to Robust Control
are expressed in terms of state space matrices
without involving the frequency variable. The
resulting LMI condition has been found useful for Bibliography
developing robust and optimal control theories.
A recent generalization of the KYP lemma Anderson B (1967) A system theory criterion for positive
real matrices. SIAM J Control 5(2): 171–182
characterizes an FDI for a possibly improper ra- Bachelier O, Paszke W, Mehdi D (2008) On the Kalman-
tional function in a (semi)finite frequency range. Yakubovich-Popov lemma and the multidimensional
The result allows for direct solutions of practical models. Multidimens Syst Signal Proc 19(3–4):425–
design problems to satisfy multiple specifications 447
Ebihara Y, Maeda K, Hagiwara T (2008) Generalized
in various frequency ranges. A design problem is S-procedure for inequality conditions on one-vector-
essentially solvable when transfer functions are lossless sets and linear system analysis. SIAM J Con-
affine in the design parameters and are required trol Optim 47(3):1547–1555
KYP Lemma and Generalizations/Applications 613

Graham M, de Oliveira M (2010) Linear matrix inequality Pipeleers G, Vandenberghe L (2011) Generalized KYP
tests for frequency domain inequalities with affine lemma with real data. IEEE Trans Autom Control
multipliers. Automatica 46:897–901 56(12):2940–2944
Gusev S (2009) Kalman-Yakubovich-Popov lemma for Pipeleers G, Iwasaki T, Hara S (2013) Generalizing
matrix frequency domain inequality. Syst Control Lett the KYP lemma to the union of intervals. In: Pro-
58(7):469–473 ceedings of European control conference, Zurich,
Gusev S, Likhtarnikov A (2006) Kalman-Yakubovich- pp 3913–3918
Popov lemma and the S-procedure: a historical essay. Rantzer A (1996) On the Kalman-Yakubovich-Popov
Autom Remote Control 67(11):1768–1810 lemma. Syst Control Lett 28(1):7–10
Hara S, Iwasaki T, Shiokata D (2006) Robust PID control Scherer C (2006) LMI relaxations in robust control. Eur J
using generalized KYP synthesis. IEEE Control Syst Control 12(1):3–29
Mag 26(1):80–91 Tanaka T, Langbort C (2011) The bounded real lemma
Iwasaki T, Hara S (2005) Generalized KYP lemma: uni- for internally positive systems and H-infinity struc-
fied frequency domain inequalities with design appli- tured static state feedback. IEEE Trans Autom Control
cations. IEEE Trans Autom Control 50(1):41–59 56(9):2218–2223
Iwasaki T, Meinsma G, Fu M (2000) Generalized S- Tanaka T, Langbort C (2013) Symmetric formulation of
procedure and finite frequency KYP lemma. Math the S-procedure, Kalman-Yakubovich-Popov lemma
Probl Eng 6:305–320 and their exact losslessness conditions. IEEE Trans
Iwasaki T, Hara S, Yamauchi H (2003) Dynamical system Autom Control 58(6): 1486–1496
design from a control perspective: finite frequency Willems J (1971) Least squares stationary optimal control
positive-realness approach. IEEE Trans Autom Con- and the algebraic Riccati equation. IEEE Trans Autom
trol 48(8):1337–1354 Control 16:621–634
Kalman R (1963) Lyapunov functions for the problem Xiong J, Petersen I, Lanzon A (2012) Finite frequency
of Lur’e in automatic control. Proc Natl Acad Sci negative imaginary systems. IEEE Trans Autom Con-
49(2):201–205 trol 57(11):2917–2922
K
L

By resorting to radar and/or lasers and cameras, a


Lane Keeping lane keeping system monitors the adjacent lanes.
Crossing the lane markings in the absence of
Paolo Falcone
vehicles and/or obstacles in the adjacent lanes
Department of Signals and Systems,
should not cause any reaction of the lane keeping
Mechatronics Group, Chalmers University of
systems and let the driver freely perform the lane
Technology, Göteborg, Sweden
change maneuver. In the presence of vehicles or
obstacles in the adjacent lanes, the system should
assess the threat and, in case a risk of collision is
Abstract
detected, either warn the driver or automatically
issue either a steering or a single-wheel braking
This chapter provides an overview of lane keep-
command, in order to prevent the crossing of
ing systems. First, a general architecture is in-
the lane markings. As discussed next, despite
troduced and existing solutions for the necessary
the simplicity of the threat assessment and the
sensors and actuators are then overviewed. The
decision-making and control problem, challenges
threat assessment and the lane position control
arise in real traffic scenarios which may lead
problems are discussed, highlighting challenges
to nuisance due to unnecessary warnings and/or
and solutions implemented in lane keeping sys-
assisting interventions.
tems available on the market.
In this entry, we overview the most important
aspects in the design of a lane keeping sys-
tem. This entry is structured as follows. Section
Keywords “Lane Keeping Systems Architecture” illustrates
a generic architecture. Section “Sensing and Ac-
Active safety; Decision-making and control; In- tuation” reviews the most used sensors suitable
telligent transportation systems; Threat assess- for lane keeping applications. Section “Decision
ment Making and Control” introduces the threat as-
sessment and the lane position control problems,
highlighting the most relevant challenges.
Introduction

Lane keeping systems are vehicle guidance Lane Keeping Systems Architecture
systems that aim at preventing lane departure
maneuvers, which may lead to accidents, i.e., The main components of a lane keeping system
collision with surrounding obstacles and vehicles. and their interconnections are shown in Fig. 1.

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
616 Lane Keeping

Lane data

Obstacle data Vision

Lane data
Low level Steering, Decision
Sensor
controllers making and
braking Fusion
control
Obstacle
data
Radar
Obstacle data

Lane Keeping, Fig. 1 The lane keeping architecture

Relative positions and velocities of the host ve- relying on the Doppler effect, i.e., by measuring
hicle w.r.t. the surrounding environment are mea- the frequency change of the reflected waves.
sured by one radar, typically installed on the Relative distance and velocity measurements are
front of the vehicle, and possibly by the camera, typically updated with a frequency of 10 Hz.
typically installed on the windshield. Position Radars for automotive applications emit waves
of the host vehicle within the lane and further with a frequency of 77 GHz and detect objects
information, e.g., road geometry, are measured by within an approximate range of 150 m and a
the camera. These measurements are then fused view angle of about ˙10ı , with a deviation of
by the sensor fusion module to provide accurate 20–30 cm from the correct value for 95 % of
measurements of the position and velocity of the measurements (Eidehall 2004). New radar
the vehicle w.r.t. the surrounding environment systems increase the range up to about 200 m
and the lane in the widest range of operating with a view angle of about ˙10ı (News Releases
conditions and scenarios. DENSO Corporation 2013a).
The task of the decision-making and control Typically, radar units are equipped with com-
module is to assess the risk that the vehicle puter systems running signal processing algo-
crosses the lane in a dangerous way and, possibly, rithms that detect and track objects and, for each
to take an action that can range from warning of them, calculate relative position and speed,
the driver or issuing an assisting intervention, azimuth angle, also providing additional informa-
e.g., braking and/or steering. Such steering and tion, e.g., the time an object has been tracked and
braking commands are actually implemented by a flag indicating that a target has been locked.
low-level controllers. Such additional information are typically used in
The different modules will be overviewed in logics implementing the decision-making algo-
the following sections. rithms of, among others, the lane keeping system.
There are several issues arising from the use of
a radar in automotive applications, e.g., wave re-
Sensing and Actuation flections due to road bumps and barriers that may
induce the signal processing algorithms to false
Radar object detections (Eidehall 2004). Moreover, in-
Radars for automotive applications are placed in terference and the vehicle dynamics (News Re-
the front of the car, typically behind the grille. leases DENSO Corporation 2013a), e.g., pitching
The radar emits radio waves and distance from due to braking, may limit the capability of the sig-
the vehicle ahead is calculated by measuring the nal processing algorithms of correctly detecting
arrival time and direction of the reflected radio and tracking the surrounding objects. The latter
waves. The relative velocity is determined by may be solved by, e.g., using electric motors that
Lane Keeping 617

adjust the radar antenna axes in order to com- EPAS to the driver’s steering torque, in order to
pensate for the vehicle dynamics (News Releases generate the desired yaw moment calculated by
DENSO Corporation 2013a). the decision-making and control module.
Clearly, the steering command is not the only
Vision Systems available to affect the vehicle yaw motion, thus
Vision systems in lane keeping applications changing its orientation and lateral position
are typically based on a single, CCD camera within the lane. Individual wheel braking may
mounted next to the rear-view mirror placed at the also be used (Technology Daimler and Safety
center of the windshield. The image is typically Innovation 2013). In particular, in vehicles
captured by 640  480 pixels and then processed equipped with yaw motion control system via
by an image processing unit. The sampling time individual braking, a braking torque request
of the vision system is about 0.1 s, but it can for each wheel can be sent to the yaw motion
change depending on, e.g., the complexity of the control system in order to generate the desired
scene, for example, in city traffic (Eidehall 2004). yaw motion.
Lane markings are detected by using differ-
ences in the image contrast (Technology Daim-
ler and Safety Innovation 2013). The camera Decision Making and Control
can be either monochrome or full colored. The
latter is used to enhance the detection of lane The decision making and control in a lane keep-
markings, which have different colors around ing problem can be conceptually divided into two
the world (News Releases DENSO Corporation tasks: the threat assessment and the lane position
2013b). Distances to the lane markings and road control. The threat assessment problem can be
geometry parameters, like heading angle and cur- stated as the problem of detecting the risk of
L
vature, are determined by the image processing accident due to an unintended lane departure, for
algorithms, which must be robust to poor image a given situation of the surrounding environment
due to bad weather conditions or worn lane mark- (i.e., surrounding vehicles and obstacles). The
ings. Estimation of road geometry parameters, lane position control problem is the problem of
like curvature measurement, can be a challenging controlling the vehicle yaw and lateral motion in
problem (Lundquist and Schön 2011), especially order to stay within the lane. The lane position
during rain or fog (Eidehall 2004). control is activated once the threat assessment
Depending on the image processing al- detects the risk of accident.
gorithms the cameras are equipped with, We point out that the border between the corre-
surrounding objects can also be detected sponding modules executing these two tasks may
and tracked. In particular, pattern recognition be blurred for different existing commercial lane
algorithm can be used to find objects in the keeping systems. That is, the two problems may
images and classify them into cars, trucks, not be solved by two separate modules, but rather
motorcycles, and pedestrians. Vehicles (or other seen and solved as a single problem. Moreover,
objects) can be typically detected in a range the following presentation of the threat assess-
of about 60–70 m, with lower accuracy than a ment and the lane position control problems and
radar (Eidehall 2004). approaches abstracts from the implementation of
a particular lane keeping system available on the
Actuators market, rather focusing on fundamental concepts.
In order to keep the vehicle within its lane, the
most convenient actuator is the steering. Hence, Threat Assessment
a lane keeping system can be quite easily built The core information in a threat assessment algo-
in those vehicles equipped with electric power- rithm for lane keeping applications is given by a
assisted steering (EPAS) systems. In particular, measure called time to lane crossing (TLC). This
an additional steering torque can be added by the is the predicted time when a front tire intersects a
618 Lane Keeping

lane boundary. As explained in van Winsum et al. order to decide whether to trigger an intervention,
(2000), the TLC can be calculated in different if a collision is predicted, or not (Eidehall
ways. Next, its simplest expression is reported as 2004). This step is repeated for all the detected
(Eidehall 2004) threat vehicles, provided that the onboard
radar and the camera support multiple-target
W=2  Wveh =2  yoff tracking.
TLC D ; (1)
yPoff In order to minimize the interference of the
lane keeping system with the driver and/or to
where W is the lane width, yoff is the vehicle not let the system perform dangerous maneuvers,
lateral position within the lane, and Wveh is the assisting interventions should not be triggered if
vehicle width. Equation (1) can be easily modi- the quality of the measurements is such that the
fied to calculate the TLC w.r.t. any lane boundary information about the surrounding environment
relative to the adjacent lanes. is poor. For instance, in case of low visibility
The simplest way of using the TLC is just that limits the detection of the lane markings
monitoring it and triggering an action as the and the estimation of the road geometry, the sys-
TLC passes a threshold. Nevertheless, depending tem should be temporarily deactivated or down-
on the vehicle manufacturer, more sophisticated graded.
logics can be developed in order to correctly In summary, the threat assessment module has
interpret the driver’s intention and minimize the to be designed with the objective of detecting
unnecessary assisting interventions. Next, few the risk of accident due to lane departure while
scenarios follow that must be taken into account not interfering with the driver with unnecessary
while developing such logics in order to not interventions (i.e., nuisance minimization).
interfere with the driver. In particular, the threat
assessment module should stop or not trigger Lane Position Control
any assisting intervention while the vehicle is As observed in section “Actuators,” the vehicle
approaching or crossing a lane boundary if motion within the lane can be affected in two
• The indicators are active, ways, i.e., through steering and individual wheel
• A risk of collision with the vehicle ahead braking. Clearly, a steering command can be
is detected, such that the vehicle is crossing issued by both the driver and the lane keeping
the lane markings as results of an evasive system.
maneuver, Before issuing a steering command, in order to
• The radar detects a slower vehicle ahead and minimize the system nuisance, the lane keeping
the driver accelerates, since this may be an system may issue other types of low-intrusiveness
overtaking (Technology Daimler and Safety interventions. For instance, if a “low”-level threat
Innovation 2013), is detected by the threat assessment module (i.e.,
• The driver’s steering wheel torque indicates a threat where the risk of accidents is not im-
that the driver is acting against the system, minent), warnings or other stimuli to the driver
• The driver manually initiates a maneuver, may be issued in order to induce the driver to
driving the vehicle back to its lane (i.e., the execute the right maneuver. For instance, based
driver executes “the right” maneuver) on, e.g., spectrum analysis of the driver’s steering
• The vehicle enters a motor highway or command, driver’s inattention or drowsiness may
a bend (Technology Daimler and Safety be detected and a warning issued. As observed
Innovation 2013). in Technology Daimler and Safety Innovation
Part of the threat assessment task is predicting (2013), different types of warning can be used for
the trajectories of the surrounding vehicles. For different vehicle types. In passenger cars, in such
instance, if a threat vehicle is traveling in the cases, a vibration motor in the steering wheel may
adjacent lane (in the same or opposite direction), warn the driver. In trucks, audible, directional
its position has to be predicted at the TLC in warning signals can be used to let the driver know
Lane Keeping 619

Lane Keeping, Fig. 2 Vehicle modeling notation

that the vehicle trajectory needs to be adjusted. In task in a lane keeping system, a lateral control
buses, in order to avoid bothering the passengers, algorithm w.r.t. the lane boundaries is needed.
driver warning is issued through vibration motors Consider the vehicle sketched in Fig. 2. The
placed in the driver’s seat. equations describing the vehicle motion within
Other types of “soft intervention” aim at in- the lane can be compactly written in a state-space
creasing the steering impedance in the direction form as
leading to lane crossing that might cause a col-
lision with surrounding vehicles. Generating the
xP D Ax C Bı C D P des ; (2)
desired steering impedance can be easily formu-
lated as a steering torque control problem. Never-  
theless, tuning the control algorithm to obtain the where x D ey ePy e eP , P d is the desired yaw
desired steering feeling can be an involving and rate, e.g., calculated based on the road curvature,
time-consuming procedure based on extensive in- and A; B; D are speed-dependent matrices that
vehicle testing. can be found in Rajamani (2003). The (unstable)
Besides warnings and “soft interventions” system can be stabilized by a state-feedback con-
aiming at inducing the driver to perform correct trol law
maneuvers, as part of the lane position control ı D Kx C ıff ; (3)
620 Learning in Games

where K is a stabilizing static gain and ıff is a become relevant. Hence, future research efforts
feedforward term that can be used to compensate aiming at developing low-complexity verification
for the road curvature. In Rajamani (2003), it is methods might greatly impact the future
shown that, while ey .t/ ! 0 as t ! 0, e ap- development of automated driving systems.
proaches a nonzero steady-state value, no matter
how ıff is chosen, for non-straight road. Cross-References
Despite a simple problem formulation and
solution, controlling the vehicle position within  Adaptive Cruise Control
the lane is not a trivial task. Indeed, having  Vehicle Dynamics Control
the control law (3) active all the time may
Acknowledgments The author would like to thank Dr.
increase the nuisance, leading to unacceptable
Erik Coelingh for the helpful discussion.
driving experience. For this reason, the steering
command calculated through the (3) may be
active only when the vehicle significantly Bibliography
deviates from the road centerline, i.e., approaches
the lane markings. Clearly, adding such logics Eidehall A (2004) An automotive lane guidance system.
complicates the analysis of the closed-loop Lic thesis 1122, Linköping University
behavior, thus making necessary extensive in- Falcone P, Ali M, Sjöberg J (2011) Predictive threat
assessment via reachability analysis and set invariance
vehicle tuning and verification. theory. IEEE Trans Intell Transp Syst 12(4):1352–
1361
Summary and Future Directions Lundquist C, Schön TB (2011) Joint ego-motion and road
geometry estimation. Inf Fusion 4(12):253–263
News Releases DENSO Corporation (2013a) Denso de-
In this chapter, we have overviewed the general velops higher performance millimeter-wave radar,
issues and requirements that must be considered June 2013
in the design of a lane-keeping system. News Releases DENSO Corporation (2013b) Denso de-
velops new vision sensor for active safety systems,
The variety of environmental conditions the
June 2013
sensing system should operate in, together with Rajamani R (2006) Vehicle dynamics and control, chap. 2.
the range of diverse scenarios the decision- Springer, New York
making module should cope with, render the Technology Daimler and Safety Innovation (2013) Lane
keeping assist: always on the right track, June 2013
design and verification problems challenging,
van Winsum W, Brookhuis KA, de Waard D (2000) A
costly, and time consuming for a lane-keeping comparison of different ways to approximate time-to-
system. It is, therefore, necessary to approach line crossing (TLC) during car driving. Accid Anal
the design of such systems by also providing Prev 32(1):47–56
safety guarantees to the largest extent, yet
minimizing conservatism and intrusiveness of
the overall system. Model-based approaches to Learning in Games
threat assessment and decision-making problems,
as proposed in Falcone et al. (2011) for a Jeff S. Shamma
lane departure application, provide neat design School of Electrical and Computer Engineering,
and verification frameworks, which can clearly Georgia Institute of Technology, Atlanta,
describe the safe operation of the overall GA, USA
system. Adopting such design methodologies can
potentially contribute to a consistent reduction of
the development time by consistently reducing Abstract
the a posteriori safety verification phase. On the
other hand, the computational complexity of In a Nash equilibrium, each player selects a
formal model-based verification methods can strategy that is optimal with respect to the strate-
dramatically increase in those scenarios where gies of other players. This definition does not
system nonlinearity and nonconvex state spaces mention the process by which players reach a
Learning in Games 621

Nash equilibrium. The topic of learning in games as understand possible barriers to reaching Nash
seeks to address this issue in that it explores how equilibrium.
simplistic learning/adaptation rules can lead to In the setup of learning in games, players
Nash equilibrium. This entry presents a selec- repetitively play a game over a sequence
tive sampling of learning rules and their long- of stages. At each stage, players use past
run convergence properties, i.e., conditions under experiences/observations to select a strategy
which player strategies converge or not to Nash for the current stage. Once player strategies
equilibrium. are selected, the game is played, information
is updated, and the process is repeated. The
question is then to understand the long-run
Keywords behavior, e.g., whether or not player strategies
converge to Nash equilibrium.
Cournot best response; Fictitious play; Log-linear Traditionally the dynamic processes consid-
learning; Mixed strategies; Nash equilibrium ered under learning in games have players se-
lecting strategies based on a myopic desire to
optimize for the current stage. That is, play-
ers do not consider long-run effects in updating
Introduction their strategies. Accordingly, while players are
engaged in repetitive play, the dynamic processes
In a Nash equilibrium, each player’s strategy is generally are not optimal in the long run (as in the
optimal with respect to the strategies of other setting of “repeated games”). Indeed, the survey
players. Accordingly, Nash equilibrium offers a article of Hart (2005) refers to the dynamic pro-
predictive model of the outcome of a game. That cesses of learning in games as “adaptive heuris-
L
is, given the basic elements of a game – (i) a set tics.” This distinction is important in that an
of players; (ii) for each player, a set of strategies; implicit concern in learning in games is to un-
and (iii) for each player, a utility function that derstand how “low rationality” (i.e., suboptimal
captures preferences over strategies – one can and heuristic) processes can lead to the “high ra-
model/assert that the strategies selected by the tionality” (i.e., mutually optimal) notion of Nash
players constitute a Nash equilibrium. equilibrium.
In making this assertion, there is no suggestion This entry presents a sampling of results from
of how players may come to reach a Nash equi- the learning in games literature through a selec-
librium. Two motivating quotations in this regard tion of illustrative dynamic processes, a review
are: of their long-run behaviors relevant to Nash equi-
librium, and pointers to further work.
The attainment of equilibrium requires a disequi-
librium process (Arrow 1986).

and Illustration: Commuting Game


The explanatory significance of the equilibrium
We begin with a description of learning in games
concept depends on the underlying dynamics
(Skyrms 1992). in the specific setting of the commuting game,
which is a special case of so-called congestion
These quotations reflect that a foundation for games (cf., Roughgarden 2005). The setup is as
Nash equilibrium as a predictive model is dynam- follows. Each player seeks to plan a path from
ics that lead to equilibrium. Motivated by these an origin to a destination. The origins and desti-
considerations, the topic of “learning in games” nations can differ from player to player. Players
shifts the attention away from equilibrium and seek to minimize their own travel times. These
towards underlying dynamic processes and their travel times depend both on the chosen path
long-run behavior. The intent is to understand (distance traveled) and the paths of other players
how players may reach an equilibrium as well (road congestion). Every day, a player uses past
622 Learning in Games

information and observations to select that day’s on high congestion paths, e.g., a rolling billboard
path according to some selection rule, and this seeking to maximize exposure. The aforemen-
process is repeated day after day. tioned learning rules for Alice remain unchanged.
In game-theoretic terms, player “strategies” Of course, Alice’s actions implicitly depend on
are paths linking their origins to destinations, and the utility functions of other players, but only
player “utility functions” reflect travel times. At indirectly through their selected paths. This char-
a Nash equilibrium, players have selected paths acteristic of no explicit dependence on the utility
such that no individual player can find a shorter functions of others is known as “uncoupled”
travel time given the chosen paths of others. The learning, and it can have major implications on
learning in games question is then whether player the achievable long-run behavior (Hart and Mas-
paths indeed converge to Nash equilibrium in the Colell 2003a).
long run. Not surprisingly, the answer depends In assuming the ability to observe the paths of
on the specific process that players use to select other players and to compute off-line travel times
paths and possible additional structure of the as a function of these paths, these learning rules
commuting game. impose severe requirements on the information
Suppose that one of the players, say “Alice,” available to each player. Less restrictive are learn-
is choosing among a collection of paths. For ing rules that are “payoff based” (Young 2005).
the sake of illustration, let us give Alice the A simple modification that leads to payoff-based
following capabilities: (i) Alice can observe the learning is as follows. Alice maintains an empiri-
paths chosen by all other players and (ii) Alice cal average of the travel times of a path using only
can compute off-line her travel time as a function the days that she took that path. Note the distinc-
of her path and the paths of others. tion – on any given day, Alice remains unaware
With these capabilities, Alice can compute of travel times for the routes not selected. Using
running averages of the travel times along all these empirical average travel times, Alice can
available paths. Note that the assumed capabili- then mimic any of the aforementioned learning
ties allow Alice to compute the travel time of a rules. As intended, she does not directly observe
path and hence its running average, whether or the paths of others, nor does she have a closed-
not she took the path on that day. With average form expression for travel times as a function of
travel time values in hand, two possible learning player paths. Rather, she only can select a path
rules are: and measure the consequences. As before, all
– Exploitation: Choose the path with the lowest players implementing such a learning rule induce
average travel time. a dynamic process, but the ensuing analysis in
– Exploitation with Exploration: With high payoff-based learning can be more subtle.
probability, choose the path with the lowest
average travel time, and with low probability,
choose a path at random. Learning Dynamics
Assuming that all players implement the same
learning rule, each case induces a dynamic pro- We now give a more formal presentation of se-
cess that governs the daily selection of paths lected learning rules and results concerning their
and determines the resulting long-run behavior. long-run behavior.
We will revisit these processes in a more formal
setting in the next section.
Preliminaries
A noteworthy feature of these learning rules is
We begin with the basic setup of games with a
that they do not explicitly depend on the utility
finite set of players, f1; 2; : : : ; N g, and for each
functions of other players. For example, suppose
player i , a finite set of strategies, Ai . Let
one of the other players is willing to trade off
travel time for more scenic routes. Similarly,
suppose one of the other players prefers to travel A D A1  : : :  AN
Learning in Games 623

denote the set of strategy profiles. Each player, i , ˛i 2 .Ai / for each i . Let us assume that players
is endowed with a utility function choose a strategy randomly and independently
according to these mixed strategies. Accordingly,
ui W A ! R: define PrŒaI ˛ to be the probability of strategy a
under the mixed strategy profile ˛, and define the
Utility functions capture player preferences over expected utility of player i as
strategy profiles. Accordingly, for any a; a0 2 A, X
the condition Ui .˛/ D ui .a/  PrŒaI ˛:
a2A
ui .a/ > ui .a0 /
A mixed strategy Nash equilibrium is a mixed
indicates that player i prefers the strategy profile strategy profile, ˛  , such that for any player i and
a over a0 . any ˛i0 2 .A/,
The notation i indicates the set of players
other than player i . Accordingly, we sometimes Ui .˛i ; ˛i

/  Ui .˛i0 ; ˛i

/:
write a 2 A as .ai ; ai / to isolate ai , the strategy
of player i , versus ai , the strategies of other Special Classes of Games
players. The notation i is used in other settings We will reference three special classes of games:
as well. (i) zero-sum games, (ii) potential games, and (iii)
Utility functions induce best-response sets. weakly acyclic games.
For ai 2 Ai , define Zero-sum games: There are only two players (i.e.,
˚ N D 2), and u1 .a/ D u2 .a/. L
Bi .ai / D ai W ui .ai ; ai /  ui .ai0 ; ai / Potential games: There exists a (potential) func-
 tion,
for all ai0 2 Ai :
WA!R
In words, Bi .ai / denotes the set of strategies such that for any pair of strategies, a D .ai ; ai /
that are optimal for player i in response to the and a0 D .ai0 ; ai /, that differ only in the strategy
strategies of other players, ai . of player i ,
A strategy profile a 2 A is a Nash equilib-
rium if for any player i and any ai0 2 Ai , ui .ai ; ai /ui .ai0 ; ai / D .ai ; ai /.ai0 ; ai /:

ui .ai ; ai

/  ui .ai0 ; ai

/: Weakly acyclic games: There exists a function

In words, at a Nash equilibrium, no player can WA!R


achieve greater utility by unilaterally changing
strategies. Stated in terms of best-response sets, with the following property: if a 2 A is not a
a strategy profile, a , is a Nash equilibrium if for Nash equilibrium, then at least one player, say
every player i , player i , has an alternative strategy, say ai0 2 Ai ,
such that
ai 2 Bi .ai

/:
ui .ai0 ; ai / > ui .ai ; ai /
We also will need the notions of mixed strate-
gies and mixed strategy Nash equilibrium. Let
and
.Ai / denote probability distributions (i.e., non-
.ai0 ; ai / > .ai ; ai /:
negative vectors that sum to one) over the set
Ai . A mixed strategy profile is a collection of Potential games are a special class of games
probability distributions, ˛ D .˛1 ; : : : ; ˛N /, with for which various learning dynamics converge to
624 Learning in Games

a Nash equilibrium. The aforementioned com- Cournot best-response dynamics need not al-
muting game constitutes a potential game under ways converge in games with a Nash equilibrium,
certain special assumptions. These are as follows: hence the restriction to weakly acyclic games.
(i) the delay on a road only depends on the
number of users (and not their identities) and (ii) Fictitious Play
all players measure delay in the same manner In fictitious play, introduced in Brown (1951),
(Monderer and Shapley 1996). players also use past observations to construct a
Weakly acyclic games are a generalization of forecast of the strategies of other players. Unlike
potential games. In potential games, there ex- Cournot best-response dynamics, this forecast is
ists a potential function that captures differences probabilistic.
in utility under unilateral (i.e., single player) As a simple example, consider the commuting
changes in strategy. In weakly acyclic games game with two players, Alice and Bob, who both
(see Young 1998), if a strategy profile is not a must choose between two paths, A and B. Now
Nash equilibrium, then there exists a player who suppose that on stage t D 10, Alice has observed
can simultaneously achieve an increase in utility Bob used path A for 6 out of the previous 10
while increasing the potential function. The char- days and path B for the remaining days. Then
acterization of weakly acyclic games through a Alice’s forecast of Bob is that he will chose path
potential function herein is not traditional and is A with 60 % probability and path B with 40 %
borrowed from Marden et al. (2009a). probability. Alice then chooses between path A
and B in order to optimize her expected utility.
Likewise, Bob uses Alice’s empirical averages to
Forecasted Best-Response Dynamics form a probabilistic forecast of her next choice
One family of learning dynamics involves players and selects a path to optimize his expected utility.
formulating a forecast of the strategies of other More generally, let j .t/ 2 .Aj / denote the
players based on past observations and then play- empirical frequency for player j at stage t. This
ing a best response to this forecast. vector is a probability distribution that indicates
the relative frequency of times player j played
each strategy in Aj over stages 0; 1; : : : ; t  1. In
fictitious play, player i assumes (incorrectly) that
Cournot Best-Response Dynamics
at stage t, other players will select their strategies
The simplest illustration is Cournot best-response
independently and randomly according to their
dynamics. Players repetitively play the same
empirical frequency vectors. Let …i .t/ denote
game over stages t D 0; 1; 2; : : :. At stage t,
the induced probability distribution over Ai at
a player forecasts that the strategies of other
stage t. Under fictitious play, player i selects an
players are the strategies played at the previous
action according to
stage t  1. The following rules specify Cournot
best response with inertia. For each stage t and X
for each player i : ai .t/ 2 arg max ui .ai ; ai /
ai 2Ai
ai 2Ai
• With probability p 2 .0; 1/, ai .t/ D ai .t  1/
(inertia). PrŒai I …i .t/:
• With probability 1  p, ai .t/ 2 Bi .ai .t  1//
(best response). In words, player i selects the action that
• If ai .t  1/ 2 Bi .ai .t  1//, then ai .t/ D maximizes expected utility assuming that other
ai .t  1/ (continuation). players select their strategies randomly and
independently according to their empirical
Proposition 1 For weakly acyclic (and hence
frequencies.
potential) games, player strategies under
Cournot best-response dynamics with inertia Proposition 2 For (i) zero-sum games, (ii)
converge to a Nash equilibrium. potential games, and (iii) two-player games
Learning in Games 625

in which one player has only two actions, In words, under log-linear learning, only a single
player empirical frequencies under fictitious play player performs a strategy update at each stage.
converge to a mixed strategy Nash equilibrium. The probability of selecting a strategy is expo-
nentially proportional to the utility garnered from
These results are reported in Fudenberg and
that strategy (with other players repeating their
Levine (1998), Hofbauer and Sandholm (2002),
previous strategies). In the above description, the
and Berger (2005). Fictitious play need not
dummy parameter Z is a normalizing variable
converge to Nash equilibria in all games. An
used to define a probability distribution. In fact,
early counterexample is reported in Shapley
the specific probability distribution for strategy
(1964), which constructs a two-player game
selection is a Gibbs distribution with tempera-
with a unique mixed strategy Nash equilibrium.
ture parameter, T . For very large T , strategies
A weakly acyclic game with multiple pure
are chosen approximately uniformly at random.
(i.e., non-mixed) Nash equilibria under which
However, for small T , the selected strategy is
fictitious play does not converge is reported in
a best response (i.e., ai .t/ 2 Bi .ai .t  1//)
Foster and Young (1998).
with high probability, and an alternative strategy
A variant of fictitious play is “joint strategy”
is selected with low probability.
fictitious play (Marden et al. 2009b). In this
Because of the inherent randomness, strategy
framework, players construct as forecasts empir-
profiles under log-linear learning never converge.
ical frequencies of the joint play of other players.
Nonetheless, the long-run behavior can be char-
This formulation is in contrast to constructing
acterized probabilistically as follows.
and combining empirical frequencies for each
player. In the commuting game, it turns out that Proposition 3 For potential games with poten-
joint strategy fictitious play is equivalent to the tial function ./ under log-linear learning, for L
aforementioned “exploitation” rule of selecting any a 2 A,
the path with lowest average travel time. Marden
et al. (2009b) show that action profiles under joint 1 .a/=T
lim PrŒa.t/ D a D e :
strategy fictitious play (with inertia) converge to t !1 Z
a Nash equilibrium in potential games.
In words, the long-run probabilities of strategy
profiles conform to a Gibbs distribution con-
Log-Linear Learning
structed from the underlying potential function.
Under forecasted best-response dynamics,
This characterization has the important implica-
players chose a best response to the forecasted
tion of (probabilistic) equilibrium selection. Prior
strategies of other players. Log-linear learning,
convergence results stated convergence to Nash
introduced in Blume (1993), allows the
equilibria, but did not specify which Nash equi-
possibility of “exploration,” in which players can
librium in the case of multiple equilibria. Under
select nonoptimal strategies but with relatively
log-linear learning, there is a probabilistic prefer-
low probabilities.
ence for the Nash equilibrium that maximizes the
Log-linear learning proceeds as follows. First,
underlying potential function.
introduce a “temperature” parameter, T > 0.
– At stage t, a single player, say player i , is
selected at random.
– For player i ,
Extensions and Variations

1 ui .a0 ;ai .t 1//=T


PrŒai .t/ D ai0  D e i : Payoff-based learning. The discussion herein
Z
presumed that players can observe the actions of
– For all other players, j 6D i , other players and can compute utility functions
off-line. Payoff-based algorithms, i.e., algorithms
aj .t/ D aj .t  1/: in which players only measure the utility
626 Learning in Games

garnered in each stage, impose less restrictive Summary and Future Directions
informational requirements. See Young (2005)
for a general discussion, as well as Marden et al. We have presented a selection of learning dynam-
(2009c), Marden and Shamma (2012), and Arslan ics and their long-run characteristics, specifically
and Shamma (2004) for various payoff-based in terms of convergence to Nash equilibria. As
extensions. stated early on, the original motivation of learn-
ing in games research has been to add credence
No-regret learning. The broad class of so-called to solution concepts such as Nash equilibrium as
“no-regret” learning rules has the desirable prop- a model of the outcome of a game. An emerging
erty of converging to broader solution concepts line of research stems from engineering consid-
(namely, Hannan consistency sets and correlated erations, in which the objective is to use the
equilibria) in general games. See Hart and Mas- framework of learning in games as a design tool
Colell (2000, 2001, 2003b) for an extensive dis- for distributed decision architecture settings such
cussion. as autonomous vehicle teams, communication
networks, or smart grid energy systems. A related
Calibrated forecasts. Calibrated forecasts are emerging direction is social influence, in which
more sophisticated than empirical frequencies the objective is to steer the collective behaviors
in that they satisfy certain long-run consistency of human decision makers towards a socially
properties. Accordingly, forecasted best-response desirable situation through the dispersement of
learning using calibrated forecasts has stronger incentives. Accordingly, learning in games can
guaranteed convergence properties, such as offer baseline models on how individuals update
convergence to correlated equilibria. See Foster their behaviors to guide and inform social influ-
and Vohra (1997), Kakade and Foster (2008), and ence policies.
Mannor et al. (2007).

Impossibility results. This entry focused on con-


Cross-References
vergence results in various special cases. There
 Evolutionary Games
are broad impossibility results that imply the
 Stochastic Games and Learning
impossibility of families of learning rules to con-
verge to Nash equilibria in all games. The focus
is on uncoupled learning, i.e., the learning dy-
namics for player i does not depend explicitly Recommended Reading
on the utility functions of other players (which
is satisfied by all of the learning dynamics pre- Monographs on learning in games:
sented herein). See Hart and Mas-Colell (2003a, – Fudenberg D, Levine DK (1998) The theory
2006), Hart and Mansour (2007), and Shamma of learning in games. MIT, Cambridge
and Arslan (2005). Another type of impossibility – Hart S, Mas Colell A (2013) Simple adaptive
result concerns lower bounds on the required rate strategies: from regret-matching to uncoupled
of convergence to equilibrium (e.g., Hart and dynamics. World Scientific Publishing Com-
Mansour 2010). pany
– Young HP (1998) Individual strategy and
Welfare maximization. Of special interest is social structure. Princeton University Press,
learning dynamics that select welfare (i.e., Princeton
sum of utilities) maximizing strategy profiles, – Young HP (2005) Strategic learning and its
whether or not they are Nash equilibria. limits. Princeton University Press, Princeton
Recent contributions include Pradelski and Overview articles on learning in games:
Young (2012), Marden et al. (2011), and – Fudenberg D, Levine DK (2009) Learning and
Arieli and Babichenko (2012). equilibrium. Annu Rev Econ 1:385–420
Learning in Games 627

– Hart S (2005) Adaptive heuristics. Economet- Hart S, Mas-Colell A (2000) A simple adaptive proce-
rica 73(5):1401–1430 dure leading to correlated equilibrium. Econometrica
68(5):1127–1150
– Young HP (2007) The possible and the im- Hart S, Mas-Colell A (2001) A general class of adaptive
possible in multi-agent learning. Artif Intell strategies. J Econ Theory 98:26–54
171:429–433 Hart S, Mas-Colell A (2003a) Uncoupled dynamics
Articles relating learning in games to distributed do not lead to Nash equilibrium. Am Econ Rev
93(5):1830–1836
control: Hart S, Mas-Colell A (2003b) Regret based continuous-
– Li N, Marden JR (2013) Designing games time dynamics. Games Econ Behav 45:375–394
for distributed optimization. IEEE J Sel Top Hart S, Mas-Colell A (2006) Stochastic uncoupled dy-
Signal Process 7:230–242 namics and Nash equilibrium. Games Econ Behav
57(2):286–303
– Mannor S, Shamma JS (2007) Multi- Hofbauer J, Sandholm W (2002) On the global con-
agent learning for engineers, Artif Intell vergence of stochastic fictitious play. Econometrica
171:417–422 70:2265–2294
– Marden JR, Arslan G, Shamma JS (2009) Kakade SM, Foster DP (2008) Deterministic calibration
and Nash equilibrium. J Comput Syst Sci 74: 115–130
Cooperative control and potential games. Mannor S, Shamma JS, Arslan G (2007) Online calibrated
IEEE Trans Syst Man Cybern B Cybern forecasts: memory efficiency vs universality for learn-
6:1393–1407 ing in games. Mach Learn 67(1):77–115
Marden JR, Shamma JS (2012) Revisiting log-linear
learning: asynchrony, completeness and a payoff-
based implementation. Games Econ Behav 75(2):788–
808
Marden JR, Arslan G, Shamma JS (2009a) Cooperative
Bibliography control and potential games. IEEE Trans Syst Man
Cybern Part B Cybern 39:1393–1407 L
Arieli I, Babichenko Y (2012) Average-testing and Pareto Marden JR, Arslan G, Shamma JS (2009b) Joint strategy
efficiency. J Econ Theory 147: 2376–2398 fictitious play with inertia for potential games. IEEE
Arrow KJ (1986) Rationality of self and others in an Trans Autom Control 54:208–220
economic system. J Bus 59(4):S385–S399 Marden JR, Young HP, Arslan G, Shamma JS (2009c)
Arslan G, Shamma JS (2004) Distributed convergence to Payoff based dynamics for multi-player weakly
Nash equilibria with local utility measurements. In: acyclic games. SIAM J Control Optim 48:373–396
Proceedings of the 43rd IEEE conference on decision Marden JR, Peyton Young H, Pao LY (2011) Achieving
and control. Atlantis, Paradise Island, Bahamas Pareto optimality through distributed learning. Oxford
Berger U (2005) Fictitious play in 2  n games. J Econ economics discussion paper #557
Theory 120(2):139–154 Monderer D, Shapley LS (1996) Potential games. Games
Blume L (1993) The statistical mechanics of strategic Econ Behav 14:124–143
interaction. Games Econ Behav 5:387–424 Pradelski BR, Young HP (2012) Learning efficient Nash
Brown GW (1951) Iterative solutions of games by fic- equilibria in distributed systems. Games Econ Behav
titious play. In: Koopmans TC (ed) Activity analy- 75:882–897
sis of production and allocation. Wiley, New York, Roughgarden T (2005) Selfish routing and the price of
pp 374–376 anarchy. MIT, Cambridge
Foster DP, Vohra R (1997) Calibrated learning and corre- Shamma JS, Arslan G (2005) Dynamic fictitious play,
lated equilibrium. Games Econ Behav 21:40–55 dynamic gradient play, and distributed convergence
Foster DP, Young HP (1998) On the nonconvergence of to Nash equilibria. IEEE Trans Autom Control
fictitious play in coordination games. Games Econ 50(3):312–327
Behav 25:79–96 Shapley LS (1964) Some topics in two-person games. In:
Fudenberg D, Levine DK (1998) The theory of learning in Dresher M, Shapley LS, Tucker AW (eds) Advances in
games. MIT, Cambridge game theory. University Press, Princeton, pp 1–29
Hart S (2005) Adaptive heuristics. Econometrica Skyrms B (1992) Chaos and the explanatory significance
73(5):1401–1430 of equilibrium: strange attractors in evolutionary game
Hart S, Mansour Y (2007) The communication com- dynamics. In: PSA: proceedings of the biennial meet-
plexity of uncoupled Nash equilibrium procedures. ing of the Philosophy of Science Association. Volume
In: STOC’07 proceedings of the 39th annual ACM Two: symposia and invited papers, East Lansing, MI,
symposium on the theory of computing, San Diego, USA, pp 274–394
CA, USA, pp 345–353 Young HP (1998) Individual strategy and social structure.
Hart S, Mansour Y (2010) How long to equilibrium? The Princeton University Press, Princeton
communication complexity of uncoupled equilibrium Young HP (2005) Strategic learning and its limits. Oxford
procedures. Games Econ Behav 69(1):107–126 University Press. Oxford
628 Learning Theory

Similarly, in the second example, the learner


Learning Theory knows that the unknown set is a half-plane,
though it is not known which half-plane. The
Mathukumalli Vidyasagar
collection of all possible unknown sets is known
University of Texas at Dallas, Richardson,
as the concept class, and the particular unknown
TX, USA
set is referred to as the “target concept.” In the
first example, this would be the set of all convex
polygons and in the second case it would be
Introduction
the set of half-planes. The unknown set cannot
be directly observed of course; otherwise, there
How does a machine learn an abstract concept
would be nothing to learn. Rather, one is given
from examples? How can a machine generalize
clues about the target concept by an “oracle,”
to previously unseen situations? Learning theory
which informs the learner whether or not a
is the study of (formalized versions of) such
particular element belongs to the target concept.
questions. There are many possible ways to for-
Therefore, the information available to the learner
mulate such questions. Therefore, the focus of
is a collection of “labelled samples,” in the form
this entry is on one particular formalism, known
f.xi ; IT .xi /; i D 1; : : : ; mg, where m is the
as PAC (probably approximately correct) learn-
total number of labelled samples and It ./ is
ing. It turns out that PAC learning theory is rich
the indicator function of the target concept T .
enough to capture intuitive notions of what learn-
Based on this information, the learner is expected
ing should mean in the context of applications
to generate a “hypothesis” Hm that is a good
and, at the same time, is amenable to formal
approximation to the unknown target concept T .
mathematical analysis. There are several precise
One of the main features of PAC learning
and complete studies of PAC learning theory,
theory that distinguishes it from its forerunners is
many of which are cited in the bibliography.
the observation that, no matter how many training
Therefore, this article is devoted to sketching
samples are available to the learner, the hypoth-
some high-level ideas.
esis Hm can never exactly equal the unknown
target concept T . Rather, all that one can expect
is that Hm converges to T in some appropriate
Keywords metric. Since the purpose of machine learning
is to generate a hypothesis Hm that can be used
Machine learning; Probably approximately cor- to approximate the unknown target concept T
rect (PAC) learning; Support vector machine; for prediction purposes, a natural candidate for
Vapnik-Chervonenkis (V-C) dimension the metric that measures the disparity between
Hm and T is the so-called generalization error,
defined as follows: Suppose that, after m training
Problem Formulation samples that have led to the hypothesis Hm , a
testing sample x is generated at random. One
In the PAC formalism, the starting point is the can now ask: what is the probability that the
premise that there is an unknown set, say an hypothesis Hm misclassifies x? In other words,
unknown convex polygon, or an unknown half- what is the value of PrfIHm .x/ ¤ IT .x/g? This
plane. The unknown set cannot be completely quantity is known as the generalization error, and
unknown; rather, something should be specified the objective is to ensure that it approaches zero
about its nature, in order for the problem to be as m ! 1.
both meaningful and tractable. For instance, in The manner in which the samples are gener-
the first example above, the learner knows that ated leads to different models of learning. For
the unknown set is a convex polygon, though instance, if the learner is able to choose the
it is not known which polygon it might be. next sample xmC1 on the basis of the previous
Learning Theory 629

m labelled samples, which is then passed on to belong to T are shown as blue rectangles, while
the oracle for labeling, this is known as “active those that do not belong to T are shown as red
learning.” More common is “passive learning,” in dots. Knowing only these labelled samples, the
which the sequence of training samples fxi gi 1 learner is expected to guess what T might be.
is generated at random, in an independent and A reasonable approach is to choose some
identically distributed (i.i.d.) fashion, according half-plane that agrees with the data and correctly
to some probability distribution P . In this case, classifies the labelled data. For instance, the well-
even the hypothesis Hm and the generalization known support vector machine (SVM) algorithm
error are random, because they depend on the chooses the unique half-plane such that the
randomly generated training samples. This is the closest sample to the dividing line is as far as
rationale behind the nomenclature “probably ap- possible from it; see the paper by Cortes and
proximately correct.” The hypothesis Hm is not Vapnik (1997).
expected to equal to unknown target concept T The symbol H denotes the boundary of a hy-
exactly, only approximately. Even that is only pothesis, which is another half-plane. The shaded
probably true, because in principle it is possible region is the symmetric difference between the
that the randomly generated training samples two half-planes. The set TH is the set of points
could be totally unrepresentative and thus lead to that are misclassified by the hypothesis H . Of
a poor hypothesis. If we toss a coin many times, course, we do not know what this set is, because
there is a small but always positive probability we do not know T . It can be shown that, when-
that it could turn up heads every time. As the coin ever the hypothesis H is chosen to be consistent
is tossed more and more times, this probability in the sense of correctly classifying all labelled
becomes smaller, but will never equal zero. samples, the generalization error goes to zero as
the number of samples approaches infinity.
L
Examples Example 2 Now suppose the concept class con-
sists of all convex polygons in the unit square,
Example 1 Consider the situation where the con- and let T denote the (unknown) target convex
cept class consists of all half-planes in R2 , as polygon. This situation is depicted in the right
indicated in the left side of Fig. 1. Here the side of Fig. 1. This time let us assume that the
unknown target concept T is some fixed but probability distribution that generates the sam-
unknown half-plane. The symbol T is next to ples is the uniform distribution on X . Given a
the boundary of the half-plane, and all points to set of positively and negatively labelled samples
the right of the line constitute the target half- (the same convention as in Example 1), let us
plane. The training samples, generated at random choose the hypothesis H to be the convex hull
according some unknown probability distribution of all positively labelled samples, as shown in
P , are also shown in the figure. The samples that the figure. Since every positively labelled sample

Learning Theory, Fig. 1 b


Examples of learning
problems. (a) Learning a T
half-planes. (b) Learning
convex polygons

H
T
H
X = [0, 1]2
630 Learning Theory

Learning Theory, Fig. 2 a


VC dimension illustrations. y y
(a) Shattering a set of three
elements. (b) Infinite VC x x
dimension z z
b
B=∅ B = {x}

y y

x x
z z

B = {y} B = {z}

belongs to T , and T is a convex set, it follows the reader is referred to any standard text such
that H is a subset of T . Moreover, P .T n H / as Vidyasagar (1997). More generally, it can be
is the generalization error. It can be shown that shown that the set of half-planes in Rk has VC
this algorithm also “works” in the sense that the dimension k C 1.
generalization error goes to zero as the number of
Example 4 The set of convex polygons has infi-
samples approaches infinity.
nite VC dimension. To see this, let S be a strictly
convex set, as shown in Fig. 2b. (Recall that a
set is “strictly convex” if none of its boundary
Vapnik-Chervonenkis Dimension points is a convex combination of other points in
the set.) Choose any finite collection of boundary
Given any concept class C, there is a single points, call it S D fx1 ; : : : ; xn g. If B is a
integer that offers a measure of the richness of subset of S , then the convex hull of B does not
the class, known as the Vapnik-Chervonenkis (or contain any other point of S , due to the strict
VC) dimension, after its originators. convexity property. Since this argument holds for
Definition 1 A set S X is said to be shattered every integer n, the class of convex polygons has
by a concept class C if, for every subset BS , infinite VC dimension.
there is a set A 2 C such that S \ A D B. The
VC dimension of C is the largest integer d such Two Important Theorems
that there is a finite set of cardinality d that is
shattered by C. Out of the many important results in learning
theory, two are noteworthy.
Example 3 It can be shown that the set of half-
planes in R2 has VC dimension two. Choose a Theorem 1 (Blumer et al. (1989)) A concept
set S D fx; y; zg consisting of three points that class is distribution-free PAC learnable if and
are not collinear, as in Fig. 2. Then there are only if it has finite VC dimension.
23 D 8 subsets of S . The point is to show that
Theorem 2 (Benedek and Itai (1991)) Suppose
for each of these eight subsets, there is a half-
P is a fixed probability distribution. Then the
plane that contains precisely that subset, nothing
concept class C is PAC learnable if and only if,
more and nothing less. That this is possible is
for every positive number , it is possible to cover
shown in Fig. 2. Four out of the eight situations
C by a finite number of balls of radius , with
are depicted in this figure, and the remaining four
respect to the pseudometric dP .
situations can be covered by taking the comple-
ment of the half-plane shown. It is also necessary Now let us return to the two examples stud-
to show that no set with four or more elements ied previously. Since the set of half-planes has
can be shattered, but that step is omitted; instead finite VC dimension, it is distribution-free PAC
Lie Algebraic Methods in Nonlinear Control 631

learnable. The set of convex polygons can be Anthony M, Biggs N (1992) Computational learning the-
shown to satisfy the conditions of Theorem 2 ory. Cambridge University Press, Cambridge
Benedek G, Itai A (1991) Learnability by fixed distribu-
if P is the uniform distribution and is therefore tions. Theor Comput Sci 86:377–389
PAC learnable. However, since it has infinite VC Blumer A, Ehrenfeucht A, Haussler D, Warmuth M (1989)
dimension, it follows from Theorem 1 that it is Learnability and the Vapnik-Chervonenkis dimension.
not distribution-free PAC learnable. J ACM 36(4):929–965
Campi M, Vidyasagar M (2001) Learning with prior infor-
mation. IEEE Trans Autom Control 46(11):1682–1695
Devroye L, Györfi L, Lugosi G (1996) A probabilistic
Summary and Future Directions
theory of pattern recognition. Springer, New York
Gamarnik D (2003) Extension of the PAC framework to
This brief entry presents only the most basic finite and countable Markov chains. IEEE Trans Inf
aspects of PAC learning theory. Many more re- Theory 49(1):338–345
Kearns M, Vazirani U (1994) Introduction to computa-
sults are known about PAC learning theory, and
tional learning theory. MIT, Cambridge
of course many interesting problems remain un- Kulkarni SR, Vidyasagar M (1997) Learning decision
solved. Some of the known extensions are: rules under a family of probability measures. IEEE
• Learning under an “intermediate” family of Trans Inf Theory 43(1):154–166
Meir R (2000) Nonparametric time series predic-
probability distributions P that is not neces-
tion through adaptive model selection. Mach Learn
sarily equal to P  , the set of all distributions 39(1):5–34
(Kulkarni and Vidyasagar 1997) Natarajan BK (1991) Machine learning: a theoretical ap-
• Relaxing the requirement that the algorithm proach. Morgan-Kaufmann, San Mateo
van der Vaart AW, Wallner JA (1996) Weak convergence
should work uniformly well for all target
and empirical processes. Springer, New York
concepts and requiring instead only that it Vapnik VN (1995) The nature of statistical learning the-
should work with high probability (Campi ory. Springer, New York L
and Vidyasagar 2001) Vapnik VN (1998) Statistical learning theory. Wiley, New
York
• Relaxing the requirement that the training
Vidyasagar M (1997) A theory of learning and generaliza-
samples are independent of each other tion. Springer, London
and permitting them to have Markovian Vidyasagar M (2003) Learning and generalization: with
dependence (Gamarnik 2003; Meir 2000) or applications to neural networks. Springer, London
ˇ-mixing dependence (Vidyasagar 2003)
There is considerable research in finding al-
ternate sets of necessary and sufficient conditions
for learnability. Unfortunately, many of these Lie Algebraic Methods in Nonlinear
conditions are unverifiable and amount to tauto- Control
logical restatements of the problem under study.
Matthias Kawski
School of Mathematical and Statistical Sciences,
Cross-References Arizona State University, Tempe, AZ, USA

 Iterative Learning Control


 Learning in Games Abstract
 Neural Control and Approximate Dynamic
Programming Lie algebraic methods generalize matrix methods
 Stochastic Games and Learning and algebraic rank conditions to smooth nonlin-
ear systems. They capture the essence of noncom-
Bibliography muting flows and give rise to noncommutative
analogues of Taylor expansions. Lie algebraic
Anthony M, Bartlett PL (1999) Neural network learning:
theoretical foundations. Cambridge University Press, This work was partially supported by the National Science
Cambridge Foundation through the grant DMS 09-08204.
632 Lie Algebraic Methods in Nonlinear Control

rank conditions determine controllability, observ- this matrix algebra, and, e.g., invariant linear
ability, and optimality. Lie algebraic methods are subspaces, to nonlinear systems is in terms of
also employed for state-space realization, control the Lie algebra generated by the vector fields fi ,
design, and path planning. integral submanifolds of this Lie algebra, and the
algebra of iterated Lie derivatives of the output
function.
Keywords The Lie bracket of two smooth vector fields
f; gW M 7! TM is defined as the vector field
Baker-Campbell-Hausdorff formula; Chen-Fliess Œf; gW M 7! TM that maps any smooth function
series; Lie bracket ' 2 C 1 .M / to the function Œf; g' D fg' 
gf '.
In local coordinates, if
Definition
X
n
@
f .x/ D f i .x/ and
This article considers generally nonlinear control @x i
i D1
systems (affine in the control) of the form
X
n
@
g.x/ D g i .x/ ;
xP D f0 .x/ C u1 f1 .x/ C : : : um fm .x/ @x i
(1) i D1
y D '.x/
then
where the state x takes values in Rn , or more
X n 
generally in an n-dimensional manifold M n , the @g i
fi are smooth vector fields, 'W Rn 7! Rp is a Œf; g.x/ D f j .x/ j .x/
i;j D1
@x
smooth output function, and the controls u D

.u1 ; : : : ; um /W Œ0; T  7! U are piecewise contin- @f i @
uous, or, more generally, measurable functions g .x/ j .x/
j
:
@x @x i
taking values in a closed convex subset U  Rm
that contains 0 in its interior. With some abuse of notation, one may abbreviate
Lie algebraic techniques refers to analyzing this to Œf; g D .Dg/f  .Df /g, where f and
the system (1) and designing controls and sta- g are considered as column vector fields and Df
bilizing feedback laws by employing relations and Dg denote their Jacobian matrices of partial
satisfied by iterated Lie brackets of the system derivatives.
vector fields fi . Note that with this convention the Lie bracket
corresponds to the negative of the commutator
of matrices: If P; Q 2 Rnn define, in matrix
Introduction notation, the linear vector fields f .x/ D P x and
g.x/ D Qx, then Œf; g.x/ D .QP  PQ/x D
Systems of the form (1) contain as a special case ŒP; Qx.
time-invariant linear systems xP D AxCBu; y D
C x (with constant matrices A 2 Rnn , B 2
Rnm , and C 2 Rpn ) that are well-studied and Noncommuting Flows
are a mainstay of classical control engineering.
Properties such as controllability, stabilizability, Geometrically the Lie bracket of two smooth
observability, and optimal control and various vector fields f1 and f2 is an infinitesimal measure
others are determined by relationships satisfied of the lack of commutativity of their flows. For a
by higher-order matrix products of A, B, and C . smooth vector field f and an initial point x.0/ D
Since the early 1970s, it has been well un- p 2 M , denote by e tf p the solution of the
derstood that the appropriate generalization of differential equation xP D f .x/ at time t. Then
Lie Algebraic Methods in Nonlinear Control 633

1   tf2 tf1 tf2 tf1  set U of admissible control values, the Hermann-
Œf1 ; f2 '.p/ D lim ' e e e e p
t !0 2t 2 Nagano theorem guarantees that if the dimension
 '.p// : of the subspace L.p/ D ff .p/W f 2 Lg <
n is not maximal, then all such trajectories are
confined to stay in a lower-dimensional proper
As a most simple example, consider parallel integral submanifold of L through the point p.
parking a unicycle, moving it sideways without For a comprehensive introduction, see, e.g., the
slipping. Introduce coordinates .x; y; / for the textbooks Bressan and Piccoli (2007), Isidori
location in the plane and the steering angle. The (1995), and Sontag (1998).
dynamics are governed by xP D u1 cos , yP D
u1 sin , and P D u2 where the control u1 is
interpreted as the signed rolling speed and u2 as Controllability
the angular velocity of the steering angle. Written
in the form (1), one has f1 D .cos ; sin ; 0/T Define the reachable set RT .p/ as the set of all
and f2 D .0; 0; 1/T . (In this case the drift vector terminal points x.T I u; p/ at time T of trajecto-
field f0  0 vanishes.) If the system starts at ries of (1) that start at the initial point x.0/ D
.0; 0; 0/T , then via the sequence of control actions p and correspond to admissible controls. Com-
of the form turn left, roll forward, turn back, and monly known as the Lie algebra rank condi-
roll backwards, one may steer the system to a tion (LARC), the above condition determines
point .0; y; 0/T with y > 0. This sideways whether the system is accessible from the point
motion corresponds to the value .0; 1; 0/T of the p, which means that for arbitrarily small time
Lie bracket Œf1 ; f2  D . sin ; cos ; 0/T at the T > 0, the reachable set RT .p/ has nonempty
origin. It encapsulates that steering and rolling do n-dimensional interior. For most applications one
L
not commute. This example is easily expanded desires stronger controllability properties. Most
to model, e.g., the sideways motion of a car, or amenable to Lie algebraic methods, and practi-
a truck with multiple trailers; see, e.g., Bloch cally relevant, is small-time local controllability
(2003), Bressan and Piccoli (2007), and Bullo (STLC): The system is STLC from p if p lies in
and Lewis (2005). In such cases longer iterated the interior of RT .p/ for every T > 0. In the case
Lie brackets correspond to the required more that there is no drift vector field f0 , accessibility
intricate control actions needed to obtain, e.g., a is equivalent to STLC. However, in general, the
pure sideways motion. situation is much more intricate, and a rich liter-
In the case of linear systems, if the Kalman ature is devoted to various necessary or sufficient
rank condition rankŒB; AB; A2 B; : : : An1 B D conditions for STLC. A popular such condition
n is not satisfied, then all solutions curves of the is the Hermes condition. For this define the sub-
system starting from the same point x.0/ D p are spaces S 1 D spanf.adj f0 ; fi /W 1  j  m; j 2
at all times T > 0 constrained to lie in a proper ZC g, and recursively S kC1 D spanfŒg1 ; gk W g1 2
affine subspace. In the nonlinear setting the role S 1 ; gk 2 S k g. Here .ad 0 f; g/ D g, and
of the compound matrix of that condition is taken recursively .ad kC1 f; g/ D Œf; .ad k f; g/. The
by the Lie algebra L D L.f0 ; f1 ; : : : fm / of all Hermes condition guarantees in the case of an-
finite linear combinations of iterated Lie brackets alytic vector fields and, e.g., U D Œ1; 1m
of the vector fields fi . As an immediate conse- that if the system satisfies the (LARC) and for
quence of the Frobenius integrability theorem, if every k  1, S 2k .p/  S 2k1 .p/, then the
at a point x.0/ D p the vector fields in L span system is (STLC). For more general conditions,
the whole tangent space, then it is possible to see Sussmann (1987) and also Kawski (1990) for
reach an open neighborhood of the initial point by a broader discussion.
concatenating flows of the system (1) that corre- The importance and value of Lie algebraic
spond to piecewise constant controls. Conversely, conditions may in large part be ascribed to their
in the case of analytic vector fields and a compact geometric character, their being invariant under
634 Lie Algebraic Methods in Nonlinear Control

coordinate changes and feedback. In particular, in never correspond to control systems of the
the analytic case, the Lie relations completely de- same form, much work has been done in recent
termine the local properties of the system, in the years to rewrite this series in more useful
sense that Lie algebra homomorphism between formats. For example, the infinite directed
two systems gives rise to a local diffeomorphism exponential product expansion in Sussmann
that maps trajectories to trajectories (Sussmann (1986) that uses Hall trees immediately may be
1974). interpreted in terms of free nilpotent systems
and consequently helps in the construction
of nilpotent approximating systems. More
Exponential Lie Series recent work, much of it of a combinatorial
algebra nature and utilizing the underlying
A central analytic tool in Lie algebraic methods Hopf algebras, further simplifies similar
that takes the role of Taylor expansions in clas- expansions and in particular yields explicit
sical analysis of dynamical system is the Chen- formulas for a continuous Baker-Campbell-
Fliess series which associates to every admissible Hausdorff formula or for the logarithm of
control uW Œ0; T  7! U a formal power series the Chen-Fliess series (Gehrig and Kawski
2008).
XZ T
CF.u; T / D d u I  X i1 : : : X is (2)
I 0

over a set fX0 ; X1 ; : : : Xm g of noncommuting


indeterminates (or letters). For every multi-index
I D .i1 ; i2 ; : : : ıs / 2 f0; 1; : : : mgs , s  0, the Observability and Realization
coefficient of XI is the iterated integral defined
recursively In the setting of linear systems a well-
defined algebraic sense dual to the concept of
Z T Z T Z t  controllability is that of observability. Roughly
du .I;j /
D u I
d uj .t/: (3) speaking the system (1) is observable if
0 0 0
knowledge of the output y.t/ D '.x.tI u; p//
Upon evaluating this series via the substitutions over an arbitrarily small interval suffices to
Xi  fj , it becomes an asymptotic series for construct the current state x.tI u; p/ and indeed
the propagation of solutions of (1): For fj ; ' the past trajectory x.  I u; p/. In the linear
analytic, U compact, p in a compact set, and setting observability is equivalent to the rank
T  0 sufficiently small, one has condition rankŒC T ; .CA/T ; : : : ; .CAn /T  D n
being satisfied. In the nonlinear setting, the place
XZ T of the rows of this compound matrix is taken
'.x.tI u; p// D d uI  .fi1 : : : fis '/ .p/: by the functions in the observation algebra,
I 0
which consists of all finite linear combinations of
(4)
iterated Lie derivatives fis    fi1 ' of the output
One application of particular interest is to function.
construct approximating systems of a given Similar to the Hankel matrices introduced in
system (1) that preserve critical geometric the latter setting, in the case of a finite Lie rank,
properties, but which have an simpler structure. one again can use the output algebra to construct
One such class is that of nilpotent systems, realizations in the form of (1) for systems which
that is, systems whose Lie algebra L D are initially only given in terms of input-output
L.f0 ; f1 ; : : : fm / is nilpotent, and for which descriptions, or in terms of formal Fliess oper-
solutions can be found by simple quadratures. ators; see, e.g., Fliess (1980), Gray and Wang
While truncations of the Chen-Fliess series (2002), and Jakubczyk (1986) for further reading.
Lie Algebraic Methods in Nonlinear Control 635

Optimal Control translate into sufficient conditions for STLC, for


the initial point to lie in the interior of RT .p/,
In a well-defined geometric way, conditions for and vice versa. For recent work employing Lie
optimal control are dual to conditions for con- algebraic methods in optimality conditions, see,
trollability and thus are directly amenable to e.g., Agrachev et al. (2002).
Lie algebraic methods. Instead of considering a
separate functional
Z T Summary and Future Research
J.u/ D .x.T I u; p//C L.t; x.tI u; p/; u.t// dt
0
(5) Lie algebraic techniques may be seen as a direct
generalization of matrix linear algebra tools that
to be minimized, it is convenient for our purposes
have proved so successful in the analysis and de-
to augment the state by, e.g., defining xP 0 D 1
sign of linear systems. However, in the nonlinear
and xP nC1 D L.x0 ; x; u/. For example, in the
case, the known algebraic rank conditions still ex-
case of time-optimal control, one again obtains
hibit gaps between necessary and sufficient con-
an enlarged system of the same form (1); else one
ditions for controllability and optimality. Also,
utilizes more general Lie algebraic methods that
new, not yet fully understood, topological and
also apply to systems not necessarily affine in the
resonance obstructions stand in the way of con-
control.
trollability implying stabilizability. Systems that
The basic picture for systems with a compact
exhibit special structure, such as living on Lie
set U of admissible values of the controls in-
groups, or being second order such as typical
volves the attainable funnel RT .p/ consisting
of all trajectories of the system (1) starting at
mechanical systems, are amenable to further re- L
finements of the theory; compare, e.g., the use of
x.0/ D p that correspond to admissible controls.
affine connections and the symmetric product in
The trajectory corresponding to an optimal con-
Bullo et al. (2000). Other directions of ongoing
trol u must at time T lie on the boundary of
and future research involve the extension of Lie
the funnel RT .p/ and hence also at all prior
algebraic methods to infinite dimensional sys-
times (using the invariance of domain property
tems and to generalize formulas to systems with
implied by the continuity of the flow). Hence one
less regularity; see, e.g., the work by Rampazzo
may associate a covector field along such optimal
and Sussmann (2007) on Lipschitz vector fields,
trajectory that at every time points in the direction
thereby establishing closer connections with non-
of an outward normal. The Pontryagin Maximum
smooth analysis (Clarke 1983) in control.
Principle is a first-order characterization of such
trajectory covector field pairs. Its pointwise max-
imization condition essentially says that if at any
time t0 2 Œ0; T  one replaces the optimal control Cross-References
u ./ by any admissible control variation on an
interval Œt0 ; t0 C ", then such variation may be  Differential Geometric Methods in Nonlinear
transported along the flow to yield, in the limit Control
as " & 0, an inward pointing tangent vector to  Feedback Linearization of Nonlinear Systems
the reachable set RT .p/ at x.T I u ; p/. To obtain  Lie Algebraic Methods in Nonlinear Control
stronger higher-order conditions for maximality,
one may combine several such families of control
variations. The effects of such combinations are
again calculated in terms of iterated Lie brackets Bibliography
of the vector fields fi . Indeed, necessary con-
Agrachev AA, Stefani G, Zezza P (2002) Strong optimal-
ditions for optimality, for a trajectory to lie on ity for a bang-bang trajectory. SIAM J Control Optim.
the boundary of the funnel RT .p/, immediately 41:991–1014 (electronic)
636 Linear Matrix Inequality Techniques in Optimal Control

Bloch AM (2003) Nonholonomic mechanics and con-


trol. Interdisciplinary applied mathematics, vol 24. Linear Matrix Inequality Techniques
Springer, New York. With the collaboration of J. in Optimal Control
Baillieul, P. Crouch and J. Marsden, With scientific
input from P. S. Krishnaprasad, R. M. Murray and D.
Robert E. Skelton
Zenkov, Systems and Control.
Bressan A, Piccoli B (2007) Introduction to the math- University of California, San Diego, CA, USA
ematical theory of control. AIMS series on applied
mathematics, vol 2. American Institute of Mathemati-
cal Sciences (AIMS), Springfield
Bullo F, Lewis AD (2005) Geometric control of
Abstract
mechanical systems: modeling, analysis, and
design for simple mechanical control systems. LMI (linear matrix inequality) techniques offer
Texts in applied mathematics, vol 49. Springer, more flexibility in the design of dynamic linear
New York
systems than techniques that minimize a scalar
Bullo F, Leonard NE, Lewis AD (2000) Controllabil-
ity and motion algorithms for underactuated La- functional for optimization. For linear state space
grangian systems on Lie groups: mechanics and non- models, multiple goals (performance bounds) can
linear control systems. IEEE Trans Autom Control be characterized in terms of LMIs, and these can
45:1437–1454
Clarke FH (1983) Optimization and nonsmooth analysis.
serve as the basis for controller optimization via
Canadian mathematical society series of monographs finite-dimensional convex feasibility problems.
and advanced texts. Wiley, New York. A Wiley- LMI formulations of various standard control
Interscience Publication problems are described in this article, including
Fliess M (1980) Realizations of nonlinear systems and
abstract transitive Lie algebras. Bull Am Math Soc
dynamic feedback stabilization, covariance con-
(N.S.) 2:444–446 trol, LQR, H1 control, L1 control, and infor-
Gehrig E, Kawski M (2008) A hopf-algebraic formula mation architecture design.
for compositions of noncommuting flows. In: Proceed-
ings IEEE conference decision and cointrol, Cancun,
pp 156–1574
Gray WS, Wang Y (2002) Fliess operators on Lp Keywords
spaces: convergence and continuity. Syst Control Lett
46:67–74
Isidori A (1995) Nonlinear control systems. Communica- Control system design; Covariance control; H1
tions and control engineering series, 3rd edn. Springer, control; L1 control; LQR/LQG; Matrix inequal-
Berlin ities; Sensor/actuator design
Jakubczyk B (1986) Local realizations of nonlinear
causal operators. SIAM J Control Optim 24:230–
242
Kawski M (1990) High-order small-time local controlla- Early Optimization History
bility. In: Nonlinear controllability and optimal con-
trol. Monographs and textbooks in pure and applied
mathematics, vol 133. Dekker, New York, pp 431– Hamilton invented state space models of
467 nonlinear dynamic systems with his generalized
Rampazzo F, Sussmann HJ (2007) Commutators of momenta work in the 1800s (Hamilton 1834,
flow maps of nonsmooth vector fields. J Differ Equ
232:134–175
1835), but at that time the lack of computational
Sontag ED (1998) Mathematical control theory. Texts in tools prevented broad acceptance of the first-
applied mathematics, vol 6, 2nd edn. Springer, New order form of dynamic equations. With the rapid
York. Deterministic finite-dimensional systems development of computers in the 1960s, state
Sussmann HJ (1974) An extension of a theorem of
Nagano on transitive Lie algebras. Proc Am Math Soc
space models evoked a formal control theory
45:349–356 for minimizing a scalar function of control and
Sussmann HJ (1986) A product expansion for the Chen state, propelled by the calculus of variations
series. In: Theory and applications of nonlinear control and Pontryagin’s maximum principle. Optimal
systems (Stockholm, 1985). North-Holland, Amster-
dam, pp 323–335
control has been a pillar of control theory for
Sussmann HJ (1987) A general theorem on local control- the last 50 years. In fact, all of the problems
lability. SIAM J Control Optim 25:158–194 discussed in this article can perhaps be solved
Linear Matrix Inequality Techniques in Optimal Control 637

by minimizing a scalar functional, but a search it was known early that any stabilizing linear
is required to find the right functional. Globally controller could be obtained by some choice of
convergent algorithms are available to do just weights in an LQG optimization problem (see
that for quadratic functionals, but more direct Chap. 6 and references in Skelton 1988), it was
methods are now available. not known until the 1980s what particular choice
Since the early 1990s, the focus for linear of weights in LQG would yield a solution to
system design has been to pose control problems the matrix inequality (MI) problem. See early
as feasibility problems, to satisfy multiple con- attempts in Skelton (1988), and see Zhu and Skel-
straints. Since then, feasibility approaches have ton (1992) and Zhu et al. (1997) for a globally
dominated design decisions, and such feasibility convergent algorithm to find such LQG weights
problems may be convex or not. If the problem when the MI problem has a solution. Since then,
can be reduced to a set of linear matrix inequal- rather than stating a minimization problem for
ities (LMIs) to solve, then convexity is proven. a meaningless sum of outputs and inputs, linear
However, failure to find such LMI formulations control problems can now be stated simply in
of the problem does not mean it is not convex, and terms of norm bounds on each input vector and/or
computer-assisted methods for convex problems each output vector of the system (L2 bounds,
are available to avoid the search for LMIs (see L1 bounds, or variance bounds and covariance
Camino et al. 2003). bounds). These feasibility problems are convex
In the case of linear dynamic models of for state feedback or full-order output feedback
stochastic processes, optimization methods led controllers (the focus of this elementary intro-
to the popularization of linear quadratic Gaussian duction), and these can be solved using linear
(LQG) optimal control, which had globally matrix inequalities (LMIs), as illustrated in this
optimal solutions (see Skelton 1988). The first article. However, the earliest approach to these
L
two moments of the stochastic process (the MI problems was iterative LQG solutions (to
mean and the covariance) can be controlled find the correct weights to use in the quadratic
with these methods, even if the distribution of penalty of the state), as in Skelton (1988), Zhu
the random variables involved is not Gaussian. and Skelton (1992), and Zhu et al. (1997).
Hence, LQG became just an acronym for the
solution of quadratic functionals of control
and state variables, even when the stochastic
processes were not Gaussian. The label LQG Matrix Inequalities
was often used even for deterministic problems,
where a time integral, rather than an expectation Let Q be any square matrix. The linear matrix
operator, was minimized, with given initial inequality (LMI) “Q > 0” is just a short-hand
conditions or impulse excitations. These were notation to represent a certain scalar inequality.
formally called LQR (linear quadratic regulator) That is, the matrix notation “Q > 0” means “the
problems. Later the book (Skelton 1988) gave scalar xT Qx is positive for all values of x, except
the formal conditions under which the LQG and x D 0.” Obviously this is a property of Q, not
the LQR answers were numerically identical, and x, hence the abbreviated matrix notation Q > 0.
this particular version of LQR was called the This is called a linear matrix inequality (LMI),
deterministic LQG. since the matrix unknown Q appears linearly
It was always recognized that the quadratic in the inequality Q > 0. Note also that any
form of the state and control in the LQG problem square matrix Q can be written as the sum of
was an artificial goal. The real control goals usu- a symmetric matrix Qs D 12 .Q C QT /, and a
ally involved prespecified performance bounds skew-symmetric matrix Qk D 12 .Q  QT /, but
on each of the outputs and bounds on each chan- xT Qk x D 0, so only the symmetric part of the
nel of control. This leads to matrix inequalities matrix Q affects the scalar xT Qx. We assume
(MIs) rather than scalar minimizations. While hereafter without loss of generality that Q is
638 Linear Matrix Inequality Techniques in Optimal Control

symmetric. The notation “Q  0” means “the MI is the appearance of the product of the two
scalar xT Qx cannot be negative for any x.” unknowns Q and G, so more work is required to
Lyapunov proved that x.t/ converges to zero show how to use LMIs to solve this problem.
if there exists a matrix Q such that, along the In the sequel some techniques are borrowed
nonzero trajectory of a dynamic system (e.g., the from linear algebra, where a linear matrix equal-
system xP D Ax), two scalars have the property, ity (LME) Gƒ D ‚ may or may not have
x.t/T Qx.t/ > 0 and d=dt.xT .t/Qx.t// < 0. a solution G. For LMEs there are two separate
This proves that the following statements are all questions to answer. The first question is “Does
equivalent: there exist a solution?” and the answer is “if and
1. For any initial condition x.0/ of the system only if   C ‚ƒC ƒ D ‚.” The second question
xP D Ax, the state x.t/ converges to zero. is “What is the set of all solutions?” and the
2. All eigenvalues of A lie in the open left half answer is “G D  C ‚ƒC C Z   C ZƒƒC ,
plane. where Z is arbitrary, and the C symbol denotes
3. There exists a matrix Q with the two proper- Pseudo Inverse.” LMI approaches employ the
ties Q > 0 and QA C AT Q < 0. same two questions by formulating the necessary
4. The set of all quadratic Lyapunov functions and sufficient conditions for the existence of an
that can be used to prove the stability or LMI solution and then to parametrize all solu-
instability of the null solution of xP D Ax tions.
is given by xT Q1 x, where Q is any square Perhaps the earliest book on LMI control
matrix with the two properties of item 3 methods was Boyd et al. (1994), but the results
above. and notations used herein are taken from Skelton
LMIs are prevalent throughout the fundamen- et al. (1998). Other important LMI papers and
tal concepts of control theory, such as control- books can give the reader a broader background,
lability and observability. For the linear system including Iwasaki and Skelton (1994), Gahinet
example xP D Ax C Bu, y D Cx, the “Ob- and Apkarian (1994), de Oliveira et al. (2002),
Rservability Gramian” is the infinite integral Q D Li et al. (2008), de Oliveira and Skelton
T
eA t CT CeAt dt. Furthermore Q > 0 if and only (2001), Camino et al. (2001, 2003), Boyd and
if .A; C/ is an observable pair, and Q is bounded Vandenberghe (2004), Iwasaki et al. (2000),
only if the observable modes are asymptotically Khargonekar and Rotea (1991), Vandenberghe
stable. When it exists, the solution of QA C and Boyd (1996), Scherer (1995), Scherer et al.
AT Q C CT C D 0 satisfies Q > 0 if and only (1997), Balakrishnan et al. (1994), Gahinet et al.
if the matrix pair .A; C/ is observable. (1995), and Dullerud and Paganini (2000).
R Likewise the “Controllability Gramian” X D
T
eAt BBT eA t dt > 0 if and only if the pair .A; B/
is controllable. If X exists, it satisfies XAT C Control Design Using LMIs
AX C BBT D 0, and X > 0 if and only if .A; B/
is a controllable pair. Note also that the matrix Consider the feedback control system
pair .A; B/ is controllable for any A if BBT > 0,
and the matrix pair .A; C/ is observable for any 2 3 2 32 3
xP p Ap Dp Bp xp
A if CT C > 0. Hence, the existence of Q > 0 4 y 5 D 4 Cp Dy By 5 4 w 5 ;
or X > 0 satisfying either .QA C AT Q < 0/ or z Mp Dz 0 u
.AX C XAT < 0/ is equivalent to the statement


that “all eigenvalues of A lie in the open left half u Dc Cc z z


D DG ; (1)
plane.” xP c Bc Ac xc xc
It should now be clear that the set of all
stabilizing state feedback controllers, u D Gx, is where z is the measurement vector, y is the output
parametrized by the inequalities Q > 0, Q.A C to be controlled, u is the control vector, xp is the
BG/ C .A C BG/T Q < 0. The difficulty in this plant state vector, xc is the state of the controller,
Linear Matrix Inequality Techniques in Optimal Control 639

and w is the external disturbance (in some cases any given matrices ; ƒ; ‚, Chap. 9 of the book
below we treat w as a zero-mean white noise). (Skelton et al. 1998) provides all G which solve
We seek to choose the control matrix G to satisfy
the given upper bounds on the output covariance Gƒ C .Gƒ/T C ‚ < 0; (6)
EŒyyT   Y, N where E represents the steady-state
expectation operator in the stochastic case (i.e., and proves that there exists such a matrix G if and
when w is white noise), and in the deterministic only if the following two conditions hold:
case E represents the infinite integral of the ma-
trix ŒyyT . The math is the same in each case, with UT ‚U  < 0; or   T > 0; (7)
appropriate interpretations of certain matrices.
For a rigorous equivalence of the deterministic VTƒ ‚V ƒ < 0; or ƒT ƒ > 0: (8)
and stochastic interpretations, see Skelton (1988).
By defining the matrices, If G exists, then one set of such G is given by

xp G D   T ˆƒT .ƒˆƒT /1 ; ˆ D .   T ‚/1 ;


xD ; (9)
xc


Acl Bcl A D B  
D C G M E (2) where > 0 is an arbitrary scalar such that
Ccl Dcl C F H


ˆ D .   T  ‚/1 > 0: (10)
Ap 0 Bp 0
AD ; BD ;
0 0 0 I



All G which solve the problem are given by
Mp Dp Dz Theorem 2.3.12 in Skelton et al. (1998). As L
MD ; DD ; ED
0 I 0 0 elaborated in Chap. 9 of Skelton et al. (1998),
(3) 17 different control problems (using either state
    feedback or full-order dynamic controllers) all re-
C D Cp 0 ; H D By 0 ; F D Dy ; (4)
duce to this same mathematical problem. That is,
by defining the appropriate ‚; ƒ; , a very large
one can write the closed-loop system dynamics in
the form number of different control problems, including
the characterization of all stabilizing controllers,


xP Acl Bcl x covariance control, H -infinity control, L-infinity


D : (5) control, LQG control, and H2 control, can be re-
y Ccl Dcl w
duced to the same matrix inequality (13). Several
Often it is of interest to characterize the set examples from Skelton et al. (1998) follow.
of all controllers that can satisfy performance
bounds on both the outputs and inputs, EŒyyT   Stabilizing Control
YN and EŒuuT   U, N and we call these covari- There exists a controller G that stabilizes the
ance control problems. But without prespecified system (1) if and only if (7) and (8) hold, where
performance bounds Y; N U,
N one can require stabil- the matrices are defined by
ity only. Such examples are given below.    
 ƒT ‚ D B XMT AX C XAT :
(11)
Many Control Problems Reduce to the
Same LMI One can also write such results in another way,
as in Corollary 6.2.1 of Skelton et al. (1998,
Let the left (right) null spaces of any matrix B p. 135): There exists a control of the form u D
be defined by matrices UB (VB ), where UTB B D Gx that can stabilize the system xP D Ax C Bu
0, UTB UB > 0, (BVB D 0, VTB VB > 0). For if and only if there exists a matrix X > 0
640 Linear Matrix Inequality Techniques in Optimal Control

satisfying B? .AX C XAT /.B? /T < 0, where the required performance is to keep the integral
B? denotes the left null space of B. In this case squared output (kyk2L2 ) less than the prespecified
all stabilizing controllers may be parametrized by value kykL2 <
for any vector w0 such that
G D BT P C LQ1=2 , for any Q > 0 and a wT0 w0  1, and x0 D 0. This problem is labeled
P > 0 satisfying PA C AT P  PBBT P C Q D 0. linear quadratic regulator (LQR). The following
The matrix L is any matrix that satisfies the norm statements are equivalent:
bound kLk < 1. Youla et al. (1976) provided (i) There exists a controller G that solves the
a parametrization of the set of all stabilizing LQR problem.
controllers, but the parametrization was infinite (ii) There exists a matrix Y > 0 such that
dimensional (as it did not impose any restriction kDT YDk <
2 and (7) and (8) hold, where
on the order or form of the controller). So for the matrices are defined by
finite calculations one had to truncate the set to a
 
finite number before optimization or stabilization  ƒT ‚
started. As noted above, on the other hand, all

stabilizing state feedback controllers G can be YB MT YA C AT Y MT


D : (13)
parametrized in terms of an arbitrary but finite- H 0 M I
dimensional norm-bounded matrix L. Similar re-
sults apply for the dynamic controllers of any Proof is provided by Theorem 9.1.3 in Skelton
fixed order (see Chap. 6 in Skelton et al. 1998). et al. (1998).

Covariance Upper Bound Control


In the system (1), suppose that Dy D 0, By D 0 H1 Control
and that w is zero-mean white noise with intensity LMI techniques provided the first papers to solve
I. Let a required upper bound Y > 0 on the the general H1 problem, without any restrictions
steady-state output covariance Y D EŒyyT  be on the plant. See Iwasaki and Skelton (1994) and
given. The following statements are equivalent: Gahinet and Apkarian (1994).
(i) There exists a controller G that solves the Let the closed-loop transfer matrix from w to
covariance upper bound control problem y with the controller in (1) be denoted by T.s/:
Y < Y.
(ii) There exists a matrix X > 0 such that Y D T.s/ D Ccl .sI  Acl /1 Bcl C Dcl : (14)
CXCT < Y and (7) and (8) hold, where the
matrices are defined by
The H1 control problem can be defined as fol-
  lows:
 ƒT ‚

Let a performance bound
> 0 be given. Deter-
B XMT AX C XAT D
D mine whether or not there exists a controller G in
0 ET DT I (1) which asymptotically stabilizes the system and
(12) yields the closed-loop transfer matrix (14) such that
the peak value of the frequency response is less
than
. That is, kTkH1 D sup kT.j!/k <
.
(‚ occupies the last two columns).
Proof is provided by Theorem 9.1.2 in Skelton For the H1 control problem, we have the follow-
et al. (1998). ing result. Let a required H1 performance bound

> 0 be given. The following statements are
Linear Quadratic Regulator equivalent:
Consider the linear time-invariant system (1). (i) A controller G solves the H1 control prob-
Suppose that Dy D 0, Dz D 0 and that w is lem.
the impulsive disturbance w.t/ D w0 ı.t/. Let (ii) There exists a matrix X > 0 such that (7)
a performance bound
> 0 be given, where and (8) holds, where
Linear Matrix Inequality Techniques in Optimal Control 641

 
 ƒT ‚ to be designed. How does one know where to ef-
2 3 fectively spend money to improve the system? To
B XMT AX C XAT XCT D answer this question, we must optimize the infor-
D 4H 0 CX 
I F 5 mation architecture jointly with the control law.
0 ET DT FT 
I Let us consider the problem of selecting the
(15) control law jointly with the selection of the
precision (defined here as the inverse of the
(‚ occupies the last three columns).
noise intensity) of each actuator/sensor, subject
Proof is provided by Theorem 9.1.5 in Skelton
to the constraint of specified upper bounds on the
et al. (1998).
covariance of output error and control signals,
and specified upper bounds on the sensor/actuator
L1 Control
cost. We assume the cost of these devices is
The peak value of the frequency response is
proportional to their precision (i.e., the cost is
controlled by the above H1 controller. A similar
equal to the price per unit of precision, times
theorem can be written to control the peak in the
the precision). Traditionally, with full-order
time domain.
controllers, and prespecified sensor/actuator
Define sup y.t/T y.t/ D kyk2L1 , and let the
instruments (with specified precisions); this is a
statement kykL1 <
mean that the peak value
well-known solved convex problem (which
of y.t/T y.t/ is less than
2 . Suppose that Dy D 0
means it can be converted to an LMI problem
and By D 0. There exists a controller G which
if desired), see Chap. 6 of Skelton et al. (1998). If
maintains kykL1 <
in the Rpresence of any
1 we enlarge the domain of the freedom to include
energy-bounded input w.t/ (i.e., 0 wT wdt  1)
sensor/actuator precisions, it is not obvious
if and only if there exists a matrix X > 0 such that
whether the feasibility problem is convex or
L
CXCT <
2 I and (7) and (8) hold, where
not. The following shows that this problem of
  including the sensor/actuator precisions within
 ƒT ‚
the control design problem is indeed convex

B XMT AX C XAT D and therefore completely solved. The proof is


D :
0 ET DT 
I provided in Li et al. (2008).
(16) Consider the linear control system (1)–(5).
Assume that the cost of sensors and actuators is
Proof is provided by Theorem 9.1.4 in Skelton proportional to their precision, which we herein
et al. (1998). define to be the inverse of the noise intensity (or
variance, in the discrete-time case). So if the price
per unit of precision of the i -th sensor/actuator
Information Architecture in is Pii , and if the variance (or intensity) of the
Estimation and Control Problems noise associated with the i -th sensor/actuator
is Wi i , then the total cost of all sensors and
P
In the typical “control problem” that occupies actuators is Pi i Wi1
i , or simply tr.PW /,
1

most research literature, the sensors and actuators where P D diag.Pi i / and W1 D diag.Wi1 i /.
have already been selected. Yet the selection of Consider the control system (1). Suppose that
 T
sensors and actuators and their locations greatly Dy D 0, By D 0, w D wTs wTa is the zero-
affect the ability of the control system to do its mean sensor/actuator noise, Dp D Œ0 Da  and
job efficiently. Perhaps in one location a high- Dz D ŒDs 0. If the $N represents the allowed upper
precision sensor is needed, and in another loca- bound on sensor/actuator costs, there exists a
tion high precision is not needed, and paying for dynamic controller G that satisfies the constraints
high precision in that location would therefore
be a waste of resources. These decisions must be N
EŒuuT  < U; N
EŒyyT  < Y; tr.PW1 / < $N
influenced by the control dynamics which are yet (17)
642 Linear Matrix Inequality Techniques in Optimal Control

in the presence of sensor/actuator noise with fix any two of these prespecified upper bounds
intensity diag.Wi i / D W (which like G should and iteratively reduce the level set value of the
be considered a design variable not fixed third “constraint” until feasibility is lost. This
a priori) if and only if there exist matrices process minimizes the resource expressed by the
L; F; Q; X; Z; W1 such that third constraint, while enforcing the other two
constraints.
tr.PW1 / < $N (18) As an example, if cost is not a concern, one
can always set large limits for $N and discover the
best assignment of sensor/actuator precisions for
2 3 2 3
N
Y Cp X Cp N L
U 0 the specified performance requirements. These
4.Cp X/T X I 5 > 0; 4LT X I5 > 0; precisions produced by the algorithm are the val-
Cp T I Z 0 I Z ues Wi1 i , produced from the solution (18)–(20),

where the observed rankings Wi1 i > Wjj1 >
ˆ 11 ˆ T21 Wkk 1
> : : : indicate which sensors or actuators
< 0; (19)
ˆ 21 W1 are most critical to the required performance
goals .U; N Y; N If any precision W 1 is essen-
N $/. nn
tially zero, compared to other required precisions,

Da 0 then the math is asserting that the information


ˆ 21 D ; from this sensor (n) is not important for the
ZDa FDs

control objectives specified, or the control signals
A X C Bp L Ap
D p ; through this actuator channel (n) are ineffective
Q ZAp C FMp in controlling the system to these specifications.
ˆ 11 D C T : (20) This information leads us to a technique for
choosing the best sensor actuators and their lo-
Note that the matrix inequalities (18)–(20) cation.
are LMIs in the collection of variables The previous discussion provides the preci-
.L; F; Q; X; Z; W1 /, whereby joint con- sions (Wi1 i ) required of each sensor and each
trol/sensor/actuator design is a convex problem. actuator in the system. Our final application of
Assume a solution .L; F; Q; X; Z; W/ is found this theory locates sensors and actuators in a
for the LMIs (18)–(20). Then the problem (17) is large-scale system, by discarding the least effec-
solved by the controller tive ones. Suppose we solve any of the above
feasibility problems, by starting with the entire


admissible set of sensors and actuators (without
0 I Q  ZAp X F
GD regard to cost). For example, in a flexible struc-
V1
l V 1
l ZBp L 0

ture control problem we might not know whether
0 V1
r to place a rate sensor or displacement sensors at
; (21)
I Mp XV1r a given location, so we add both. We might not
know whether to use torque or force actuators, so
where Vl and Vr are left and right factors of the we add both. We fill up the system with all the
matrix I  YX (which can be found from the possibilities we might want to consider, and let
singular value decomposition IYX D U†VT D the above precision rankings (available after the
.U†1=2 /.† 1=2 VT / D .Vl /.Vr )). above LMI problem is solved) reveal how much
To emphasize the theme of this article, to precision is needed at each location and at each
relate optimization to LMIs, we note that three sensor/actuator. If there is a large gap in the pre-
optimization problems present themselves in the cisions required (say W111 > W221 > W331 >>
above problem with three constraints: control 1
: : : Wnn ), then delete the sensor/actuator n and
N output performance Y,
effort U, N and instrument repeat the LMI problem with one less sensor or
N
costs $. To solve optimization problems, one can actuator. Continue deleting sensors/actuators in
Linear Matrix Inequality Techniques in Optimal Control 643

this manner until feasibility of the problem is problems all reduce to exactly the same matrix
lost. Then this algorithm, stopping at the previous inequality problem (6). The set of such equivalent
iteration, has selected the best distribution of problems includes LQR, the set of all stabilizing
sensors/actuators for solving the specific prob- controllers, the set of all H1 controllers, and the
lem specified by the allowable bounds ($; N U;
N Y).
N set of all L1 controllers. The discrete and robust
The most important contribution of the above versions of these problems are also included in
algorithm has been to extend control theory to this equivalent set; 17 control problems have
solve system design problems that involve more been found to be equivalent to LMI problems.
than just deigning control gains. This enlarges the LMI techniques extend the range of
set of solved linear control problems, from solu- solvable system design problems beyond just
tions of linear controllers with sensors/actuators control design. By integrating information
prespecified to solutions which specify the sen- architecture and control design, one can
sor/actuator requirements jointly with the control simultaneously choose the control gains and
solution. the precision required of all sensor/actuators to
satisfy the closed-loop performance constraints.
These techniques can be used to select the
Summary information (with precision requirements)
required to solve a control or estimation problem,
LMI techniques provide more powerful tools using the best economic solution (minimal
for designing dynamic linear systems than precision). For a more complete discussion of
techniques that minimize a scalar functional for LMI problems in control, read Dullerud and
optimization, since multiple goals (bounds) can Paganini (2000), de Oliveira et al. (2002), Li
be achieved for each of the outputs and inputs. et al. (2008), de Oliveira and Skelton (2001),
L
Optimal control has been a pillar of control theory Gahinet and Apkarian (1994), Iwasaki and
for the last 50 years. In fact, all of the problems Skelton (1994), Camino et al. (2001, 2003),
discussed in this article can perhaps be solved Skelton et al. (1998), Boyd and Vandenberghe
by minimizing a scalar functional, but a search (2004), Boyd et al. (1994), Iwasaki et al. (2000),
is required to find the right functional. Globally Khargonekar and Rotea (1991), Vandenberghe
convergent algorithms are available to do just and Boyd (1996), Scherer (1995), Scherer et al.
that for quadratic functionals. But more direct (1997), Balakrishnan et al. (1994), and Gahinet
methods are now available (since the early 1990s) et al. (1995).
for satisfying multiple constraints. Since then,
feasibility approaches have dominated design
decisions (at least for linear systems), and such Cross-References
feasibility problems may be convex or not. If
the problem can be reduced to a set of LMIs  H-Infinity Control
to solve, then convexity is proven. However,  H2 Optimal Control
failure to find such LMI formulations of the  Linear Quadratic Optimal Control
problem does not mean it is not convex, and  LMI Approach to Robust Control
computer-assisted methods for convex problems  Stochastic Linear-Quadratic Control
are available to avoid the search for LMIs (see
Camino et al. 2003). Optimization can also be
achieved with LMI methods by reducing the
level set for one of the bounds, while maintaining Bibliography
all the other bounds. This level set is reduced
Balakrishnan V, Huang Y, Packard A, Doyle JC (1994)
iteratively, between convex (LMI) solutions,
Linear matrix inequalities in analysis with multipliers.
until feasibility is lost. A most amazing fact is In: Proceedings of the 1994 American control confer-
that most of the common linear control design ence, Baltimore, vol 2, pp 1228–1232
644 Linear Quadratic Optimal Control

Boyd SP, Vandenberghe L (2004) Convex optimization. Skelton RE, Iwasaki T, Grigoriadis K (1998) A unified al-
Cambridge University Press, Cambridge gebraic approach to control design. Taylor & Francis,
Boyd SP, El Ghaoui L, Feron E, Balakrishnan V (1994) London
Linear matrix inequalities in system and control the- Vandenberghe L, Boyd SP (1996) Semidefinite program-
ory. SIAM, Philadelphia ming. SIAM Rev 38:49–95
Camino JF, Helton JW, Skelton RE, Ye J (2001) Analysing Youla DC, Bongiorno JJ, Jabr HA (1976) Modern
matrix inequalities systematically: how to get schur Wiener-Hopf design of optimal controllers, part ii: the
complements out of your life. In: Proceedings of the multivariable case. IEEE Trans Autom Control 21:
5th SIAM conference on control & its applications, 319–338
San Diego Zhu G, Skelton R (1992) A two-Riccati, feasible algo-
Camino JF, Helton JW, Skelton RE, Ye J (2003) Ma- rithm for guaranteeing output l1 constraints. J Dyn
trix inequalities: a symbolic procedure to determine Syst Meas Control 114(3):329–338
convexity automatically. Integral Equ Oper Theory Zhu G, Rotea M, Skelton R (1997) A convergent al-
46(4):399–454 gorithm for the output covariance constraint control
de Oliveira MC, Skelton RE (2001) Stability tests for problem. SIAM J Control Optim 35(1):341–361
constrained linear systems. In: Reza Moheimani SO
(ed) Perspectives in robust control. Lecture notes in
control and information sciences. Springer, New York,
pp 241–257. ISBN:1852334525
de Oliveira MC, Geromel JC, Bernussou J (2002) Ex-
tended H2 and H1 norm characterizations and con- Linear Quadratic Optimal Control
troller parametrizations for discrete-time systems. Int
J Control 75(9):666–679
H.L. Trentelman
Dullerud G, Paganini F (2000) A course in robust control
theory: a convex approach. Texts in applied mathemat- Johann Bernoulli Institute for Mathematics and
ics. Springer, New York Computer Science, University of Groningen,
Gahinet P, Apkarian P (1994) A linear matrix inequality Groningen, AV, The Netherlands
approach to H1 control. Int J Robust Nonlinear Con-
trol 4(4):421–448
Gahinet P, Nemirovskii A, Laub AJ, Chilali M (1995)
LMI control toolbox user’s guide. The Mathworks Abstract
Inc., Natick
Hamilton WR (1834) On a general method in dynamics;
by which the study of the motions of all free systems
Linear quadratic optimal control is a collective
of attracting or repelling points is reduced to the search term for a class of optimal control problems
and differentiation of one central relation, or character- involving a linear input-state-output system
istic function. Philos Trans R Soc (part II):247–308 and a cost functional that is a quadratic form
Hamilton WR (1835) Second essay on a general method
in dynamics. Philos Trans R Soc (part I):95–144
of the state and the input. The aim is to
Iwasaki T, Skelton RE (1994) All controllers for minimize this cost functional over a given
the general H1 control problem – LMI exis- class of input functions. The optimal input
tence conditions and state-space formulas. Automatica depends on the initial condition, but can be
30(8):1307–1317
Iwasaki T, Meinsma G, Fu M (2000) Generalized S-
implemented by means of a state feedback
procedure and finite frequency KYP lemma. Math control law independent of the initial condition.
Probl Eng 6:305–320 Both the feedback gain and the optimal cost can
Khargonekar PP, Rotea MA (1991) Mixed H2 =H1 con- be computed in terms of solutions of Riccati
trol: a convex optimization approach. IEEE Trans
Autom Control 39:824–837 equations.
Li F, de Oliveira MC, Skelton RE (2008) Integrating infor-
mation architecture and control or estimation design.
SICE J Control Meas Syst Integr 1(2):120–128
Scherer CW (1995) Mixed H2 =H1 control. In: Isidori
A (ed) Trends in control: a European perspective, Keywords
pp 173–216. Springer, Berlin
Scherer CW, Gahinet P, Chilali M (1997) Multiobjective Algebraic Riccati equation; Finite horizon;
output-feedback control via LMI optimization. IEEE
Infinite horizon; Linear systems; Optimal control;
Trans Autom Control 42(7):896–911
Skelton RE (1988) Dynamics Systems Control: linear Quadratic cost functional; Riccati differential
systems analysis and synthesis. Wiley, New York equation
Linear Quadratic Optimal Control 645

Introduction continuous or locally integrable functions, but for


most purposes, the exact class from which the
Linear quadratic optimal control is a generic input functions are chosen is not important. We
term that collects a number of optimal control will assume that input functions take values in an
problems for linear input-state-output systems in m-dimensional space U, which we often identify
which a quadratic cost functional is minimized with Rm . The variable x is called the state
over a given class of input functions. This func- variable and it is assumed to take values in an n-
tional is formed by integrating a quadratic form dimensional space X . The space X will be called
of the state and the input over a finite or an infi- the state space. It will usually be identified with
nite time interval. Minimizing the energy of the Rn . Finally, z is called the to be controlled output
output over a finite or infinite time interval can be of the system and takes values in a p-dimensional
formulated in this framework and in fact provides space Z, which we identify with Rp . The solution
a major motivation for this class of optimal con- of the differential equation of † with initial value
trol problems. A common feature of the solutions x.0/ D x0 will be denoted as xu .t; x0 /. It can be
to the several versions of the problem is that the given explicitly using the variation-of-constants
optimal input functions can be given in the form formula (see Trentelman et al. 2001, p. 38). The
of a linear state feedback control law. This makes set of eigenvalues of a given matrix M is called
it possible to implement the optimal controllers the spectrum of M and is denoted by .M /. The
as a feedback loop around the system. Another system (1) is called stabilizable if there exists a
common feature is that the optimal value of the map (matrix of suitable dimensions) F such that
cost functional is a quadratic form of the initial .ACBF / C . Here, C denotes the open left
condition on the system. This quadratic form is half complex plane, i.e., f 2 C j Re. / < 0g.
obtained by taking the appropriate solution of a We often express this property by saying that the
L
Riccati differential equation or algebraic Riccati pair .A; B/ is stabilizable.
equation associated with the system.

The Linear Quadratic Optimal Control


Problem
Systems with Inputs and Outputs
Assume that our aim is to keep all components
Consider the continuous-time, linear time- of the output z.t/ as small as possible, for all
invariant input-output system in state space form t  0. In the ideal situation, with initial state
represented by x.0/ D 0, the uncontrolled system (with control
input u D 0) evolves along the stationary solution
P
x.t/ D Ax.t/ C Bu.t/; z.t/ D C x.t/ C Du.t/: x.t/ D 0. Of course, the output z.t/ will then
(1) also be equal to zero for all t. If, however, at
time t D 0 the state of the system is perturbed
This system will be referred to as †. In (1), to, say, x.0/ D x0 , with x0 ¤ 0, then the
A; B; C , and D are maps between suitable uncontrolled system will evolve along a state
spaces (or matrices of suitable dimensions) and trajectory unequal to the stationary zero solution,
the functions x, u, and z are considered to be and we will get z.t/ D C e At x0 . To remedy this,
defined on the real axis R or on any subinterval from time t D 0 on, we can apply an input
of it. In particular, one often assumes the domain function u, so that for t  0 the corresponding
of definition to be the nonnegative part of R. The output becomes equal to z.t/ D C xu .t; x0 / C
function u is called the input, and its values are Du.t/. Keeping in mind that we want the output
assumed to be given from outside the system. z.t/ to be as small as possible for all t  0,
The class of admissible input functions will be we can measure its size by the quadratic cost
denoted U. Often, U will be the class of piecewise functional
646 Linear Quadratic Optimal Control

Z 1
Problem 2 In the situation of Problem 1, deter-
J.x0 ; u/ D kz.t/k2 dt; (2)
0
mine for every initial state x0 an input u 2 U such
that xu .t; x0 / ! 0 .t ! 1/ and such that under
where k  k denotes the Euclidean norm. Our this condition, J.x0 ; u/ is minimized.
aim to keep the values of the output as small as
In the literature various special cases of these
possible can then be expressed as requiring this
problems have been considered, and names have
integral to be as small as possible by suitable
been associated to these special cases. In partic-
choice of input function u. In this way we arrive
ular, Problems 1 and 2 are called regular if the
at the linear quadratic optimal control problem:
matrix D is injective, equivalently, D > D > 0. If,
Problem 1 Consider the system † W x.t/ P D in addition, C > D D 0 and D > D D I , then the
Ax.t/ C Bu.t/, z.t/ D C x.t/ C Du.t/. Deter- problems are said to be in standard form. In the
mine for every initial state x0 an input u 2 U (a standard case, the integrand in the cost functional
space of functions Œ0; 1/ ! U) such that reduces to kzk2 D x > C > C x C u> u. We often
Z write Q D C > C . The standard case is a special
1
J.x0 ; u/ W D kz.t/k2 dt (3) case, which is not essentially simpler than the
0 general regular problem, but which gives rise to
simpler formulas. The general regular problem
is minimal. Here z.t/ denotes the output trajec- can be reduced to the standard case by means of
tory zu .t; x0 / of † corresponding to the initial a suitable feedback transformation.
state x0 and input function u.
Since the system is linear and the integrand in The Finite-Horizon Problem
the cost functional is a quadratic function of z,
the problem is called linear quadratic. Of course, The finite-horizon problem in standard from is
kzk2 D x > C > C x C 2u> D > C x C u> D > Du, formulated as follows:
so the integrand can also be considered as a
quadratic function of .x; u/. The convergence of Problem 3 Given the system x.t/ P D Ax.t/ C
the integral in (3) is of course a point of concern. Bu.t/, a final time T > 0, and symmetric
Therefore, one often considers the corresponding matrices N and Q such that N  0 and Q  0,
finite-horizon problem in a preliminary investiga- determine for every initial state x0 a piecewise
tion. In this problem, a final time T is given and continuous input function u W Œ0; T  ! U such
one wants to minimize the integral that the integral
RT
Z T J.x0 ; u; T / WD 0 x.t/> Qx.t/ C u.t/> u.t/ dt
J.x0 ; u; T / W D kz.t/k2 dt: (4)
0 Cx.T /> N x.T / (5)

In contrast to this, the first problem above is is minimized.


sometimes called the infinite horizon problem.
In this problem, we have introduced a weight
An important issue is also the convergence of
on the final state, using the matrix N . This
the state. Obviously, convergence of the integral
generalization of the problem does not give rise
does not always imply the convergence to zero of
to additional complications.
the state. Therefore, distinction is made between
A key ingredient in solving this finite-horizon
the problem with zero and with free endpoint.
problem is the Riccati differential equation asso-
Problem 1 as stated is referred to as the problem
ciated with the problem:
with free endpoint. If one restricts the inputs u
in the problem to those for which the resulting
PP .t/ D A> P .t/ C P .t/AP .t/BB > P .t/CQ;
state trajectory tends to zero, one speaks about
the problem with zero endpoint. Specifically: P .0/ D N: (6)
Linear Quadratic Optimal Control 647

This is a quadratic differential equation on the T ! 1 to some matrix P  . Such a convergence


interval Œ0; 1 in terms of the matrices A, B, does not always take place. In order to achieve
and Q, and with initial condition given by the convergence, we make the following assumption:
weight matrix N on the final state. The unknown for every x0 , there exists an input u for which the
in the differential equation is the matrix valued integral
function P .t/. The following theorem solves the
Z 1
finite-horizon problem. It states that the Riccati
J.x0 ; u/ W D x.t/> Qx.t/ C u.t/> u.t/ dt
differential equation with initial condition (6) has 0
a unique solution on Œ0; 1/, that the optimal (8)
value of the cost functional is determined by the
converges, i.e., for which the cost J.x0 ; u/ is fi-
value of this solution at time T , and that there
nite. Obviously, for the problem to make sense for
exists a unique optimal input that is generated by
all x0 , this condition is necessary. It is easily seen
a time-varying state feedback control law:
that the stabilizability of .A; B/ is a sufficient
Theorem 1 Consider Problem 3. The following condition for the above assumption to hold (not
properties hold: necessary, take, e.g., Q D 0). Take an arbitrary
1. The Riccati differential equation with initial initial state x0 and assume that uN is a function
value (6) has a unique solution P .t/ on such that the integral (8) is finite. We have for
Œ0; 1/. This solution is symmetric and positive every T > 0 that
semidefinite for all t  0.
2. For each x0 there is exactly one optimal input x0 > P .T /x0  J.x0 ; uN ; T /  J.x0 ; uN /;
function, i.e., a piecewise continuous func-
tion u on Œ0; T  such that J.x0 ; u ; T / D L
J  .x0 ; T / WD inffJ.x0 ; u; T / j u 2 Ug. This which implies that for every x0 , the expression
optimal input function u is generated by the x0 > P .T /x0 is bounded. This implies that P .T /
time-varying feedback control law is bounded. Since P .T / is increasing with respect
to T , it follows that P  W D limT !1 P .T /
u.t/ D B > P .T  t/x.t/ .0  t  T /: exists. Since P satisfies the differential equa-
(7) tion (6), it follows that also PP .t/ has a limit as
3. For each x0 , the minimal value of the cost t ! 1. It is easily seen that this latter limit must
functional equals be zero. Hence, P D P  satisfies the following
equation:
J  .x0 ; T / D x0 > P .T /x0 :
A> P C PA  PBB > P C Q D 0: (9)
4. If N D 0, then the function t 7! P .t/ is an
increasing function in the sense that P .t/  This is called the algebraic Riccati equation
P .s/ is positive semidefinite for t  s. (ARE). The solutions of this equation are exactly
the constant solutions of the Riccati differential
equation. The previous consideration shows that
The Infinite-Horizon Problem with the ARE has a positive semidefinite solution
Free Endpoint P  . The solution is not necessarily unique, not
even with the extra condition that P  0.
We consider the situation as described in Theo- However, P  turns out to be the smallest real
rem 1 with N D 0. An obvious conjecture is symmetric positive semidefinite solution of the
that x0 > P .T /x0 converges to the minimal cost ARE.
of the infinite-horizon problem as T ! 1. The The following theorem now establishes a com-
convergence of x0 > P .T /x0 for all x0 is equiva- plete solution to the regular standard form version
lent to the convergence of the matrix P .T / for of Problem 1:
648 Linear Quadratic Optimal Control

Theorem 2 Consider the system x.t/ P D with Q  0. Assume that .A; B/ is stabilizable.
Ax.t/ C Bu.t/ together with the cost functional Then:
1. There exists a largest real symmetric solution
Z 1 of the ARE, i.e., there exists a real symmetric
J.x0 ; u/ W D x.t/> Qx.t/ C u.t/> u.t/ dt; solution P C such that for every real symmet-
0
ric solution P , we have P  P C . P C is
positive semidefinite.
with Q  0. Factorize Q D C > C . Then, the
2. For every initial state x0 , we have
following statements are equivalent:
1. For every x0 2 X , there exists u 2 U such that J0 .x0 / D x0 > P C x0 :
J.x0 ; u/ < 1.
2. The ARE (9) has a real symmetric positive 3. For every initial state x0 , there exists an op-
semidefinite solution P . timal input function, i.e., a function u 2
Assume that one of the above conditions holds. U with x.1/ D 0 such that J.x0 ; u / D
Then, there exists a smallest real symmetric pos- J0 .x0 / if and only if every eigenvalue of A on
itive semidefinite solution of the ARE, i.e., there the imaginary
 axis is .Q; A/ observable, i.e.,
exists a real symmetric solution P   0 such A  I
that for every real symmetric solution P  0, we rank D n for all 2 .A/ with
Q
have P   P . For every x0 , we have Re. / D 0.
Under this assumption we have:
J  .x0 / W D inffJ.x0 ; u/ j u 2 Ug D x0 > P  x0 : 4. For every initial state x0 , there is exactly one
optimal input function u . This optimal input
Furthermore, for every x0 , there is exactly one function is generated by the time-invariant
optimal input function, i.e., a function u 2 U feedback law
such that J.x0 ; u / D J  .x0 /. This optimal input
u.t/ D B > P C x.t/:
is generated by the time-invariant feedback law
P
5. The optimal closed-loop system x.t/ D .A 
u.t/ D B > P  x.t/: BB > P C /x.t/ is stable. In fact, P C is the
unique real symmetric solution of the ARE for
which .A  BB > P C / C .

The Infinite-Horizon Problem with


Zero Endpoint Summary and Future Directions

In addition to the free endpoint problem, we con- Linear quadratic optimal control deals with find-
sider the version of the linear quadratic problem ing an input function that minimizes a quadratic
with zero endpoint. In this case the aim is to cost functional for a given linear system. The
minimize for every x0 the cost functional over all cost functional is the integral of a quadratic form
inputs u such that xu .t; x0 / ! 0 (t ! 1). For in the input and state variable of the system. If
each x0 such u exists if and only if the pair .A; B/ the integral is taken over, a finite time interval
is stabilizable. A solution to the regular standard the problem is called a finite-horizon problem,
form version of Problem 2 is stated next: and the optimal cost and optimal state feedback
gain can be expressed in terms of the solution of
Theorem 3 Consider the system x.t/ P D an associated Riccati differential equation. If we
Ax.t/ C Bu.t/ together with the cost functional integrate over an infinite time interval, the prob-
Z lem is called an infinite-horizon problem. The
1
J.x0 ; u/ W D x.t/> Qx.t/ C u.t/> u.t/ dt; optimal cost and optimal feedback gain for the
0 free endpoint problem can be found in terms of
Linear Quadratic Zero-Sum Two-Person Differential Games 649

the smallest nonnegative real symmetric solution worked out in detail in Hautus and Silverman
of the associated algebraic Riccati equation. For (1983) and Willems et al. (1986). For a more
the zero endpoint problem, these are given in recent reference, including an extensive list
terms of the largest real symmetric solution of the of references, we refer to the textbook of
algebraic Riccati equation. Trentelman et al. (2001).

Cross-References Bibliography

Anderson BDO, Moore JB (1971) Linear optimal control.


 Generalized Finite-Horizon Linear-Quadratic Prentice Hall, Englewood Cliffs
Optimal Control Brockett RW (1969) Finite dimensional linear systems.
 H-Infinity Control Wiley, New York
 H2 Optimal Control Clements DJ, Anderson BDO (1978) Singular optimal
control: the linear quadratic problem. Volume 5 of
 Linear State Feedback lecture notes in control and information sciences.
Springer, New York
Coppel WA (1974) Matrix quadratic equations. Bull Aust
Math Soc 10:377–401
Recommended Reading Hautus MLJ, Silverman LM (1983) System structure and
singular control. Linear Algebra Appl 50:369–402
The linear quadratic regulator problem and the Kalman RE (1960) Contributions to the theory of optimal
Riccati equation were introduced by R.E. Kalman control. Bol Soc Mat Mex 5:102–119
Kwakernaak H, Sivan R (1972) Linear optimal control
in the early 1960s (see Kalman 1960). Extensive
theory. Wiley, New York
treatments of the problem can be found in the Schumacher JM (1983) The role of the dissipation ma-
textbooks Brockett (1969), Kwakernaak and trix in singular optimal control. Syst Control Lett
L
Sivan (1972), and Anderson and Moore (1971). 2:262–266
Trentelman HL, Hautus MLJ, Stoorvogel AA (2001) Con-
For a detailed study of the Riccati differential trol theory for linear systems. Springer, London
equation and the algebraic Riccati equation, we Willems JC (1971) Least squares stationary optimal con-
refer to Wonham (1968). Extensions of the linear trol and the algebraic Riccati equation. IEEE Trans
quadratic regulator problem to linear quadratic Autom Control 16:621–634
Willems JC, Kitapçi A, Silverman LM (1986) Singular op-
optimization problems, where the integrand timal control: a geometric approach. SIAM J Control
of the cost functional is a possibly indefinite Optim 24:323–337
quadratic function of the state and input variable, Wonham WM (1968) On a matrix Riccati equation
were studied in the classical paper of Willems of stochastic control. SIAM J Control Optim 6(4):
681–697
(1971). A further reference for the geometric
classification of all real symmetric solutions
of the algebraic Riccati equation is Coppel
(1974). For the question what level of system Linear Quadratic Zero-Sum
performance can be obtained if, in the cost Two-Person Differential Games
functional, the weighting matrix of the control
input is singular or nearly singular leading to Pierre Bernhard
singular and nearly singular linear quadratic INRIA-Sophia Antipolis-Méditerranée, Sophia
optimal control problems and “cheap control” Antipolis, France
problems, we refer to Kwakernaak and Sivan
(1972). An early reference for a discussion on
the singular problem is the work of Clements and Abstract
Anderson (1978). More details can be found
in Willems (1971) and Schumacher (1983). As in optimal control theory, linear quadratic
In singular problems, in general one allows (LQ) differential games (DG) can be solved,
for distributions as inputs. This approach was even in high dimension, via a Riccati equation.
650 Linear Quadratic Zero-Sum Two-Person Differential Games

However, contrary to the control case, existence only necessary for some of the following results.
of the solution of the Riccati equation is not Detailed results without that assumption were
necessary for the existence of a closed-loop obtained by Zhang (2005) and Delfour (2005).
saddle point. One may “survive” a particular, We chose to set the cross term in uv in the
nongeneric, type of conjugate point. An criterion null; this is to simplify the results and
important application of LQDGs is the so-called is not necessary. This problem satisfies Isaacs’
H1 -optimal control, appearing in the theory of condition (see article DG) even with nonzero
robust control. such cross terms.
Using the change of control variables

Keywords u D uQ  R1 S1t x ; v D vQ C 1 S2t x ;

Differential games; Finite horizon; H-infinity yields a DG with the same structure, with mod-
control; Infinite horizon ified matrices A and Q, but without the cross
terms in xu and xv. (This extends to the case with
nonzero cross terms in uv.) Thus, without loss of
Perfect State Measurement generality, we will proceed with .S1 S2 / D .0 0/.
The existence of open-loop and closed-loop
Linear quadratic differential games are a spe- solutions to that game is ruled by two Riccati
cial case of differential games (DG). See the equations for symmetric matrices P and P ? ,
article  Pursuit-Evasion Games and Zero-Sum respectively, and by a pair of canonical equations
Two-Person Differential Games. They were first that we shall see later:
investigated by Ho et al. (1965), in the context
of a linearized pursuit-evasion game. This sub- PP C PA C At P  PBR1 B t P C PD 1 D t P
section is based upon Bernhard (1979, 1980). A C Q D 0 ; P .T / D K ; (1)
linear quadratic DG is defined as
PP ? C P ? A C At P ? C P ? D 1 D t P ? CQ D 0 ;
xP D Ax C Bu C Dv ; x.t0 / D x0 ; P ? .T / D K : (2)

with x 2 Rn , u 2 Rm , v 2 R` , u./ 2 When both Riccati equations have a solution over


L2 .Œ0; T ; Rm /, v./ 2 L2 .Œ0; T ; R` /. Final time Œt; T , it holds that in the partial ordering of
T is given, there is no terminal constraint, and definiteness,
using the notation x t Kx D kxk2K ,
Z 0  P .t/  P ? .t/ :
T
J.t0 ; x0 I u./; v.// D kx.T /k2K C .x t ut v t /
t0 When the saddle point exists, it is represented by
0 1 the state feedback strategies
Q S1 S2 0x 1
B t C
@S1 R 0 A@uA dt : u D ' ? .t; x/ D R1 B t P .t/x ;
S2t 0  v
vD ?
.t; x/ D 1 D t P .t/x : (3)
The matrices of appropriate dimensions, A, B, D,
The control functions generated by this pair of
Q, Si , R, and , may all be measurable functions
feedbacks will be noted uO ./ and v./.
O
of time. R and must be positive definite with
inverses bounded away from zero. To get the most Theorem 1
complete results available, we assume also that K • A sufficient condition for the existence of
and Q are nonnegative definite, although this is a closed-loop saddle point, then given by
Linear Quadratic Zero-Sum Two-Person Differential Games 651

.' ? ; ? / in (3), is that Eq. (1) has a solution F .t/x.t/ remains finite. (See an example in Bern-
P .t/ defined over Œt0 ; T . hard 1979.)
• A necessary and sufficient condition for the If T D 1, with all system and payoff matrices
existence of an open-loop saddle point is that constant and Q > 0, Mageirou (1976) has shown
Eq. (2) has a solution over Œt0 ; T  (and then so that if the algebraic Riccati equation obtained by
O
does (1)). In that case, the pairs .Ou./; v.//, setting PP D 0 in (1) admits a positive definite
? ? ?
.Ou./; /, and .' ; / are saddle points. solution P , the game has a Value kxk2P , but (3)
• A necessary and sufficient condition for may not be a saddle point. ( ? may not be an
O
.' ? ; v.// to be a saddle point is that Eq. (1) equilibrium strategy.)
has a solution over Œt0 ; T .
• In all cases where a saddle point exists, the
Value function is V .t; x/ D kxk2P .t / . H1 -Optimal Control
However, Eq. (1) may fail to have a solution and
This subsection is entirely based upon Başar and
a closed-loop saddle point still exists. The precise
Bernhard (1995). It deals with imperfect state
necessary condition is as follows: let X./ and
measurement, using Bernhard’s nonlinear min-
Y ./ be two square matrix function solutions of
imax certainty equivalence principle (Bernhard
the canonical equations
and Rapaport 1996).
     Several problems of robust control may be
XP A BR1 B t C D 1 D t X brought to the following one: a linear, time-
D ;
YP Q At Y invariant system with two inputs (control input
    u 2 Rm and disturbance input w 2 R` ) and two
X.T /
D
I
: outputs (measured output y 2 Rp and controlled
L
Y .T / K
output z 2 Rq ) is given. One wishes to con-
trol the system with a nonanticipative controller
The matrix P .t/ exists for t 2 Œt0 ; T  if and u./ D .y.// in order to minimize the induced
only if X.t/ is invertible over that range, and linear operator norm between spaces of square-
then, P .t/ D Y .t/X 1 .t/. Assume that the rank integrable functions, of the resulting operator
of X.t/ is piecewise constant, and let X  .t/ w./ 7! z./.
denote the pseudo-inverse of X.t/ and R.X.t// It turns out that the problem which has a
its range. tractable solution is a kind of dual one: given
a positive number
, is it possible to make this
Theorem 2 A necessary and sufficient condition
norm no larger than
? The answer to this ques-
for a closed-loop saddle point to exist, which is
tion is yes if and only if
then given by (3) with P .t/ D Y .t/X  .t/, is
that Z 1
1. x0 2 R.X.t0 //. inf sup .kz.t/k2 
2 kw.t/k2 / dt  0 :
2˚ w./2L2 1
2. For almost all t 2 Œt0 ; T , R.D.t//
R.X.t//.
We shall extend somewhat this classical prob-
3. 8t 2 Œt0 ; T , Y .t/X  .t/  0.
lem by allowing either a time variable system,
In a case where X.t/ is only singular at an with a finite horizon T , or a time-invariant system
isolated instant t ? (then conditions 1 and 2 above with an infinite horizon.
are automatically satisfied), called a conjugate The dynamical system is
point but where YX 1 remains positive defi-
nite on both sides of it, the conjugate point is xP D Ax C Bu C Dw ; (4)
called even. The feedback gain F D R1 B t P
y D Cx C Ew ; (5)
diverges upon reaching t ? , but on a trajectory
generated by this feedback, the control u.t/ D z D H x C Gu : (6)
652 Linear Quadratic Zero-Sum Two-Person Differential Games

We let In that case, the optimal controller ensuring


    inf supx0 ;w J
is given by a “worst case state”
D M L O satisfying x.0/
x./ O D 0 and
.D E / D
t t
;
E Lt N
 t   PO
xDŒA  BR1 .B t P CS t /C
2 D.D t P CLt /
H Q S
.H G / D ;
Gt St R xO C.I 
2 ˙P /1 .˙ C t CL/.y C x/
O ; (11)

and we assume that E is onto, , N > 0, and G and the certainty equivalent controller
is one-to-one , R > 0.
 ? .y.//.t/ D R1 .B t P C S t /x.t/
O : (12)
Finite Horizon
In this part, we consider a time-varying system,
Infinite Horizon
with all matrix functions measurable. Since the
The infinite horizon case is the traditional H1 -
state is not known exactly, we assume that the
optimal control problem reformulated in a state
initial state is not known either. The issue is
space setting. We let all matrices defining the
therefore to decide whether the criterion
system be constant. We take the integral in (7)
Z T from 1 to C1, with no initial or terminal term
J
D kx.T /k2K C .kz.t/k2 
2 kw.t/k2 / of course. We add the hypothesis that the pairs
t0 .A; B/ and .A; D/ are stabilizable and the pairs
dt 
2 kx0 k2˙0 (7) .C; A/ and .H; A/ detectable. Then, the theorem
is as follows:
may be kept finite and with which strategy. Let Theorem 4

? if and only if the fol-
lowing conditions are satisfied: The algebraic

? D inff
j inf sup J
< 1g : Riccati equations obtained by placing PP D 0
2˚ x 2Rn ;w./2L2
0 and ˙P D 0 in (8) and (9) have positive defi-
nite solutions, which satisfy the spectral radius
Theorem 3

? if and only if the following condition (10). The optimal controller is given
three conditions are satisfied: by Eqs. (11) and (12), where P and ˙ are the
1. The following Riccati equation has a solution minimal positive definite solutions of the alge-
over Œt0 ; T : braic Riccati equations, which can be obtained
as the limit of the solutions of the differential
 PP DPACAt P .PBCS /R1 .B t P CS t / equations as t ! 1 for P and t ! 1
for ˙.
C
2 PMP CQ ; P .T /DK : (8)

2. The following Riccati equation has a solution Conclusion


over Œt0 ; T :
The similarity of the H1 -optimal control theory
˙P DA˙ C˙At  .˙ C t CL/N 1 .C ˙ CLt / with the LQG, stochastic, theory is in many
C
2 ˙Q˙ C M ; ˙.t0 / D ˙0 : (9) respects striking, as is the duality observation
control. Yet, the “observer” of H1 -optimal con-
trol does not arise from some estimation the-
3. The following spectral radius condition is sat- ory but from the analysis of a “worst case.”
isfied: The best explanation might be in the duality of
the ordinary, or .C; /, algebra with the idem-
8t 2 Œt0 ; T  ; .˙.t/P .t// <
2 : (10) potent .max; C/ algebra (see Bernhard 1996).
Linear State Feedback 653

The complete theory of H1 -optimal control in


that perspective has yet to be written.
Linear State Feedback

Panos J. Antsaklis1 and A. Astolfi2;3


1
Department of Electrical Engineering,
Cross-References University of Notre Dame, Notre Dame,
IN, USA
 Dynamic Noncooperative Games 2
Department of Electrical and Electronic
 Game Theory: Historical Overview Engineering, Imperial College London,
 Generalized Finite-Horizon Linear-Quadratic London, UK
Optimal Control 3
Dipartimento di Ingegneria Civile e Ingegneria
 H-Infinity Control Informatica, Università di Roma Tor Vergata,
 H2 Optimal Control Roma, Italy
 Linear Quadratic Optimal Control
 Optimization Based Robust Control
 Pursuit-Evasion Games and Zero-Sum Abstract
Two-Person Differential Games
 Robust Control of Infinite Dimensional Sys- Feedback is a fundamental mechanism in nature
tems and central in the control of systems. The state
 Robust H2 Performance in Feedback Control contains important system information, and ap-
 Sampled-Data H-Infinity Optimization plying a control law that uses state information is
 Stochastic Dynamic Programming a very powerful control policy. To illustrate the
 Stochastic Linear-Quadratic Control effect of feedback in linear systems, continuous- L
time and discrete-time state variable descriptions
are used: these allow one to write explicitly the
Bibliography resulting closed-loop descriptions and to study
the effect of feedback on the eigenvalues of the
Başar T, Bernhard P (1995) H 1 -optimal control and closed-loop system. The eigenvalue assignment
related minimax design problems: a differential games
approach, 2nd edn. Birkhäuser, Boston
problem is also discussed.
Bernhard P (1979) Linear quadratic zero-sum two-person
differential games: necessary and sufficient condition.
JOTA 27(1):51–69
Bernhard P (1980) Linear quadratic zero-sum two-person Keywords
differential games: necessary and sufficient condition:
comment. JOTA 31(2):283–284 Feedback; Linear systems; State feedback; State
Bernhard P (1996) A separation theorem for expected variables
value and feared value discrete time control. COCV
1:191–206
Bernhard P, Rapaport A (1996) Min-max certainty equiv-
alence principle and differential games. Int J Robust
Nonlinear Control 6:825–842
Introduction
Delfour M (2005) Linear quadratic differential games:
saddle point and Riccati equation. SIAM J Control Feedback is a fundamental mechanism arising in
Optim 46:750–774 nature. Feedback is also common in engineered
Ho Y-C, Bryson AE, Baron S (1965) Differential games
systems and is essential in the automatic control
and optimal pursuit-evasion strategies. IEEE Trans
Autom Control AC-10:385–389 of dynamic processes with uncertainties in their
Mageirou EF (1976) Values and strategies for infinite time model descriptions and in their interactions with
linear quadratic games. IEEE Trans Autom Control the environment. When feedback is used, the
AC-21:547–550
actual values of the system variables are sensed,
Zhang P (2005) Some results on zero-sum two-person
linear quadratic differential games. SIAM J Control fed back, and used to control the system. That is,
Optim 43:2147–2165 a control law decision process is based not only
654 Linear State Feedback

on predictions on the system behavior derived also a linear time-invariant system. However,
from a process model (as in open-loop control) depending on the application needs, the state
but also on information about the actual behavior feedback control law (2) can be more com-
(closed-loop feedback control). plex.
– Although the Eqs. (3) that describe the closed-
loop behavior are different from Eq. (1), this
Linear Continuous-Time Systems does not imply that the system parameters
have changed. The way feedback control acts
Consider, to begin with, time-invariant systems is not by actually changing the system pa-
described by the state variable description rameters A, B, C , D but by changing u
so that closed-loop system behaves as if the
xP D Ax C Bu; y D C x C Du; (1) parameters were changed. When one applies,
say, a step via r.t/ in the closed-loop system,
in which x.t/ 2 Rn is the state, u.t/ 2 Rm is the then u.t/ in (2) is modified appropriately so
input, y.t/ 2 Rp is the output, and A 2 Rnn , the system behaves in a desired way.
B 2 Rnm , C 2 Rpn , D 2 Rpm are constant – It is possible to implement u in (2) as an open-
matrices. In this case, the linear state feedback loop control law, namely,
(lsf) control law is selected as
uO .s/ D F ŒsI  .A C BF /1 x.0/
u.t/ D F x.t/ C r.t/; (2)
CŒI  F .sI  A/1 B1 rO .s/ (4)
where F 2 Rmn is the constant gain matrix and
r.t/ 2 Rm is a new external input. where Laplace transforms have been used for
Substituting (2) into (1) yields the closed-loop notational convenience. Equation (4) produces
state variable description, namely, exactly the same input as Eq. (2), but it has
the serious disadvantage that it is based ex-
xP D .A C BF /x C Br; clusively on prior knowledge on the system
(3) (notably x.0/ and parameters A, B). As a
y D .C C DF /x C Dr:
result, when there are uncertainties (and there
always are), the open-loop control law (4)
Appropriately selecting F , primarily to modify
may fail, while the closed-loop control law (2)
A C BF , one affects and improves the behavior
succeeds.
of the system.
– Analogous definitions exist for continuous-
A number of comments are in order:
time, time-varying systems described by the
– Feeding back the information from the state x
equations
of the system is expected to be, and it is, an ef-
fective way to alter the system behavior. This
is because knowledge of the (initial) state and xP D A.t/x C B.t/u; y D C.t/x C D.t/u
the input uniquely determines the system’s (5)
future behavior and intuitively using the state In this framework, the control law is described
information should be a good way to control by
the system, i.e., modifying its behavior. u D F .t/x C r; (6)
– In a state feedback control law, the input u
and the resulting closed-loop system is
can be any function of the state u D f .x; r/,
not necessarily linear with constant gain F as
in (2). Typically given (1) and (2) is selected xP D ŒA.t/ C B.t/F .t/x C B.t/r;
as the linear state feedback primarily because (7)
the resulting closed-loop description (3) is y D ŒC.t/ C D.t/F .t/x C D.t/r:
Linear State Feedback 655

Linear Discrete-Time Systems theory which yields the “best” F .t/ (or F .k/) in
some sense.
For the discrete-time, time-invariant case, the In the time-invariant case, one can also use a
system description is LQR formulation, but here stabilization is equiva-
lent to the problem of assigning the n eigenvalues
x.kC1/ D Ax.k/CBu.k/; y D C x.k/CDu.k/; of .A C BF / in the stable region of the complex
(8) plane. If i ; i D 1; : : : ; n, are the eigenvalues of
A C BF , then F should be chosen so that, for all
the linear state feedback control law is defined as i D 1; : : : ; n, the real part of i , Re. i / < 0 in
the continuous-time case, and the magnitude of
u.k/ D F x.k/ C r.k/; (9) i , j i j < 1 in the discrete-time case. Eigenvalue
assignment is therefore an important problem,
and the closed-loop system is described by which is discussed hereafter.

x.k C 1/ D .A C BF /x.k/ C Br.k/;


(10) Eigenvalue Assignment Problem
y.k/ D .C C DF /x.k/ C Dr.k/:
For continuous-time and discrete-time, time-
Similarly, for the discrete-time, time-varying case invariant systems, the eigenvalue assignment
problem can be posed as follows. Given matrices
x.k C 1/ D A.k/x.k/ C B.k/u.k/; A 2 Rnn and B 2 Rnm , find F 2 Rmn such
(11) that the eigenvalues of A C BF are assigned to
y.k/ D C.k/x.k/ C D.k/u.k/;
arbitrary, complex conjugate, locations. Note that L
the characteristic polynomial of ACBF , namely,
the control law is defined as
det .sI  .A C BF //, has real coefficients,
which implies that any complex eigenvalue is
u.k/ D F .k/x.k/ C r.k/; (12) part of a pair of complex conjugate eigenvalues.

and the resulting closed-loop system is Theorem 1 The eigenvalue assignment problem
has a solution if and only if the pair .A; B/ is
reachable.
x.k C 1/ D ŒA.k/CB.k/F .k/x.k/CB.k/r.k/;
For single-input systems, that is, for systems with
y.k/ D ŒC.k/ C D.k/F .k/x.k/CD.k/r.k/:
(13) m D 1, the eigenvalue assignment problem has
a simple solution, as illustrated in the following
statement:
Selecting the Gain F Proposition 1 Consider system (1) or (8). Let
m D 1. Assume that
F (or F .t/) is selected so that the closed-loop
system has certain desirable properties. Stability rank R D n;
is of course of major importance. Many control
where
problems are addressed using linear state feed-
R D ŒB; AB; : : : :An1 B;
back including tracking and regulation, diagonal
decoupling, and disturbance rejection. Here we that is, the system is reachable. Let p.s/ be a
shall focus on stability. Stability can be achieved desired monic polynomial of degree n. Then there
under appropriate controllability assumptions. In is a (unique) linear state feedback gain matrix F
the time-varying case, one way to determine such that the characteristic polynomial of ACBF
such stabilizing F .t/ (or F .k/) is to use results is equal to p.s/. Such linear state feedback gain
from the optimal linear quadratic regulator (LQR) matrix F is given by
656 Linear State Feedback

 
F D  0    0 1 R1 p.A/: (14) then we use Ackermann’s formula to compute a
linear state feedback gain matrix F such that the
Proposition 1 provides a constructive way to characteristic polynomial of
assign the characteristic polynomial, hence the
eigenvalues, of the matrix A C BF . Note that, for A C BG C bi F
low order systems, i.e., if n D 2 or n D 3, it may
be convenient to compute directly the character- is p.s/. Note also that if .A; B/ is reachable,
istic polynomial of A C BF and then compute under mild conditions on A, there exists vector
F using the principle of identity of polynomials, g so that .A; Bg/ is reachable.
i.e., F should be such that the coefficients of There are many other methods to assign the
the polynomials det.sI  .A C BF // and p.s/ eigenvalues which may be found in the references
coincide. Equation (14) is known as Ackermann’s below.
formula.
The result summarized in Proposition 1 can be
extended to multi-input systems. Transfer Functions
Proposition 2 Consider system (1) or (8). Sup-
If HF .s/ is the transfer function matrix of the
pose
closed-loop system (3), it is of interest to find its
rank R D n;
relation to the open-loop transfer function H.s/
that is, the system is reachable. Let p.s/ be a of (1). It can be shown that
monic polynomial of degree n. Then there is a
linear state feedback gain matrix F such that the HF .s/ D H.s/ŒI  F .sI  A/1 B1
characteristic polynomial of A C BF is equal
to p.s/. D H.s/ŒF .sI  .A C BF //1 B C I 

Note that in the case m > 1 the linear state In the single-input, single-output case, it can
feedback gain matrix F assigning the character- be readily shown that the linear state feedback
istic polynomial of the matrix A C BF is not control law (2) only changes the coefficients
unique. To compute such a gain matrix, one may of the denominator polynomial in the transfer
exploit the following fact: function (this result is also true in the multi-
Lemma 1 Consider system (1). Suppose input, multi-output case). Therefore, if any of
the (stable) zeros of H.s/ need to be changed,
rank R D n; the only way to accomplish this via linear state
feedback is by pole-zero cancelation (assigning
that is, the system is reachable. Let bi be a closed-loop poles at the open-loop zero locations;
nonzero column of the matrix B. Then there is a in the MIMO case, closed-loop eigenvalue direc-
matrix G such that the single-input system tions also need to be assigned for cancelations to
take place). Note that it is impossible to change
the unstable zeros of H.s/ under stability, since
xP D .A C BG/x C bi v (15)
they would have to be canceled with unstable
poles.
is reachable. Similar results are true for discrete-
time systems (8).
Exploiting Lemma 1, it is possible to design a Observer-Based Dynamic Controllers
matrix F such that the characteristic polynomial
of A C BF equals some monic polynomial p.s/ When the state x is not available for feedback, an
of degree n in two steps. First, we compute a ma- asymptotic estimator (a Luenberger observer) is
trix G such that the system (15) is reachable, and typically used to estimate the state. The estimate
Linear Systems: Continuous-Time Impulse Response Descriptions 657

xQ of the state, instead of the actual x, is then used Wonham WM (1967) On pole assignment in multi-input
in (2) to control the system, in what is known as controllable linear systems. IEEE Trans Autom Con-
trol AC-12:660–665
the certainty equivalence architecture. Wonham WM (1985) Linear multivariable control: a geo-
metric approach, 3rd edn. Springer, New York

Summary

The notion of state feedback for linear systems Linear Systems: Continuous-Time
has been discussed. It has been shown that state Impulse Response Descriptions
feedback modifies the closed-loop behavior. The
related problem of eigenvalue assignment has Panos J. Antsaklis
been discussed, and its connection with the reach- Department of Electrical Engineering, University
ability (controllability) properties of the system of Notre Dame, Notre Dame, IN, USA
has been highlighted. The class of feedback laws
considered is the simplest possible one. If addi-
tional constraints on the input signal, or on the Abstract
closed-loop performance, are imposed, then one
perhaps has to resort to nonlinear state feedback, An important input–output description of a linear
for example, if the input signal is bounded in continuous-time system is its impulse response,
amplitude or rate. If constraints such as decou- which is the response h.t; / to an impulse ap-
pling of the systems into m noninteracting sub- plied at time . In time-invariant systems that are
also causal and at rest at time zero, the impulse
systems or tracking under asymptotic stability are L
imposed, then dynamic state feedback may be response is h.t; 0/ and its Laplace transform is
necessary. the transfer function of the system. Expressions
for h.t; / when the system is described by state-
variable equations are also derived.

Cross-References
Keywords
 Linear Systems: Continuous-Time, Time-Vary-
ing State Variable Descriptions Continuous-time; Impulse response descriptions;
 Linear Systems: Discrete-Time, Time-Invariant Linear systems; Time-invariant; Time-varying;
State Variable Descriptions Transfer function descriptions
 Observer-Based Control

Introduction

Bibliography Consider linear continuous-time dynamical sys-


tems, the input–output behavior of which can
Antsaklis PJ, Michel AN (2006) Linear systems. be described by an integral representation of the
Birkhauser, Boston form
Chen CT (1984) Linear system theory and design. Holt,
Rinehart and Winston, New York Z C1
DeCarlo RA (1989) Linear systems. Prentice-Hall, Engle- y.t/ D H.t; /u./d  (1)
wood Cliffs 1
Hespanha JP (2009) Linear systems theory. Princeton
Press, Princeton where t;  2 R, the output is y.t/ 2 Rp , the input
Kailath T (1980) Linear systems. Prentice-Hall, Engle-
is u.t/ 2 Rm , and H W RR ! Rpm is assumed
wood Cliffs
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- to be integrable. For instance, any system in state-
Hall, Englewood Cliffs variable form
658 Linear Systems: Continuous-Time Impulse Response Descriptions

xP D A.t/x C B.t/u input at time  with all remaining components


(2) of the input being zero. H.t; / D Œhij .t; / is
y D C.t/x C D.t/u called the impulse response matrix of the system.
If it is known that system (1) is causal, then
or the output will be zero before an input is applied.
xP D Ax C Bu
(3) Therefore,
y D C x C Du
H.t; / D 0; for t < ; (7)
also has a representation of the form (1) as we
shall see below.
and (1) becomes
Note that it is assumed that at  D 1, the
system is at rest. H.t; / is the impulse response Z t
matrix of the system (1). To explain, consider first y.t/ D H.t; /u./d : (8)
1
a single-input single-output system:
Z C1
Rewrite (8) as
y.t/ D h.t; /u./d ; (4) Z t0 Z t
1
y.t/ D H.t; /u./d  C H.t; /u./d 
1 t0
and recall that if ı.tO/ denotes an impulse (delta Z t
or Dirac) function applied at time  D tO, then for D y.t0 / C H.t; /u./d : (9)
a function f .t/, t0

Z C1 If (1) is at rest at t D t0 (i.e., if u.t/ D 0 for


f .tO/ D f ./ı.tO  /d : (5) t  t0 , then y.t/ D 0 for t  t0 ), y.t0 / D 0 and
1
(9) becomes
If now in (4) u./ D ı.tO  /, that is, an impulse Z t
input is applied at  D tO, then the output yI .t/ is y.t/ D H.t; /u./d : (10)
t0

yI .t/ D h.t; tO/; If in addition system (1) is time-invariant, then


H.t; / D H.t  ; 0/ (also written as H.t 
i.e., h.t; tO/ is the output at time t when an impulse /) since only the elapsed time (t  ) from the
is applied at the input at time tO. So in (4), h.t; / application of the impulse is important. Then (10)
is the response at time t to an impulse applied becomes
at time . Clearly if the impulse response h.t; /
Z t
is known, the response to any input u.t/ can be
y.t/ D H.t  /u./d ; t  0; (11)
derived via (4), and so h.t; / is an input/output 0
description of the system.
Equation (1) is a generalization of (4) for the where we chose t0 D 0 without loss of generality.
multi-input, multi-output case. If we let all the Equation (11) is the description for causal, time-
components of u./ in (1) be zero except the j th invariant systems, at rest at t D 0.
component, then Equation (11) is a convolution integral and
if we take the (one-sided or unilateral) Laplace
Z C1 transform of both sides,
yi .t/ D hij .t; /uj ./d ; (6)
1
O
y.s/ D HO .s/Ou.s/; (12)
hij .t; / denotes the response of the i th compo-
nent of the output of system (1) at time t due to O
where y.s/; uO .s/ are the Laplace transforms of
an impulse applied to the j th component of the y.t/; u.t/ and HO .s/ is the Laplace transform of
Linear Systems: Continuous-Time Impulse Response Descriptions 659

the impulse response H.t/. HO .s/ is the transfer where x.t0 / D 0.


function matrix of the system. Note that the trans- Comparing (15) with (11), the impulse re-
fer function of a linear, time-invariant system is sponse
typically defined as the rational matrix HO .s/ that
satisfies (12) for any input and its corresponding (
output assuming zero initial conditions, which is C e A.t  / B C Dı.t  / t  ;
H.t  / D
of course consistent with the above analysis. 0 t < ;
(16)

or as it is commonly written (taking the time


Connection to State-Variable when the impulse is applied to be zero,  D 0)
Descriptions
(
When a system is described by the state-variable C e At B C Dı.t/ t  0;
description (2), then H.t/ D (17)
0 t < 0:
Z t
y.t/ D ŒC.t/ˆ.t; /B./
t0 Take now the (one-sided or unilateral) Laplace
C D.t/ı.t  /u./d ; (13) transform of both sides in (17) to obtain

where it was assumed that x.t0 / D 0, i.e., the HO .s/ D C.sI  A/1 B C D; (18)
system is at rest at t0 . Here ˆ.t; / is the state
L
transition matrix of the system defined by the
Peano-Baker series: which is the transfer function matrix in terms
of the coefficient matrices in the state-variable
Zt description (3). Note that (18) can also be derived
ˆ.t; t0 / D I C A.1 /d 1 directly from (3) by assuming zero initial condi-
t0 tions .x.0/ D 0/ and taking Laplace transform of
2  3 both sides.
Zt Z1
C A.1 / 4 A.2 /d 2 5 d 1 C    I Finally, it is easy to show that equivalent
state-variable descriptions give rise to the same
t0 t0
impulse responses.
see  Linear Systems: Continuous-Time, Time–
Varying State Variable Descriptions.
Comparing (13) with (10), the impulse re- Summary
sponse
( The continuous-time impulse response is an
C.t/ˆ.t; /B.t/CD.t/ı.t/ t; external, input–output description of linear,
H.t; /D
0 t<: continuous-time systems. When the system is
(14) time-invariant, the Laplace transform of the
impulse response h.t; 0/ (which is the output
Similarly, when the system is time-invariant
response at time t due to an impulse applied at
and is described by (3),
time zero with initial conditions taken to be zero)
Z t
is the transfer function of the system – another
y.t/ D ŒC e A.t  / B C Dı.t  /u./d ; very common input–output description. The
t0 relationships with the state-variable descriptions
(15) are shown.
660 Linear Systems: Continuous-Time, Time-Invariant State Variable Descriptions

Cross-References coefficients can also be described in a systematic


way in terms of state variable descriptions of
 Linear Systems: Continuous-Time, Time-In- P
the form x.t/ D Ax.t/ C Bu.t/; y.t/ D
variant State Variable Descriptions C x.t/CDu.t/. The response of such systems due
 Linear Systems: Continuous-Time, Time-Vary- to a given input and a set of initial conditions is
ing State Variable Descriptions derived and expressed in terms of the variation of
constants formula. Equivalence of state variable
descriptions is also discussed.
Recommended Reading

External or input–output descriptions such as the Keywords


impulse response and the transfer function (in
the time-invariant case) are described in several Continuous-time; Linear systems; State variable
textbooks below. descriptions; Time-invariant

Bibliography Introduction

Antsaklis PJ, Michel AN (2006) Linear systems. Linear, continuous-time systems are of great in-
Birkhauser, Boston terest because they model, exactly or approxi-
DeCarlo RA (1989) Linear systems. Prentice-Hall, Engle-
wood Cliffs
mately, the behavior over time of many practical
Kailath T (1980) Linear systems. Prentice-Hall, Engle- physical systems of interest. We are particularly
wood Cliffs interested in systems, the behavior of which is de-
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- scribed by linear, ordinary differential equations
Hall, Englewood Cliffs
Sontag ED (1990) Mathematical control theory: deter- with constant coefficients.
ministic finite dimensional systems. Texts in applied Such descriptions can always be rewritten as a
mathematics, vol 6. Springer, New York set of first-order differential equations, typically
in the following convenient state variable form:

xP D Ax.t/ C Bu.t/; y.t/ D C x.t/ C Du.t/I


Linear Systems: Continuous-Time,
Time-Invariant State Variable x.0/ D x0 ; (1)
Descriptions
where x.t/, the state vector, is a column vector of
Panos J. Antsaklis dimension n (x.t/ 2 Rn ) and x.t/ P D dxdt
with
Department of Electrical Engineering, University the derivative being taken element by element.
of Notre Dame, Notre Dame, IN, USA A 2 Rnn , B 2 Rnm , C 2 Rpn , D 2 Rpm are
matrices with real entries (these are the constant
coefficients that make the system time invariant);
Synonyms and u.t/ 2 Rm , y.t/ 2 Rp are the inputs and
outputs of the system. The vector differential
LTI Systems equation is the state equation and the algebraic
equation is the output equation.
The advantage of the above state variable
Abstract description is that for given input u.t/ and initial
condition x.0/, its solution (state and output mo-
Continuous-time processes that can be modeled tions or trajectories) can be conveniently and sys-
by linear differential equations with constant tematically characterized. This is shown below.
Linear Systems: Continuous-Time, Time-Invariant State Variable Descriptions 661

Deriving State Variable Descriptions xP 1 .t/ 0 1 x1 .t/ 0


D C 1 u.t/
xP 2 .t/  mk  mb x2 .t/ m
Description (1) may be derived directly, by mod- (3)
eling the behavior of a linear, continuous-time,
time-invariant system, but more often it is de- and
rived either from the linearization of a nonlinear

equation around an operating point or a trajectory   x1 .t/


y.t/ D 1 0
or from higher-order differential equations that x2 .t/
model the system’s behavior. The example below

illustrates the latter case. x1 .0/ y0


with D as initial conditions. This
x2 .0/ y1
is of the form (1) where x.t/ is a 2-dimensional
y(t) column vector; A is a 2  2 matrix; B and
C are 2-dimensional column and row vectors,
k respectively; and x.0/ D x0 .
u(t)
m Notes:
1. It is always possible to obtain a state variable
b description which is equivalent to a given set
of higher-order differential equations
2. The choice of the state variables, here x1 and
x2 , is not unique. Different choices will lead
Consider a spring-mass example, where a to different A, B, and C .
3. The number of the state variables is typically
L
mass m slides horizontally on a surface with
damping coefficient b due to friction and it is equal to the order of the set of the higher-
attached to a wall by a linear spring of spring order differential equations and equals the
constant k. If y.t/ denotes the distance of the number of initial conditions needed to derive
center of the mass from a position of rest of the the unique solution; in the above example this
spring, by applying Newton’s law the following number is 2.
second-order linear ordinary differential equation 4. In time-invariant systems, it can be assumed
with constant coefficients is obtained: without loss of generality that the starting time
is t D 0 and so the initial conditions are taken
to be x.0/ D x0 .
R C b y.t/
my.t/ P C ky.t/ D u.t/: (2)

Solving xP D A.t/xI x.0/ D x0


P
Here y.t/ D dy.t /
dt . The motion of the mass
y.t/; t > 0 is uniquely determined if the applied
Consider the homogeneous equation
force u.t/; t  0 is known and at t D 0 the initial
P
position y.0/ D y0 and initial velocity y.0/ D y1
are given. To obtain a state variable description, xP D A.t/xI x.0/ D x0 (4)
introduce the state variables x1 and x2 as
where x.t/ D Œx1 .t/; : : : ; xn .t/T is the state
vector of dimension n and A is an n  n matrix
x1 .t/ D y.t/; x2 .t/ D y.t/
P
with entries real numbers (i.e., A 2 Rnn ).
Equation (4) is a special case of (1) where
to obtain the set of first-order differential equa- there are no inputs and outputs, u and y. The
tions mxP 2 .t/ C b xP 2 .t/ C kx1 .t/ D u.t/ and homogeneous vector differential equation (4) will
P
x.t/ D x2 .t/ which can be rewritten in the form be solved first, and its solution will be used to find
of (1) the solution of (1).
662 Linear Systems: Continuous-Time, Time-Invariant State Variable Descriptions

Solving (4) is an initial value problem. It time-invariant case becomes the defining series
can be shown that there always exists a unique for the matrix exponential (5).
solution '.t/ such that
System Response
P D A'.t/I
'.t/ '.0/ D x0 :
Based on the solution of the homogeneous equa-
To find the unique solution, consider first the tion (4), shown in (6), the solution of the state
one-dimensional case, namely, equation in (1) can be shown to be
Z
P D ay.t/I
y.t/ y.0/ D y0 t
x.t/ D e x0 C At
e A.t  / Bu./d : (7)
0
the unique solution of which is
The following properties for the matrix exponen-
tial e At can be shown directly from the defining
y.t/ D e y0 ;
at
t  0:
series:
1. Ae At D e At A.
The scalar exponential e at can be expressed in a 2. .e At /1 D e At .
series form Equation (7) which is known as the variation
1 k of constants formula can be derived as follows:
X t 1 1 1
e at D ak .D 1C taC t 2 a2 C t 3 a3C: : : / Consider xP D Ax C Bu and let z.t/ D

kD0
1 2 6 e At x.t/. Then x.t/ D e At z.t/ and substituting

The generalization to the nn matrix exponential Ae At z.t/ C e At zP.t/ D Ae At z.t/ C Bu.t/
(A is n  n) is given by
or zP.t/ D e At Bu.t/ from which
1 k
X t 1 Z
e At D Ak .D In C At C A2 t 2 C : : : / t
kŠ 2 z.t/  z.0/ D e A Bu./d 
kD0
(5) 0

By analogy, let the solution to (4) be or


Z
.x.t/ D/'.t/ D e At x0 (6) At
t
e x.t/  x.0/ D e A Bu./d 
0
It is a solution since if it is substituted into (4),
or
1 Z
P D ŒA C At C A2 t 2 C : : : x0
'.t/ t
2 x.t/ D e x0 CAt
e A.t  / Bu./d 
0
D Ae At x0 D A'.t/
which is the variation of constants formula (7).
and '.0/ D e A0 x0 D x0 , that is, it satisfies Equation (7) is the sum of two parts, the state
the equation and the initial condition. Since the response (when u.t/ D 0 and the system is driven
solution of (4) is unique, (6) is the unique solution only by the initial state conditions) and the input
of (4). response (when x0 D 0 and the system is driven
The solution (6) can be derived more formally only by the input u.t/); this illustrates the linear
using the Peano-Baker series (see  Linear system principle of superposition.
Systems: Continuous-Time, Time-Varying State If the output equation y.t/ D C x.t/ C Du.t/
Variable Descriptions), which in the present is considered, then in view of (7),
Linear Systems: Continuous-Time, Time-Invariant State Variable Descriptions 663

Z t where
y.t/ D C e x0 C
At
C e A.t  / Bu./d  C Du.t/
0
(8)
AQ D PAP 1 ; BQ D PB; CQ D CP 1 ; DQ D D
Z t
D C e At x0 C ŒC e A.t  / B
0 The state variable descriptions (9) and (10)
are called equivalent and P is the equivalence
C Dı.t  /u./d  transformation. This transformation corresponds
to a change in the basis of the state space, which
The second expression involves the Dirac (or is a vector space. Appropriately selecting P , one
impulse or delta) function ı.t/, and it is derived can simplify the structure of A.DQ PAP 1 /; the
based on the basic property for ı.t/, namely, matrices AQ and A are called similar. When the
Z eigenvectors of A are all linearly independent
1
f .t/ D ı.t  /f ./d  (this is the case, e.g., when all eigenvalues i of
1 A are distinct), then P may be found so that AQ
is diagonal. When e At is to be determined, and
It is clear that the matrix exponential e At plays AQ D PAP 1 D diagΠi  (AQ and A have the same
a central role in determining the response of eigenvalues), then
a linear continuous-time, time-invariant system
described by (1). Q t
1 AP Q
Given A, e At may be determined using several e At D e P D P 1 e At P D P 1 diagŒe i t P:
methods including its defining series, diagonal-
ization of A using a similarity transformation
L
Note that it can be easily shown that equiva-
(PAP 1 ), the Cayley-Hamilton theorem, using lent state space representations give rise to the
expressions involving the modes of the system
P same impulse response and transfer function (see
(e At D niD1 Ai e i t when A has n distinct eigen-  Linear Systems: Continuous-Time Impulse Re-
values i ; Ai D vi vQi with vi , vQ i the right and left sponse Descriptions).
eigenvectors of A that correspond to i (vQ i vj D
1, i D j and vQ i vj D 0, i ¤ j )), or using
Laplace transform (e At D L1 Œ.sI  A/1 ). See
references below for detailed algorithms. Summary

State variable descriptions for continuous-time


Equivalent State Variable time-invariant systems are introduced and the
Descriptions state and output responses to inputs and initial
conditions are derived. Equivalence of state vari-
Given able representations is also discussed.
xP D Ax C Bu; y D C x C Du (9)

consider the new state vector xQ where Cross-References

xQ D P x  Linear Systems: Continuous-Time Impulse Re-


sponse Descriptions
with P a real nonsingular matrix. Substituting  Linear Systems: Continuous-Time, Time-Vary-
x D P 1 xQ in (9), we obtain ing State Variable Descriptions
 Linear Systems: Discrete-Time, Time-Invariant
xPQ D AQxQ C Bu;
Q y D CQ xQ C Du;
Q (10) State Variable Descriptions
664 Linear Systems: Continuous-Time, Time-Varying State Variable Descriptions

Recommended Reading variable descriptions of the form x.t/ P D


A.t/x.t/ C B.t/u.t/; y.t/ D C.t/x.t/ C
The state variable description of systems received D.t/u.t/. The response of such systems due
wide acceptance in systems theory beginning in to a given input and initial conditions is derived
the late 1950s. This was primarily due to the using the Peano-Baker series. Equivalence of
work of R.E. Kalman and others in filtering state variable descriptions is also discussed.
theory and quadratic control theory and to the
work of applied mathematicians concerned with
the stability theory of dynamical systems. For Keywords
comments and extensive references on some of
the early contributions in these areas, see Kailath Continuous-time; Linear systems; State variable
(1980) and Sontag (1990). The use of state vari- descriptions; Time-varying
able descriptions in systems and control opened
the way for the systematic study of systems with
multi-inputs and multi-outputs. Introduction

Bibliography Dynamical processes that can be described or


approximated by linear high-order ordinary dif-
Antsaklis PJ, Michel AN (2006) Linear systems.
Birkhauser, Boston
ferential equations with time-varying coefficients
Brockett RW (1970) Finite dimensional linear systems. can also be described, via a change of variables,
Wiley, New York by state variable descriptions of the form
Chen CT (1984) Linear system theory and design. Holt,
Rinehart and Winston, New York
DeCarlo RA (1989) Linear systems. Prentice-Hall, Engle- P
x.t/ D A.t/x.t/ C B.t/u.t/I x.t0 / D x0
wood Cliffs
Hespanha JP (2009) Linear systems theory. Princeton y.t/ D C.t/x.t/ C D.t/u.t/;
Press, Princeton (1)
Kailath T (1980) Linear systems. Prentice-Hall, Engle- where x.t/ (t 2 R, the set of reals) is a col-
wood Cliffs
umn vector of dimension n (x.t/ 2 Rn ) and
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice-
Hall, Englewood Cliffs A.t/, B.t/, C.t/, D.t/ are matrices with entries
Sontag ED (1990) Mathematical control theory: deter- functions of time t. A.t/ D Œaij .t/; aij .t/ W
ministic finite dimensional systems. Texts in applied R ! R. A.t/ 2 Rnn ; B.t/ 2 Rnm ; C.t/ 2
mathematics, vol 6. Springer, New York
Rpn ; D.t/ 2 Rpm . The input vector is u.t/ 2
Zadeh LA, Desoer CA (1963) Linear system theory: the
state space approach. McGraw-Hill, New York Rm and the output vector is y.t/ 2 Rp . The
vector differential equation in (1) is the state
equation, while the algebraic equation is the out-
put equation.
Linear Systems: Continuous-Time, The advantage of the state variable description
Time-Varying State Variable (1) is that given an input u.t/; t > 0 and an
Descriptions initial condition x.t0 / D x0 , the state trajec-
tory or motion for t > t0 can be conveniently
Panos J. Antsaklis characterized. To derive the expressions, we first
Department of Electrical Engineering, University consider the homogenous state equation and the
of Notre Dame, Notre Dame, IN, USA corresponding initial value problem.

Abstract
Solving x.t/
P D A.t/x.t/I x.t0 / D x0
Continuous-time processes that can be modeled
by linear differential equations with time-varying Consider the homogenous equation with the ini-
coefficients can be written in terms of state tial condition
Linear Systems: Continuous-Time, Time-Varying State Variable Descriptions 665

x.t/ D A.t/x.t/I x.t0 / D x0 (2) from which


2
where x.t/ D Œx1 .t/; : : : ; xn .t/T is the state Zt
(column) vector of dimension n and A.t/ is m .t/ D 4I C A.1 /d 1
an n  n matrix with entries functions of time t0
that take on values from the field of reals Zt Z1
(A 2 Rnn ). C A.1 / A.2 /d 2 d 1 C : : :
Under certain assumptions on the entries t0 t0
of A.t/, a solution of (2) exists and it is
Zt Z1 Zm1
unique. These assumptions are satisfied, and a
C A.1 / A.2 / : : : A.m /
solution exists and is unique in the case, for
example, when the entries of A.t/ are continuous t0 t0 t0
3
functions of time. In the following we make this
assumption. d m : : : d 1 5 x0
To find the unique solution of (2), we use
the method of successive approximations which
when applied to When m ! 1, and under the above continuity
assumptions on A.t/, m .t/ converges to the
P
x.t/ D f .t; x.t//; x.t0 / D x0 (3) unique solution of (2), i.e.,

is described by .t/ D ˆ.t; t0 /x0 (5)


L
0 .t/ D x0 where
Zt
m .t/ D x0 C f . ; m1 .//d ; m D 1; 2; : : : Zt
ˆ.t; t0 / D I C A.1 /d 1
t0
(4) t0
2  3
Zt Z1
As m ! 1, m converges to the unique solution C A.1 / 4 A.2 /d 2 5 d 1 C : : :
of (3), assuming the f satisfies certain condi- t0 t0
tions. (6)
Applying the method of successive approxi-
mations to (2) yields
Note that ˆ.t0 ; t0 / D I and by differentiation it
can be seen that
0 .t/ D x0
Zt P t0 / D A.t/.t; t0 /;
.t; (7)
1 .t/ D x0 C A./x0 d 
t0 as expected, since (5) is the solution of (2). The
Zt n  n matrix ˆ.t; t0 / is called the state transition
2 .t/ D x0 C A./1 ./x0 d  matrix of (2). The defining series (6) is called the
Peano-Baker series.
t0
Note that when A.t/ D A, a constant matrix,
::
: then (6) becomes
Zt 1
X Ak .t  t0 /k
m .t/ D x0 C A./m1 ./x0 d  ˆ.t; t0 / D I C (8)
t0

kD1
666 Linear Systems: Continuous-Time, Time-Varying State Variable Descriptions

which is the defining series for the matrix expo- Z


C1
nential e A.t t0 / (see  Linear Systems: Continu- f .t/ D ı.t  /f ./d  ;
ous-Time, Time-Invariant State Variable Descrip- 1
tions).

where ı.t  / denotes an impulse applied at time


System Response  D t.

Based on the solution (5) of xP D A.t/x.t/, the


solution of the non-homogenous equation

P
x.t/ D A.t/x.t/ C B.t/u.t/I x.t0 / D x0 (9) Properties of the State Transition
Matrix ˆ.t; t0 /
can be shown to be
In general it is difficult to determine ˆ.t; t0 /
Z t explicitly; however, ˆ.t; t0 / may be readily
.t/ D ˆ.t; t0 /x0 C ˆ.t; /B./u./d  : determined in a number of special cases including
t0
(10) the cases in which A.t/ D A, A.t/ is diagonal,
Equation (10) is the variation of constants A.t/A./ D A./A.t/.
formula. This result can be shown via direct Consider xP D A.t/x. We can derive
substitution of (10) into (9); note that .t/ D a number of important properties which
ˆ.t0 ; t0 /x0 D x0 . That (10) is a solution can are described below. It can be shown that
also be shown using a change of variables in (9), given n linearly independent initial con-
namely, ditions x0i , the corresponding n solutions
z.t/ D ˆ.t0 ; t/x.t/:  i .t/ are also linearly independent. Let a
fundamental matrix ‰.t/ of xP D A.t/x
Equation (10) is the sum of two parts, the state be an n  n matrix, the columns of which
response (when u.t/ D 0 and the system is driven are a set of linearly independent solutions
only by the initial state conditions) and the input 1 .t/; : : : ; n .t/. The state transition matrix
response (when x0 D 0 and the system is driven ˆ is the fundamental matrix determined from
only by the input u.t/); this illustrates the linear solutions that correspond to the initial conditions
system principle of superposition. Œ1; 0; 0; : : :T ; Œ0; 1; 0; : : : ; 0T ; : : : Œ0; 0; : : : ; 1T
In view of (10), the output y.t/ .D C.t/x.t/C (recall that ˆ.t0 ; t0 / D I ). The following are
D.t/u.t// is properties of ˆ.t; t0 /:
(i) ˆ.t; t0 / D ‰.t/‰ 1 .t0 / with ‰.t/ any fun-
y.t/ D C.t/ˆ.t; t0 /x0 damental matrix.
Zt (ii) ˆ.t; t0 / is nonsingular for all t and t0 .
C C.t/ˆ.t; /B./u./d  C D.t/u.t/ (iii) ˆ.t; / D ˆ.t; /ˆ. ; / (semigroup prop-
t0 erty).
Zt
(iv) Œˆ.t; t0 /1 D ˆ.t0 ; t/.
D C.t/ˆ.t; t0 /x0 C ŒC.t/ˆ.t; /B./
In the special case of time-invariant systems
t0
and xP D Ax, the above properties can be written
CD.t/ı.t  /u./d  in terms of the matrix exponential since

The second expression involves the Dirac (or


impulse or delta) function ı.t/ and is derived
based on the basic property for ı.t/, namely, ˆ.t; t0 / D e A.t t0 / :
Linear Systems: Discrete-Time Impulse Response Descriptions 667

Equivalence of State Variable  Linear Systems: Continuous-Time Impulse Re-


Descriptions sponse Descriptions
 Linear Systems: Discrete-Time, Time-Varying,
Given the system State Variable Descriptions

xP D A.t/x C B.t/u Recommended Reading


(11)
y D C.t/x C D.t/u Additional information regarding the time-
varying case may be found in Brockett (1970),
consider the new state vector xQ Rugh (1996), and Antsaklis and Michel (2006).
For historical comments and extensive references
on some of the early contributions, see Sontag
Q
x.t/ D P .t/x.t/
(1990) and Kailath (1980).

where P 1 .t/ exists and P and P 1 are continu- Bibliography


ous. Then the system
Antsaklis PJ, Michel AN (2006) Linear systems.
xPQ D A.t/
Q xQ C B.t/u
Q Birkhauser, Boston
Brockett RW (1970) Finite dimensional linear systems.
Wiley, New York
y D CQ .t/xQ C D.t/u
Q Kailath T (1980) Linear systems. Prentice-Hall, Engle-
wood Cliffs
where Miller RK, Michel AN (1982) Ordinary differential equa-
tions. Academic, New York L
Rugh WJ (1996) Linear system theory, 2nd edn. Prentice-
Q D ŒP .t/A.t/ C PP .t/P
A.t/ 1
.t/ Hall, Englewood Cliffs
Sontag ED (1990) Mathematical control theory: deter-
Q
B.t/ D P .t/B.t/ ministic finite dimensional systems. Texts in applied
mathematics, vol 6. Springer, New York
CQ .t/ D C.t/P 1 .t/
Q
D.t/ D D.t/
Linear Systems: Discrete-Time
is equivalent to (1). It can be easily shown that Impulse Response Descriptions
equivalent descriptions give rise to the same im-
pulse responses. Panos J. Antsaklis
Department of Electrical Engineering, University
of Notre Dame, Notre Dame, IN, USA
Summary
Abstract
State variable descriptions for continuous-time
time-varying systems were introduced and the An important input-output description of a
state and output responses to inputs and initial linear discrete-time system is its (discrete-time)
conditions were derived. The equivalence of state impulse response (or pulse response), which
variable representations was also discussed. is the response h.k; k0 / to a discrete impulse
applied at time k0 . In time-invariant systems
that are also causal and at rest at time zero, the
Cross-References impulse response is h.k; 0/, and its z-transform is
the transfer function of the system. Expressions
 Linear Systems: Continuous-Time, Time-In- for h.k; k0 / when the system is described by state
variant State Variable Descriptions variable equations are derived.
668 Linear Systems: Discrete-Time Impulse Response Descriptions

Keywords i.e., h.k; lO/ is the output at time k when a unit


pulse is applied at time l.O
At rest; Causal; Discrete-time; Discrete-time So in (4) h.k; l/ is the response at time k to a
impulse response descriptions; Linear systems; discrete-time impulse (unit pulse) applied at time
Pulse response descriptions; Time-invariant; l. h.k; l/ is the discrete-time impulse response
Time-varying; Transfer function descriptions of the system. Clearly if h.k; l/ is known, the
response of the system to any input can be de-
termined via (4). So h.k; l/ is an input/output
Introduction description of the system.
Equation (1) is a generalization of (4) for the
Consider linear, discrete-time dynamical systems
multi-input, multi-output case. If we let all the
that can be described by
components of u.l/ in (1) be zero except for
the j th component, then
C1
X
y.k/ D H.k; l/u.l/ (1) C1
X
lD1
yi .k/ D hij .k; l/uj .l/ (5)
lD1
where k; l 2 Z is the set of integers, the output is
y.k/ 2 Rp , the input is u.k/ 2 Rm , and H.k; l/ W hij .k; l/ denotes the response of the i th compo-
ZZ ! Rpm . For instance, any system that can nent of the output of system (1) at time k due to a
be written in state variable form discrete impulse applied to the j th component of
the input at time l with all remaining components
x.k C 1/ D A.k/x.k/ C B.k/u.k/ of the input being zero. H.k; l/ D Œhij .k; l/ is
(2)
y.k/ D C.k/x.k/ C D.k/u.k/ called the impulse response matrix of the system.
If it is known that system (1) is causal, then
or the output will be zero before an input is applied.
x.k C 1/ D Ax.k/ C Bu.k/ Therefore,
(3)
y.k/ D C x.k/ C Du.k/
H.k; l/ D 0; for k < l; (6)
can be represented by (1). Note that it is assumed
that at l D 1, the system is at rest, i.e., no and so when causality is present, (1) becomes
energy is stored in the system at time 1.
Define the discrete-time impulse (or unit X
k
pulse) as y.k/ D H.k; l/u.l/: (7)
lD1

1 kD0
ı.k/ D A system described by (1) is at rest at k D k0
0 k ¤ 0 ;k 2 Z
if u.k/ D 0 for k > k0 implies y.k/ D 0 for k >
k0 . For a system at rest at k D k0 , (7) becomes
and consider a single-input, single-output system:

C1
X
k
X y.k/ D H.k; l/u.l/: (8)
y.k/ D h.k; l/u.l/ (4)
lDk0
lD1

If system (1) is time-invariant, then H.k; l/ D


If u.l/ D ı.lO  l/, that is, the input is a unit pulse H.k  l; 0/ (also written as H.k  l/) since only
O then the output is
applied at l D l, the time elapsed .k  l/ from the application of
the discrete-time impulse is important. Then (8)
yI .k/ D h.k; lO/; becomes
Linear Systems: Discrete-Time Impulse Response Descriptions 669

X
k X
k1
y.k/ D H.k  l/u.l/; k  0; (9) y.k/ D C Ak.lC1/ Bu.l/ C Du.k/; k > k0
lD0 lDk0
(13)
where we chose k0 D 0 without loss of gener- where x.k0 / D 0 and
ality. Equation (9) is the description for casual,
8
time-invariant systems, at rest at k D 0. < C Ak.lC1/ B k>l
Equation (9) is a convolution sum and if we H.k; l/ D H.k  l/ D D kDl
:
take the (one-sided or unilateral) z-transform of 0 k<l
both sides, (14)

When l D 0 (taking the time when the discrete


O D HO .z/Ou.z/;
y.z/ (10)
impulse is applied to be zero, l D 0), the discrete-
time impulse response is
O
where y.z/, uO .z/ are the z-transforms of y.k/,
u.k/ and HO .z/ is the z-transform of the discrete- 8
< C Ak1 B k>0
time impulse response H.k/. HO .z/ is the transfer
H.k/ D D kD0 (15)
function matrix of the system. Note that the trans- :
0 k<0
fer function of a linear, time-invariant system is
defined as the rational matrix HO .z/ that satisfies
(10) for any input and its corresponding output Taking (one-sided or unilateral) z-transforms of
assuming zero initial conditions. both sides in (15),
L
HO .z/ D C .zI  A/1 B C D (16)
Connections to State Variable
Descriptions which is the transfer function matrix in terms
of the coefficient matrices in the state variable
When a system is described by (2), then description (3). Note that (16) can also be derived
directly from (3) by assuming zero initial condi-
X
k1 tions .x.0/ D 0/ and taking z-transforms of both
y.k/ D C.k/ˆ.k; l C 1/B.l/u.l/ sides.
lDk0 Finally, it is easy to show that equivalent
CD.k/u.k/; k > k0 (11) state variable descriptions give rise to the same
discrete-impulse response.
where it was assumed that x.k0 / D 0, i.e., the
system is at rest at k0 . Here ˆ.k; l/ (D A.k 
1/    A.l/) is the state transition matrix of the Summary
system.
Comparing (11) with (8), the discrete-time The discrete-time impulse response is an exter-
impulse response of the system is nal, input-output description of linear, discrete-
time systems. When the system is time-invariant,
8 the z-transform of the impulse response h.k; 0/
< C.k/ˆ.k; l C 1/B.l/ k>l
(which is the output response at time k due
H.k; l/ D D.k/ kDl
: to a discrete impulse applied at time zero with
0 k<l
(12) initial conditions taken to be zero) is the transfer
function – another very common input-output
Similarly, when the system is time-invariant and description. The relationships to the state variable
is described by (3), descriptions were shown.
670 Linear Systems: Discrete-Time, Time-Invariant State Variable Descriptions

Cross-References Keywords

 Linear Systems: Discrete-Time, Time-Invariant Discrete-time; Linear systems; State variable de-
State Variable Descriptions scriptions; Time-invariant
 Linear Systems: Discrete-Time, Time-Varying,
State Variable Descriptions Introduction

Discrete-time systems arise in a variety of ways


in the modeling process. There are systems that
Recommended Reading are inherently defined only at discrete points in
time; examples include digital devices, inven-
External or input-output descriptions such as the tory systems, economic systems such as banking
impulse response and the transfer function (in where interest is calculated and added to savings
the time-invariant case) are described in several accounts at discrete time interval, etc. There are
textbooks below. also systems that describe continuous-time sys-
tems at discrete points in time; examples include
simulations of continuous processes using digital
computers and feedback control systems that em-
Bibliography ploy digital controllers and give rise to sampled-
data systems.
Antsaklis PJ, Michel AN (2006) Linear systems. Linear, discrete-time, time-invariant systems
Birkhauser, Boston
Kailath T (1980) Linear systems. Prentice-Hall, Engle- can be modeled via state variable equations,
wood Cliffs namely,
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice-
Hall, Englewood Cliffs
x.k C 1/ D Ax.k/ C Bu.k/I x.0/ D x0
y.k/ D C x.k/ C Du.k/
(1)
Linear Systems: Discrete-Time, where k 2 Z, the set of integers, the state vector
Time-Invariant State Variable x 2 Rn , i.e., an n dimensional column vector;
Descriptions A 2 Rnn , B 2 Rnm , C 2 Rpn , D 2 Rpm
are matrices with entries of real numbers; and
Panos J. Antsaklis y.k/ 2 Rp , u.k/ 2 Rm the output and the input,
Department of Electrical Engineering, University respectively. The vector difference equation in (1)
of Notre Dame, Notre Dame, IN, USA is the state equation and the algebraic equation is
the output equation.
Note that (1) could have been equivalently
Abstract written as x.l/ D Ax.l  1/ C Bu.l  1/ where
l D k C 1 and x.l  1/ is an easily visualized
Discrete-time processes that can be modeled by delayed version of x.l/; this is a form more
linear difference equations with constant coeffi- common in signal processing (where a two-sided
cients can also be described in a systematic way or bilateral z-transform is used). In control where
in terms of state variable descriptions of the form we assume a known initial condition at time equal
x.k C 1/ D Ax.k/ C Bu.k/; y.k/ D C x.k/ C to zero (and one-sided or unilateral z-transform is
Du.k/. The response of such systems due to a taken), the form in (1) is common.
given input and subject to initial conditions is de- Similar to the continuous-time case, (1) can
rived. Equivalence of state variable descriptions be derived from a set of high-order difference
is also discussed. equations by introducing the state variables
Linear Systems: Discrete-Time, Time-Invariant State Variable Descriptions 671

x.k/ D Œx1 .k/; : : : ; xn .k/T . Description (1) time elapsed (k  k0 ) and not on the actual initial
can also be derived from continuous-time system time k0 .
descriptions by sampling (see  Sampled-Data In view of (3), it is clear that Ak plays an
Systems). important role in the solutions of the difference
The advantage of the above state variable state equations that describe linear, discrete-
description is that given any input u.k/ and initial time, time-invariant systems; it is actually
conditions x.0/, its solution (state trajectory or analogous to the role e At plays in the solutions
motion) can be conveniently and systematically of the linear differential state equations that
characterized. This is done below. We first con- describe linear, continuous-time, time-invariant
sider the solutions of the homogenous equation systems.
x.k C 1/ D Ax.k/. Notice that in (3), k > 0. This is so because
Ak for k < 0 may not exist; this is the case,
for example, when A is a singular matrix – it has
Solving x.k C 1/ D Ax.k/I x.0/ D x0 at least one eigenvalue at the origin. In contrast,
e At exists for any t positive or negative. The
Consider the homogenous equation implication is that in discrete-time systems we
may not be able to determine uniquely the initial
x.k C 1/ D Ax.k/I x.0/ D x0 (2) past state x.0/ from a current state value x.k/; in
contrast, in continuous-time systems, it is always
where k 2 ZC is a nonnegative integer, x.k/ D possible to go backwards in time.
Œx1 .k/; : : : ; xn .k/T is the state column vectors of There are several methods to calculate Ak that
dimension n, and A is an nn matrix with entries mirror the methods to calculate e At . One could,
for example, use similarity transformations, or
L
real numbers (i.e., A 2 Rnn ).
Write (2) for k D 0; 1; 2; : : : , namely, x.1/ D the z-transform. When all eigenvectors of A are
Ax.0/, x.2/ D Ax.1/ D A2 x.0/; : : : to derive linearly independent (this is the case, e.g., when
the solution all eigenvalues i of A are distinct), then a simi-
larity transformation exists so that
x.k/ D Ak x.0/; k>0 (3)
PAP 1 D AQ D diagΠi :
This result can be shown formally by induction.
Note that A0 D I by convention and so (3) also
Then
satisfies the initial condition.
If the initial time were some (integer) k0 in- 2 3
stead of zero, then the solution would be k1
6 7
Ak D P 1 AQk P D P 1 4 ::
: 5 P:
x.k/ D A kk0
x.k0 /; k > k0 (4) kn

The solution can be written as


Alternatively, using the z-transforms, Ak D
Z fz.zI  A/1 g. Also when the eigenvalues i
1
x.k/ D ˚.k; k0 /x.k0 /
of A are distinct, then
D ˚.k  k0 ; 0/x.k0 /; k > k0 (5)
X
n
where ˚.k; k0 / is the state transition matrix and Ak D Ai ki ;
it equals ˚.k; k0 / D Akk0 . Note that for time- i D0

invariant systems, the initial time k0 can always


be taken to be zero without loss of generality; where Ai D vi vQ i with vi , vQi the right and left
this is because the behavior depends only on the eigenvectors of A that correspond to i . Note that
672 Linear Systems: Discrete-Time, Time-Invariant State Variable Descriptions

2 3
vQ1 Equivalence of State Variable
6 :: 7  1 Descriptions
4 : 5 D v1    vn ;
vQn
Given description (1), consider the new state
vector xQ where
Ai ki are the modes of the system. One could
also use the Cayley-Hamilton theorem to deter-
Q
x.k/ D P x.k/
mine Ak .

with P 2 Rnn a real nonsingular matrix.


System Response Substituting x D P 1 xQ in (1), we obtain

Consider the description (1). The response can Q C 1/ D AQx.k/


x.k Q Q
C Bu.k/
be easily derived by writing the equation for (10)
y.k/ D CQ x.k/
Q Q
C Du.k/
k D 0; 1; 2; : : : and substituting or formally by
induction. It is where
X
k1
x.k/ D Ak x.0/ C Ak.j C1/ Bu.j /; k>0 AQ D PAP 1 ; BQ D PB; CQ D CP 1 ; DQ D D
j D0
(6) The state variable descriptions (1) and (9)
and are called equivalent and P is the equivalence
transformation matrix. This transformation cor-
X
k1
responds to a change in the basis of the state
y.k/ D C Ak x.0/ C C Ak.j C1/Bu.j /
space, which is a vector space. Appropriately se-
j D0
lecting P one can simplify the structure of AQ .D
C Du.k/; k>0 PAP 1 /. It can be easily shown that equivalent
description gives rise to the same discrete impulse
y.0/ D C x.0/ C Du.0/: (7)
response and transfer function.
Note that (6) can also be written as
2 3 Summary
u.k  1/
6 7
x.k/ D Ak x.0/CŒB; AB;    ; Ak1 B4 ::: 5:
State variable descriptions for discrete-time,
u.0/ time-invariant systems were introduced and the
(8) state and output responses to inputs and initial
Clearly the response is the sum of two compo- conditions were derived. The equivalence of state
nents, one due to the initial condition (state re- variable representations was also discussed.
sponse) and one due to the input (input response).
This illustrates the linear system principle of
superposition. Cross-References
If the initial time is k0 and (4) is used, then
 Linear Systems: Continuous-Time, Time-In-
X
k1 variant State Variable Descriptions
k.j C1/
y.k/ D C A kk0
x.k0 / C CA Bu.j /  Linear Systems: Discrete-Time Impulse Re-
j Dk0 sponse Descriptions
 Linear Systems: Discrete-Time, Time-Varying,
C Du.k/; k > k0
State Variable Descriptions
y.k0 / D C x.k0 / C Du.k0 /: (9)  Sampled-Data Systems
Linear Systems: Discrete-Time, Time-Varying, State Variable Descriptions 673

Recommended Reading Introduction

The state variable descriptions received wide ac- Discrete-time systems arise in a variety of ways
ceptance in systems theory beginning in the late in the modeling process. There are systems that
1950s. This was primarily due to the work of are inherently defined only at discrete points in
R.E. Kalman. For historical comments and ex- time; examples include digital devices, inventory
tensive references, see Kailath (1980). The use of systems, and economic systems such as banking
state variable descriptions in systems and control where interest is calculated and added to savings
opened the way for the systematic study of sys- accounts at discrete time interval. There are also
tems with multi-inputs and multi-outputs. systems that describe continuous-time systems at
discrete points in time; examples include simula-
tions of continuous processes using digital com-
Bibliography puters and feedback control systems that employ
digital controllers and give rise to sampled-data
Antsaklis PJ, Michel AN (2006) Linear systems. systems.
Birkhauser, Boston
Franklin GF, Powell DJ, Workman ML (1998) Digital Dynamical processes that can be described or
control of dynamic systems, 3rd edn. Addison-Wesley, approximated by linear difference equations with
Longman, Inc., Menlo Park, CA time-varying coefficients can also be described,
Kailath T (1980) Linear systems. Prentice-Hall, Engle- via a change of variables, by state variable
wood Cliffs
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- descriptions of the form
Hall, Englewood Cliffs
x.k C 1/ D A.k/x.k/ C B.k/u.k/I x.k0 / D x0
L
y.k/ D C.k/x.k/ C D.k/u.k/:
(1)
Linear Systems: Discrete-Time,
Time-Varying, State Variable Above, the state vector x.k/ (k 2 Z, the set of in-
Descriptions tegers) is a column vector of dimension n (x.k/ 2
Rn ); the output is y.k/ 2 Rm and the input is
Panos J. Antsaklis u.k/ 2 Rm . A.k/, B.k/, C.k/, and D.k/ are
Department of Electrical Engineering, University matrices with entries functions of time k, A.k/ D
of Notre Dame, Notre Dame, IN, USA Œaij .k/, aij .k/ W Z ! R (A.k/ 2 Rnn , B.k/ 2
Rnm , C.k/ 2 Rpn , D.k/ 2 Rpm ). The vector
difference equation in (1) is the state equation,
Abstract while the algebraic equation is the output equa-
tion. Note that in the time-invariant case, A.k/ D
Discrete-time processes that can be modeled by A, B.k/ D B, C.k/ D C , and D.k/ D D.
linear difference equations with time-varying co- The advantage of the state variable description
efficients can be written in terms of state variable (1) is that given an input u.k/, k  k0 and an
descriptions of the form x.k C 1/ D A.k/x.k/ C initial condition x.k0 / D x0 , the state trajectories
B.k/u.k/; y.k/ D C.k/x.k/ C D.k/u.k/. The or motions for k  k0 can be conveniently
response of such systems due to a given input and characterized. To determine the expressions, we
initial conditions is derived. Equivalence of state first consider the homogeneous state equation
variable descriptions is also discussed. and the corresponding initial value problem.

Keywords Solving x.k C 1/ D A.k/x.k/I


x.k0 / D x0
Discrete-time; Linear systems; State variable de-
scriptions; Time-varying Consider the homogenous equation
674 Linear Systems: Discrete-Time, Time-Varying, State Variable Descriptions

x.k C 1/ D A.k/x.k/I x.k0 / D x0 (2) and

Note that y.k0 / D C.k0 /x.k0 / C D.k0 /u.k0 /:

x.k0 C 1/ D A.k0 /x.k0 / Equation (5) is the sum of two parts, the state
response (when u.k/ D 0 and the system is
x.k0 C 2/ D A.k0 C 1/A.k0 /x.k0 / driven only by the initial state conditions) and the
:: input response (when x.k0 / D 0 and the system
: is driven only by the input u.k/); this illustrates
x.k/ D A.k  1/A.k  2/    A.k0 /x.k0 / the linear systems principle of superposition.

Y
k1
D A.j /x.k0 /; k > k0 Equivalence of State Variable
j Dk0 Descriptions

This result can be shown formally by induction. Given (1), consider the new state vector xQ where
The solution of (2) is then
Q
x.k/ D P .k/x.k/
x.k/ D ˆ.k; k0 /x.k0 /; (3)
where P 1 .k/ exists. Then
where ˆ.k; k0 / is the state transition matrix of
(2) given by x.k Q x.k/
Q C 1/ D A.k/ Q Q
C B.k/u.k/
(7)
Y
k1 y.k/ D CQ .k/x.k/
Q Q
C D.k/u.k/
ˆ.k; k0 / D A.j /; k > k0 I ˆ.k0 ; k0 / D I:
j Dk0
where
(4)
Q
A.k/ D P .k C 1/A.k/P 1 .k/;
Note that in the time-invariant case, ˆ.k; k0 / D
Akk0 . Q
B.k/ D P .k C 1/B.k/;
CQ .k/ D C.k/P 1 .k/;
System Response Q
D.k/ D D.k/

Consider now the state equation in (1). It can be is equivalent to (1). It can be easily shown that
easily shown that the solution is equivalent descriptions give rise to the same dis-
crete impulse responses.
x.k/ D ˆ.k; k0 /x.k0 /
X
k1 Summary
C ˆ.k; j C 1/B.j /u.j /; k > k0 ;
j Dk0 State variable descriptions for linear discrete-
(5) time time-varying systems were introduced and
the state and output responses to inputs and initial
and the response y.k/ of (1) is conditions were derived. The equivalence of state
variable representations was also discussed.
y.k/ D C.k/ˆ.k; k0 /x.k0 /
X
k1
Cross-References
C C.k/ ˆ.k; j C 1/B.j /u.j /
j Dk0
 Linear Systems: Discrete-Time, Time-Invariant
C D.k/u.k/; k > k0 ; (6) State Variable Descriptions
LMI Approach to Robust Control 675

 Linear Systems: Discrete-Time Impulse After the introduction of LMI, it is illustrated how
Response Descriptions a control design problem is related with matrix
 Linear Systems: Continuous-Time, Time-Vary- inequality. Then, two methods are explained on
ing State Variable Descriptions how to transform a control problem characterized
 Sampled-Data Systems by matrix inequalities to LMIs, which is the core
of the LMI approach. Based on this knowledge,
the LMI solutions to various kinds of robust
Recommended Reading control problems are illustrated. Included are H1
and H2 control, regional pole placement, and
The state variable descriptions received wide ac- gain-scheduled control.
ceptance in systems theory beginning in the late
1950s. This was primarily due to the work of
R.E. Kalman. For historical comments and ex- Keywords
tensive references, see Kailath (1980). The use of
state variable descriptions in systems and control Gain-scheduled control; H1 and H2 control;
opened the way for the systematic study of sys- LMI; Multi-objective control; Regional pole
tems with multi-inputs and multi-outputs. placement; Robust control

Bibliography
Introduction of LMI
Antsaklis PJ, Michel AN (2006) Linear systems.
Birkhauser, Boston L
Astrom KJ, Wittenmark B (1997) Computer controlled
A matrix inequality in a form of
systems: Theory and Design, 3rd edn. Prentice Hall,
Upper Saddle River, NJ X
m
Franklin GF, Powell DJ, Workman ML (1998) Digital F .x/ D F0 C xi Fi > 0 (1)
control of dynamic systems, 3rd edn. Addison-Wesley, i D1
Menlo Park, CA
Jury EI (1958) Sampled-data control systems. Wiley, New
York is called an LMI (linear matrix inequality). Here,
Kailath T (1980) Linear systems. Prentice-Hall, Engle- x D Œx1    xm  is the unknown vector and Fi .i D
wood Cliffs 1; : : :; m/ is a symmetric matrix. F .x/ is an affine
Ragazzini JR, Franklin GF (1958) Sampled-data control
systems. McGraw-Hill, New York function of x. The inequality means that F .x/ is
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- positive definite.
Hall, Englewood Cliffs LMI can be solved effectively by numeri-
cal algorithms such as the famous interior point
method (Nesterov and Nemirovskii 1994). MAT-
LAB has an LMI toolbox (Gahinet et al. 1995)
LMI Approach to Robust Control tailored for solving the related control problems.
Boyd et al. (1994) provide detailed theoretic
Kang-Zhi Liu fundamentals of LMI. A comprehensive and up-
Department of Electrical and Electronic to-date treatment on the applications of LMI
Engineering, Chiba University, Chiba, Japan in robust control is covered in Liu and Yao
(2014).
The notation He.A/ D A C AT is used to
Abstract simplify the presentation of large matrices; A?
is a matrix whose columns form the basis of
In the analysis and design of robust control sys- the kernel space of A, i.e., AA? D 0. Further,
tems, LMI method plays a fundamental role. This A ˝ B denotes the Kronecker product of matrices
article gives a brief introduction to this topic. (A, B).
676 LMI Approach to Robust Control

Control Problems and LMI Application of this theorem to the previous


stabilization problem (2) yields .B T /T? .AP C
In control problems, it is often the case that PAT /.B T /? < 0, which is an LMI about P .
the variables are matrices. For example, the nec- Once P is obtained, it may be substituted back
essary and sufficient condition for the stability into the inequality (2) and solve for F .
P
of a linear system x.t/ D Ax.t/ is that there For output feedback problems, it is often
exists a positive-definite matrix P satisfying the needed to construct a new matrix from two given
inequality AP C PAT < 0. Although this is matrices in solving a control problem with LMI
different from the LMI of Eq. (1) in form, it can approach. The method is given by the following
be converted to Eq. (1) equivalently by using a lemma.
basis of symmetric matrices.
Lemma 2 Given two n-dimensional positive-
Next, consider the stabilization of system xP D
definite matrices X and Y, a 2n-dimensional
Ax CBu by a state feedback u D Fx. The closed-
positive-definite matrix P satisfying the condi-
loop system is xP D .A C BF /x. Therefore, the
tions
stability condition is that there exist a positive-
definite matrix P and a feedback gain matrix F

Y
1 X

satisfying the inequality PD ; P D





.A C BF /P C P .A C BF /T < 0: (2) can be constructed if and only if


In this inequality, FP, the product of unknown X I


> 0: (5)
variables F and P , appears. Such matrix in- I Y
equality is called a bilinear matrix inequality, or
BMI for short. BMI problem is non-convex and Factorizing Y  X 1 as FFT , a solution is given
difficult to solve. There are mainly two methods by

for transforming a BMI into an LMI: variable Y F


elimination and variable change. PD :
FT I
As an example of output feedback control design,
let us consider the stabilization of the plant
From BMI to LMI: Variable Elimination
xP P D AxP C Bu; y D C xP (6)
The method of variable elimination is good at op-
timizing single-objective problems. This method
with a full-order dynamic controller
is based on the theorem below (Gahinet and
Apkarian 1994).
xP K D AK xK C BK y; u D CK xK C DK y: (7)
Lemma 1 Given real matrices E, F , G with G
being symmetric, the inequality
The closed-loop system is

E T XF C F T X T E C G < 0 (3)

xP P xP A C BDK C BCK
D Ac ; Ac D :
xP K xK BK C AK
has a solution X if and only if the following two (8)
inequalities hold simultaneously
The stability condition is that the matrix inequal-
ity
T
E? GE? < 0; F?T GF? < 0: (4) ATc P C PAc < 0 (9)
LMI Approach to Robust Control 677

has a solution P > 0. To apply the variable LMI Approach to Robust


elimination method, we need to put all coefficient Control, Fig. 1
Generalized feedback
matrices of the controller into in a single matrix. system
This is done as follows:

DK CK
Ac D A C BKC ; K D (10)
BK AK
A D NAK M T C NBK CX C YBCK M T
in which A D diag.A; 0/, B D diag.B; I /, CY .A C BDK C /X
and C D diag.C; I /, all being block diagonal.
B D NBK C YBDK ; C D CK M T
Then, based on Lemma 1, the stability condition
reduces to the existence of symmetric matrices CDK CX; D D DK :
X , Y satisfying LMIs (15)
The coefficient matrices of the controller become
.B T /T? .AX C XA /.B /? < 0
T T
(11)
DK D D; CK D .C  DK CX /M T ;
BK D N 1 .B  YBDK /
.C? /T .YA C AT Y /C? < 0: (12) (16)
AK D N 1 .A  NBK CX  YBCK M T
Meanwhile, the positive definiteness of matrix P
Y .A C BDK C /X /M T :
is guaranteed by Eq. (5) in Lemma 2.

H2 and H1 Control L
From BMI to LMI: Variable Change
In system optimization, H2 and H1 norms
We may also use the method of variable change
are the most popular and effective performance
to transform a BMI into an LMI. This method is
indices. H2 norm of a transfer function is closely
good at multi-objective optimization.
related with the squared area of its impulse
The detail is as follows (Gahinet 1996). A
response. So, a smaller H2 norm implies a
positive-definite matrix can always be factorized
faster response. Meanwhile, H1 norm of a
as the quotient of two triangular matrices, i.e.,
transfer function is the largest magnitude of


its frequency response. Hence, for a transfer
X I IY
P…1 D …2 ; …1 D ; …2 D : function from the disturbance to the controlled
MT 0 NT
0 output, a smaller H1 norm guarantees a better
(13)
disturbance attenuation.
P > 0 is guaranteed by Eq. (5) for a full-
Usually H2 and H1 optimization problems
order controller. Further, the matrices M , N are
are treated in the generalized feedback system of
computed from MN T D I  XY. Consequently,
Fig. 1. Here, the generalized plant G.s/ includes
they are nonsingular.
the nominal plant, the performance index, and the
An equivalent inequality …T1 ATc …2 C
T
weighting functions.
…2 Ac …1 < 0 is obtained by multiplying Eq. (9)
Let the generalized plant G.s/ be
with …T1 and …1 . After a change of variables,
this inequality reduces to an LMI
" #
C1   D11 D 12
G.s/ D .sI A/1 B1 B2 C D 0 :

C2 21
AX C BC A C BDC C AT
He < 0: (14) (17)
0 YA C BC
Further, the stabilizability of (A, B2 ) and the
The new variables A, B, C, D are set as detectability of (C2 , A/ are assumed. The
678 LMI Approach to Robust Control

2 3
closed-loop transfer matrix from the disturbance ATc P C PAc PBc CcT
w to the performance output z is denoted by 4 BcT P 
I DcT 5 < 0: (22)
Cc Dc 
I
Hzw .s/ D Cc .sI  Ac /1 Bc C Dc : (18)
There are two kinds of LMI solutions to this con-
The condition for Hzw .s/ to have an H2 norm trol problem: one based on variable elimination
less than
, i.e., kHzw k2 <
, is that there are and one based on variable change.
symmetric matrices P and W satisfying To state the first solution, define the following


matrices first:
PAc C ATc P CcT W BcT P
< 0; >0
Cc I PBc P NY D ŒC2 D21 ? ; NX D ŒB2T D12
T
? : (23)
(19)
as well as Tr(W / <
2 . Here, Tr(W / denotes the
trace of matrix W , i.e., the sum of its diagonal
entries. Theorem 2 The H1 control problem has a solu-
The LMI solution is derived via the appli- tion if and only if Eq. (5) and the following LMIs
cation of the variable change method, as given have positive-definite solutions X, Y:
below. 2 3

AX C XAT XC1T B1
NXT 0 4
Theorem 1 Suppose that D11 D 0. The H2 C1 X 
I D11 5
0 I
control problem is solvable if and only if there B1T T
D11 
I
exist symmetric matrices X, Y, W and matrices A,

B, C satisfying the following LMIs: N 0


 X <0 (24)
0 I
2 3
AX C B2 C 0 0
He 4 AT C A YA C BC2 0 5<0 2 3
C1 X C D12 C C1  12 I
YA C AT Y YB1 C1T
NYT 0 4 T 5
(20) B1T Y 
I D11
2 3 0 I
W B1T B1T Y C1 D11 
I
4 B1 X I 5 > 0; Tr .W / <
2 : (21)

N 0
YB1 I Y  Y < 0: (25)
0 I
When the LMI Eqs. (20) and (21) have solutions,
an H2 controller is given by Eq. (16) by setting
D D 0. Once a matrix P is computed according to
The H1 control problem is to design a con- Lemma 2, Eq. (22) becomes an LMI and its
troller so that kHzw k1 <
. The starting point of solution yields the controller.
H1 control is the famous bounded real lemma, The second solution is given below.
which states that Hzw .s/ has an H1 norm less Theorem 3 The H1 control problem has a solu-
than
if and only if there is a positive-definite tion if and only if Eq. (5) and the following LMI
matrix P satisfying have solutions X, Y and A, B, C, D W

2 3
AX C B2 C A C B2 DC2 B1 C B2 DD21 0
6 A YA C BC2 YB1 C BD21 0 7
6 7
He 6 7 < 0: (26)
4 0 0

2I 0 5
C1 X C D12 C C1 C D12 DC2 D11 C D12 DD21 
2 I
LMI Approach to Robust Control 679

The controller is given by Eq. (16). all of its poles are in the LMI region D if and only
if there is a positive-definite matrix P satisfying
the LMI

Regional Pole Placement L ˝ P C M ˝ .AP/ C M T ˝ .AP/T < 0: (29)

The location of system poles determines the re-


This forms the basis for the regional pole
sponse quality. However, for uncertain systems
placement design.
it is impossible to place the closed-loop poles at
For the disk region in Fig. 2a, the condition
fixed points because they move with the variation
becomes
of the plant. Nevertheless, it is still possible to
place the closed-loop poles inside a region. For

rP cP C AP
convex regions characterized by LMI, the design < 0: (30)
cP C .AP/T rP
method is mature and proven effective in practice.
Let us see how to characterize a convex region.
Meanwhile, for the sector region in Fig. 2b, the
It is easy to know that a complex number z is
corresponding LMI is
inside the disk of Fig. 2a if and only if it satisfies


.AP C PAT / sin  .AP  PAT / cos 
r zCc < 0:
< 0: .AP  PAT / cos  .AP C PAT / sin 
zN C c r (31)
Moreover, for a composite LMI region, such as
Similarly, z is inside the sector of Fig. 2b if and
only if
the intersection of the disk and the sector, the pole L
placement is guaranteed by enforcing a common

solution P to all the corresponding LMIs.
.z C zN/ sin  .z  zN/ cos  In the pole placement design, only the variable
< 0:
.z  zN/ cos  .z C zN/ sin  change method is applicable. For example, in
the nominal closed-loop system Eq. (8), the pole
Generally, the set of complex number z character- placement condition is that the LMI
ized by

X I

D D fz 2 CjL C zM C zNM T < 0g (27) I Y


AX CBC ACBDC
is called an LMI region, in which L is a symmet- CHe M ˝ <0
A YACBC
ric matrix. For the dynamic system
(32)
xP D Ax; (28)
and Eq. (5) are solvable (Chilali and Gahinet
1996).
For systems with norm-bounded parameter
uncertainty, a robust pole placement method is
provided in Chilali et al. (1999).

Multi-objective Control

It is noted that all of the preceding control designs


LMI Approach to Robust Control, Fig. 2 Typical exam-
involve a positive-definite matrix P. Therefore, a
ples of LMI region multi-objective control design is easily realized
680 LMI Approach to Robust Control

by enforcing a common solution P to the corre- When the structure of the gain-scheduled con-
sponding matrix inequality conditions. troller is chosen as summarized above, the solv-
ability conditions reduce to those at all vertices
Gain-Scheduled Control i of the scheduling parameter vector p.t/. Fur-
ther, a multi-objective is achieved by imposing a
In practice, many nonlinear systems can be ex- common solution P to all LMI conditions. Some
pressed as linear systems with state-dependent concrete examples are illustrated below:
coefficients in form, which is known as the LPV H1 Norm Spec: The conditions of Theorem 3
(linear parameter-varying) form. For example, in are satisfied at all vertices j of the parameter
the model of a robot arm JR C mgl sin  D u, vector p.t/.
if we define a parameter as p.t/ D sin =, H2 Norm Spec: The conditions of Theorem 1 are
R C
then it can be written as an LPV model J.t/ satisfied at all vertices j of the parameter
mglp.t/.t/ D u.t/. In this class of systems, vector p.t/.
when the parameter p.t/ is available online and Regional Pole Placement: Eq. (32) is satisfied at
its range is finite, one may tune the controller all vertices j of the parameter vector p.t/ and
parameters based on the information of p.t/, Eq. (5) holds.
so as to achieve a higher performance. This is Moreover, a different gain-scheduled method
referred to as gain-scheduled control. is proposed in Packard (1994) for parametric
Consider the following affine model: systems with norm-bounded uncertainty.

xP D A.p.t//x C B1 .p.t//d C B2 .p.t//u (33) Summary and Future Direction


z D C1 .p.t//x C D11 d C D12 u (34) LMI approach is a very powerful method that can
be applied to solve most of the robust control
y D C2 .p.t//x C D21 d (35)
problems smartly and effectively. In particular, its
capability of handling the multi-objective control
where A.p/ C2 .p/ are affine functions of
problems is very attractive and proven useful in
Pqparameter vector p.t/, such as
the time-varying
industrial applications.
A.p/ D A0 C i D1 pi .t/Ai . The gain-scheduled
Further study is needed in the following direc-
control is to impose, on the coefficient matrices
tions.
of the controller, the same affine structure about
Pq • New method of variable change is desired
p.t/ such as AK .p/ D AK0 C i D1 pi .t/AKi .
in order to deal with the robust performance
To simplify the design, it is desirable that the
design of parametric systems.
coefficient matrices of the closed-loop system
• Almost all robust performance designs are
become affine functions of the parameter vector
carried out based on sufficient conditions. It
p.t/. This may be satisfied by restraining some
is very important to discover less conservative
of the matrices of the controller to constant ones.
design methods.
The easy-to-design structure of a gain-scheduled
controller is summarized as follows:
• Both B2 .p/ and C2 .p/ depend on p.t/: (BK , Cross-References
CK / must be constant matrices besides DK D
0.  H-Infinity Control
• Constant (B2 , C2 /: All coefficient matrices of  Linear Matrix Inequality Techniques in Opti-
the controller can be affine functions of the mal Control
parameter vector p.t/.  Optimization Based Robust Control
• Constant B2 : (BK , DK / must be constant  Optimization-Based Control Design Tech-
matrices. niques and Tools
• Constant C2 : (CK , DK / must be constant ma-  Robust Synthesis and Robustness Analysis
trices. Techniques and Tools
Lyapunov Methods in Power System Stability 681

Bibliography function for estimating stability regions of


nonlinear dynamical systems is also addressed.
Apkarian P, Gahinet P (1995) A convex characterization
of gain-scheduled H1 controllers. IEEE Trans Autom
Control 40(5):853–864
Boyd SP et al (1994) Linear matrix inequalities in system Keywords
and control. SIAM, Philadelphia
Chilali M, Gahinet P (1996) H1 design with pole
placement constraints: an LMI approach. IEEE Trans
Energy function; Lyapunov function theory; Op-
Autom Control 41(3):358–367 timal estimation; Power system stability; Stabil-
Chilali M, Gahinet P, Apkarian P (1999) Robust pole ity region
placement in LMI regions. IEEE Trans Autom Control
44(12):2257–2270
Gahinet P (1996) Explicit controller formulas for LMI-
based H1 synthesis. Automatica 32(7):1007–1014 Introduction
Gahinet P, Apkarian P (1994) A linear matrix inequality
approach to H1 control. Int J Robust Nonlinear
Control 4:421–448 Energy functions, an extension of the Lyapunov
Gahinet P, Nemirovski A, Laub AJ, Chilali M (1995) LMI functions, have been practically used in electric
control toolbox. The MathWorks, Inc., Natick power systems for several applications. A com-
Liu KZ, Yao Y (2014, to appear) Robust control: theory prehensive energy function theory for general
and applications. Wiley, New York
Nesterov Y, Nemirovskii A (1994) Interior-point poly- nonlinear autonomous dynamical systems along
nomial methods in convex programming. SIAM, with its applications to electric power systems
Philadelphia will be summarized in this article.
Packard A (1994) Gain scheduling via linear fractional
We consider a general nonlinear autonomous
transformations. Syst Control Lett 22: 79–92
dynamical system described by the following L
equation:
P
x.t/ D f .x.t// (1)
LTI Systems
We say a function V W Rn ! R is an energy
function for the system (1) if the following three
 Linear Systems: Continuous-Time, Time-In-
conditions are satisfied (Chiang et al. 1987):
variant State Variable Descriptions
(E1): The derivative of the energy function V .x/
along any system trajectory x.t/ is non-
positive, i.e., VP .x.t//  0.
(E2): If x.t/ is a nontrivial trajectory (i.e., x.t/
Lyapunov Methods in Power System
is not an equilibrium point), then along the
Stability
nontrivial trajectory x.t/ the set ft 2 R W
VP .x.t// D 0g has measure zero in R.
Hsiao-Dong Chiang
(E3): That a trajectory x.t/ has a bounded value
School of Electrical and Computer Engineering,
of V .x.t// for t 2 RC implies that the
Cornell University, Ithaca, NY, USA
trajectory x.t/ is also bounded.
Condition (E1) indicates that the value of an
Abstract energy function is nonincreasing along its trajec-
tory, but does not imply that the energy function
Energy functions, an extension of Lyapunov is strictly decreasing along any trajectory. Condi-
functions, have been used in electric power tions (E1) and (E2) imply that the energy function
systems for several applications. An overview is strictly decreasing along any system trajectory.
of energy function theory for general nonlinear Property (E3) states that the energy function is
autonomous dynamical systems along with a proper map along any system trajectory but
its applications to electric power systems is need not be a proper map for the entire state
presented. The issue of how to optimally space. Obviously, an energy function may not be
determine the critical level value of an energy a Lyapunov function.
682 Lyapunov Methods in Power System Stability

As an illustration of the energy function, we Theorem 1 (Global Behavior of Trajectories)


consider the following classical transient stability If there exists a function satisfying condition
model and derive an energy function for the (E1) and condition (E2) of the energy function
model. Consider a power system consisting of for system (1), then every bounded trajectory of
n generators. Let the loads be modeled as con- system (1) converges to one of the equilibrium
stant impedances. Under the assumption that the points.
transfer conductance of the reduced network after
Theorem 1 asserts that there does not exist
eliminating all load buses is zero, the dynamics
any limit cycle (oscillatory behavior) or bounded
of the i th generator can be represented by the
complicated behavior such as almost periodic
equations
trajectory, chaotic motion, etc. in the system. We
next show a sharper result, asserting that every
ıPi D !i trajectory on the stability boundary must con-
X verge to one of the unstable equilibrium points
Mi !P i D Pi Di !i  Vi Vj Bij sin.ıi ıj /
(UEPs) on the stability boundary. Recall that for
j D1
a hyperbolic equilibrium point, it is an (asymptot-
(2)
ically) stable equilibrium point if all the eigenval-
ues of its corresponding Jacobian have negative
where the voltage at node i +1 is served as the
real parts; otherwise it is an unstable equilib-
reference, i.e., ıi C1 W D 0. This is a version of the
rium point. Let xO be a hyperbolic equilibrium
so-called classical model of the power system. It
point. Its stable and unstable manifolds, W s .x/ O
can be shown that the following function is an en-
O are well defined. There are many
and W u .x/,
ergy function V .ı; !/ which satisfies conditions
physical systems such as electric power systems
(E1)–(E3) for the classical model (2).
containing multiple stable equilibrium points. A
useful concept for these kinds of systems is that
1 Xn Xn
V .ı; !/ D Mi !i2  Pi ıi  ıjs of the stability region (also called the region
2 i D1 i D1
of attraction). The stability region of a stable
Xn XnC1
 Vi Vj Bij cos.ıi  ıj / equilibrium point xs is defined as
i D1 j Di C1

 cos.ıis  ıjs / (3) n o


A.xs / WD x 2 Rn W lim ˆ t .x/ D xs
t !1
where x s D .ı s ; 0/ is the stable equilibrium point
under consideration. The boundary of stability region A.xs / is called
the stability boundary of .xs / and will be denoted
by @A.xs /.
Energy Function Theory
Theorem 2 (Trajectories on the Stability
Boundary (Chiang et al. 1987)) If there exists
In general, the dynamical behaviors of trajecto-
an energy function for system (1), then every
ries of general nonlinear systems can be very
trajectory on the stability boundary @A.xs /
complicated. The asymptotical behaviors (i.e.,
converges to one of the equilibrium points on
the !-limit set) of trajectories can be quasiperi-
the stability boundary @A.xs /.
odic trajectories or chaotic trajectories. However,
as shown below, every trajectory of system (1) The significance of this theorem is that it
having an energy function has only two modes offers an effective way to characterize the sta-
of behaviors: its trajectory either converges to an bility boundary. In fact, Theorem 2 asserts that
equilibrium point or goes to infinity (becomes the stability boundary @A.xs / is contained in the
unbounded) as time increases. This result is ex- union of stable manifolds of the UEPs on the
plained in the following theorem: stability boundary, i.e.,
Lyapunov Methods in Power System Stability 683

[
@A.xs /  W s .xi / energy surface) and k the level value. Generally
xi 2fE\@A.xs /g speaking, this set S.k/ can be very complicated
with several connected components even for the
The following two theorems give interesting 2-dimensional case. We use the notation Sk .xs /
results on the structure of the equilibrium points to denote the only component of the several dis-
on the stability boundary. Moreover, it presents joint connected components of Sk that contains
a necessary condition for the existence of certain the stable equilibrium point xs .
types of equilibrium points on a bounded stability Theorem 5 (Optimal Estimation) Consider
boundary. the nonlinear system (1) which has an energy
Theorem 3 (Structure of Equilibrium Points function V .x/. Let xs be an asymptotically
on the Stability Boundary (Chiang and Thorp stable equilibrium point whose stability region
1989)) If there exists an energy function for sys- A.xs / is not dense in Rn . Let E1 be the
tem (1) which has an asymptotically stable equi- set of type one equilibrium points and cO D
librium point xs (but not globally asymptotically minxi 2@A.xs / \E V (xi ), and then
1
stable), then the stability boundary .xs / must 1. ScO .xs / A(xs ).
contain at least one type one equilibrium point. 2. The set fSb .xs / \ ANc .xs /g is nonempty for any
If, furthermore, the stability region is bounded, number b > c.
then the stability boundary @A.xs / must contain This theorem leads to an optimal estimation of
at least one type one equilibrium point and one the stability region A(xs ) via an energy function
source. V (.) (Chiang and Thorp 1989). For the purpose
Theorem 4 (Sufficient Condition for Un- of illustration, we consider the following simple
example: L
bounded Stability Region (Chiang et al. 1987))
If there exists an energy function for system (1)
which has an asymptotically stable equilibrium
xP 1 D  sin xi  0:5 sin.x1  x2 / C 0:01
point xs (but not globally asymptotically stable)
and if @A(xs ) contains no source, then the xP 2 D 0:5 sin x2  0:5 sin.x2  x1 / C 0:05 (5)
stability region A(xs ) is unbounded.
A direct application of this is that the stability It is easy to show that the following function is an
boundary @A(xs ) of an (asymptotically) stable energy function for system (5):
equilibrium point of the classical power system
stability model (2) is unbounded.
V .x1 ; x2 / D 2 cos x1  cos x2  cos.x1  x2 /
0:02x1  0:1x2 (6)
Optimally Estimating Stability Region
Using Energy Functions
 
The point x s x1s ; x2s D .0:02801; 0:06403/
In this section, we focus on how to optimally is the stable equilibrium point whose stability
determine the critical level value of an energy region is to be estimated. Applying the optimal
function for estimating the stability boundary scheme to system (5), we have the critical level
@A(xs ). We consider the following set: value of 0.31329. The Curve A in Fig. 1 is
the exact stability boundary @A.x s / while Curve
Sv .k/ D fx 2 Rn W V (x) < kg (4) B is the stability boundary estimated by the
connected component (containing the s.e.p. x s )
where V .:/ W Rn ! R is an energy function. of the constant energy surface. It can be seen that
We shall call the boundary of set (2) @S.k/ WD the critical level value, 0.31329, is indeed the
fx 2 Rn W V .x/ D kg the level set (or constant optimal value.
684 Lyapunov Methods in Power System Stability

5 Applications
4
3 A
After decades of research and development in
2 the energy-function-based direct methods and the
1 B
time-domain simulation approach, it has become
X2

0
clear that the capabilities of direct methods and
−1
that of the time-domain approach complement
−2
−3 each other. The current direction of development
−4 is to include appropriate direct methods and time-
−5 domain simulation programs within the body of
4 3 2 1 0 1 2 3 4 overall power system stability simulation pro-
X1 grams (Chiang 1999, 2011; Chiang et al. 1995;
Fouad and Vittal 1991; Sauer and Pai 1998).
Lyapunov Methods in Power System Stability, Fig. 1
Curve A is the exact stability boundary @A.x s / of system
For example, the direct method provides the ad-
(5), while Curve B is the stability boundary estimated by vantages of fast computational speed and energy
the constant energy surface (with level value of 0.31329) margins which make it a good complement to
of the energy function the traditional time-domain approach. The en-
ergy margin and its functional relations to cer-
tain power system parameters are an effective
complement to develop tools such as preventive
control schemes for credible contingencies which
Constructing Analytical Energy are unstable and to develop fast calculators for
Functions for Transient Stability available transfer capability limited by transient
Models stability.
An effective, theory-based methodology for
The task of constructing an energy function for online screening and ranking of a large set
a (post-fault) transient stability model is essential of contingencies at operating points obtained
to direct stability analysis of power systems. The from state estimators has been developed in
role of the energy function is to make feasible Chiang et al. (2013). A set of improved BCU
a direct determination of whether a given point classifiers, along with their analytical basis,
(such as the initial point of a post-fault power sys- has been developed. Extensive evaluation of the
tem) lies inside the stability region of post-fault improved BCU classifiers on a large test system
SEP without performing numerical integration. It and on the actual PJM interconnection system
has been shown that a general (analytical) energy for a fast screening has been performed. This
function for power systems with losses does not evaluation study is the largest in terms of system
exist (Chiang 1989). One key implication is that size, 14,500 buses and 3,000 generators, for a
any general procedure attempting to construct practical online transient stability assessment
an energy function for a lossy power system application. The evaluation results, performed
transient stability model must include a step that on a total number of 5.3 million contingencies,
checks for the existence of an energy function. were very promising in terms of speed, accuracy,
This step essentially plays the same role as the reliability, and robustness (Chiang et al. 2013).
Lyapunov equation in determining the stability of This study also confirms the practicality of
an equilibrium point. theory-based methodology for online transient
Several schemes are available for constructing stability assessment of large-scale power
numerical energy functions for power system systems; in particular, theory-based methods are
transient stability models expressed as a set of suitable for power system online applications
general differential-algebraic equations (DAEs) which demand speed, accuracy, reliability, and
(Chu and Chiang 1999, 2005). robustness.
Lyapunov’s Stability Theory 685

Cross-References Fouad AA, Vittal V (1991) Power system transient sta-


bility analysis: using the transient energy function
method. Prentice-Hall, Englewood Cliffs
 Lyapunov’s Stability Theory Sauer PW, Pai MA (1998) Power system dynamics and
 Model Order Reduction: Techniques and Tools stability. Prentice-Hall, Upper Saddle River
 Power System Voltage Stability
 Small Signal Stability in Electric Power Sys-
tems
 Time-ScaleSeparation in Power System Lyapunov’s Stability Theory
Swing Dynamics: Singular Perturbations and
Coherency Hassan K. Khalil
Department of Electrical and Computer
Engineering, Michigan State University,
East Lansing, MI, USA
Recommended Reading

A recent book which contains a comprehensive Abstract


treatment of energy functions theory and applica-
tions is Chiang (2011). Lyapunov’s theory for characterizing and study-
ing the stability of equilibrium points is presented
for time-invariant and time-varying systems mod-
eled by ordinary differential equations.
Bibliography
L
Chiang HD (1989) Study of the existence of energy
functions for power systems with losses. IEEE Trans
Circuits Syst CAS-36(11):1423–1429
Keywords
Chiang HD (1999) Power system stability. In: Webster JG
(ed) Wiley encyclopedia of electrical and electronics Asymptotic stability; Equilibrium point;
engineering. Wiley, New York, pp 104–137 Exponential stability; Global asymptotic
Chiang HD (2011) Direct methods for stability analysis of
electric power systems: theoretical foundation, BCU stability; Hurwitz matrix; Invariance principle;
methodologies, and applications. Wiley, Hoboken Linearization; Lipschitz condition; Lyapunov
Chiang HD, Thorp JS (1989) Stability regions of nonlinear function; Lyapunov surface; Negative (semi-)
dynamical systems: a constructive methodology. IEEE definite function; Perturbed system; Positive
Trans Autom Control 34(12):1229–1241
Chiang HD, Wu FF, Varaiya PP (1987) Foundations of di- (semi-) definite function; Region of attraction;
rect methods for power system transient stability anal- Stability; Time-invariant system; Time-varying
ysis. IEEE Trans Circuits Syst CAS-34(2):160–173 system
Chiang HD, Chu CC, Cauley G (1995) Direct stabil-
ity analysis of electric power systems using energy
functions: theory, applications and perspective (invited
paper)
Chiang HD, Li H, Tong J, Tada Y (2013) On-line transient
Introduction
stability screening of practical a 14,500-bus power
system: methodology and evaluations. In: Khaitan Stability theory plays a central role in systems
SK, Gupta A (eds) High performance computing in theory and engineering. For systems represented
power and energy systems. Springer, Berlin/New York,
by state models, stability is characterized by
pp 335–358. ISBN:978-3-642-32682-0
Chu CC, Chiang HD (1999) Constructing analytical en- studying the asymptotic behavior of the state
ergy functions for network-reduction power systems variables near steady-state solutions, like equi-
with models: framework and developments. Circuits librium points or periodic orbits. In this article,
Syst Signal Process 18(1):1–16
Lyapunov’s method for determining the stabil-
Chu CC, Chiang HD (2005) Constructing analytical en-
ergy functions for network-preserving power system ity of equilibrium points is introduced. The at-
models. Circuits Syst Signal Process 24(4):363–383 tractive features of the method include a solid
686 Lyapunov’s Stability Theory

theoretical foundation, the ability to conclude time t D 0 approaches the origin as t tends to 1.
stability without knowledge of the solution (no When the region of attraction is the whole space,
extensive simulation effort), and an analytical we say that the origin is globally asymptotically
framework that makes it possible to study the ef- stable. A stronger form of asymptotic stability
fect of model perturbations and design feedback arises when there exist positive constants c, k,
control. Its main drawback is the need to search and such that the solutions of Eq. (1) satisfy the
for an auxiliary function that satisfies certain inequality
conditions.
kx.t/k  kkx.0/ke  t ; 8 t  0 (2)

Stability of Equilibrium Points for all kx.0/k < c. In this case, the equilibrium
point x D 0 is said to be exponentially stable.
We consider a nonlinear system represented by It is said to be globally exponentially stable if the
the state model inequality is satisfied for any initial state x.0/.

xP D f .x/ (1) Linear Systems


For the linear time-invariant system

where the n-dimensional locally Lipschitz func- xP D Ax (3)


tion f .x/ is defined for all x in a domain D
Rn . A function f .x/ is locally Lipschitz at a the stability properties of the origin can be de-
point x0 if it satisfies the Lipschitz condition termined by the location of the eigenvalues of
kf .x/  f .y/k  Lkx  yk for all x; y in A. The origin is stable if and only if all the
some neighborhood ofqx0 , where L is a positive eigenvalues of A satisfy ReΠi   0 and for
constant and kxk D x12 C x22 C    C xn2 . The every eigenvalue with ReΠi  D 0 and algebraic
Lipschitz condition guarantees that Eq. (1) has multiplicity qi  2, rank.A  i I / D n  qi ,
a unique solution for given initial state x.0/. where n is the dimension of x and qi is the
Suppose xN 2 D is an equilibrium point of multiplicity of i as a zero of det. I  A/. The
N D 0. Whenever the state
Eq. (1); that is, f .x/ origin is globally exponentially stable if and only
of the system starts at x,N it will remain at xN for if all eigenvalues of A have negative real parts;
all future time. Our goal is to characterize and that is, A is a Hurwitz matrix. For linear sys-
N For convenience, we take
study the stability of x. tems, the notions of asymptotic and exponential
xN D 0. There is no loss of generality in doing so stability are equivalent because the solution is
because any equilibrium point xN can be shifted to formed of exponential modes. Moreover, due to
the origin via the change of variables y D x  x. N linearity, if the origin is exponentially stable, then
Therefore, we shall always assume that f .0/ D 0 the inequality of Eq. (2) will hold for all initial
and study stability of the origin x D 0. states.
The equilibrium point x D 0 of Eq. (1) is
stable if for each " > 0, there is ı D ı."/ > 0 Linearization
such that kx.0/k < ı implies that kx.t/k < ", Suppose the function f .x/ of Eq. (1) is continu-
for all t  0. It is asymptotically stable if it is ously differentiable in a domain D containing the
stable and ı can be chosen such that kx.0/k < ı origin. The Jacobian matrix Œ@f =@x is an n  n
implies that x.t/ converges to the origin as t tends matrix whose .i; j / element is @fi =@xj . Let A be
to infinity. When the origin is asymptotically sta- the Jacobian matrix evaluated at the origin x D 0.
ble, the region of attraction (also called region It can be shown that
of asymptotic stability, domain of attraction, or
basin) is defined as the set of all points x such
f .x/ D ŒA C G.x/x; where lim G.x/ D 0
that the solution of Eq. (1) that starts from x at x!0
Lyapunov’s Stability Theory 687

This suggests that in a small neighborhood of the converges to zero as t tends to infinity. This
origin we can approximate the nonlinear system extension of the basic theorem is known as the
xP D f .x/ by its linearization about the origin invariance principle.
xP D Ax. Indeed, we can draw conclusions about Lyapunov functions can be used to estimate
the stability of the origin as an equilibrium point the region of attraction of an asymptotically sta-
for the nonlinear system by examining the eigen- ble origin, that is, to find sets contained in the
values of A. The origin of Eq. (1) is exponentially region of attraction. Let V .x/ be a Lyapunov
stable if and only if A is Hurwitz. It is unstable if function that satisfies the conditions of asymp-
ReΠi  > 0 for one or more of the eigenvalues of totic stability over a domain D. For a positive
A. If ReΠi   0 for all i , with ReΠi  D 0 for constant c, let c be the component of fV .x/ 
some i , we cannot draw a conclusion about the cg that contains the origin in its interior. The
stability of the origin of Eq. (1). properties of V guarantee that, by choosing c
small enough, c will be bounded and contained
in D. Then, every trajectory starting in c re-
Lyapunov’s Method mains in c and approaches the origin as t ! 1.
Thus, c is an estimate of the region of attraction.
Let V .x/ be a continuously differentiable scalar If D D Rn and V .x/ is radially unbounded, that
function defined in a domain D Rn that is, kxk ! 1 implies that V .x/ ! 1, then any
contains the origin. The function V .x/ is said to point x 2 Rn can be included in a bounded set
be positive definite if V .0/ D 0 and V .x/ > 0 c by choosing c large enough. Therefore, the
for x ¤ 0. It is said to be positive semidefinite origin is globally asymptotically stable if there is
if V .x/  0 for all x. A function V .x/ is said a continuously differentiable, radially unbounded
function V(x) such that for all x 2 Rn , V .x/
L
to be negative definite or negative semidefinite
if V .x/ is positive definite or positive semidef- is positive definite and VP .x/ is either negative
inite, respectively. The derivative of V along the definite or negative semidefinite but no solution
P
can stay identically in the set fV.x/ D 0g other
trajectories of Eq. (1) is given by
than the zero solution x.t/  0.
X n
@V @V
VP .x/ D xP i D f .x/ Time-Varying Systems
i D1
@xi @x Equation (1) is time-invariant because f does
not depend on t. The more general time-varying
where Œ@V =@x is a row vector whose i th compo- system is represented by
nent is @V =@xi .
Lyapunov’s stability theorem states that the xP D f.t; x/ (4)
origin is stable if there is a continuously differ-
In this case, we may allow the Lyapunov function
entiable positive definite function V .x/ so that
P candidate V to depend on t. Let V .t; x/ be a
V.x/ is negative semidefinite, and it is asymp-
P continuously differentiable function defined for
totically stable if V.x/ is negative definite. A
all t  0 and x 2 D. The derivative of V along
function V .x/ satisfying the conditions for sta-
the trajectories of Eq. (4) is given by
bility is called a Lyapunov function. The surface
V .x/ D c, for some c > 0, is called a Lyapunov @V @V
surface or a level surface. VP .t; x/ D C f .t; x/
@t @x
When V.x/P is only negative semidefinite, we
may still conclude asymptotic stability of the ori- If there are positive definite functions W1 .x/,
gin if we can show that no solution can stay iden- W2 .x/, and W3 .x/ such that
P
tically in the set fV.x/ D 0g, other than the zero
W1 .x/  V .t; x/  W2 .x/ (5)
solution x.t/  0. Under this condition, V .x.t//
must decrease toward 0, and consequently x.t/ VP .t; x/  W3 .x/ (6)
688 Lyapunov’s Stability Theory

for all t  0 and all x 2 D, then the c1 kxk2  V .t; x/  c2 kxk2 (8)
origin is uniformly asymptotically stable, where
@V @V
“uniformly” indicates that the "–ı definition C f .t; x/  c3 kxk2 (9)
of stability and the convergence of x.t/ to @t @x
 
zero are independent of the initial time t0 .  @V 
 
Such uniformity annotation is not needed with  @x   c4 kxk (10)
time-invariant systems since the solution of a
time-invariant state equation starting at time t0 for all x 2 D for some positive constants c1 , c2 ,
depends only on the difference t  t0 , which c3 , and c4 . Suppose the perturbation term g.t; x/
is not the case for time-varying systems. If the satisfies the linear growth bound
inequalities of Eqs. (5) and (6) hold globally and
W1 .x/ is radially unbounded, then the origin kg.t; x/k 
kxk; 8 t  0; 8 x 2 D (11)
is globally uniformly asymptotically stable.
If W1 .x/ D k1 kxka , W2 .x/ D k2 kxka , and where
is a nonnegative constant. We use V as
W3 .x/ D k3 kxka for some positive constants k1 , a Lyapunov function candidate to investigate the
k2 , k3 , and a, then the origin is exponentially stability of the origin as an equilibrium point for
stable. the perturbed system. The derivative of V along
the trajectories of Eq. (7) is given by

@V @V @V
Perturbed Systems VP .t; x/ D C f .t; x/ C g.t; x/
Consider the system @t @x @x

The first two terms on the right-hand side are the


xP D f .t; x/ C g.t; x/ (7) derivative of V .t; x/ along the trajectories of the
nominal system, which is negative definite and
satisfies the inequality of Eq. (9). The third term,
where f and g are continuous in t and locally Œ@V =@xg, is the effect of the perturbation. Using
Lipschitz in x, for all t  0 and x 2 D, Eqs. (9) through (11), we obtain
in which D Rn is a domain that contains
 
the origin x D 0. Suppose f .t; 0/ D 0 and  @V 
P 
V .t; x/  c3 kxk C 
2  kg.t; x/k
g.t; 0/ D 0 so that the origin is an equilibrium @x 
point of Eq. (7). We think of the system (7)
as a perturbation of the nominal system (4).  c3 kxk2 C c4
kxk2
The perturbation term g.t; x/ could result from
modeling errors, uncertainties, or disturbances. If
< c3 =c4 , then
In a typical situation, we do not know g.t; x/,
but we know some information about it, like  .c3 
c4 /kxk2 ; .c3 
c4 / > 0
knowing an upper bound on kg.t; x/k. Suppose
the nominal system has an exponentially stable which shows that the origin is an exponentially
equilibrium point at the origin, what can we say stable equilibrium point of the perturbed sys-
about the stability of the origin as an equilib- tem (7).
rium point of the perturbed system? A natural
approach to address this question is to use a
Lyapunov function for the nominal system as a Summary
Lyapunov function candidate for the perturbed
system. Lyapunov’s method is a powerful tool for study-
Let V .t; x/ be a Lyapunov function that satis- ing the stability of equilibrium points. However,
fies there are two drawbacks of the method. First,
Lyapunov’s Stability Theory 689

there is no systematic procedure for finding Lya- (1977), Yoshizawa (1966), and (Zubov 1964).
punov functions. Second, the conditions of the Lyapunov’s theory for discrete-time systems is
theory are only sufficient; they are not neces- presented in Haddad and Chellaboina (2008) and
sary. Failure of a Lyapunov function candidate to Qu (1998). The monograph Michel and Wang
satisfy the conditions for stability or asymptotic (1995) presents Lyapunov’s stability theory for
stability does not mean that the origin is not stable general dynamical systems, including functional
or asymptotically stable. These drawbacks have and partial differential equations.
been mitigated by a long history of using the
method in the analysis and design of engineering
systems, where various techniques for finding Bibliography
Lyapunov functions for specific systems have
been determined. Bacciotti A, Rosier L (2005) Liapunov functions and
stability in control theory, 2nd edn. Springer, Berlin
Haddad WM, Chellaboina V (2008) Nonlinear dynami-
cal systems and control. Princeton University Press,
Cross-References Princeton
Hahn W (1967) Stability of motion. Springer, New York
 Feedback Stabilization of Nonlinear Systems Khalil HK (2002) Nonlinear systems, 3rd edn. Prentice
 Input-to-State Stability Hall, Princeton
Krasovskii NN (1963) Stability of motion. Stanford Uni-
 Regulation and Tracking of Nonlinear Systems
versity Press, Stanford
Michel AN, Wang K (1995) Qualitative theory of dynam-
ical systems. Marcel Dekker, New York
Recommended Reading Qu Z (1998) Robust control of nonlinear uncertain sys-
tems. Wiley-Interscience, New York
Rouche N, Habets P, Laloy M (1977) Stability theory by
L
For an introduction to Lyapunov’s stability Lyapunov’s direct method. Springer, New York
theory at the level of first-year graduate students, Sastry S (1999) Nonlinear systems: analysis, stability, and
the textbooks Khalil (2002), Sastry (1999), control. Springer, New York
Slotine J-JE, Li W (1991) Applied nonlinear control.
Slotine and Li (1991), and (Vidyasagar 2002)
Prentice-Hall, Englewood Cliffs
are recommended. The books by Bacciotti and Vidyasagar M (2002) Nonlinear systems analysis, classic
Rosier (2005) and Haddad and Chellaboina edn. SIAM, Philadelphia
(2008) cover a wider set of topics at the Yoshizawa T (1966) Stability theory by Liapunov’s sec-
ond method. The Mathematical Society of Japan,
same introductory level. A deeper look into Tokyo
the theory is provided in the monographs Zubov VI (1964) Methods of A. M. Lyapunov and their
Hahn (1967), Krasovskii (1963), Rouche et al. applications. P. Noordhoff Ltd., Groningen
M

Keywords
Markov Chains and Ranking
Problems in Web Search
Aggregation; Distributed randomized algorithms;
Markov chains; Optimization; PageRank; Search
Hideaki Ishii1 and Roberto Tempo2
1 engines; World wide web
Tokyo Institute of Technology, Yokohama,
Japan
2
CNR-IEIIT, Politecnico di Torino, Torino,
Introduction
Italy
For various real-world large-scale dynamical sys-
tems, reasonable models describing highly com-
Abstract plex behaviors can be expressed as stochastic
systems, and one of the most well-studied classes
Markov chains refer to stochastic processes of such systems is that of Markov chains. A char-
whose states change according to transition acteristic feature of Markov chains is that their
probabilities determined only by the states of behavior does not carry any memory. That is, the
the previous time step. They have been crucial current state of a chain is completely determined
for modeling large-scale systems with random by the state of the previous time step and not at all
behavior in various fields such. as control, on the states prior to that step (Kumar and Varaiya
communications, biology, optimization, and 1986; Norris 1997).
economics. In this entry, we focus on their Recently, Markov chains have gained
recent application to the area of search engines, renewed interest due to the extremely successful
namely, the PageRank algorithm employed at applications in the area of web search. The
Google, which provides a measure of importance search engine of Google has been employing
for each page in the web. We present several an algorithm known as PageRank to assist
researches carried out with control theoretic tools the ranking of search results. This algorithm
such as aggregation, distributed randomized models the network of web pages as a Markov
algorithms, and PageRank optimization. Due chain whose states represent the pages that
to the large size of the web, computational web surfers with various interests visit in a
issues are the underlying motivation of these random fashion. The objective is to find an
studies. order among the pages according to their

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
692 Markov Chains and Ranking Problems in Web Search

P
popularity and importance, and this is done by sum up to 1, i.e., niD1 pij D 1 for j 2 X . In
focusing on the structure of hyperlinks among this respect, the matrix P is (column) stochastic
pages. (Horn and Johnson 1985).
In this entry, we first provide a brief overview In this entry, we assume that the Markov chain
on the basics of Markov chains and then in- is ergodic, meaning that for any pair of states,
troduce the problem of PageRank computation. the chain can make a transition from one to the
We proceed to provide further discussions on other over time. In this case, the chain and the
control theoretic approaches dealing with PageR- matrix P are called irreducible. This property is
ank problems. The topics covered include ag- known to imply that P has a simple eigenvalue
gregated Markov chains, distributed randomized of 1. Thus, there exists a unique steady state
algorithms, and Markov decision problems for probability distribution  2 Rn given by
link optimization.
 D P ; 1T  D 1; i > 0; 8i 2 X ;

Markov Chains where 1 2 Rn denotes a vector with entries one.


Note that in this distribution , all entries are
In the simplest form, a Markov chain takes its positive.
states in a finite state space with transitions in
the discrete-time domain. The transition from one
state to another is characterized completely by the Ranking in Search Engines: PageRank
underlying probability distribution. Algorithm
Let X be a finite set given by X WD
f1; 2; : : : ; ng, which is called the state space. At Google, PageRank is used to quantify the
Consider a stochastic process fXk g1 kD0 taking
importance of each web page based on the hy-
values on this set X . Such a process is called a perlink structure of the web (Brin and Page 1998;
Markov chain if it exhibits the following Markov Langville and Meyer 2006). A page is considered
property: important if (i) many pages have links pointing
to the page, (ii) such pages having links are
˚
Prob XkC1 D j jXk D ik ; Xk1 D ik1 ; : : : ; important ones, and (iii) the numbers of links
 ˚  that such pages have are limited. Intuitively, these
X0 D i0 D Prob XkC1 D j jXk D ik ; requirements are reasonable. For a web page, its
incoming links can be viewed as votes supporting
where Probfjg denotes the conditional probabil- the page, and moreover the quality of the votes
ity and k 2 ZC . That is, the state at the next count through their importance as well as the
time step depends only on the current state and number of votes that they make. Even if a minor
not those of previous times. page (with low PageRank) has many outgoing
Here, we consider the homogeneous case links, its contribution to the linked pages will not
where the transition probability is constant over be substantial.
time. Thus, we have for each pair i; j 2 X , the An interesting way to explain the PageRank
probability that the chain goes from state j to is through the random surfer model: The random
state i at time k expressed as surfer starts from a randomly chosen page. Each
˚  time visiting a page, he/she follows a hyperlink in
pij WD Prob Xk D i jXk1 D j ; k 2 ZC : that page chosen at random with uniform proba-
bility. Hence, if the current page i has ni outgoing
In the matrix form, P WD .pij / is called the links, then one of them is picked with probability
transition probability matrix of the chain. It is 1=ni . If it happens that the current page has
obvious that all entries of P are nonnegative, and no outgoing link (e.g., at PDF documents), the
for each j, the entries of the jth column of P surfer will use the back button. This process
Markov Chains and Ranking Problems in Web Search 693

will be repeated. The PageRank value for a page The idea of the teleportation model is that
represents the probability of the surfer visiting the the random surfer, after a while, becomes bored
page. It is thus higher for pages visited more often and stops following the hyperlinks. At such an
by the surfer. instant, the surfer “jumps” to another page not
It is now clear that PageRank is obtained by directly connected to the one currently visiting.
describing the random surfer model as a Markov This page can be in fact completely unrelated
chain and then finding its stationary distribution. in the domains and/or the contents. All n pages
First, we express the network of web pages as in the web have the same probability 1=n to be
the directed graph G D .V; E/, where V D reached by a jump.
f1; 2; : : : ; ng is the set of nodes corresponding to The probability to make such a jump is de-
web page indices while E  V  V is the set of noted by m 2 .0; 1/. The original transition
edges for links among pages. Node i is connected probability matrix A is now replaced with the
to node j by an edge, i.e., .i; j / 2 E, if page i has modified one M 2 Rnn defined by
an outgoing link to page j.
Let xi .k/ be the distribution of the random m T
surfer visiting page i at time k, and let x.k/ be M WD .1  m/A C 11 : (3)
n
the vector containing all xi .k/. Given the initial
distribution x.0/, which is a probability vector,
P For the value of m, we take m D 0:15 as reported
i.e., niD1 xi.0/ D 1, the evolution of x.k/ can
be expressed as in the original algorithm in Brin and Page (1998).
Notice that M is a positive stochastic matrix. By
x.k C 1/ D Ax.k/: (1) Perron’s theorem (Horn and Johnson 1985), the
eigenvalue 1 is of multiplicity 1 and is the unique
The link matrix A D .aij / 2 Rnn is given by eigenvalue with the maximum modulus. Further,
the corresponding eigenvector is positive. Hence,
M
aij D 1=nj if .j; i / 2 E and 0 otherwise, where
nj is the number of outgoing links of page j . we redefine the vector x  in (2) by using M
Note that this matrix A is the transition proba- instead of A as follows:
bility matrix of the random surfer. Clearly, it is
stochastic, and thus x.k/ remains a probability
P x  D M x  ; x  2 Œ0; 1n ; 1T xi D 1:
vector so that niD1 xi .k/ D 1 for all k.
As mentioned above, PageRank is the
stationary distribution of the process (1) under Due to the large dimension of the link matrix
the assumption that the limit exists. Hence, M , the computation of x  is difficult. The solu-
the PageRank vector is given by x  WD tion employed in practice is based on the power
limk!1 x.k/. In other words, it is the solution of method given by
the linear equation
m
x  D Ax  ; x  2 Œ0; 1n ; 1T x  D 1: (2) x.k C 1/ D M x.k/ D .1  m/Ax.k/ C 1;
n
(4)
Notice that the PageRank vector x  is a non-
negative unit eigenvector for the eigenvalue 1 of
A. Such a vector exists since the matrix A is where the initial vector x.0/ 2 Rn is a probability
stochastic, but may not be unique; the reason is vector. The second equality above follows from
that A is a reducible matrix since in the web, not the fact 1T x.k/ D 1 for k 2 ZC . For imple-
every pair of pages can be connected by simply mentation, the form on the far right-hand side is
following links. To resolve this issue, a slight important, using only the sparse matrix A and not
modification is necessary in the random surfer the dense matrix M . This method asymptotically
model. finds the value vector as x.k/ ! x  ,k ! 1.
694 Markov Chains and Ranking Problems in Web Search

Aggregation Methods for Large-Scale To exploit the sparse structure in the


Markov Chains computation of stationary probability distribu-
tions, one approach is to carry out decomposition
In dealing with large-scale Markov chains, or aggregation of the chains. The basic approach
it is often desirable to predict their dynamic here is (i) to compute the local stationary
behaviors from reduced-order models that are distributions for I C Bii , (ii) to find the global
more computationally tractable. This enables us, stationary distribution for a chain representing
for example, to analyze the system performance the group interactions, and (iii) to finally use the
at a macroscale with some approximation under obtained vectors to compute exact/approximate
different operating conditions. Aggregation distribution for the entire chain; for details, see
refers to partitioning or grouping the states so Meyer (1989). By interpreting such methods
that the states in each group can be treated from the control theoretic viewpoints, in Phillips
as a whole. The technique of aggregation is and Kokotovic (1981) and Aldhaheri and Khalil
especially effective for Markov chains possessing (1991), singular perturbation approaches have
sparse structures with strong interactions among been developed. These methods lead us to the
states in the same group and weak interactions two-time scale decomposition of (controlled)
among states in different groups. Such methods Markov chain recursions.
have been extensively studied, motivated by In the case of PageRank computation, sparsity
applications in queueing networks, power is a relevant property since it is well known
systems, etc. (Meyer 1989). that many links in the web are intra-host ones,
In the context of the PageRank problem, such connecting pages within the same domains or
sparse interconnection can be expressed in the directories. However, in the real web, it is easy
link matrix A with a block-diagonal structure to find pages that have only a few outlinks, but
(after some coordinate change, if necessary). The some of them are external ones. Such pages
entries of the matrix A are dense along its diag- will prevent the link matrix from having small
onal in blocks, and those outside the blocks take  when decomposed in the form (5). Hence, the
small values. More concretely, we write general aggregation methods outlined above are
not directly applicable.
An aggregation-based method suitable for
A D I C B C C; (5)
PageRank computation is proposed in Ishii et al.
(2012). There, the sparsity in the web is expressed
where B is a block-diagonal matrix given as by the limited number of external links pointing
B D diag.B11 ; B22 ; : : : ; BNN /; Bii is the nQ i  nQ i towards pages in other groups. For each page i ,
matrix corresponding to the ith group with nQ i the node parameter ıi 2 Œ0; 1 is given by
member pages for i D 1; 2; : : : ; N ; and  is a
small positive parameter. Here, the non-diagonal # external outgoing links
ıi WD :
entries of Bii are the same as those in the same # total outgoing links
diagonal block of A, but the diagonal entries are
chosen such that I C Bii becomes stochastic and Note that smaller ıi implies sparser networks. In
thus take nonpositive values. Thus, both B and this approach, for a given bound ı, the condition
C have column sums equal to zero. The small  ıi  ı is imposed only in the case page i belongs
suggests us that states can be aggregated into N to a group consisting of multiple members. Thus,
groups with strong interactions within the groups, a page forming a group by itself is not required
but connections among different groups are weak. to satisfy the condition. This means that we can
This class of Markov chains is known as nearly regroup the pages by first identifying pages that
completely decomposable. In general, however, it violate this condition in the initial groups and
is difficult to uniquely determine the form (5) for then making them separately as single groups.
a given chain. By repeating these steps, it is always possible to
Markov Chains and Ranking Problems in Web Search 695

obtain groups for a given web. Once the grouping points in the field of numerical analysis (Bert-
is settled, an aggregation-based algorithm can sekas and Tsitsiklis 1989). The distributed update
be applied, which computes an approximated recursion is given as
PageRank vector. A characteristic feature is the
tradeoff between the accuracy in PageRank com- L C 1/ D ML 1 .k/;:::;n .k/ x.k/;
x.k L (6)
putation and the node parameter ı. More accurate L
where the initial state x.0/ is a probability vector
computation requires a larger number of groups and the distributed link matrices ML p1 ;:::;pn are
and thus a smaller ı. given as follows: Its .i; j /th entry is equal to
.1  m/aij C m=n if pi D 1; 1 if pi D 0 and
i D j ; and 0 otherwise. Clearly, these matrices
Distributed Randomized keep the rows of the original link matrix M in (3)
Computation for pages initiating updates. Other pages just
keep their previous values. Thus, these matrices
For large-scale computation, distributed algo- are not stochastic. From this update recursion,
rithms can be effective by employing multiple the PageRank x  is probabilistically obtained (in
processors to compute in parallel. There are the mean square sense and in probability one),
several methods of constructing algorithms to where the convergence speed is exponential in
find stationary distributions of large Markov time k. Note that in this scheme (6), due to the
chains. In this section, motivated by the current way the distributed link matrices are constructed,
literature on multi-agent systems, sequential each page needs to know which pages have links
distributed randomized approaches of gossip pointing towards it. This implies that popular
type are described for the PageRank problem. pages linked by a number of pages must have
In gossip-type distributed algorithms, nodes extra memory to keep the data of such links.
make decisions and transmit information to Another recently developed approach Ishii and M
their neighbors in a random fashion. That is, Tempo (2010) and Zhao et al. (2013) has several
at any time instant, each node decides whether notable differences from the asynchronous
to communicate or not depending on a random iteration approach above. First, the pages need
variable. The random property is important to to transmit their states only over their outgoing
make the communication asynchronous so that links; the information of such links are by
simultaneous transmissions resulting in collisions default available locally, and thus, pages are
can be avoided. Moreover, there is no need of any not required to have the extra memory regarding
centralized decision maker or fixed order among incoming links. Second, it employs stochastic
pages. matrices in the update as in the centralized
More precisely, each page i 2 V is equipped scheme; this aspect is utilized in the convergence
with a random process i .k/ 2 f0; 1g for k 2 analysis. As a consequence, it is established
ZC . If at time k, i .k/ is equal to 1, then page that the PageRank vector x  is computed in
i broadcasts its information to its neighboring a probabilistic sense through the time average
pages connected by outgoing links. All pages in- of the states x.0/; : : : ; x.k/ given by y.k/ WD
volved at this time renew their values based on the P
1=.k C 1/ k`D0 x.`/. The convergence speed in
latest available data. Here, i .k/ is assumed to be this case is of order 1=k.
an independent and identically distributed (i.i.d.)
random process, and its probability distribution
is given by Probfi .k/ D 1g D ˛, k 2 ZC . PageRank Optimization
Hence, all pages are given the same probability via Hyperlink Designs
˛ to initiate an update.
One of the proposed randomized approaches For owners of websites, it is of particular interest
is based on the so-called asynchronous iteration to raise the PageRank values of their web pages.
algorithms for distributed computation of fixed Especially in the area of e-business, this can
696 Markov Chains and Ranking Problems in Web Search

be critical for increasing the number of visitors in polynomial time by modeling them as con-
to their sites. The values of PageRank can be strained Markov decision processes with ergodic
affected by changing the structure of hyperlinks rewards (see, e.g., Puterman 1994).
in the owned pages. Based on the random surfer
model, intuitively, it makes sense to arrange the
links so that surfers will stay within the domain Summary and Future Directions
of the owners as long as possible.
PageRank optimization problems have rigor- Markov chains form one of the simplest classes
ously been considered in, for example, de Ker- of stochastic processes but have been found pow-
chove et al. (2008) and Fercoq et al. (2013). erful in their capability to model large-scale com-
In general, these are combinatorial optimization plex systems. In this entry, we introduced them
problems since they deal with the issues on where mainly from the viewpoint of PageRank algo-
to place hyperlinks, and thus the computation for rithms in the area of search engines and with a
solving them can be prohibitive especially when particular emphasis on recent works carried out
the web data is large. However, the work Fercoq based on control theoretic tools. Computational
et al. (2013) has shown that the problem can issues will remain in this area as major chal-
be solved in polynomial time. In what follows, lenges, and further studies will be needed. As we
we discuss a simplified discrete version of the have observed in PageRank-related problems, it
problem setup of this work. is important to pay careful attention to structures
Consider a subset V0  V of web pages over of the particular problems.
which a webmaster has control. The objective is
to maximize the total PageRank of the pages in Cross-References
this set V0 by finding the outgoing links from
these pages. Each page i 2 V0 may have con-  Randomized Methods for Control of Uncertain
straints such as links that must be placed within Systems
the page and those that cannot be allowed. All
other links, i.e., those that one can decide to have
or not, are the design parameters. Hence, the Bibliography
PageRank optimization problem can be stated as
Aldhaheri R, Khalil H (1991) Aggregation of the policy
˚ iteration method for nearly completely decomposable
max U.x  ; M / W x  D M x  ; x  2 Œ0; 1n ; Markov chains. IEEE Trans Autom Control 36:178–
 187
1T x  D 1; M 2 M ; Bertsekas D, Tsitsiklis J (1989) Parallel and distributed
computation: numerical methods. Prentice-Hall, En-
glewood Cliffs
where U is the utility function U.x  ; M / WD
P 
Brin S, Page L (1998) The anatomy of a large-scale
i 2V0 xi and M represents the set of admissible hypertextual web search engine. Comput Netw ISDN
link matrices in accordance with the constraints Syst 30:107–117
de Kerchove C, Ninove L, Van Dooren P (2008) Influence
introduced above.
of the outlinks of a page on its PageRank. Linear
In Fercoq et al. (2013), an extended continu- Algebra Appl 429:1254–1276
ous problem is also studied where the set M of Fercoq O, Akian M, Bouhtou M, Gaubert S (2013) Er-
link matrices is a polytope of stochastic matrices godic control and polyhedral approaches to PageRank
optimization. IEEE Trans Autom Control 58:134–148
and a more general utility function is employed.
Horn R, Johnson C (1985) Matrix analysis. Cambridge
The motivation for such a problem comes from University Press, Cambridge
having weighted links so that webmasters can Ishii H, Tempo R (2010) Distributed randomized algo-
determine which links should be placed in a more rithms for the PageRank computation. IEEE Trans
Autom Control 55:1987–2002
visible location inside their pages to increase
Ishii H, Tempo R, Bai EW (2012) A web aggregation
clickings on those hyperlinks. Both discrete and approach for distributed randomized PageRank algo-
continuous problems are shown to be solvable rithms. IEEE Trans Autom Control 57:2703–2717
Mathematical Models of Marine Vehicle-Manipulator Systems 697

Kumar P, Varaiya P (1986) Stochastic systems: estimation, vehicle is usually equipped with one or more
identification, and adaptive control. Prentice Hall, En- manipulators; such systems are defined under-
glewood Cliffs
Langville A, Meyer C (2006) Google’s PageRank and be-
water vehicle-manipulator systems (UVMSs). A
yond: the science of search engine rankings. Princeton UVMS holding six degree-of-freedom (DOF)
University Press, Princeton manipulators is illustrated in Fig. 1. The vehicle
Meyer C (1989) Stochastic complementation, uncoupling carrying the manipulator may or may not be
Markov chains, and the theory of nearly reducible
systems. SIAM Rev 31:240–272
connected to the surface; in the first case we face
Norris J (1997) Markov chains. Cambridge University a so-called remotely operated vehicle (ROV),
Press, Cambridge while in the latter an autonomous underwater
Phillips R, Kokotovic P (1981) A singular perturbation vehicle (AUV). ROVs, being physically linked,
approach to modeling and control of Markov chains.
IEEE Trans Autom Control 26:1087–1094
via the tether, to an operator that can be on a
Puterman M (1994) Markov decision processes: submarine or on a surface ship, receives power as
discrete stochastic dynamic programming. Wiley, well as control commands. AUVs, on the other
New York hand, are supposed to be completely autonomous,
Zhao W, Chen H, Fang H (2013) Convergence of
distributed randomized PageRank algorithms. IEEE
thus relying to onboard power system and
Trans Autom Control 58:3255–3259 intelligence.
Remotely controlled UVMSs represent the
state of the art in underwater manipulation, while
autonomous or semiautonomous UVMSs still are
Mathematical Models of Marine in their embryonic stage. All over the world, few
Vehicle-Manipulator Systems experimental setups have been developed within
on-purpose projects; see, e.g., the European
Gianluca Antonelli project Trident (2012).
University of Cassino and Southern Lazio, M
Cassino, Italy

Sensory System
Abstract
Any manipulation task requires that some vari-
Marine intervention requires the use of manip-
ables are measured; those may concern the inter-
ulators mounted on support vehicles. Such sys-
nal state of the system such as the end effector
tems, defined as vehicle-manipulator systems,
as well the vehicle position and orientation or the
exhibit specific mathematical properties and re-
velocities. Some others concern the surrounding
quire proper control design methodologies. This
environment as it is the case of vision systems
article briefly discusses the mathematical model
or range measurements. Underwater sensing is
within a control perspective as well as sensing
characterized by poorer performance with respect
and actuation peculiarities.
to the ground corresponding variables due to
the physical properties of the water as medium
Keywords carrying the electromagnetic or acoustic signals.
One of the major challenges in underwater
Floating-base manipulators; Marine robotics; robotics is the localization due to the absence
Underwater intervention; Underwater robotics of a single, proprioceptive sensor that measures
the vehicle position and the impossibility to use
the Global Navigation Satellite System (GNSS)
Introduction under the water. The use of redundant multisensor
systems, thus, is common in order to perform sen-
In case of marine operations that require sor fusion and give fault detection and tolerance
interaction with the environment, an underwater capabilities to the vehicle.
698 Mathematical Models of Marine Vehicle-Manipulator Systems

Mathematical Models of Marine Vehicle-Manipulator Systems, Fig. 1 Sketch of a UVMS, the inertial frame as
well as the frames attached to all the rigid bodies are highlighted

Localization Among the commercially available solutions,


long, short, and ultrashort baseline systems have
A possible approach for AUV localization is to found widespread use. The differences are in
rely on inertial navigation systems (INSs); those the baseline wavelength, the required distance
are algorithms that implement dead reckoning among the transponders, the accuracy, and the
techniques, i.e., the estimation of the position by installation cost. Acoustic underwater positioning
properly merging and integrating measurements is commercially mature, and several companies
obtained with inertial and velocity sensors. Dead offer a variety of products.
reckoning suffers from numerical drift due to the In case of intervention, when the UVMS is
integration of sensor noise, as well as sensor bias close to the target, rather than the absolute posi-
and drift, and may be prone to the presence of tion with respect to an inertial frame, it is crucial
external currents and model uncertainties. Since to estimate the relative position with respect to
the variance of the estimation error grows with the target itself. In such a case, vision-based
the distance traveled, this technique is only used systems may be considered.
for short dives.
Several algorithms are based on the concept
of trilateration. The vehicle measures its distance Actuation
with respect to known positions and properly uses
this information by applying geometric-based Underwater vehicles are usually controlled
formulas. Under the water, the technology for by thrusters and/or control surfaces. Control
trilateration is not based on the electromagnetic surfaces, such as rudders and sterns, are typically
field, due to the attenuation of its radiations, but used in vehicles working at cruise speed,
on acoustics. i.e., torpedo-shaped vehicles usually used in
Mathematical Models of Marine Vehicle-Manipulator Systems 699

monitoring or cable/pipeline inspection. In such Analytic and/or geometric approaches may be


vehicles a main thruster is used together with at used to retrieve those relationships. The study
least one rudder and one stern. of the velocity-related equations is fundamental
This configuration is unsuitable for UVMSs to understand how to balance the movement be-
since the force/moment provided by the control tween vehicle and manipulator and, within the
surfaces is the function of the velocity and it is manipulator, how to distribute it among the joints.
null in hovering, when typically manipulation is This topic is strictly related to differential, and
performed. inverse, kinematics for industrial robots.
The relationship between the force/moment The extension of Newton’s second law to
acting on the vehicle and the control input of the UVMSs leads to a number of nonlinear differ-
thrusters is highly nonlinear. It is the function ential equations that link together the systems
of structural variables such as the density of the generalized forces and accelerations. With the
water, the tunnel cross-sectional area, the tunnel word generalized forces, it is here intended as
length, the volumetric flow rate between input the forces and moments acting on the vehicles
and output of the thrusters, and the propeller di- and the joint torques. Correspondingly, one is
ameter. The state of the dynamic system describ- interested in the vehicle linear and angular accel-
ing the thrusters is constituted by the propeller erations and joint accelerations. Those equations
revolution, the speed of the fluid going into the couple together all the DOFs of the structure, e.g.,
propeller, and the input torque. a force applied to the vehicle causes acceleration
also on the joints. Study of the dynamics is crucial
to design the controller.
It is not possible to neglect that the bodies are
Modeling
moving in the water, the theory of fluidodynamics
is rather complex, and it is difficult to develop M
UVMSs can be modeled as rigid bodies con-
a simple model for most of the hydrodynamic
nected to form a serial chain; the vehicle is the
effects. A rigorous analysis for incompressible
floating base, while each link of the manipulator
fluids would need to resort to the Navier-Stokes
represents an additional rigid body with one DOF,
equations (distributed fluid flow). However, most
typically the rotation around the corresponding
of the hydrodynamic effects have no significant
joint’s axis. Roughly speaking, modeling of a
influence in the range of the operative velocities
UVMS is the effort to represent the physical
for UVMS intervention tasks. In particular, it
relationships of those bodies in order to measure
is necessary to model added masses, linear and
and control its end effector, typically involved in
quadratic damping terms, and the buoyancy.
a manipulation task.
The first step of modeling is the so-called di-
rect kinematics, consisting in computing the po-
sition/orientation of the end effector with respect Control
to an inertial, i.e., world fixed, frame. This is done
via geometric relationship function of the system Not surprisingly, the mathematical model of
kinematic parameters, typically the lengths of UVMS shares most of the characteristics
the links, and the current system configuration, of industrial robots as well as space robots
i.e., the vehicle position/orientation and the joint modeling. Having taken into account to the
positions. physical differences, the control problems are
Velocities of each rigid body affect the follow- also similar:
ing rigid bodies and thus the end effector. For • Kinematic control. The control problem is
example, a vehicle roll movement or the joint given in terms of motion of the end effector
velocity is projected into a linear and angular and needs to be transformed into the motion
end-effector velocity. This velocity transforma- of the vehicle and the manipulator. This is
tion is studied by the differential kinematics. often approached by resorting to the inverse
700 Mathematical Models of Marine Vehicle-Manipulator Systems

differential kinematic algorithms. In particu- testing, and monitoring all the operations. Fault
lar, algorithms for redundant systems need to detection and recovery policies are necessary in
be considered since a UVMS always possess each step to avoid loss of expensive hardware.
at least six DOFs. Moving a UVMS requires Future generation of UVMSs needs to be au-
to handle additional variables with respect to tonomous, to percept and contextualize the en-
the end effector such as the vehicle roll and vironment, to react with respect to unplanned
pitch to preserve energy, the robot manipula- situations, and to safely reschedule the tasks
bility, the mechanical joint limits, or eventual of complex missions. Those characteristics are
directional sensing. being shared by all the branches of the service
• Motion control. Low-level control algorithms robotics.
are designed to allow the system tracking the
desired trajectory. UVMSs are characterized
by different dynamics between vehicle and Cross-References
manipulator, uncertainty in the model parame-
ter knowledge, poor sensing performance, and  Advanced Manipulation for Underwater Sam-
limit cycle in the thruster model. On the other pling
hand, the limited bandwidth of the closed-  Mathematical Models of Ships and Underwater
loop system allows the use of simple control Vehicles
approaches.  Redundant Robots
• Interaction control. Several applications re-
quire exchange of forces with the environ-
Recommended Reading
ment. A pure motion control algorithm is
not devised for such operation and specific The book of Fossen (1994) is one of the first
force control algorithms, both direct and indi- books dedicated to control problems of marine
rect, may be necessary. Master/slave systems systems, both underwater and surface. The same
or haptic devices may be used on the pur- author presents, in Fossen (2002), an updated
pose, while autonomous interaction control
and extended version of the topics developed in
still is in the research phase for the marine the first book and in Fossen (2011), a handbook
environment. on marine craft hydrodynamics and control. A
short introductory chapter to marine robotics may
be found in Antonelli et al. (2008). Robotics
Summary and Future Directions fundamentals are also useful and can be found
in Siciliano et al. (2009). To the best of our
This article is aimed at giving a short overview knowledge, Antonelli (2014) is the only mono-
of the main mathematical and technological chal- graph devoted to addressing the specific problems
lenges arising with UVMSs. All the components of underwater vehicle-manipulator systems.
of an underwater mission, perception, actuation,
and communication with the surface, are char-
acterized by poorer performances with respect Bibliography
to the current industrial or advanced robotics
applications. Antonelli, G (2014) Underwater robots: motion and
force control of vehicle-manipulator systems. Springer
The underwater environment is hostile; as an
tracts in advanced robotics, Springer-Verlag, 3rd edn.
example the marine current provides disturbances Springer, Heidelberg
to be counteracted by the dynamic controller, or Antonelli G, Fossen T, Yoerger D (2008) Underwater
the sand’s whirlwinds obfuscate the vision-based robotics. In: Siciliano B, Khatib O (eds) Springer
handbook of robotics. Springer, Heidelberg, pp 987–
operations close to the sea bottom. Both tele-
1008
operated and autonomous underwater missions Fossen T (1994) Guidance and control of ocean vehicles.
require a significant human effort in planning, Chichester, New York
Mathematical Models of Ships and Underwater Vehicles 701

Fossen T (2002) Marine control systems: guidance, navi- we mean “any large floating vessel capable of
gation and control of ships, rigs and underwater vehi- crossing open waters,” as opposed to a boat,
cles. Marine Cybernetics, Trondheim
Fossen T (2011) Handbook of marine craft hydrodynam-
which is generally a smaller craft. An underwater
ics and motion control. Wiley, Chichester vehicle is a “small vehicle that is capable of
Siciliano B, Sciavicco L, Villani L, Oriolo G (2009) propelling itself beneath the water surface as well
Robotics: modelling, planning and control. Springer, as on the water’s surface.” This includes un-
London
TRIDENT (2012) Marine robots and dexterous manip-
manned underwater vehicles (UUVs), remotely
ulation for enabling autonomous underwater multi- operated vehicles (ROVs), autonomous under-
purpose intervention missions. http://www.irs.uji.es/ water vehicles (AUVs) and underwater robotic
trident vehicles (URVs).
This entry is based on Fossen (1991, 2011),
which contains a large number of standard
Mathematical Models of Ships models for ships, rigs, and underwater vehicles.
and Underwater Vehicles There exist a large number of textbooks on
mathematical modeling of ships; see Rawson
Thor I. Fossen and Tupper (1994), Lewanddowski (2004), and
Department of Engineering Cyberentics, Centre Perez (2005). For underwater vehicles, see
for Autonomous Marine Operations and Allmendinger (1990), Sutton and Roberts (2006),
Systems, Norwegian University of Science and Inzartsev (2009), Anotonelli (2010), and Wadoo
Technology, Trondheim, Norway and Kachroo (2010). Some useful references
on ship hydrodynamics are Newman (1977),
Faltinsen (1990), and Bertram (2012).
Abstract

This entry describes the equations of motion Degrees of Freedom M


of ships and underwater vehicles. Standard A mathematical model of marine craft is usu-
hydrodynamic models in the literature are ally represented by a set of ordinary differential
reviewed and presented using the nonlinear equations (ODEs) describing the motions in six
robot-like vectorial notation of Fossen (1991, degrees of freedom (DOF): surge, sway, heave,
1994, 2011). The matrix-vector notation is highly roll, pitch, and yaw.
advantageous when designing control systems
since well-known system properties such as Hydrodynamics
symmetry, skew-symmetry, and positiveness can In hydrodynamics it is common to distinguish
be exploited in the design. between two theories:
• Seakeeping theory: The motions of ships
Keywords at zero or constant speed in waves are an-
alyzed using hydrodynamic coefficients and
Autonomous underwater vehicle (AUV); Degrees wave forces, which depends of the wave ex-
of freedom; Euler angles; Hydrodynamics; Kine- citation frequency and thus the hull geometry
matics; Kinetics; Maneuvering; Remotely oper- and mass distribution. For underwater vehicles
ated vehicle (ROV); Seakeeping; Ship; Underwa- operating below the wave-affected zone, the
ter vehicles wave excitation frequency will not influence
the hydrodynamic coefficients.
• Maneuvering theory: The ship is moving in
Introduction restricted calm water – that is, in sheltered
waters or in a harbor. Hence, the maneuvering
The subject of this entry is mathematical model- model is derived for a ship moving at positive
ing of ships and underwater vehicles. With ship speed under a zero-frequency wave excitation
702 Mathematical Models of Ships and Underwater Vehicles

assumption such that added mass and damping P D J./ (4)


can be represented by constant parameters.
 
Seakeeping models are typically used for R.‚/ 033
ocean structures and dynamically positioned J./ WD (5)
033 T.‚/
vessels. Several hundred ODEs are needed to
effectively represent a seakeeping model; see where ‚ D Œ; ; > is the Euler angles and
Fossen (2011), and Perez and Fossen (2011a,b). 2
c c s c C c ss
The remainder of this entry assumes maneu-
4
R.‚/ D s c c c C sss
vering theory, since this gives lower-order mod-
s cs
els typically suited for controller-observer design.
3
Six ODEs are needed to describe the kinemat- s s C c cs
ics, that is, the geometrical aspects of motion c s C ss c 5 (6)
while Newton-Euler’s equations represent addi- cc
tional six ODEs describing the forces and mo-
ments causing the motion (kinetics). with s  D sin./, c  D cos./ and t  D tan./.
Notation The matrix R is recognized as the Euler angle
The equations of motion are usually represented rotation matrix R 2 SO.3/ satisfying RR> D
using generalized position, velocity and forces R> R D I, and det.R/ D 1, which implies that R
(Fossen 1991, 1994, 2011) defined by the state is orthogonal. Consequently, the inverse rotation
vectors: matrix is given by: R1 D R> . The Euler rates
‚P D T .‚/! are singular for  ¤ ˙=2 since:
 WD Œx; y; z; ; ; > (1) 2 3
>
1 st ct
 WD Œu; v; w; p; q; r 
(2) T.‚/ D 4 0 c s 5 ;  ¤˙
2
 WD ŒX; Y; Z; K; M; N > (3) 0 s=c c=c
(7)
where  is the generalized position expressed in Singularities can be avoided by using unit quater-
the north-east-down (NED) reference frame fng. nions instead (Fossen 1994, 2011).
A body-fixed reference frame fbg with axes:
xb – longitudinal axis (from aft to fore)
yb – transversal axis (to starboard) Kinetics
zb – normal axis (directed downward)
is rotating about the NED reference frame fng The rigid-body kinetics can be derived using the
with angular velocity ! D Œp; q; r> . The gen- Newton-Euler formulation, which is based on
eralized velocity vector  and forces  are both Newton’s second law. Following Fossen (1994,
expressed in fbg, and the 6-DOF states are de- 2011) this gives:
fined according to SNAME (1950):
• Surge position x, linear velocity u, force X MRB P C CRB ./ D  RB (8)
• Sway position y, linear velocity v, force Y
• Heave position z, linear velocity w, force Z where MRB is the rigid-body mass matrix, CRB
• Roll angle , angular velocity p, moment K is the rigid-body Coriolis and centripetal matrix
• Pitch angle , angular velocity q, moment M due to the rotation of fbg about the geographical
• Yaw angle , angular velocity r, moment N frame fng. The generalized force vector  RB rep-
Kinematics resents external forces and moments expressed in
The generalized velocities P and  in fbg and fbg. In the nonlinear case:
fng, respectively satisfy the following kinematic
transformation (Fossen 1994, 2011):  RB D MA C
P A ./D./g./C (9)
Mathematical Models of Ships and Underwater Vehicles 703

2 3
where the matrices MA and CA ./ represent 0 mr mxg r
hydrodynamic added mass due to acceleration 6 7
CRB ./ D 4 mr 0 0 5 (14)
P and Coriolis acceleration due to the rotation
mxg r 0 0
of fbg about the geographical frame fng. The
potential and viscous damping terms are lumped 2 3
0 0 YvP v C YrP r
together into a nonlinear matrix D./ while 6 7
g./ is a vector of generalized restoring forces. CA ./ D 4 0 0 XuP u 5
The control inputs are generalized forces given YvP v  YrP r XuP u 0
by . (15)
Formulae (8) and (9) together with (4) are Hydrodynamic damping will, in its simplest
the fundamental equations when deriving the ship form, be linear:
and underwater vehicle models. This is the topic
for the next sections. 2 3
Xu 0 0
6 7
DD4 0 Yv Yr 5 (16)
0 Nr Nr
Ship Model
while a nonlinear expression based on second-
The ship equations of motion are usually order modulus functions describing quadratic
represented in three DOFs by neglecting heave, drag and cross-flow drag is:
roll and pitch. Combining (4), (8), and (9) 2
we get: X juju juj 0
6 Y jvjv jvj Y jrjv jrj
6 0
D./ D 6
P D R. / (10) 4 M
MP C C./CD./ D  C  wind C  wave 0 N jvjv jvj N jrjv jrj
(11) 3
0
Y jvjr jvj Y jrjr jrj 7
5 (17)
where  WD Œx; y; > ,  WD Œu; v; r> and N jvjr jvj N jrjr jrj
2 3
c s 0 Other nonlinear representations are found in Fos-
R. / D 4 s c 05 (12) sen (1994, 2011).
0 0 1 In the case of irrotational ocean currents, we
introduce the relative velocity vector:
is the rotation matrix in yaw. It is assumed that
wind and wave-induced forces  wind and  wave r D   c
can be linearly superpositioned. The system ma-
trices M D MRB C MA and C./ D CRB ./ C where  c D Œubc ; vcb ; 0> is a vector of current
CA ./ are usually derived under the assumption velocities in fbg. Hence, the kinetic model takes
of port-starboard symmetry and that surge can the form:
be decoupled from sway and yaw (Fossen 2011).
Moreover, M P C C ./
„ RB ƒ‚ RB …
2 3 rigid-body forces
m  XuP 0 0
6 7 C MA P r C CA . r / r C D. r / r
MD4 0 m  YvP mxg  YrP 5 „ ƒ‚ …
hydrodynamic forces
0 mxg  NvP Iz  NrP
(13) D  C  wind C  wave
704 Mathematical Models of Ships and Underwater Vehicles

This model can be simplified if the ocean cur- a more detailed treatment of viscous dissipa-
rents are assumed to be constant and irrotational tive forces (damping) and sealoads are found in
in fng. According to Fossen (2011, Property 8.1), the extensive literature on hydrodynamics – see
MRB P C CRB ./  MRB P r C CRB . r / r if Faltinsen (1990) and Newman (1977).
the rigid-body Coriolis and centripetal matrix sat-
isfies CRB . r / D CRB ./. One parametrization
satisfying this is (14). Hence, the Coriolis and
centripetal matrix satisfies C. r / D CRB . r / C Underwater Vehicle Model
CA . r / and it follows that:
The 6-DOF underwater vehicle equations of
MP r C C. r / r C D. r / r D  C  wind C  wave motion follow from (4), (8), and (9) under the
(18) assumption that wave-induced motions can be
neglected:
The kinematic equation (10) can be modi-
fied to include the relative velocity  r according P D J./ (20)
to:  >
P D R. / r C unc ; vcn ; 0 (19) MP C C./ C D./ C g./ D  (21)

where the ocean current velocities unc = constant with generalized position  WD Œx; y; z; ; ; >
and vcn = constant in fng. Notice that the body- and velocity  WD Œu; v; w; p; q; r> . Assume that
fixed velocities  c D R. /> Œunc ; vcn ; 0> will the gravitational force acts through the center
vary with the heading angle . of gravity (CG) defined by the vector rg WD
The maneuvering model presented in this Œxg ; yg ; zg > with respect to the coordinate origin
entry is intended for controller-observer design, fbg. Similarly, the buoyancy force acts through
prediction, and computer simulations, as the center of buoyancy (CB) defined by the vector
well as system identification and parameter rb WD Œxb ; yb ; zb > . For most vehicles yg D yb D
estimation. A large number of application- 0.
specific models for marine craft are found in For a port-starboard symmetrical vehicle with
Fossen (2011, Chap. 7). homogenous mass distribution, CG satisfying
Hydrodynamic programs compute mass, iner- yg D 0 and products of inertia Ixy D Iyz D 0,
tia, potential damping and restoringforces while the system inertia matrix becomes:

2 3
m  X uP 0 X wP 0 mzg X qP 0
6 7
6 7
6 m  Y vP mzg Y pP mx g Y rP 7
6 0 0 0 7
6 7
6 7
6 X m  Z wP mx g Z qP 7
6 P
w 0 0 0 7
6 7
M WD 6 7 (22)
6 7
6 0 mzg Y pP 0 Ix K pP 0 I zx K rP 7
6 7
6 7
6 7
6mzg X qP 0 mx g Z qP 0 Iy M qP 0 7
6 7
4 5
0 mx g Y rP 0 I zx K rP 0 Iz N rP
Mathematical Models of Ships and Underwater Vehicles 705

where the hydrodynamic derivatives are defined where


according to SNAME (1950). The Coriolis and
centripetal matrices are:
2 3 a1 D XuP uCXwP wCXqP q
0 0 0 0 a3 a2
6 0 0 0 a 0 a1 7 a2 D YvP vCYpP pCYrP r
6 3 7 D ZuP uCZwP wCZqP q (24)
6 0 0 0 a2 a1 0 7 a3
6
CA ./ D 6 7 D KvP vCKpP pCKrP r
6 0 a3 a2 0 b 3 b2 77
b1
4 a3 0 a1 b3 0 b 1 5 b2 D MuP uCMwP wCMqP q
a2 a1 0 b 2 b1 0 b3 D NvP vCNpP pCNrP r
(23)
and

2 3
0 mr mq mzg r mx g q mx g r
6 mr 0 mp 0 m.zg r C x g p/ 0 7
6 7
6 mq mp 0 mzg p mzg q mx g p 7
CRB ./ D 6
6mzg r
7
7
6 0 mzg p 0 I xz p C I z r I y q 7
4 mx g q m.zg r C x g p/ mzg q Ixz p  I z r 0 I xz r C I x p 5
mx g r 0 mx g p Iy q Ixz r  I x p 0
(25)

2 3
Notice that this representation of CRB ./ .W  B/s M
6 .W  B/cs 7
only depends on the angular velocities p, 6 7
6 .W  B/cc 7
q, and r, and not the linear velocities g./D 6
6 .zg W  zb B/cs
7
7
u; v, and r. This property will be exploited 6 7
4 .zg W  zb B/sC.x g W  x b B/cc 5
when including drift due to ocean cur- .x g W  x b B/cs
rents. (27)
Linear damping for a port-starboard symmet-
The expression for D can be extended to include
rical vehicle takes the following form:
nonlinear damping terms if necessary. Quadratic
2 3
Xu 0 Xw 0 Xq 0 damping is important at higher speeds since
6 0 Yv 0 Yp 0 Yr 7 the Coriolis and centripetal terms C./ can
6 7
6 Zu 0 Zw 0 Zq 0 7 destabilize the system if only linear damping is
D D 6 6 7
6 0 Kv 0 Kp 0 Kr 7
7 used.
4 Mu 0 Mw 0 Mq 0 5 In the presence of irrotational ocean currents,
0 Nv 0 Np 0 Nr we can rewrite (20) and (21) in terms of relative
(26) velocity  r D    c according to:
Let W D mg and B D gr denote
 >
the weight and buoyance where m is the P D J./ r C unc ; vcn ; wnc ; 0; 0; 0 (28)
mass of the vehicle including water in
free floating space, r the volume of fluid
displaced by the vehicle, g the accelera- MP r C C. r / r C D. r / r C g./ D  (29)
tion of gravity (positive downward), and 
the water density. Hence, the generalized where it is assumed that CRB . r / D CRB ./,
restoring forces for a vehicle satisfying which clearly is satisfied for (25). In addition, it
yg D yb D 0 becomes (Fossen 1994, is assumed that unc , vcn , and wnc are constant. For
2011): more details see Fossen (2011).
706 Mean Field Games

Programs and Data Lewanddowski EM (2004) The dynamics of marine craft.


World Scientific, Singapore
MSS (2010) Marine systems simulator. Open access:
The Marine Systems Simulator (MSS) is a Mat- http://www.marinecontrol.org/
lab/Simulink library and simulator for marine Newman JN (1977) Marine hydrodynamics. MIT, Cam-
craft (http://www.marinecontrol.org). It includes bridge
models for ships, underwater vehicles, and float- Perez T (2005) Ship motion control. Springer,
Berlin/London
ing structures. Perez T, Fossen TI (2011a) Practical aspects of frequency-
domain identification of dynamic models of ma-
rine structures from hydrodynamic data. Ocean Eng
38:426–435
Summary and Future Directions Perez T, Fossen TI (2011b) Motion control of marine
craft, Ch. 33. In: Levine WS (ed) The control systems
This entry has presented standard models for handbook: control system advanced methods, 2nd edn.
CRC, Hoboken
simulation of ships and underwater vehicles. It Rawson KJ, Tupper EC (1994) Basic ship theory. Long-
is recommended to consult Fossen (1994, 2011) man Group, Harlow/New York
for a more detailed description of marine craft SNAME (1950) Nomenclature for treating the motion of
hydrodynamics. a submerged body through a fluid. SNAME, Technical
and Research Bulletin no 1–5, pp 1–15
Sutton R, Roberts G (2006) Advances in unmanned ma-
rine vehicles. IEE control series. The Institution of
Engineering and Technology, London
Cross-References Wadoo S, Kachroo P (2010) Autonomous underwater
vehicles: modeling, control design and simulation.
 Control of Networks of Underwater Taylor and Francis, London
Vehicles
 Control of Ship Roll Motion
 Dynamic Positioning Control Systems for
Ships and Underwater Vehicles
 Underactuated Marine Control Mean Field Games
Systems
Peter E. Caines
McGill University, Montreal, QC, Canada

Bibliography

Allmendinger E (ed) (1990) Submersible vehicle systems Abstract


design. SNAME, Jersey City
Anotonelli G (2010) Underwater robots: motion and force
Mean Field Game (MFG) theory studies the ex-
control of vehicle-manipulator systems. Springer,
Berlin/New York istence of Nash equilibria, together with the indi-
Bertram V (2012) Practical ship hydrodynamics. Elsevier, vidual strategies which generate them, in games
Amsterdam/London involving a large number of agents modeled by
Faltinsen O (1990) Sea loads on ships and offshore struc-
controlled stochastic dynamical systems. This
tures. Cambridge
Fossen TI (1991) Nonlinear modelling and control of is achieved by exploiting the relationship be-
underwater vehicles. PhD thesis, Department of Engi- tween the finite and corresponding infinite limit
neering Cybernetic, Norwegian University of Science population problems. The solution of the infi-
and Technology
nite population problem is given by the fun-
Fossen TI (1994) Guidance and control of ocean vehicles.
Wiley, Chichester/New York damental MFG Hamilton-Jacobi-Bellman (HJB)
Fossen TI (2011) Handbook of marine craft hydrodynam- and Fokker-Planck-Kolmogorov (FPK) equations
ics and motion control. Wiley, Chichester/Hoboken which are linked by the state distribution of a
Inzartsev AV (2009) Intelligent underwater vehicles.
I-Tech Education and Publishing. Open access: http://
generic agent, otherwise known as the system’s
www.intechweb.org/ mean field.
Mean Field Games 707

Keywords Equilibria for large-population dynamic games


was introduced by Weintraub et al. (2005) in
Fokker-Planck-Kolmogorov (FPK) equa- the framework of Markov Decision Processes
tion; Hamilton-Jacobi-Bellman (HJB) equa- (MDPs).
tion; Nash equilibrium; Stochastic dynamical One of the main results of MFG theory is that
systems in large-population stochastic dynamic games in-
dividual feedback strategies exist for which any
given agent will be in a Nash equilibrium with
Introduction respect to the pre-computable behavior of the
mass of the other agents; this holds exactly in
Large-population, dynamical, multi-agent, the asymptotic limit of an infinite population and
competitive, and cooperative phenomena occur with increasing accuracy for a finite population
in a wide range of designed and natural of agents using the infinite population feedback
settings such as communication, environmental, laws as the finite population size tends to infinity,
epidemiological, transportation, and energy a situation which is termed an "-Nash equilib-
systems, and they underlie much economic rium. This behavior is described by the solution
and financial behavior. Analysis of such to the infinite population MFG equations which
systems is intractable using the finite population are fundamental to the theory; they consist of
game theoretic methods which have been (i) a parameterized family of HJB equations (in
developed for multi-agent control systems the nonuniform parameterized agent case) and
(see, e.g., Basar and Ho 1974; Basar and (ii) a corresponding family of McKean-Vlasov
Olsder 1999; Ho 1980; and Bensoussan and (MV) FPK PDEs, where (i) and (ii) are linked
Frehse 1984). The continuum population game by the probability distribution of the state of
theoretic models of economics (Aumann and a generic agent, that is to say, the mean field. M
Shapley 1974; Neyman 2002) are static, as, For each agent, these yield (i) a Nash value
in general, are the large-population models of the game, (ii) the best response strategy for
employed in network games (Altman et al. the agent, (iii) the agent’s stochastic differential
2002) and transportation analysis (Correa equation (SDE) (i.e., the MV-SDE pathwise de-
and Stier-Moses 2010; Haurie and Marcotte scription), and (iv) the state distribution of such
1985; Wardrop 1952). However, dynamical (or an agent (via the MV FPK for the parameterized
sequential) stochastic games were analyzed in individual).
the continuum limit in the work of Jovanovic
and Rosenthal (1988) and Bergin and Bernhardt
(1992), where the fundamental mean field Dynamical Agents
equations appear in the form of a discrete
time dynamic programming equation and an In the diffusion-based models of large-population
evolution equation for the population state games, the state evolution of a collection of N
distribution. agents Ai ; 1  i  N < 1; is specified by a set
The mean field equations for dynamical games N of controlled stochastic differential equations
with large but finite populations of asymptotically (SDEs) which in the important linear case take
negligible agents originated in the work of Huang the form
et al. (2003, 2006, 2007) (where the framework
was called the Nash Certainty Equivalence dxi .t/ D ŒFi xi .t/ C Gi ui .t/dt C Di d wi .t/;
Principle) and independently in that of Lasry
and Lions (2006a,b, 2007), where the now 1  i  N;
standard terminology of Mean Field Games
(MFGs) was introduced. Independent of both where xi 2 Rn is the state, ui 2 Rm the control
of these, the closely related notion of Oblivious input, and wi the state Wiener process of the i th
708 Mean Field Games

agent Ai , where fwi ; 1  i  N g is a collec- Agent Performance Functions


tion of N independent standard Wiener processes
in Rr independent of all mutually independent In the basic finite population linear-quadratic dif-
initial conditions. For simplicity, throughout this fusion case, the agent Ai ; 1  i  N , possesses
entry, all collections of system initial conditions a performance, or loss, function of the form
are taken to be independent, zero mean and have
Z T
finite second moment.
A simplified form of the general case treated JiN .ui ; ui / DE fkxi .t/  mN .t/k2Q
0
in Huang et al. (2007) and Nourian and Caines
(2013) is given by the following set of controlled C kui .t/k2R gdt;
SDEs which for each agent Ai includes state
coupling with all other agents: where we assume the cost coupling to be of the
form mN .t/ WD .xN .t/ C /;  2 Rn , where ui
denotes all agents’ control laws except for that
1 X
N
of the i th agent andPxN denotes the population
dxi .t/ D f .t; xi .t/; ui .t/; xj .t//dt average state .1=N / N
N j D1 i D1 xi ; and where here and
below the expectation is taken over an underlying
C d wi .t/; 1  i  N; sample space which carries all initial conditions
and Wiener processes.
For the nonlinear case introduced in the pre-
where here, for the sake of simplicity, only vious section, a corresponding finite population
the uniform (non-parameterized) generic mean field loss function is
agent case is presented. The dynamics of
a generic agent in the infinite population JiN .ui I ui / WD
limit of this system is then described by the 0 1
following controlled MV stochastic differential Z T X
N

equation: E @.1=N / L.t; xi .t/; ui .t/; xj .t//Adt;


0 j D1

1  i  N;
dx.t/ D f Œt; x.t/; u.t/;
t dt C d w.t/;

where L is the nonlinear state cost-coupling


R
where f Œt; x; u;
t  D Rn f .t; x; u; y/
t .dy/, function. Setting, byR possible abuse of notation,
with the initial condition measure
0 specified, LŒt; x; u;
t  D Rn L.t; x; u; y/
t .dy/, the
where
t ./ denotes the state distribution infinite population limit of this cost function
of the population at t 2 Œ0; T . The dy- for a generic individual agent A is given by
namics used in the analysis in Lasry and
Z T
Lions (2006a,b, 2007) and Cardaliaguet
J.u;
/ WD E LŒt; x.t/; u.t/;
t dt;
(2012) are of the form dxi .t/ D ui .t/dt C 0
d wi .t/.
The dynamical evolution of the state xi of which is the general expression for the infinite
the i th agent Ai in the discrete time Markov population individual performance functions
Decision Processes (MDP)-based formulation appearing in Huang et al. (2006) and Nourian and
of the so-called anonymous sequential games Caines (2013) and which includes those of Lasry
(Bergin and Bernhardt 1992; Jovanovic and and Lions (2006a,b, 2007) and Cardaliaguet
Rosenthal 1988; Weintraub et al. 2005) is (2012). Exponentially discounted costs with
described by a Markov state transition function, discount rate parameter  are employed for
or kernel, of the form Pt C1 WD P .xi .t C infinite time horizon performance functions in
1/jxi .t/; xi .t/; ui .t/; Pt /. Huang et al. (2003, 2007), while the sample path
Mean Field Games 709

limit of the long-range average is used for ergodic and consequently central results of the topic con-
MFG problems in Lasry and Lions (2006a, 2007) cern the existence of Nash Equilibria and their
and Li and Zhang (2008) and in the analysis of properties.
adaptive MFG systems (Kizilkale and Caines The basic linear-quadratic mean field problem
2013). has an explicit solution characterizing a Nash
equilibrium (see Huang et al. 2003, 2007).
Consider the scalar infinite time horizon
The Existence of Equilibria discounted case, with nonuniform parameterized
agents A with parameter distribution F ./;  2
The objective of each agent is to find strategies A, and dynamical parameters identified as
(i.e., control laws) which are admissible with a WD F ; b WD G ; Q WD 1; r WD RI then
respect to information and other constraints and the so-called Nash Certainty Equivalence (NCE)
which minimize its performance function. The equation scheme generating the equilibrium
resulting problem is necessarily game theoretic solution takes the form

ds b2
s D C a s   … s  x  ;
dt r
 
d x b2 b2
D a   … x    s ; 0  t < 1;
dt r r
Z
x.t/ D x  .t/dF ./;
A M
x  .t/ D .x.t/ C /;
b2
… D 2a …  … C 1; … > 0; Riccati Equation
r

where the control action of the generic and at any instant it is a feedback function
parameterized agent A is given by u0 .t/ D of only the state of the agent A itself and
 br .… x .t/ C s .t//; 0  t < 1: u0 is the the deterministic mean field-dependent offset
optimal tracking feedback law with respect to s .
x  .t/ which is an affine function of the mean field For the general nonlinear case, the MFG
term x.t/, the mean with respect to the parameter equations on Œ0; T  are given by the linked
distribution F of the  2 A parameterized state equations for (i) the performance function
means of the agents. Subject to the conditions V for each agent in the continuum, (ii) the
for the NCE scheme to have a solution, each FPK for the MV-SDE for that agent, and
agent is necessarily in a Nash equilibrium in all (iii) the specification of the best response
full information causal (i.e., non-anticipative) feedback law depending on the mean field
feedback laws with respect to the remainder of measure
t and the agent’s state x.t/. In the
the agents when these are employing the law u0 . uniform agents case, these take the following
It is an important feature of the best response form.
control law u0 that its form depends only on The Mean Field Game HJB: (MV) FPK
the parametric data of the entire set of agents, Equations
710 Mean Field Games

@V .t; x/ @V .t; x/
[MV-HJB]  D inf f Œt; x.t/; u.t/;
t  C LŒt; x.t/; u.t/;
t 
@t u2U @x
2 @2 V .t; x/
C ;
2 @x 2
V .T; x/ D 0; .t; x/ 2 Œ0; T   R;

@
.t; x/ @ff Œt; x; u.t/;
t 
.t; x/g 2 @2
.t; x/
[MV-FPK] D C ;
@t @x 2 @x 2

[MV-BR] u.t/ D '.t; x.t/j


t /; .t; x/ 2 Œ0; T   R:

The general nonlinear MFG problem is of the feedback law given by the MFG infinite
approached by different routes in Huang et al. population scheme gives each agent a value to
(2006) and Nourian and Caines (2013), and Lasry its performance function within " of the infinite
and Lions (2006a,b, 2007) and Cardaliaguet population Nash value.
(2012), respectively. In the former, the so-called A counterintuitive feature of these results is
probabilistic method solves the MFG equations that, asymptotically in population size, observa-
directly. Subject to technical conditions, an tions of the states of rival agents are of no value to
iterated contraction argument establishes the any given agent; this is in contrast to the situation
existence of a solution to the HJB-(MV) FPK in single-agent optimal control theory where the
equations; the best response control laws are value of observations on an agent’s environment
obtained from these MFG equations, and is in general positive.
these are necessarily Nash equilibria within all
causal feedback laws for the infinite population
problem. In Lasry and Lions (2006a, 2007) the Current Developments and Open
MFG equations on the infinite time interval (i.e., Problems
the ergodic case) are obtained as the limit of Nash
equilibria for increasing finite populations, while There is now an extensive literature on Mean
in the expository notes of Cardaliaguet (2012) the Field Games, the following being a sample:
analytic properties of solutions to the HJB-FPK the mathematical literature has focused on
equations on the finite interval are analyzed using the study of general classes of solutions to
PDE methods including the theory of viscosity the fundamental HJB-FPK equations (see e.g.,
solutions. Cardaliaguet (2013)), while in systems and
In Huang et al. (2003, 2006, 2007), Nourian control, the theory of major-minor agent MFG
and Caines (2013), and Cardaliaguet (2012), it problems (in economics terminology, atoms
is shown that subject to technical conditions, and continua) is being developed (Huang 2010;
the solutions to the HJB-FPK scheme yield "- Nourian and Caines 2013; Nguyen and Huang
Nash solutions for finite population MFGs in that 2012), adaptive control extensions of the LQG
for any " > 0, there exists a population size theory have been carried out (Kizilkale and
N" such that for all larger populations the use Caines 2013), and the risk-sensitive case has
Mean Field Games 711

been analyzed (Tembine et al. 2012). Much work Haurie A, Marcotte P (1985) On the relationship be-
is now under way in the applications of MFG tween Nash-Cournot and Wardrop equilibria. Net-
works 15(3):295–308
theory to economics, finance, distributed energy Ho YC (1980) Team decision theory and information
systems, and electrical power markets. Each structures. Proc IEEE 68(6):15–22
of these areas has significant open problems, Huang MY (2010) Large-population LQG games in-
including the application of mathematical volving a major player: the Nash certainty equiva-
lence principle. SIAM J Control Optim 48(5):3318–
transport theory to HJB-FPK equations, the 3353
role of MFG theory in portfolio optimization, Huang MY, Caines PE, Malhamé RP (2003) Individual
and the analysis of systems where the presence and mass behaviour in large population stochastic
of partially observed major and minor agent wireless power control problems: centralized and Nash
equilibrium solutions. In: IEEE conference on decision
states incurs mean field and agent state and control, Maui, pp 98–103
estimation. Huang MY, Malhamé RP, Caines PE (2006) Large pop-
ulation stochastic dynamic games: closed loop Kean-
Vlasov systems and the Nash certainty equivalence
Cross-References principle. Commun Inf Syst 6(3):221–252
Huang MY, Caines PE, Malhamé RP (2007) Large
population cost-coupled LQG problems with non-
 Dynamic Noncooperative Games uniform agents: individual-mass behaviour and decen-
 Game Theory: Historical Overview tralized " – Nash equilibria. IEEE Tans Autom Control
 Stochastic Dynamic Programming 52(9):1560–1571
Jovanovic B, Rosenthal RW (1988) Anonymous sequen-
 Stochastic Linear-Quadratic Control tial games. J Math Econ 17(1):77–87. Elsevier
 Stochastic Maximum Principle Kizilkale AC, Caines PE (2013) Mean field stochas-
tic adaptive control. IEEE Trans Autom Control
58(4):905–920
Lasry JM, Lions PL (2006a) Jeux à champ moyen. I – Le
Bibliography cas stationnaire. Comptes Rendus Math 343(9):619– M
625
Altman E, Basar T, Srikant R (2002) Nash equilib- Lasry JM, Lions PL (2006b) Jeux à champ moyen.
ria for combined flow control and routing in net- II – Horizon fini et controle optimal. Comptes Rendus
works: asymptotic behavior for a large number of Math 343(10):679–684
users. IEEE Trans Autom Control 47(6):917–930. Lasry JM Lions PL (2007) Mean field games. Jpn J Math
Special issue on Control Issues in Telecommunication 2:229–260
Networks Li T, Zhang JF (2008) Asymptotically optimal decentral-
Aumann RJ, Shapley LS (1974) Values of non-atomic ized control for large population stochastic multiagent
games. Princeton University Press, Princeton systems. IEEE Tans Autom Control 53(7):1643–1660
Basar T, Ho YC (1974) Informational properties of the Neyman A (2002) Values of games with infinitely many
Nash solutions of two stochastic nonzero-sum games. players. In: Aumann RJ, Hart S (eds) Handbook
J Econ Theory 7:370–387 of game theory, vol 3. North Holland, Amsterdam,
Basar T, Olsder GJ (1999) Dynamic noncooperative game pp 2121–2167
theory. SIAM, Philadelphia Nourian M, Caines PE (2013) "-Nash Mean field games
Bensoussan A, Frehse J (1984) Nonlinear elliptic sys- theory for nonlinear stochastic dynamical systems
tems in stochastic game theory. J Reine Angew Math with major and minor agents. SIAM J Control Optim
350:23–67 50(5):2907–2937
Bergin J, Bernhardt D (1992) Anonymous sequential Nguyen SL, Huang M (2012) Linear-quadratic-Gaussian
games with aggregate uncertainty. J Math Econ mixed games with continuum-parametrized minor
21:543–562. North-Holland players. SIAM J Control Optim 50(5):2907–2937
Cardaliaguet P (2012) Notes on mean field games. Collège Tembine H, Zhu Q, Basar T (2012) Risk-sensitive mean
de France field games. arXiv:1210.2806
Cardaliaguet P (2013) Long term average of first order Wardrop JG (1952) Some theoretical aspects of road
mean field games and work KAM theory. Dyn Games traffic research. In: Proceedings of the institute of civil
Appl 3:473–488 engineers, London, part II, vol 1, pp 325–378
Correa JR, Stier-Moses NE (2010) In: Cochran JJ (ed) Weintraub GY, Benkard C, Van Roy B (2005) Oblivious
Wardrop equilibria. Wiley encyclopedia of operations equilibrium: a mean field approximation for large-
research and management science. Jhon Wiley & Sons, scale dynamic games. In: Advances in neural informa-
Chichester, UK tion processing systems. MIT, Cambridge
712 Mechanism Design

In each case, the mechanism designer is shap-


Mechanism Design ing the incentives of participants in the system.
The mechanism designer must first define the
Ramesh Johari
desired objective and then choose a mechanism
Stanford University, Stanford, CA, USA
that optimizes that objective given a prediction of
how strategic agents will respond. The theory of
Abstract mechanism design provides guidance in solving
such optimization problems.
Mechanism design is concerned with the design We provide a brief overview of some central
of strategic environments to achieve desired concepts in mechanism design. In the first
outcomes at equilibria of the resulting game. section, we delve into more detail on the structure
We briefly overview central ideas in mechanism of the optimization problem that a mechanism
design. We survey both objectives the mechanism designer solves. In particular, we discuss two
designer may seek to achieve, as well as central features of this problem: (1) What
equilibrium concepts the designer may use is the objective that the mechanism designer
to model agents. We conclude by discussing seeks to achieve or optimize? (2) How does the
a seminal example of mechanism design mechanism designer model the agents, i.e., what
at work: the Vickrey-Clarke-Groves (VCG) equilibrium concept describes their strategic
mechanisms. interactions? In the second section, we study
a specific celebrated class of mechanisms, the
Vickrey-Clarke-Groves mechanisms.
Keywords

Game theory; Incentive compatibility; Vickrey-


Clarke-Groves mechanisms Objectives and Equilibria

Introduction A mechanism design problem requires two essen-


tial inputs, as described in the introduction. First,
Informally, mechanism design might be what is the objective the mechanism designer is
considered “inverse game theory.” In mechanism trying to achieve or optimize? And second, what
design, a principal (the “designer”) creates a are the constraints within which the mechanism
system (the “mechanism”) in which strategic designer operates? On the latter question, perhaps
agents interact with each other. Typically, the the biggest “constraint” in mechanism design is
goal of the mechanism designer is to ensure that the agents are assumed to act rationally in
that at an “equilibrium” of the resulting strategic response to whatever mechanism is imposed on
interaction, a “desirable” outcome is achieved. them. In other words, the mechanism designer
Examples of mechanism design at work include needs to model how the agents will interact with
the following: each other. Mathematically, this is modeled by
1. The FCC chooses to auction spectrum among a choice of equilibrium concept. For simplicity,
multiple competing, strategic bidders to max- we focus only on static mechanism design, i.e.,
imize the revenue generated. How should the mechanism design for settings where all agents
FCC design the auction? act simultaneously.
2. A search engine decides to run a market for
sponsored search advertising. How should the Objectives
market be designed? In this section we briefly discuss three objec-
3. The local highway authority decides to charge tives the mechanism designer may choose to
tolls for certain roads to reduce congestion. optimize for: efficiency, revenue, and a fairness
How should the tolls be chosen? criterion.
Mechanism Design 713

1. Efficiency. When the mechanism designer fo- Equilibrium Concepts


cuses on “efficiency,” they are interested in In this section we briefly discuss a range of equi-
ensuring that the equilibrium outcome of the librium concepts the mechanism designer might
game they create is a Pareto efficient outcome. use to model the behavior of players. From an op-
In other words, at an equilibrium of the game, timization viewpoint, mechanism design should
there should be no individual that can be made be viewed as maximization of the designer’s
strictly better off while leaving all others at objective, subject to an equilibrium constraint.
least as well off as they were before. The most The equilibrium concept used captures the mech-
important instantiation of the efficiency crite- anism designer’s judgment about how the agents
rion arises in quasilinear settings, i.e., settings can be expected to interact with each other, once
where the utility of all agents is measured in the mechanism designer has fixed the mech-
a common, transferable monetary unit. In this anism. Here we briefly discuss three possible
case, it can be shown that achieving efficient equilibrium concepts that might be used by the
outcomes is equivalent to maximizing the ag- mechanism designer.
gregate utility of all agents in the system. 1. Dominant strategies. In dominant strategy
See Chap. 23 in Mas-Colell et al. (1995) for implementation, the mechanism designer
more details on mechanism design for efficient assumes that agents will play a (weak or strict)
outcomes. dominant strategy against their competitors.
2. Revenue. Efficiency may be a reasonable goal This equilibrium concept is obviously quite
for an impartial social planner; on the other strong, as dominant strategies may not exist
hand, in many applied settings, the mechanism in general. However, the advantage is that
designer is often herself a profit-maximizing when the mechanism possesses dominant
party. In these cases, it is commonly the goal strategies for each player, the prediction of
of the mechanism designer to maximize her play is quite strong. The Vickrey-Clarke- M
own payoff from the mechanism itself. Groves mechanisms described below are
A common example of this scenario is in central in the theory of mechanism design
the design of optimal auctions. An auction is a with dominant strategies.
mechanism for the sale of a good (or multiple 2. Bayesian equilibrium. In a Bayesian
goods) among many competing buyers. When equilibrium, agents optimize given a common
the principal is self-interested, she may wish prior distribution about the other agents’
to choose the auction design that maximizes preferences. In Bayesian mechanism design,
her revenue from sale; the celebrated paper the mechanism designer chooses a mechanism
of Myerson (1981) studies this problem in taking into account that the agents will play
detail. according to a Bayesian equilibrium of the
3. Fairness. Finally, in many settings, the resulting game. This solution concept allows
mechanism designer may be interested more the designer to capture a lack of complete
in achieving a “fair” outcome – even if such information among players, but typically
an outcome is potentially not Pareto efficient. allows for a richer family of mechanisms than
Fairness is subjective, and therefore, there mechanism design with dominant strategies.
are many potential objectives that might be Myerson’s work on optimal auction design is
viewed as fair by the mechanism designer. carried out in a Bayesian framework (Myerson
One common setting where the mechanism 1981).
design strives for fair outcomes is in cost 3. Nash equilibrium. Finally, in a setting where
sharing: in a canonical example, the cost the mechanism designer believes the agents
of a project must be shared “fairly” among will be quite knowledgeable about each
many participants. See Chap. 15 of Nisan other’s preferences, it may be reasonable to
et al. (2007) for more discussion of such assume they will play a Nash equilibrium of
mechanisms. the resulting game. Note that in this case,
714 Mechanism Design

it is typically assumed the designer does not However, achieving the efficient allocation re-
know the utilities of agents at the time the quires knowledge of the utility functions; what
mechanism is chosen – even though agents do can we do if these are unknown? The key insight
know their own utilities at the time the result- is to make each user act as if they are opti-
ing game is actually played. See, e.g., Moore mizing the aggregate utility, by structuring pay-
(1992) for an overview of mechanism design ments appropriately. The basic approach in a
with Nash equilibrium as the solution concept. VCG mechanism is to let the strategy space of
each user r be the set U of possible valuation
The Vickrey-Clarke-Groves functions and make a payment tr to user r so that
Mechanisms her net payoff has the same form as the social
objective (1). In particular, note that if user r
In this section, we describe a seminal exam- receives an allocation xr and a payment tr , the
ple of mechanism design at work: the Vickrey- payoff to user r is
Clarke-Groves (VCG) class of mechanisms. The Ur .xr / C tr :
key insight behind VCG mechanisms is that by
structuring payment rules correctly, individuals On the other hand, the social objective (1) can be
can be incentivized to truthfully declare their written as
utility functions to the market and in turn achieve X
Ur .xr / C Us .xs /:
an efficient allocation. VCG mechanisms are an s¤r
example of mechanism design with dominant
strategies and with the goal of welfare maxi- Comparing the preceding two expressions, the
mization, i.e., efficiency. The presentation here most natural means to align user objectives with
is based on the material in Chap. 5 of Berry and the social planner’s objectives is to define the
Johari (2011), and the reader is referred there payment tr as the sum of the valuations of all
for further discussion. See also Vickrey (1961), users other than r.
Clarke (1971), and Groves (1973) for the original A VCG mechanism first asks each user to
papers discussing this class of mechanisms. declare a valuation function. For each r, we use
To illustrate the principle behind VCG mech- UO r to denote the declared valuation function of
anisms, consider a simple example where we al- user r and use U O D .UO 1 ; : : : ; UO R / to denote the
locate a single resource of unit capacity among R vector of declared valuations. Formally, given a
competing users. Each user’s utility is measured vector of declared valuation functions U, O a VCG
in terms of a common currency unit; in particular, mechanism chooses the allocation x.U/ O as an
if the allocated amount is xr and the payment to optimal solution to (1)–(2) given U, O i.e.,
user r is tr , then her utility is Ur .xr / C tr ; we
refer to Ur as the valuation function, and let the X
O 2 arg
x.U/ max
P UO r .xr /: (4)
space of valuation functions be denoted by U. x0W r xr 1
r
For simplicity we assume the valuation functions
are continuous. In line with our discussion of The payments are then structured so that
efficiency above, it can be shown that the Pareto
X
efficient allocation is obtained by solving the O D
tr .U/ O C hr .U
UO s .xs .U// O r /: (5)
following: s¤r

X Here hr is an arbitrary function of the declared


maximize Ur .xr / (1) valuation functions of users other than r, and
r various definitions of hr give rise to variants of
X
subject to xr  1I (2) the VCG mechanisms. Since user r cannot affect
r this term through the choice of UO r , she chooses
x  0: (3) UO r to maximize
Model Building for Control System Synthesis 715

X
O C
Ur .xr .U// O
UO s .xs .U//: we have only considered implementation in static
s¤r
environments. Most practical mechanism design
settings are dynamic. Dynamic mechanism
O r , the above expression is design remains an active area of fruitful research.
Now note that given U
bounded above by
2 3 Cross-References
X
max
P
4Ur .xr / C UO s .xs /5 :
x0W r xr 1
s¤r
 Auctions
 Game Theory: Historical Overview
 Linear Quadratic Zero-Sum Two-Person Dif-
But since x.U/ O satisfies (4), user r can achieve
ferential Games
the preceding maximum by truthfully declaring
UO r D Ur . Since this optimal strategy does not
depend on the valuation functions .UO s ; s ¤ r/
declared by the other users, we recover the impor- Bibliography
tant fact that in a VCG mechanism, truthful dec-
Berry RA, Johari R (2011) Economic modeling in net-
laration is a weak dominant strategy for user r. working: a primer. Found Trends Netw 6(3):165–286
For our purposes, the interesting feature of the Clarke EH (1971) Multipart pricing of public goods.
VCG mechanism is that it elicits the true utilities Public Choice 11(1):17–33
from the users and in turn (because of the defini- Groves T (1973) Incentives in teams. Econometrica
O chooses an efficient allocation. The 41(4):617–631
tion of x.U/) Hajek B (2013) Auction theory. In: Encyclopedia of
feature that truthfulness is a dominant strategy is systems and control. Springer
known as incentive compatibility: the individual Mas-Colell A, Whinston M, Green J (1995) Microeco-
incentives of users are aligned, or “compatible,” nomic theory. Oxford University Press, New York M
Moore J (1992) Implementation, contracts, and renegotia-
with overall efficiency of the system. The VCG tion in environments with complete information. Adv
mechanism achieves this by effectively paying Econ Theory 1:182–281
each agent to tell the truth. The significance of Myerson RB (1981) Optimal auction design. Math Oper
the approach is that this payment can be properly Res 6(1):58–73
Nisan N, Roughgarden T, Tardos E, Vazirani V (eds)
structured even if the resource manager does (2007) Algorithmic game theory. Cambridge Univer-
not have prior knowledge of the true valuation sity Press, Cambridge/New York
functions. Vickrey W (1961) Counterspeculation, auctions, and
competitive sealed tenders. J Financ 16(1): 8–37

Summary and Future Directions


Model Building for Control System
Mechanism design provides an overarching
Synthesis
framework for the “engineering” of economic
systems. However, many significant challenges
Marco Lovera and Francesco Casella
remain. First, VCG mechanisms are not
Politecnico di Milano, Milan, Italy
computationally tractable in complex settings,
e.g., combinatorial auctions (Hajek 2013);
finding computationally tractable yet efficient Abstract
mechanisms is a very active area of current
research. Second, VCG mechanisms optimize The process of developing control-oriented math-
for overall welfare, rather than revenue, and ematical models of physical systems is a complex
finding simple mechanisms that maximize task, which in general implies a careful combi-
revenue also presents new challenges. Finally, nation of prior knowledge about the physics of
716 Model Building for Control System Synthesis

the system under study with information coming aerospace systems; advanced energy conversion
from experimental data. In this article the role systems. All the abovementioned applications
of mathematical models in control system design possess at least one of the following features,
and the problem of developing compact control- which in turn call for accurate mathematical
oriented models are discussed. modeling for the design of the control system:
closed-loop performance critically depends on
the dynamic behavior of the plant; the system
Keywords is made of many closely interacting subsystems;
advanced control systems are required to obtain
Analytical models; Computational modeling; competitive performance, and these in turn
Continuous-time systems; Control-oriented mod- depend on explicit mathematical models for their
eling; Discrete-time systems; Parameter-varying design; the system is safety critical and requires
systems; Simulation; System identification; extensive validation of closed-loop stability and
Time-invariant systems; Time-varying systems; performance by simulation.
Uncertainty Therefore, building control-oriented mathe-
matical models of physical systems is a crucial
prerequisite to the design process itself (see, e.g.,
Introduction Lovera (2014) for a more detailed treatment of
this topic).
The design of automatic control systems requires In the following, two aspects related to mod-
the availability of some knowledge of the dynam- eling for control system synthesis will be dis-
ics of the process to be controlled. In this respect, cussed, namely, the role of models for control
current methods for control system synthesis can system synthesis and the actual process of model
be classified in two broad categories: model-free building itself.
and model-based ones.
The former aim at designing (or tuning) con-
trollers solely on the basis of experimental data The Role of Models for Control
collected directly on the plant, without resorting System Synthesis
to mathematical models.
The latter, on the contrary, assume that suit- Mathematical models play a number of different
able models of the plant to be controlled are roles in the design of control systems. In particu-
available, and rely on this information to work lar, different classes of mathematical models are
out control laws capable of meeting the design usually employed: detailed, high-fidelity models
requirements. for system simulation and compact models for
While the research on model-free design control design. In this section the two model
methods is a very active field, the vast majority classes are presented and their respective roles in
of control synthesis methods and tools fall in the design of control systems are described. Note,
the model-based category and therefore assume in passing, that although hybrid system control is
that knowledge about the plant to be controlled an interesting and emerging field, this entry will
is encoded in the form of dynamic models of focus on purely continuous-time physical mod-
the plant itself. Furthermore, in an increasing els, with application to the design of continuous-
number of application areas, control system time or sampled-time control systems.
performance is becoming a key competitive
factor for the success of innovative, high- Detailed Models for System Simulation
tech systems. Consider, for example, high- Detailed models play a double role in the control
performance mechatronic systems (such as design process. On one hand, they allow checking
robots); vehicles enhanced by active integrated how good (or crude) the compact model is, com-
stability, suspension, and braking control; pared to a more detailed description, thus helping
Model Building for Control System Synthesis 717

to develop good compact models. On the other connectors (actuator inputs and sensor outputs)
hand, they allow closed-loop performance verifi- to a causal controller model in a causal simu-
cation of the controlled system, once a controller lation environment. From a mathematical point
design is available. Indeed, verifying closed-loop of view, this corresponds to reformulating (1) in
performance using the same simplified model state-space form, by means of analytical and/or
that was used for control system design is not numerical transformations.
a sound practice; conversely, verification per- Finally, it is important to point out that
formed with a more detailed model is usually physical model descriptions based on partial-
deemed a good indicator of the control system differential equations (PDEs) can be handled in
performance, whenever experimental validation the OOM framework by means of discretization
is not possible for some reason. using finite volume, finite elements, or finite
Object-oriented modeling (OOM) method- differences methods.
ologies and equation-based, object-oriented
languages (EOOLs) provide very good support Compact Models for Control Design
for the development of such high-fidelity The requirements for a control-oriented model
models, thanks to equation-based modeling, can vary significantly from application to applica-
acausal physical ports, hierarchical system tion. Design models can be tentatively classified
composition, and inheritance; see Tiller (2001) in terms of two key features: complexity and
for a comprehensive overview. Any continuous- accuracy. For a dynamic model, complexity can
time EOOL model is equivalent to the set of be measured in terms of its order; accuracy, on
differential-algebraic equations (DAEs) the other hand, can be measured using many
different metrics (e.g., time-domain simulation
P
F .x.t/; x.t/; u.t/; y.t/; p; t/ D 0; (1) or prediction error, frequency domain matching
with the real plant, etc.) related to the capability M
where x is the vector of dynamic variables, u is of the model to reproduce the behavior of the true
the vector of input variables, y is the vector of system in the operating conditions of interest.
algebraic variables, p is the vector of parameters Broadly speaking, it can be safely stated that
and t is the time. It is interesting to highlight that the required level of closed-loop performance
the object-oriented approach (in particular, the drives the requirements on the accuracy and
use of replaceable components) allows defining complexity of the design model. Similarly,
and managing families of models of the same it is intuitive that more complex models
plant with different levels of complexity, by pro- have the potential for being more accurate.
viding more or less detailed implementations of So, one might be tempted to resort to very
the same abstract interfaces. This feature of OOM detailed mathematical representations of the
allows the development of simulation models for plant to be controlled in order to maximize
different purposes and with different degrees of closed-loop performance. This consideration
detail throughout the entire life of an engineering however is moderated by a number of additional
project, from preliminary design down to com- requirements, which actually end up driving the
missioning and personnel training, all within a control-oriented modeling process. First of all,
coherent framework. present-day controller synthesis methods and
In particular, when focusing on control sys- tools have computational limitations in terms of
tems verification (and regardless of the actual the complexity of the mathematical models they
control design methodology) once the controller can handle, so compact models representative of
has been set up, an OOM tool can be used to the dominant dynamics of the system under study
run closed-loop simulations, including both the are what is really needed. Furthermore, for many
plant and the controller model. Many OOM tools synthesis methods (such as, e.g., LQG or H1
provide model export facilities, which allow to synthesis), the complexity of the design model
connect a plant model with only causal external has an impact on the complexity of the controller,
718 Model Building for Control System Synthesis

which in turn is constrained by implementation is approximated by (3) via analytical or numerical


issues. Last but not least, in engineering projects, linearization.
the budget of the control-oriented modeling If, on the contrary, the control-oriented model
activity is usually quite limited, so the achievable is obtained by linearization of the DAE system
level of accuracy is affected by this limitation. (1), then a generalized LTI (or descriptor) model
It is clear from the above discussion that devel- in the form
oping mathematical models suitable for control
system synthesis is a nontrivial task but rather P
E x.t/ D Ax.t/ C Bu.t/
(5)
corresponds to the pursuit of a careful tradeoff y.t/ D C x.t/ C Du.t/
between complexity and accuracy. Furthermore,
throughout the model development, one should is obtained. Clearly, a generalized LTI model is
keep in mind the eventual control application of equivalent to a conventional one as long as E
the model, so its mathematical structure has to be is nonsingular. The generalized form, however,
compatible with currently available methods and encompasses the wider class of linearized plants
tools for control system analysis and design. with a singular E.
Control-oriented models are usually formu-
lated in state-space form: Linear Time-Varying Models
In some engineering applications, the need may
P
x.t/ D f .x.t/; u.t/; p; t/ arise to linearize the detailed model in the neigh-
(2)
y.t/ D g.x.t/; u.t/; p; t/ borhood of a trajectory rather than around an
equilibrium point. Whenever this is the case, a
where x is the vector of state variables, u is the linear time-varying (LTV) model is obtained, in
vector of system inputs (control variables and dis- the form
turbances), y is the vector of system outputs, p is
the vector of parameters, and t is the continuous P
x.t/ D A.t/x.t/ C B.t/u.t/
(6)
time. In the following, however, the focus will y.t/ D C.t/x.t/ C D.t/u.t/:
be on linear models, which constitute the starting
point for most control law design methods and An important subclass is the one of time periodic
tools. In this respect, the main categories of behavior of the state-space matrices of the model,
models used in control system synthesis can be which corresponds to a linear time periodic (LTP)
defined as follows. model. LTP models arise when considering the
linearization along periodic trajectories or, as it
occurs in a number of engineering problems,
Linear Time-Invariant Models
whenever rotating systems are considered (e.g.,
Linear time-invariant (LTI) models can be de-
spacecraft, rotorcraft, wind turbines). Finally, it
scribed in state-space form as
is interesting to recall that (discrete-time) LTP
models are needed to model multi-rate sampled
P
x.t/ D Ax.t/ C Bu.t/ data systems.
(3)
y.t/ D C x.t/ C Du.t/
Linear Parameter-Varying models
or, equivalently, using an input-output model The control-oriented modeling problem can be
given by the (rational) transfer function also formulated as the one of simultaneously
representing all the linearizations of interest
G.s/ D C.sI  A/1 B C D; (4) for control purposes of a given nonlinear plant.
Indeed, in many control engineering applications
where s denotes the Laplace variable. In many a single control system must be designed to
cases, the dynamics of systems in the form (2) in guarantee the satisfactory closed-loop operation
the neighborhood of an equilibrium (trim) point of a given plant in many different operating
Model Building for Control System Synthesis 719

conditions (either equilibria or trajectories). Most of the existing control design literature
Many design techniques are now available for assumes that the plant model is given in the form
this problem (see, e.g., Mohammadpour and of a linear fractional transformation (LFT) (see,
Scherer 2012), provided that a suitable model in e.g., Skogestad and Postlethwaite (2007) for an
parameter-dependent form has been derived for introduction to LFT modeling of uncertainty and
the system to be controlled. Linear parameter- Hecker et al. (2005) for a discussion of algo-
varying (LPV) models, described in state-space rithms and software tools). LFT models consist
form as of a feedback interconnection between a nominal
LTI plant and a (usually norm-bounded) operator
P
x.t/ D A.p.t//x.t/ C B.p.t//u.t/ which represents model uncertainties, e.g., poorly
(7)
y.t/ D C.p.t//x.t/ C D.p.t//u.t/ known or time-varying parameters, nonlineari-
are linear models the state-space representation ties, etc. A generic such LFT interconnection
of which depends on a parameter vector p that is depicted in Fig. 1, where the nominal plant
can be time varying. The elements of vector p is denoted with P and the uncertainty block is
may or may not be measurable, depending on the denoted with . The LFT formalism can be also
specific problem formulation. The present state used to provide a structured representation for the
of the art of LPV modeling can be briefly sum- state-space form of LPV models, as depicted in
marized by defining two classes of approaches Fig. 2, where the block .˛/ takes into account
(see Lopes dos Santos et al. (2011) for details). the presence of the uncertain parameter vector
Analytical methods based on the availability of ˛, while the block .p/ models the effect of
reliable nonlinear equations for the dynamics of the varying operating point, parameterized by the
the plant, from which suitable control-oriented vector of time-varying parameters p. Therefore,
representations can be derived (by resorting to, LFT models can be used for the design of robust
broadly speaking, suitable extensions of the fa- and gain scheduling controllers; in addition they M
miliar notion of linearization, developed in order can also serve as a basis for structured model
to take into account off-equilibrium operation identification techniques, where the uncertain pa-
of the system). Experimental methods based en- rameters that appear in the feedback blocks are
tirely on identification, i.e., aimed at deriving estimated based on input/output data sequences.
LPV models for the plant directly from input/ out- The process of extracting uncertain/scheduling
put data. In particular, some LPV identification parameters from the design model of the system
techniques assume that one global identification
experiment in which both the control input and
the parameter vector are (persistently) excited in a Δ
simultaneous way, while others aim at deriving a
parameter-dependent model on the basis of local u P y

experiments only, i.e., experiments in which the


parameter vector is held constant and only the Model Building for Control System Synthesis, Fig. 1
control input is excited. Block diagram of the typical LFT interconnection adopted
in the robust control framework
Modern control theory provides methods and
tools to deal with design problems in which Δ(p)
stability and performance have to be guaranteed
also in the presence of model uncertainty, both Δ(α)
for regulation around a specified operating point
and for gain scheduled control system design. u(t) S u(t)
Therefore, modeling for control system synthesis
should also provide methods to account for model
Model Building for Control System Synthesis, Fig. 2
uncertainty (both parametric and nonparametric) Block diagram of the typical LFT interconnection adopted
in the considered model class. in the robust LPV control framework
720 Model Building for Control System Synthesis

to be controlled is a highly complex one, in which sweeps) and then reconstruct the dynamics by
symbolic techniques play a very important role. means of system identification methods. Note
Tools already exist to perform this task (see, e.g., that in this way the structure/order selection stage
Hecker et al. 2005), while a recent overview of of the system identification process provides
the state of the art in this research area can be effective means to manage the complexity versus
found in Hecker and Varga (2006). accuracy tradeoff in the derivation of the compact
Finally, it is important to point out that there is model. A more direct approach, presently
a vast body of advanced control techniques which supported by many tools, is to directly compute
are based on discrete-time models: the A; B; C; D matrices of the linearized system
around specified equilibrium (trim) points,
x.k C 1/ D f .x.k/; u.k/; p; k/ using symbolic and/or numerical linearization
(8)
y.k/ D g.x.k/; u.k/; p; k/ techniques. The result is usually a high-order
linear system, which then can (sometimes
where the integer time step k usually corresponds must) be reduced to a low-order system by
to multiples of a sampling period Ts . Many tech- using model order reduction techniques (such
niques are available to transform (2) into (8). as, e.g., balanced truncation). Model reduction
Furthermore, LTI, LTV, and LPV models can techniques (see Antoulas (2009) for an in-depth
be formulated in discrete time rather than in treatment of this topic) allow to automatically
continuous time. obtain approximated compact models such as
(3), starting from much more detailed simulation
models, by formulating specific approximation
Building Models for Control System bounds in control-relevant terms (e.g., percentage
Synthesis errors of steady-state output values, norm-
bounded additive or multiplicative errors of
The development of control-oriented models of weighted transfer functions, or L2 -norm errors
physical systems is a complex task, which in of output transients in response to specified input
general implies a careful combination of prior signals).
knowledge about the physics of the system under Black box modeling, on the other hand, cor-
study with information coming from experimen- responds to situations in which the modeling
tal data. In particular, this process can follow activity is entirely based on input-output data
very different paths depending on the type of collected on the plant (which therefore must be
information which is available on the plant to be already available), possibly in dedicated, suitably
controlled. Such paths are typically classified in designed, experiments (see Ljung 1999). Regard-
the literature as follows (see, e.g., Ljung (2008) less of the type of model to be built (i.e., linear or
for a more detailed discussion). nonlinear, time invariant or time varying, discrete
White box modeling refers to the development time or continuous time), the black box approach
of control-oriented models on the basis of first consists of a number of well-defined steps. First
principles only. In this framework, one uses the of all the structure of the model to be identified
available information on the plant to develop must be defined: in the linear time-invariant case,
a detailed model using OOM or EOOL tools this corresponds to the choice of the number of
and subsequently works out a compact control- poles and zeros for an input-output model or
oriented model from it. If the adopted tool to the choice of model order for a state-space
only supports simulation, then one can run representation; in the nonlinear case structure
simulations of the plant model, subject to suitably selection is a much more involved process in
chosen excitation inputs (ranging from steps to view of the much larger number of degrees of
persistently exciting input sequences such as, freedom which are potentially involved. Once
e.g., pseudorandom binary sequences and sine a model structure has been defined, a suitable
Model Building for Control System Synthesis 721

cost function to measure the model performance sidered. An overview of the different uses
must be selected (e.g., time-domain simulation or of mathematical models in control system
prediction error, frequency domain model fitting, design has been provided and the process
etc.) and the experiments to collect identification of building compact control-oriented models
and validation data must be designed. Finally, starting from prior knowledge about the system
the uncertain model parameters must be esti- and/or experimental data has been discussed.
mated from the available identification dataset Present-day modeling and simulation tools
and the model must be validated on the validation support advanced control system design in a
dataset. much more direct way. In particular, while
Grey box modeling (in various shades) corre- methods and tools for the individual steps in the
sponds to the many possible intermediate cases modeling process (such as OOM, linearization
which can occur in practice, ranging from the and model reduction, parameter estimation) are
white box approach to the black box one. As available, an integrated environment enabling
recently discussed in Ljung (2008), the critical the pursuit of all the abovementioned paths
issue in the development of an effective approach to the development of compact control-
to control-oriented grey box modeling lies in oriented models is still a subject for future
the integration of existing methods and tools for development. The availability of such a tool
physical systems modeling and simulation with might further promote the application of
methods and tools for parameter estimation. Such advanced, model-based techniques that are
integration can take place in a number of different currently limited by the model development
ways depending on the relative role of data and process.
priors on the physics of the system in the specific
application. A typical situation which occurs fre-
quently in applications is when a white box model M
(developed by means of OOM or EOOL tools) Cross-References
contains parameters having unknown or uncertain
numerical values (such as, e.g., damping factors  Modeling of Dynamic Systems from First Prin-
in structural models, aerodynamic coefficients ciples
in aircraft models and so on). Then, one may  Multi-domain Modeling and Simulation
rely on input-output data collected in dedicated  System Identification: An Overview
experiments on the real system to refine the
white box model by estimating the parameters
using the information provided by the data. This
process is typically dependent on the specific Bibliography
application domain as the type of experiment,
the number of measurements, and the estimation Antoulas A (2009) Approximation of large-scale dynami-
technique must meet application-specific con- cal systems. SIAM, Philadelphia
Hecker S, Varga A (2006) Symbolic manipulation
straints (see, e.g., Klein and Morelli (2006) for techniques for low order LFT-based parametric
an overview of grey box modeling in aerospace uncertainty modelling. Int J Control 79(11):
applications). 1485–1494
Hecker S, Varga A, Magni J-F (2005) Enhanced
LFR-toolbox for MATLAB. Aerosp Sci Technol
9(2):173–180
Klein V, Morelli EA (2006) Aircraft system identification:
Summary and Future Directions theory and practice. AIAA, Reston
Ljung L (1999) System identification: theory for the user.
Prentice-Hall, New Jersey
In this article the problem of model building Ljung L (2008) Perspectives on system identification. In:
for control system synthesis has been con- 2008 IFAC world congress, Seoul
722 MHE

Lopes dos Santos P, Azevedo Perdicoulis TP, Novara C, Keywords


Ramos JA, Rivera DE (eds) (2011) Linear parameter-
varying system identification: new developments and
trends. World Scientific, Singapore Balanced truncation; Interpolation-based meth-
Lovera M (ed) (2014) Control-oriented modelling and ods; Reduced-order models; SLICOT;
identification: theory and practice. IET, London Truncation-based methods
Mohammadpour J, Scherer C (eds) (2012) Control of
linear parameter varying systems with applications.
Springer, New York
Skogestad S, Postlethwaite I (2007) Multivariable feed- Problem Description
back control analysis and design. Wiley, Chich-
ester/New York
Tiller M (2001) Introduction to physical modelling with This survey is concerned with linear time-
Modelica. Kluwer, Boston invariant (LTI) systems in state-space form

P
E x.t/ D Ax.t/ C Bu.t/;

MHE y.t/ D C x.t/ C Du.t/; (1)

 Moving Horizon Estimation where E; A 2 Rnn are the system matrices,


B 2 Rnm is the input matrix, C 2 Rpn is the
output matrix, and D 2 Rpm is the feedthrough
(or input–output) matrix. The size n of the matrix
A is often referred to as the order of the LTI
Model Order Reduction: Techniques
system. It mainly determines the amount of time
and Tools
needed to simulate the LTI system.
Such LTI systems often arise from a finite ele-
Peter Benner1 and Heike Faßbender2
1 ment modeling using commercial software such
Max Planck Institute for Dynamics of Complex
as ANSYS or NASTRAN which results in a
Technical Systems, Magdeburg, Germany
2 second-order differential equation of the form
Institut Computational Mathematics,
Technische Universität Braunschweig,
Braunschweig, Germany R C D x.t/
M x.t/ P C Kx.t/ D F u.t/;
y.t/ D Cp x.t/ C Cv x.t/;
P

Abstract
where the mass matrix M , the stiffness matrix K,
and the damping matrix D are square matrices in
Model order reduction (MOR) is here understood
Rss , F 2 Rsm , Cp ; Cv 2 Rqs , x.t/ 2 Rs ,
as a computational technique to reduce the order
u.t/ 2 Rm , y.t/ 2 Rq . Such second-order dif-
of a dynamical system described by a set of or-
ferential equations are typically transformed to a
dinary or differential-algebraic equations (ODEs
mathematically equivalent first-order differential
or DAEs) to facilitate or enable its simulation,
equation
the design of a controller, or optimization and
design of the physical system modeled. It focuses        
on representing the map from inputs into the I 0 P
x.t/ 0 I x.t/ 0
D C u.t/
system to its outputs, while its dynamics are R
0 M x.t/ K D x.t/
P F
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ƒ‚…
treated as a black box so that the large-scale E zP.t / A z.t / B
set of describing ODEs/DAEs can be replaced  
  x.t/
by a much smaller set of ODEs/DAEs without y.t/ D Cp Cv ;
sacrificing the accuracy of the input-to-output P
„ ƒ‚ … x.t/
C „ ƒ‚ …
behavior. z.t /
Model Order Reduction: Techniques and Tools 723

where E; A 2 R2s2s , B 2 R2sm , C 2 Rq2s , of reduced-order r n such that the correspond-


z.t/ 2 R2s , u.t/ 2 Rm , y.t/ 2 Rq . Various ing TFM
other linearizations have been proposed in the
literature. e
G.s/ e .s E
DC ee eCD
A/1 B e (4)
The matrix E may be singular. In that case
the first equation in (1) defines a system of approximates the original TFM (2). That is,
differential-algebraic equations (DAEs); other- using the same input u.t/ in (1) and (3), we
wise it is a system of ordinary differential Q
want that the output y.t/ of the reduced order
 equa-
tions (ODEs). For example, for E D J0 00 with model (ROM) (3) approximates the output y.t/
a j  j nonsingular matrix J , only the first j of (1) well enough for the application considered
equations in the left-hand side expression in (1) (e.g., controller design). In general, one requires
form ordinary differential equations, while the ky.t/  y.t/k
Q  " for all feasible inputs u.t/,
last n  j equations form homogeneous
  linear for (almost) all t in the time domain of interest,
equations. If further A D A011 A A12
and B D and for a suitable norm k  k: In control theory
 B1  22
with the j  j matrix A 11 , the j  m one often employs the L2 - or L1 -norms on R
B2
matrix B1 and a nonsingular matrix A22 , this is or Œ0; 1, respectively, to measure time signals
heasilyiseen: partitioning the state vector x.t/ D
or their Laplace transforms. In the situations
x1 .t /
P D considered here, the L2 -norms employed in
x2 .t / with x1 .t/ of length j , the DAE E x.t/
frequency and time domain coincide due to the
Ax.t/ C Bu.t/ splits into the algebraic equation
Paley-Wiener theorem (or Parseval’s equation
0 D A22 x2 .t/ C B2 u2 .t/, and the ODE
or the Plancherel theorem, respectively); see
Antoulas (2005) and Zhou et al. (1996) for
J xP 1 .t/ D A11 x1 .t/ C B1  A12 A1
22 B2 u.t/: details. As Y .s/  Ye.s/ D .G.s/  G.s//U.s/,
e
one can therefore consider the approximation M
To simplify the description, only continuous- e
error of the TFM kG./  G./k measured in an
time systems are considered here. The discrete- induced norm instead of the error in the output
time case can be treated mostly analogously; see, ky./  y./k.
Q
e.g., Antoulas (2005). Depending on the choice of the norm, different
An alternative way to represent LTI systems is MOR goals can be formulated. Typical choices
provided by the transfer function matrix (TFM), are (see, e.g., Antoulas (2005) for a more thor-
a matrix-valued function whose elements are ra- ough discussion)
tional functions. Assuming x.0/ D 0 and tak- e
• kG./  G./k H1 , where
ing Laplace transforms in (1) yields sX.s/ D
AX.s/CBU.s/, Y .s/ D CX.s/CDU.s/, where kF .:/kH1 D sups2CC max .F .s//:
X.s/; Y .s/, and U.s/ are the Laplace transforms
of the time signals x.t/; y.t/ and u.t/, respec- Here, max is the largest singular value of
tively. The map from inputs U to outputs Y is the matrix F .s/. This minimizes the maximal
then described by Y .s/ D G.s/U.s/ with the magnitude of the frequency response of the
TFM error system and by the Paley-Wiener theorem
bounds the L2 -norm of the output error.
p
e
• kG./  G./k H2 , where (with { D 1)
G.s/ D C.sE  A/1 B C D; s 2 C: (2)
Z C1
1
The aim of model order reduction is to find an kF ./k2H2 D tr F .{!/ F .{!/ d!:
2 1
LTI system
This ensures a small error ky./y./k
Q L1 .0;1/
E PQ
e x.t/ De Q
Ax.t/C e
Bu.t/; Q DC
y.t/ e x.t/C
Q e
Du.t/ D supt >0 ky.t/  y.t/k Q 1 (with k : k1
(3) denoting the maximum norm of a vector)
724 Model Order Reduction: Techniques and Tools

uniformly over all inputs


R 1 u.t/ having bounded to construct a real reduced-order model. All of
L2 -energy, that is, 0 u.t/T u.t/dt  1; see the methods discussed in the following either do
Gugercin et al. (2008). construct a real reduced-order system or there is
Besides a small approximation error, one may a variant of the method which does. In order to
impose additional constraints for the ROM. One keep this exposition at a reasonable length, the
might require certain properties (such as sta- reader is referred to the cited literature.
bility and passivity) of the original systems to
be preserved. Rather than considering the full Truncation Based Methods
nonnegative real line in time domain or the full The general idea of truncation is most easily
imaginary axis in frequency domain, one can explained by modal truncation: For simplicity,
also consider bounded intervals in both domains. assume that E D I and that A is diagonalizable,
For these variants, see, e.g., Antoulas (2005) and T 1 AT D DA D diag. 1 ; : : : ; n /. Further we
Obinata and Anderson (2001). assume that the eigenvalues ` 2 C of A can be
ordered such that

Methods Re. n /  Re. n1 /  : : :  Re. 1 / < 0; (6)

There are a number of different methods to con- (i.e., all eigenvalues lie in the open left half
struct ROMs, see, e.g., Antoulas (2005), Ben- complex plane). This implies that the system is
ner et al. (2005), Obinata and Anderson (2001), stable. Let V be the n  r matrix consisting of
and Schilders et al. (2008). Here we concen- the first r columns of T and let W  be the first r
trate on projection-based methods which restrict rows of T 1 , that is, W D V .V  V /1 . Applying
the full state x.t/ to an r-dimensional subspace the transformation T to the LTI system (1) yields
by choosing x.t/Q D W  x.t/; where W is an
n  r matrix. Here the conjugate transpose of T 1x.t/
P D .T 1AT /T 1x.t/ C .T 1B/u.t/ (7)
a complex-valued matrix Z is denoted by Z  ;
while the transpose of a matrix Y will be denoted y.t/ D .C T /T 1 x.t/ C Du.t/ (8)
by Y T : Choosing V 2 Cnr such that W  V D
I 2 Rrr yields an n  n projection matrix … D with
V W  which projects onto the r-dimensional sub-     
1 W AV 1 W B
space spanned by the columns of V along the T AT D ; T BD ;
A2 B2
kernel of W  . Applying this projection to (1), one
obtains the reduced-order LTI system (3) with and C T D ŒC V C2  ; where W  AV D
diag. 1 ; : : : ; r / and A2 D diag. rC1 ; : : : ; n /:
e D W  EV; AQ D W  AV; B
E e D W  B; C
e DCV Preserving the r dominant poles (eigenvalues
(5) with largest real part) by truncating the rest (i.e.,
discarding A2 ; B2 , and C2 from (7)) yields the
and an unchanged D e D D: If V D W , … is
ROM as in (5). It can be shown that the error
an orthogonal projector and is called a Galerkin bound
projection. If V ¤ W , … is an oblique projector,
sometimes called a Petrov-Galerkin projection. 1
e
kG./  G./kH1  kC2 k kB2 k
In the following, we will briefly discuss the jRe. rC1 /j
main classes of methods to construct suitable
matrices V and W : truncation-based methods and holds (Benner 2006). As eigenvalues contain only
interpolation-based methods. Other methods, in limited information about the system, this is not
particular combinations of the two classes dis- necessarily a meaningful reduced-order system.
cussed here, can be found in the literature. In case In particular, the dependence of the input–output
the original LTI system is real, it is often desirable relation on B and C is completely ignored.
Model Order Reduction: Techniques and Tools 725

This can be enhanced by more refined dominance • Compute the Cholesky factors S and R of the
measures taking B and C into account; see, e.g., Gramians such that P D S T S; Q D RT R.
Varga (1995) and Benner et al. (2011). • Compute the singular value decomposition of
More suitable reduced-order systems can be SRT D ˆ† T , where ˆ and  are orthogo-
obtained by balanced truncation. To introduce nal matrices and † is a diagonal matrix with
this concept, we no longer need to assume A to the Hankel singular values on its diagonal.
1
be diagonalizable, but we require the stability T D S T ˆ† 2 yields the balancing trans-
1
of A in the sense of (6). For simplicity, we formation (note that T 1 D † 2 ˆT S T D
1
assume E D I . For treatment of the DAE case † 2  T R).
(E ¤ I ), see Benner et al. (2005, Chap. 3). • Partition ˆ; †;  into blocks of corresponding
Loosely speaking, a balanced representation sizes,
of an LTI system is obtained by a change of
     T
coordinates such that the states which are hard †1 ˆ1 1
†D ;ˆ D ; T D ;
to reach are at the same time those which are †2 ˆ2 2T
difficult to observe. This change of coordinates
amounts to an equivalence transformation of the with †1 D diag. 1 ; : : : ; r / and apply T
realization .A; B; C; D/ of (1) called state-space to (1) to obtain (7) with
transformation as in (7), where T now is the ma-
trix representing the change of coordinates. The    T 
1 W T AV A12 1 W B
new system matrices .T 1 AT; T 1 B; C T; D/ T AT D ; T BD ;
A21 A22 B2
form a balanced realization of (1). Truncating in (10)
this balanced realization the states that are hard
 12
to reach and difficult to observe results in a ROM. and C T D ŒC V C2  for W D RT 1 †1 and
Consider the Lyapunov equations 1
V D S T ˆ1 †1 2 :
Preserving the r dominant M
Hankel singular values by truncating the rest
AP CPAT CBB T D 0; AT QCQACC T C D 0: yields the reduced-order model as in (5).
(9) As W T V D I , balanced truncation is an oblique
The solution matrices P and Q are called con- projection method. The reduced-order model is
trollability and observability Gramians, respec- stable with the Hankel singular values 1 ; : : : ; r .
tively. If both Gramians are positive definite, the It can be shown that if r > rC1 , the error bound
LTI system is minimal. This will be assumed
X n
from here on in this section. e
kG./  G./k H1  2 k (11)
In balanced coordinates the Gramians P kDrC1
and Q of a stable minimal LTI system satisfy
P D Q D diag. 1 ; : : : ; n / with the Hankel holds. Given an error tolerance, this allows to
singular values 1  2  : : :  n > 0: The choose the appropriate order r of the reduced
Hankel singular values are the positive square system in the course of the computations.
roots of the eigenvalues p of the product of the As the explicit computation of the balancing
Gramians PQ, k D k .PQ/: They are transformation T is numerically hazardous, one
system invariants, i.e., they are independent of usually uses the equivalent balancing-free square
the chosen realization of (1) as they are preserved root algorithm (Varga 1991) in which orthogonal
under state-space transformations. bases for the column spaces of V and W are
Given the LTI system (1) in a non-balanced computed. The so obtained ROM is no longer
coordinate form and the Gramians P and Q balanced, but preserves all other properties (er-
satisfying (9), the transformation matrix T which ror bound, stability). Furthermore, it is shown
yields an LTI system in balanced coordinates in Benner et al. (2000) how to implement the
can be computed via the so-called square root balancing-free square root algorithm using low-
algorithm as follows: rank approximations to S and R without ever
726 Model Order Reduction: Techniques and Tools

having to resort to the square solution matrices e


G.0/ D G.0/ and the error bound (11); more-
P and Q of the Lyapunov equations (9). This over, it preserves stability.
yields an efficient algorithm for balanced trunca- Besides SPA, another related truncation
tion for LTI systems with large dense matrices. method that is not based on projection is
For systems with large-scale sparse A efficient optimal Hankel norm approximation (HNA). The
algorithms based on sparse solvers for (9) exist; description of HNA is technically quite involved;
see Benner (2006). for details, see Zhou et al. (1996) and Glover
By replacing the solution matrices P and Q (1984). It should be noted that the so obtained
of (9) by other pairs of positive (semi-)definite ROM usually has similar stability and accuracy
matrices characterizing alternative controllability properties as for balanced truncation.
and observability related system information,
one obtains a family of model reduction methods Interpolation-Based Methods
including stochastic/bounded-real/positive-real Another family of methods for MOR is based
balanced truncation. These can be used if further on (rational) interpolation. The unifying feature
properties like minimum phase, passivity, etc. are of the methods in this family is that the origi-
to be preserved in the reduced-order model; for nal TFM (2) is approximated by a rational ma-
further details, see Antoulas (2005) and Obinata trix function of lower degree satisfying some
and Anderson (2001). interpolation conditions (i.e., the original and
The balanced truncation yields good approxi- the reduced-order TFM coincide, e.g., G.s0 / D
mation at high frequencies as G.{!/ e ! G.{!/ e 0 / at some predefined value s0 for which
G.s
for ! ! 1 (as D e D D), while the maximum
A  s0 E is nonsingular). Computationally this
error is often attained for ! D 0: For a perfect is usually realized by certain Krylov subspace
match at zero and a good approximation for low methods.
frequencies, one may employ the singular pertur- The classical approach is known under
bation approximation (SPA, also called balanced the name of moment-matching or Padé(-type)
residualization). In view of (7) and (10), balanced approximation. In these methods, the transfer
truncation can be seen as partitioning T 1 x ac- functions of the original and the reduced order
cording to (10) into Œx1T ; x2T T and setting x2  0 systems are expanded into power series, and
(i.e., xP 2 D 0 as well). For SPA, one only sets the reduced-order system is then determined so
xP 2 D 0, such that that the first coefficients in the series expansions
match. In this context, the coefficients of the
xP 1 D W TAV x1 C A12 x2 C W TBu; power series are called moments, which explains
0 D A21 x1 C A22 x2 C B2 u: the term moment matching.
Classically the expansion of the TFM (2) in a
Solving the second equation for x2 and inserting power series about an expansion point s0
it into the first equation yields
1
X
G.s/ D Mj .s0 /.s  s0 /j (12)
xP 1 D W TAV  A12 A1
22 A21 x1 j D0

C W TB  A12 A1
22 B2 u:
is used. The moments Mj .s0 /; j D 0; 1; 2; : : :,
For the output equation, it follows are given by

yQ D C V  C2 A1 1
22 A21 x1 C D  C2 A22 B2 u:
Mj .s0 / D C Œ.A  s0 E/1 Ej .A  s0 E/1 B:

This reduced-order model makes use of the in- Consider the (block) Krylov subspace Kk .F; H /
formation in the matrices A12 ; A21 ; A22 ; B2 ; and D spanfH; FH; F 2 H; : : : ; F k1 H g for F D
C2 discarded by balanced truncation. It fulfills .A  s0 E/1 E and H D .A  s0 E/1 B with
Model Order Reduction: Techniques and Tools 727

an appropriately chosen expansion point s0 which circumstances. Using Krylov subspace methods
may be real or complex. From the definitions to achieve an interpolation-based ROM as
of A; B, and E, it follows that F 2 Knn described above is recommended. The unitary
and H 2 Knm , where K D R or K D C basis of a (block) Krylov subspace can be
depending on whether s0 is chosen in R or in computed by employing a (block) Arnoldi or
C. Considering Kk .F; H / column by column, (block) Lanczos method; see, e.g., Antoulas
this leads to the observation that the number of (2005), Golub and Van Loan (2013), and Freund
column vectors in fH; FH; F 2 H; : : : ; F k1 H g (2003).
is given by r D m  k, as there are k blocks In the case when an oblique projection is
F j H 2 Knm ; j D 0; : : : ; k  1. In the case to be used, it is not necessary to compute two
when all r column vectors are linearly inde- unitary bases as above. An alternative is then to
pendent, the dimension of the Krylov subspace use the nonsymmetric Lanczos process (Golub
Kk .F; H / is m  k: Assume that a unitary basis and Van Loan 2013). It computes bi-unitary (i.e.,
for this block Krylov subspace is generated such W  V D Ir ) bases for the above mentioned
that the column space of the resulting unitary Krylov subspaces and the reduced-order model
matrix V 2 Cnr spans Kk .F; G/. Applying the as a by-product of the Lanczos process. An
Galerkin projection … D V V  to (1) yields a re- overview of the computational techniques for
duced system whose TFM satisfies the following moment-matching and Padé approximation
(Hermite) interpolation conditions at s0 : summarizing the work of a decade is given in
Freund (2003) and the references therein.
e .j /.s0 / D G .j / .s0 /; j D 0; 1; : : : ; k  1:
G In general, the discussed MOR approaches are
instances of rational interpolation. When the
That is, the first k  1 derivatives of G and G e expansion point is chosen to be s0 D 1,
coincide at s0 . Considering the power series ex- the moments are called Markov parameters and M
pansion (12) of the original and the reduced-order the approximation problem is known as partial
TFM, this is equivalent to saying that at least the realization. If s0 D 0, the approximation problem
first k moments f M j .s0 / of the transfer function is known as Padé approximation.
e
G.s/ of the reduced system (3) are equal to the As the use of one single expansion point s0
first k moments Mj .s0 / of the TFM G.s/ of the leads to good approximation only close to s0 ,
original system (1) at the expansion point s0 : it might be desirable to use more than one ex-
pansion point. This leads to multipoint moment-
fj .s0 /; j D 0; 1; : : : ; k  1:
Mj .s0 / D M matching methods, also called rational Krylov
methods; see, e.g., Ruhe and Skoogh (1998),
If further the r columns of the unitary matrix W Antoulas (2005), and Freund (2003).
span the block Krylov subspace Kk .F; H / for In contrast to balanced truncation, these (ratio-
F D .As0 E/T E and H D .As0 E/T C T ; nal) interpolation methods do not necessarily pre-
applying the Petrov-Galerkin projection … D serve stability. Remedies have been suggested;
V .W  V /1 W  to (1) yields a reduced system see, e.g., Freund (2003).
whose TFM matches at least the first 2k moments The use of complex-valued expansion points
of the TFM of the original system. will lead to a complex-valued reduced-order sys-
Theoretically, the matrix V (and W ) tem (3). In some applications (in particular, in
can be computed by explicitly forming the case the original system is real valued), this
columns which span the corresponding Krylov is undesired. In that case one can always use
subspace Kk .F; H / and using the Gram-Schmidt complex-conjugate pairs of expansion points as
algorithm to generate unitary basis vectors for then the entire computations can be done in real
Kk .F; H /: The forming of the moments (the arithmetic.
Krylov subspace blocks F j H ) is numerically The methods just described provide good ap-
precarious and has to be avoided under all proximation quality locally around the expansion
728 Model Order Reduction: Techniques and Tools

points. They do not aim at a global approxi- • Oberwolfach Model Reduction Benchmark
mation as measured by the H2 - or H1 -norms. Collection
In Gugercin et al. (2008), an iterative procedure http://simulation.uni-freiburg.de/downloads/
is presented which determines locally optimal benchmark/
expansion points w.r.t. the H2 -norm approxima- • NICONET Benchmark Examples
tion under the assumption that the order r of http://www.icm.tu-bs.de/NICONET/
the reduced model is prescribed and only 0th- benchmodred.html
and 1st-order derivatives are matched. Also, for The MOR WiKi http://morwiki.mpi-magde
multi-input multi-output systems (i.e., m and p burg.mpg.de/morwiki/ is a platform for MOR
in (1) are both larger than one), no full mo- research and provides discussions of a number
ment matching is achieved, but only tangential of methods, links to further software packages
interpolation: G.sj /bj D G.s e j /bj ; cj G.sj / D (e.g., MOREMBS and MORPACK), as well as
e j /; cj G 0 .sj /bj D cj G
cj G.s e 0 .sj /bj ; for certain additional benchmark examples.
vectors bj ; cj determined together with the opti-
mal sj by the iterative procedure.
Summary and Future Directions

MOR of LTI systems can now be considered


as an established computational technique. Some
Tools open issues still remain and are currently investi-
gated. These include methods yielding good ap-
Almost all commercial software packages proximation in finite frequency or time intervals.
for structural dynamics include modal analy- Though numerous approaches for these tasks
sis/truncation as a means to compute a ROM. exist, methods with sharp local error bounds are
Modal truncation and balanced truncation are still desirable. A related problem is the reduction
available in the MATLAB R
Control System of closed-loop systems and controller reduction.
Toolbox and the MATLAB R
Robust Control Also, the generalization of the methods discussed
Toolbox. in this essay to descriptor systems (i.e., systems
Numerically reliable, well-tested, and efficient with DAE dynamics), second-order systems, or
implementations of many variants of balancing- unstable LTI systems has only been partially
based MOR methods as well as Hankel achieved. An important problem class getting a
norm approximation and singular perturbation lot of current attention consists of (uncertain)
approximation can be found in the Subroutine parametric systems. Here it is important to pre-
Library In Control Theory (SLICOT, http://www. serve parameters as symbolic quantities in the
slicot.org) (Varga 2001). Easy-to-use MATLAB ROM. Most of the current approaches are based
interfaces to the Fortran 77 subroutines from in one way or another on interpolation. MOR for
SLICOT are available in the SLICOT Model nonlinear systems has also been a research topic
and Controller Reduction Toolbox (http:// for decades. Still, the development of satisfactory
slicot.org/matlab-toolboxes/basic-control); see methods in the context of control design having
Benner et al. (2010). An implementation of computable error bounds and preserving interest-
moment matching via the (block) Arnoldi ing system properties remains a challenging task.
method is available in MOR for ANSYS R
(http://
modelreduction.com/Software.html).
There exist benchmark collections with Cross-References
mainly a number of LTI systems from various
applications. There one can find systems in  Basic Numerical Methods and Software for
computer-readable format which can easily be Computer Aided Control Systems Design
used to test new algorithms and software:  Multi-domain Modeling and Simulation
Model Reference Adaptive Control 729

Bibliography Series in Engineering and Computer Science, vol 629.


Kluwer Academic, Boston, pp 239–282
Antoulas A (2005) Approximation of large-scale dynami- Zhou K, Doyle J, Glover K (1996) Robust and optimal
cal systems. SIAM, Philadelphia control. Prentice Hall, Upper Saddle River, NJ
Benner P (2006) Numerical linear algebra for model
reduction in control and simulation. GAMM Mitt
29(2):275–296
Benner P, Quintana-Ortí E, Quintana-Ortí G (2000) Bal-
anced truncation model reduction of large-scale dense Model Reference Adaptive Control
systems on parallel computers. Math Comput Model
Dyn Syst 6:383–405 Jing Sun
Benner P, Mehrmann V, Sorensen D (2005) Dimension re-
University of Michigan, Ann Arbor, MI, USA
duction of large-scale systems. Lecture Notes in Com-
putational Science and Engineering, vol 45. Springer,
Berlin/Heidelberg
Benner P, Kressner D, Sima V, Varga A (2010) Synonyms
Die SLICOT-Toolboxen für Matlab (The SLICOT-
Toolboxes for Matlab) [German]. at-Automatisierung-
stechnik 58(1):15–25. English version available as MRAC
SLICOT working note 2009-1, 2009, http://slicot.org/
working-notes/
Benner P, Hochstenbach M, Kürschner P (2011) Model
order reduction of large-scale dynamical systems with Abstract
Jacobi-Davidson style eigensolvers. In: Proceedings
of the International Conference on Communications, The fundamentals and design principles of
Computing and Control Applications (CCCA), March
3-5, 2011 at Hammamet, Tunisia, IEEE Publications
model reference adaptive control (MRAC)
(6 pages) are described. The controller structure and
Freund R (2003) Model reduction methods based on adaptive algorithms are delineated. Stability and
Krylov subspaces. Acta Numer 12:267–319 convergence properties are summarized. M
Glover K (1984) All optimal Hankel-norm approxima-
tions of linear multivariable systems and their L1
norms. Internat J Control 39:1115–1193
Golub G, Van Loan C (2013) Matrix computations, 4th Keywords
edn. Johns Hopkins University Press, Baltimore
Gugercin S, Antoulas AC, Beattie C (2008) H2 model
reduction for large-scale dynamical systems. SIAM J Certainty equivalence; Lyapunov-SPR design;
Matrix Anal Appl 30(2):609–638 MIT rule
Obinata G, Anderson B (2001) Model reduction for
control system design. Communications and Control
Engineering Series. Springer, London
Ruhe A, Skoogh D (1998) Rational Krylov algorithms Introduction
for eigenvalue computation and model reduction.
Applied Parallel Computing. Large Scale Scientific Model reference adaptive control (MRAC) is an
and Industrial Problems, Lecture Notes in Com- important adaptive control approach, supported
puter Science, vol 1541. Springer, Berlin/Heidelberg,
pp 491–502 by rigorous mathematical analysis and effective
Schilders W, van der Vorst H, Rommes J (2008) Model design toolsets. It is made up of a feedback
order reduction: theory, research aspects and applica- control law that contains a controller C.s; c /
tions. Springer, Berlin/Heidelberg
and an adjustment mechanism that generates the
Varga A (1991) Balancing-free square-root algorithm
for computing singular perturbation approximations. controller parameter updates c .t/ online. While
In: Proceedings of the 30th IEEE CDC, Brighton, different MRAC configurations can be found
pp 1062–1065 in the literature, the structure shown in Fig. 1
Varga A (1995) Enhanced modal approach for model
is commonly used and includes all the basic
reduction. Math Model Syst 1(2):91–105
Varga A (2001) Model reduction software in the SLICOT components of an MRAC system. The prominent
library. In: Datta B (ed) Applied and computational features of MRAC are that it incorporates a
control, signals, and circuits. The Kluwer International reference model which represents the desired
730 Model Reference Adaptive Control

is the so-called model matching condition.


To assure the existence of a causal controller
that meets the model matching condition and
guarantees internal stability of the closed-
loop system, the following assumptions are
essential:
• A1. The plant has a stable inverse, and
the reference model is chosen to be
stable.
• A2. The relative degree of M.s/ is equal to or
greater than that of the plant Gp .s/. Herein,
Model Reference Adaptive Control, Fig. 1 Schematic the relative degree of a transfer function refers
of MRAC to the difference between the orders of the
denominator and numerator polynomials.
It should be noted that these assumptions are
input–output behavior and that the controller imposed to the MRC problem so that there is
and adaptation law are designed to force the enough structural flexibility in the plant and in
response of the plant, yp , to track that of the the reference model to meet the control objec-
reference model, ym , for any given reference tives. A1 is necessary for maintaining internal
input r. stability of the system while meeting the model
Different approaches have been used to matching condition, and A2 is needed to ensure
design MRAC, and each may lead to a different the causality of the controller. Both assumptions
implementation scheme. The implementation are essential for non-adaptive applications when
schemes fall into two categories: direct and the plant parameters are known, let alone for
indirect MRAC. The former updates the the adaptive cases when the plant parameters are
controller parameters c directly using an unknown.
adaptive law, while the latter updates the plant The reference model plays an important role
parameters p first using an estimation algorithm in MRAC, as it will define the feasibility of
and then updates c by solving, at each time t, MRAC design as well as the performance of
certain algebraic equations that relate c with the the resulting closed-loop MRAC system. The
online estimates of p . In both direct and indirect reference model should reflect the desired closed-
MRAC schemes, the controller structure is kept loop performance. Namely, any time-domain or
the same as that which would be used in the case frequency-domain specifications, such as time
that the plant parameters are known. constant, damping ratio, natural frequency, band-
width, etc., should be properly reflected in the
chosen transfer function M.s/.
MRC Controller Structure The controller structure for the MRAC is de-
rived with these assumptions for the known plant
Consider the design objective of model reference case and extended to the adaptive case by com-
control (MRC) for linear time-invariant systems: bining it with a proper adaptive law. Under as-
Given a reference model M.s/, find a control law sumptions A1–A2, there exist infinitely many
such that the closed-loop system is stable and control solutions C that will achieve the MRC
yp ! ym as t ! 1 for any bounded reference design objective for a given plant transfer func-
signal r. tion Gp .s/. Nonetheless, only those extendable
For the case of known plant parameters, the to MRAC with the simplest structure are of in-
MRC objective can be achieved by designing terest. It is known that if a special controller
the controller so that the closed-loop system structure with the following parametrization is
has a transfer function equal to M.s/. This imposed, then the solution to the model matching
Model Reference Adaptive Control 731

condition, in terms of the ideal parameter c , will AT P C PA D qq T  L


be unique: p
PB  C D ˙q 2d
up D 1T !1 C 2T !2 C 3T yp C c0 r D cT !
By choosing M.s/ to be SPR, one can formulate
where c 2 R , n is the order of the plant,
2n a Lyapunov function consisting of the state track-
ing and parameter estimation errors and use the
2 3 2 3
1 !1 MKY Lemma to define the adaptive law that will
6  7 6 !2 7 force the derivative of the Lyapunov function to
c D 6
 7 6
4   5 ; ! D 4 yp
2 7
5 be semi-negative definite. The resulting adaptive
3
c0 r law has the following simple form:

and !1 ; !2 2 Rn1 are signals internal to the P D e1 !sign.c0 /


controller generated by stable filters (Ioannou and
Sun 1996). where e1 D yp  ym is simply the tracking error
This MRC control structure is particularly ap- and c0 D km =kp with km ; kp being the high
pealing for adaptive control development, as the frequency gain of the transfer function for the
parameters appear linearly in the control law ex- reference model M.s/ and the plant Gp .s/, re-
pression, leading to a convenient linear paramet- spectively. This algorithm, however, applies only
ric model for adaptive algorithm development. to systems with relative degree equal to 0 or 1,
which is implied by the SPR condition imposed
on M.s/ and assumption A2.
Adaptation Algorithm The Lyapunov-SPR-based MRAC design is
mathematically elegant in its stability analysis but
Design of adaptive algorithms for parameter is restricted to a special class of systems. While M
updating can be pursued in several different it can be extended to more general cases with
approaches, thereby resulting in different MRAC relative degrees equal to 2 and 3, the resulting
schemes. Three direct design approaches, control law and adaptive algorithm become much
namely, the Lyapunov-SPR, the certainty more complicated and cumbersome as efforts
equivalence, and the MIT rule, will be briefly must be made to augment the control signal in
described together with indirect MRAC. such a way that the MKY Lemma is applicable to
the “reformulated” reference model.
Lyapunov-SPR Design
Certainty Equivalence Design
One popular MRAC algorithm is derived us-
ing Lyapunov’s direct method and the Meyer- For more general cases with a high relative de-
Kalman-Yakubovich (MKY) Lemma based on gree, another design approach based on “certainty
the strictly positive real (SPR) argument. The equivalence” (CE) principle is preferred, due to
concept of SPR transfer functions originates from the simplicity in its design as well as its robust-
network theory and is related to the driving point ness properties in the presence of modeling er-
impendence of dissipative networks. The MKY rors. This approach treats the design of the adap-
Lemma states that given a stable transfer function tive law as a parameter estimation problem, with
M.s/ and its realization .A; B; C; d / where d  the estimated parameters being the controller
0 and all eigenvalues of the matrix A are in the parameter vector c . Using the specific linear
open left half plane: If M.s/ is SPR, then for formulation of the control law and assuming
any given positive definite matrix L D LT > 0, that c satisfies the model matching condition,
there exists a scalar  > 0, a vector q, and a one can show that the ideal controller parameter
P D P T > 0 such that satisfies the following parametric equation:
732 Model Reference Adaptive Control

z D cT !p explicitly estimate the plant parameter p as


an intermediate step. The adaptive law for an
with indirect MRAC includes two basic components:
one for estimating the plant parameters and
2 3 another for calculating the controller parameters
M.s/!1
6 M.s/!2 7 based on the estimated plant parameters. This
z D M.s/up ; !p D 6 7
4 M.s/yp 5 approach would be preferred if the plant transfer
yp function is partially known, in which case
the identification of the remaining unknown
This parametric model allows one to derive adap- parameters represents a less complex problem.
tive laws to estimate the unknown controller For example, if the plant has no zeros, the indirect
parameter c using standard parameter identifi- scheme estimates n C 1 parameters, while the
cation techniques, such as the gradient and least direct scheme has to estimate 2n parameters.
squares algorithms. The corresponding MRAC is Indirect MRAC is a CE-based design. As such,
then implemented in the CE sense where the un- the design is intuitive but the design process does
known parameters are replaced by their estimated not guarantee closed-loop stability, and separate
value. It should be noted that a CE design does analysis has to be carried out to establish stability.
not guarantee closed-loop stability of the result- Except for systems with a low number of zeros,
ing adaptive system, and additional analysis has the “feasibility” problem could also complicate
been carried out to establish closed-loop stability. the matter, in the sense that the MRC problem
may not have a solution for the estimated plant
at some time instants even though the solution
MIT Rule exists for the real plant. This problem is unique
to the indirect design, and several mitigating
Besides the Lyapunov-SPR and CE approaches solutions have been found at the expense of more
mentioned earlier, the direct MRAC problem can complicated adaptation or control algorithms.
also be approached using the so-called MIT rule,
an early form of MRAC developed in the 1950s– Stability, Robustness, and Parameter
1960s in the Instrumentation Laboratory at MIT Convergence
for flight control. The designer defines a cost
function, e.g., a quadratic function of tracking Stability for MRAC often refers to the properties
error, and then adjusts parameters in the direction that all signals are bounded and tracking error
of steepest descent. The negative gradient of the converges to zero asymptotically. Robustness for
cost function is usually calculated through the adaptive systems implies that signal boundedness
sensitivity derivative approach. The formulation and tracking error convergence (to a small residue
is quite flexible, as different forms of MIT rule set) will be preserved in the presence of small
can be derived by changing the cost function perturbations such as disturbances, un-modeled
following the same procedure and reusing the dynamics, and time-varying parameters. For dif-
same sensitivity functions. Despite its effective- ferent MRAC schemes, different approaches are
ness in some practical applications, MRAC sys- used to establish their properties.
tems designed with MIT rule have had stability For the Lyapunov-SPR-based MRAC systems,
and robustness issues. stability is established in the design process
where the adaptive law is derived to enforce
Indirect MRAC a Lyapunov stability condition. For CE-based
designs, establishing stability for the closed-
While most of the MRAC systems are loop MRAC system is a nontrivial exercise
designed as direct adaptive systems, indirect for both direct and indirect schemes. Using
MRAC systems can also be developed which properly normalized adaptive laws for parameter
Model Reference Adaptive Control 733

estimation, however, stability can be proved MRAC had been a very active and fruitful
for direct and indirect MRAC schemes. For research topic from the 1960s to 1990s, and it
MRAC systems designed with the MIT rule, local formed important foundations for modern adap-
stability can be established under more restrictive tive control theory. It also found many successful
conditions, such as when the parameters are close applications ranging from chemical process con-
to the ideal ones. trols to automobile engine controls. More recent
It should be noted that the adaptive control efforts have been mostly devoted to integrating it
algorithm in the original form has been with other design approaches to treat nonstandard
shown to have robustness issues, and extensive MRAC problems for nonlinear and complex dy-
publications in the 1980s and 1990s were namic systems.
devoted to robust adaptive control in attempts
to mitigate the problem. Many modifications
have been proposed and shown to be effective Cross-References
in “robustifying” the MRAC; interested readers
are referred to the article on  Robust Adaptive  Adaptive Control of Linear Time-Invariant
Control for more details. Systems
Parameter convergence is not an intrinsic  Adaptive Control, Overview
requirement for MRAC, as tracking error  History of Adaptive Control
convergence can be achieved without parameter  Robust Adaptive Control
convergence. It has been shown, however,
that parameter convergence could enhance
robustness, particularly for indirect schemes. Recommended Reading
As in the case for parameter identification, a
persistent excitation (PE) condition needs to MRAC has been well covered in several text- M
be imposed on the regression signal to assure books and research monographs. Astrom and
parameter convergence in MRAC. In general, Wittenmark (1994) presented different MRAC
PE is accomplished by properly choosing the schemes in a tutorial fashion. Narendra and An-
reference input r. It can be established for most naswamy (1989) focused on stability of determin-
MRAC approaches that parameter convergence is istic MRAC systems. Ioannou and Sun (1996)
achieved if, in addition to conditions required for covered the detailed derivation and analysis of
stability, the reference input r is sufficiently rich different MRAC schemes and provided a unified
of order 2n; rP is bounded, and there is no pole- treatment for their stability and robustness analy-
zero cancelation in the plant transfer function. A sis. MRAC systems for discrete-time (Goodwin
signal is called to be sufficiently rich of order m and Sin 1984) and for nonlinear (Krstic et al.
if it contains at least m=2 distinct frequencies. 1995) processes are also well explored.

Summary and Future Directions Bibliography


MRAC incorporates a reference model to capture Astrom KJ, Wittenmark B (1994) Adaptive control. Sec-
the desired closed-loop responses and designs ond edition. Prentice Hall, Englewood Cliffs
the control law and adaptation algorithm to force Goodwin GC, Sin KS (1984) Adaptive filtering, prediction
and control. Prentice Hall, Englewood Cliffs
the output of the plant to follow the output of Ioannou PA, Sun J (1996) Robust adaptive control. Pren-
the reference model. Several different design tice Hall, Upper Saddle River
approaches are available. Stability, robustness, Krstic K, Kanellakopoulos I, Kokotovic PV (1995)
and parameter convergence have been established Nonlinear and adaptive control design. Wiley,
New York
for different MRAC designs with appropriate Narendra KS, Annaswamy AM (1989) Stable adaptive
assumptions. systems. Prentice Hall, Englewood Cliffs
734 Model-Based Performance Optimizing Control

of disturbances und uncertainties, exploiting the


Model-Based Performance information gained in real time from the available
Optimizing Control measurements. This holds generally for the
higher control layers in the process industries but
Sebastian Engell
similarly for many other applications. Suppose
Fakultät Bio- und Chemieingenieurwesen,
that, for example, the availability of cooling
Technische Universität Dortmund, Dortmund,
water at a lower temperature than assumed as
Germany
a worst case during plant design enables plant
operation at a higher throughput. In this case,
Abstract what sense does it make to enforce the nominal
operating point by tight feedback control? For
In many applications, e.g., in chemical process a combustion engine, the goal is to achieve the
control, the purpose of control is to achieve an desired torque with minimum consumption of
optimal performance of the controlled system fuel. For a cooling system, the goal is to keep
despite the presence of significant uncertainties the temperature of the goods or of a room within
about its behavior and of external disturbances. certain bounds with minimum consumption of
Tracking of set points is often required for lower- energy, possibly weighted against the wear of the
level control loops, but at the system level in equipment. To regulate some variables to their
most cases, this is not the primary concern set points may help to achieve these goals but it
and may even be counterproductive. In this is not the real performance target for the overall
entry, the use of dynamic online optimization system. Feedback control loops therefore usually
on a moving finite horizon to realize optimal are part of control hierarchies that establish
system performance is discussed. By real- good performance of the overall system and the
time optimization, a performance-oriented meeting of constraints on its operation.
or economic cost criterion is minimized or There are four main approaches to the integra-
maximized over a finite horizon while the usual tion of feedback control with system performance
control specifications enter as constraints but optimization:
not as set points. This approach integrates the – Choice of regulated variables such that, im-
computation of optimal set-point trajectories and plicitly via the regulation of these variables to
of the regulation to these trajectories. their set points, the performance of the overall
system is close to optimal (see the chapter on
 Control Structure Selection).
Keywords
– Tracking of necessary conditions of optimality
where variables which determine the optimal
Model-predictive control (MPC); Integrated op-
operating policy are kept at or close to their
timization and control; Real-time optimization
constraints. This is a widespread approach
(RTO); Performance optimizing control; Process
especially in chemical batch processes where,
control
e.g., the feeding of reactants is such that the
maximum cooling power available is used
Introduction (Finkler et al. 2014); see also the chapter
on  Control and Optimization of Batch Pro-
From a systems point of view, the purpose of cesses).
automatic feedback control (and that of manual In these two approaches, the choice of the opti-
control as well) in many cases is not primarily to mal set points or constraints to be tracked is done
keep the controlled variables at their set points as off-line, and they are then implemented by the
well as possible or to track dynamic set-point feedback layer of the process control hierarchy
changes but to operate the system such that (see the chapter on  Control Hierarchy of Large
its performance is optimized in the presence Processing Plants: An Overview).
Model-Based Performance Optimizing Control 735

– Combination of a regulatory (tracking) feed- objectives.” A subgoal in this “translation” is to


back control with an optimization of the set select the regulatory control structure of a process
points or system trajectories (called real-time such that steady-state optimality of process oper-
optimization in the process industries) (see ations is realized to the maximum extent possible
the chapter on  Real-Time Optimization of by driving the selected controlled variables to
Industrial Processes). suitably chosen set points. A control structure
– Reformulation of model-predictive control with this property was termed “self-optimizing
such that the control target is not the tracking control” by Skogestad (2000). It should adjust
of references but the optimization of the the manipulated variables by keeping a function
system performance over a finite horizon, of the measured variables constant such that the
taking constraints of system variables or process is operated at the economically optimal
inputs into account directly within the steady state in the presence of disturbances. From
online optimization. Here, the optimization a system point of view, a control structure that
is performed with a dynamic model, in yields nice transient responses and tight control
contrast to the steady-state optimization in of the selected variables may be of little use or
real-time optimization or in the choice of self- even counterproductive if keeping the regulated
optimizing control structures. variables at their set points does not improve
The first three approaches are currently state the performance of the system. Ideally, in the
of the art in the process industries. Tracking of steady state, a similar performance is obtained as
necessary conditions of optimality is usually de- it would be realized by optimizing the stationary
signed based on process insight rather than based values of the operational degrees of freedom of
upon a rigorous analysis, and the same holds the system for known disturbances d and a per-
for the selection of regulatory control structures. fect model. By regulating the controlled variables
The last one is the most challenging approach in to their set points at the steady state in the pres- M
terms of the required models and algorithms and ence of disturbances, a mapping u D f .yset ; d /
computing power, and its theoretical foundations is implicitly realized which should be an approx-
are still under development. But on the other imation of the performance optimizing inputs
hand, it also has the highest potential in terms uopt .d /. The choice of the self-optimizing control
of the resulting performance of the controlled structure takes only the steady-state performance
system, and it is structurally simple and easier to into account, not the dynamic reaction of the
tune because the natural performance specifica- controlled plant. An extension of the approach to
tion does not have to be translated into controller include also the dynamic behavior can be found
tunings, weights, etc. Therefore, the idea of direct in Pham and Engell (2011).
model-based performance optimizing control has
found much attention in process control in recent
years. Tracking of Necessary Conditions
The four approaches above are discussed in of Optimality
more detail below. We also provide some histor-
ical notes and outline some areas of continuing Very often, the optimal operation of a system in
research. a certain phase of its evolution or under certain
conditions is defined by some variables being
at their constraints. If these variables are known
Performance Optimization and the conditions can be monitored, a switching
by Regulation to Fixed Set Points control structure can be built that keeps the (pos-
sibly changing) set of critical variables at their
Morari et al. (1980) stated that the objective in constraints despite inaccuracies of the model,
the synthesis of a control structure is “to trans- external disturbances, etc. In fact it turns out that
late the economic objectives into process control such control schemes can, in the case of varying
736 Model-Based Performance Optimizing Control

parameters and in the presence of disturbances, As an approximation to real-time optimization


perform as good as sophisticated model-based with a nonlinear rigorous plant model, in many
optimization schemes (Finkler et al. 2013). MPC implementations nowadays, an optimiza-
tion of the steady-state values based on the linear
model that is used in the MPC controller is imple-
Performance Optimization mented. Then the gain matrix of the model must
by Steady-State Optimization be estimated carefully to obtain good results.
and Regulation

A well-established approach to create a link be- Performance Optimizing Control


tween regulatory control and the optimization of
the performance of a system is to compute the Model-predictive control has become the stan-
set points of the controllers by an optimization dard solution for demanding control problems in
layer. In process operations, this layer is called the process industries (Qin and Badgwell 2003)
real-time optimization (RTO) (see, e.g., Marlin and increasingly is used also in other domains.
and Hrymak (1997) and the references therein). The core idea is to employ a model to predict the
An RTO system is a model-based, upper-level effect of the future manipulated variables on the
control system that is operated in closed loop and future controlled variables over a finite horizon
provides set points to the lower-level control sys- and to use optimization to determine sequences
tems in order to maintain the process operation of inputs which minimize a cost function over the
as close as possible to the economic optimum. so-called prediction horizon. In the unconstrained
It usually comprises an estimation of the plant case with linear plant model and a quadratic
state and plant parameters from the measured cost function, the optimal control moves can
data and an economic or otherwise performance- be computed by a closed-form solution. When
related optimization of the operating point using constraints on inputs, outputs, and possibly also
a detailed nonlinear steady-state model. state variables are present, for a quadratic cost
As the RTO system employs a stationary pro- function and linear plant model, the optimization
cess model and the optimization is only per- problem becomes a quadratic program (QP) that
formed if the plant is approximately in a steady has to be solved in real time.
state, the time between successive RTO steps When the system dynamics are nonlinear
must be large enough for the plant to reach a new and linear models are only sufficiently accurate
steady state after the last commanded move. This within narrow operation bands, as is the case
structure is based upon a separation of concerns in many chemical processes, nonlinear model
and of time-scales between the RTO system and predictive control which is based on nonlinear
the process control system. The RTO system models of the process dynamics provides
optimizes the system economics on a medium superior performance and therefore has met
timescale (shifts to days), while the control sys- increasing interest both in theory and in practice.
tem provides tracking and disturbance rejection The classical formulation of nonlinear model-
on shorter timescales from seconds to hours. predictive tracking control (TC) is

N u/
min T C .y;
u

!
P
N P
P P
R P
M
T C D n;i yn;ref .k  i /  yNn .k C i // 2
C ˛l;j u2l .k C j/
nD1 i D1 lD1 j D1
Model-Based Performance Optimizing Control 737

s.t. The optimization of the future control moves is


subject to the same constraints as before. In ad-
x.i C1/ D f.x.i / ; z .i / ; u.i /; i/ ; i D k; : : : ; kCP dition, instead of reference tracking, constraints
are formulated for all outputs that are critical for
0 D g .x .i / ; z .i / ; u.i /; i/ ; i D k; : : : ; k C P the operation of the system or its performance,
e.g., product quality specifications or limitations
y.i C1/ D h .x.i C 1/; u.i // ; i D k; : : : ; k CP
of the equipment. In contrast to reference track-
xmin  x .i /  xmax ; i D k; : : : ; k C P ing, these constraints usually are one-sided (in-
equalities) or define operation bands. By this
ymin  yN .i /  ymax ; i D k; : : : ; k C P
formulation, e.g., the production revenues can be
umin  u .i /  umax ; i D k; : : : ; k C M maximized online over a finite horizon, consid-
umin  u .i /  umax ; i D k; : : : ; k C M ering constraints on product purities and waste
stream impurities. Feedback enters into the com-
u .i / D u .i  1/ C u .i / ; i D k; : : : ; k C M putation by the initialization of the model with
u .i / D u .k C M / ; 8i > k C M: a new initial state that is estimated from the
available measurements of system variables and
Here f and g represent the plant model in the by the bias correction. Thus, direct performance
form of a system of differential-algebraic Equa- optimizing control realizes an online optimiza-
tions and h is the output function. P is the tion of all operational degrees of freedom in a
length of the prediction horizon and M is the feedback structure without tracking of a priori
length of the control horizon, and y1 ;    ; yN are fixed set points or reference trajectories. The reg-
the predicted control outputs, u1 ;    ; uR are the ularization term that penalizes control moves is
control inputs. ˛ and represent the weights on added to the purely economic objective function
the control inputs and the control outputs, respec- to obtain smoother solutions. M
tively. yref refers to the set point or the desired This approach has several advantages over a
output trajectory, and y(i) O are the corrected model combined steady-state optimization/ linear MPC
predictions. N is number of the controlled out- scheme:
puts, and R is the number of the control inputs. • Immediate reaction to disturbances, no wait-
Compensation for plant-model mismatch and un- ing for the plant to reach a steady state is
measured disturbances is usually done using the required.
bias correction equations: • “Overregulation” is avoided – no variables are
forced to fixed set points and all degrees of
d .k/ D y meas .k/  y.k/; freedom can be used to improve the (eco-
nomic) performance of the plant.
• Performance goals and process constraints do
yN .k C i / D y .k C i /Cd .k/ ; i D k; : : : ; kCP:
not have to be mapped to a control cost that
The idea of direct performance optimizing con- defines a compromise between different goals.
trol (POC) is to replace this formulation by a In this way, the formulation of the optimiza-
performance-related objective function: tion problem and the tuning are facilitated
compared to achieving good performance by
min POC .y; u/ tuning of the weights of a tracking formula-
u ! tion.
PR P
M
• More constraints than available manipulated
POC D ˛l;j ul .k C j /
2
lD1 j D1 variables can be handled as well as more
P 
P manipulated variables than variables that have
 ˇi .k C i / :
i D1 to be regulated.
• No inconsistency arises from the use of differ-
Here .k C i / represents the value of the per- ent models on different layers.
formance cost criterion at the time step [k C i ]. • The overall scheme is structurally simple.
738 Model-Based Performance Optimizing Control

Similar to any NMPC controller that is designed bioethanol process and Idris and Engell (2012)
for reference tracking, a successful implementa- for a reactive distillation column.
tion will require careful engineering such that as
many uncertainties as possible are compensated
by simple feedback controllers and only the key Further Issues
dynamic variables are handled by the optimizing
controller based on a rigorous model of the essen- Modeling and Robustness
tial dynamics and of the stationary relations of the In a direct performance optimizing control ap-
plant without too much detail. proach, sufficiently accurate dynamic nonlinear
process models are needed. While in the pro-
cess industries, nonlinear steady-state models are
nowadays available for many processes because
History and Examples they are built and used extensively in the pro-
cess design phase, there is still a considerable
The idea of economic or performance optimizing additional effort required to formulate, imple-
control originated from the process control com- ment, and validate nonlinear dynamic process
munity. The first papers on directly integrating models. The effort for rigorous or semi-rigorous
economic considerations into model-predictive modeling usually dominates the cost of an ad-
control Zanin et al. (2000) proposed to achieve vanced control project. The alternative approach
a better economic performance by adding an eco- to use black-box or gray-box models as pro-
nomic term to a classical tracking performance posed frequently in nonlinear model-predictive
criterion and applied this to the control of a flu- control may be effective for regulatory control
idized bed catalytic cracker. Helbig et al. (2000) where the model only has to capture the es-
discussed different ways to integrate optimization sential dynamic features of the plant near an
and feedback control including direct dynamic operating point, but it seems to be less suitable
optimization for the example of a semi-batch for optimizing control where the optimal plant
reactor. Toumi and Engell (2004) and Erdem performance is aimed at and hence the best sta-
et al. (2004) demonstrated online performance tionary values of the inputs and of the controlled
optimizing control schemes for simulated mov- variables have to be computed by the controller.
ing bed (SMB) chromatographic separations in As increasingly so-called operator training sim-
lab scale. SMB processes are periodic processes ulators are built in parallel to the construction
and constitute prototypical examples where ad- of a plant and are continuously used and up-
ditional degrees of freedom can be used to si- dated after the commissioning phase, it seems
multaneously optimize system performance and attractive to use the models contained in the
to meet product specifications. Bartusiak (2005) simulators also for online optimization. However,
reported already industrial applications of care- the model formulations often are not suitable for
fully engineered performance optimizing NMPC this purpose.
controllers. Model inaccuracies always have to be taken
Direct performance optimizing control into account. They not only lead to suboptimal
was suggested as a promising general new performance but also can cause that the con-
control paradigm for the process industries by straints even on measured variables cannot be met
Rolandi and Romagnoli (2005), Engell (2006, in the future because of an insufficient back-off
2007), Rawlings and Amrit (2009), and others. from the constraints. A new approach to deal with
Meanwhile it has been demonstrated in many uncertainties about model parameters and future
simulation studies that direct optimization of influences on the process is multistage scenario-
a performance criterion can lead to superior based optimization with recourse. Here the model
economic performance compared to classical uncertainties are represented by a set of scenarios
tracking (N)MPC, e.g., Ochoa et al. (2010) for a of parameter variations and the future availability
Model-Based Performance Optimizing Control 739

of additional information is taken into account. years, important results on closed-loop stabil-
It has been demonstrated that this is an effective ity guaranteeing formulations have nonetheless
tool to handle model uncertainties and to auto- been obtained, involving terminal constraints or
matically generate the necessary back-off without a quasi-infinite horizon (Angeli et al. 2012; Diehl
being overly conservative (Lucia et al. 2013). et al. 2011; Grüne 2013).

State Estimation Reliability and Transparency


For the computation of economically optimal Nowadays quite large nonlinear dynamic opti-
process trajectories based upon a rigorous non- mization problems can be solved in real time,
linear process model, the state variables of the not only for slow processes as they are found in
system at the beginning of the prediction horizon the chemical industry but also in mechatronics
must be known. As not all states will be measured and automotive control. So this issue does no
in a practical application, state estimation is a longer prohibit the application of a performance
key ingredient of a performance optimizing con- optimizing control scheme to complex systems.
troller. Extended Kalman filters are the standard A practically very important limiting issue how-
solution used in the process industries, if the ever is that of reliability and transparency. It is
nonlinearities are significant, unscented Kalman difficult to guarantee that a nonlinear optimizer
filters or particle filters may be used. A novel will provide a solution which at least satisfies
approach is to formulate the state estimation the constraints and gives a reasonable perfor-
problem also as an optimization problem on a mance for all possible input data. While for an
moving horizon (Rao et al. 2003). The estima- RTO scheme an inspection of the commanded
tion of some important varying unknown model set points by the operators usually will be fea-
parameters can be included in this formulation. sible, this is less likely to be realistic in a dy-
As accurate state estimation is at least as critical namic situation. Hence, automatic result filters M
for the performance of the closed-loop system are necessary as well as a backup scheme that
as the exact tuning of the optimizing controller, stabilizes the process in the case where the result
more attention should be paid to the investigation of the optimization is not considered safe. In
of the performance of state estimation schemes the process industries, the operators will con-
in realistic situations with non-negligible model- tinue to supervise the operation of the plant in
plant mismatch. the foreseeable future, so a control scheme that
includes performance optimizing control must be
Stability structured into modules, the outputs of which can
Optimization of a cost function over a finite hori- still be understood by the operators so that they
zon in general neither assures optimality of the build up trust in the optimization. Good operator
complete trajectory beyond this horizon nor sta- interfaces that display the predicted moves and
bility of the closed-loop system. Closed-loop sta- the predicted reaction of the plant and enable
bility has been addressed extensively in the the- comparisons with the operators’ intuitive strate-
oretical research in nonlinear model-predictive gies are believed to be essential for practical
control. Stability can be assured by a proper success.
choice of the stage cost within the prediction
horizon and the addition of a cost on the ter- Effort vs. Performance
minal state and the restriction of the terminal The gain in performance by a more sophisticated
state to a suitable set. In performance optimizing control scheme always has to be traded against
MPC, there is no a priori known steady state the increase in cost due to the complexity of
to which the trajectory should converge, and the the control scheme – a complex scheme will
economic cost function may not satisfy the usual not only cause cost for its implementation, but
conditions for closed-loop stability, e.g., because it will need more maintenance by better qual-
it only involves some of the inputs. In recent ified people than a simple one. If a carefully
740 Model-Based Performance Optimizing Control

chosen standard regulatory control layer leads of exothermic semi-batch polymerizations. Ind Eng
to a close-to-optimal operation, there is no need Chem Res 52:5906–5920
Finkler TF, Kawohl M, Piechottka U, Engell S (2014) Re-
for optimizing control. If the disturbances that alization of online optimizing control in an industrial
affect profitability and cannot be handled well semi-batch polymerization. J Process Control 24:399–
by the regulatory layer (in terms of economic 414
performance) are slow, the combination of reg- Grüne L (2013) Economic receding horizon control with-
out terminal constraints. Automatica 49:725–734
ulatory control and RTO is sufficient. In a more Helbig A, Abel O, Marquardt W (2000) Structural
dynamic situation or for complex nonlinear mul- concepts for optimization based control of transient
tivariable plants, the idea of direct performance processes. In: Nonlinear model predictive control.
optimizing control should be explored and im- Allgöwer F and Zheng A, eds. Birkhäuser, Basel
pp 295–311
plemented if significant gains can be realized in Idris IAN, Engell S (2012) Economics-based NMPC
simulations. strategies for the operation and control of a contin-
uous catalytic distillation process. J Process Control
22:1832–1843
Lucia S, Finkler T, Basak D, Engell S (2013) A new
Cross-References Robust NMPC Scheme and its application to a semi-
batch reactor example. J Process Control 23:1306–
 Control and Optimization of Batch 1319
Processes Marlin TE, Hrymak AN (1997) Real-time operations op-
timization of continuous processes. In: Proceedings of
 Control Hierarchy of Large Processing Plants: CPC V, Lake Tahoe, AIChE symposium series, vol 93,
An Overview pp 156–164
 Control Structure Selection Morari M, Stephanopoulos G, Arkun Y (1980) Studies
 Economic Model Predictive Control in the synthesis of control structures for chemical
processes, part I. AIChE J 26:220–232
 Extended Kalman Filters Ochoa S, Wozny G, Repke J-U (2010) Plantwide opti-
 Model-Predictive Control in Practice mizing control of a continuous bioethanol production
 Moving Horizon Estimation process. J Process Control 20:983–998
Pham LC, Engell S (2011) A procedure for systematic
 Particle Filters
control structure selection with application to reac-
 Real-Time Optimization of Industrial Pro- tive distillation. In: Proceedings of 18th IFAC world
cesses congress, Milan, pp 4898–4903
Qin SJ, Badgwell TA (2003) A survey of industrial
model predictive control technology. Control Eng
Pract 11:733–764
Bibliography Rao CV, Rawlings JB, Mayne DQ (2003) Constrained
state estimation for nonlinear discrete-time systems.
Angeli D, Amrit R, Rawlings J (2012) On average per- IEEE Trans Autom Control 48:246–258
formance and stability of economic model predictive Rawlings J, Amrit R (2009) Optimizing process eco-
control. IEEE Trans Autom Control 57(7):1615–1626 nomic performance using model predictive control In:
Bartusiak RD (2005) NMPC: a platform for optimal Nonlinear model predictive control – towards new
control of feed- or product-flexible manufacturing. challenging applications. Springer, Berlin, pp 119–
Preprints International workshop on assessment and 138
future directions of NMPC, Freudenstadt, pp 3–14 Rolandi PA, Romagnoli JA (2005) A framework for
Diehl M, Amrit R, Rawlings J, Angeli D (2011) A Lya- online full optimizing control of chemical processes.
punov function for economic optimizing model pre- In: Proceedings of ESCAPE 15. Barcelona, Elsevier,
dictive control. IEEE Trans Autom Control 56(3):703– pp 1315–1320
707 Skogestad S (2000) Plantwide control: the search for the
Engell S (2006, 2007) Feedback control for optimal self-optimizing control structure. J Process Control
process operation. Plenary paper IFAC ADCHEM, 10:487–507
Gramado, 2006; J Process Control 17:203–219 Toumi A, Engell S (2004) Optimization-based control of
Erdem G, Abel S, Morari M, Mazzotti M, Morbidelli a reactive simulated moving bed process for glucose
M (2004) Automatic control of simulated moving isomerization. Chem Eng Sci 59:3777–3792
beds. Part II: nonlinear isotherms. Ind Eng Chem Res Zanin AC, Tvrzska de Gouvea M, Odloak D (2000)
43:3895–3907 Industrial implementation of a real-time optimization
Finkler T, Lucia S, Dogru M, Engell S (2013) A strategy for maximizing production of LPG in a FCC
simple control scheme for batch time minimization unit. Comput Chem Eng 24:525–531
Modeling of Dynamic Systems from First Principles 741

 are unknown and have to be determined using


Modeling of Dynamic Systems from parameter estimation. When used in connection
First Principles with system identification, these models are
sometimes referred to as gray box models (in
S. Torkel Glad
contrast to black box models) to indicate that
Department of Electrical Engineering,
some degree of physical knowledge is assumed.
Linköping University, Linköping, Sweden
In  System Identification: An Overview, various
connections between physical models and
parameter estimation are discussed.
Abstract

This entry describes how models can be formed


from the basic principles of physics and the other
Overview of Physical Modeling
fields of science. Use can be made of similarities
Since modeling covers such a wide variety of
between different domains which leads to the
physical systems, there are no universal system-
concepts of bond graphs and, more abstractly, to
atic principles. However, a few concepts have
port-controlled Hamiltonian systems. The class
wide application. One of them is the preserva-
of models is naturally extended to differential
tion of certain quantities like energy, leading to
algebraic equation (DAE) models. The concepts
balance equations. A simple example is given by
described here form a natural basis for parameter
the heating of a body. If W is the energy stored
identification in gray box models.
as heat, P1 an external power input, and P2 the
heat loss to the environment per time unit, energy
balance gives
Keywords
M
dW
Bond graph; Differential algebraic equation D P1  P2 (2)
(DAE); Differential algebra; Gray box model; dt
Hamiltonian; Physical analogy; Physical
To get a complete model, one needs also con-
modeling
stitutive relations, i.e., relations between relevant
physical variables. For instance, one might know
that the stored energy is proportional to the tem-
Introduction perature T; W D C T and that the energy loss is
from black body radiation, P2 D kT 4 . The model
The approach to the modeling of dynamic
is then
systems depends on how much is known about dT
the system. When the internal mechanisms are C D P1  kT 4 (3)
dt
known, it is natural to model them using known The model is now an ordinary differential equa-
relationships from physics, chemistry, biology, tion with state variable T , input variable P1 and
etc. Often the result is a model of the following parameters C and k.
form:

dx
D f .x; uI /; y D h.x; uI / (1) Physical Analogies and General
dt
Structures
where u is the input, y is the output, and the
state x contains internal physical variables, while Physical Analogies
 contains parameters. Typically all of these Physicists and engineers have noted that mod-
are vectors. The model is known as a state eling in different areas of physics often gives
space model. In many cases some elements in very similar models. The term “analogies” is
742 Modeling of Dynamic Systems from First Principles

L R k
F
C m
u b

Modeling of Dynamic Systems from First Principles,


Modeling of Dynamic Systems from First Principles,
Fig. 2 Mechanical system
Fig. 1 Electric circuit

often used in modeling to describe this fact. Here Note that the products (voltage)  (current) and
we will show some analogies between electrical (force)  (velocity) give the power.
and mechanical phenomena. Consider the electric
circuit given in Fig. 1. An ideal voltage source is Bond Graphs
connected in series with an inductor, a resistor, The bond graph is a tool to do systematic model-
and a capacitor. Using u and v to denote the ing based on the analogies of the previous section.
voltages over the voltage source and capacitor, The basic element is the bond
respectively, and i to denote the current, a math-
ematical model is e
*
f
dv
C Di
dt formed by a half arrow showing the direction of
(4)
di positive energy flow. Two variables are associated
L C Ri C v D u with the bond, the effort variable e and the flow
dt
variable f . The product ef of these variables
The first equation uses the definition of capaci- gives the power. In the electric domain e is
tance and the second one uses Kirchhoff’s voltage voltage and f is current. For mechanical systems
law. Compare this to the mechanical system of e is force, while f is velocity. Bond graph theory
Fig. 2 where an external force F is applied to has three basic components to describe storage
a mass m that is also connected to a damper b and dissipation of energy. The relations
and a spring with spring constant k. If S is the
elongation force of the spring and w the velocity de df
of the mass, a system model is ˛ D f; ˇ D e; f D e (8)
dt dt

dS are known as C, I, and R elements, respectively.


D kw
dt Input signals are modeled by elements called
(5)
dw effort sources Se or flow sources Sf , respectively.
m C bw C S D F
dt A bond graph describes the energy flow between
these elements. When the energy flow is split, it
Here the first equation uses the definition of can either be at s junctions where the flows are
spring constant and the second one uses New- equal and the efforts are added or at a p junction
ton’s 2nd law. The models are seen to be the where efforts are equal and flows are additive.
same with the following correspondences be- The model (5), for instance, can be described
tween time-varying quantities by the bond graph in Fig. 3. The graph shows
how the energy from the external force is split
u $ F; i $ w; v$S (6) into the acceleration of the mass, the elongation
of the spring, and dissipation into the damper. The
and between parameters splitting of the energy flow is accomplished by an
s element, meaning that the velocity is the same
C $ 1=k; L $ m; R $ b (7) for all elements but that the forces are added:
Modeling of Dynamic Systems from First Principles 743

I:m and dissipation phenomena. To give an example


the mechanical system used above is considered
again.
Introduce x1 as the length of the spring so that
N w dx1 =dt D w. If H1 is the energy stored in the
spring, then the following relations hold:
F S kx12 @H1
Se s C:1/k H1 .x1 / D ; D kx1 D S (12)
w w 2 @x1

w T Introducing x2 for the momentum and H2 as the


kinetic energy, one has dx2 =dt D N and

x22 @H2
H2 .x2 / D ; D m1 x2 D w (13)
2m @x2
R:b

Modeling of Dynamic Systems from First Principles, Let H D H1 C H2 be the total energy. Then the
Fig. 3 Bond graph for mechanical or electric system following relation holds:
       
dx 0 1 00 @H =@x1 0
F DN CS CT D  C F
(9) dt 1 0 0b @H =@x2 1
(14)
Here T and N denote the forces associated with This model is a special case of
the damper and the mass, respectively. From (8)
it follows that dx M
D .M  R/rH.x/ C Bu (15)
dt
dS dw
k 1 D w; m D N; bw D T (10) where M is a skew symmetric and R a nonnega-
dt dt
tive definite matrix, respectively. The model type
Together (9) and (10) give the same model as (5). is called a port-controlled Hamiltonian system
Using the correspondences (6), (7) it is seen that with dissipation. Without external input (B D
the same bond graph can also represent the elec- 0) and dissipation (R D 0), it reduces to an
tric circuit (4). An overview of bond graph mod- ordinary Hamiltonian system of the form (11).
eling is given in Rosenberg and Karnopp (1983). For systems generated by simple bond graphs, it
A general overview of modeling, including bond can be shown that the junction structure gives the
graphs and the connection with identification, can skew symmetric M , while the R elements give
be found in Ljung and Glad (1994b). the matrix R. The storage of energy in I and C
elements is reflected in H . The reader is directed
Port-Controlled Hamiltonian Systems to Duindam et al. (2009) for a description of the
Many physical processes can be modeled as port Hamiltonian approach to modeling.
Hamiltonian systems. This means that there are
state variables x, a scalar function H , and a skew
symmetric matrix M so that the system dynamics Component-Based Models and
is Modeling Languages
dx
D M rH.x/ (11)
dt Since engineering systems are usually assembled
The function H is called the Hamiltonian of the from components, it is natural to treat their math-
system. To be useful in a control context, this ematical models in the same way. This is the idea
model class has to be extended to handle inputs behind block-oriented models where the output
744 Modeling of Dynamic Systems from First Principles

of one model is connected to the input of another algebraic equation (DAE). The difference from
one: the connection of blocks in block diagrams is
that now the connection is not between an input
u1 y1 = u2 y2 and an output. Instead there are the equations
!1 D !2 and M1 D M2 that destroy the state
space structure. There exist modeling languages
like Modelica (Fritzson 2000; Tiller 2001) or
A nice feature of this block connection is that
SimMechanics in MATLAB (MathWorks 2002)
the state space description is preserved. Suppose
that accept this more general type of model. It
the individual models are of the form (1)
is then possible to form model libraries of basic
dxi components that can be interconnected in very
D fi .xi ; ui /; yi D hi .xi ; ui /; i D 1; 2 general ways to form models of complex systems.
dt
(16) However, this more general structure poses some
challenges when it comes to analysis and simula-
Then the connection u2 D y1 immediately gives tion that are described in the next section.
the state space model
   
d x1 f1 .x1 ; u1 / Differential Algebraic Equations
D ;
dt x2 f2 .x2 ; h1 .x1 ; u1 // (17) (DAE)
y2 D h2 .x2 ; h1 .x1 ; u1 //
This model (19) is a special case of the general
with input u1 , output y2 , and state (x1 ; x2 ). differential algebraic equation
This fact is the basis of block-oriented mod-
eling and simulation tools like the MATLAB- F .d z=dt; z; u/ D 0 (20)
based Simulink. Unfortunately the preservation
of the state space structure does not extend to A good description of both theory and numerical
more general connections of systems. Consider, properties of such equations is given in Kunkel
for instance, two pieces of rotating machinery and Mehrmann (2006). In many cases it is possi-
described by ble to split the variables and equations in such a
way that the following structure is achieved:
d!i
Ji D bi !i C Mi ; i D 1; 2 (18)
dt F .d z1 =dt; z1 ; z2 ; u/ D 0; F2 .z1 ; z2 ; u/ D 0
where !i is the angular velocity, Mi the external (21)
torque, Ji the moment of inertia, and bi the damp- If z2 can be solved from the second equation
ing coefficient. Suppose the pieces are joined and substituted into the first one, and if d z1 =dt
together so that they rotate with the same angular can then be solved from the first equation, the
velocity. The mathematical model would then be problem is reduced to an ordinary differential
d!1 equation in z1 . Often, however, the situation is
J1 D b1 !1 C M1
dt not as simple as that. For the example (19) an
d!2 addition of the first two equations gives
J2 D b2 !2 C M2 (19)
dt
d!1
!1 D !2 .J1 C J2 / D .b1 C b2 /!1 (22)
dt
M1 D M2
which is a standard first-order system description.
This is no longer a state space model of the Note, however, that in order to arrive at this result,
form (1), but a mixture of dynamic and static the relation !1 D !2 has to be differentiated. This
relationships, usually referred to as a differential DAE thus includes an implicit differentiation.
Modeling of Dynamic Systems from First Principles 745

In the general case one can investigate how many This expression shows that an index k > 1
times (20) has to be differentiated in order to get implies differentiation of the input (unless N B2
an explicit expression for d z=dt. This number is happens to be zero). This in turn implies potential
called the (differentiation) index. Both theoretical difficulties, e.g., if u is a measured signal.
analysis and practical experience show that the
numerical difficulties encountered when solving Identification of DAE Models
a DAE increase with increasing index; see, e.g., The extended use of DAE models in modern
the classical reference Brenan et al. (1987). It modeling tools also means that there is a need
turns out that mechanical systems in particu- to use these models in system identification. To
lar give high-index models when constructed by fully use system identification theory, one needs
joining components, and this has been an obstacle a stochastic model of disturbances. The inclusion
to the use of DAE models. For linear DAE models of such disturbances leads to a class of models de-
the role of the index can be seen more easily. A scribed as stochastic differential algebraic equa-
linear model is given by tions. The treatment of such models leads to some
interesting problems. In the previous section it
dz was seen that DAE models often contain implicit
E C F z D Gu (23) differentiations of external signals. If a DAE
dt
model is to be well posed, this differentiation
where the matrix E is singular (if E is invertible, must not affect signals modeled as white noise.
multiplication with E 1 from the left gives an In Gerdin et al. (2007), conditions are given that
ordinary differential equation). The system can guarantee that stochastic DAEs are well posed.
be transformed by multiplying with P from the There it is also described how a maximum like-
left and changing variables with z D Qw.P; Q lihood estimate can be made for DAE models,
nonsingular matrices). The transformed model is laying the basis for parameter estimation. M
now
dw Differential Algebra
PEQ C PFQw D P Gu (24)
dt For the case where models consist of polynomial
If E C F is nonsingular for some value of the equations, it is possible to manipulate them in
scalar , then it can be shown that there is a a very systematic way. The model (20) is then
choice of P and Q such that (24) takes the form generalized to

        F .d n z=dt n ; : : : ; d z=dt; z/ D 0 (28)


I 0 d w1 =dt A 0 w1 B1
C D u
0N d w2 =dt 0 I w2 B2
(25) where z is now a vector containing an arbitrary
mix of inputs, outputs, and internal variables.
where N is a nilpotent matrix, i.e., N D 0 for
k
There is then a theory based on Ritt (1950) that al-
some positive integer k. The smallest such k turns
lows the transformation of (28) to a standard form
out to be the index of the DAE. The transformed
where the properties of the system can be easily
model (25) thus contains an ordinary differential
determined. The process is similar to the use of
equation:
d w1 Gröbner bases but also includes the possibility of
D Aw1 C B1 u (26) differentiating equations. Of particular interest to
dt
identification is the possibility of determining the
Using the nilpotency of N , the equation for w2
identifiability of parameters with these tools. The
can be rewritten:
model is then of the form

du d k1 u F .d m y=dt m ; : : : ; dy=dt; y; d n z=dt n ;


w2 D B2 u  NB2 C    C .N /k1 B2 k1 (29)
dt dt
(27) : : : ; d z=dt; zI / D 0
746 Modeling, Analysis, and Control with Petri Nets

where y contains measured signals, z contains Hamiltonian techniques is Duindam et al. (2009).
unmeasured variables, and  is a vector of pa- The Modelica modeling language is treated in
rameters to be identified, while F is a vector of Tiller (2001) and Fritzson (2000). The former
polynomials in these variables. It was shown in emphasizes the physical modeling point of view;
Ljung and Glad (1994a) that there is an algorithm the latter also gives details of the language itself.
giving for each parameter k a polynomial:

gk .d m y=dt m ; : : : ; dy=dt; yI k / D 0 (30) Bibliography

This relation can be regarded as a polynomial Brenan KE, Campbell SL, Petzold LR (1987) Numeri-
cal solution of initial-value problems in differential-
in k where all coefficients are expressed in algebraic equations. Classics in applied mathematics
measured quantities. The local or global identifi- (Book 14). SIAM, Philadelphia
ability will then be determined by the number of Duindam V, Macchelli A, Stramigioli S, Bruyninckx H
solutions. If k is unidentifiable, then no equation (eds) (2009) Modeling and control of complex physi-
cal systems: the Port-Hamiltonian approach. Springer,
of the form (30) will exist, and this fact will also Berlin
be demonstrated by the output of the algorithm. Fritzson P (2000) Principles of object-oriented modeling
and simulation with Modelica 2.1. IEEE/Wiley Inter-
science, Piscataway NJ
Gerdin M, Schön T, Glad T, Gustafsson F, Ljung L
Summary and Future Directions (2007) On parameter and state estimation for linear
differential-algebraic equations. Automatica 43:416–
There is no general method to derive models from 425
Kunkel P, Mehrmann V (2006) Differential-algebraic
first principles. However, modeling techniques
equations. European Mathematical Society, Zürich
based on bond graphs or port-controlled Hamilto- Ljung L, Glad ST (1994a) On global identifiability
nian systems offer a systematic approach for large of arbitrary model parameterizations. Automatica
model classes. Modeling languages like Modelica 30(2):265–276
Ljung L, Glad T (1994b) Modeling of dynamic systems.
make the practical work with modeling much
Prentice Hall, Englewood Cliffs
easier. A fundamental problem that comes up is Ritt JF (1950) Differential algebra. American Mathemati-
that models are not necessarily in state space form cal Society, Providence
but are so called differential algebraic equation Rosenberg RC, Karnopp D (1983) Introduction to physi-
cal system dynamics. McGraw-Hill, New York
(DAE) models. Much of the future work is ex- The MathWorks (2002) SimMechanics. User’s guide. The
pected to deal with the handling of DAE models MathWorks, Natick
and in the development of modeling languages. Tiller MM (2001) Introduction to physical modeling with
Modelica. Kluwer Academic, Boston

Cross-References Modeling, Analysis, and Control


with Petri Nets
 Nonlinear System Identification: An Overview
of Common Approaches Manuel Silva
 System Identification: An Overview Instituto de Investigation en Ingeniería de
Aragón (I3A), Universidad de Zaragoza,
Zaragoza, Spain
Recommended Reading

A classical book on physical modeling is Rosen- Abstract


berg and Karnopp (1983) with emphasis on bond
graph techniques. The physical modeling and Petri net is a generic term used to designate a
identification perspectives are tied together in broad family of related formalisms for discrete
Ljung and Glad (1994b). A good reference for event views of (dynamic) Systems (DES), all
Modeling, Analysis, and Control with Petri Nets 747

sharing some basic relevant features, such as algebraic form using some matrices. As in the
minimality in the number of primitives, local- case of differential equations, an initial condition
ity of the states and actions (with consequences or state should be defined in order to represent
for model construction), or temporal realism. a dynamic system. This is done by means of an
The global state of a system is obtained by the initial distributed state. The English translation of
juxtaposition of the different local states. We the Carl Adam Petri’s seminal work, presented in
should initially distinguish between autonomous 1962, is Petri (1966).
formalisms and those extended by interpretation.
Models in the latter group are obtained by re-
stricting the underlying autonomous behaviors Untimed Place/Transition Net
by means of constraints that can be related to Systems
different kinds of external events, in particular to
time. This article first describes place/transition A place/transition net (PT-net) can be viewed as
nets (PT-nets), by default simply called Petri nets N D hP; T; Pre; Posti, where:
(PNs). Other formalisms are then mentioned. As • P and T are disjoint and finite nonempty sets
a system theory modeling paradigm for concur- of places and transitions, respectively.
rent DES, Petri nets are used in a wide variety of • Pre and Post are jP j  jT j sized, natural-
application fields. valued (zero included), incidence matrices.
The net is said to be ordinary if Pre and Post
are valued on f0; 1g. Weighted arcs permit
Keywords
the abstract modeling of bulk services and
arrivals.
Condition/event nets (CE-nets); Continuous Petri
A PT-net is a structure. The Pre (Post) function
nets (CPNs); Diagrams; Fluidization; Grafcet;
defines the connections from places to transitions M
Hybrid Petri nets (HPNs); Marking Petri nets;
(transitions to places). Those two functions can
Place/transition nets (PT-nets); High-level Petri
alternatively be defined as weighted flow relations
nets (HLPNs)
(nets as graphs). Thus, PT-nets can be represented
as bipartite directed graphs with places (p, using
Introduction circles) and transitions (t, using bars or rectan-
gles) as nodes: N D hP; T; F; W i, where F

Petri nets (PNs) are able to model concurrent .P  T / [ .T  P / is the flow relation (set of
and distributed DES ( Models for Discrete directed arcs, with dom.F /[range.F / D P [T ),
Event Systems: An Overview). They constitute and W W F ! NC assigns a natural weight to
a powerful family of formalisms with different each arc.
expressive purposes and power. They may be The net structure represents the static part
applied to inter alia, modeling, logical analysis, of the DES model. Furthermore, a “distributed
performance evaluation, parametric optimization, state” is defined over the set of places, known
dynamic control (minimum makespan, super- as the marking. This is “numerically quantified”
visory control, or other kinds), diagnosis, and (not in an arbitrary alphabet, as in automata),
implementation issues (eventually fault tolerant). associating natural values to the local state
Hybrid and continuous PNs are particularly variables, the places. If a place p has a value
useful when some parts of the system are highly .m.p/ D /, it is said to have  tokens
populated. Being multidisciplinary, formalisms (frequently depicted in graphic terms with v
belonging to the Petri nets paradigm may cover black dots or just the number inside the place).
several phases of the life cycle of complex DES. The places are “state variables,” while the
A Petri net can be represented as a bipartite markings are their “values”; the global state
directed graph provided with arcs inscriptions; is defined through the concatenation of local
alternatively, this structure can be represented in states. The net structure, provided with an initial
748 Modeling, Analysis, and Control with Petri Nets

Modeling, Analysis, and Control with Petri Nets, attributions (or meets); the logical AND is formed around
Fig. 1 Most basic PN constructions: The logical OR transitions, in joins (or waits or rendezvous) and forks (or
is present around places, in choices (or branches) and splits)

Modeling, Analysis, and Control with Petri Nets, Fig. 2 Only transitions b and c are initially enabled. The results
of firing b or c are shown subsequently

marking, to be denoted as .N ; m0 /, is a Petri net represents the number of resources to be con-


system, or marked Petri net. sumed. The post-condition defines the number
The last two basic PN constructions in Fig. 1 resources produced by the firing of the transition.
( join and fork) do not appear in finite-state ma- This is made explicit by the weights of the arcs
chines; moreover, the arcs may be valued with from the transition to the places. Three important
natural numbers. The dynamic behavior of the net observations should be taken into account:
system (trajectories with changes in the marking) • The underlying logic in the firing of a tran-
is produced by the firing of transitions, some sition is non-monotonic! It is a consump-
“local operations” which follows very simple tion/production logic.
rules. • Enabled transitions are never forced to fire:
Markings in net systems evolve according to This is a form of non-determinism.
the following firing (or occurrence) rules (see, • An occurrence sequence is a sequence of fired
Fig. 2): transitions D t1 : : : tk . In the evolution
• A transition is said to be enabled at a given from m0 , the reached marking m can be easily
marking if each input place has at least as computed as:
many tokens as the weight of the arc joining
them. m D m0 C C   ; (1)
• The firing or occurrence of an enabled tran-
sition is an instantaneous operation that re- where C D Post – Pre is the token flow matrix
moves from (adds to) each input (output) place (incidence matrix if N is self-loop free) and
a number of tokens equal to the weight of  the firing count vector corresponding to  .
the arc joining the place (transition) to the Thus m and  are vectors of natural numbers.
transition (place). The previous equation is the state-transition
The precondition of a transition can be seen as the equation (frequently known as the fundamental
resources required for the transition to be fired. or, simply, state equation). Nevertheless, two im-
The weight of the arc from a place to a transition portant remarks should be made:
Modeling, Analysis, and Control with Petri Nets 749

• It represents a necessary but not sufficient design process is to express in a formalized way
condition for reachability; the problem is that properties that the system should enjoy and to use
the existence of a  does not guarantee that a formal proof techniques. Errors can be eventually
corresponding sequence is firable from m0 ; detected close to the moment they are introduced,
thus, certain solutions – called spurious (Silva reducing their propagation to subsequent stages.
et al. 1998) – are not reachable. This implies The goal in verification is to ensure that a given
that – except in certain net system subclasses – system is correct with respect to its specification
only semi-decision algorithms can usually be (perhaps expressed in temporal-logic terms) or to
derived. a certain set of predetermined properties.
• All variables are natural numbers, which im- Among the most basic qualitative properties of
ply computational complexity. “net systems” are the following: (1) reachability
It should be pointed out that in finite-state ma- of a marking from a given one; (2) boundedness,
chines, the state is a single variable taking values characterizing finiteness of the state space; (3)
in a symbolic unstructured set, while in PT-net liveness, related to potential fireability of all tran-
systems, it is structured as a vector of nonnegative sitions starting on an arbitrary reachable marking
integers. This allows analysis techniques that do (deadlock-freeness is a weaker condition in which
not require the enumeration of the state space. only global infinite fireability of the net system
At a structural level, observe that the negation model is guaranteed, even if some transitions
is missing in Fig. 1; its inclusion leads to the so- no longer fire); (4) reversibility, characterizing
called inhibitor arcs, an extension in expressive recoverability of the initial marking from any
power. In its most basic form, if the place at reachable one; and (5) mutual exclusion of two
the origin of an inhibitor arc is marked, it “in- places, dealing with the impossibility of reaching
hibits” the enabling of the target transition. PT- markings in which both places are simultane-
net systems can model infinite-state systems, but ously marked. M
not Turing machines. PT-net systems provided All the above are behavioral properties, which
with inhibitor arcs (or priorities on the firing of depend on the net system .N ; m0 /. In practice,
transitions) can do it. sometimes problems with a net model are rooted
With this conceptually simple formalism, it in the net structure; thus, the study of the struc-
is not difficult to express basic synchronization tural counterpart of certain behavioral proper-
schemas (Fig. 3). All the illustrated examples use ties may be of interest. For example, a “net”
joins. When weights are allowed in the arcs, is structurally bounded if it is bounded for any
another kind of synchronization appears: Sev- initial marking; a “net” is structurally live if an
eral copies of the same resource are needed (or initial marking exists that make the net system
produced) in a single operation. Being able to live (otherwise stated, it reflect non-liveness for
express concurrency and synchronization, when arbitrary initial markings, a pathology of the net).
viewing the system at a higher level, it is possible Basic techniques to analyze net systems in-
to build cooperation and competition relation- clude: (1) enumeration, in its most basic form
ships. based on the construction of a reachability graph
(a sequentialized view of the behavior). If the net
system is not bounded, losing some information,
Analysis and Control of Untimed a finite coverability graph can be constructed; (2)
PT Models transformation, based on an iterative rewriting
process in which a net system enjoys a certain
The behavior of a concurrent (eventually dis- property if and only if a transformed (“sim-
tributed) system is frequently difficult to under- pler” to analyze) one also does. If the new sys-
stand and control. Thus, misunderstandings and tem is easier to analyze, and the transformation
mistakes are frequent during the design cycle. A is computationally cheap, the process may be
way of cutting down the cost and duration of the extremely interesting; (3) structural, based on
750 Modeling, Analysis, and Control with Petri Nets

Modeling, Analysis, and Control with Petri Nets, marked – must be in mutex – to remember the returning
Fig. 3 Basic synchronization schemes: (1) Join or ren- point; for simplicity, it is assumed that the subprogram is
dezvous, RV; (2) Semaphore, S; (3) Mutual exclusion single input/single output); (8) Guard (a self-loop from a
semaphore (mutex), R, representing a shared resource; place through a transition); its role is like a traffic light:
(4) Symmetric RV built with two semaphores; (5) Asym- If at least one token is present at the place, it allows the
metric RV built with two semaphores (master/slave); evolution, but it is not consumed. Synchronizations can
(6) Fork-join (or par-begin/par-end); (7) Non-recursive also be modeled by the weights associated to the arcs
subprogram (places i and j cannot be simultaneously going to transitions

graph properties, or in mathematical program- reachability graphs), also to non-sequentialized


ming techniques rooted on the state equation; views such as those based on unfoldings.
(4) simulation, particularly interesting for gaining Transformation techniques are extremely useful,
certain confidence about the absence of certain but not complete (i.e., not all net systems can
pathological behaviors. Analysis strategies com- be reduced in a practical way). In most cases,
bining all these kinds of techniques are extremely structural techniques only provide necessary or
useful in practice. sufficient conditions (e.g., a sufficient condition
Reachability techniques provide sequentialized for deadlock-freeness, a necessary condition for
views for a particular initial marking. Moreover reversibility, etc.), but not a full characterization.
they suffer from the so-called state explosion As already pointed out, a limitation of methods
problem. The reduction of this computational based on the state equation for analyzing net
issue leads to techniques such as stubborn systems is the existence of non reachable
set methods (smaller, equally informative solutions (the so-called spurious solutions). In
Modeling, Analysis, and Control with Petri Nets 751

this context, three kinds of related notions that so named structure theory, based on the direct
must be differentiated are the following: (1) exploitation of the structure of the net model
some natural vectors (left and right annullers (using graph or mathematical programming
of the token flow matrix, C: P-semiflows and theories and algorithms, where the initial marking
T-semiflows), (2) some invariant laws (token is a parameter).
conservation and repetitive behaviors), and Similarly, transitions (or places) can be parti-
(3) some peculiar subnets (conservative and tioned into observable and unobservable. Many
consistent components, generated by the subsets observability problems may be of interest; for ex-
of nodes in the P- and T-semiflows, respectively). ample, observing the firing of a subset of transi-
More than analysis, control leads to synthesis tions to compute the subset of markings in which
problems. The idea is to enforce the given system the system may be. Related to observability, di-
in order to fulfill a specification (e.g., to enforce agnosis is the process of detecting a failure (any
certain mutual exclusion properties). Technically deviation of a system from its intended behavior)
speaking, the idea is to “add” some elements in and identifying the cause of the abnormality. Di-
order to constrain the behavior in such a way that agnosability, like observability or controllability,
a correct execution is obtained. Questions related is a logical criterion. If a model is diagnosable
to control, observation, diagnosis, or identifica- with respect to a certain subset of possible faults
tion are all areas of ongoing research. (i.e., it is possible to detect the occurrence of
With respect to classical control theory, there those faults in finite time), a diagnoser can be
are two main differences: Models are DES and constructed (see section “Diagnosis and Diagnos-
untimed (autonomous, fully nondeterministic, ability Analysis of Petri Nets” in  Diagnosis of
eventually labeling the transitions in order to Discrete Event Systems). Identification of DES
be able to consider the PNs languages). Let us is also a question that has required attention in
remark that for control purposes, the transitions recent years. In general, the starting point is a M
should be partitioned into controllable (when behavioral observation, the goal being to con-
enabled, you can either force or block the struct a PN model that generates the observed
firing) and uncontrollable (if enabled, the firing behavior, either from examples/counterexamples
is nondeterministic). A natural approach to of its language or from the structure of a reach-
synthesize a control is to start modeling the plant ability graph. So the results are derived models,
dynamics (by means of a PN, P) and adopting a not human-made models (i.e., not made by de-
specification for the desired closed-loop system signers).
(S). The goal is to compute a controller (L)
such that S equals the parallel-composition of
P and L; in other words, controllers (called The Petri Nets Modeling Paradigm
“supervisors”) are designed to ensure that only
behaviors consistent with the specification may Along the life cycle of DES, designers may deal
occur. The previous equality is not always with basic modeling, analysis, and synthesis from
possible, and the goal is usually relaxed to different perspectives together with implemen-
minimally limit the behavior within the specified tation and operation issues. Thus, the designer
legality (i.e., to compute maximally permissive may be interested in expressing the basic struc-
controllers). For an approach in the framework ture, understanding untimed possible behaviors,
of finite-state machines and regular languages, checking logic properties on the model when
see  Supervisory Control of Discrete-Event provided with some timing (e.g., in order to
Systems. The synthesis in the framework of guarantee if a certain reaction is possible before
Petri nets and having goals as enforcing some 3 ms; something relevant in real-time systems),
generalized mutual exclusions constraints in computing some performance indices on timed
markings or avoiding deadlocks, for example, models (related to the throughput in the firing of a
can be efficiently approached by means of the given transition, or to the length of a waiting line,
752 Modeling, Analysis, and Control with Petri Nets

expressed as the number of tokens in a place), and structured models, while keeping the
computing a schedule or control that optimizes same theoretical expressive power of PT-
a certain objective function, decomposing the nets (i.e., we can speak of “abbreviations,”
model in order to prepare an efficient imple- not of “extensions”). In other cases, object-
mentation, efficiently determining redundancies oriented concepts from computer programming
in order to increase the degree of fault tolerance, are included in certain HLPNs. The analysis
etc. For these different tasks, different formalisms techniques of HLPNs can be approached
may be used. Nevertheless, it seems desirable with techniques based on enumeration, trans-
to have a family of related formalisms rather formation, or structural considerations and
than a collection of “unrelated” or weakly related simulation, generalizing those developed for PT-
formalisms. The expected advantages would in- net systems.
clude coherence among models usable in dif-
ferent phases, economy in the transformations
and synergy in the development of models and
theories. Extending Net Systems with External
Events and Time: Nonautonomous
Formalisms
When dealing with net systems that interact with
Other Untimed PN Formalisms: Levels of some specific environment, the marking evolu-
Expressive Power tion rule must be slightly modified. This can
PT-net systems are more powerful than condi- be done in an enormous number of ways, con-
tion/event (CE) systems, roughly speaking the sidering external events and logical conditions
basic seminal formalism of Carl Adam Petri in as inputs to the net model, in particular some
which places can be marked only with zero or depending on time. The same interpretation given
one token (Boolean marking). CE-systems can to a graph in order to define a finite-state diagram
model only finite-state systems. As already said, can be used to define a marking diagram, a for-
“extensions” of the expressive power of untimed malism in which the key point is to recognize that
PT-net systems to the level of Turing machines the state is now numerical (for PT-net systems)
are obtained by adding inhibitor arcs or priorities and distributed. For example, drawing a parallel
to the firing of transitions. with Moore automata, the transitions should be
An important idea is adding the notion of labeled with logical conditions and events, while
individuals to tokens (e.g., from anonymous unconditional actions are associated to the places.
to labeled or colored tokens). Information in If a place is marked, the associated actions are
tokens allows the objects to be named (they emitted.
are no longer indistinguishable) and dynamic Even if only time-based interpretations are
associations to be created. Moving from PT- considered, there are a large number of successful
nets to so-called high-level PNs (HLPNs) is proposals for formalisms. For example, it should
something like “moving from assembler to be specified if time is associated to the firing of
high-level programming languages,” or, at transitions (T-timing may be atomic or in three
the computational level, like “moving from phases), to the residence of tokens in places (P-
pure numerical to a symbolic level.” There timed), to the arcs of the net, as tags to the
are many proposals in this respect, the more tokens, etc. Moreover, even for T-timed models,
important being predicate/transition nets and there are many ways of defining the timing: time
colored PNs. Sometimes, this type of abstraction intervals, stochastic or possibilistic forms, and
has the same theoretical expressiveness as the deterministic case as a particular one. If the
PT-net systems (e.g., colored PNs if the firing of transitions follows exponential pdfs, and
number of colors is finite); in other words, the conflict resolution follows the race policy
high-level views may lead to more compact (i.e., fire in conflicts the first that ends the task,
Modeling, Analysis, and Control with Petri Nets 753

not a preselection policy), the underlying Markov Fluid and Hybrid PN Models
chain described is isomorphic to the reachability Different ideas may lead to different kinds of
graph (due to the Markovian “memoryless” prop- hybrid PNs. One is to fluidize (here to relax the
erty). Moreover, the addition of immediate transi- natural numbers of discrete markings into the
tions (whose firing is instantaneous) enriches the nonnegative reals) the firing of transitions that are
practical modeling possibilities, eventually com- “most time” enabled. Then the relaxed model has
plicating the analysis techniques. Timed models discrete and continuous transitions, thus also dis-
are used to compute minimum and maximum crete and continuous places. If all transitions are
time delays (when time intervals are provided, fluidized, the PN system is said to be fluid or con-
in real-time problems) or performance figures tinuous, even if technically it is a hybrid one. In
(throughput, utilization rate of resources, average this approach, the main goal is to try to overcome
number of tokens – clients – in services, etc.). For the state explosion problem inherent to enumer-
performance evaluation, there is an array of tech- ation techniques. Proceeding in that way, some
niques to compute bounds, approximated values, computationally NP-hard problems may become
or exact values, sometimes generalizing those much easier to solve, eventually in polynomial
that are used in certain queueing network classes time. In other words, fluidization is an abstraction
of models. Simulation techniques are frequently that tries to make tractable certain real-scale DES
very helpful in practice to produce an educated problems ( Discrete Event Systems and Hybrid
guess about the expected performance. Systems, Connections Between).
Time constraints on Petri nets may change When transitions are timed with the so-called
logical properties of models (e.g., mutual exclu- infinite server semantics, the PN system can be
sion constraints, deadlock-freeness, etc.), calling observed as a time differentiable piecewise affine
for new analysis techniques. For example, certain system. Thus, even if the relaxation “simplifies”
timings on transitions can transform a live system computations, it should be taken into account that M
into a non-live one (if to the net system in Fig. 2 continuous PNs with infinite server semantics
are associated deterministic times to transitions are able to simulate Turing machines. From a
and a race policy with the time associated to tran- different perspective, the steady-state throughput
sition c smaller than that of transition a, transition of a given transition may be non-monotonic with
b cannot be fired, after firing transition d; thus it respect to the firing rates or the initial marking
is non-live, while the untimed model was live). (e.g., if faster or more machines are used, the un-
By the addition of some time constraints, the controlled system may be slower); moreover, due
transformation of a non-live model into a live one to the important expressive power, discontinuities
is also possible. So additional analysis techniques may even appear with respect to continuous de-
need to be considered, redefining the state, now sign parameters as firing rates, for example.
depending also on time, more than just on the An alternative way to define hybrid Petri nets
marking. is a generalization of hybrid automata: The net
Finally, as in any DES, the optimal control system is a DES, but sets of differential equations
of timed Petri net models (scheduling, sequenc- are associated to the marking of places. If a
ing, etc.) may be approached by techniques as place is marked, the corresponding differential
dynamic programming or perturbation analysis equations contribute to define its evolution.
(presented in the context of queueing networks
and Markov chains, see  Perturbation Analysis
of Discrete Event Systems). In practice, those Summary and Future Directions
problems are frequently approached by means
of some partially heuristic strategies. About the Petri nets designate a broad family of related
diagnosis of timed Petri nets, see  Diagnosis of DES formalisms (a modeling paradigm) each
Discrete Event Systems. Of course, all these tasks one specifically tailored to approach certain
can be done with HLPNs. problems. Conceptual simplicity coexists with
754 Modeling, Analysis, and Control with Petri Nets

powerful modeling, analysis, and synthesis knowledge. Most of these books deal essentially
capabilities. From a control theory perspective, with PT-net systems. Complementary surveys
much work remains to be done for both untimed are Murata (1989), Silva (1993), and David and
and timed formalisms (remember, there are many Alla (1994), the latter also considering some
different ways of timing), particularly when continuous and hybrid models. Concerning high-
dealing with optimal control of timed models. level PNs, Jensen and Rozenberg (1991) is a se-
In engineering practice, approaches to the latter lection of papers covering the main developments
class of problems frequently use heuristic strate- during the 1980s. Jensen and Kristensen (2009)
gies. From a broader perspective, future research focuses on state space methods and simulation
directions include improvements required to where elements of timed models are taken into
deal with controllability and the design of account, but performance evaluation of stochastic
controllers, with observability and the design systems is not covered. Approaching the present
of observers, with diagnosability and the design day, relevant works written with complementary
of diagnosers, and with identification. This work perspectives include inter alia, Girault and Valk
is not limited to the strict DES framework, but (2003), Diaz (2009), David and Alla (2010),
also applies to analogous problems relating to and Seatzu et al. (2013). The consideration of
relaxations into hybrid or fluid approximations time in nets with an emphasis on performance
(particularly useful when high populations are and performability evaluation is addressed in
considered). The distributed nature of system is monographs such as Ajmone Marsan et al.
more and more frequent and is introducing new (1995), Bause and Kritzinger (1996), Balbo
constraints, a subject requiring serious attention. and Silva (1998), and Haas (2002), while timed
In all cases, different from firing languages models under different fuzzy interpretations are
approaches, the so named structure theory of the subject of Cardoso and Camargo (1999).
Petri nets should gain more interest. Structure-based approaches to controlling PN
models is the main subject in Iordache and
Cross-References Antsaklis (2006) or Chen and Li (2013). Different
kinds of hybrid PN models are studied in Di
 Applications of Discrete-Event Systems
Febbraro et al. (2001), Villani et al. (2007), and
 Diagnosis of Discrete Event Systems
David and Alla (2010), while a broad perspective
 Discrete Event Systems and Hybrid Systems,
about modeling, analysis, and control of contin-
Connections Between uous (untimed and timed) PNs is provided by
 Models for Discrete Event Systems: An
Silva et al. (2011).
Overview DiCesare et al. (1993) and Desrochers and Al-
 Perturbation Analysis of Discrete Event Jaar (1995) are devoted to the applications of
Systems PNs to manufacturing systems. A comprehensive
 Supervisory Control of Discrete-Event Systems
updated introduction to business process systems
and PNs can be found in van der Aalst and
Stahl (2011). Special volumes dealing with other
Recommended Reading monographic topics are, for example, Billington
et al. (1999), Agha et al. (2001), and Cortadella
Topics related to PNs are considered in well et al. (2002). An application domain for Petri
over a hundred thousand papers and reports. nets emerging over the last two decades is sys-
The first generation of books concerning this tems biology, a model-based approach devoted to
field is Brauer (1980), immediately followed by the analysis of biological systems (Koch et al.
Starke (1980), Peterson (1981), Brams (1983), 2011; Wingender 2011). Furthermore, it should
Reisig (1985), and Silva (1985). The fact that be pointed out that Petri nets have also been em-
they are written in English, French, German, and ployed in many other application domains (e.g.,
Spanish is proof of the rapid dissemination of this from logistics to musical systems).
Modeling, Analysis, and Control with Petri Nets 755

For an overall perspective of the field over the Girault C, Valk R (2003) Petri nets for systems engineer-
five decades that have elapsed since the publica- ing. a guide to modeling, verification, and applications.
Springer, Berlin
tion of Carl Adam Petri’s PhD thesis, including Haas PJ (2002) Stochastic Petri nets. modelling, stabil-
historical, epistemological, and technical aspects, ity, simulation. Springer series in operations research.
see Silva (2013). Springer, New York
Iordache MV, Antsaklis PJ (2006) Supervisory control of
concurrent systems: a Petri net structural approach.
Birkhauser, Boston
Jensen K, Kristensen LM (2009) Coloured Petri nets.
Bibliography modelling and validation of concurrent systems.
Springer, Berlin
Agha G, de Cindio F, Rozenberg G (eds) (2001) Concur- Jensen K, Rozenberg G (eds) (1991) High-level Petri nets.
rent object-oriented programming and Petri nets, ad- Springer, Berlin
vances in Petri nets. Volume 2001 of LNCS. Springer, Koch I, Reisig W, Schreiber F (eds) (2011) Modeling
Berlin/Heidelberg/New York in systems biology. the Petri net approach. Computa-
Ajmone Marsan M, Balbo G, Conte G, Donatelli S, tional biology, vol 16. Springer, Berlin
Franceschinis G (1995) Modelling with generalized Murata T (1989) Petri nets: properties, analysis and appli-
stochastic Petri nets. Wiley, Chichester/New York cations. Proc IEEE 77(4):541–580
Balbo G, Silva M (eds) (1998) Performance models Peterson JL (1981) Petri net theory and the
for discrete event systems with synchronizations: modeling of systems. Prentice-Hall, Upper Saddle
formalisms and analysis techniques. Proceedings of River
human capital and mobility MATCH performance ad- Petri CA (1966) Communication with automata. Rome
vanced school, Jaca. Available online: http://webdiis. Air Development Center-TR-65-377, New York
unizar.es/GISED/?q=news/matchbook Reisig W (1985) Petri nets. an introduction. Volume 4 of
Bause F, Kritzinger P (1996) Stochastic Petri nets. an EATCS monographs on theoretical computer science.
introduction to the theory. Vieweg, Braunschweig Springer-Verlag, Berlin/Heidelberg/New York
Billington J, Diaz M, Rozenberg G (eds) (1999) Appli- Seatzu C, Silva M, Schuppen J (eds) (2013) Control of
cation of Petri nets to communication networks, ad- discrete-event systems. Automata and Petri net per-
vances in Petri nets. Volume 1605 of LNCS. Springer, spectives. Number 433 in lecture notes in control and M
Berlin/Heidelberg/New York information sciences. Springer, London
Brams GW (1983) Reseaux de Petri: Theorie et Pratique. Silva M (1985) Las Redes de Petri: en la Automatica y la
Masson, Paris Informatica. Ed. AC, Madrid (2nd ed., Thomson-AC,
Brauer W (ed) (1980) Net theory and applications. Volume 2002)
84 of LNCS. Springer, Berlin/New York Silva M (1993) Introducing Petri nets. In: Practice of
Cardoso J, Camargo H (eds) (1999) Fuzziness in Petri Petri nets in manufacturing. Chapman and Hall, Lon-
nets. Volume 22 of studies in fuzziness and soft com- don/New York, pp 1–62
puting. Physica-Verlag, Heidelberg/New York Silva M (2013) Half a century after Carl Adam Petri’s PhD
Chen Y, Li Z (2013) Optimal supervisory control thesis: a perspective on the field. Annu Rev Control
of automated manufacturing systems. CRC, Boca 37(2):191–219
Raton Silva M, Teruel E, Colom JM (1998) Linear algebraic
Cortadella J, Yakovlev A, Rozenberg G (eds) (2002) and linear programming techniques for the analysis of
Concurrency and hardware design, advances net systems. Volume 1491 of LNCS, advances in Petri
in Petri nets. Volume 2549 of LNCS. Springer, nets. Springer, Berlin/Heidelberg/New York, pp 309–
Berlin/Heidelberg/New York 373
David R, Alla H (1994) Petri nets for modeling of dynamic Silva M, Julvez J, Mahulea C, Vazquez C (2011) On
systems – a survey. Automatica 30(2):175–202 fluidization of discrete event models: observation and
David R, Alla H (2010) Discrete, continuous and hybrid control of continuous Petri nets. Discret Event Dyn
Petri nets, Springer-Verlag, Berlin/Heidelberg Syst 21:427–497
Desrochers A., Al-Jaar RY (1995) Applications of Petri Starke P (1980) Petri-Netze. Deutcher Verlag der Wis-
nets in manufacturing systems. IEEE, New York senschaften, Berlin
Diaz M (ed) (2009) Petri nets: fundamental models, verifi- van der Aalst W, Stahl C (2011) Modeling busi-
cation and applications. Control systems, robotics and ness processes: a Petri net oriented approach. MIT,
manufacturing series (CAM). Wiley, London Cambridge
DiCesare F, Harhalakis G, Proth JM, Silva M, Vernadat Villani E, Miyagi PE, Valette R (2007) Modelling and
FB (1993) Practice of Petri nets in manufacturing. analysis of hybrid supervisory systems. A Petri net
Chapman & Hall, London/Glasgow/New York approach. Springer, Berlin
Di Febbraro A, Giua A, Menga G (eds) (2001) Special Wingender E (ed) (2011) Biological Petri nets. Studies in
issue on hybrid Petri nets. Discret Event Dyn Syst health technology and informatics. vol 162. IOS Press,
11(1–2):5–185 Lansdale
756 Model-Predictive Control in Practice

process heading (state estimation), where should


Model-Predictive Control in Practice the process go (steady-state target optimization),
and what is the best sequence of control (input)
Thomas A. Badgwell1 and S. Joe Qin2
1 adjustments to send it to the right place (dynamic
ExxonMobil Research & Engineering,
optimization). The first control (input) adjust-
Annandale, NJ, USA
2 ment is implemented and then the entire cal-
University of Southern California, Los Angeles,
culation sequence is repeated at the subsequent
CA, USA
control cycles.
MPC technology arose first in the context of
petroleum refinery and power plant control prob-
Synonyms lems (Cutler and Ramaker 1979; Richalet et al.
1978). Specific needs that drove the development
MPC of MPC technology include the requirement for
economic optimization and strict enforcement
of safety and equipment constraints. Promising
early results led to a wave of successful industrial
Abstract applications, sparking the development of several
commercial offerings (Qin and Badgwell 2003)
This entry provides a brief description of model and generating intense interest from the academic
predictive control (MPC) technology and how it community (Mayne et al. 2000). Today MPC
is used in practice. The emphasis here is on re- technology permeates the refining and chemical
fining and chemical plant applications where the industries and has gained increasing acceptance
technology has achieved its greatest acceptance. in a wide variety of areas including chemicals,
After a short description of what MPC is and automotive, aerospace, and food processing ap-
how it fits into the hierarchy of control functions, plications. The total number of MPC applications
the basic algorithm is presented as a sequence of worldwide was estimated in 2003 to be 4,500
three optimization problems. The steps required (Qin and Badgwell 2003).
for a successful application are then outlined,
followed by a summary and outline of likely
future directions for MPC technology. MPC Control Hierarchy

In a modern chemical plant or refinery, MPC


is part of a multilevel hierarchy, as illustrated
Keywords in Fig. 1. Moving from the top level to the
bottom, the control functions execute at a
Computer control; Mathematical programming; higher frequency but cover a smaller geographic
Predictive control scope. At the bottom level, referred to as
Level 0, proportional-integral-derivative (PID)
controllers execute several times a second within
Introduction distributed control system (DCS) hardware.
These controllers adjust individual valves to
Model predictive control (MPC) refers to a class maintain desired flows, pressures, levels, and
of computer control algorithms that utilize an temperatures.
explicit mathematical model to predict future At Level 1, MPC runs once a minute
process behavior. At each control interval, in the to perform dynamic constraint control for
most general case, an MPC algorithm solves a an individual processing unit, such as crude
sequence of three nonlinear programs to answer distillation unit or a fluid catalytic cracker (Gary
the following essential questions: where is the et al. 2007). It typically utilizes a linear dynamic
Model-Predictive Control in Practice 757

Model-Predictive Control in Practice, Fig. 1 Hierarchy of control functions in a refinery/chemical plant

model identified directly from process step-test MPC Algorithms


data. The MPC has the job of holding the unit
at the best economic operating point in the MPC algorithms function in much the same way
face of dynamic disturbances and operational that an experienced human operator would ap- M
constraints. proach a control problem. Figure 2 illustrates the
At Level 2, a real-time optimizer (RTO) runs flow of information for a typical MPC imple-
hourly to calculate optimal steady-state targets mentation. At each control interval, the algorithm
for a collection of processing units. It uses a compares the current model output prediction
rigorous first-principles steady-state model to cal- yp to the measured output ym and passes the
culate targets for key operating variables such prediction error e and control (input) u to a
as unit temperatures and feed rates. These are state estimator, which estimates the dynamic state
typically passed down to several MPCs for im- x. The most commonly used methods for state
plementation. estimation can be viewed as special cases of
At Level 3, planning and scheduling func- an optimization-based formulation called mov-
tions are carried out daily to optimize economics ing horizon estimation (MHE) (Rawlings and
for an entire chemical plant or refinery. Simple Mayne 2009). The state estimate x, O which in-
steady-state models are typically used at this cludes an estimate of the process disturbances
level, with some nonlinear but mostly linear con- dO , is then passed to a steady-state optimizer to
nections between model inputs and outputs. Key determine the best operating point for the unit.
operating targets and economic data are typ- The steady-state optimizer must also consider
ically passed to several RTO applications for operator-entered output and control (input) tar-
implementation. gets yt and ut . The steady-state state and control
Note that a different mathematical model of (input) targets xs and us are then passed, along
the process is used at each level of the hierarchy. O to a dynamic optimizer
with the state estimate x,
These models must be reconciled in some man- to compute the best trajectory of future control
ner with current plant operation and with each (input) adjustments. The first computed control
other in order for the overall system to function (input) adjustment is then implemented and the
properly. entire calculation sequence is repeated at the
758 Model-Predictive Control in Practice

Model-Predictive Control in Practice, Fig. 2 Information flow for MPC algorithm

next control interval. The various commercial • Pre-test – design the control and test sensors
MPC algorithms differ in such details as the and actuators.
mathematical form of the dynamic model and • Step-test – generate process response data.
the specific formulations of the state estimation, • Modeling – develop model from process re-
steady-state optimization, and dynamic optimiza- sponse data.
tion problems (Qin and Badgwell 2003). • Configuration – configure the software and
In the general case, the MPC algorithm must test preliminary tuning by simulation.
solve the three optimization problems outlined • Commissioning – turn on and test the con-
above at each control interval. For the case of troller.
linear models and reasonable tuning parameters, • Post-audit – measure and certify economic
these problems take the form of a convex performance.
quadratic program (QP) with a constant, positive- • Sustainment – monitor and maintain the appli-
definite Hessian. As such, they can be solved cation.
relatively easily using standard optimization The most expensive of these steps, both in
codes. For the case of a linear state-space model, terms of engineering time and lost production, is
the structure can be exploited even further to the generation of process response data through
develop a specialized solution algorithm using an the step test. This is accomplished, in principle,
interior point method (Rao et al. 1998). by making significant adjustments to each vari-
For the case of nonlinear models, these able that will be adjusted by the MPC while
problems take the form of a nonlinear program operating open loop to prevent compensating
(NLP) for which the solution domain is no longer control action. This will necessarily cause abnor-
convex, greatly complicating the numerical mal movement in key operating variables, which
solution. A typical strategy is to iterate on may lead to lower throughput and off-spec prod-
a linearized version of the problem until ucts. Significant progress has been made in re-
convergence (Bielger 2010). cent years to minimize these difficulties through
the use of approximate closed-loop step testing
(Darby and Nikolaou 2012).
Implementation Once the application has been commis-
sioned, it is critical to set up an aggressive
The combined experience of thousands of MPC monitoring and sustainment program. MPC
applications in the process industries has led to application benefits can fall off quickly
a near consensus on the steps required for a due to changes in the process operation
successful implementation: and as new personnel interact with it. New
• Justification – make the economic case for the constraint variables may need to be added
application. and key sections of the model may need
Model-Predictive Control in Practice 759

to be updated as time goes on. The math- and Ramaker (1979). A detailed summary of the
ematical problem of MPC monitoring re- history of MPC technology development, as well
mains a topic of current academic research as a survey of commercial offerings through 2003
(Zagrobelny et al. 2012). can be found in the review article by Qin and
Note that the implementation steps outlined Badgwell (2003). Darby and Nikolaou present a
above must be carried out by a carefully selected more recent summary of MPC practice (Darby
project team that typically includes, in addition and Nikolaou 2012). Textbook descriptions of
to the MPC expert, an engineer with detailed MPC theory and design, suitable for classroom
knowledge of the process and an operator with use, include Rawlings and Mayne (2009) and
significant relevant experience. Maciejowski (2002). The book by Ljung (1999)
provides a good summary of methods for identi-
fying dynamic models from test data. Theoretical
Summary and Future Directions properties of MPC are analyzed in a highly cited
paper by Mayne and coworkers (2000). Guide-
Model predictive control is now a mature tech- lines for designing disturbance models so as to
nology in the process industries. A representative achieve offset-free control can be found in Pan-
MPC algorithm in this domain includes a state nocchia and Rawlings (2003). Numerical solution
estimator, a steady-state optimizer, and a dynamic strategies for the nonlinear programs found in
optimizer, running once a minute. A successful MPC are discussed in the book by Biegler (2010).
MPC application usually starts with a careful An efficient interior-point method for solving the
economic justification, includes significant par- linear MPC dynamic optimization is described
ticipation from process engineers and operators, in Rao et al. (1998). A promising algorithm for
and is maintained with an aggressive sustainment solving the nonlinear MPC dynamic optimiza-
program. Many thousands of such applications tion is outlined in Zavala and Biegler (2009).
are currently operating around the world, gen- M
A data-based method for tuning Kalman Filters,
erating billions of dollars per year in economic which are often used for MPC state estimation,
benefits. is described in Odelson et al. (2006). A new
Likely future directions for MPC prac- method for monitoring the performance of MPC
tice include increasing use of nonlinear is summarized in Zagrobelny et al. (2012). A
models, improved state estimation through readable summary of refining operations can be
unmeasured disturbance modeling (Pannocchia found in Gary et al. (2007).
and Rawlings 2003), and development of
more efficient numerical solution methods
(Zavala and Biegler 2009). Bibliography

Bielger LT (2010) Nonlinear programming, concepts,


Cross-References algorithms, and applications to chemical processes.
SIAM, Philadelphia
Cutler CR, Ramaker BL (1979) Dynamic matrix control
 Distributed Model Predictive Control – a computer control algorithm. Paper presented at the
 Nominal Model-Predictive Control AIChE national meeting, Houston, April 1979
 Optimization Algorithms for Model Predictive Darby ML, Nikolaou M (2012) MPC: current practice and
challenges. Control Eng Pract 20:328–342
Control Gary JH, Handwerk GE, Kaiser MJ (2007) Petroleum
 Tracking Model Predictive Control refining: technology and economics. CRC, New York
Ljung L (1999) System identification: theory for the user.
Prentice Hall, Upper Saddle River
Recommended Reading Maciejowski JM (2002) Predictive control with con-
straints. Pearson Education Limited, Essex
Mayne DQ, Rawlings JB, Rao CV, Scokaert POM (2000)
The first descriptions of MPC technology appear Constrained model predictive control: stability and
in papers by Richalet et al. (1978) and Cutler optimality. Automatica 36:789–814
760 Models for Discrete Event Systems: An Overview

Odelson BJ, Rajamani MR, Rawlings JB (2006) Keywords


A new autocovariance least-squares method for
estimating noise covariances. Automatica 42:
303–308 Automata; Dioid algebras; Event-driven systems;
Pannocchia G, Rawlings JB (2003) Disturbance mod- Hybrid systems; Petri nets; Time-driven systems
els for offset-free model predictive control. AIChE J
49:426–437
Qin SJ, Badgwell TA (2003) A survey of industrial
model predictive control technology. Control Eng Introduction
Pract 11:733–764
Rao CV, Wright SJ, Rawlings JB (1998) Application of Discrete event systems (DES) form an important
interior-point methods to model predictive control. J class of dynamic systems. The term was intro-
Optim Theory Appl 99:723–757
Rawlings JB, Mayne DQ (2009) Model predictive duced in the early 1980s to describe a DES in
control: theory and design. Nob Hill Publishing, terms of its most critical feature: the fact that
Madison its behavior is governed by discrete events which
Richalet J, Rault A, Testud JL, Papon J (1978) Model occur asynchronously over time and which are
predictive heuristic control: applications to industrial
processes. Automatica 14:413–428 solely responsible for generating state transitions.
Zagrobelny M, Luo J, Rawlings JB (2012) Quis custodiet In between event occurrences, the state of a DES
ipsos custodies? In: IFAC conference on nonlinear is unaffected. Examples of such behavior abound
model predictive control 2012, Noordwijkerhout, Aug in technological environments, including com-
2012
Zavala VM, Biegler LT (2009) The advanced-step NMPC puter and communication networks, manufac-
controller: optimality, stability, and robustness. Auto- turing systems, transportation systems, logistics,
matica 45:86–93 and so forth. The operation of a DES is largely
regulated by rules which are often unstructured
and frequently human-made, as in initiating or
terminating activities and scheduling the use of
Models for Discrete Event Systems:
resources through controlled events (e.g., turning
An Overview
equipment “on”). On the other hand, their op-
eration is also subject to uncontrolled randomly
Christos G. Cassandras
occurring events (e.g., a spontaneous equipment
Division of Systems Engineering, Center for
failure) which may or may not be observable
Information and Systems Engineering, Boston
through sensors. It is worth pointing out that the
University, Brookline, MA, USA
term “discrete event dynamic system” (DEDS) is
also commonly used to emphasize the importance
of the dynamical behavior of such systems (Cas-
Synonyms
sandras and Lafortune 2008; Ho 1991).
There are two aspects of a DES that define its
DES
behavior:
1. The variables involved are both continuous
and discrete, sometimes purely symbolic, i.e.,
Abstract nonnumeric (e.g., describing the state of a
traffic light as “red” or “green”). This renders
This article provides an introduction to discrete traditional mathematical models based on dif-
event systems (DES) as a class of dynamic ferential (or difference) equations inadequate
systems with characteristics significantly and related methods based on calculus of lim-
distinguishing them from traditional time-driven ited use.
systems. It also overviews the main modeling 2. Because of the asynchronous nature of events
frameworks used to formally describe the that cause state transitions in a DES, it is
operation of DES and to study problems related neither natural nor efficient to use time as a
to their control and optimization. synchronizing element driving its dynamics.
Models for Discrete Event Systems: An Overview 761

It is for this reason that DES are often referred causing transitions from one system state value
to as event driven, to contrast them to clas- to another. It may be identified with an action
sical time-driven systems based on the laws (e.g., pressing a button), a spontaneous natural
of physics; in the latter, as time evolves state, occurrence (e.g., a random equipment failure), or
variables such as position, velocity, tempera- the result of conditions met by the system state
ture, voltage, etc., also continuously evolve. In (e.g., the fluid level in a tank exceeds a given
order to capture event-driven state dynamics, value). For the purpose of developing a model
however, different mathematical models are for DES, we will use the symbol e to denote an
necessary. event. Since a system is generally affected by
In addition, uncertainties are inherent in different types of events, we assume that we can
the technological environments where DES are define a discrete event set E with e 2 E.
encountered. Therefore, associated mathematical In a classical system model, the “clock” is
models and methods for analysis and control what drives a typical state trajectory: with every
must incorporate such uncertainties. Finally, “clock tick” (which may be thought of as an
complexity is also inherent in DES of practical “event”), the state is expected to change, since
interest, usually manifesting itself in the form of continuous state variables continuously change
combinatorially explosive state spaces. Although with time. This leads to the term time driven.
purely analytical methods for DES design, In the case of time-driven systems described by
analysis, and control are limited, they have continuous variables, the field of systems and
still enabled reliable approximations of their control has based much of its success on the use
dynamic behavior and the derivation of useful of well-known differential-equation-based mod-
structural properties and provable performance els, such as
guarantees. Much of the progress made in this
field, however, has relied on new paradigms xP .t/ D f.x.t/; u.t/; t/; x.t0 / D x0 (1)
M
characterized by a combination of mathematical y.t/ D g.x.t/; u.t/; t/; (2)
techniques, computer-based tools, and effective
processing of experimental data. where (1) is a (vector) state equation with initial
Event-driven and time-driven system com- conditions specified and (2) is a (vector) output
ponents are often viewed as coexisting and equation. As is common in system theory, x.t/
interacting and are referred to as hybrid systems denotes the state of the system, y.t/ is the output,
(separately considered in the Encyclopedia, and u.t/ represents the input, often associated
including the article  Discrete Event Systems with controllable variables used to manipulate
and Hybrid Systems, Connections Between). the state so as to attain a desired output. Com-
Arguably, most contemporary technological mon physical quantities such as position, veloc-
systems are combinations of time-driven ity, temperature, pressure, flow, etc., define state
components (typically, the physical parts of a variables in (1). The state generally changes as
system) and event-driven components (usually, time changes, and, as a result, the time variable t
the computer-based controllers that collect data (or some integer k D 0; 1; 2; : : : in discrete time)
from and issue commands to the physical parts). is a natural independent variable for modeling
such systems.
In contrast, in a DES, time no longer serves
Event-Driven vs. Time-Driven the purpose of driving such a system and may
Systems no longer be an appropriate independent variable.
Instead, at least some of the state variables are
In order to explain the difference between time- discrete, and their values change only at certain
driven and event-driven behavior, we begin points in time through instantaneous transitions
with the concept of “event.” An event should which we associate with “events.” If a clock
be thought of as occurring instantaneously and is used, consider two possibilities: (i) At every
762 Models for Discrete Event Systems: An Overview

clock tick, an event e is selected from the event driven system shown, the state space X is the set
set E (if no event takes place, we use a “null of real numbers R, and x.t/ can take any value
event” as a member of E such that it causes no from this set. The function x.t/ is the solution
state change), and (ii) at various time instants (not of a differential equation of the general form
necessarily known in advance or coinciding with P
x.t/ D f .x.t/; u.t/; t/, where u.t/ is the input.
clock ticks), some event e “announces” that it (ii) For the event-driven system, the state space is
is occurring. Observe that in (i) state transitions some discrete set X D fs1 ; s2 ; s3 ; s4 g. The sample
are synchronized by the clock which is solely path can only jump from one state to another
responsible for any possible state transition. In whenever an event occurs. Note that an event
(ii), every event e 2 E defines a distinct process may take place, but not cause a state transition,
through which the time instants when e occurs as in the case of e4 . There is no immediately
are determined. State transitions are the result of P
obvious analog to x.t/ D f .x.t/; u.t/; t/, i.e., no
combining these asynchronous concurrent event mechanism to specify how events might interact
processes. Moreover, these processes need not over time or how their time of occurrence might
be independent of each other. The distinction be determined. Thus, a large part of the early
between (i) and (ii) gives rise to the terms time- developments in the DES field has been devoted
driven and event-driven systems, respectively. to the specification of an appropriate mathemati-
Comparing state trajectories of time-driven cal model containing the same expressive power
and event-driven systems is useful in understand- as (1)–(2) (Baccelli et al. 1992; Cassandras and
ing the differences between the two and setting Lafortune 2008; Glasserman and Yao 1994).
the stage for DES modeling frameworks. Thus, in We should point out that a time-driven
Fig. 1, we observe the following: (i) For the time- system with continuous state variables, usually
modeled through (1)–(2), may be abstracted
as a DES through some form of discretization
x(t) in time and quantization in the state space.
We should also point out that discrete event
systems should not be confused with discrete
time systems. The class of discrete time systems
contains both time-driven and event-driven
systems.

t Timed and Untimed Models


of Discrete Event Systems
x(t)

s4 Returning to Fig. 1, instead of constructing the


s3 piecewise constant function x.t/ as shown, it is
convenient to simply write the timed sequence of
s2
events f.e1 ; t1 /; .e2 ; t2 /; .e3 ; t3 /; .e4 ; t4 /; .e5 ; t5 /g
which contains the same information as the
s1 state trajectory. Assuming that the initial state
t of the system (s2 in this case) is known and that
t1 t2 t3 t4 t5
the system is “deterministic” in the sense that
e the next state after the occurrence of an event is
e1 e2 e3 e4 e5
unique, we can recover the state of the system
Models for Discrete Event Systems: An Overview, at any point in time and reconstruct the DES
Fig. 1 Comparison of time-driven and event-driven state state trajectory. The set of all possible timed
trajectories sequences of events that a given system can ever
Models for Discrete Event Systems: An Overview 763

execute is called the timed language model of specification (Cassandras and Lafortune 2008;
the system. The word “language” comes from Moody and Antsaklis 1998; Ramadge and
the fact that we can think of the event E as an Wonham 1987).
“alphabet” and of (finite) sequences of events as On the other hand, we may be interested in
“words” (Hopcroft and Ullman 1979). We can event timing in order to answer questions such
further refine such a model by adding statistical as the following: “How much time does the sys-
information regarding the set of state trajectories tem spend at a particular state?” or “Can this
(sample paths) of the system. Let us assume that sequence of events be completed by a partic-
probability distribution functions are available ular deadline?” More generally, event timing
about the “lifetime” of each event type e 2 E, is important in assessing the performance of a
that is, the elapsed time between successive DES often measured through quantities such as
occurrences of this particular e. A stochastic throughput or response time. In these instances,
timed language is a timed language together with we need to consider the timed language model
associated probability distribution functions for of the system. Since DES frequently operate in a
the events. stochastic setting, an additional level of complex-
Stochastic timed language modeling is the ity is introduced, necessitating the development
most detailed in the sense that it contains event of probabilistic models and related analytical
information in the form of event occurrences and methodologies for design and performance anal-
their orderings, information about the exact times ysis based on stochastic timed language models.
at which the events occur (not only their relative Sample path analysis and perturbation analysis,
ordering), and statistical information about suc- discussed in the entry  Perturbation Analysis
cessive occurrences of events. If we delete the of Discrete Event Systems, refer to the study of
timing information from a timed language, we sample paths of DES, focusing on the extrac-
obtain an untimed language, or simply language, tion of information for the purpose of efficiently M
which is the set of all possible orderings of estimating performance sensitivities of the sys-
events that could happen in the given system. tem and, ultimately, achieving online control and
For example, the untimed sequence correspond- optimization (Cassandras and Lafortune 2008;
ing to the timed sequence of events in Fig. 1 is Glasserman 1991; Ho and Cao 1991; Ho and
fe1 ; e2 ; e3 ; e4 ; e5 g. Cassandras 1983).
Untimed and timed languages represent These different levels of abstraction are com-
different levels of abstraction at which DES plementary, as they address different issues about
are modeled and studied. The choice of the the behavior of a DES. Although the language-
appropriate level of abstraction clearly depends based approach to DES modeling is attractive,
on the objectives of the analysis. In many it is by itself not convenient to address verifica-
instances, we are interested in the “logical tion, controller synthesis, or performance issues.
behavior” of the system, that is, in ensuring that This motivates the development of discrete event
all the event sequences it can generate satisfy modeling formalisms which represent languages
a given set of specifications, e.g., maintaining in a manner that highlights structural information
a precise ordering of events. In this context, about the system behavior and can be used to
the actual timing of events is not required, address analysis and controller synthesis issues.
and it is sufficient to model only the untimed Next, we provide an overview of three major
behavior of the system. Supervisory control modeling formalisms which are used by most
that is discussed in the article  Supervisory (but not all) system and control theoretic method-
Control of Discrete-Event Systems is the term ologies pertaining to DES. Additional modeling
established for describing the systematic means formalisms encountered in the computer science
(i.e., enabling or disabling events which are literature include process algebras (Baeten and
controllable) by which the logical behavior Weijland 1990) and communicating sequential
of a DES is regulated to achieve a given processes (Hoare 1985).
764 Models for Discrete Event Systems: An Overview

Automata G and denoted by L.G/ is the set of all strings


in E  for which the extended function f is
A deterministic automaton, denoted by G, is a defined. The automaton model above is one in-
six-tuple stance of what is referred to as a generalized
semi-Markov scheme (GSMS) in the literature of
G D .X ; E; f; ; x0 ; Xm /; stochastic processes. A GSMS is viewed as the
basis for extending automata to incorporate an
where X is the set of states, E is the finite set of event timing structure as well as nondeterministic
events associated with the transitions in G, and state transition mechanisms, ultimately leading
f W X  E ! X is the transition function; to stochastic timed automata, discussed in the
specifically, f .x; e/ D y means that there is sequel.
a transition labeled by event e from state x to Let us allow for generally countable sets X
state y and, in general, f is a partial function and E and leave out of the definition any con-
on its domain.  W X ! 2E is the active event sideration for marked states. Thus, we begin with
function (or feasible event function); .x/ is the an automaton model .X ; E; f; ; x0 /. We extend
set of all events e for which f .x; e/ is defined the modeling setting to timed automata by in-
and it is called the active event set (or feasible corporating a “clock structure” associated with
event set) of G at x. Finally, x0 is the initial the event set E which now becomes the input
state and Xm
X is the set of marked states. from which a specific event sequence can be
The terms state machine and generator (which deduced. The clock structure (or timing structure)
explains the notation G) are also used to describe associated with an event set E is a set V D fvi W
the above object. Moreover, if X is a finite set, we i 2 Eg of clock (or lifetime) sequences
call G a deterministic finite-state automaton. A
nondeterministic automaton is defined by means vi Dfvi;1 ; vi;2 ; : : :g; i 2 E; vi;k 2 RC ; k D 1; 2; : : :
of a relation over X  E  X or, equivalently, a
function from X  E to 2X .
The automaton G operates as follows. It starts Timed Automaton. A timed automaton is de-
in the initial state x0 , and upon the occurrence of fined as a six-tuple
an event e 2 .x0 /
E, it makes a transition to
state f .x0 ; e/ 2 X . This process then continues .X ; E; f; ; x0 ; V/;
based on the transitions for which f is defined.
Note that an event may occur without changing where V D fvi W i 2 Eg is a clock structure
the state, i.e., f .x; e/ D x. It is also possible that and .X ; E; f; ; x0 / is an automaton. The automa-
two distinct events occur at a given state causing ton generates a state sequence x 0 D f .x; e 0 /
the exact same transition, i.e., for a; b 2 E, driven by an event sequence fe1 ; e2 ; : : :g gener-
f .x; a/ D f .x; b/ D y. What is interesting ated through
about the latter fact is that we may not be able
to distinguish between events a and b by simply e 0 D arg min fyi g (3)
i 2.x/
observing a transition from state x to state y.
For the sake of convenience, f is always with the clock values yi , i 2 E, defined by
extended from domain X  E to domain X 

E  , where E  is the set of all finite strings yi  y  if i ¤ e 0 and i 2 .x/
yi0 D i 2 .x 0 /
of elements of E, including the empty string vi;Ni C1 if i D e 0 or i … .x/
(denoted by "); the * operation is called the (4)
Kleene closure. This is accomplished in the fol-
where the interevent time y  is defined as
lowing recursive manner: f .x; "/ WD x and
f .x; se/ WD f .f .x; s/; e/ for s 2 E  and
y  D min fyi g (5)
e 2 E. The (untimed) language generated by i 2.x/
Models for Discrete Event Systems: An Overview 765

and the event scores Ni , i 2 E, are defined by an event set E is a set of distribution functions
F D fFi W i 2 Eg characterizing the stochastic
clock sequences
Ni C 1 if i D e 0 or i … .x/
Ni0 D i 2 .x 0 /:
Ni otherwise fVi;k g D fVi;1 ; Vi;2 ; : : :g; i 2 E;
(6)
Vi;k 2 RC ; k D 1; 2; : : :
In addition, initial conditions are yi D vi;1 and
Ni D 1 for all i 2 .x0 /. If i … .x0 /, then yi
It is usually assumed that each clock sequence
is undefined and Ni D 0.
consists of random variables which are inde-
A simple interpretation of this elaborate def-
pendent and identically distributed (iid) and that
inition is as follows. Given that the system is at
all clock sequences are mutually independent.
some state x, the next event e 0 is the one with
Thus, each fVi;k g is completely characterized by
the smallest clock value among all feasible events
a distribution function Fi .t/ D P ŒVi  t. There
i 2 .x/. The corresponding clock value, y  ,
are, however, several ways in which a clock struc-
is the interevent time between the occurrence of
ture can be extended to include situations where
e and e 0 , and it provides the amount by which
elements of a sequence fVi;k g are correlated or
the time, t, moves forward: t 0 D t C y  . Clock
two clock sequences are interdependent. As for
values for all events that remain active in state x 0
state transitions which may be nondeterministic
are decremented by y  , except for the triggering
in nature, such behavior is modeled through state
event e 0 and all newly activated events, which are
transition probabilities as explained next.
assigned a new lifetime vi;Ni C1 . Event scores are
incremented whenever a new lifetime is assigned
Stochastic Timed Automaton. We can extend
to them. It is important to note that the “system
the definition of a timed automaton by viewing
clock” t is fully controlled by the occurrence of M
the state, event, and all event scores and clock
events, which cause it to move forward; if no
values as random variables denoted respectively
event occurs, the system remains at the last state
by capital letters X , E, Ni , and Yi , i 2 E. Thus,
observed.
a stochastic timed automaton is a six-tuple
Comparing x 0 D f .x; e 0 / to the state equa-
tion (1) for time-driven systems, we see that
the former can be viewed as the event-driven .X ; E; ; p; p0 ; F /;
analog of the latter. However, the simplicity of
x 0 D f .x; e 0 / is deceptive: unless an event se- where X is a countable state space; E is a count-
quence is given, determining the triggering event able event set; .x/ is the active event set (or fea-
e 0 which is required to obtain the next state x 0 sible event set); p.x 0 I x; e 0 / is a state transition
involves the combination of (3)–(6). Therefore, probability defined for all x; x 0 2 X , e 0 2 E and
the analog of (1) as a “canonical” state equation such that p.x 0 I x; e 0 / D 0 for all e 0 … .x/; p0 is
for a DES requires all Eqs. (3)–(6). Observe that the probability mass function P ŒX0 D x, x 2 X ,
this timed automaton generates a timed language, of the initial state X0 ; and F is a stochastic clock
thus extending the untimed language generated structure. The automaton generates a stochastic
by the original automaton G. state sequence fX0 ; X1 ; : : :g through a transition
In the definition above, the clock structure V mechanism (based on observations X D x, E 0 D
is assumed to be fully specified in a deterministic e 0 ):
sense and so are state transitions dictated by
x 0 D f .x; e 0 /. The sequences fvi g, i 2 E, can X 0 D x 0 with probability p.x 0 I x; e 0 / (7)
be extended to be specified only as stochastic
sequences through distribution functions denoted and it is driven by a stochastic event sequence
by Fi , i 2 E. Thus, the stochastic clock structure fE1 ; E2 ; : : :g generated exactly as in (3)–(6) with
(or stochastic timing structure) associated with random variables E, Yi , and Ni , i 2 E, instead
766 Models for Discrete Event Systems: An Overview

of deterministic quantities and with fVi;k g are associated with transition nodes. In order
Fi ( denotes “with distribution”). In addition, for a transition to occur, several conditions may
initial conditions are X0 p0 .x/, Yi D Vi;1 , and have to be satisfied. Information related to these
Ni D 1 if i 2 .X0 /. If i … .X0 /, then Yi is conditions is contained in place nodes. Some
undefined and Ni D 0. such places are viewed as the “input” to a tran-
It is conceivable for two events to occur at sition; they are associated with the conditions
the same time, in which case we need a priority required for this transition to occur. Other places
scheme to overcome a possible ambiguity in the are viewed as the output of a transition; they
selection of the triggering event in (3). In prac- are associated with conditions that are affected
tice, it is common to expect that every Fi in by the occurrence of this transition. A Petri net
the clock structure is absolutely continuous over graph is formally defined as a weighted directed
Œ0; 1/ (so that its density function exists) and bipartite graph .P; T; A; w/ where P is the finite
has a finite mean. This implies that two events set of places (one type of node in the graph), T
can occur at exactly the same time only with is the finite set of transitions (the other type of
probability 0. node in the graph), A
.P  T / [ .T  P /
A stochastic process fX.t/g with state space is the set of arcs with directions from places to
X which is generated by a stochastic timed au- transitions and from transitions to places in the
tomaton .X ; E; ; p; p0 ; F / is referred to as a graph, and w W A ! f1; 2; 3; : : :g is the weight
generalized semi-Markov process (GSMP). This function on the arcs. Let P D fp1 ; p2 ; : : : ; pn g,
process is used as the basis of much of the sample and T D ft1 ; t2 ; : : : ; tm g. It is convenient to
path analysis methods for DES (see Cassandras use I.tj / to represent the set of input places to
and Lafortune 2008; Glasserman 1991; Ho and transition tj . Similarly, O.tj / represents the set
Cao 1991). of output places from transition tj . Thus, we have
I.tj / D fpi 2 P W .pi ; tj / 2 Ag and O.tj / D
fpi 2 P W .tj ; pi / 2 Ag.
Petri Nets
Petri Net Dynamics. Tokens are assigned to
An alternative modeling formalism for DES is places in a Petri net graph in order to indi-
provided by Petri nets, originating in the work cate the fact that the condition described by that
of C. A. Petri in the early 1960s. Like an au- place is satisfied. The way in which tokens are
tomaton, a Petri net (Peterson 1981) is a device assigned to a Petri net graph defines a mark-
that manipulates events according to certain rules. ing. Formally, a marking x of a Petri net graph
One of its features is the inclusion of explicit .P; T; A; w/ is a function x W P ! N D
conditions under which an event can be enabled. f0; 1; 2; : : :g. Marking x defines row vector x D
The Petri net modeling framework is the subject Œx.p1 /; x.p2 /; : : : ; x.pn /, where n is the number
of the article  Modeling, Analysis, and Control of places in the Petri net. The i th entry of this
with Petri Nets. Thus, we limit ourselves here to vector indicates the (nonnegative integer) number
a brief introduction. First, we define a Petri net of tokens in place pi , x.pi / 2 N. In Petri
graph, also called the Petri net structure. Then, net graphs, a token is indicated by a dark dot
we adjoin to this graph an initial state, a set of positioned in the appropriate place. The state of
marked states, and a transition labeling function, a Petri net is defined to be its marking vector
resulting in the complete Petri net model, its x. The state transition mechanism of a Petri net
associated dynamics, and the languages that it is captured by the structure of its graph and by
generates and marks. “moving” tokens from one place to another. A
transition tj 2 T in a Petri net is said to be
Petri Net Graph. A Petri net is a directed enabled if
bipartite graph with two types of nodes, places
and transitions, and arcs connecting them. Events x.pi /  w.pi ; tj / for all pi 2 I.tj / : (8)
Models for Discrete Event Systems: An Overview 767

In words, transition tj in the Petri net is enabled with a transition tj . A positive real number,
when the number of tokens in pi is at least as vj;k , assigned to tj has the following meaning:
large as the weight of the arc connecting pi to when transition tj is enabled for the kth time,
tj , for all places pi that are input to transition it does not fire immediately, but incurs a firing
tj . When a transition is enabled, it can occur or delay given by vj;k ; during this delay, tokens are
fire. The state transition function of a Petri net is kept in the input places of tj . Not all transitions
defined through the change in the state of the Petri are required to have firing delays. Thus, we
net due to the firing of an enabled transition. The partition T into subsets T0 and TD , such that
state transition function, f W Nn  T ! Nn , of T D T0 [ TD . T0 is the set of transitions always
Petri net .P; T; A; w; x/ is defined for transition incurring zero firing delay, and TD is the set
tj 2 T if and only if (8) holds. Then, we set of transitions that generally incur some firing
x0 D f .x; tj / where delay. The latter are called timed transitions. The
clock structure (or timing structure) associated
x 0 .pi / D x.pi /  w.pi ; tj / C w.tj ; pi /; with a set of timed transitions TD
T of a
i D 1; : : : ; n : (9) marked Petri net .P; T; A; w; x/ is a set V D
fvj W tj 2 TD g of clock (or lifetime) sequences
An “enabled transition” is therefore equivalent to vj D fvj;1 ; vj;2 ; : : :g, tj 2 TD , vj;k 2 RC ;
a “feasible event” in an automaton. But whereas k D 1; 2; : : : A timed Petri net is a six-tuple
in automata the state transition function enumer- .P; T; A; w; x; V/, where .P; T; A; w; x/ is a
ates all feasible state transitions, here the state marked Petri net and V D fvj W tj 2 TD g is
transition function is based on the structure of a clock structure. It is worth mentioning that
the Petri net. Thus, the next state defined by (9) this general structure allows for a variety of
explicitly depends on the input and output places behaviors in a timed Petri net, including the
of a transition and on the weights of the arcs con- possibility of multiple transitions being enabled M
necting these places to the transition. According at the same time or an enabled transition being
to (9), if pi is an input place of tj , it loses as many preempted by the firing of another, depending
tokens as the weight of the arc from pi to tj ; if it on the values of the associated firing delays.
is an output place of tj , it gains as many tokens as The need to analyze and control such behavior
the weight of the arc from tj to pi . Clearly, it is in DES has motivated the development of a
possible that pi is both an input and output place considerable body of analysis techniques for
of tj . Petri net models which have been proven to be
In general, it is entirely possible that, after particularly suitable for this purpose (Moody and
several transition firings, the resulting state is Antsaklis 1998; Peterson 1981).
x D Œ0; : : : ; 0 or that the number of tokens in
one or more places grows arbitrarily large after
an arbitrarily large number of transition firings. Dioid Algebras
The latter phenomenon is a key difference with
automata, where finite-state automata have only a Another modeling framework is based on devel-
finite number of states, by definition. In contrast, oping an algebra using two operations: minfa; bg
a finite Petri net graph may result in a Petri net (or maxfa; bg) for any real numbers a and b and
with an unbounded number of states. It should addition .a C b/. The motivation comes from
be noted that a finite-state automaton can always the observation that the operations “min” and
be represented as a Petri net; on the other hand, “C” are the only ones required to develop the
not all Petri nets can be represented as finite-state timed automaton model. Similarly, the operations
automata. “max” and “C” are the only ones used in de-
Similar to timed automata, we can define veloping the timed Petri net models described
timed Petri nets by introducing a clock structure, above. The operations are formally named addi-
except that now a clock sequence vj is associated tion and multiplication and denoted by ˚ and ˝
768 Models for Discrete Event Systems: An Overview

respectively. However, their actual meaning (in a systematic framework for formulating and
terms of regular algebra) is different. For any two solving problems of this type; a comprehensive
real numbers a and b, we define coverage can be found in Cassandras and
Lafortune (2008). Logical behavior issues are
Addition W a ˚ b  maxfa; bg (10) also encountered in the diagnosis of partially
Multiplication W a ˝ b  a C b: (11) observed DES, a topic covered in the article
 Diagnosis of Discrete Event Systems.
This dioid algebra is also called a .max; C/
Event Timing. When timing issues are intro-
algebra (Baccelli et al. 1992; Cuninghame-Green
duced, timed automata and timed Petri nets are
1979). If we consider a standard linear discrete
invoked for modeling purposes. Supervisory con-
time system, its state equation is of the form
trol in this case becomes significantly more com-
plicated. An important class of problems, how-
x.k C 1/ D Ax.k/ C Bu.k/;
ever, does not involve the ordering of individual
events, but rather the requirement that selected
which involves (regular) multiplication () and
events occur within a given “time window” or
addition (C). It turns out that we can use a
with some desired periodic characteristics. Mod-
.max; C/ algebra with DES, replacing the .C; /
els based on the algebraic structure of timed Petri
algebra of conventional time-driven systems, and
nets or the .max; C/ algebra provide convenient
come up with a representation similar to the one
settings for formulating and solving such prob-
above, thus paralleling to a considerable extent
lems (Baccelli et al. 1992; Glasserman and Yao
the analysis of classical time-driven linear sys-
1994).
tems. We should emphasize, however, that this
particular representation is only possible for a Performance Analysis. As in classical control
subset of DES. Moreover, while conceptually theory, one can define a performance (or cost)
this offers an attractive way to capture the event function as a means for quantifying system
timing dynamics in a DES, from a computa- behavior. This approach is particularly crucial
tional standpoint, one still has to confront the in the study of stochastic DES. Because of
complexity of performing the “max” operation the complexity of DES dynamics, analytical
when numerical information is ultimately needed expressions for such performance metrics in
to analyze the system or to design controllers for terms of controllable variables are seldom
its proper operation. available. This has motivated the use of
simulation and, more generally, the study
of DES sample paths; these have proven to
Control and Optimization of Discrete contain a surprising wealth of information for
Event Systems control purposes. The theory of perturbation
analysis presented in the article  Perturbation
The various control and optimization methodolo- Analysis of Discrete Event Systems has provided
gies developed to date for DES depend on the a systematic way of estimating performance
modeling level appropriate for the problem of sensitivities with respect to system parameters
interest. (Cassandras and Lafortune 2008; Cassandras and
Panayiotou 1999; Glasserman 1991; Ho and Cao
Logical Behavior. Issues such as ordering 1991).
events according to some specification or
ensuring the reachability of a particular state are Discrete Event Simulation. Because of the
normally addressed through the use of automata aforementioned complexity of DES dynamics,
and Petri nets (Chen and Lafortune 1991; Moody simulation becomes an essential part of DES
and Antsaklis 1998; Ramadge and Wonham performance analysis (Law and Kelton 1991).
1987). Supervisory control theory provides Discrete event simulation can be defined as a
Monotone Systems in Biology 769

systematic way of generating sample paths of Chen E, Lafortune S (1991) Dealing with blocking in
a DES by means of a timed automaton or its supervisory control of discrete event systems. IEEE
Trans Autom Control AC-36(6):724–735
stochastic counterpart. The same process can be Cuninghame-Green RA (1979) Minimax algebra. Number
carried out using a Petri net model or one based 166 in lecture notes in economics and mathematical
on the dioid algebra setting. systems. Springer, Berlin/New York
Glasserman P (1991) Gradient estimation via perturbation
analysis. Kluwer Academic, Boston
Optimization. Optimization problems can be Glasserman P, Yao DD (1994) Monotone structure in
formulated in the context of both untimed and discrete-event systems. Wiley, New York
timed models of DES. Moreover, such problems Ho YC (ed) (1991) Discrete event dynamic systems:
can be formulated in both a deterministic and a analyzing complexity and performance in the modern
world. IEEE, New York
stochastic setting. In the latter case, the ability Ho YC, Cao X (1991) Perturbation analysis of discrete
to efficiently estimate performance sensitivities event dynamic systems. Kluwer Academic, Dordrecht
with respect to controllable system parameters Ho YC, Cassandras CG (1983) A new approach to the
provides a powerful tool for stochastic gradient- analysis of discrete event dynamic systems. Automat-
ica 19:149–167
based optimization (when one can define Hoare CAR (1985) Communicating sequential processes.
derivatives) (Vázquez-Abad et al. 1998). Prentice-Hall, Englewood Cliffs
A treatment of all such problems from an Hopcroft JE, Ullman J (1979) Introduction to automata
application-oriented standpoint, along with fur- theory, languages, and computation. Addison-Wesley,
Reading
ther details on the use of the modeling frame- Law AM, Kelton WD (1991) Simulation modeling and
works discussed in this entry, can be found in the analysis. McGraw-Hill, New York
article  Applications of Discrete-Event Systems. Moody JO, Antsaklis P (1998) Supervisory control of
discrete event systems using petri nets. Kluwer Aca-
demic, Boston
Cross-References Peterson JL (1981) Petri net theory and the modeling of
systems. Prentice Hall, Englewood Cliffs M
 Applications of Discrete-Event Systems Ramadge PJ, Wonham WM (1987) Supervisory control
 Diagnosis of Discrete Event Systems of a class of discrete event processes. SIAM J Control
Optim 25(1):206–230
 Discrete Event Systems and Hybrid Systems, Vázquez-Abad FJ, Cassandras CG, Julka V (1998) Cen-
Connections Between tralized and decentralized asynchronous optimization
 Modeling, Analysis, and Control with Petri of stochastic discrete event systems. IEEE Trans Au-
Nets tom Control 43(5):631–655
 Perturbation Analysis of Discrete Event Sys-
tems
 Perturbation Analysis of Steady-State Perfor-
mance and Sensitivity-Based Optimization Monotone Systems in Biology
 Supervisory Control of Discrete-Event Systems
David Angeli
Department of Electrical and Electronic
Bibliography Engineering, Imperial College London, London,
UK
Baccelli F, Cohen G, Olsder GJ, Quadrat JP (1992) Dipartimento di Ingegneria dell’Informazione,
Synchronization and linearity. Wiley, Chichester/New
York University of Florence, Italy
Baeten JCM, Weijland WP (1990) Process algebra. Vol-
ume 18 of Cambridge tracts in theoretical computer
science. Cambridge University Press, Cambridge/New
York Abstract
Cassandras CG, Lafortune S (2008) Introduction to dis-
crete event systems, 2nd edn. Springer, New York Mathematical models arising in biology might
Cassandras CG, Panayiotou CG (1999) Concurrent sam-
ple path analysis of discrete event systems. J Discret sometime exhibit the remarkable feature of
Event Dyn Syst Theory Appl 9:171–195 preserving ordering of their solutions with
770 Monotone Systems in Biology

respect to initial data: in words, the “more” of arbitrary behavior (including periodic or chaotic).
x (the state variable) at time 0, the more of it Further results along these lines provide insight
at all subsequent times. Similar monotonicity into which set of extra assumptions allow one
properties are possibly exhibited also with to strengthen generic convergence to global
respect to input levels. When this is the case, convergence, including, for instance, existence of
important features of the system’s dynamics can positive first integrals (Banaji and Angeli 2010;
be inferred on the basis of purely qualitative Mierczynski 1987), tridiagonal structure (Smillie
or relatively basic quantitative knowledge of 1984), or positive translation invariance (Angeli
the system’s characteristics. We will discuss and Sontag 2008a).
how monotonicity-related tools can be used While these tools were initially developed
to analyze and design biological systems having in mind applications arising in ecology,
with prescribed dynamical behaviors such epidemiology, chemistry, or economy, it was due
as global stability, multistability, or periodic to the increased importance of mathematical
oscillations. modeling in molecular biology and the
subsequent rapid development of systems biology
as an emerging independent field of investigation
that they became particularly relevant to biology.
Keywords
The paper Angeli and Sontag (2003) first
introduced the notion of control monotone
Feedback interconnections; Monotone dynamics;
systems, including input and output variables,
Monotonicity checks
that is of interest if one is looking at systems
arising from interconnection of monotone
modules. Small-gain theorems and related
Introduction conditions were defined to study both positive
(Angeli and Sontag 2004b) and negative (Angeli
Ordinary differential equations of a scalar and Sontag 2003) feedback interconnections by
unknown, under suitable assumptions for unicity relating their asymptotic behavior to properties
of solutions, trivially enjoy the property that any of the discrete iterations of a suitable map,
pair of ordered initial conditions (according to called the steady-state characteristic of the
the standard  order defined for real numbers) system.
gives rise to ordered solutions at all positive times In particular, convergence of this map is re-
(as well as negative, though this is less relevant lated to convergent solutions for the original con-
for the developments that follow). Monotone tinuous time system; on the other hand, specific
systems are a special but significant class of negative feedback interconnections can instead
dynamical models, possibly evolving in high- give rise to oscillations as a result of Hopf bi-
dimensional or even infinite-dimensional state furcations as in Angeli and Sontag (2008b) or
spaces, that are nevertheless characterized by to relaxation oscillators as highlighted in Gedeon
a similar property holding with respect to a and Sontag (2007).
suitably defined notion of partial order. They A parallel line of investigation, originated in
became the focus of considerable interest in the work of Volpert et al. (1994), exploited the
mathematics after a series of seminal papers specific features of models arising in biochem-
by Hirsch (1985, 1988) provided the basic istry by focusing on structural conditions for
definitions as well as deep results showing monotonicity of chemical reaction networks (An-
how generic convergence properties of their geli et al. 2010; Banaji 2009). Monotonicity is
solutions are expected under suitable technical only one of the possible tools adopted in the
assumptions. Shortly before that Smale (1976), study of dynamics for such class of models in
Smale’s construction had already highlighted the related field of chemical reaction networks
how specific solutions could instead exhibit theory.
Monotone Systems in Biology 771

Mathematical Preliminaries Definition 1 A monotone system ' is one that


fulfills the following:
To illustrate the main tools of monotone dynam-
ics, we consider the following systems defined 8x1 ; x2 2 X W x1 x2 '.t; x1 / '.t; x2 /
on partially ordered input, state, and output
8 t  0: (1)
spaces. Namely, along with the sets U; X; Y
(which denote input, state, and output space,
A system ' is strongly monotone when the fol-
respectively), we consider corresponding partial
lowing holds:
orders X ; U ; Y . A typical way of defining
a partial order on a set S embedded in some
Euclidean space E, S  E, is to first identify 8x1 ; x2 2 X W x1 x2 '.t; x1 /  '.t; x2 /
a cone K of positive vectors which belong to 8 t > 0: (2)
E. A cone in this context is any closed convex
set which is preserved under multiplication times A control system is characterized by two con-
nonnegative scalars and such that K \K D f0g. tinuous mappings: ' W R  X  U ! X and the
Accordingly we may denote s1 S s2 whenever readout map h W X  U ! Y .
s1  s2 2 K. A typical choice of K in the case
of finite-dimensional E D Rn is the positive Definition 2 A control system is monotone if
orthant, (K D Œ0; C1/n ), in which case can be
interpreted as componentwise inequalities. More 8 u1 ; u2 2 U W u1 u2 ; 8 x1 ; x2 2X W x1 x2 ;
general orthants are also very useful in several 8t  0 '.t; x1 ; u1 / '.t; x2 ; u2 / (3)
applications as well as more exotic cones, smooth
or polyhedral, according to the specific model and
considered. When dealing with input signals, we M
let U denote the set of locally essentially bounded
8 u1 ; u2 2 U W u1 u2 ; 8 x1 ; x2 2X W x1 x2 ;
and measurable functions of time. In particular,
we inherit a partial order on U from the partial h.x1 ; u1 / h.x2 ; u2 /: (4)
order on U according to the following definition:
Notice that for any ordered state and input pairs
u1 ./ U u2 ./ , u1 .t/ U u2 .t/ 8 t 2 R: x1 ; x2 , u1 ; u2 , the signals y1 and y2 defined
as y1 .t/ WD h.'.t; x1 ; u1 /; u1 .t//, y2 .t/ WD
When obvious from the context, we do not h.'.t; x2 ; u2 /; u2 .t// also fulfill, thanks to the
emphasize the space to which variables belong Definition 2, y1 .t/ Y y2 .t/ (for all t  0).
and simply write . Strict order notions are also A system which is monotone with respect to
of interest and especially relevant for some of the positive orthant is called cooperative. If a
the deepest implications of the theory. We let system is cooperative after reverting the direction
s1 s2 denote s1 s2 and s1 ¤ s2 . While for of time, it is called competitive. Checking if
partial orders induced by positivity cones, we let a mathematical model specified by differential
s1  s2 denote s1  s2 2 int.K/. equations is monotone with respect to the partial
A dynamical system is for us a continuous order induced by some cone K is not too diffi-
map ' W R  X ! X which fulfills the property, cult. In particular, monotonicity, in its most basic
'.0; x/ D x for all x 2 X and '.t2 ; '.t1 ; x// D formulation (1), simply amounts to a check of
'.t1 C t2 ; x/ for all t1 ; t2 2 R. Sometimes, when positive invariance of the set  WD f.x1 ; x2 / 2
solutions are not globally defined (for instance, if X 2 W x1 x2 g for a system formed by two copies
the system is defined through a set of nonlinear of ' in parallel. This can be assessed without
differential equations), it is enough to restrict the explicit knowledge of solutions, for instance, by
definitions that follow to the domain of existence using the notion of tangent cones and Nagumo’s
of solutions. theorem (Angeli and Sontag 2003). Sufficient
772 Monotone Systems in Biology

conditions also exist to assess strong monotonic- ƒx1 ƒx2 , where denotes the partial order
ity, for instance, in the case of orthant cones. induced by the positive orthant, while O de-
Finding whether there exists an order (as induced, notes the order induced by O. This means that
for instance, by a suitable cone K) which can we may check monotonicity with respect to O
make a system monotone is instead a harder task by performing a simple change of coordinates
which normally entails a good deal of insight in z D ƒx. As a corollary:
the systems dynamics.
Proposition 2 The system ' induced by the set
It is worth mentioning that for the special case
of differential equations (5) is monotone with
of linear systems, monotonicity is just equiva-
respect to O if and only if ƒ @f
@x
ƒ is a Metzler
lent to invariance of the cone K, as incremental
matrix for all x 2 X .
properties (referred to pairs of solutions) are just
equivalent to their non-incremental counterparts Notice that conditions of Propositions 1 and 2
(referred to the 0 solution). In this respect, a can be expressed in terms of sign constraints on
substantial amount of theory exists starting from off-diagonal entries of the Jacobian; in biological
classical works such as the Perron-Frobenius the- terms a sign constraint in an off-diagonal entry
ory on positive and cone-preserving maps; this is, amounts to asking that a particular species (mean-
however, outside the scope of this entry, and the ing chemical compound or otherwise) consis-
interested reader may refer to Farina and Rinaldi tently exhibit throughout the considered model’s
(2000) for a recent book on the subject. state space either an excitatory or inhibitory effect
on some other species of interest. Qualitative
diagrams showing effects of species on each other
Monotone Dynamics are commonly used by biologists to understand
the working principles of biomolecular networks.
We divide this section in three parts; first we sum- Remarkably, Proposition 2 has also an in-
marize the main tools for checking monotonicity teresting graph theoretical
 interpretation if one
with respect to orthant cones, then we recall some thinks of sign @f@x
as the adjacency matrix of a
of the main consequences of monotonicity for the graph with nodes x1 : : : xn corresponding to the
long-term behavior of solutions and, finally, we state variables of the system.
study interconnections of monotone systems.
Proposition 3 The system ' induced by the set
Checking Monotonicity of differential equations (5) is monotone with
Orthant cones and the partial orders they induce respect to O if and only if the directed graph
play a major role in biology applications. In fact, of adjacency matrix sign @f@x (neglecting diag-
for systems described by equations onal entries) has undirected loops with an even
number of negative edges.
xP D f .x/ (5)
This means in particular that @f @x must be sign
symmetric (no predator-prey-type interactions)
with X  Rn open and f W X ! Rn of class C 1 ,
and in addition that a similar parity property has
the following characterization holds:
to hold on undirected loops of arbitrary length.
Proposition 1 The system ' induced by the set Sufficient conditions for strong monotonicity are
of differential equations (5) is cooperative if and also known, for instance, in terms of irreducibil-
only if the Jacobian @f
@x is a Metzler matrix for all
ity of the Jacobian matrix (Kamke’s condition;
x 2 X. see Hirsch and Smith 2003).
We recall that M is Metzler if mij  0 for all i ¤
j . Let ƒ D diagŒ 1 ; : : : ; n  with i 2 f1; 1g Asymptotic Dynamics
and assume that the orthant O D ƒŒ0; C1/n . As previously mentioned, several important im-
It is straightforward to see that x1 O x2 , plications of monotonicity are with respect to
Monotone Systems in Biology 773

asymptotic dynamics. Let E denote the set of asymptotically stable equilibrium kx .u/ and
equilibria of '. The following result is due to the map kx .u/ is continuous. If moreover the
Hirsch (1985). equilibrium is hyperbolic, then kx is called a
non-degenerate characteristic. The input–output
Theorem 1 Let ' be a strongly monotone sys-
characteristic is defined as ky .u/ D h.kx .u//.
tem with bounded solutions. There exists a zero
measure set Q such that each solution starting in Let ' be system with a well-defined input–output
X nQ converges toward E. characteristic ky ; we may define the iteration
Global convergence results can be achieved for
important classes of monotone dynamics. For ukC1 D ky .uk /: (6)
instance, when increasing conservation laws are
present (see Banaji and Angeli 2010):
Theorem 2 Let X  K  Rn be any two proper It is clear that fixed points of (6) correspond to
cones. Let ' on X be strongly monotone with input values (and therefore to equilibria through
respect to the partial order induced by K and the characteristic map kx ) of the closed-loop
preserving a K-increasing first integral. Then system derived by considering the unity feedback
every bounded solution converges. interconnection u D y. What is remarkable for
monotone systems is that both in the case of
Dually to first integrals, positive translation in- positive and negative feedback and in a precise
variance of the dynamics also provide grounds sense, stability properties of the fixed points of
for global convergence (see Angeli and Sontag the discrete iteration (6) are matched by stability
2008a): properties of the corresponding associated solu-
Theorem 3 If a system is strongly monotone and tions of the original continuous time system. See
fulfills '.t; x0 Chv/ D '.t; x0 /Chv for all h 2 R Angeli and Sontag (2004b) for the case of posi- M
and some v  0, then all solutions with bounded tive feedback interconnections and Angeli et al.
projections in v ? converge. (2004) for applications of such results to synthe-
sis and detection of multistability in molecular
The class of tridiagonal cooperative systems has biology.
also been investigated as a significant remarkable Multistability, in particular, is an important
class of global convergent dynamics; see Smillie dynamical feature of specific cellular systems and
(1984). These arise from differential equations can be achieved, with good degree of robustness
xP D f .x/ when @fi =@xj D 0 for all ji  j j > 1. with respect to different types of uncertainties, by
Finally it is worth emphasizing how signif- means of positive feedback interconnections of
icant for biological systems, often subject to monotone subsystems. The typical input–output
phenomena evolving at different timescales, are characteristic ky giving rise to such behavior is,
also results on singular perturbations (Gedeon in the SISO case, that of a sigmoidal function
and Sontag 2007; Wang and Sontag 2008). intersecting in 3 points the diagonal u D y. Two
of the fixed points, namely, u1 and u3 (see Fig. 1),
are asymptotically stable for (6), and the cor-
Interconnected Monotone Systems
responding equilibria of the original continuous
Results on interconnected monotone SISO sys-
time monotone system are also asymptotically
tems are surveyed in Angeli and Sontag (2004a).
stable with a basin of attraction which covers
The main tool used in this context is the notion of
almost all initial conditions. The fixed-point u2
input-state and input–output steady-state charac-
is unstable and the corresponding equilibrium is
teristic.
also such (under suitable technical assumption
Definition 3 A control system admits a well- on the non-degeneracy of the I-O characteristic).
defined input-state characteristic if for all Extensions of similar criteria to the MIMO case
constant inputs u there exists a unique globally are presented in Enciso and Sontag (2005).
774 Monotone Systems in Biology

ky(u) ky(u)

u1 u2 u3 u u1 u

Monotone Systems in Biology, Fig. 1 Fixed points of Monotone Systems in Biology, Fig. 2 Fixed point of a
a sigmoidal input–output characteristic decreasing input–output characteristic

When the iteration (6) has an unstable fixed


Negative feedback normally destroys mono- point, for instance, it converges to a period-
tonicity. As a result, the likelihood of complex 2 solution, one may expect insurgence of
dynamical behavior is highly increased. Never- oscillations around the equilibrium through a
theless, input–output characteristics still can pro- Hopf bifurcation provided sufficiently large
vide useful insight in the system’s dynamics at delays in the input channels are allowed. This
least in the case of low feedback gain or, for situation is analyzed in Angeli and Sontag
high feedback gains, in the presence of suffi- (2008b) and illustrated through the study of the
ciently large delays. For instance, unity negative classical Golbeter’s model for the Drosophila’s
feedback interconnection of a SISO monotone circadian rhythm.
system may give rise to a unique and globally
asymptotically stable fixed point of (6), thanks
to the decreasingness of the input–output char- Summary and Future Directions
acteristic and as shown in Fig. 2. Under such
circumstances a small-gain result applies and Verifying that a control system preserves some
global asymptotic stability of the corresponding ordering of initial conditions provides impor-
equilibrium is guaranteed regardless of arbitrary tant and far-reaching implications for its dynam-
input delays in the systems. See Angeli and ics. Insurgence of specific behaviors can often
Sontag (2003) for the simplest small-gain theo- be inferred on the basis of purely qualitative
rem developed in the context of SISO negative knowledge (as in the case of Hirsch’s generic
feedback interconnections of monotone systems convergence theorem) as well as additional basic
and Enciso and Sontag (2006) for generalizations quantitative knowledge as in the case of positive
to systems with multiple inputs as well as delays. and negative feedback interconnections of mono-
A generalization of small-gain results to the case tone systems. For the above reasons, applications
of MIMO systems which are neither in a positive in molecular biology of monotone system’s the-
nor negative feedback configuration is presented ory are gradually emerging: for instance, in the
in Angeli and Sontag (2011). study of MAPK cascades or circadian oscilla-
Monotone Systems in Biology 775

tions, as well as in Chemical Reaction Networks Hirsch MW, Smith H (2005) Monotone
Theory. Generally speaking, while monotonicity dynamical systems (Chapter 4). In: Canada A,
as a whole cannot be expected in large networks, Drábek P, Fonda A (eds) Handbook of
experimental data shows that the number of neg- differential equations ordinary differential
ative feedback loops in biological regulatory net- equations, vol 2. Elsevier
works is significantly lower than in a random
signed graph of comparable size Maayan et al.
(2008).
Analyzing the properties of monotone dynam- Bibliography
ics may potentially lead to better understanding
Angeli D, Sontag ED (2003) Monotone control systems.
of the key regulatory mechanisms of complex IEEE Trans Autom Control 48:1684–1698
networks as well as the development of bottom- Angeli D, Sontag ED (2004a) Interconnections of mono-
up approaches for the identification of meaning- tone systems with steady-state characteristics. In: Op-
ful submodules in biological networks. Poten- timal control, stabilization and nonsmooth analysis.
Lecture notes in control and information sciences,
tial research directions may include both novel vol 301. Springer, Berlin, pp 135–154
computational tools and specific applications to Angeli D, Sontag ED (2004b) Multi-stability in monotone
systems biology, for instance: input/output systems. Syst Control Lett 51:185–202
• Algorithms for detection of monotonicity with Angeli D, Sontag ED (2008a) Translation-invariant mono-
tone systems, and a global convergence result for
respect to exotic orders (such as arbitrary enzymatic futile cycles. Nonlinear Anal Ser B Real
polytopic cones or even state-dependent World Appl 9:128–140
cones) Angeli D, Sontag ED (2008b) Oscillations in I/O mono-
• Application of monotonicity-based ideas to tone systems. IEEE Trans Circuits Syst 55:166–176
Angeli D, Sontag ED (2011) A small-gain result for
control synthesis (see, for instance, Aswani orthant-monotone systems in feedback: the non sign-
and Tomlin (2009) where the special class of definite case. Paper appeared in the 50th IEEE con- M
piecewise affine systems is considered) ference on decision and control, Orlando, 12–15 Dec
2011
Angeli D, Ferrell JE, Sontag ED (2004) Detection of
multistability, bifurcations, and hysteresis in a large
Cross-References class of biological positive-feedback systems. Proc
Natl Acad Sci USA 101:1822–1827
 Deterministic Description of Biochemical Net- Angeli D, de Leenheer P, Sontag ED (2010) Graph-
theoretic characterizations of monotonicity of chem-
works ical networks in reaction coordinates. J Math Biol
 Spatial Description of Biochemical Networks 61:581–616
 Stochastic Description of Biochemical Aswani A, Tomlin C (2009) Monotone piecewise affine
Networks systems. IEEE Trans Autom Control 54:1913–1918
Banaji M (2009) Monotonicity in chemical reaction sys-
tems. Dyn Syst 24:1–30
Banaji M, Angeli D (2010) Convergence in strongly
Recommended Reading monotone systems with an increasing first integral.
SIAM J Math Anal 42:334–353
Banaji M, Angeli D (2012) Addendum to “Convergence
For readers interested in the mathematical details in strongly monotone systems with an increasing first
of monotone systems theory we recommend the integral”. SIAM J Math Anal 44:536–537
following: Enciso GA, Sontag ED (2005) Monotone systems under
positive feedback: multistability and a reduction theo-
Smith H (1995) Monotone dynamical systems:
rem. Syst Control Lett 54:159–168
an introduction to the theory of competitive Enciso GA, Sontag ED (2006) Global attractivity, I/O
and cooperative systems. Mathematical sur- monotone small-gain theorems, and biological delay
veys and monographs, vol 41. AMS, Provi- systems. Discret Contin Dyn Syst 14:549–578
Enciso GA, Sontag ED (2008) Monotone bifurcation
dence
graphs. J Biol Dyn 2:121–139
A more recent technical survey of aspects related Farina L, Rinaldi S (2000) Positive linear systems: theory
to asymptotic dynamics of monotone systems is and applications. Wiley, New York
776 Motion Description Languages and Symbolic Control

Gedeon T, Sontag ED (2007) Oscillations in multi-stable control; discuss briefly its history, connections,
monotone systems with slowly varying feedback. and applications; and provide a few insights into
J Differ Equ 239:273–295
Hirsch MW (1985) Systems of differential equations that
where the field is going.
are competitive or cooperative II: convergence almost
everywhere. SIAM J Math Anal 16:423–439
Hirsch MW (1988) Stability and convergence in strongly Keywords
monotone dynamical systems. Reine Angew Math
383:1–53 Abstraction; Complex systems; Formal methods
Hirsch MW, Smith HL (2003) Competitive and coop-
erative systems: a mini-review. In: Positive systems,
Rome. Lecture notes in control and information sci-
ence, vol 294. Springer, pp 183–190
Maayan A, Iyengar R, Sontag ED (2008) Intracellular Introduction
regulatory networks are close to monotone systems.
IET Syst Biol 2:103–112 Systems and control theory is powerful paradigm
Mierczynski J (1987) Strictly cooperative systems with a for analyzing, understanding, and controlling dy-
first integral. SIAM J Math Anal 18:642–646
Smale S (1976) On the differential equations of species in namic systems. Traditional tools in the field for
competition. J Math Biol 3:5–7 developing and analyzing control laws, however,
Smillie J (1984) Competitive and cooperative tridiagonal face significant challenges when one needs to
systems of differential equations. SIAM J Math Anal deal with the complexity that arises in many
15:530–534
Volpert AI, Volpert VA, Volpert VA (1994) Traveling practical, real-world settings such as the control
wave solutions of parabolic Systems. Translations of of autonomous, mobile systems operating in un-
mathematical monographs, vol 140. AMS, Providence certain and changing physical environments. This
Wang L, Sontag ED (2008) Singularly perturbed mono- is particularly true when the tasks to be achieved
tone systems and an application to double phosphory-
lation cycles. J Nonlinear Sci 18:527–550 are not easily framed in terms of motion to a point
in the state space. One of the primary goals of
symbolic control is to mitigate this complexity
by abstracting some combination of the system
dynamics, the space of control inputs, and the
Motion Description Languages physical environment to a simpler, typically fi-
and Symbolic Control nite, model.
This fundamental idea, namely, that of ab-
Sean B. Andersson stracting away the complexity of the underlying
Mechanical Engineering and Division of dynamics and environment, is in fact a quite
Systems Engineering, Boston University, natural one. Consider, for example, how you
Boston, MA, USA give instructions to another person wanting to
go to a point of interest. It would be absurd
to provide details at the level of their actua-
Abstract tors, namely, with commands to their individual
muscles (or to carry the example to an even
The fundamental idea behind symbolic control more absurd extreme, to the dynamic components
is to mitigate the complexity of a dynamic sys- that make up those muscles). Rather, very high-
tem by limiting the set of available controls to level commands are given, such as “follow the
a typically finite collection of symbols. Each road,” “turn right,” and so on. Each of these
symbol represents a control law that may be provides a description of what to do with the
either open or closed loop. With these symbols, understanding that the person can carry out those
a simpler description of the motion of the system commands in their own fashion. Similarly, the en-
can be created, thereby easing the challenges vironment itself is abstracted, and only elements
of analysis and control design. In this entry, meaningful to the task at hand are described.
we provide a high-level description of symbolic Thus, continuing the example above, rather than
Motion Description Languages and Symbolic Control 777

providing metric information or a detailed map, space of possible control signals by limiting the
the instructions may use environmental features system to a typically finite collection of control
to determine when particular actions should be symbols. Each of these symbols represents a
terminated and the next begun, such as “follow control law that may be open loop or may utilize
the road until the second intersection, then turn output feedback. For example, follow the road
right.” could be a feedback control law that uses sensor
Underlying the idea of symbolic control is measurements to determine the position relative
the notion that rich behaviors can result from to the road and then applies steering commands
simple actions. This premise was used in many so that the system stays on the road while simul-
early robots and can be traced back at least to taneously maintaining a constant speed. There
the ideas of Norbert Wiener on cybernetics (see are, of course, many ways to accomplish the
Arkin 1998). It is at the heart of the behavior- specifics of this task, and the details will depend
based approach to robotics (Brooks 1986). Sim- on the particular system. Thus, an autonomous
ilar ideas can also be seen in the development four-wheeled vehicle equipped with a laser range
of a high-level language (G-codes) for Computer finder, an autonomous motorcycle equipped with
Numerically Controlled (CNC) machines. The ultrasonic sensors, or an autonomous aerial vehi-
key technical ideas in the more general setting cle with a camera would each carry out the com-
of symbolic control for dynamic systems can be mand in their own way, and each would have very
traced back to Brockett (1988) which introduced different trajectories. They would all, however,
ideas of formalizing a modular approach to pro- satisfy the notion of follow the road. Description
gramming motion control devices through the of the behavior of the system can then be given in
development of a Motion Description Language terms of the abstract symbols rather than in terms
(MDL). of the details of the trajectories.
The goal of the present work is to introduce the Typically each of these symbols describes an M
interested reader to the general ideas of symbolic action that at least conceptually is simple. In
control as well as to some of its application order to generate rich motions to carry out com-
areas and research directions. While it is not a plex tasks, the system is switched between the
survey paper, a few select references are provided available symbols. Switching conditions are of-
throughout to point the reader in hopefully fruit- ten referred to as interrupts. Interrupts may be
ful directions into the literature. purely time-based (e.g., apply a given symbol
for T seconds) or may be expressed in terms
of symbols representing certain environmental
conditions. These may be simple function of
Models and Approaches the measurements (e.g., interrupt when an in-
tersection is detected) or may represent more
There are at least two related but distinct ap- complicated scenarios with history and dynamics
proaches to symbolic control. Both begin with a (e.g., interrupt after the second intersection is
mathematical description of the system, typically detected). Just as the input symbols abstract away
given as an ordinary differential equation of the the details of the control space and of the mo-
form tion of the system, the interrupt symbols abstract
away the details of the environment. For example,
xP D f .x; u; t/; y D h.x; t/ (1) intersection has a clear high-level meaning but
a very different sensor “signature” for particular
where x is a vector describing the state of the systems.
system, y is the output of the sensors of the As a simple illustrative example, consider a
system, and u is the control input. collection of control symbols designed for mov-
Under the first approach to symbolic control, ing along a system of roads, {follow road, turn
the focus is on reducing the complexity of the right, turn left}, and a collection of interrupt
778 Motion Description Languages and Symbolic Control

Motion Description Languages and Symbolic Con- The systems each interpret the same symbols in their
trol, Fig. 1 Simple example of symbolic control with a own ways, leading to different trajectories due both to
focus on abstracting the inputs. Two systems, a snakelike differences in dynamics and also to different sensors cues
robot and an autonomous car, are given a high-level plan as caused, for example, by the parked vehicle encountered
in terms of symbols for navigating a right-hand turn. by the car in this scenario

symbols for triggering changes in such a setting, to write programs that can be compiled into
{in intersection, clear of intersection}. Suppose an executable for a specific system. Different
there are two vehicles that can each interpret rules for doing this can be established that
these symbols, an autonomous car and a snake- define different languages, analogous to different
like robot, as illustrated in Fig. 1. It is reasonable high-level programming languages such as
to assume that the control symbols each describe C++, Java, or Python. Further details can
relatively complex dynamics that allow, for ex- be found in, for example, Manikonda et al.
ample, for obstacle avoidance while carrying out (1998).
the action. Figure 1 illustrates a possible situation Under the second approach, the focus is on
where the two systems carry out the plan defined representing the dynamics and state space (or
by the symbolic sequence: environment) of the system in an abstract, sym-
bolic way. The fundamental idea is to lump all
(Follow the road UNTIL in intersection) the states in a region into a single abstract el-
(Turn right UNTIL clear of intersection) ement and to then represent the entire system
with a finite number of these elements. Control
The intent of this plan is for the system to nav- laws are then defined that steer all the states
igate a right-hand turn. As shown in the figure, in one element into some state in a different
the actual trajectories followed by the systems region. The symbolic control system is then the
can be markedly different due in part to sys- finite set of elements representing regions to-
tem dynamics (the snakelike robot undulates, gether with the finite set of controllers for moving
while the car does not) as well as to different between them. It can be thought of essentially
sensor responses (when the car goes through, as a graph (or more accurately as a transition
there is a parked vehicle that it must navigate system) in which the nodes represent regions in
around, while the snakelike robot found a clear the state space and the edges represent achiev-
path during its execution). Despite these dif- able transitions between them. The goal of this
ferences, both systems achieve the goal of the abstraction step is for the two representations to
plan. be equivalent (or at least approximately equiv-
The collection of control and interrupt alent) in that any motion that can be achieved
symbols can be thought of as a language for in one can be achieved in the other (in an ap-
describing and specifying motion and are used propriate sense). Planning and analysis can then
Motion Description Languages and Symbolic Control 779

R11 R12 R13 R14 R11 R12 R13 R14

R7 R8 R9 R10 R7 R8 R9 R10

R5 R6 R5 R6

R4
R3
R3 R4
R1 R2
R1 R2

Symbolic model
and planning

Abstraction Execution of plan

Motion Description Languages and Symbolic Con- one that twists the robot to face the cell to the left before
trol, Fig. 2 Simple example of symbolic control with a slithering across and then reorienting. The combination of
focus on abstracting the system dynamics and environ- regions and actions yields a symbolic abstraction (center
ment. The initial environment (left image) is segmented image) that allows for planning to achieve specific goals,
into different regions and simple controllers developed such as moving through the right-hand turn. Executing
for moving from region to region. The image shows two this plan leads to a physical trajectory of the system (right
possible controllers: one that actuates the robot through image)
a tight slither pattern to move forward by one region and

be done on the (simpler) symbolic model. Fur- Applications and Connections


ther details on such schemes can be found in,
M
for example, Tabuada (2006) and Bicchi et al. The fundamental idea behind symbolic control,
(2006). namely, mitigating complexity by abstracting a
As an illustrative example, consider as before system, its environment, and even the tasks to be
a snakelike robot moving through a right-hand accomplished into a simpler but (approximately)
turn. In a simplified view of this second approach equivalent model, is a natural and a powerful one.
to symbolic control, one begins by dividing the It has clear connections to both hybrid systems
environment up into regions and then defining (Brockett 1993; Egerstedt 2002) and to quantized
controllers to steer the robot from region to region control (Bicchi et al. 2006), and the tools from
as illustrated in the left image in Fig. 2. This those fields are often useful in describing and
yields the symbolic model shown in the center analyzing systems with symbolic representations
of Fig. 2. A plan is then developed on this model of the control and of the dynamics. Symbolic
to move from the initial position to the final control is not, however, strictly a subcategory of
position. This planning step can take into account either field, and it provides a unique set of tools
restrictions on the motion and subgoals of the for the control and analysis of dynamic systems.
task. Here, for example, one may want the robot Brockett’s original MDL was intended to
to stay to the right of the double yellow line that is serve as a tool for describing and planning
in its lane of traffic. The plan R2 ! R4 ! R6 ! robot motion. Inspired in part by this, languages
R8 ! R9 ! R10 is one sequence that drives for motion continue to be developed. Some
the system around the turn while satisfying the of these extend and provide a more formal
lane requirement. Each transition in the sequence basis for motion programming (Manikonda
corresponds to a control law. The plan is then et al. 1998) and interconnection of dynamic
executed by applying the sequence of control systems into a single whole (Murray et al.
laws, resulting in the actual trajectory shown in 1992), while some are designed for specialized
the right image in Fig. 2. dynamics or applications such as flight vehicles
780 Motion Description Languages and Symbolic Control

(Frazzoli et al. 2005), self-assembly (Klavins Summary and Future Directions


2007), and other areas. In addition to studying
standard systems and control theoretic ideas, Symbolic control proceeds from the basic goal
including notions of reachability (Bicchi et al. of mitigating the complexity of dynamic systems,
2002) and stability (Tarraf et al. 2008), the especially in real-world scenarios, to yield a sim-
framework of symbolic control introduces plification of the problems of analysis and control
interesting questions such as how to understand design. It builds upon results from diverse fields
the reduction of complexity that can be achieved while also contributing new ideas to those areas,
for a given collection of symbols (Egerstedt and including hybrid system theory, formal languages
Brockett 2003). and grammars, and motion planning. There are
While there are many application areas of many open, interesting questions that are the
symbolic control, the one that is perhaps most subject of ongoing investigations as well as the
active is that of motion planning for autonomous genesis of future research.
mobile robots (Belta et al. 2007). As illustrated One particularly fruitful direction is that of
in Figs. 1 and 2, symbolic control allows the combining symbolic control with stochasticity.
planning problem (i.e., the determination of how Systems that operate in the real world are subject
to achieve a desired task) to be separated from to noise with respect both to their inputs (noisy
the complexities of the dynamics. The approach actuators) and to their outputs (noisy sensors).
has been particularly fertile when coupled with Recent work along these lines can be found in the
symbolic descriptions of the tasks to be achieved. formal methods approach to motion planning and
While point-to-point commands are useful, and in hybrid systems (Abate et al. 2011; Lahijanian
can be often thought of as symbols themselves et al. 2012). The fundamental idea is to use
from which to build more complicated com- a Markov chain, Markov decision process, or
mands, most tasks that one would want mobile similar model as the symbolic abstraction and
robots to carry out involve combinations of spa- then, as in all symbolic control, to do the analysis
tial goals (move to a certain location), sequencing and planning on this simpler model.
(first do this and then do that) or other tempo- Another interesting direction is to address
ral requirements (repeatedly visit a collection of questions of optimality with respect to the
regions), as well as safety or other restrictions symbols and abstractions for a given dynamic
(avoid obstacles or regions that are dangerous system. Of course, the notion of “optimal” must
for the robot to traverse). Such tasks can be be made clear, and there are several reasonable
described using a variety of temporal logics. notions one could define. There is a clear
These are, essentially, logic systems that include trade-off between the complexity of individual
rules related to time in addition to the standard symbols, the number of symbols used in the
Boolean operators. These tasks can be combined motion “alphabet,” and the complexity in terms
with a symbolic description of a system and then of, say, average number of symbols required to
automated tools used both to check whether the code programs that achieve a given set of tasks.
system is able to perform the desired task and The complexity of a necessary alphabet is also
to design plans that ensure the system will do related to the variety of tasks the system might
so (Fainekos et al. 2009). To ensure that results need to perform. An autonomous vacuuming
on the abstract, symbolic system are valid on robot is likely to need far fewer symbols in its
the original dynamic system, methods exist for library than an autonomous vehicle that must
guaranteeing the equivalence of the two mod- operate in everyday traffic conditions and respond
els, in an appropriate sense (Girard and Pappas to unusual events such as traffic jams. The
2007). question of the “right” set of symbols can also
Motion Description Languages and Symbolic Control 781

be of use in efficient descriptions of motion in ough descriptions can be found in Manikonda


domains such as dance (Baillieul and Ozcimder et al. (1998) and Egerstedt (2002). An excellent
2012). description of symbolic control in robotics, par-
It is intuitively clear that to handle complex ticularly in the context of temporal logics and
scenarios and environments, a hierarchical ap- formal methods, can be found in Belta et al.
proach is likely needed. Organizing symbols into (2007). There are also several related articles in
progressively higher levels of abstraction should a 2011 special issue of the IEEE Robotics and
allow for more efficient reasoning, planning, and Automation magazine (Kress-Gazit 2011).
reaction to real-world settings. Such structures
already appear in existing works, such as in
the behavior-based approach of Brooks (1986), Bibliography
in the extended Motion Description Language
in Manikonda et al. (1998), and in the Spatial Abate A, D’Innocenzo A, Di Benedetto MD (2011) Ap-
Semantic Hierarchy of Kuipers (2000). Despite proximate abstractions of stochastic hybrid systems.
these efforts, there is still a need for a rigorous IEEE Trans Autom Control 56(11):2688–2694
Arkin RC (1998) Behavior-based robotics. MIT, Cam-
approach for analyzing and designing symbolic bridge
hierarchical systems. Baillieul J, Ozcimder K (2012) The control theory of
The final direction discussed here is that of the motion-based communication: problems in teaching
connection of symbolic control to emergent be- robots to dance. In: American control conference,
Montreal, pp 4319–4326
havior in large groups of dynamic agents. There Belta C, Bicchi A, Egerstedt M, Frazzoli E, Klavins E,
are a variety of intriguing examples in nature in Pappas GJ (2007) Symbolic planning and control of
which large numbers of agents following sim- robot motion [Grand Challenges of Robotics]. IEEE
ple rules produce large-scale, coherent behavior, Robot Autom Mag 14(1):61–70
Bicchi A, Marigo A, Piccoli B (2002) On the reachabil-
including in fish schools and termite and ant ity of quantized control systems. IEEE Trans Autom M
colonies (Johnson 2002). How can one predict Control 47(4):546–563
the global behavior that will emerge from a large Bicchi A, Marigo A, Piccoli B (2006) Feedback encoding
for efficient symbolic control of dynamical systems.
collection of independent agents following sim-
IEEE Trans Autom Control 51(6):987–1002
ple rules (symbols)? How can one design a set of Brockett RW (1988) On the computer control of move-
symbols to produce a desired collective behavior? ment. In: IEEE International conference on robotics
While there has been some work in symbolic con- and automation, Philadelphia, pp 534–540
Brockett RW (1993) Hybrid models for motion control
trol for self-assembling systems (Klavins 2007),
systems. In: Trentelman HL, Willems JC (eds) Essays
this general topic remains a rich area for research. on control. Birkhauser, Boston, pp 29–53
Brooks R (1986) A robust layered control system for
a mobile robot. IEEE J Robot Autom RA-2(1):14–
Cross-References 23
Egerstedt M (2002) Motion description languages for
multi-modal control in robotics. In: Bicchi A,
 Multi-vehicle Routing Cristensen H, Prattichizzo D (eds) Control problems
 Robot Motion Control in robotics. Springer, pp 75–89
 Walking Robots Egerstedt M, Brockett RW (2003) Feedback can reduce
the specification complexity of motor programs. IEEE
 Wheeled Robots
Trans Autom Control 48(2):213–223
Fainekos GE, Girard A, Kress-Gazit H, Pappas GJ (2009)
Recommended Reading Temporal logic motion planning for dynamic robots.
Automatica 45(2):343–352
Frazzoli E, Dahleh MA, Feron E (2005) Maneuver-based
Brockett’s original paper Brockett (1988) is a sur- motion planning for nonlinear systems with symme-
prisingly short but informational read. More thor- tries. IEEE Trans Robot 21(6):1077–1091
782 Motion Planning for Marine Control Systems

Girard A, Pappas GJ (2007) Approximation metrics for tal external forces (sea state, currents, winds) that
discrete and continuous systems. IEEE Trans Autom drive the optimized motion generation process.
Control 52(5):782–798
Johnson S (2002) Emergence: the connected lives of ants,
brains, cities, and software. Scribner, New York
Klavins E (2007) Programmable self-assembly. IEEE Keywords
Control Syst 27(4):43–56
Kress-Gazit H (2011) Robot challenges: toward develop-
ment of verification and synthesis techniques [from the
Configuration space; Dynamic programming;
Guest Editors]. IEEE Robot Autom Mag 18(3):22–23 Grid search; Guidance controller; Guidance sys-
Kuipers B (2000) The spatial semantic hierarchy. Artif tem; Maneuvering; Motion plan; Optimization
Intell 119(1–2):191–233 algorithms; Path generation; Route planning;
Lahijanian M, Andersson SB, Belta C (2012) Temporal
logic motion planning and control with probabilistic Trajectory generation; World space
satisfaction guarantees. IEEE Trans Robot 28(2):396–
409
Manikonda V, Krishnaprasad PS, Hendler J (1998) Lan- Introduction
guages, behaviors, hybrid architectures, and motion
control. In: Baillieul J, Willems JC (eds) Mathematical
control theory. Springer, New York, pp 199–226 Marine control systems include primarily ships
Murray RM, Deno DC, Pister KSJ, Sastry SS (1992) and rigs moving on the sea surface, but also
Control primitives for robot systems. IEEE Trans Syst underwater systems, manned (submarines) or un-
Man Cybern 22(1):183–193
Tabuada P (2006) Symbolic control of linear systems manned, and eventually can be extended to any
based on symbolic subsystems. IEEE Trans Autom kind of off-shore moving platform.
Control 51(6):1003–1013 A motion plan consists in determining what
Tarraf DC, Megretski A, Dahleh MA (2008) A framework motions are appropriate for the marine system to
for robust stability of systems over finite alphabets.
IEEE Trans Autom Control 53(5):1133–1146 reach a goal, or a target/final state (LaValle 2006).
Most often, the final state corresponds to a geo-
graphical location or destination, to be reached by
the system while respecting constraints of phys-
Motion Planning for Marine Control ical and/or economical nature. Motion planning
Systems in marine systems hence starts from route plan-
ning, and then it covers desired path generation
Andrea Caiti and trajectory generation. Path generation in-
DII – Department of Information Engineering & volves the determination of an ordered sequence
Centro “E. Piaggio”, ISME – Interuniversity of states that the system has to follow; trajectory
Research Centre on Integrated Systems for the generation requires that the states in a path are
Marine Environment, University of Pisa, Pisa, reached at a prescribed time.
Italy Route, path, and trajectory can be generated
off-line or on-line, exploiting the feedback from
the system navigation and/or from external
Abstract sources (weather forecast, etc.). In the feedback
case, planning overlaps with the guidance system,
In this chapter we review motion planning al- i.e., the continuous computation of the reference
gorithms for ships, rigs, and autonomous marine (desired) state to be used as reference input by
vehicles. Motion planning includes path and tra- the motion control system (Fossen 2011).
jectory generation, and it goes from optimized
route planning (off-line long-range path genera-
tion through operating research methods) to re- Formal Definitions and Settings
active on-line trajectory reference generation, as
given by the guidance system. Crucial to the ma- Definitions and classifications as in Goerzen et al.
rine systems case is the presence of environmen- (2010) and Petres et al. (2007) are followed
Motion Planning for Marine Control Systems 783

throughout the section. The marine systems under – Static if there is perfect knowledge of the
considerations live in a physical space referred environment at any time, dynamic otherwise
to as the world space (e.g., a submarine lives in – Time-invariant when the environment does
a 3-D Euclidean space). A configuration q is a not evolve (e.g., coastline that limits the C -
vector of variables that define position and orien- free subspace), time-variant otherwise (e.g.,
tation of the system in the world space. The set other systems – ships, rigs – in navigation)
of all possible configurations is the configuration – Differentially constrained if the system state
space, or C -space. The vector of configuration equations act as a constraint on the path,
and configuration rate of changes is the state differentially unconstrained otherwise
 T
of the system x D qT q̇T , and the set of In practice, optimal motion planning problems
all the possible states is the state space. The are solved numerically through discretization of
kino-dynamic model associated to the system is the C -space. Resolution completeness/optimality
represented by the system state equations. The of an algorithm implies the achievement of the
regions of C -space free from obstacles are called solution as the discretization interval tends to
C -free. zero. Probabilistic completeness/optimality im-
The path planning problem consists in deter- plies that the probability of finding the solution
mining a curve W Œ0; 1 ! C -free; s ! .s/, tends to 1 as the computation time approaches
with (0) corresponding to the initial configu- infinity. Complexity of the algorithm refers to the
ration and (1) corresponding to the goal con- computational time required to find a solution as
figuration. Both initial and goal configurations a function of the dimension of the problem.
are in C -free. The trajectory planning problem The scale of the motion w.r. to the scale of the
consists in determining a curve and a time system defines the specific setting of the problem.
law: t ! s .t/ s:t: .s/ D .s .t//. In both In cargo ships route planning from one port call
cases, either the path or the trajectory must be to the next, the problem is stated first as static, M
compatible with the system state equations. In time-invariant, differentially unconstrained path
the following, definitions of motion algorithm planning problem; once a large-scale route is thus
properties are given referring to path planning, determined, it can be refined on smaller scales,
but the same definitions can be easily extended e.g., smoothing it, to make it compatible with ship
to trajectory planning. maneuverability. Maneuvering the same cargo
A motion planning algorithm is complete if ship in the approaches to a harbor has to be
it finds a path when one exists, and returns a casted as a dynamic, time-variant, differentially
proper flag when no path exists. The algorithm is constrained trajectory planning problem.
optimal when it provides the path that minimizes
some cost function J . The (strictly positive) cost
function J is isotropic, when it depends only Large-Scale, Long-Range Path
on the system configuration (J D J (q)), or Planning
anisotropic, when it depends also on an external
force field f (e.g., sea currents, sea state, weather Route determination is a typical long-range path
perturbations ) (J D J (q,f)). The cost function planning problem for a marine system. The
J induces a pseudometric in the configuration geographical map is discretized into a grid, and
space; the distance d between configurations q1 the optimal path between the approaches of the
and q2 through the path is the “cost-to-go” from starting and destination ports is determined as a
q1 to q2 along : sequence of adjacent grid nodes. The problem
is taken as time-invariant and differentially
unconstrained, at least in the first stages of
d .q1 ; q2 / D s0 1 J. q1 q2 .s/ ; f/ds (1)
the procedure. It is assumed that the ship will
cruise at its own (constant) most economical
An optimal motion planning problem is: speed to optimize bunker consumption, the major
784 Motion Planning for Marine Control Systems

source of operating costs (Wang and Meng updated as the ship is in transit along the route.
2012). Navigation constraints (e.g., allowed ship The path planning algorithms can/must be rerun,
traffic corridors for the given ship class) are over a grid in the neighborhood of the nominal
considered, as well as weather forecasts and path, each time new environmental information
predicted/prevailing currents and winds. The becomes available. Kino-dynamic model of the
cost-to-go Eq. (1) is built between adjacent nodes ship must be included, allowing for deviation
either in terms of time to travel or in terms of from the established path and ship speed varia-
operating costs, both computed correcting the tion around the nominal most economical speed.
nominal speed with the environmental forces. The latter case is particularly important: increas-
Optimality is defined in terms of shortest ing/decreasing the speed to avoid a weather per-
time/minimum operating cost; the anisotropy turbation keeping the same route may indeed
introduced by sea/weather conditions is the result in a reduced operating cost with respect
driving element of the optimization, making to path modifications keeping a constant speed.
the difference with respect to straightforward This implies that the timing over the path must be
shortest route computation. The approach is specified. Dynamic programming is well suited
iterated, starting with a coarse grid and then for this transition from path to trajectory gen-
increasing grid resolution in the neighborhood of eration, and it is still the most commonly used
the previously found path. approach to trajectory (re)planning in reaction to
The most widely used optimization approach environmental predictions update.
for surface ships is dynamic programming When discretizing the world space, the min-
(LaValle 2006); alternatively, since the deter- imum grid size should still be large enough to
mination of the optimal path along the grid nodes allow for ship maneuvering between grid nodes.
is equivalent to a search over a graph, the A* This is required for safety, to allow evasive ma-
algorithm is applied (Delling et al. 2009). As the neuvering when other ships are at close ranges,
discretization grid gets finer, system dynamics are and for the generation of smooth, dynamics-
introduced, accounting for ship maneuverability compliant trajectories between grid points. This
and allowing for deviation from the constant latter aspect bridges motion planning with guid-
ship speed assumption. Dynamic programming ance.
allows to include system dynamics at any level
of resolution desired; however, when system
dynamics are considered, the problem dimension Trajectory Planning, Maneuvering
grows from 2-D to 3-D (2-D space plus time). Generation, and Guidance Systems
In the case of underwater navigation, path
planning takes place in a 3-D world space, and Once a path has been established over a spatial
the inclusion of system dynamics makes it a grid, a continuous reference has to be generated,
4-D problem; moreover, bathymetry has to be linking the nodes over the grid. The generation
included as an additional constraint to shape of the reference trajectory has to take into ac-
the C -free subspace. Dynamic programming count all the relevant dynamic properties and
may become unfeasible, due to the increase constraints of the marine system, so that the
in dimensionality. Computationally feasible reference motion is feasible. In this scenario, the
algorithms for this case include global search path/trajectory nodes are way-points, and the tra-
strategies with probabilistic optimality, as genetic jectory generation connects the way-points along
algorithms (Alvarez et al. 2004), or improved the route. The approaches to trajectory generation
grid-search methods with resolution optimality, can be divided between those that do not compute
as FM* (Petres et al. 2007). explicitly in advance the whole trajectory and
Environmental force fields are intrinsically dy- those that do.
namic fields; moreover, the prediction of such Among the approaches that do not need ex-
fields at the moment of route planning may be plicit trajectory computation between way-points,
Motion Planning for Marine Control Systems 785

Motion Planning for Marine Control Systems, Fig. 1 Generation of a reference trajectory with a system model and
a guidance controller (Adapted from Fossen (2011))

the most common is the line-of -sight (LOS) is used in simulation, with a (simulated) feed-
guidance law (Pettersen and Lefeber 2001). LOS back controller (guidance controller), the next
guidance can be considered a path generation, way-point as input, and the (simulated) system
more than a trajectory generation, since it does position and velocity as output. The simulated
not impose a time law over the path; it computes results are feasible maneuvers by construction
directly the desired ship reference heading and can be given as reference position/velocity to
on the basis of the current ship position and the physical control system (Fig. 1).
the previous and next way-point positions. A
review of other guidance approaches can be
found in Breivik and Fossen (2008), where
maneuvering along the path and steering around Summary and Future Directions
the way-points are also discussed. From such a
set of different maneuvers, a library of motion
M
Motion planning for marine control systems em-
primitives can be built (Greytak and Hover ploys methodological tools that range from oper-
2010), so that any motion can be specified as ating research to guidance, navigation, and con-
a sequence of primitives. While each primitive trol systems. A crucial role in marine applica-
is feasible by construction, an arbitrary sequence tions is played by the anisotropy induced by the
of primitives may not be feasible. An optimized dynamically changing environmental conditions
search algorithm (dynamic programming, A*) is (weather, sea state, winds, currents – the external
again needed to determine the optimal feasible force fields). The quality of the plan will depend
maneuvering sequence. on the quality of environmental information and
Path/trajectory planning explicitly computing predictions.
the desired motion among two way-points may While motion planning can be considered a
include a system dynamic model, or may not. mature issue for ships, rigs, and even standalone
In the latter case, a sufficiently smooth curve autonomous vehicles, current and future research
that connects two way-points is generated, for directions will likely focus on the following
instance, as splines or as Dubins paths (LaValle items:
2006). Curve generation parameters must be set – Coordinated motion planning and obstacle
so that the “sufficiently smooth” part is guaran- avoidance for teams of autonomous surface
teed. After curve generation, a trim velocity is and underwater vehicles (Aguiar and Pascoal
imposed over the path (path planning), or a time 2012; Casalino et al. 2009)
law is imposed, e.g., smoothly varying the system – Naval traffic regulation compliant maneuver-
reference velocity with the local curvature radius. ing in restricted spaces and collision evasion
Planners that do use a system dynamic model maneuvering (Tam and Bucknall 2010)
are described in Fossen (2011) as part of the – Underwater intervention robotics (Antonelli
guidance system. In practice, the dynamic model 2006; Sanz et al. 2010)
786 Motion Planning for PDEs

Cross-References Fossen TI (2011) Handbook of marine crafts hydrodynam-


ics and motion control. Wiley, Chichester
Goerzen C, Kong Z, Mettler B (2010) A survey of
 Control of Networks of Underwater Vehicles motion planning algorithms from the perspective of
 Mathematical Models of Marine Vehicle-Ma- autonomous UAV guidance. J Intell Robot Syst 57:65–
nipulator Systems 100
 Mathematical Models of Ships and Underwater Greytak MB, Hover FS (2010) Robust motion planning for
marine vehicle navigation. In: Proceedings of the 18th
Vehicles international offshore & polar engineering conference,
 Underactuated Marine Control Systems Vancouver
LaValle SM (2006) Planning algorithms. Cambridge Uni-
versity Press, Cambridge/New York. Available on-line
at http://planning.cs.uiuc.edu. Accessed 6 June 2013
Recommended Reading Petres C, Pailhas Y, Patron P, Petillot Y, Evans J, Lane
D (2007) Path planning for autonomous underwater
vehicles. IEEE Trans Robot 23(2):331–341
Motion planning is extensively treated in LaValle Pettersen KY, Lefeber E (2001) Way-point tracking con-
(2006), while the essential reference on marine trol of ships. In: Proceedings of the 40th IEEE confer-
control systems is the book by Fossen (2011). ence decision & control, Orlando
Sanz PJ, Ridao P, Oliver G, Melchiorri C, Casalino G,
Goerzen et al. (2010) reviews motion planning
Silvestre C, Petillot Y, Turetta A (2010) TRIDENT:
algorithms in terms of computational properties. a framework for autonomous underwater intervention
The book Antonelli (2006) includes the treatment missions with dexterous manipulation capabilities. In:
of planning and control in intervention robots. Proceedings of the 7th IFAC symposium on intelligent
autonomous vehicles, Lecce
The papers Breivik and Fossen (2008) and Tam
Tam CK, Bucknall R (2010) Path-planning algorithm for
et al. (2009) provide a survey of both termi- ships in close-range encounters. J Mar Sci Technol
nology and guidance design for both open and 15:395–407
close space maneuvering. In particular, Tam et al. Tam CK, Bucknall R, Greig A (2009) Review of collision
avoidance and path planning methods for ships in close
(2009) links motion planning to navigation rules.
range encounters. J Navig 62:455–476
Wang S, Meng Q (2012) Liner ship route schedule design
with sea contingency time and port time uncertainty.
Transp Res B 46:615–633
Bibliography

Aguiar AP, Pascoal A (2012) Cooperative control of mul-


tiple autonomous marine vehicles: theoretical founda-
tions and practical issues. In: Roberts GN, Sutton R
(eds) Further advances in unmanned marine vehicles. Motion Planning for PDEs
IET, London, pp 255–282
Alvarez A, Caiti A, Onken R (2004) Evolutionary path Thomas Meurer
planning for autonomous underwater vehicles in a
variable ocean. IEEE J Ocean Eng 29(2):418–429
Faculty of Engineering,
Antonelli G (2006) Underwater robots – motion and Christian-Albrechts-University Kiel, Kiel,
force control of vehicle-manipulator systems. 2nd edn. Germany
Springer, Berlin/New York
Breivik M, Fossen TI (2008) Guidance laws for planar
motion control. In: Proceedings of the 47th IEEE
conference on decision & control, Cancun Abstract
Casalino G, Simetti E, Turetta A (2009) A three-layered
architecture for real time path planning and obstacle Motion planning refers to the design of an open-
avoidance for surveillance usvs operating in harbour
fields. In: Proceedings of the IEEE oceans’09-Europe,
loop or feedforward control to realize prescribed
Bremen desired paths for the system states or outputs. For
Delling D, Sanders P, Schultes D, Wagner D (2009) distributed-parameter systems described by par-
Engineering route planning algorithms. In Lerner J, tial differential equations (PDEs), this requires to
Wagner D, Zweig KA (eds) Algorithmics of large and
complex networks. Lecture notes in computer science, take into account the spatial-temporal system dy-
vol 5515. Springer, Berlin/New York, pp 117–139 namics. Here, flatness-based techniques provide
Motion Planning for PDEs 787

a systematic inversion-based motion planning ap- nonlinearities as well as formal integration for
proach, which is based on the parametrization of semilinear PDEs using a generalized Cauchy-
any system variable by means of a flat or basic Kowalevski approach. To illustrate the principle
output. With this, the motion planning problem ideas and the evolving research results starting
can be solved rather intuitively as is illustrated for with Fliess et al. (1997), subsequently different
linear and semilinear PDEs. techniques are introduced based on selected ex-
ample problems. For this, the exposition is pri-
marily restricted to parabolic PDEs with a brief
Keywords discussion of motion planning for hyperbolic
PDEs before concluding with possible future re-
Basic output; Flatness; Formal integration; For- search directions.
mal power series; Trajectory assignment; Trajec-
tory planning; Transition path
Linear PDEs

Introduction In the following, a scalar linear diffusion-reaction


equation is considered in the state variable x.z; t/
Motion planning or trajectory planning refers to with boundary control u.t/ governed by
the design of an open-loop control to realize
prescribed desired temporal or spatial-temporal @t x.z; t/ D @2z x.z; t/ C rx.z; t/ (1a)
paths for the system states or outputs. Examples
@z x.0; t/ D 0; x.1; t/ D u.t/ (1b)
include smart structures with embedded actua-
tors and sensors such as adaptive optics in tele- x.z; 0/ D 0: (1c)
scopes, adaptive wings or smart skins, thermal M
and reheating processes in steel industry, and This PDE describes a wide variety of thermal and
deep drawing, start-up, shutdown, or transitions fluid systems including heat conduction and tubu-
between operating points in chemical engineer- lar reactors. Herein, r 2 R refer to the reaction
ing, as well as multi-agent deployment and for- coefficient and the initial state is without loss of
mation control (see, e.g., the overview in Meurer generality assumed zero. In order to solve the
2013). motion planning problem for (1), a feedforward
For the solution of the motion planning and control t 7! u .t/ is determined to realize a
tracking control problem for finite-dimensional finite-time transition between the initial state and
linear and nonlinear systems, differential flat- a final stationary state xT .z/ to be imposed for
ness as introduced in Fliess et al. (1995) has t  T.
evolved into a well-established inversion-based
technique. Differential flatness implies that any Formal Power Series
system variable can be parametrized in terms of By making use of the formal power series expan-
a flat or a so-called basic output and its time sion of the state variable
derivatives up to a problem-dependent order. As
1
X
a result, the assignment of a suitable desired zn
x.z; t/ ! x.z;
O t/ D xO n .t/ (2)
trajectory for the flat output directly yields the nŠ
nD0
respective state and input trajectories to realize
the prescribed motion. Flatness can be adapted to the evaluation of (1) results in the 2nd-order
systems governed by partial differential equations recursion
(PDEs). For this, different techniques have been
developed utilizing operational calculus or spec- xO n .t/ D @t xO n2 .t/  rxn2 .t/; n2 (3a)
tral theory for linear PDEs, (formal) power series
for linear PDEs, and PDEs involving polynomial xO 1 .t/ D 0: (3b)
788 Motion Planning for PDEs

In order to be able to solve (3) for xO n .t/, it The proof of this result can be, e.g., found in
is hence required to impose xO 0 .t/ D x.0;
O t/. Laroche et al. (2000) and Lynch and Rudolph
Denoting y.t/ D x.0; t/ or respectively (2002) and relies on the analysis of the recursion
(3) taking into account the assumptions on the
function y.t/.
xO 0 .t/ D y.t/ (3c)
Trajectory Assignment
implies To apply these results for the solution of the
motion planning problem to achieve finite-time
xO 2n .t/ D .@t  r/n ı y.t/; xO 2nC1 .t/ D 0: (4) transitions between stationary profiles, it is cru-
cial to properly assign the desired trajectory y  .t/
for the basic output y.t/. For this, observe that
Hence, any series coefficient in (2) can be differ- stationary profiles x s .z/ D x s .zI y s / are due
entially parametrized by means of y.t/. Taking to the flatness property (Classically stationary
into account the inhomogeneous boundary con- solutions are to be defined in terms of stationary
dition in (1b), i.e., input values x s .1/ D us .) governed by
1
X 1
X
xn .t/ x2n .t/ 0 D @2z x s .z/ C rx s .z/ (6a)
u.t/ D x.1; t/ D D (5)
nD0
nŠ nD0
.2n/Š @z x s .0/ D 0; x s .0/ D y s : (6b)

yields that y.t/ D x.0; t/ can be considered as a Hence, assigning different y s results in different
flat or basic output. In particular, by prescribing stationary profiles x s .zI y s /. The connection be-
a suitable trajectory t 7! y  .t/ 2 C 1 .R/ for tween an initial stationary profile x0s .zI y0s / and a
y.t/, the evaluation of (5) yields the feedforward final stationary profile xTs .zI yTs / is achieved by
control u .t/ which is required to realize the assigning y  .t/ such that
spatial-temporal path x  .z; t/ obtained from the
substitution of y  .t/ into (2) with coefficients y  .0/ D y0s ; y  .T / D yTs
parametrized by (4). This, however, relies on the
@nt y  .0/ D 0; @nt y  .T / D 0; n  1:
uniform convergence of (2) in view of (4) with at
least a unit radius of convergence in z. For this,
This implies that y  .t/ has to be locally nonan-
the notion of a Gevrey class function is needed
alytic at t 2 f0; T g and in view of the previous
(Rodino 1993).
discussion has thus to be a Gevrey class function
Definition 1 (Gevrey class) The function y.t/ of order ˛ 2 .1; 2/. For specific problems differ-
is in GD;˛ ./, the Gevrey class of order ˛ in ent functions have been suggested fulfilling these

R, if y.t/ 2 C 1 ./ and for every closed properties. In the following, the ansatz
subset 0 of  there exists a D > 0 such that
supt 20 j@nt y.t/j  D nC1 .nŠ/˛ . y  .t/ D y0s C .yTs  y0s /˚T; .t/ (7a)
The set GD;˛ ./ forms a linear vector space and
a ring with respect to the arithmetic product of is used with
functions which is closed under the standard rules 8
ˆ
ˆ 0; t 0
of differentiation. Gevrey class functions of order < Rt
hT; . /d
˛ < 1 are entire and are analytic if ˛ D 1. ˚T; .t/ D R 0T t 2 .0; T / (7b)
ˆ hT; . /d

0

Theorem 1 Let y.t/ 2 GD;˛ .R/ for ˛ < 2, 1; t T


then the formal power series (2) with coefficients
(4) converges uniformly with infinite radius of for hT; .t/ D exp .Œt=T .1  t=T / / if t 2
convergence. .0; T / and hT; .t/ D 0 else. It can be shown that
Motion Planning for PDEs 789

Motion Planning for


PDEs, Fig. 1 Simulated
spatial-temporal transition
path (top) and applied
flatness-based feedforward
control u .t / and desired
trajectory y  .t / (bottom)
for (1)

(7b) is a Gevrey class function of order ˛ D 1 C feedforward control and spatial-temporal transi-
1= (Fliess et al. 1997). Alternative functions are tion path are shown in Fig. 1.
presented, e.g., in Rudolph (2003).

Extensions and Generalizations


The previous considerations constitute a first
Simulation Example systematic approach to solve motion planning
In order to illustrate the results of the motion problems of systems governed by PDEs. The
planning procedure described above, let r D 1 underlying techniques can be, however, further
in (1). The differential parametrization (4) of the generalized to address coupled systems of
series coefficients is evaluated for the desired PDEs, certain classes of nonlinear PDEs (see
trajectory y  .t/ defined in (7) for y0s D 0 and also section “Semilinear PDEs”), or in-domain
yTs D 1 with the transition time T D 1 and control.
the parameter D 2. With this, the finite- While the application of formal power series
time transition between the zero initial stationary is restricted to boundary control diffusion-con-
profile x0 .z/ D 0 and the final stationary profile vection-reaction systems, the approach can be
xT .z/ D xTs .z/ D yTs cosh.z/ is realized along the combined with so-called resummation techniques
trajectory x.0; t/ D y  .t/. The corresponding to overcome convergence issues such as slowly
790 Motion Planning for PDEs

converging or even divergent series expansions xO n .t/ D @t xO n2 .t/  r1 xn2 .t/
(Laroche et al. 2000; Meurer and Zeitz 2005). !
Flatness-based techniques for motion planning X
n2
n
 r2 xO j .t/xO nj .t/; n2
can be also embedded into an operator theoretic j
j D0
context using semigroup theory by restricting
(9a)
the analysis to so-called Riesz spectral operators.
This enables to analyze coupled systems of linear xO 1 .t/ D 0: (9b)
PDEs with both boundary and in-domain control
in a single and multiple spatial coordinates with
a common framework (Meurer 2011, 2013). In Similar to the linear setting in the section “Linear
addition, experimental results for flexible beam PDEs” above, the recursion can be solved for
and plate structures with embedded piezoelectric xO n .t/ by imposing xO 0 .t/ D x.0;
O t/ or respectively
actuators confirm the applicability of this design
approach and the achievable high tracking ac-
xO 0 .t/ D y.t/: (9c)
curacy when transiently shaping the deflection
profile (Schröck et al. 2013).
As a result, also in this nonlinear setting any
series coefficient can be expressed in terms of
Semilinear PDEs y.t/ and its time derivatives. Hence, y.t/ D
x.0; t/ denotes a basic output for the semilinear
Flatness can be extended to semilinear PDEs. PDE (8). The uniform series convergence can
This is subsequently illustrated for the diffusion- be analyzed by restricting any trajectory y.t/ to
reaction system a certain Gevrey order ˛ while simultaneously
restricting the absolute values of d , r1 and r2
@t x.z; t/ D @2z x.z; t/ C r.x.z; t// (8a) (Dunbar et al. 2003; Lynch and Rudolph 2002).
These restrictions can be approached using, e.g.,
@z x.0; t/ D 0; x.1; t/ D u.t/ (8b) resummation techniques to sum slowly converg-
x.z; 0/ D 0 (8c) ing or divergent series to a meaningful limit.
The reader is therefore referred to Meurer and
with boundary input u.t/. Similar to the previous Zeitz (2005) or Meurer and Krstic (2011), with
section, the motion planning problem refers to the the latter introducing a PDE-based approach for
determination of a feedforward control t 7! u .t/ formation control of multi-agent systems.
to realize finite-time transitions starting at the
initial profile x0 .z/ D x.z; 0/ D 0 to achieve a Formal Integration
final stationary profile xT .z/ for t  T . A generalization of these results has been re-
cently suggested in Schörkhuber et al. (2013) by
Formal Power Series making use of an abstract Cauchy-Kowalevski
If r.x.z; t// is a polynomial in x.z; t/ or an theorem in Gevrey classes. In order to illustrate
analytic function, then similar to the previous sec- this, solve (8a) for @2z x.z; t/ and formally inte-
tion, formal power series can be applied to solve grate with respect to z taking into account the
the motion planning problem. This, however, re- boundary conditions (8b). This yields the implicit
lies on the successive evaluation of Cauchy’s solution
product formula. As an example, consider
Z zZ p
x.z; t/ D xC .0; t/ @t x.q; t/
r.x.z; t// D r1 x.z; t/ C r2 x .z; t/;
2
0 0

r.x.q; t// dqdp (10a)
then the formal power series ansatz (2) results in
the recursion u.t/ D x.1; t/; (10b)
Motion Planning for PDEs 791

which can be used to develop to a flatness-based 0 D @2z x s .z/ C r.x s .z// (14a)
design systematics for motion planning given
semilinear PDEs. For this, introduce @z x s .0/ D 0; x s .0/ D y s : (14b)

is clearly achieved along the prescribed path


y.t/ D x.0; t/; (11)
y  .t/.

and rewrite (10b) in terms of the sequence of


functions .x .n/ .z; t//1 Extensions and Generalizations
nD0 according to
Generalizations of the introduced formal integra-
tion approach to solve motion planning problems
x .0/ .z; t/ D y.t/ (12a)
Z zZ for systems of coupled PDEs are, e.g., provided
p
x .nC1/ .z; t/ D x .0/ .z; t/ C @t x .n/ .q; t/ in Schörkhuber et al. (2013). Moreover, linear
0 0 diffusion-convection-reaction systems with spa-
 tially and time-varying coefficients defined on
r.x .n/
.q; t// dqdp: (12b)
a higher-dimensional parallelepipedon are ad-
dressed in Meurer and Kugi (2009) and Meurer
From this, it is obvious that y.t/ denotes a ba-
(2013).
sic output differentially parametrizing the state
variable x.z; t/ D limn!1 x .n/ .z; t/ and the
boundary input u.t/ D x.1; t/ provided that
Hyperbolic PDEs
the limit exists as n ! 1. As is shown in
Schörkhuber et al. (2013) by making use of
Hyperbolic PDEs exhibiting wavelike dynam-
scales of Banach spaces in Gevrey classes and
ics require the development of a design sys-
abstract Cauchy-Kowalevski theory, the conver-
tematics explicitly taking into account the finite M
gence of the parametrized sequence of functions
speed of wave propagation. For linear hyperbolic
.x .n/ .z; t//1
nD0 can be ensured in some compact PDEs, operational calculus has been success-
subset of the domain z 2 Œ0; 1. Besides its
fully applied to determine the state and input
general setup this approach provides an itera-
parametrizations in terms of the basic output
tion scheme, which can be directly utilized for
and its advanced and delayed arguments (Pe-
a numerically efficient solution of the motion
tit and Rouchon 2001, 2002; Rouchon 2001;
planning problem.
Rudolph and Woittennek 2008; Woittennek and
Rudolph 2003). In addition, the method of char-
Simulation Example acteristics can be utilized to address both linear
Let the reaction be subsequently described by and quasi-linear hyperbolic PDEs. Herein, a suit-
able change of coordinates enables to reformulate
r.x.z; t// D sin.2x.z; t//: (13) the PDE in a normal form, which can be (for-
mally) integrated in terms of a basic output. With
The iterative scheme (12) is evaluated for the this, also an efficient numerical procedure can
desired trajectory y  .t/ defined in (7) for y0s D 0 be developed to solve motion planning problems
and yTs D 1 with the transition time T D 1 and for hyperbolic PDEs (Woittennek and Mounier
the parameter D 1, i.e., the desired trajectory 2010).
is of Gevrey order ˛ D 2. The resulting feed-
forward control u .t/ and the spatial-temporal
transition path resulting from the numerical so- Summary and Future Directions
lution of the PDE are depicted in Fig. 2. The
desired finite-time transition between the zero Motion planning constitutes an important design
initial stationary profile x0 .z/ D 0 and the final step when solving control problems for systems
stationary profile xT .z/ D xTs .z/ determined by governed by PDEs. This is particularly due to
792 Motion Planning for PDEs

Motion Planning for


PDEs, Fig. 2 Simulated
spatial-temporal transition
path (top) and applied
flatness-based feedforward
control u .t / and desired
trajectory y  .t / (bottom)
for (8) with (13)

the increasing demands on quality, accuracy, and Cross-References


efficiency, which require to turn away from the
pure stabilization of an operating point toward  Boundary Control of 1-D Hyperbolic Systems
the realization of specific start-up, transition, or  Boundary Control of Korteweg-de Vries and
tracking tasks. In view of these aspects, future re- Kuramoto–Sivashinsky PDEs
search directions might deepen and further evolve  Control of Fluids and Fluid-Structure Interac-
the following: tions
– Semi-analytic design techniques taking into
account suitable approximation schemes for
complex-shaped spatial domains Bibliography
– Nonlinear PDEs and coupled systems of non-
linear PDEs with boundary and in-domain Dunbar W, Petit N, Rouchon P, Martin P (2003) Motion
control planning for a nonlinear Stefan problem. ESAIM Con-
trol Optim Calculus Var 9:275–296
– Applications arising, e.g., in aeroelasticity, Fliess M, Lévine J, Martin P, Rouchon P (1995) Flatness
micromechanical systems, fluid flow, and and defect of non–linear systems: introductory theory
fluid-structure interaction. and examples. Int J Control 61:1327–1361
Motorcycle Dynamics and Control 793

Fliess M, Mounier H, Rouchon P, Rudolph J (1997)


Systèmes linéaires sur les opérateurs de Mikusiński et Motorcycle Dynamics and Control
commande d’une poutre flexible. ESAIM Proc 2:183–
193 Martin Corless
Laroche B, Martin P, Rouchon P (2000) Motion planning
School of Aeronautics & Astronautics, Purdue
for the heat equation. Int J Robust Nonlinear Control
10:629–643 University, West Lafayette, IN, USA
Lynch A, Rudolph J (2002) Flatness-based boundary
control of a class of quasilinear parabolic distributed
parameter systems. Int J Control 75(15):1219–1230
Meurer T (2011) Flatness-based trajectory planning for
Abstract
diffusion-reaction systems in a parallelepipedon – a
spectral approach. Automatica 47(5):935–949 A basic model due to Sharp which is useful in
Meurer T (2013) Control of higher-dimensional PDEs: the analysis of motorcycle behavior and control
flatness and backstepping designs. Communications
is developed. This model is based on linearization
and control engineering series. Springer, Berlin
Meurer T, Krstic M (2011) Finite-time multi-agent de- of a bicycle model introduced by Whipple, but is
ployment: a nonlinear PDE motion planning approach. augmented with a tire model in which the lateral
Automatica 47(11):2534–2542 tire force depends in a dynamic fashion on tire be-
Meurer T, Kugi A (2009) Trajectory planning for bound-
ary controlled parabolic PDEs with varying parameters
havior. This model is used to explain some of the
on higher-dimensional spatial domains. IEEE Trans important characteristics of motorcycle behavior.
Autom Control 54(8): 1854–1868 The significant dynamic modes exhibited by this
Meurer T, Zeitz M (2005) Feedforward and feedback model are capsize, weave, and wobble.
tracking control of nonlinear diffusion-convection-
reaction systems using summability methods. Ind Eng
Chem Res 44:2532–2548
Petit N, Rouchon P (2001) Flatness of heavy chain sys- Keywords
tems. SIAM J Control Optim 40(2):475–495
Petit N, Rouchon P (2002) Dynamics and solutions to Bicycle; Capsize; Counter-steering; Motorcycle; M
some control problems for water-tank systems. IEEE
Single-track vehicle; Tire model; Weave; Wobble
Trans Autom Control 47(4):594–609
Rodino L (1993) Linear partial differential operators in
gevrey spaces. World Scientific, Singapore
Rouchon P (2001) Motion planning, equivalence, and Introduction
infinite dimensional systems. Int J Appl Math Comput
Sci 11:165–188
The bicycle is mankind’s ultimate solution
Rudolph J (2003) Flatness based control of distributed
parameter systems. Berichte aus der Steuerungs– und to the quest for a human-powered vehicle
Regelungstechnik. Shaker–Verlag, Aachen (Herlihy 2006). The motorcycle just makes riding
Rudolph J, Woittennek F (2008) Motion planning and more fun. Bicycles, motorcycles, scooters, and
open loop control design for linear distributed pa-
mopeds are all examples of single-track vehicles
rameter systems with lumped controls. Int J Control
81(3):457–474 and have similar dynamics. The dynamics of a
Schörkhuber B, Meurer T, Jüngel A (2013) Flatness of motorcycle are considerably more complicated
semilinear parabolic PDEs – a generalized Cauchy- than that of a four-wheel vehicle such as a
Kowalevski approach. IEEE Trans Autom Control
58(9):2277–2291
car. The first obvious difference in behavior
Schröck J, Meurer T, Kugi A (2013) Motion planning for is stability. An unattended upright stationary
Piezo–actuated flexible structures: modeling, design, motorcycle is basically an inverted pendulum
and experiment. IEEE Trans Control Syst Technol and is unstable about its normal upright position,
21(3):807–819
Woittennek F, Mounier H (2010) Controllability of net-
whereas a car has no stability issues in the
works of spatially one-dimensional second order same configuration. Another difference is that a
P.D.E. – an algebraic approach. SIAM J Control Optim motorcycle must lean when cornering. Although
48(6):3882–3902 a car leans a little due to suspension travel, there
Woittennek F, Rudolph J (2003) Motion planning for a
class of boundary controlled linear hyperbolic PDE’s
is no necessity for it to lean in cornering. A
involving finite distributed delays. ESAIM Control perfectly rigid car would not lean. Furthermore,
Optim Calculus Var 9: 419–435 beyond low speeds, the steering behavior of a
794 Motorcycle Dynamics and Control

motorcycle is not intuitive like that of a car. Each of the above four bodies are described by
To turn a car right, the driver simply turns the their mass, mass center location, and a 3  3 in-
steering wheel right; on a motorcycle, the rider ertia matrix. Two other important parameters are
initially turns the handlebars to the left. This is the wheelbase w and the trail c. The wheelbase
called counter-steering and is not intuitive. is the distance between the contact points of the
two wheels in the nominal configuration, and the
trail is the distance from the front wheel contact
point Q to the intersection S of the steering axis
A Basic Model with the ground. The trail is normally positive,
that is, Q is behind S . The point G locates the
To obtain a basic motorcycle model, we start mass center of the complete bike in its nominal
with four rigid bodies: the rear frame (which configuration, whereas GA is the location of the
includes a rigidly attached rigid rider), the front mass center of the front assembly (front frame
frame (includes handlebars and front forks), the and wheel).
rear wheel, and the front wheel; see Fig. 1. We
assume that both frames and wheels have a plane
of symmetry which is vertical when the bike is
in its nominal upright configuration. The front Description of Motion
frame can rotate relative to the rear frame about
the steering axis; the steering axis is in the plane Considering a right-handed reference frame e D
of symmetry of each frame and in the nominal .eO1 ; eO2 ; eO3 / with origin O fixed in the ground, the
upright configuration of the bike, the angle it bike motion can be described by the location of
makes with the vertical is called the rake angle the rear wheel contact point P relative to O,
or caster angle and is denoted by . The rear the orientation of the rear frame relative to e,
wheel rotates relative to the rear frame about an and the orientation of the front frame relative
axis perpendicular to the rear plane of symmetry to the rear frame; see Fig. 2. Assuming the bike
and is symmetrical with respect to this axis. The is moving along a horizontal plane, the location
same relationship holds between the front wheel of P is usually described by Cartesian coordi-
and the front frame. Although each wheel can nates x and y. Let reference frame b be fixed
be three dimensional, we model the wheels as in the rear frame with bO1 and bO3 in the plane
terminating in a knife edge at their boundaries of symmetry with bO1 along the nominal P  S
and contact the ground at a single point. Points
Q and P are the points on the ground in contact
with the front and rear wheels, respectively. O e1 x
e3 e2
ψ δ

GA
G
UA
y b̂1
P
h b̂3 φ
P B Q S
b c
w S

Motorcycle Dynamics and Control, Fig. 1 Basic Motorcycle Dynamics and Control, Fig. 2 Description
model of motion
Motorcycle Dynamics and Control 795

line; see Fig. 2. Using this reference frame, the With no sideslip at both wheels, kinematical
orientation of the rear frame is described by a considerations (see Fig. 4) show that the yaw
3-1-2 Euler angle sequence which consists of a rate P is determined by ı; specifically for small
yaw rotation by about the 3-axis followed by angles we have the following linearized relation-
a lean (roll) rotation by  about the 1-axis and ship:
finally by a pitch rotation by  about the 2-axis. P D vı C
ıP (1)
The orientation of the front frame relative to the
where  D c =w, c D cos , and
D cc =w is
rear frame can be described by the steer angle ı.
the normalized mechanical trail. In Fig. 4, ıf D
Assuming both wheels remain in contact with the
c ı is the effective steer angle; it is the angle
ground, the pitch angle  is not independent; it is
between the intersections of the front wheel plane
uniquely determined by ı and . In considering
and the rear frame plane with the ground. Thus
small perturbations from the upright nominal
we can completely describe the lateral bike dy-
configuration, the variation in pitch is usually
namics with the roll angle  and the steer angle
ignored. Here we consider it to be zero. Also the
ı. To obtain the above relationship, first note that,
dynamic behavior of the bike is independent of
as a consequence of no sideslip, vN P D v bO1 and
the coordinates x; y, and . These coordinates
vN Q is perpendicular to fO2 . Taking the dot product
can be obtained by integrating the velocity of P
of the expression,
and P .
vN Q D vN P C .wCc/ P bO2  c. P C ıPf /fO2 ;
The Whipple Bicycle Model
with fO2 while noting that bO1  fO2 D  sin ıf and
The “simplest” model which captures all the bO2  fO2 D cos ıf results in
salient features of a single track vehicle for a M
basic understanding of low-speed dynamics and 0 D v sin ıf C .w C c/ P cos ıf  c. P C ıPf / :
control is that originally due to Whipple (1899).
We consider the linearized version of this model Linearization about ı D 0 yields the desired
which is further expounded on in Meijaard et al. result.
(2007). The salient feature of this model is that The relationship in (1) also holds for four
there is no slip at each wheel. This means that wheel vehicles. There one can achieve a desired
the velocity of the point on the wheel which
is instantaneously in contact with the ground is
zero; this is illustrated in Fig. 3 for the rear wheel.
No slip implies that there is no sideslip which
means that the velocity of the wheel contact
point (vN P in Fig. 3) is parallel to the intersection P P

of the wheel plane with the ground plane; the P
wheel contact point is the point moving along the
ground which is in contact with the wheel. Motorcycle Dynamics and Control, Fig. 3 No slip:
0
The rest of this entry is based on linearization vP D 0
of motorcycle dynamics about an equilibrium
configuration corresponding to the bike traveling vQ
Q
upright in a straight line at constant forward P c
vP b̂1 δf S
speed v WD v P , the speed of the rear wheel w +c fˆ1
contact point P . In the linearized system, the ψ̇
b̂2 fˆ2
longitudinal dynamics are independent of the
lateral dynamics, and in the absence of driving Motorcycle Dynamics and Control, Fig. 4 Some kine-
or braking forces, the speed v is constant. matics
796 Motorcycle Dynamics and Control

φ Ixx R  mgh D mı ıR  cı v ıP  kı .v/ı


(3)

G where mı D
mhb > 0, cı D mh.
Cb/ > 0
and kı .v/ D 
mgb C mhv 2 . Note that v is a
constant parameter corresponding to the nominal
h speed of the rear wheel contact point.
aB With ı D 0, we have a system whose be-
havior
p is characterized by two real eigenvalues:
B
˙ mgh=Ixx : This system p is unstable due to
Motorcycle Dynamics and Control, Fig. 5 Inverted the positive eigenvalue mgh=Ixx . For v suf-
pendulum with accelerating support point ficiently large, the coefficient kı .v/ is positive
and one can readily show that the above system
constant yaw rate P d by simply letting the steer can be stabilized with positive feedback ı D
angle ı D P d =v. However, as we shall see, a K provided K > mgh=kı .v/. This helps
motorcycle with its steering fixed at a constant explain the stabilizing effect of a rider turning
steer angle is unstable. Neglecting gyroscopic the handlebars in the direction the bike is falling.
terms, it is simply an inverted pendulum. With its Actually, the rider input is a steer torque Tı
steering free, a motorcycle can be stable over a about the steer axis.
certain speed range and, if unstable, can be easily To explain why an uncontrolled motorcycle
stabilized above a very low speed by most people. can be stable or easily stabilized, one also has
Trials riders can stabilize a motorcycle at any to look at the effect that  has on ı; in general,
speed including zero. a lean perturbation results in the front assembly
To help understand the effect of steer angle turning in the same direction, that is, a positive
on bike behavior, we initially ignore the mass perturbation of  results in a positive change in ı.
and inertia of the front assembly along with The lean equation also explains why a motor-
gyroscopic effects, and we assume that the bO1 cycle must lean when cornering above a certain
axis is a principle axis of inertia of the rear frame speed. Suppose the motorcycle is in a right hand
with moment of inertia Ixx . Angular momentum corner of radius R at some constant speed v: in
considerations about the bO1 axis and linearization this scenario, P D v=R and, with ı constant,
results in (1) implies that ı D P =v D 1=R; with ı
and  constant, the lean equation now requires
Ixx R C mhaB D mgh C .Nf c c/ı (2) that  D kı .v/ı=mgh D kı .v/=mghR. For
higher speeds, kı .v/  mhv 2 ; hence  
where Nf is the normal force (vertical and v 2 =gR. Since aB D v 2 =R, the lean angle 
upwards) on the front wheel and aB is the lat- is approximately aB =g. Hence, to corner with a
eral acceleration (perpendicular to rear frame) of lateral acceleration aB D v 2 =R, the motorcycle
point B which is the projection of G onto the must lean at an angle of approximately aB =g.
bO1 axis. By considering a moment balance about The lean equation can also help explain
the pitch axis bO2 through P , one can obtain that counter-steering; that is, at speeds above a
Nf D mgb=w. Notice that, with the steering reasonably low speed, one can initiate a turn by
fixed at ı D 0, Eq. (2) is the equation of motion of turning the handlebars in the opposite direction to
a simple inverted pendulum whose support axis which one wants to go; to turn right, one initially
is accelerating horizontally with acceleration aB . turns the handlebars to the left. See Limebeer and
This is illustrated in Fig. 5. Sharp (2006) for further discussion.
Basic kinematics reveal that aB D v P C b R Taking into account the mass and inertia of
and, recalling relationship (1), Eq. (2) now yields the front assembly, gyroscopic effects and cross
the lean equation: products of inertia of the rear frame, one can
Motorcycle Dynamics and Control 797

show (see Meijaard et al. 2007) that the lean kı D kı
equation (3) still holds with
kıı .v/ D k0ıı C k2ıı v 2
mı D
Ixz C IAx k0ıı D s SA g;
cı D
mh C Ixz C
ST C c SF k2ıı D .SA C s SF /
kı .v/ D k0ı C k2ı v 2 cı D .
ST C c SF /
k0ı D SA g; cıı D
.SA C Izz / C IAz
k2ı D .mh C ST /
Here, Izz is the moment of inertia of the total
motorcycle and the rider in the nominal config-
Here Ixx is the moment of inertia of the total uration about the bO3 axis, IA is the moment of
motorcycle and rider in the nominal configuration inertia of the front assembly about the steering
about the bO1 axis and Ixz is the inertia cross axis, and IAz is the front assembly inertia cross
product w.r.t the bO1 and bO3 axes. The term IAx product with respect to the steering axis and the
is the front assembly inertia cross product with vertical axis through P . The lean equation (3)
respect to the steering axis and the bO1 axis; see combined with the steer equation (4) provide an
Meijaard et al. (2007) for a further description of initial model for motorcycle dynamics. This is
this parameter. Also, SA D
mb C mA uA where a linear model with the nominal speed v as a
mA is the mass of the front assembly (front wheel constant parameter and the rider’s steering torque
and front frame) and uA is the offset of the mass Tı as an input.
center of the front assembly from the steering
axis, that is, the distance of this mass center from
the steering axis; see Fig. 1. The terms SF D
Modes of Whipple Model
M
IF yy =rF and ST D IRyy =rR C IF yy =rF are
gyroscopic terms due the rotation of the front
At v D 0, the linearized Whipple model (3)–
and rear wheels where rF and rR are the radii
(4) has two pairs of real eigenvalues: ˙p1 ; ˙p2
of the front and rear wheels, while IF yy and
with p2 > p1 > 0; see Fig. 6. The pair ˙p1
IRyy are the moments of inertias of the front and
roughly describe inverted pendulum behavior of
rear wheels about their axles. It is assumed that
the whole bike with fixed steering, while ˙p2
the mass center of each wheel is located at its
describe inverted pendulum behavior of the front
geometric center.
assembly with the rear frame fixed upright. As
By considering an angular momentum balance
v increases the real eigenvalues corresponding
about a vertical axis through P , one can obtain
to p1 and p2 meet and from there on form a
an expression for the lateral force at the front
wheel. Angular momentum considerations about
the steering axis for the front assembly and lin-
vw
earization then yield the steer equation:

mı R C mıı ıR C vcı P C vcıı ıP WEAVE


(4)
Ckı  C kıı .v/ı D Tı
vc
CASTOR
where −p2 −p1 p1 p2
CAPSIZE
mı D mı
Motorcycle Dynamics and Control, Fig. 6 Variation of
mıı D IA C 2
IAz C
2 Izz eigenvalues of Whipple model with speed v
798 Motorcycle Dynamics and Control

complex conjugate pair of eigenvalues which re- WHEEL


WHEEL
sult in a single oscillatory mode called the weave
mode. Initially the weave mode is unstable, but vP
φ
is stable above a certain speed vw , and for large α
speeds, its eigenvalues are roughly a linear func- F
P F
tion of v; thus it becomes more damped and its P
frequency increases with speed. The eigenvalue
N
corresponding to p2 remains real and becomes
more negative with speed; this is called the cas- TOP VIEW FRONT VIEW
tor mode, because it roughly corresponds to the
front assembly castoring about the steer axis. The Motorcycle Dynamics and Control, Fig. 7 Lateral
force, slip angle, and camber angle
eigenvalue corresponding to p1 also remains
real but increases, eventually becoming slightly
positive above some speed vc , resulting in an un-
stable system; the corresponding mode is called by the slip angle ˛ which is the angle between
the capsize mode. Thus the bike is stable in the contact point velocity and the intersection of
the autostable speed range .vw ; vc / and unstable the wheel plane and the ground; see Fig. 7.
outside this speed range. However, above vc , the The lateral force also depends on the tire
unstable capsize mode is easily stabilized by a camber angle which is the roll angle of the tire;
rider and usually without conscious effort. This is motorcycle tires can achieve large camber angles
because the time constant of the unstable capsize in cornering; modern MotoGP racing motorcy-
mode is very small (Astrom et al. 2005). cles can achieve camber angles of nearly 65ı .
Thus an initial linear model of a tire lateral force
is given by
Sharp71 Model
F D N.k˛ ˛ C k / (5)
The Whipple bicycle model is not applicable
at higher speeds. In particular, it does not con-
tain a wobble mode which is common to bi- where N is the normal force on the tire, k˛ > 0 is
cycle and motorcycle behavior at higher speeds called the tire cornering stiffness, and k > 0
(Sharp 1971). A wobble mode is characterized is called the camber stiffness. Modifying the
mainly by oscillation of the front assembly about above Whipple model with the tire force model
the steering axis and can sometimes be unstable. results in the appearance of the wobble mode.
Also, in a real motorcycle, the damping and Since lateral forces do not instantaneously re-
frequency of the weave mode do not continually spond to changes in slip angle and camber, the
increase with speed; the damping usually starts dynamic model,
to decrease after a certain speed; sometimes this
mode even becomes unstable. At higher speeds, P
F C F D N.k˛ ˛ C k / ; (6)
one must depart from the simple non-slipping v
wheel model. In the Whipple model, the lateral
force F on a wheel is simply that force which is usually used where > 0 is called the
is necessary to maintain the non-holonomic con- relaxation length of the tire. This yields more
straint which requires the velocity of the wheel realistic behavior (Sharp 1971). In this model
contact point to be parallel to the wheel plane, the weave mode damping eventually decreases at
that is, no sideslip. Actual tires on wheels slip higher speeds and the frequency does not con-
in the longitudinal and lateral direction, and the tinually increase. The frequency of the wobble
lateral force depends on slip in the lateral direc- mode is higher than that of the weave mode and
tion, that is, sideslip. This lateral slip is defined its damping decreases at higher speeds.
Moving Horizon Estimation 799

Further Models tires: measurement and comparison. Veh Syst Dyn


39(5):329–352
Cossalter V (2006) Motorcycle dynamics. Second English
To obtain nonlinear models, resort is usually Edition, LULU
made to multi-body simulation codes. In recent Herlihy DV (2006) Bicycle: the history. Yale University
years, several researchers have used such codes to Press, New Haven
make nonlinear models which take into account Limebeer DJN, Sharp RS (2006) Bicycles, motorcycles,
and models. IEEE Control Syst Mag 26(5):34–61
other features such as frame flexibility, rider Meijaard JP, Papadopoulos JM, Ruina A, Schwab AL
models, and aerodynamics; see Cossalter (2006), (2007) Linearized dynamics equations for the balance
Cossalter and Lot (2002), Sharp and Limebeer and steer of a bicycle: a benchmark and review –
(2001), and Sharp et al. (2004). The nonlinear including appendix. Proc R Soc A 463(2084):1955–
1982
behavior of the tires is usually modeled with Pacejka HB (2006) Tire and vehicle dynamics, 2nd edn.
a version of the Magic formula; see Pacejka SAE International, Warrendale, PA
(2006), Sharp et al. (2004), and Cossalter et al. Saccon A, Hauser J, Beghi A (2012) Trajectory ex-
(2003). Another line of research is to use some ploration of a rigid motorcycle model. IEEE Trans
Control Syst Technol 20(2):424–437
of these models to obtain optimal trajectories for Sharp RS (1971) The stability and control of motorcycles.
high performance; see Saccon et al. (2012). J Mech Eng Sci 13(5):316–329
Sharp RS, Limebeer DJN (2001) A motorcycle model
for stability and control analysis. Multibody Syst Dyn
6:123–142
Summary and Future Directions Sharp RS, Evangelou S, Limebeer DJN (2004) Advances
in the modelling of motorcycle dynamics. Multibody
We have presented a basic linearized model of a Syst Dyn 12(3):251–283
Whipple FJW (1899) The stability of the motion of a
motorcycle or bicycle useful for the understand- bicycle. Q J Pure Appl Math 30:312–348
ing and control of these two wheeled vehicles.
It seems that inclusion of further features in the M
model and the consideration of full nonlinear
behavior require the use of multibody simulation Moving Horizon Estimation
software. Future research will consider models
which will include the engine, transmission, and James B. Rawlings
an active pilot. Autonomous control of these University of Wisconsin, Madison,
vehicles will also be considered. WI, USA

Cross-References Synonyms

 Pilot-Vehicle System Modeling MHE


 Transmission
 Vehicle Dynamics Control

Abstract
Bibliography Moving horizon estimation (MHE) is a state esti-
mation method that is particularly useful for non-
Astrom KJ, Klein RE, Lennarstsson A (2005) Bicycle dy-
namics and control. IEEE Control Syst Mag 25(4):26– linear or constrained dynamic systems for which
47 few general methods with established properties
Cossalter V, Lot R (2002) A motorcycle multi-body are available. This entry explains the concept of
model for real time simulations based on the natural
full information estimation and introduces mov-
coordinates approach. Veh Syst Dyn 37(6):423–447
Cossalter V, Doria A, Lot R, Ruffo N, Salvador M ing horizon estimation as a computable approx-
(2003) Dynamic properties of motorcycle and scooter imation of full information. The basic design
800 Moving Horizon Estimation

methods for ensuring stability of MHE are pre- First, we define some notation necessary to
sented. The relationships of full information and distinguish the system variables from the es-
MHE to other state estimation methods such timator variables. We have already introduced
as Kalman filtering and statistical sampling are the system variables .x; w; y; v/. In the estima-
discussed. tor optimization problem, these have correspond-
ing decision variables, which we denote by the
Greek letters .; !; ; /. The relationships be-
Keywords tween these variables are

Full information estimation; Kalman filtering; C D f .; !/ y D h./ C  (2)


Statistical sampling
and they are depicted in Fig. 1. Notice that 
measures the gap between the model prediction
Introduction  D h./ and the measurement y. The optimal
O w;
decision variables are denoted .x; O y;
O v/,
O and
In state estimation, we consider a dynamic sys- these optimal decisions are the estimates pro-
tem from which measurements are available. In vided by the state estimator.
discrete time, the system description is

x C D f .x; w/ y D h.x/ C v (1) Full Information Estimation

The state of the systems is x 2 Rn , the mea- The full information objective function is
surement is y 2 RP , and the notation x C means
1
x at the next sample time. A control input u TX
may be included in the model, but it is con- VT ..0/; !/ D `x .0/x 0 C `i .!.i /; .i //
sidered a known variable, and its inclusion is i D0
(3)
irrelevant to state estimation, so we suppress it in
subject to (2) in which T is the current time, ! is
the model under consideration here. We receive
the estimated sequence of process disturbances,
measurement y from the sensor, but the process
.!.0/; : : : ; !.T  1//, y.i / is the measurement
disturbance, w 2 Rg ; measurement disturbance
at time i , and x 0 is the prior, i.e., available,
v 2 Rp ; and system initial state, x.0/, are
value of the initial state. Full information here
considered unknown variables.
means that we use all the data on time interval
The goal of state estimation is to construct or
Œ0; T  to estimate the state (or state trajectory) at
estimate the trajectory of x from only the mea-
time T . The stage cost `i .!; / costs the model
surements y. Note that for control purposes, we
disturbance and the fitting error, the two error
are usually interested in the estimate of the state
sources that we reconcile in all state estimation
at the current time, T , rather than the entire tra-
problems.
jectory over the time interval Œ0; T . In the mov-
The full information estimator is then defined
ing horizon estimation (MHE) method, we use
as the solution to
optimization to achieve this goal. We have two
sources of error: the state transition is affected
min VT ..0/; !/ (4)
by an unknown process disturbance (or noise), .0/;!
w, and the measurement process is affected by
another disturbance, v. In the MHE approach, we The solution to the optimization exists for all
formulate the optimization objective to minimize T 2 I0 under mild continuity assumptions and
the size of these errors thus finding a trajectory of choice of stage cost. Many choices of (positive,
the state that comes close to satisfying the (error- continuous) stage costs `x ./ and `i ./ are possi-
free) model while still fitting the measurements. ble, providing a rich class of estimation problems
Moving Horizon Estimation 801

Moving Horizon Estimation, Fig. 1 The state, mea- (gray circles in lower half ) is to be reconstructed given the
sured output, and disturbance variables appearing in the measurements (black circles in upper half )
state estimation optimization problem. The state trajectory
M
 
that can be tailored to different applications. Be-
`i .!; / D .1=2/ k!k2Q1 C kk2R1
cause the system model (1) and cost function (3)
are so general, it is perhaps best to start off by
specializing them to see the connection to some
classic results.
in which random variable x.0/ is assumed to have
mean x 0 and variance P0 and random variables w
and v are assumed zero mean with variances Q
Related Problem: The Kalman Filter and R, respectively. The Kalman filter is also a
recursive solution to the state estimation problem
If we specialize to the linear dynamic model so that only the current mean xO and variance P of
f .x; w/ D Ax C Gw, h.x/ D C x, and let x.0/, the conditional density are required to be stored,
w, and v be independent, normally distributed instead of the entire history of measurements
random variables, the classic Kalman filter is y.i /; i D 0; : : : ; T . This computational effi-
known to be the statistically optimal estimator, ciency is critical for success in online application
i.e., the Kalman filter produces the state estimate for processes with short time scales requiring fast
that maximizes the conditional probability of processing.
x.T / given y.0/; : : : ; y.T /. The full information But if we consider nonlinear models, the max-
estimator is equivalent to the Kalman filter given imization of conditional density is usually an
the linear model assumption and the following intractable problem, especially in online appli-
choice quadratic of stage costs cations. So, MHE becomes a natural alternative
for nonlinear models or if an application calls for
hard constraints to be imposed on the estimated
`x ..0/; x 0 / D .1=2/ k.0/  x 0 k2P 1
0 variables.
802 Moving Horizon Estimation

Moving the Horizon VOT ..T  N /; !/ D T N ..T  N //


X
T 1
An obvious problem with solving the full infor-
C `i .!.i /; .i //
mation optimization problem is that the number
i DT N
of decision variables grows linearly with time T ,
which quickly renders the problem intractable for
subject to (2). The MHE problem is defined to be
continuous processes that have no final time. A
natural alternative to full information is to con-
sider instead a finite moving horizon of the most min VOT ..T  N /; !/ (5)
.T N /;!
recent N measurements. Figure 2 displays this
idea. The initial condition .0/ is now replaced
in which ! D f!.T  N /; : : : ; !.T  1/g and the
by the initial state in the horizon, .T  N /,
hat on V distinguishes the MHE objective func-
and the decision variable sequence of process
tion from full information. The designer must
disturbances is now just the last N variables
now choose this prior weighting k ./ for k > N .
! D .!.T  N /; : : : ; !.T  1//. Now, the
To think about how to choose this prior
big question remaining is what to do about the
weighting, it is helpful to first think about solving
neglected, past data. This question is strongly
the full information problem by breaking it
related to what penalty to use on the initial state in
into two non-overlapping sequences of decision
the horizon .T N /. If we make this initial state
variables: the decision variables in the time
a free variable, that is equivalent to completely
interval corresponding to the neglected data
discounting the past data. If we wish to retain
.!.0/; !.1/; : : : ; !.T  N  1// and those in
some of the influence of the past data and keep the
the time interval corresponding to the considered
moving horizon estimation problem close to the
data in the horizon .!.T  N /; : : : ; !.T  1//. If
full information problem, then we must choose
we optimize over the first sequence of variables
an appropriate penalty for the initial state. We
and store the solution as a function of the terminal
discuss this problem next.
state .T  N /, we have defined what is known
as the arrival cost. This is the optimal cost to
Arrival Cost. When time is less than or equal
arrive at a given state value.
to the horizon length, T  N , we can simply
do full information estimation. So we assume Definition 1 (arrival cost) The (full informa-
throughout that T > N . For T > N , we express tion) arrival cost is defined for k  1 as
the MHE objective function as
Zk .x/ D min Vk ..0/; !/ (6)
.0/;!

subject to (2) and .kI .0/; !/ D x.


Notice the terminal constraint that  at time k
ends at value x. Given this arrival cost function,
we can then solve the full information problem by
optimizing over the remaining decision variables.
What we have described is simply the dynamic
programming strategy for optimizing over a sum
of stage costs with a dynamic model (Bertsekas
1995).
We have the following important equivalence.
Moving Horizon Estimation, Fig. 2 Schematic of the Lemma 1 (MHE and full information estima-
moving horizon estimation problem tion) The MHE problem (5) is equivalent to
Moving Horizon Estimation 803

the full information problem (4) for the choice


k ./ D Zk ./ for all k > N and N  1.
Using dynamic programming to decompose the
full information problem into an MHE prob-
lem with an arrival cost penalty is conceptu-
ally important to understand the structure of the
problem, but it doesn’t yet provide us with an
implementable estimation strategy because we
cannot compute and store the arrival cost when
the model is nonlinear or other constraints are
present in the problem. But if we are not too
worried about the optimality of the estimator and
are mainly interested in other properties, such Moving Horizon Estimation, Fig. 3 Arrival cost Zk .x/,
as stability of the estimator, we can find simpler underbounding prior weighting k .x/, and MHE optimal
design methods for choosing the weighting k ./. value VOk0 ; for all x and k > N , Zk .x/  k .x/  VOk0 ,
We address this issue next. O
and Zk .x.k// D k .x.k//
O D VOk0

state in the horizon at all. So neglecting the past


Estimator Properties: Stability data completely leads to a stable estimator for
detectable systems. If we want to improve on this
An estimator is termed stable if small distur- performance, we can increase the prior penalty,
bances .w; v/ lead to small estimate errors x  and we are guaranteed to remain stable as long as
xO as time increases. Precise definitions of this we stay below the upper limit set by the arrival M
basic idea are available elsewhere (Rawlings and cost.
Ji 2012), but this basic notion is sufficient for
the purposes of this overview. In applications,
properties such as stability and insensitivity to Related Problem: Statistical Sampling
model errors are usually more important than
optimality. It is possible for a filter to be optimal MHE is based on optimizing an objective func-
and still not stable. In the linear system context, tion that bears some relationship to the condi-
this cannot happen for “nice” systems. Such nice tional probability of the state (trajectory) given
systems are classified as detectable. Again, the the measurements. As discussed in the section
precise definition of detectability for the linear on the Kalman filter, if the system is linear with
case is available in standard references (Kwaker- normally distributed noise, this relationship can
naak and Sivan 1972). Defining detectability for be made exact, and MHE is therefore an optimal
nonlinear systems is a more delicate affair, but statistical estimator. But in the nonlinear case,
useful definitions are becoming available for the the objective function is chosen with engineering
nonlinear case as well (Sontag and Wang 1997). judgment and is only a surrogate for the condi-
If we lower our sights and do not worry if tional probability. By contrast, sampling methods
MHE is equivalent to full information estimation such as particle filtering are designed to sam-
and require only that it be a stable estimator, then ple the conditional density also in the nonlinear
the key result is that the prior penalty k ./ need case. The mean and variance of the samples then
only be chosen smaller than the arrival cost as provide estimates of the mean and variance of
shown in Fig. 3. See Rawlings and Mayne (2009, the conditional density of interest. In the limit
Theorem 4.20) for a precise statement of this of infinitely many samples, these methods are
result. Of course this condition includes the flat exact. The efficiency of the sampling methods
arrival cost, which does not penalize the initial depends strongly on the model and the dimension
804 MPC

of the state vector n, however. The efficiency of either (i) general background required to
the sampling strategy is particularly important understand MHE theory and its relationship to
for online use of state estimators. Rawlings and other methods or (ii) computational methods for
Bakshi (2006) and Rawlings and Mayne (2009, solving the real-time MHE optimization problem
pp. 329–355) provide some comparisons of par- or (iii) challenging nonlinear applications that
ticle filtering with MHE and also describe some demonstrate benefits and probe the current limits
hybrid methods combining MHE and particle of MHE implementations.
filtering.

Bibliography
Summary and Future Directions
Bertsekas DP (1995) Dynamic programming and optimal
MHE is one of few state estimation methods that control, vol 1. Athena Scientific, Belmont
can be applied to nonlinear models for which Kuhl P, Diehl M, Kraus T, Schloder JP, Bock HG (2011)
properties such as estimator stability can be es- A real-time algorithm for moving horizon state and
parameter estimation. Comput Chem Eng 35:71–83
tablished (Rao et al. 2003; Rawlings and Mayne Kwakernaak H, Sivan R (1972) Linear optimal control
2009). The required online solution of an opti- systems. Wiley, New York. ISBN:0-471-51110-2
mization problem is computationally demanding Lopez-Negrete R, Biegler LT (2012) A moving horizon
in some applications but can provide signifi- estimator for processes with multi-rate measurements:
a nonlinear programming sensitivity approach. J
cant benefits in estimator accuracy and rate of Process Control 22:677–688
convergence (Patwardhan et al. 2012). Current Patwardhan SC, Narasimhan S, Jagadeesan P, Gopaluni B,
topics for MHE theoretical research include treat- Shah SL (2012) Nonlinear Bayesian state estimation:
ing bounded rather than convergent disturbances a review of recent developments. Control Eng Pract
20:933–953
and establishing properties of suboptimal MHE Rao CV, Rawlings JB, Mayne DQ (2003) Constrained
(Rawlings and Ji 2012). The current main focus state estimation for nonlinear discrete-time systems:
for MHE applied research involves reducing the stability and moving horizon approximations. IEEE
online computational complexity to reliably han- Trans Autom Control 48(2): 246–258
Rawlings JB, Bakshi BR (2006) Particle filtering and
dle challenging large dimensional, nonlinear ap- moving horizon estimation. Comput Chem Eng
plications (Kuhl et al. 2011; Lopez-Negrete and 30:1529–1541
Biegler 2012; Zavala and Biegler 2009; Zavala Rawlings JB, Ji L (2012) Optimization-based state esti-
et al. 2008). mation: current status and some new results. J Process
Control 22:1439–1444
Rawlings JB, Mayne DQ (2009) Model predictive control:
theory and design. Nob Hill Publishing, Madison, 576
Cross-References p. ISBN:978-0-9759377-0-9
Sontag ED, Wang Y (1997) Output-to-state stability and
detectability of nonlinear systems. Syst Control Lett
 Bounds on Estimation 29:279–290
 Estimation, Survey on Zavala VM, Biegler LT (2009) Optimization-based strate-
 Extended Kalman Filters gies for the operation of low-density polyethylene
tubular reactors: nonlinear model predictive control.
 Nonlinear Filters
Comput Chem Eng 33(10):1735–1746
 Particle Filters Zavala VM, Laird CD, Biegler LT (2008) A fast mov-
ing horizon estimation algorithm based on nonlin-
ear programming sensitivity. J Process Control 18:
876–884
Recommended Reading

Moving horizon estimation has by this point


a fairly extensive literature; a recent overview MPC
is provided in Rawlings and Mayne (2009,
pp. 356–357). The following references provide  Model-Predictive Control in Practice
MRAC 805

FMI for Co-Simulation; FMI for Model


MRAC Exchange; Functional Mockup Interface; Inverse
models; Modelica; Object-oriented modeling;
 Model Reference Adaptive Control
Potential variable; Stream variable; Symbolic
transformation; VHDL-AMS

MSPCA
Introduction
 Multiscale Multivariate Statistical Process
Control Methods and tools for control system analysis
and design usually require an input–output block
diagram description of the plant to be controlled.
Multi-domain Modeling and
Apart from small systems, it is nontrivial to de-
Simulation
rive such models from first principles of physics.
Since a long time, methods and tools are available
Martin Otter
to construct such models automatically for one
Institute of System Dynamics and Control,
domain, for example, a mechanical model, an
German Aerospace Center (DLR), Wessling,
electronic, or a hydraulic circuit. These domain-
Germany
specific methods and tools are, however, only
of limited use for the modeling of multi-domain
Abstract systems.
In the dissertation (Elmqvist 1978), a suitable
One starting point for the analysis and design approach for multi-domain, object-oriented
of a control system is the block diagram modeling has been developed by introducing M
representation of a plant. Since it is nontrivial to a modeling language to define models on
convert a physical model of a plant into a block a high level based on first principles. The
diagram, this can be performed manually only resulting DAE (differential-algebraic equation)
for small plant models. Based on research from systems are transformed with proper algorithms
the last 35 years, more and more mature tools automatically in a block diagram description
are available to achieve this transformation fully with input and output signals based on ODEs
automatically. As a result, multi-domain plants, (ordinary differential equations).
for example, systems with electrical, mechanical, In 1978, the computers were not powerful
thermal, and fluid parts, can be modeled in a enough to apply this method on larger systems.
unified way and can be used directly as input– This changed in the 1990s, and then the tech-
output blocks for control system design. An nology has been substantially improved, many
overview of the basic principles of this approach different modeling languages appeared (and also
is given. This provides also the possibility to use disappeared), and the technology was introduced
nonlinear, multi-domain plant models directly in in commercial simulation environments.
a controller. Finally, the low-level “Functional In Table 1, an overview of the most important
Mockup Interface” standard is sketched to standards, languages, and tools in the year 2013
exchange multi-domain models between many for multi-domain modeling is given:
different modeling and simulation environments. The Modelica language is a standard from
The Modelica Association (Modelica Associa-
tion 2012). The first version was released in
Keywords 1997. Also a large free library is provided with
about 1,300 model components from many do-
Block diagram; Bond graph; Differential- mains. There are several software tools support-
algebraic equation (DAE) system; Flow variable; ing this modeling language and the free Modelica
806 MRAC

Multi-domain Modeling and Simulation, Table 1 Multi-domain modeling and simulation environments
Tool name Web (accessed December 2013)
Environments based on the Modelica Standard (https://www.Modelica.org)
CyModelica http://cydesign.com/
Dymola http://www.dymola.com/
JModelica.org http://www.jmodelica.org/
LMS Imagine.Lab AMESim http://www.lmsintl.com/LMS-Imagine-Lab-AMESim
MapleSim http://www.maplesoft.com/products/maplesim
MWorks http://en.tongyuan.cc/
OpenModelica https://openmodelica.org/
SimulationX http://www.itisim.com/simulationx/
Wolfram SystemModeler http://www.wolfram.com/system-modeler/
Environments based on the VHDL-AMS Standard (http://www.eda.ora/twiki/bin/view.cai/P10761)
ANSYS Simplorer http://www.ansys.com/Products
Saber http://www.synopsys.com/Systems/Saber
SMASH http://www.dolphin.fr/medal/products/smash/smashoverview.php
SystemVision http://www.mentor.com/products/sm/systemvision
Virtuoso AMS designer http://www.cadence.com
Environments with vendor-specific multi-domain modeling languages
EcosimPro http://www.ecosimpro.com/
gPROMS http://www.psenterprise.com/gproms
OpenMAST http://www.openmast.org/
Simscape https://www.mathworks.com/products/simscape
Environments based on the Bond Graph Methodology
20-sim http://www.20sim.com/

Standard Library. The examples of this entry are in 1999. It is an extension of the widely used
mostly provided from this standard. VHDL hardware description language. This
The following registered trademarks are refer- language is especially used in the electronics
enced: community.
• There are several vendor-specific modeling
Registered Owner of trademark languages, notably Simscape from Math-
trademark Works as an extension to Simulink, as well
AMESim IMAGINE SA as MAST, the underlying modeling language
ANSYS ANSYS Inc. of Saber (Mantoolh and Vlach 1992). In 2004,
Dymola Dassault Systemes AB MAST was published as OpenMAST under
EcosimPro Empresarios Agrupados A.I.E.
an open source license.
gPROMS Process Systems Enterprise Limited
• Bond graphs (see, e.g., Karnopp et al. 2012)
MATLAB The MathWorks Inc
are a special graphical notation to define
Modelica Modelica Association
multi-domain systems based on energy flow.
Saber Sabremark Limited partnership
It was invented in 1959 by Henry M. Paynter.
SimulationX ITI GmbH
Simulink The MathWorks Inc
In the section “Modeling Language Princi-
SystemVision Mentor Graphics Corporation ples”, the principles of multi-domain modeling
Virtuoso Cadence Design based on a modeling language are summarized.
In the section “Models for Control Systems”,
it is shown how such models can be used not
• The VHDL-AMS language is a standard from only for simulation but also as components in
IEEE (IEEE 1076.1-2007 2007), first released nonlinear control systems. Finally, in the section
MRAC 807

“The Functional Mockup Interface”, an overview A component, like a resistor, rotational inertia,
about a low-level standard for the exchange of or convective heat transfer, is shown as an icon
multi-domain systems is described. in the diagram. On the border of a component,
small rectangular or circular signs are present
representing the “physical ports.” Ports are con-
nected by lines and model the (idealized) physical
Modeling Language Principles or signal interaction between ports of different
components, for example, the flow of electrical
Schematics: The Graphical View current or heat or the rigid mechanical coupling.
Modelers nowadays require a simple to use Components are built up hierarchically from
graphical environment to build up models. With other components. On the lowest level, compo-
very few exceptions, multi-domain environments nents are described textually with the respec-
define models by schematic diagrams. A typical tive modeling language (see section “Component
example is given in Fig. 1, showing a simple Equations”).
direct-current electrical motor in Modelica.
In the lower left part, the electrical circuit
diagram of the DC motor is visible, consisting Coupling Components by Ports
mainly of the armature resistance and inductance The ports define how of a component can interact
of the motor, a voltage source, and component with other components. A port contains (a) a def-
“emf” to model in an idealized way the electro- inition of the variables that describe the interface
motoric forces in the air gap. On the lower right and (b) defines in which way a tool can automat-
part, the motor inertia, a gear box, and a load ically construct the equations of connections. A
inertia are present. In the upper part, the heat typical scenario is shown in Fig. 2 where the ports
transfer of the resistor losses to the environment of the three components A, B, C are connected
is modeled with lumped elements. together at one point P:
M

fixedTemp
thermalCond convection

deg⬚C
C G=G Gc
T=20
const
heatCapacitor

R=13.8 inductor
k=20
resistor L= 0.061
gear
signalVoltage

motorInertia loadInertia
+

k =1.016

emf ratio=105

iSensor J=0.00025 J=100

ground

Multi-domain Modeling and Simulation, Fig. 1 Modelica schematic of DC motor with mechanical load and heat
losses
808 MRAC

When cutting away the connection lines, the consequence, the balance equations and bound-
resulting system consists of three decoupled com- ary conditions are fulfilled in the overall model
ponents A, B, C and a new component around containing all components and all connections.
P describing the infinitesimally small connection In order that a tool can automatically construct
point. The balance equations and the boundary the equations at a connection point, every port
conditions of the respective domain must hold at variable needs to be associated to a port variable
all these components. When drawing the connec- type. In Table 2, some port variable types of
tion lines, enough information must be available Modelica are shown. In this table it is assumed
in the port definitions so that the tool can con- that u1 ; u2 ,. . . un ; y; v1 ; v2 ; : : : ; vn ; f1 ; f2 ; : : :, fn ,
struct the equations of the infinitesimally small s1 , s2 ; : : : ; sn are corresponding port variables
connection points automatically. from different components that are connected
To summarize, the component developer is together at the same point P.
responsible that the balance equations and bound- Port variable types “input” and “output” define
ary conditions are fulfilled for every component the “usual” signal connections in block diagrams.
(A, B, C in Fig. 2), and the tool is responsible that “Potential variables” and “flow variables” are
the balance equations and boundary conditions used to define standard physical connections. For
are also fulfilled at the points where the compo- example, an electrical port contains the electrical
nents are connected together (P in Fig. 2). As a potential and the electrical current at the port,
and when connecting electrical ports together,
the electrical potentials are identical and the sum
of the electrical currents is zero, according to
P Table 2. This corresponds exactly to Kirchhoff’s
A B
voltage and current laws.
“Stream variables” are used to describe the
C connection semantics of intensive quantities in
bidirectional fluid flow, such as specific enthalpy
or mass fraction. Here, the idealized balance
equation at a connection point states, for exam-
P
A B ple, that the sum of the port enthalpy flow rates is
zero and the port enthalpy flow rate is computed
as the product of the mass flow rate (a flow
variable fi ) and the directional specific enthalpy
C
si , which is either the (yet unknown) mixing-
specific enthalpy smix when the flow is from
Multi-domain Modeling and Simulation, Fig. 2
the connection point to the port or the specific
Cutting the connections around the connection point P
results in three decoupled components A, B, C and a new enthalpy si in the port when the flow is from
component around P describing the infinitesimally small the port to the connection point. More details
connection point

Multi-domain Modeling and Simulation, Table 2 Some port variable types in Modelica
Port variable type Connection semantics
Input variables ui , output variable y u1 D u2 D : : : D un D y (exactly one output variable can
be connected to n input variables)
Potential variables vi v1 D v2 D : : : D vn
P
Flow variables fi 0 D fi

P s if fi > 0
Stream variables si (with associated flow variables fi ) 0 D fi sOi I sOi D mix
P si if fi  0
.0 D fi /
MRAC 809

Multi-domain Modeling and Simulation, Table 3 Some port definitions from Modelica
Domain Port variables
Electrical analog Electrical potential in [V] (pot.)
electrical current in [A] (flow)
Elec. multiphase Vector of electrical ports
Electrical quasi-stationary Complex elec. potential (pot.)
complex elec. current (flow)
Magnetic flux tubes Magnetic potential in [A] (pot.)
magnetic flux in [Wb] (flow)
Translational (1-dim. mechanics) Distance in [m] (pot.)
cut-force in [N] (flow)
Rotational (1-dim. mechanics) Absolute angle in [rad] (pot.)
cut-torque in [Nm] (flow)
2-dim. mechanics Position in x-direction in [m] (pot.)
position in y-direction in [m] (pot.)
absolute angle in [rad] (pot.)
cut-force in x-direction in [N] (flow)
cut-force in y-direction in [N] (flow)
cut-torque in z-direc. in [Nm] (flow)
3-dim. mechanics Position vector in [m] (pot.)
transformation matrix in [1] (pot.)
cut-force vector in [N] (flow)
cut-torque vector in [Nm] (flow)
1-dim. heat transfer Temperature in [K] (pot.)
heat flow rate in [W] (flow)
1-dim. thermo-fluid pipe flow Pressure in [Pa] (pot.)
mass flow rate in [kg/s] (flow)
spec. enthalpy in [J/kg] (stream) M
mass fractions in [1] (stream)

and explanations are available from Franke et al.


(2009). In Table 3 some of the port definitions are 0= +
shown that are defined in the Modelica Standard = −
Library.
=

Component Equations
Implementing a component in a modeling lan- Multi-domain Modeling and Simulation, Fig. 3
Equations of a capacitor component
guage means to (a) define the ports of the com-
ponent and (b) provide the equations describing
the relationships between the port variables. For
example, an electrical capacitor with constant Furthermore, the two remaining equations state
capacitance C can be defined by the equations in that the derivative of the difference of the port
the right side of Fig. 3. potentials is proportional to the current flowing
Such a component has two ports, the pins into port “p.”
“p” and “n,” and the port variables are the elec- One important question is how many equa-
trical currents ip ; in flowing into the respective tions are needed to describe such a component?
ports and the electrical potentials vp ; vn at the For an input–output block, this is simple: all input
ports. The first component equation states that variables are known, and for all other variables,
if the current ip at port “p” is positive, then the one equation per unknown is needed. Count-
current in at port “n” is negative (therefore, the ing equations for physical components, such as
current flowing into “p” is flowing out of “n”). a capacitor, is more involved: the requirement
810 MRAC

that any type of component connections shall stead, in VHDL-AMS (and some other modeling
always result in identical numbers of unknowns languages), port variables cannot be accessed in
and equations of the overall system leads to the a model, and instead via the “quantity .. across
following counting rule (for a proof, see Olsson .. through .. to ..” construction, the relationships
et al. 2008): between the port variables are implicitly defined
1. The number of potential and the number of and correspond to the Modelica equations “0 D
flow variables in a port must be identical. p.i C n.i” and “u D p.v  n. v.”
2. Input variables and variables that appear dif-
ferentiated are treated as known variables. Simulation of Multi-domain Systems
3. The number of equations of a component must Collecting all the component equations of
be equal to the number of unknowns minus the a multi-domain system model together with
number of flow variables. all connection equations results in a DAE
In the example of the capacitor, there are 5 un- (differential-algebraic equation) system:
knowns (ip , in , vp ; vn , du/dt) and 2 flow variables
(ip , in ). Therefore, 52 D 3 equations are needed
0 D f.Px; x; w; y; u; t/ (1)
to define this component.
Modeling languages are used to provide a tex-
tual description of the ports and of the equations where t 2 R is time, x.t/ 2 Rnx are vari-
in a specific syntax. For example, in Modelica ables appearing differentiated, w.t/ 2 Rnw are
the capacitor from Fig. 3 can be defined as Fig. 4 algebraic variables, y.t/ 2 Rny are outputs,
(keywords of the Modelica language are written u.t/ 2 Rnu are inputs, and f 2 Rnx Cnw Cny are
in boldface): the DAE equations. Equation (1) can be solved
In VHDL-AMS the capacitor model can be numerically with an integrator for DAE systems;
defined as shown in Fig. 5. see, for example, Brenan et al. (1996). For DAEs
One difference between Modelica and VHDL- that are linear in their unknowns, a complete
AMS is that in Modelica all equations need to be theory for solvability is available based on matrix
explicitly given and port variables (such as p.i) pencils (see, e.g., Brenan et al. 1996) and also
can be directly accessed in the model (Fig. 4). In- reliable software for their analysis (Varga 2000).
Unfortunately, only certain classes of nonlin-
ear DAEs can be directly solved numerically

type Voltage = Real (unit="V");


type Current = Real (unit="A");
subtype voltage is real;
subtype current is real;
connector Pin nature electrical is
Voltage v; voltage across
flow Current i; current through
end Pin; electrical_ref reference;

model Capacitor entity CapacitorInterface IS


parameter Real C(unit="F"); generic(C: real);
Pin p,n; port (terminal p, n: electrical);
end entity CapacitorInterface;
Voltage u;
architecture SimpleCapacitor of
equation
CapacitorInterface is
0 = p.i + n.i;
quantity u across i through p to n;
u = p.v – n.v;
begin
C*der(u) = p.i; i == C*u’dot;
end Capacitor; end architecture SimpleCapacitor;

Multi-domain Modeling and Simulation, Fig. 4 Multi-domain Modeling and Simulation, Fig. 5
Modelica model of capacitor component VHDL-AMS model of capacitor component
MRAC 811

in a reliable way. Domain-specific software, state, that is, it is required that xP 0 D 0 and
as, e.g., for mechanical systems, transforms therefore at the initial time (1) is required to
the underlying DAE into a form that can be satisfy
more reliably solved, using domain-specific
knowledge. This is performed by differentiating 0 D f .0; x0 ; w0 ; y0 ; u0 ; t0 / (2)
certain equations of the DAE analytically and
utilizing special integration methods for the Equation (2) is a nonlinear algebraic system of
resulting overdetermined set of differential- equations in the unknowns x0 ; w0 , y0 ; u0 . These
algebraic equations. Multi-domain simulation are nx C nw C ny equations for nx C nw C ny C nu
software uses the following approaches: unknowns. Therefore, nu further conditions must
(a) The DAE (1) is directly solved numerically be provided (usually some elements of u0 and/or
using an implicit integration method, such y0 are fixed to desired physical values). Solving
as a linear multistep method. Typically, all (2) for the unknowns is also called “DC operat-
VHDL-AMS simulators use this approach. ing point calculation” or “trimming.” Nonlinear
(b) The DAE (1) is symbolically transformed in equation solvers are based on iterative methods
a form that is equivalent to a set of ODEs that require usually a sufficiently accurate initial
(ordinary differential equations), and then guess for all unknowns. In a large multi-domain
either explicit or implicit ODE or DAE in- system model, this is not practical, and therefore,
tegration methods are used to numerically methods are needed to solve (2) even if generic
solve the transformed system. The transfor- guess values in a library are provided that might
mation is based on the algorithms of Pan- be far from the solution of the system at hand.
telides (1988) and of Mattsson and Söderlind For analog electronic circuit simulations, a
(1993) and might require to analytically dif- large body of theory, algorithms, and software is
ferentiate equations. Typically, all Modelica- available to solve (2) based on homotopy meth- M
based simulators, but also EcosimPro, use ods. The basic idea is to solve a sequence of non-
this approach. linear algebraic equation systems by starting with
For many models both approaches can be applied an easy to solve simplified system, characterized
successfully. There are, however, systems where by the homotopy parameter D 0. This system
approach (a) is successful and fails for (b) or vice is continuously “deformed” until the desired one
versa. is reached at D 1. The solution at iteration i
DAEs (1) derived from modeling languages is used as guess value for iteration i C 1, and at
usually have a large number of equations but with every iteration, the solution is usually computed
only a few unknowns in every equation. In order with a Newton-Raphson method.
to solve DAEs of this kind efficiently, both with The simplest such approach is “source step-
(a) or (b), typically graph theory and/or sparse ping”: the initial guess values of all electrical
matrix methods are utilized. For method (b) the components are set to “zero voltage” and/or “zero
fundamental algorithms have been developed in current.” All (voltage and current) sources start
Elmqvist (1978) and later improved in further at zero, and their values are gradually increased
publications. For a recent survey and comparison until the desired source values are reached. This
of some of the algorithms, see Frenkel et al. method may not converge, typically due to the
(2012). severe nonlinearities at switching thresholds in
Solving the DAE (1) means to solve an ini- logical circuits.
tial value problem. In order that this can be There are several, more involved approaches,
performed, a consistent set of initial variables called “probability one homotopy” methods. For
xP 0 D xP .t0 / ; x0 D x .t0 / ; w0 D w .t0 / ; y0 D these method classes, proofs exist that they con-
y .t0 / ; u0 D u .t0 / has to be determined first at verge with probability one (so practically al-
the initial time t0 : In general, this is a nontrivial ways). These algorithms can only be applied for
task. For example, often (1) shall start in steady certain classes of DAEs; see, for example, the
812 MRAC

“Variable Stimulus Probability One Homotopy” Multi-domain models might also be used
from Melville et al. (1993). directly in nonlinear Kalman filters, moving
Although strong results exist for analog elec- horizon estimators, or nonlinear model predictive
trical circuit simulators, it is difficult to generalize control. For example, the company ABB is using
them to the large class of multi-domain systems moving horizon estimation and nonlinear model
covered by a modeling language. In Modelica predictive control based on Modelica models
a “homotopy” operator was introduced into the to significantly improve the start-up process of
language (Sielemann et al. 2011) in order that a power plants (Franke and Doppelhamer 2006).
library developer can formulate simple homotopy
methods like the “source stepping” in a com- Inverse Models
ponent library. A generalization of probability A large body of literature exists about the theory
one methods for multi-domain systems was de- of nonlinear control systems that are based on
veloped in the dissertation of Sielemann (2012) inverse plant models; see, for example, Isidori
and was successfully applied to air distribution (1995). Methods such as feedback linearization,
systems described as 1-dim. thermo-fluid pipe nonlinear dynamic inversion, or flat systems use
flow. an inverse plant model in the control loop. How-
ever, a major obstacle is how to automatically
utilize an inverse plant model in a controller with-
Models for Control Systems out being forced to manually set up the equations
in the needed form which is not practical for
Models for Analysis larger systems. Modeling languages can solve
The multi-domain models from section “Mod- this problem as discussed below.
eling Language Principles” can be utilized to Nonlinear inverse models can be utilized in
evaluate the properties of a control system by various ways in a control system. The simplest
simulation. Also control systems can be designed approach, as feed forward controller, is shown in
by nonlinear optimization where at every opti- Fig. 6.
mization step one or several simulations of a plant Under the assumption that the models of the
model are executed. Furthermore, modeling en- plant and of the inverse plant are completely
vironments usually provide a means to linearize identical and start at the same initial state, then
the nonlinear DAE (1) of the underlying model from the construction the control error e is zero
around an operating point: and y = T(s)  yref where T is a diagonal matrix
with the transfer functions of the low-pass filters
x .t/  xop C x.t/; w.t/  wop C w.t/; on the diagonal (so y  yref for reference signals
y.t/  yop C y.t/; u.t/  uop C u.t/
(3)
resulting in
, inverse plant ,
Pxred D A xred C B u model
(4)
y D C xred C D u

where xred is a vector consisting of elements of low pass feedback


plant
filters controller
the vector of x, the vector w is eliminated by −
exploiting the algebraic constraints, and A, B, C,
D are constant matrices. Simulation tools provide
linear analysis and synthesis methods on this Multi-domain Modeling and Simulation, Fig. 6
Controller with inverse plant model in the feed forward
linearized system and/or export it for usage in an path. The inverse plant model needs usually also
environment like Matlab, Maple, Mathematica, derivatives of yref as inputs. These derivatives are
or Python. provided by appropriate low-pass filters
MRAC 813

that have a frequency spectrum below the cutoff differentiating equations analytically and solving
frequency of the low-pass filters). Since actually algebraic variables of the model in a different way
the assumption is usually not fulfilled, there will as for a simulation model. The whole transforma-
be a nonzero control error e and the feedback tion is nontrivial, but it is just the standard method
controller has to cope with it. This controller used by Modelica tools as for any other type of
structure with a nonlinear inverse plant model DAE system.
has the advantage that the feed forward part is The question arises whether a solution of the
useful over the complete operating range of the inverse model exists, is unique, and whether the
plant. model is stable (otherwise, it cannot be applied
Various other structures with nonlinear plant in a control system). In general, a nonlinear
models are discussed in Looye et al. (2005), inverse model consists of linear and/or nonlinear
such as compensation controllers, feedback lin- algebraic equation systems and of linear and/or
earization controllers, and nonlinear disturbance nonlinear differential equations. Therefore, from
observers. a formal point of view, the same theorems as for
It turns out that nonlinear inverse plant a general DAE apply; see, for example, Brenan
models can be generated automatically with et al. (1996). Furthermore, all these equations
the techniques that have been developed for need to be solved with a numerical method.
modeling languages; see section “Modeling For some classes of systems, it can be shown
Language Principles”. In particular, constructing that mathematically a unique solution exists and
an inverse model from (1) means that the inputs that the system is stable. However, in general,
u are defined to be outputs, so they are no longer one cannot expect that it is possible to provide
knowns but unknowns, and outputs y are defined such a proof for complex inverse plant models.
to be inputs, so they are no longer unknowns Still, inverse plant models have been successfully
but knowns. The resulting system is still a utilized by automatic generation from a Modelica M
DAE and can therefore be handled as any other tool, e.g., for robots, satellites, aircrafts, vehicles,
DAE. and thermo-fluid systems.
Therefore, defining an inverse model with a
modeling language just requires exchanging the
definition of input and output signals. In Model- The Functional Mockup Interface
ica, this can be graphically performed with the
nonstandard input–output block from Fig. 8. Many different types of simulation environments
This block has two inputs and two outputs and are in use. One cannot expect that a generic
described by the equations approach as sketched in section “Modeling
Language Principles” will replace all these
u1 D u2I y1 D y2 environments with their rich set of domain-
specific knowledge, analysis, and synthesis
From a block diagram point of view this looks features. Practically, all simulation environments
strange. However, from a DAE point of view, provide a vendor-specific interface in order
this just states constraints between two input and that a user can import components that are not
two output signals. In Fig. 8, it is shown how this describable by the simulation environment itself.
block can be used to invert a simple second order Typically, this requires to provide a component
system. as a set of C or Fortran functions with a particular
The output of the low-pass filter is connected calling interface. In the control community,
to the output of the second-order system and the most widely used approach of this kind is
therefore this model computes the input of the the S-Function interface from The MathWorks,
second-order system, from the input of the filter. where Simulink is used as integration platform,
A Modelica environment will generate from and model components from other environments
this type of definition the inverse model, thereby are imported as S-Functions.
814 MRAC

Multi-domain Modeling
and Simulation, Fig. 7
Modelica u1 u2 y1 y2
InverseBlockConstraint
block

Multi-domain Modeling filter


and Simulation, Fig. 8 secondOrder
Inversion of a second-order 2
system in Modelica

w=0.1
f_cut=10

In 2010 the vendor-independent standard properties of an FMU from a text file, without
“Functional Mockup Interface 1.0” was actually loading and running the FMU.
published (FMI Group 2010). This is a low- 2. A set of C-functions is provided to execute
level standard for the exchange of models model equations for the Model Exchange case
between different simulation environments. This and to simulate the equations for the Co-
standard allows to exchange only either the model Simulation case. These C-functions can be
equations (called “FMI for Model Exchange”) or provided either in binary form for different
the model equations with an embedded solver platforms or in source code. The different
(called “FMI for Co-Simulation”). This standard forms can be included in the same model zip-
was quickly adopted by many simulation file.
environments, and in 2013 there are more than 3. Further data can be included in the FMU zip-
40 tools that support it (for an actual list of file, especially a model icon (bitmap file),
tools, see https://www.fmi-standard.org/tools). documentation files, maps and tables needed
In particular nearly all Modelica environments by the model, and/or all object libraries or
can export Modelica models in this format, and DLLs that are utilized.
therefore, Modelica multi-domain models can be
imported in other environments with low effort.
A software component which implements the Summary and Future Directions
FMI is called Functional Mockup Unit (FMU).
An FMU consists of one zip-file with extension Multi-domain modeling based on a DAE descrip-
“.fmu” containing all necessary components to tion and defined with a modeling language is an
utilize the FMU either for Model Exchange, for established approach, and many tools support it.
Co-Simulation, or for both. The following sum- This allows to conveniently define plant models
mary is an adapted version from Blochwitz et al. from many domains for the design and evalua-
(2012): tion of control systems. Furthermore, nonlinear
1. An XML-file contains the definition of all inverse plant models can be easily constructed
exposed variables of the FMU, as well as with the same methodology and can be utilized
other model information. It is then possible to in various ways in nonlinear control systems.
run the FMU on a target system without this Current research focuses on the support of
information, i.e., without unnecessary over- the complete life cycle: defining requirements
head. Furthermore, this allows determining all of a system formally on a “high level,”
MRAC 815

considerably improving testing by checking these extended automation system 800xA. In: Proceed-
requirements automatically when evaluating a ings of 6th Modelica conference, Vienna, 4–5
Sept 2006, pp 293–302. https://modelica.org/events/
system design by simulations, and providing modelica2006/Proceedings/sessions/Session3c2.pdf
complete tool chains from nonlinear multi- Franke R, Casella F, Otter M, Sielemann M, Mattsson SE,
domain models to embedded systems. The Olsson H, Elmqvist H (2009) Stream connectors – an
latter will allow convenient and fast target code extension of modelica for device-oriented modeling
of convective transport phenomena. In: Proceedings
generation of nonlinear controllers, extended and of 7th Modelica conference, Como, pp 108–121.
unscented Kalman filters, optimization-based https://www.modelica.org/events/modelica2009/
controllers, or moving horizon estimators. Proceedings/memorystick/pages/papers/0078/0078.
Furthermore, the methodology itself is further pdf
Frenkel J, Kunze G, Fritzson P (2012) Survey of appro-
improved. For example, in 2012, Modelica was priate matching algorithms for large scale systems of
extended with language elements to define multi- differential algebraic equations. In: Proceedings of the
rate sampled data systems in a precise way, as 9th international Modelica conference, Munich, 3–5
well as state machines. Sept 2012, pp 433–442. http://www.ep.liu.se/ecp/076/
045/ecp12076045.pdf
IEEE 1076.1-2007 (2007) IEEE standard VHDL analog
and mixed-signal extensions. Standard of IEEE. http://
Cross-References standards.ieee.org/findstds/standard/1076.1-2007.
html
Isidori A (1995) Nonlinear control systems, 3rd edn.
 Computer-Aided Control Systems Design: In- Springer, Berlin/New York
troduction and Historical Overview Karnopp DC, Margolis DL, Rosenberg RC (2012)
 Extended Kalman Filters System dynamics: modeling, simulation, and
control of mechatronic systems, 5th edn. Wiley,
 Feedback Linearization of Nonlinear Hoboken
Systems Looye G, Thümmel M, Kurze M, Otter M, Bals J
 Interactive Environments and Software Tools (2005) Nonlinear inverse models for control. In: M
Proceedings of the 4th international Modelica
for CACSD
conference, Hamburg, 7–8 March 2005, p 267.
 Model Building for Control System Synthesis https://www.modelica.org/events/Conference2005/
onlineproceedings/Session3/Session3c3.pdf
Mattsson SE, Söderlind G (1993) Index reduction in
differential-algebraic equations using dummy deriva-
Bibliography tives. SIAM J Sci Comput 14:677–692
Mantoolh HA, Vlach M (1992) Beyond spice with saber
Blochwitz T, Otter M, Akesson J, Arnold M, Clauß C, and MAST. In: IEEE international symposium on
Elmqvist H, Friedrich M, Junghanns A, Mauss J, circuits and systems, San Diego, May 10–13 1992,
Neumerkel D, Olsson H, Viel A (2012) The functional vol 1, pp 77–80
mockup interface 2.0: the standard for tool indepen- Melville RC, Trajkovic L, Fang SC, Watson LT (1993)
dent exchange of simulation models. In: Proceedings Artificial parameter homotopy methods for the DC
of 9th international modelica conference, Munich, 3–5 operating point problem. IEEE Trans Comput Aided
Sept 2012, pp 173–184. http://www.ep.liu.se/ecp/076/ Des Integr Circuits Syst 12:861–877. http://citeseerx.
017/ecp12076017.pdf ist.psu.edu/viewdoc/download?doi=10.1.1.212.9327&
Brenan KE, Campbel SL, Petzold LR (1996) Numeri- rep=rep1&type=pdf
cal solution of initial-value problems in differential- Modelica Association (2012) Standard, Modelica
algebraic equations. SIAM, Philadelphia transactions on computer – a unified object-
Elmqvist H (1978) A structured model language for oriented language for systems modeling. Language
large continuous systems. Dissertation. Report specification, version 3.3. https://www.modelica.org/
CODEN:LUTFD2/(TFRT–1015), Department of Auto documents/ModelicaSpec33.pdf
Control, Lund Institute of Technology, Lund. http:// Olsson H, Otter M, Mattsson SE, Elmqvist H (2008)
www.control.lth.se/database/publications/article.pike? Balanced models in Modelica 3.0 for increased model
action=fulltext&artkey=elm78dis quality. In: Proceedings of 6th Modelica confer-
FMI Group (2010) The Functional Mockup Interface for ence, Bielefeld, pp 21–33. https://modelica.org/events/
Model Exchange and for Co-Simulation, version 1.0. modelica2008/Proceedings/sessions/session1a3.pdf
https://www.fmi-standard.org Pantelides CC (1988) The consistent initialization of
Franke R, Doppelhamer J (2006) Online applica- differential-algebraic systems. SIAM J Sci Stat Com-
tion of Modelica models in the industrial IT put 9:213–233
816 Multiscale Multivariate Statistical Process Control

Sieleman M (2012) Device-oriented modeling and simu- scales (or frequencies) and the signals in each
lation in aircraft energy systems design. Dissertation, decomposed scale are then used for MSP which
Dr. Hut. ISBN 3843905045
Sielemann M, Casella F, Otter M, Clauß C, Eborn J,
provides an indirect way of handling process
Mattsson SE, Olsson H (2011) Robust initialization dynamics.
of differential-algebraic equations using homotopy.
In: 8th international Modelica conference, Dres-
den. https://www.modelica.org/events/modelica2011/
Proceedings/pages/papers/041ID154afv.pdf
Keywords
Varga A (2000) A descriptor systems toolbox for Matlab.
In: Proceedings of CACSD’2000 IEEE international Multiresolution analysis; Partial least squares
symposium on computer-aided control system design, (PLS); Principal component analysis (PCA);
Anchorage, 25–27 Sept 2000, pp 150–155. http://elib.
dlr.de/11629/01/vargacacsd2000p2.pdf
Projection to latent structures (PLS); Wavelet
transform

Multiscale Multivariate Statistical Definition


Process Control
Multiscale principal component analysis
Julian Morris (MSPCA) and its extension multiscale projection
School of Chemical Engineering and Advanced to latent structures (MSPLS) combine the
Materials, Centre for Process Analytics and abilities of these multivariate tools to de-
Control Technology, Newcastle University, correlate the variables by extracting linear
Newcastle Upon Tyne, UK relationships with that of wavelet analysis, to
extract deterministic features and approximately
de-correlate autocorrelated measurements.
Synonyms Multiscale modeling makes use of the wavelet
transform which allows a signal (measurement)
MSPCA to be viewed in multiple resolutions with each
resolution representing a different frequency.
That is, wavelet transform allows complex
information to be decomposed into basic
Abstract components at different positions and scales.

Dynamic processes, both continuous and


batch, are characterised by autocorrelated Motivation and Background
measurements which are allied to the effects
of process dynamics and disturbances. The One of the drawbacks of the conventional
common multivariate statistical process control PCA (or PLS)-based MSPC is that although
(MSPC) approaches have been to use principal the PCA/PLS model captures the correlations
component analysis (PCA) or projection to latent among the variables, it ignores the serial
structures (PLS) to build a model that captures the (auto)correlation in the process variables and
simultaneous correlations amongst the variables, measurements. One way to overcome this issue
but that ignores the serial correlation in the is to include time-lagged variables in the PCA
data during normal operations. Under such or PLS model. In this way, PCA and PLS will
conditions it is difficult to perform efficient fault explicitly model both the correlations among
detection and diagnosis. An alternative approach the variables and the serial correlations in the
to account for the process dynamics in MSPC is individual variables. The impact is an increase in
to use multiresolution analysis (MRA) by way the number principal components required, but
of wavelet decomposition. Here, the individual the multivariate monitoring model will be able
measurements are decomposed into different to detect any changes in the serial correlation of
Multiscale Multivariate Statistical Process Control 817

the variables as well as changes in relationships provided review of wavelet-based multiscale sta-
among the variables. This article focuses on tistical process monitoring.
multiscale-multiway PCA using batchwise
data unfolding. However, the methodology
can equally be applied to PLS-based process The Approach
performance monitoring (MSPC).
In multivariate statistical process control Multiresolution analysis (MRA) provides the
(MSPC), the multivariate statistical techniques theoretical basis for the derivation of a com-
of principle component analysis (PCA) and putationally efficient algorithm for the wavelet
projection to latent structures {Partial Least transform Mallat (1998). MRA allows the
Squares} (PLS) together with monitoring metrics dynamic aspects of the data in to be taken into
based on Hotelling’s T 2 (directly related to account in MSPC. The individual signals are
the Mahalanobis distance that monitors the decomposed into different scales (frequencies),
fit of new observations to the model space) and data in each decomposed scale are then used
and the squared prediction error (SPE) or Q for MSPC which provides an indirect approach
statistic (that monitors the residual space-model to handling process dynamics. Multiscale MSPC
mismatch) are used to simultaneously monitor (MSPCA) enables the simultaneous extraction
the process variables (Kourti and MacGregor of process correlations across data as well as
1996; Qin 2003). A recent survey provides an accounting for autocorrelation within sensor
excellent state-of-the-art review of the methods data. In this way, it captures correlations among
and applications of data-driven fault detection the process variables made by various events
and diagnosis that have been developed over the occurring at different scales.
last two decades (Qin 2012). MSPCA calculates the principal components
Process measurements typically exhibit multi- of wavelet coefficients at each scale and com- M
scale behavior as a consequence of representing bines these at the relevant wavelet scales. Due
the cumulative effect of a number of underlying to its multiscale nature, MSPCA is very useful
process phenomena including process dynamics, for the modeling of data containing contribu-
measurement noise, and disturbances. To address tions from events whose behavior changes over
these issues, methodologies are required to ad- both time and frequency. Process monitoring by
dress (i) the multiscale nature of process data and MSPCA, and process prediction by MSPLS, in-
(ii) the inability of some existing algorithms to volves combining those scales where significant
handle autocorrelation. One approach is through events are detected. Approximate de-correlation
the use of multiresolution analysis and wavelets of wavelet coefficients also makes MSPCA ef-
Mallat (1998). Informative discussions and ap- fective for the monitoring of autocorrelated mea-
plication studies related to using multiresolution surements.
analysis and wavelet decompositions to enhance
PCA-based process monitoring and fault detec-
tion have been presented, for example, by Bakshi The Algorithm
(1998), Misra et al. (2002), Aradhye et al. (2003),
Lu et al. (2003), Yoon and MacGregor (2004) and Wavelets are a family of basis functions that
Reis and Saraiva (2006). Yoon and MacGregor provide a mapping from the time domain to the
in their comprehensive MSPCA study discussed time-frequency domain. They can be used to
their approach in the context of other multiscale decompose the signal into different resolutions by
approaches and illustrated the methodology using projecting onto the corresponding wavelet basis
simulated data from a continuous stirred-tank functions using multiresolution analysis (MRA).
reactor system. A major contribution of the paper A wavelet set is constructed from a fundamen-
was to extend fault isolation methods based on tal basis function or the mother wavelet by a
contribution plots to multiscale PCA approaches. process of translation and dilation. The wavelet
Although some 9 years old, Ganesan et al. (2004) set is defined as wavelet analysis which provides
818 Multiscale Multivariate Statistical Process Control

methodologies for the extraction of the time and three components, low-pass filters L.n/, high-
frequency content of a signal. Conventional fre- pass filters H.n/, and dyadic decimation. By
quency analysis based on the Fourier transform passing the input signal through this pair of
consists of decomposing a signal into sine waves filters, the projection of the original signal
of different frequencies. Wavelet analysis decom- onto the scaling and wavelet functions for the
poses the original signal in a similar manner. The multiresolution analysis is performed. Dyadic
major difference is that while Fourier analysis decimation, or down-sampling, removes every
uses sine waves of infinite length, multiresolution odd member of a sequence, thus halving the
analysis uses waveforms of finite length. The original number of samples. The low-pass filter
finite length of the wavelets allows them to de- resembles a moving average, while the high-pass
scribe local events in both the time and frequency filter extracts the detailed information contained
domain. in the signal. The discrete wavelet transform
The wavelet transform, an extension to the operates by taking a sequence of values, applying
Fourier transform, projects the original signal L.n/ and H.n/ and then repeating this same
down onto wavelet basis functions, providing a procedure to the approximation coefficients. In
mapping from the time domain to the timescale this way, the original signal vector is smoothed
plane. The wavelet functions, which are localized and halved through L, and the vector of
in the time and frequency domain, are obtained approximation coefficients is again smoothed
from a single prototype wavelet, the mother and halved through L. Successive application of
wavelet, by dilation and translation. The wavelet the low-pass filter results in the approximation
set is defined as coefficients, becoming an increasingly smooth
  version of the original signal. At the same time
1 t b as smoothing the signal, each iteration extracts
a;b .t/ D p
jaj a the high frequency information in the data.
The repeated application of L, followed by H
where is the mother wavelet function, a the is, in effect, a band-pass filter. The result of
dilation parameter, and b the translation param- applying high-/low-pass filters to a signal is
eters, and the factor p1jaj is used to ensure that a set of coefficients describing the details of
each wavelet function has the same energy as the the signals DL and a second set describing the
mother wavelet. The discrete wavelet transform approximations of the signals AL . The original
with dyadic dilation and translation is used in this signal s can then be represented by
overview. A definition of continuous and discrete
wavelet transforms can be found in Daubechies X
L
x.t/ D Dj .t/ C AL .t/
(1992). In the discrete case, the dilation and
j D1
translation parameters are discretized as a D a0m
and b D kb0 a0m . If a0 D 2 and b0 D 1, a dyadic where Dj and Aj are referred to as the j th level
dilation and translation is carried out; however, wavelet details and approximation, respectively.
a0 and b0 are not restricted to these values. The Figure 1 shows schematically the multi-
discrete wavelet form, which is widely used in resolution-based wavelet decomposition.
process monitoring and chemical signal analy- One of the most popular choices of wavelets
sis, is are those of the Daubechies’ family. These
wavelets are compactly supported in the time
j=2 jt
jk .t/ D a0 .a0  kb0 / domain and have good frequency domain
decay. Moreover, Daubechies’ wavelets (DaubN)
A recursive algorithm for wavelet decompo- possess a different type of smoothness which
sition and the reconstruction of a discrete signal is determined by the vanishing moments N .
of dyadic length is often used Mallat (1998) This makes it possible to match the wavelet
and is known as the pyramid algorithm. The smoothness to the smoothness of the signals to
fast discrete wavelet decomposition consists of be analyzed. The signal can then be decomposed
Multiscale Multivariate Statistical Process Control 819

Multiscale Multivariate
Statistical Process
Control, Fig. 1 Schematic
of multiresolution wavelet
decomposition

into its contributions from multiple scales fermentation process (Birol et al. http://
as a weighted sum of dyadically discretized www.chee.iit.edu~control/software.html) was
orthonormal basis functions: presented by Alawi and Morris (2007). The M
application used a combination of multiblock
X
L X
N X
N statistical modeling approaches together with
x.t/ D dmk mk .t/ C alk 'lk .t/ multiscale-multiway batch monitoring. Figure 3
mD1 kD1 kD1 shows the multiscale-multiway monitoring
scheme for process monitoring and fault
where x.t/ represents the process measurements, detection. At every time point, the batch process
dmk represents the wavelet or detail signal coeffi- variables are decomposed into scales to the
cient at scale m and location k, and aLk represent wavelet domain and then reconstructed back
the scaled signal or scaling function coefficient to the time domain. The scales/details and the
of .t/ at the coarsest scale L and location approximations are collected into separate ma-
k. The scaling function, or father wavelet, mk trices (blocks). Multiblock PCA is then applied
captures the low-frequency content of the original to the wavelets details and approximation. Fault
signal that is not captured by wavelets at the detection based on the Ts2 and Qs statistics was
corresponding or finer scales. used along with contribution plots incorporating
The wavelet transformation is applied to confidence bounds to enhance fault diagnosis.
decompose a multivariate signal X into its Figure 4 compares the monitoring statistics
approximate, A1 to AL , and detail, D1 to DL , for the multiscale-multiway PCA and conven-
coefficients for the first to Lth level, respectively. tional multiway PCA for a slowly drifting sen-
For more information, see Bakshi (1998), Misra sor fault showing the potential for multiscale
et al. (2002) and Aradhye et al. (2003). Figure 2 MPCA (MSPCA) in being able to detect faster
shows a schematic representation of a typical subtle process and sensor faults than conventional
MSPCA multivariate statistical process control multiway MSPC. It is noted that sensor drift is
scheme. confined to one scale band at low frequency. It has
An example of the application of multiway- been observed that multiscale approaches appear
multiscale MPCA to a benchmark-fed batch to provide little improvement if a fault effect is
820 Multiscale Multivariate Statistical Process Control

Multiscale Multivariate Statistical Process Control, Fig. 2 Schematic representation of multiscale PCA-based
MSPC

Multiscale Multivariate Statistical Process Control, Fig. 3 Multiscale-multiway batch process monitoring scheme

spread over more than one frequency band or developed, e.g., Teppola and Minkkinen (2000)
the fault effect occurs mainly in a scale with and Lee et al. (2009). Nonlinear approaches have
dominant variance. Thus, a monitoring method also been explored. For example, Lee et al. (2004)
that gives the best detection and identification proposed a batch monitoring approach using
of faults will depend on the fault characteristics multiway kernel principal component analysis,
with multiscale approaches, providing an advan- Shao et al. (1999) proposed a wavelet-based
tage when the faults localized in frequency or nonlinear PCA algorithm, Choi et al. (2008)
that appear in scales that normally have small described a study of a kernel-based MSPCA
variance. algorithm for nonlinear multiscale monitoring,
and most recently Zhang and Ma (2011) com-
Other Applications of Multiscale pared fault diagnosis of nonlinear processes using
MPCA multiscale KPCA and multiscale KPLS. Wavelet
multiscale approaches have also been widely
There have a number of nonlinear extensions. For discussed in spectroscopic data processing (Shao
example, multiscale PLS approaches have been et al. 2004).
Multi-vehicle Routing 821

Lee J-M, Yoo C-K, Lee I-B (2004) Fault detection of batch
processes using multiway Kernel principal component
analysis. Comput Chem Eng 28(9):1837–1847
Lee HW, Lee MW, Park JM (2009) Multi-scale extension
of PLS algorithm for advanced on-line process moni-
toring. Chemom Intell Lab Syst 98:201–212
Liu Z, Cai W, Shao X (2009) A weighted multiscale
regression for multivariate calibration of near infrared
spectra. Analyst 134:261–266
Lu N, Wang F, Gao F (2003) Combination method of prin-
cipal component and wavelet analysis for multivariate
process monitoring and fault diagnosis. Ind Eng Chem
Res 42:4198–4207
Mallat SG (1998) Multiresolution approximations and
wavelet orthonormal bases. Trans Am Math Soc
315:69–87
Multiscale Multivariate Statistical Process Control, Misra MH, Yue H, Qin SJ, Ling C (2002) Multivariate
Fig. 4 Comparison of MPCA and multiscale MSPCA for process monitoring and fault diagnosis by multi-scale
a range of subtle sensor drift faults magnitude ! against PCA. Comp Chem Eng 26:1281–1293
fault detection delay Qin SJ (2003) Statistical process monitoring: basics and
beyond. J Chemom 17:480–502
Qin SJ (2012) Survey on data-driven industrial pro-
cess monitoring and diagnosis. Annu Rev Control
Cross-References 36:220–234
Reis MS, Saraiva PM (2006) Multiscale statistical pro-
cess control with multiresolution data. AIChE J
 Controller Performance
52:2107–2119
Monitoring Shao R, Jia F, Martin EB, Morris AJ (1999) Wavelets and
 Fault Detection and Diagnosis nonlinear principal components analysis for process
 Statistical Process Control in monitoring. Control Eng Pract 7:865–879 M
Shao X-G, Leung AK-M, Chau F-T (2004) Wavelet: a new
Manufacturing trend in chemistry. Acc Chem Res 36:276–283
Teppola P, Minkkinen P (2000) Wavelet-PLS regression
models for both exploratory data analysis and process
Bibliography monitoring. J Chemom 14:383–399
Yoon S, MacGregor JF (2004) Principal-component anal-
Alawi A, Zhang J, Morris AJ (2007) Multiscale Multi- ysis of multiscale data for process monitoring and fault
block Batch Monitoring: Sensor and Process Drift diagnosis. AIChE J 50(11):2891–2903
and Degradation. DOI: 10.1021/op400337x April 25 Zhang Y, Ma C (2011) Fault diagnosis of nonlinear pro-
2014 cesses using multiscale KPCA and multiscale KPLS.
Aradhye HB, Bakshi BR, Strauss A, Davis JF (2003) Chem Eng Sci 66:64–72
Multiscale SPC using wavelets: theoretical analysis
and properties. AIChE J 49(4):939–958
Bakshi BR (1998) Multiscale PCA with application to
multivariate statistical process monitoring. AIChE J Multi-vehicle Routing
44:1596–1610
Birol G, Undey C, Cinar AA (2002) Modular simulation Emilio Frazzoli1 and Marco Pavone2
package for fed-batch fermentation: penicillin produc- 1
tion. Comput Chem Eng 26:1553–1565
Massachusetts Institute of Technology,
Choi SW, Morris J, Lee I-B (2008) Nonlinear multiscale Cambridge, MA, USA
2
modelling for fault detection and identification. Chem Stanford University, Stanford, CA, USA
Eng Sci 63:2252–2266
Daubechies I (1992) Ten lectures on wavelets. SIAM,
Philadelphia
Ganesan R, Das TT, Venkataraman V (2004) Wavelet- Abstract
based multiscale statistical process monitoring: a lit-
erature review. IIE Trans 36:787–806 Multi-vehicle routing problems in systems and
Kourti T, MacGregor JF (1996) Multivariate SPC methods
for process and product monitoring. J Qual Technol control theory are concerned with the design of
28:409–428 control policies to coordinate several vehicles
822 Multi-vehicle Routing

moving in a metric space, in order to complete (UAVs) is responsible for investigating possible
spatially localized, exogenously generated tasks threats over a region of interest. As possible
in an efficient way. Control policies depend on threats are detected, by intelligence, high-altitude
several factors, including the definition of the or orbiting platforms, or by ground sensor net-
tasks, of the task generation process, of the vehi- works, one of the UAVs must visit its location
cle dynamics and constraints, of the information and investigate the cause of the alarm, in order
available to the vehicles, and of the performance to enable an appropriate response if necessary.
objective. Ensuring the stability of the system, Performing this task may require the UAV not
i.e., the uniform boundedness of the number of only to fly to the possible threat’s location but
outstanding tasks, is a primary concern. Typical also to spend additional time on site. The objec-
performance objectives are represented by mea- tive is to minimize the average time between the
sures of quality of service, such as the average appearance of a possible threat and the time one
or worst-case time a task spends in the system of the UAVs completes the close-range inspection
before being completed or the percentage of tasks task. Variations may include priority levels, time
that are completed before certain deadlines. The windows during which the inspection task must
scalability of the control policies to large groups be completed, and sensors with limited range.
of vehicles often drives the choice of the informa- In order to perform the required mission, the
tion structure, requiring distributed computation. UAVs (or, more in general, mission control)
need to repeatedly solve three coupled decision-
making problems:
Keywords 1. Task allocation: which UAV shall pursue
each task? What policy is used to assign tasks
Cooperative control; Decentralized control; Dy- to UAVs? How often should the assignment
namic routing; Networked robots; Task allocation be revised?
2. Service scheduling: given the list of tasks to
be pursued, what is the most efficient ordering
Introduction of these tasks?
3. Loitering paths: what should UAVs without
Multi-vehicle routing problems in systems and pending assignments do?
control theory are concerned with the design of The optimization process must take into account,
control policies to coordinate several vehicles for example, algebraic or differential constraints
moving in a metric space, in order to complete (such as obstacle avoidance or bounded cur-
spatially localized, exogenously generated tasks vature, respectively), sensing constraints, com-
in an efficient way. Key features of the prob- munication constraints, and energy constraints.
lem are that tasks arrive sequentially over time Furthermore, one might require a decentralized
and planning algorithms should provide control control architecture.
policies (in contrast to preplanned routes) that DVR problems, including the above UAV
prescribe how the routes should be updated as a routing problem, are generally intractable due
function of those inputs that change in real time. to their multifaceted combinatorial, differential,
This problem is usually referred to as dynamic and stochastic nature, and consequently solution
vehicle routing (DVR). In DVR problems, ensur- approaches have been devised that look either
ing the stability of the system, i.e., the uniform at heuristic algorithms or at approximation
boundedness of the number of outstanding tasks, algorithms with some guarantee on their
is a primary concern. performance.

Motivation and Background Related Problems


As a motivating example, consider the following DVR problems represent the dynamic counter-
scenario: a team of unmanned aerial vehicles part of the well-known static vehicle routing
Multi-vehicle Routing 823

problem (VRP), whereby (i) a team of n vehicles fulfilled in the order in which they arrive, are un-
is required to service a set of nT “static” tasks in able to stabilize the system for all stabilizable task
a metric space, (ii) each task requires a certain arrival rates, in the sense that with such routing al-
amount of on-site service, (iii) and the goal is to gorithms the average number of tasks grows over
compute a set of routes that minimizes the cost time without bound, even though there exist alter-
of servicing the tasks; see Toth and Vigo (2001) native routing algorithms that would maintain the
for a thorough introduction to this problem. The number of tasks uniformly bounded (Bertsimas
VRP is static in the sense that vehicle routes and van Ryzin 1991).
are computed assuming that no new tasks arrive. The most widely applied approach is
The VRP is an important research topic in the to combine static routing methods (e.g.,
operations research community. VRP-like methods, nearest neighbor strategies,
or genetic algorithms) and sequential re-
optimization, where the re-optimization horizon
Approaches for Multi-vehicle Routing is chosen heuristically. In particular, greedy
nearest neighbor strategies, whose formal
Broadly speaking, there are three main characterization still represents an open problem,
approaches available in the literature to tackle are known to perform particularly well in some
dynamic vehicle routing problems. The first notable cases (Bertsimas and van Ryzin 1991).
approach relies on heuristic algorithms. In the However, the joint selection of a static routing
second approach, called “competitive analysis method and of the re-optimization horizon in
approach,” routing policies are designed to the presence of vehicle and task constraints
minimize the worst-case ratio between their (e.g., differential motion constraints, or task
performance and the performance of an optimal priorities) makes the application of this approach
off-line algorithm which has a priori knowledge far from trivial. For example, one can show that M
of the entire input sequence. In the third an erroneous selection of the re-optimization
approach, the routing problem is embedded horizon can lead to pathological scenarios where
within the framework of queueing theory. no task ever receives service (Pavone 2010).
Routing policies are then designed to stabilize Additionally, performance criteria in dynamic
the system in terms of uniform boundedness of settings commonly differ from those of the
the number of outstanding tasks and to minimize corresponding static problems. For example, in
typical queueing-theoretical cost functions such a dynamic setting, the time needed to complete
as the expected time the tasks remain in the queue. a task may be a more important factor than the
Since the generation of tasks and motion of the total vehicle travel cost.
vehicles is within an Euclidean space, one can
refer to this third approach as “spatial queueing
theory.” Competitive Analysis Approach
The distinctive feature of the competitive analy-
Heuristic Approach sis approach is the method used to evaluate an
The main aspect of the heuristic approach is that algorithm’s performance, which is called compet-
routing algorithms are evaluated primarily via nu- itive analysis. In competitive analysis, the per-
merical, statistical and experimental studies, and formance of a (causal) algorithm is compared
formal performance guarantees are not available. to the performance of a corresponding off-line
A naïve, yet reasonable approach to design a algorithm (i.e., a non-causal algorithm that has
heuristic algorithm for DVR would be to adapt a priori knowledge of the entire input) in the
classic queueing policies. However, perhaps sur- worst-case scenario. Specifically, an algorithm is
prisingly, this adaptation is not at all straightfor- c-competitive if its cost on any problem instance
ward. For example, routing algorithms based on a is at most c times the cost of an optimal off-line
first-come-first-served policy, whereby tasks are algorithm:
824 Multi-vehicle Routing

Costcausal .I /  c Costoptimal off-line .I /; spatial queueing model, establishment of funda-


mental limitations of performance, and design of
for all problem instances I: algorithms with performance guarantees. More
specifically, the formulation of a model entails
In the recent past, several dynamic vehicle detailing four main aspects:
routing problems have been successfully studied 1. A model for the dynamic component of the
in this framework, under the name of the online environment: this is usually achieved by as-
traveling repairman problem (Jaillet and Wagner suming that new events are generated (either
2006), and many interesting insights have been adversarially or stochastically) by an exoge-
obtained. However, the competitive analysis nous process.
approach has some potential disadvantages. First, 2. A model for targets/tasks: tasks are usually
competitive analysis is a worst-case analysis; modeled as points in a physical environment
hence, the results are often overly pessimistic distributed according to some (possibly un-
for normal problem instances, and potential known) distribution, might require a certain
statistical information about the problem (e.g., level of on-site service time, and can be sub-
knowledge of the spatial distribution of future ject to a variety of constraints, e.g., time win-
tasks) is often neglected. Second, the worst- dows, priorities, etc.
case analysis usually requires a finite horizon 3. A model for the vehicles and their motion:
problem formulation, which precludes the besides their number, one needs to specify
study of useful properties such as stability. whether the vehicles are subject to alge-
Third, competitive analysis is used to bound braic (e.g., obstacles) or differential (e.g.,
the performance relative to an optimal off-line minimum turning radius) constraints, sensing
algorithm, which, by being non-causal, does not constraints, and fuel constraints. Also, the
belong to the feasible set of routing algorithms control could be centralized (i.e., coordinated
one is optimizing over. Hence, with this approach by a central station) or decentralized and
one minimizes the “cost of causality” in subject to communication constraints.
the worst-case scenario, but not necessarily 4. Performance criterion: examples include the
the worst-case cost (which would require minimization of the waiting time before ser-
comparison with an optimal causal routing vice, loss probabilities, expectation-variance
algorithm). Finally, many important real-world analysis, etc.
constraints for DVR, such as time windows, Once the model is formulated, one seeks
priorities, differential constraints on vehicle’s to characterize fundamental limitations of
motion, and the requirement of teams to fulfill performance (in the form of lower bounds for
a task, have so far proved to be too complex the best achievable cost); the purpose of this step
to be considered in the competitive analysis is essentially twofold: it allows the quantification
framework (Golden et al. 2008, page 206). Some of the degree of optimality of a routing algorithm
of these drawbacks have been recently addressed and provides structural insights into the problem.
by Van Hentenryck et al. (2009) where a As for the last step, the design of a routing
combined stochastic and competitive analysis algorithm usually relies on a careful combination
approach is proposed for a general class of of static routing methods with sequential re-
combinatorial optimization problems and is optimization. Desirable properties for the static
analyzed under some technical assumptions. methods are the following: (i) the static problem
can be solved (at least approximately) in
Spatial Queueing Theory polynomial time and (ii) the static method is
Spatial queueing theory embeds the dynamic ve- amenable to a statistical characterization (this is
hicle routing problem within the framework of essential for the computation of performance
queueing theory. Spatial queueing theory consists bounds). Formal performance guarantees on
of three main steps, namely, development of a a routing algorithm are then obtained by
Multi-vehicle Routing 825

quantifying the ratio between an upper bound on Applying Spatial Queueing Theory
the cost delivered by that algorithm and a lower to DVR Problems
bound for the best achievable cost. Such a ratio,
being an estimate of the degree of optimality of Spatial Queueing Theory Workflow for
the algorithm, should be close to one and possibly DTRP
independent of system parameters. The proposed
algorithms are finally evaluated via numerical, Model
statistical and experimental studies, including The DTRP, which, incidentally, captures well the
Monte-Carlo comparisons with alternative salient features of the UAV scenario outlined
approaches. in the Motivation Section, can be modeled as
An interesting feature of this approach is that follows. In a geographical region Q of area A,
the performance analysis usually yields scaling a dynamic process generates spatially localized
laws for the quality of service in terms of model tasks. The process generating tasks is modeled
data, which can be used as useful guidelines as a spatio-temporal Poisson process, i.e., (i) the
to select system parameters when feasible (e.g., time between consecutive generation instants has
number of vehicles). an exponential distribution with intensity >
In order to make the model tractable, the ar- 0 and (ii) upon arrival, the locations of tasks
rival process of tasks is assumed stationary (with are independently and uniformly distributed in
possibly unknown parameters) with statistically Q. The location of the new tasks is assumed
independent arrival times. These assumptions, to be immediately available to a team of n ser-
however, can be unrealistic in some scenarios, in vicing vehicles. The vehicles provide service in
which case the competitive analysis approach Q, traveling at a speed at most equal to v; the
may represent a better alternative. From a vehicles are assumed to have unlimited fuel and
technical standpoint, one should note that spatial task-servicing capabilities. Each task requires an M
queueing models are inherently different from independent and identically distributed amount of
traditional, nonspatial queueing models. The on-site service with finite mean duration s > 0.
main reason is that in spatial queueing models, A task is completed when one of the vehicles
the “service time” per task has both a travel moves to its location and performs its on-site
and an on-site component. Although the on- service. The objective is to design a routing policy
site service requirements can often be modeled that maximizes the quality of service delivered
as “statistically” independent, the travel times by the vehicles in terms of the average steady-
are inherently statistically coupled. Hence, in state time delay T between the generation of a
contrast to standard queueing models, service task and the time it is completed (in general, in
times in spatial queueing models are statistically a dynamic setting, the focus is on the quality
dependent, and this deeply affects the solution to of service as perceived by the “end user,” rather
the problem. than, for example, fuel economies achieved by
Pioneering work in this context is that of Bert- the vehicles). Other quantities of interest are the
simas and van Ryzin (1991), who introduced average number nT of tasks waiting to be com-
queueing methods to solve the baseline DVR pleted and the waiting time W of a task before its
problem (a vehicle moves along straight lines and location is reached by a vehicle. These quantities,
visits tasks whose time of arrival, location, and however, are related according to T D W C s
on-site service are stochastic; information about (by definition) and by Little’s law, stating that
task location is communicated to the vehicle upon nT D W , for stable queues.
task arrival). Next section provides an overview The system is considered stable if the expected
of the application of spatial queueing theory to number of waiting tasks is uniformly bounded at
such simplified DVR problem, referred to in all times, or equivalently, that tasks are removed
the literature as dynamic traveling repairman from the system at least at the same rate at
problem (DTRP). which they are generated. In the case at hand,
826 Multi-vehicle Routing

n
the time to complete a task is the sum of the Vi D q 2 Qj kq  pi k  kq  pj k; 8j ¤ i;
time to reach its location (which depends on the
o
routing policy) plus the time spent at that location j 2 f1; : : : ; ng ;
in on-site service (which is independent of the
routing policy). Since, by definition, the service
time is no shorter than the on-site service time where Vi is the region associated with the i -th
s, then a weaker necessary condition for stability “generator” point pi (see also  Optimal Deploy-
is % WD s=n < 1; the quantity % measures ment and Spatial Coverage). The distance Hn
the fraction of time the vehicles are performing certainly provides a lower bound on the expected
on-site service. Remarkably, it turns out that this distance traveled by a vehicle to reach a task, and
is also a sufficient condition for stability, in the hence one obtains the lower bound
sense that, if this condition is satisfied, one can
find a stabilizing policy. Note that this stability Hn
condition is independent of the size and shape of T  C s:
v
Q and of the speed of the vehicles.

This lower bound is tight in light-load conditions


Fundamental Limitations of Performance
(% ! 0C ), as it will be seen in the next section.
To derive lower bounds, the main difficulty con-
Consider now the case in which % ! 1
sists in bounding (possibly in a statistical sense)
(heavy load). Let D be the average travel distance
the amount of time spent to reach a target lo-
per task for some routing policy. By using ar-
cation. The derivation of these bounds becomes
guments from geometrical probability (indepen-
simpler in asymptotic regimes, i.e., looking at
dent
p of palgorithms), one can show that D 
cases when % ! 0C and % ! 1 , which are often 
ˇ2 A= 2nT as % ! 1 , where ˇ2 is a constant
called “light-load” and “heavy-load” conditions,
that will be specified later. As discussed, for
respectively.
stability, one needs s C D=v < n= . Combining
Consider first the case in which % ! 0C
the stability condition with the bound on the
(light-load regime). A set of n points is called the
average travel distance per task, one obtains
n-median of Q if it globally minimizes the ex-
pected distance between a random point sampled p
ˇ2 A n
uniformly from Q and the closest point in such sC p  :
v 2nT
set. In other words, the n-median of Q globally
minimizes the function Since, by Little’s law, nT D W and T D W Cs,
one finally obtains (recall that % D s=n):
Hn .p1 ; p2 ; : : : ; pn /
 
WD E mink2f1;:::;ng kpk  qk ˇ22 A
Z T  C s; .as % ! 1 /:
1 2 v 2 n2 .1  %/2
D min kpk  qk dq:
A Q k2f1;:::;ng
A salient feature of the above lower bound is
Let Hn be the global minimum of this function. that it scales quadratically with the number of ve-
Geometric considerations
p show that Hn scales hicles (as opposed to the square-root scaling law
proportionally to A=n. one has in light-load conditions); note, however,
Incidentally, the n-median of Q induces a that congestion effects are not included in this
Voronoi partition that is called Median Voronoi model. This bound also shows that the quality
Tessellation, whose importance will become clear of service, which is proportional to 1=.1  %/2 ,
in the next section. Recall that the Voronoi di- degrades much faster as the target load increases
agram of Q induced by points .p1 ; : : : ; pn / is than in nonspatial queueing systems (where the
defined by growth rate is proportional to 1=.1  %/).
Multi-vehicle Routing 827

Design of Routing Algorithms nT points independently and uniformly sampled


The design of an optimal light-load policy es- in Q is known to satisfy the following property:
sentially relies on mimicking the proof strategy p
p
employed for the light-load lower bound. Specif- lim ETSP.nT /= nT Dˇ2  A; almost surely;
nT !1
ically, a routing policy whereby (1) one vehicle
is assigned to each of the n median locations
where ˇ2  0:712 is a constant (the same
of Q, (2) new tasks are assigned to the nearest
ˇ2 constant that appeared in the previous sec-
median location and its corresponding vehicle,
tion) (Steele 1990).
and (3) each vehicle services tasks according to
It can be shown that, using the above routing
a first-come-first-served policy is asymptotically
policy, the average system time T satisfies
optimal, i.e.,
A
Hn T  .p/ C s; .as % ! 1 /;
T ! C s; .as % ! 0C /: v .1  %/2
2
v

Note that under this strategy “regions of domi- where .1/ D ˇ22 and .p/ ! ˇ22 =2 for large p.
nance” are implicitly assigned to vehicles accord- These results critically exploit the statistical char-
ing to a Median Voronoi Tessellation. acterization of the length of an optimal TSP tour.
The heavy-load case is more challenging. Hence, the proposed policy achieves a quality
Consider, first, the following single-vehicle of service that is arbitrarily close to the optimal
routing policy, based on a partition of Q into one, in the asymptotic regime of heavy load (and,
p  1 subregions fQ1 , Q2 , . . . , Qp g of equal indeed, also of light load).
area A=p. Such a partition can be obtained, e.g., The above single-vehicle routing policies can
as sectors centered at the median of Q. Define a be fairly easily lifted to an efficient multi-vehicle
cyclic ordering for the subregion, such that, e.g., routing policy. The key idea (akin to the one in the M
if the vehicle is in region Qi , the “next” region is light-load case) is to (1) partition the workspace
Qj , where j follows i in the cyclic ordering (in into n regions of dominance (with disjoint interi-
other words, j D .i C 1/mod p). ors and whose union is Q), (2) assign one vehicle
to each region, and (3) have each vehicle follow
1. If there are no outstanding tasks, move to a single-vehicle routing policy within its own re-
the median of the region Q. gion. This approach leads to the following multi-
2. Otherwise, visit the “next” subregion; vehicle routing policy for the DTRP problem:
subregions with no tasks are skipped.
Compute a minimum-length path from 1. Partition Q into n regions of dominance
the vehicle’s current position through all of equal area and assign one vehicle to
the outstanding tasks in that subregion. each region.
Complete all tasks on this path, ignoring 2. Each vehicle executes a single-vehicle
new tasks generated in the meantime. DTRP policy in its own subregion.
Repeat.
Using as single-vehicle policy the routing pol-
The problem of computing the shortest path icy described above, the average system time T
through a number of points is related to the well- in heavy-load satisfies
known traveling salesman problem (TSP). While
the TSP is a prototypically hard combinatorial A
T  .p/ C s; .% ! 1 /:
optimization problem, it is well known that the v n .1  %/2
2 2
Euclidean version of the problem can be approx-
imated efficiently (Vazirani 2001). Furthermore, Hence, by comparing this result with the cor-
the length ETSP.nT / of a Euclidean TSP through responding lower bound, one concludes that a
828 Multi-vehicle Routing

simple partitioning strategy leads to a multi- ties, translating tasks, and adversarial generation;
vehicle routing policy whose performance is ar- has been extended to address aspects concerning
bitrarily close to the optimal one in heavy load. robotic implementation such as complex vehi-
cle dynamics, limited sensing range, and team
Mode of Implementation forming; and has even been tailored to integrate
The scalability of the control policies to large humans in the design space; see Bullo et al.
groups of vehicles often requires a distributed (2011) and references therein. Despite the sig-
implementation of multi-vehicle routing strate- nificant modeling differences, the “workflow” is
gies. For the DTRP, a distributed implementa- essentially the same as in the DTRP: a queueing
tion can be obtained by devising decentralized model that captures the salient features of the
algorithms for environment partitioning. In the problem at hand, characterization of the funda-
solution proposed in Pavone (2010), power dia- mental limitations of performance, and design of
grams are the key geometric concept to obtain, algorithms with provable performance bounds.
in a decentralized fashion, partitions suitable for The last step, as for the DTRP, often involves
both the light-load case (requiring, as seen before, lifting a single-vehicle policy to a multi-vehicle
a Median Voronoi Tessellation) and the heavy- policy through the strategy of environment par-
load case (requiring an equal-area partition). The titioning. Within this context, a number of parti-
power diagram of Q is defined as tioning schemes and corresponding decentralized
partitioning algorithms relevant to a large variety
n of DVR problems are discussed in Pavone et al.
Vi D q 2 Qj kq  pi k2  wi kq  pj k2  wj ;
(2009).
o This workflow efficiently and transparently
8j ¤ i; j 2 f1; : : : ; ng ;
decouples the three decision-making problems
mentioned in the Introduction Section, i.e., “task
where .pi ; wi / 2 Q  R are a set of “power allocation,” “service scheduling,” and “loitering
points” and Vi is the subregion associated with paths.” In fact, task allocation is addressed via
the i -th power point. Note that power diagrams the strategy of environment partitioning, service
are a generalization of Voronoi diagrams: when scheduling is addressed by applying a single-
all weights are equal, the power diagram and the vehicle routing policy within the individual
Voronoi diagram are identical. The basic idea, regions of dominance, and the loitering paths
then, is to associate to each vehicle i a virtual resolve in placing the vehicles at or around
power point, which is an artificial (or logical) specific points within the dominance regions
variable whose value is locally controlled by the (e.g., the median). Note, however, that in some
i -th vehicle. The cell Vi becomes the region of important cases, e.g., DVR problems where
dominance for vehicle i , and each vehicle updates goods have to be transported from a pickup
its own power point according to a decentralized location to a delivery location or where vehicles
gradient-descent law with respect to a cover- are differentially constrained and operate in a
age function ( Optimal Deployment and Spatial “congested” workspace, multi-vehicle policies
Coverage), until the desired partition is achieved. that rely on static partitions perform poorly or
The reader is referred to Pavone (2010) for more are not even feasible (Pavone et al. 2009), and
details. task allocation and service scheduling need to be
addressed as tightly coupled.
Extensions and Discussion Through spatial queueing theory one is
By integrating additional ideas from dynamics, usually able to characterize the performance
teaming, and distributed algorithms, the spatial of multi-vehicle routing policies in asymptotic
queueing theory approach has been recently ap- regimes. To ensure “satisfactory” performance
plied to scenarios with complex models for the under general operation conditions, a common
tasks such as time constraints, service priori- strategy is to consider heuristic modifications
Multi-vehicle Routing 829

to a baseline asymptotically efficient routing regimes (current optimality results are only
policy in such a way that, on the one hand, available either in the light or heavy-load
asymptotic performance is preserved, and, on the regimes), online estimation of the statistical
other hand, light- and heavy-load performances parameters (e.g., spatial distribution of the tasks),
are “smoothly” and efficiently blended in the and formulations that take into account second-
intermediate load case. The interested reader order moments and large-deviation probabilities.
can find more information in Bullo et al.
(2011).
Cross-References

Summary and Future Directions  Averaging Algorithms and Consensus


 Flocking in Networked Systems
The three main approaches available to tackle  Networked Systems
DVR problems are (i) heuristic algorithms, (ii)  Optimal Deployment and Spatial Coverage
competitive analysis, and (iii) spatial queueing  Particle Filters
theory. Broadly speaking, the competitive anal-
ysis approach is well suited when worst-case
guarantees are sought, e.g., because there is not Bibliography
enough statistical information about the problem
at hand. Spatial queueing theory represents a Bertsimas DJ, van Ryzin GJ (1991) A stochastic and
dynamic vehicle routing problem in the Euclidean
powerful alternative in cases where it is possi- plane. Oper Res 39:601–615
ble to leverage statistical information and one Bullo F, Frazzoli E, Pavone M, Savla K, Smith SL (2011)
seeks average-case guarantees. Finally, for some Dynamic vehicle routing for robotic systems. Proc
problems the complexity of the model makes IEEE 99(9):1482–1504
Golden B, Raghavan S, Wasil E (2008) The vehicle M
an analytical treatment very difficult, in which routing problem: latest advances and new challenges.
case the only option is to resort to an heuristic Volume 43 of operations research/computer science
approach (possibly relying on insights derived interfaces. Springer, New York
by applying competitive analysis and/or spatial Jaillet P, Wagner MR (2006) Online routing problems:
value of advanced information and improved competi-
queueing theory to a simplified version of the tive ratios. Transp Sci 40(2):200–210
problem). Pavone M (2010) Dynamic vehicle routing for robotic
Future directions include the extension of the networks. PhD thesis, Department of Aeronautics and
three aforementioned approaches to increasingly Astronautics, Massachusetts Institute of Technology
Pavone M, Savla K, Frazzoli E (2009) Sharing the load.
complex problem setups, for example, higher- IEEE Robot Autom Mag 16(2):52–61
fidelity vehicle dynamics and environments Steele JM (1990) Probabilistic and worst case analyses
and sophisticated sensing and communication of classical problems of combinatorial optimization in
constraints, novel applications (e.g., search and Euclidean space. Math Oper Res 15(4):749
Toth P, Vigo D (eds) (2001) The vehicle routing problem.
rescue missions, map maintenance, and pursuit- Monographs on discrete mathematics and applica-
evasion), and inclusion of game-theoretical tools tions. SIAM, Philadelphia. ISBN:0898715792
to address adversarial scenarios. Specifically, for Van Hentenryck P, Bent R, Upfal E (2009) Online stochas-
the spatial queueing theory approach, key future tic optimization under time constraints. Ann Oper Res
177(1):151–183
directions include the problem of addressing Vazirani V (2001) Approximation algorithms. Springer,
optimality of performance in intermediate New York
N

Keywords and Phrases


Nash equilibrium
Congestion games; Network economics; Price-
 Strategic Form Games and Nash Equilibrium
taking users; Routing games; Strategic users

Network Games Introduction

R. Srikant A communication network can be viewed as a


Department of Electrical and Computer collection of resources shared by a set of compet-
Engineering and the Coordinated Science Lab, ing users. If the network were totally unregulated,
University of Illinois at Urbana-Champaign, then each user would attempt to grab as many
Champaign, IL, USA resources in the network as possible, resulting in
poor network performance, a situation commonly
referred to as the tragedy of the commons (Hardin
Abstract 1968). In reality, there is a carefully designed
set of network protocols and pricing mechanisms
Game theory plays a central role in studying sys- which provide incentives to users to act in a
tems with a number of interacting players com- socially responsible manner. Since game theory
peting for a common resource. A communication is the mathematical discipline which studies the
network serves as a prototypical example of such interactions between selfish users, it is a natu-
a system, where the common resource is the net- ral tool to use to design these network control
work, consisting of nodes and links with limited mechanisms. We now provide a few examples
capacities, and the players are the computers, of network problems which naturally lend them-
web servers, and other end hosts who want to selves to game-theoretic analysis. Later, we will
transfer information over the shared network. In elaborate on the game-theoretic formulation of
this entry, we present several examples of game- one of these examples.
theoretic interaction in communication networks • Resource Allocation: A network such
and present a simple mathematical model to study as the Internet is a collection of links,
one such instance, namely, resource allocation in where each link has a limited data-carrying
the Internet. capacity, usually measured in bits per second.

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
832 Network Games

The Internet is shared by billions of users, For such a peer-to-peer service to work,
and the actions of these users have to be each peer should not only download files
regulated so that they share the resources in from others, but should also be willing to
the network in a fair manner. Equivalently, sacrifice some of its resources to upload files
this problem can be viewed as one in which to others. Naturally, peers would prefer to
a network designer has to design a collection only download and not upload to minimize
of protocols so that the users of the network their resource usage. The design of incentive
can equitably allocate the available resources schemes to induce users to both download
among themselves without the intervention of and upload files is another example of a
a central authority. Such protocols are built game-theoretic problem in a network (Qiu
into every computer connected to the Internet and Srikant 2004).
today, to allow for seamless operation of the • Network Economics: In addition to end-user
network. The problem of designing such interaction, Internet service providers (ISPs)
protocols can be posed as a game-theoretic have to interact with each other to allow their
problem in which the players are the network customers access to all the web sites in the
and the traffic sources using the network world. For example, one ISP may have a
(Kelly 1997). customer who wants to access a web site
• Routing Games: Finding appropriate routes connected to another ISP. In this case, the data
for each user’s data traffic is a particular form traffic must cross ISP boundaries, and thus,
of resource allocation mentioned above. How- one ISP has to transport data destined for a
ever, routing has applications beyond commu- customer of another ISP. Thus, ISPs must be
nication networks (with the other major ap- willing to contribute resources to satisfy the
plication area being transportation networks), needs of customers who do not directly pay
so it is useful to discuss routing separately. them. In such a situation, ISPs must have bilat-
In communication networks, each user may eral agreements (commonly known as peering
attempt to find the minimum-delay route for agreements) to ensure that the selfish interest
its traffic, with help from the network, to of each ISP to minimize its resource usage
minimize the delay experienced by its packets. is aligned with the needs of its customers.
In a transportation network, each automobile Again, game theory is the right tool to study
on the road attempts to take the path of least such inter-ISP interactions (Courcoubetis and
congestion through the network. An active Weber 2003).
area of research in game theory is one which • Spectrum Sharing: Large portions of the ra-
tries to understand the impact of individual dio spectrum are severely underutilized. Typ-
user decisions on the global performance of ically, portions of the spectrum are assigned
the network (Roughgarden 2005). An interest- to a primary user, but the primary user does
ing result in this regard is the Braess paradox not use it most of the time. There has been
which is an example of a road transporta- a surge of interest recently in the concept of
tion network in which the addition of a road cognitive radio, whereby radios are cognitive
leads to increased delays when each user self- of the presence or absence of the primary
ishly choose a route to minimize its delay. Of user, and when the primary user is absent,
course, if routes are chosen to minimize the another radio can use the spectrum to transmit
overall delay experienced in the network such its data. When there are many users and the
a paradox will not arise. available spectrum is split into many channels,
• Peer-to-Peer Applications: Many studies have it is impossible for users to perfectly coordi-
indicated that file sharing between users (also nate their transmissions to achieve maximum
known as peers) directly, without using a network utilization. In these situations, game-
centralized web site such as YouTube, is a theoretic protocols which take into account the
dominant source of traffic in the Internet. noncooperative behavior of the users can be
Network Games 833

P
designed to allow secondary users to access where we have used the notation yl WD rWl2r xr
the available channels as efficiently as possi- to denote the total data rate on link l. If p is
ble (Saad et al. 2009). known, then the optimal x can be calculate by
In the next section, we will elaborate on one of solving
the applications above, namely, resource alloca- max L.x; p/:
x0
tion in the Internet, and show how game-theoretic
modeling can be used to design fair resource Notice that the optimal solution for each xr can
sharing. be obtained by solving

Resource Allocation and Game max Ur .xr /  qr xr ; (4)


xr 0
Theory
P
Consider a network consisting of L links, with where qr D l2r pl . Thus, if the Lagrange
link l having capacity cl . Suppose that there are R multipliers are known, then the network utility
users sharing the network, with each user r being maximization can be interpreted as a game in
characterized by a set of links which connect the the following manner. Suppose that the network
user’s source to its destination. Since each user charges each user qr dollars for every bit trans-
uses a fixed route in our model, we will use r to mitted by user r though the network. Then, qr xr
denote both the user and the route used by the is the dollars per second spent by the user if xr is
user. We use the notation l 2 r to denote that measured in bits per second. Interpreting Ur .xr /
link l is a part of route r. Let xr denote the rate as the dollars per second that the user is willing
at which user r transmits data. Thus, we have the to pay for transmitting at rate xr , the optimization
following natural constraints, which state that the problem in (4) is the problem faced by user
total data rate on any link must be less than or r which wants to maximize its net utility, i.e.,
equal to the capacity of the link: utility minus cost. Thus, the individual optimal
X solution for each user is also the solution to N
xr  cl ; 8l: (1) the network utility maximization problem. The
rWl2r above game-theoretic interpretation of the net-
work utility maximization problem is somewhat
Associated with each user is a concave utility trivial since, given the pl ’s or qr ’s, there is no
function Ur .xr / which is the utility that user interaction between the users. Of course, this
r derives by transmitting data at rate xr . The interpretation relies on the ability of the network
network utility maximization problem is to solve to compute p. We next present a scheme to com-
X pute p, which couples the users closely and thus
max Ur .xr /; (2) allows for a richer game-theoretic interpretation.
x0
r
Suppose that the network wants to compute p
subject to the constraint (1). In (2), x denotes the but does not have access to the utility functions
vector .x1 ; x2 ; : : : ; xR / and x  0 means that of the users. The network asks each user r to bid
each component of x must be greater than or an amount wr which is interpreted as the dollars
equal to zero. Note that the goal of the network per second that the user is willing to pay. The
in (2) is to maximize the sum of the utilities of network then assumes that user r’s utility func-
the users in the network. tion is wr log xr and solves the network utility
Let pl be the Lagrange multiplier correspond- maximization. While this choice of utility func-
ing to the capacity constraint in (1) for link l. tion may seem arbitrary, the resulting solution x
Then the Lagrangian for the problem is given by has a number of attractive properties, including a
X X form of fairness called proportional fairness. The
L.x; p/ D Ur .xr /  pl .yl  cl /; (3) proportionally fair resource allocation solution
r l to (4) is given by
834 Network Games

wr conditions are satisfied since the constraints


D qr : (5)
xr for (2) and the proportionally fair solution
are the same. Thus, under the price-taking
The network then allocates rate xr to user r and assumption, the equilibrium of the game so-
charges qr dollars per bit. From (5), the amount lution is the same as the socially optimal
charged to user r per second is wr , thus satisfying solution provided the network computes qr us-
the original interpretation of wr . Knowing that the ing the proportionally fair resource allocation
network charges users in this manner, how might formulation.
a user choose its bid wr ? Recall that user r’s goal • Strategic Users: In networks where the num-
is to solve (4). Substituting from (5), the problem ber of users is small, it may be possible for
in (4) can be rewritten as each user to know the topology of the network,
  and thus, each user may be able to solve for
wr
max Ur  wr : (6) the proportionally fair resource allocation if
wr 0 qr it has access to other users’ bids. In other
words, it may be possible to compute a Nash
Thus, the users’ problem of selecting w can be equilibrium by taking into account the impact
viewed as a game, with each user’s objective of the wr ’s on the qr ’s. When the users are
given by (6). Note that qr is given by (5) and strategic, the socially optimal solution could
thus depends on all the wr ’s. Depending upon the be quite different from the Nash equilibrium.
application, the game can be solved under one of The ratio of the network utility under the
two assumptions: socially optimal solution to the network utility
• Price-Taking Users: Under this assumption, under a Nash equilibrium is called the price of
users are assumed to take the price qr as given, anarchy.
i.e., they do not attempt to infer the impact of There is a rich literature associated with both
their actions on the price. This is a reasonable interpretations of the network congestion game.
assumption in a large network such as the In- In the case of price-taking users, much of the
ternet, where the impact of a single user on the emphasis in the literature has been on designing
link prices is negligible, and it is practically distributed algorithms to achieve the socially op-
impossible for any user to infer the impact timal solution (Shakkottai and Srikant 2007). In
of its decisions on the prevailing price of the the case of strategic users, the focus has been on
network resources. When the users are price characterizing the price of anarchy (Johari and
taking, the socially optimal solution, i.e., the Tsitsiklis 2004; Yang and Hajek 2007).
solution to the network utility maximization
problem, coincides with the Nash equilibrium
of the game. To see this, note that the solution
Summary and Future Directions
to (6) is given by
  We have presented a number of applications
1 0 wr
U  1 D 0; which involve the interactions of selfish users
qr r qr over a network. For the resource allocation
application, we have also described how simple
under the assumption that the utility function
mathematical models can be used to provide
is differentiable and the solution is bounded
incentives for users to act in a socially optimal
away from zero. Using (5), this equation re-
manner. In particular, we have shown that, under
duces to
the reasonable price-taking assumption and an
Ur0 .xr / D qr ;
appropriate computation of link prices, selfish
which maximizes the Lagrangian (3). It is not users automatically maximize network utility. In
difficult to see that the complementary slack- the case where the users are strategic, the goal is
ness equations in the Karush-Kuhn-Tucker to characterize the price of anarchy.
Networked Control Systems: Architecture and Stability Issues 835

Moving forward, two areas which require con- Kelly FP (1997) Charging and rate control for elastic
siderable further research are the following: (i) traffic. Eur Trans Telecommun 8:33–37
Qiu D, Srikant R (2004) Modeling and performance
inter-ISP routing and (ii) spectrum sharing. The analysis of BitTorrent-like peer-to-peer networks.
Internet is a fairly reliably network, and any Proc ACM SIGCOMM ACM Comput Commun Rev
unreliability often arises due to routing issues 34:367–378
among ISPs. As mentioned in the introduction, Roughgarden T (2005) Selfish routing and the price of
anarchy. MIT Press, Cambridge
peering arrangements between ISPs are necessary Saad W, Han Z, Debbah M, Hjorungnes A, Basar T (2009)
to make sure that ISPs carry each others’ traffic Coalitional game theory for communication networks:
and are appropriately compensated for it, either a tutorial. IEEE Signal Process Mag 26(5):77–97
through reciprocal traffic-carrying agreements or Shakkottai S, Srikant R (2007) Network optimization and
control. NoW Publishers, Boston-Delft
actual monetary transfer. Thus, the policy that Yang S, Hajek B (2007) VCG-Kelly mechanisms for
an ISP uses to route traffic may be governed by allocation of divisible goods: adapting VCG mecha-
these peering agreements. The more complicated nisms to one-dimensional signals. IEEE J Sel Areas
these policies are, the more chances there are Commun 25:1237–1243
for routing misconfigurations that lead to service
interruptions. This interplay between policies and
technology in the form of routing algorithms is an Networked Control Systems:
interesting topic for further study. Architecture and Stability Issues
Cognitive radios and spectrum sharing are
expected to be significant technological com- Linda Bushnell1 and Hong Ye2
ponents of future wireless networks. Designing 1
Department of Electrical Engineering,
algorithms for selfish radios to share the avail- University of Washington, Seattle, WA, USA
able spectrum while respecting the rights of the 2
The Mathworks, Inc., Natick, MA, USA
primary user of the spectrum is a challenge that
requires considerable further attention. This area
of research requires one to combine sensing tech- Abstract N
nologies to sense the presence of other users
with game-theoretic models to ensure fair chan- When shared, band-limited, real-time communi-
nel access to the secondary users, subject to the cation networks are employed in a control system
constraint that the primary user should not be to exchange information between spatially dis-
affected by the presence of the secondary users. tributed components, such as controllers, actua-
tors, and sensors, it is categorized as a networked
control system (NCS). The primary advantages
Cross-References of a NCS are reduced complexity and wiring,
reduced design and implementation cost, ease of
 Game Theory: Historical Overview system maintenance and modification, and effi-
 Networked Systems cient data sharing. In addition, this unique archi-
 Optimal Deployment and Spatial Coverage tecture creates a way to connect the cyberspace
to the physical space for remote operation of sys-
tems. The NCS architecture allows for perform-
ing more complex tasks, but also requires taking
Bibliography
the network effects into account when designing
Courcoubetis C, Weber R (2003) Pricing communication control laws and stability conditions. In this entry,
networks: economics, technology and modelling. Wi- we review significant results on the architecture
ley, Hoboken and stability analysis of a NCS. The results pre-
Hardin G (1968) The tragedy of the commons. Science
sented address communication network-induced
162:1243–1248
Johari R, Tsitsiklis JN (2004) Efficiency loss in a network challenges such as time delays, scheduling, and
resource allocation game. Math Oper Res 29:407–435 information packet dropouts.
836 Networked Control Systems: Architecture and Stability Issues

Keywords be described as a feedback control system where


the control loops are closed through a real-time
Architecture; Networked control system; communication network.
Stability

Architecture of Networked Control


Introduction Systems

From the washing machine, air conditioner, and The architecture of a NCS consists of a band-
microwave oven to the telephone, stereo, and limited, digital communication network physi-
automobile, embedded computers are present in cally and electronically integrated with a spatially
the modern home. In a factory environment, there distributed control system, operated on a given
are thousands of networked smart sensors and plant. Digital information, such as controller sig-
actuators with embedded processors, working to nals, actuator signals, sensor signals, and operator
complete a coordinated task. The trend in manu- input, is transmitted via the network. The compo-
facturing plants, homes, buildings, aircraft, and nents connected by the network include all nodes
automobiles is toward distributed networking. of the control system, such as the supervisory
This trend can be inferred from many proposed (or “network owner”) computer, controller soft-
or emerging network standards, such as con- ware and hardware, actuators, and sensors. In this
troller area network (CAN) for automotive and structure, the feedback control system’s loops are
industrial automation, BACnet for building au- closed over the shared communication network.
tomation, PROFIBUS and WorldFIP fieldbus for The communication network can be wired or
process control, and IEEE 802.11, and Blue- wireless and may be shared with other unrelated
tooth wireless standards for applications such nodes outside the control system. As illustrated in
as mobile sensor networks, HVAC systems, and Fig. 1, the shared communication channel, which
unmanned aerial vehicles. multiplexes signals from the sensors to the con-
The traditional dedicated point-to-point wired trollers and/or from the controllers to the actua-
connection in control systems has been success- tors, serves many other uses besides control. Each
fully implemented in industry for decades. With of the system components directly connected to
the advance of communication network and hard- the network via a network interface is denoted
ware technologies, it is common to integrate the a physical node. Besides the network interface,
communication network into the control system the sensors and actuator nodes are typically smart
to replace the dedicated point-to-point connection nodes with embedded microprocessors. Some-
to achieve reduced weight and power, lower cost, times, the controller is colocated with the smart
simpler installation and maintenance, and higher actuator. Several key issues make networked con-
reliability, to name a few advantages. For exam- trol systems distinct from traditional control sys-
ple, a typical new automobile has two controller tems (Hespanha et al. 2007; Yang 2006).
area networks (CANs): a high-speed one in front
of the firewall for the engine, transmission, and Band-Limited Channels
traction control and a low-speed one for locks, Bandwidth limitation of the shared communica-
windows, and other devices (Johansson et al. tion channel requires that all nodes in the network
2005). must share (e.g., time sharing or frequency shar-
The conventional definition of a networked ing, etc.) the common network resource without
control system (NCS) is as follows: When a interfering with each other.
feedback control system is closed via a com-
munication channel, which may be shared with Sampling and Delays
other nodes outside the control system, then the In a NCS, the plant outputs are sampled by
control system is called a NCS. A NCS can also the sensors, which can convert continuous-time
Networked Control Systems: Architecture and Stability Issues 837

Networked Control Systems: Architecture and Stability Issues, Fig. 1 A general networked control system (NCS)
architecture

analog signals to digital signals; perform prepro- many embedded systems can be neglected when
cessing, filtering, and encoding; and package the compared with the magnitude of network access
data signal so that it is ready for transmission. delay.
After winning the medium access control and N
being transmitted over the network, the package Packet Dropouts
containing the sampled data signal arrives at the It is possible in a NCS that a packet may be
receiver side, which could be a controller or a lost while it is in transit through the network.
smart actuator with a controller collocated with The packet that contains important sampling data
it. The receiver unpacks and decodes the signal. or control signals may drop occasionally due
This process is quite different from the traditional to transmission errors of the physical network
periodic sampling in digital control. The overall link, message collision, or node failures, to
delay between sampling and the eventual decod- name a few. Overflow in queue or buffer can
ing of the transmitted packet at the receiver can be lead to network congestion and package loss.
time varying and random due to both the network Thus, the use of queues is not favored by NCSs
access delay (i.e., the time it takes for a shared in general. Packet dropouts also happen if the
network to accept the data) and the transmission receiver discards outdated arrivals that have long
delay (i.e., the time during which data are in delays. Most network protocols are equipped
transit inside the network). This also depends on with transmission-retry mechanisms, such as
the highly variable network conditions, such as TCP, that guarantee the eventual delivery of
congestion and channel quality. In some NCSs, packets. These protocols, unfortunately, are not
the data transmitted are time stamped, which appropriate for a NCS since the retransmission
means that the receiver may have an estimate of of old sensor data or calculated control signals
the delay’s duration and could take appropriate is generally not very useful when new, time-
corrective action. Given the rapid advance of em- critical data are available. Using selected old
bedded computation and communication hard- data for estimation or prediction is an exception,
ware technology today, the transmission delay in where old data may be packaged with the new
838 Networked Control Systems: Architecture and Stability Issues

data in one packet. It is advantageous to discard variable delays, and packet dropouts can degrade
the old, un-transmitted data and transmit a new the performance of a NCS. For example, the
packet if and when it becomes available. In this NCS may have a longer settling time or bigger
way, the controller always receives fresh data for overshoot in the step response. Furthermore,
its control calculation, and the actuator always the NCS may become unstable when delays
executes the up-to-date command to control the and/or packet dropouts exceed a certain range.
plant. Designers choosing to use a NCS architecture,
however, are motivated not by performance but
Modeling Errors, Uncertainties, and by cost, maintenance, and reliability gains.
Disturbances
In a distributed NCS, modeling errors, uncertain- Band-Limited Channels
ties, and disturbances always exist when using Inspired by Shannon’s results on the maximum
the mathematical model to describe the physical bit rate that a communication channel can carry
process. These factors may lead to a major impact reliably, a significant research effort has been
on the overall system performance and cause devoted to the problem of determining the min-
failure in fulfilling the desired objectives. Wang imum bit rate that is needed to stabilize a system
and Hovakimyan (2013) proposed a reference through feedback over a finite capacity channel
model-based architecture to decouple the design (Baillieul 1999; Nair and Evans 2000; Tatikonda
of controller and communication schemes. A ref- and Mitter 2004; Wong and Brockett 1999; Bail-
erence model is introduced in each subsystem lieul and Antsaklis 2007). This has been solved
as a bridge to build the connection between the exactly for linear plants, but only conservative re-
real system and an ideal model, free of uncer- sults have been obtained for nonlinear plants. The
tainties. The closeness between the real system data-rate theorem that quantifies a fundamental
and the reference model is associated only with relationship between unstable physical systems
plant uncertainties, and the difference between and the rate at which information must be pro-
the reference model and the ideal model is only cessed in order to stably control them was proved
in the communication constraints. independently under a variety of assumptions.
Minimum bit rate and quantization becomes es-
pecially important for networks designed to carry
Stability of Networked Control very small packets with little overhead, because
Systems encoding measurements or actuation signals with
less bits can save network bandwidth.
The stability of a control system is often ex- Most of the NCS stability results presented
tremely important and is generally a safety re- here, however, are based on the observation that
quirement. Examples include the control of rock- the channel can transmit a finite number of pack-
ets, robots, airplanes, automobiles, or ships. In- ets per unit of time (packet rate) and each packet
stability in any one of these systems can result can carry certain number of bits in the data
in an unimaginable accident and loss of life. field. The packets on a real-time control network
The stability of a general dynamical system with typically are frequent and have small data seg-
no input can be described with the Lyapunov ments compared to their headers. For example, a
stability criteria, which is stated as follows: A CAN II packet with a single 16-bit data sample
linear system is stable if its impulse response has fixed 64 bits of overhead associated with
approaches zero as time approaches infinity or if identifier, control field, CRC, ACK field, and
every bounded input produces a bounded output. frame delimiter, resulting in 25 % utilization, and
When sensors, controllers, and actuators this utilization can never exceed 50 % (data field
are not colocated and use a shared network length is limited to 64 bits). Thus, the quan-
to communicate, the feedback loop of a NCS tization effects imposed by the communication
is closed over the network. Network-induced, networks are generally ignored.
Networked Control Systems: Architecture and Stability Issues 839

Network-Induced Delays after a (possibly variable) delay of £k , where


A significant number of results have attempted to we assume that the network delays are always
characterize a maximum upper bound on the sam- smaller than one sampling interval. For periodic
pling or transmission interval for which stability sampling and constant delays, a sufficient and
of the NCS can be guaranteed. The upper bound necessary condition for exponential stability of
is sometimes called the maximum allowable the NCS (Eqs. 1 and 2) was derived (Zhang et al.
transfer interval (MATI) (Walsh 2001a). These 2001). By using the augmented state space model
results implicitly attempt to minimize the packet and based on the stability of nonlinear hybrid
rate or schedule the traffic of the control network systems, they also proved the sufficient condition
that is needed to stabilize a system through for stability of the NCS in the time-invariant
feedback. The general approach is to design case.
the controller using established techniques, If we now assume the sampling intervals are
considering the network to be transparent, and constant and the computation and transmission
then to analyze the effect of the network on delays are negligible, then the variable network
closed-loop system performance and stability. access delays serve as the main source of delays
The NCS with a linear time-invariant (LTI) in a NCS (Lin et al. 2003, 2005). Using average
plant/controller pair and one-channel feedback dwell time results for discrete switched systems,
(see Fig. 2) can be modeled by the following Zhai et al. (2002) provided conditions such that
continuous-time system, where x includes the NCS stability is guaranteed. Also, the authors
states of the plant and the controller, x.t/ D consider robust disturbance attenuation analysis
Œxp .t/; xc .t/T : for this class of NCSs.
When the network delay is not constant or
xP D Ax C B y; O y D C.x/ (1) when the signal y(t) is sampled in a nonperiodic
 fashion, the system (1) and (2) is not time invari-
yOk1 ; t 2 Œtk ; tk C k / ant and one needs a Lyapunov-based argument to
O D
y.t/ (2)
yOk ; t 2 Œtk C k I tkC1 / prove its stability. Zhang and Branicky (2001) de- N
rived the sufficient condition to ensure the NCS in
The signal y is a vector of sensor measurements Fig. 2 is exponentially stable. They also proposed
and yO is the input to a continuous-time controller a randomized algorithm to find the largest value
collocated with the actuators. Alternatively, yO of sampling interval for which stability can be
can be viewed as the input to the actuators and guaranteed.
y as the desired control signal computed by a For a model-based NCS with state and output
controller collocated with the sensors. The signal feedback, an explicit model of the plant is used
y(t) is sampled at times ftk W k 2 N g and to produce an estimate of the plant state behavior
the samples y.k/ WD y.tk / are sent through the between transmission times (Montestruque
network. But the samples arrive at the destination and Antsaklis 2004). Sufficient conditions for
Lyapunov stability are derived for a model-based
NCS when the controller/actuator is updated
with the sensor information at nonconstant time
intervals. A NCS with transmission times that are
driven by a stochastic process with identically
independently distributed and Markov-chain-
driven transmission times almost sure stability
and mean-square sufficient conditions for
stability are introduced. Onat et al. (2011)
Networked Control Systems: Architecture and Sta-
adapted above stability results to model-
bility Issues, Fig. 2 A NCS architecture with one- based predictive NCSs with realistic structure
channel feedback (controller collocated with actuator) assumptions.
840 Networked Control Systems: Architecture and Stability Issues

it is called a static scheduler, such as round-


robin scheduling. A dynamic scheduler deter-
mines the network allocation while the system
runs. A novel dynamic network scheduler, try-
once-discard (TOD) and several variations were
introduced for wired and wireless NCSs (Walsh
and Ye 2001; Ye et al. 2001). For linear and
nonlinear NCSs with the new dynamic and com-
monly used static schedulers, an analytic proof
Networked Control Systems: Architecture and Sta-
of global exponential stability of a MIMO NCS
bility Issues, Fig. 3 A NCS architecture with two-
channel feedback was provided (Walsh 2001a; Walsh et al. 2001b).
Simulation and experiment results showed that
the dynamic schedulers outperform static sched-
Control Network Scheduler ulers in terms of NCS performance, e.g., a bigger
In general a Multi-Input/Multi-Output (MIMO) MATI.
NCS with two-channel feedback, both the sam- Nesic and Teel (2004a,b) generalize the above
pled plant output and controller output are trans- results by considering a nonlinear NCS with
mitted via a network (see Fig. 3). Because of the external disturbances and more general class of
network, only the reported output y(t) is available protocols (or schedulers). They considered a new
to the controller and its prediction processes; class of Lyapunov uniformly globally asymptot-
similarly, only uO (t) is available to the actuators on ically stable (UGAS) protocols in a NCS. It is
the plant. We label the network-induced error shown that if the controller is designed without
taking into account the network, it yields input-
e.t/ WD Œy.t/;
O uO .t/T  Œy.t/; u.t/T to-state stability (ISS) with respect to external
disturbances (not necessarily with respect to the
and the combined state of controller and plant network-induced error), and then the same con-
x.t/ D Œxp .t/; xc .t/T . The state of the entire troller will achieve semi-global practical ISS for
NCS is given by z.t/ D Œx.t/; e.t/T . Following the NCS when implemented via the network
this general approach, the controller is designed with a Lyapunov UGAS protocol. Moreover, the
using established techniques without considering ISS gain is preserved. The adjustable parameter
the presence of the network. with respect to which semi-global practical ISS
The behavior of the network-induced error is achieved is the MATI between transmission
e.t/ is mainly determined by the architecture times. The authors also studied the input–output
of the NCS and the scheduling strategy. In the Lp stability of a NCS for a large class of network
special case of one-package transmission, there is scheduling protocols. It is shown that polling,
only one node transmitting data on the network; static protocols, and dynamic protocol such as
therefore, the entire vector e.t/ is set to zero at TOD belong to this class. Results in Nesic and
each transmission time. For multiple nodes trans- Teel (2004a) provide a unifying framework for
mitting measured outputs y.t/ and/or computed generating new scheduling protocols that pre-
inputs u.t/, the transmission order of the nodes serve Lp stability properties of the system, if
depends on the scheduling strategy chosen for a design parameter is chosen to be sufficiently
the NCS. In other words, the scheduling strategy small. The most general version of these results
decides which components of e.t/ are set to zero can also be used to model a NCS with data packet
at the transmission times. dropouts. The proof technique used is based on
Static and dynamic schedulers (a.k.a. proto- the small gain theorem and lends itself to an easy
cols) are two main categories used in a NCS. interpretation.
When the network resource or transmission order A framework for analyzing the stability of
are pre-allocated or determined before run-time, a general nonlinear NCS with disturbances in
Networked Control Systems: Architecture and Stability Issues 841

the setting of Lp stability was provided by Tab- In terms of future directions, there has been
bara et al. (2007). Their presentation provides significant effort in analyzing networked control
sharper results for both gain and MATI than systems with variable sampling rate, but most
previously obtainable and details the property of results investigate the stability for a given worst-
uniformly persistently exciting scheduling pro- case interval between consecutive sampling
tocols. This class of protocols was shown to times, leading to conservative results. An open
lead to stability for high enough transmission area of research would be to look at methods that
rates. This was a natural property to demand, take into account a stochastic characterization for
especially in the design of wireless scheduling the inter-sampling times. Substantial work has
protocols. The property is used directly in a novel also been devoted to determining the stability
proof technique based on the notions of vector of a NCS, as described in this article. Possible
comparison and (quasi)-monotone systems. Via open areas of research would be to consider
simulations, analytical, and numerical compari- design issues related to the joint stability and
son, it is verified that the uniform persistence of performance of the system. The design and
excitation property of protocols is, in some sense, development of controllers for a NCS is also an
the “finest” property that can be extracted from open area of research. In designing a controller
wireless scheduling protocols. for a NCS, one has to take into account the
challenges introduced by the communication
Delays and Packet Dropouts network. Only afterward can analysis of the
Packet dropouts can be modeled as either whole system take place.
stochastic or deterministic phenomena. For a
one-channel feedback NCS, Zhang and Branicky
(2001) consider a deterministic dropouts model, Cross-References
with packet dropouts occurring at an asymptotic
rate. Stability conditions were studied for a NCS  Data Rate of Nonlinear Control Systems and
with deterministic and stochastic dropouts (Seiler Feedback Entropy N
and Sengupta 2005).  Information and Communication Complexity
Sometimes, the NCS was characterized as of Networked Control Systems
a continuous-time delayed differential equation  Networked Control Systems: Estimation and
(DDE) with the time-varying delay £(t). One Control over Lossy Networks
important advantage is that the equations are  Networked Systems
still valid even when the delays exceed the sam-  Quantized Control and Data Rate Constraints
pling interval. Researchers successfully used the
Lyapunov–Krasovskii (Yue et al. 2004) and the
Razumikhin theorems (Yu et al. 2004) to study Bibliography
the stability of a NCS that is modeled as DDEs.
Baillieul J (1999) Feedback designs for controlling device
arrays with communication channel bandwidth con-
straints. In: Lecture notes of the fourth ARO workshop
Summary and Future Directions on smart structures, Penn State University
Baillieul J, Antsaklis PJ (2007) Control and communi-
This article introduced the concept of a net- cation challenges in networked control systems. Proc
IEEE 95(1):9–28
worked control system and its general architec-
Hespanha JP, Naghshtabrizi P, Xu Y (2007) A survey of
ture. Several key issues specific to a NCS, such as recent results in networked control systems. Proc IEEE
band-limited channels, network-induced delays, 95(1):138–162
and information packet dropouts, were explained. Johansson KH, Torngren M, Nielsen L (2005) Vehicle
applications of controller area network. In: Levine WS,
The stability condition of a NCS with various net-
Hristu-Varsakelis D (eds) Handbook of networked
work effects was discussed with several common and embedded control systems. Birkhäuser, Boston,
modeling techniques. pp 741–765
842 Networked Control Systems: Estimation and Control Over Lossy Networks

Lin H, Zhai G, Antsaklis PJ (2003) Robust stabil- Yue D, Han QL, Peng C (2004) State feedback controller
ity and disturbance attenuation analysis of a class design for networked control systems. IEEE Trans
of networked control systems. In: 42nd IEEE con- Circuits Syst 51(11):640–644
ference on decision and control, Maui, vol 2, Zhai G, Hu B, Yasuda K, Michel A (2002), Qualitative
pp 1182–1187 analysis of discrete-time switched systems, in Proc.
Lin H, Zhai G, Fang L, Antsaklis PJ (2005) Stability Amer. Contr. Conf, vol 3, pp 1880–1885
and H1 performance preserving scheduling policy for Zhang W, Branicky MS (2001) Stability of networked
networked control systems. In: Proceedings 16th IFAC control systems with time-varying transmission pe-
world congress, Prague riod. In: Allerton conference on communication, con-
Montestruque LA, Antsaklis PJ (2004) Stability of trol, and computing, Monticello, IL
model-based networked control systems with time- Zhang W, Branicky MS, Phillips SM (2001) Stability of
varying transmission times. IEEE Trans Autom Con- networked control systems. IEEE Control Syst Mag
trol 49(9):1562–1572 21(1):84–99
Nair GN, Evans RJ (2000) Stabilization with data-rate-
limited feedback: tightest attainable bounds. Syst Con-
trol Lett 41(1):49–56. Elsevier
Nesic D, Teel A (2004a) Input-output stability properties
of networked control systems. IEEE Trans Autom Networked Control Systems:
Control 49(10):1650–1667 Estimation and Control Over
Nesic D, Teel A (2004b) Input-to-state stability of net-
worked control systems. Automatica 40(12):2121–
Lossy Networks
2128
Onat A, Naskali T, Parlakay E, Mutluer O (2011) Con- João P. Hespanha1 and Alexandre R. Mesquita2
trol over imperfect networks: model-based predictive 1
Center for Control, Dynamical Systems and
networked control systems. IEEE Trans Ind Electron
Computation, University of California, Santa
58:905–913
Seiler P, Sengupta R (2005) An H1 approach to Barbara, CA, USA
2
networked control, IEEE Trans Autom Control Department of Electronics Engineering, Federal
50(3):356–364 University of Minas Gerais, Belo Horizonte,
Tabbara M, Nesic D, Teel A (2007) Stability of wireless
Brazil
and wireline networked control systems. IEEE Trans
Autom Control 52(9):1615–1630
Tatikonda S, Mitter S (2004) Control under commu-
nication constraints. IEEE Trans Autom Control Abstract
49(7):1056–1068
Walsh G, Ye H (2001) Scheduling of networked control
This entry discusses optimal estimation and con-
systems. IEEE Control Syst Mag 21(1): 57–65
Walsh G, Ye H, Bushnell L (2001a) Stability analysis of trol for lossy networks. Conditions for stability
networked control systems. IEEE Trans Control Syst are provided both for two-link and multiple-link
Technol 10(3):438–446 networks. The online adaptation of network re-
Walsh G, Beldiman O, Bushnell L (2001b) Asymptotic
sources (controlled communication) is also con-
behavior of nonlinear networked control systems.
IEEE Trans Autom Control 44:1093–1097 sidered.
Wang X, Hovakimyan N (2013) Distributed control of un-
certain networked systems: a decoupled design. IEEE
Trans Autom Control 58(10):2536–2549 Keywords
Wong WS, Brockett RW (1999) System with finite
communication bandwidth constraints-II: stabilization
with limited information feedback. IEEE Trans Autom Automatic control; Communication networks;
Control 44(5):1049–1053 Controlled communication; Estimation; Net-
Yang TC (2006) Networked control system: a brief survey. worked control systems; Stability
IEE Proc Control Theory Appl 153(4):403–412
Ye H, Walsh G, Bushnell L (2001) Real-time mixed-
traffic wireless networks. IEEE Trans Ind Electron
48(5):883–890 Introduction
Yu M, Wang L, Chu T, Hao F (2004) An LMI approach to
networked control systems with data packet dropout
and transmission delays. 43rd IEEE conference on
Network Control Systems (NCSs) are spatially
decision and control, Paridise Island, Bahamas, vol 4, distributed systems in which the communication
pp 3545–3550 between sensors, actuators, and controllers
Networked Control Systems: Estimation and Control Over Lossy Networks 843

occurs through a shared band-limited digital traffic and are typically random; they can be ana-
communication network. In this entry, we lyzed using the techniques developed in Antunes
consider the problem of estimation and control et al. (2012).
over such networks. Notation and Basic Definitions. Throughout
A significant difference between NCSs and the entry, R stands for real numbers and N for
standard digital control is the possibility that nonnegative integers. For a givenpmatrix A 2
data may be lost while in transit through the Rnn and vector x 2 Rn ; kxk WD x 0 x denotes
network. Typically, packet dropouts result from the Euclidean norm of x, and .A/ the set of
transmission errors in physical network links eigenvalues of A. Random variables are gener-
(which is far more common in wireless than ally denoted in boldface. For a random variable
in wired networks) or from buffer overflows y; EŒy stands for the expectation of y.
due to congestion. Long transmission delays
sometimes result in packet reordering, which
essentially amounts to a packet dropout if the Two-Link Networks
receiver discards “outdated” arrivals. Reliable
transmission protocols, such as TCP, guarantee Here, we consider a control/estimation problem
the eventual delivery of packets. However, these when all network effects can be modeled using
protocols are not appropriate for NCSs since two erasure channels: one from the sensor to the
the retransmission of old data is generally not controller and the other from the controller to the
useful. Another important difference between actuator (see Fig. 1).
NCSs and standard digital control systems is We restrict our attention to a linear time-
that, due to the nature of network traffic, delays invariant (LTI) plant with intermittent observa-
in the control loop may be time varying and tion and control packets:
nondeterministic.
In this entry, we concentrate on the problem of xkC1 D Axk C k Buk C wk ; (1a)
control and estimation in the presence of packet N
losses, leaving other important features of NCSs y k D k C x k C v k ; (1b)
(such as quantization and random delays) to be
addressed in other entries of this encyclopedia. 8k 2 N; xk ; wk 2 Rn ; yk ; vk 2 Rp , where
Consequently, we assume that the network can be .x0 ; wk ; vk / are mutually independent, zero-mean
viewed as a channel that can carry real numbers Gaussian with covariance matrices .P0 ; Rw ; Rv /,
without distortion, but that some of the messages and k ; k 2 f0; 1g are i.i.d. Bernoulli random
may be lost. This network model is appropriate variables with Prfk D 1g D N and Prfk D
when the number of bits in each data packet is 1g D . N The variable k models the packet loss
sufficiently large so that quantization effects can between sensor and controller, whereas k mod-
be ignored, but packet dropouts cannot. For more els the packet loss between controller and actua-
general channel models, see, for example, Imer tor. When there is a packet drop from controller
and Basar (2005).
This entry also does not address network trans-
mission delays explicitly. In general, network Actuator Plant Sensor
delays have two components: one that is due to
the time spent transmitting packets and another
due to the time packets wait in buffers waiting
to be transmitted. Delays due to packet transmis- Erasure channel Erasure channel
Controller
sion present little variation and may be modeled
as constants. For control design purposes, these
Networked Control Systems: Estimation and Control
delays may be incorporated into the plant model. Over Lossy Networks, Fig. 1 Control system with two
Delays due to buffering depend on the network network links
844 Networked Control Systems: Estimation and Control Over Lossy Networks

to actuator, we set the actuator’s output to zero. the estimation error covariance becomes
Different strategies, such as holding the control unbounded:
input, could still be modeled using (1) by aug-
Theorem
  1 (Sinopoli et al. 2004) Assume that
menting of the state vector.
A; Rw1=2 is controllable, .A; C / is observable,
The information available to the controller up
and A is unstable. Then there exists a critical
to time k is given by the information set:
value c 2 .0; 1 such that
Ik D fP0 g [ fy` ; ` W `  kg [ f` W `  k  1g:
EŒPk   M; 8k 2 N , N  c
Here, we make an important assumption that
acknowledgment packets from the actuator
where M is a positive definite matrix that may
are always received by the controller so that
depend on P0 . Furthermore, the critical value c
` ; `  k  1 is available at time k to the remote
satisfies min  c  max , where the lower bound
estimator.
is given by
Optimal Estimation with Remote
Computation 1
min D 1  ; (3)
The optimal mean-square estimate of xk , given .max fj.A/jg/2
the information known to the remote estimator at
time k, is given by
and the upper bound is given by the solution to
the following (quasi-convex) optimization prob-
xO kjk WD EŒxk jIk : lem:

This estimate can be computed recursively using


the following time-varying Kalman filter (TVKF) max D minf  0 W  .Y; Z/ > 0;
(Sinopoli et al. 2004): 0  Y  I for some Y; Zg;

xO 0j1 D 0; (2a)
where
xO kjk D xO kjk1 C k Fk .yk  C xO kjk1 /; (2b)
 .Y; Z/ D
xO kC1jk D AOxkjk C k Buk ; (2c)
2 p p 3
Y .YA C ZC / 1  YA
with the gain matrix Fk calculated recursively as 6p.A0 Y C C 0 Z 0 / 7
4 p Y 0 5:
follows
1  A0 Y 0 Y
Fk D Pk C 0 .CPk C 0 C Rv /1 ;
PkC1 D APk A0 C Rw  k AFk .CPk C 0 C Rv /
Remark 1 In some special cases, the upper
Fk0 A0 : bound in (3) is tight in the sense that c  min .
The largest class of systems known for which
Each Pk corresponds to the estimation error co- this occurs is that of nondegenerate systems
variance matrix defined in Mo and Sinopoli (2012). Examples of
  systems in this class include (1) those for which
Pk D E .xk  xO kjk1 /.xk  xO kjk1 /0 : the matrix C is invertible and (2) those with a
detectable pair .A; C / and such that the matrix
For this estimator, there exists a critical A is diagonalizable with unstable eigenvalues
N above which
value c for the dropout rate , having distinct absolute values.
Networked Control Systems: Estimation and Control Over Lossy Networks 845

Optimal Control with Remote Theorem 2 (Schenato et al. 2007) Assume that
Computation (A, B) and .A; Rw1=2 / are controllable, (A, C)
From a control perspective, one may also be and .A; W 1=2 / are observable, and A is unstable.
interested in finding control sequences uN D Then, finite control costs J are achievable if
fu1 ; : : : ; uN 1 g, as functions of the information and only if N > c and N > c , where the
set IN , which minimize cost functions of the critical value c is given by the (quasi-convex)
form optimization problem
"N 1 #
1 X c D minf  0 W  .Y; Z/ > 0;
0 0
JD lim E .xk W xk C k uk U uk /jIk :
N !1 N 0  Y  I for some Y; Zg;
kD0

where

 .Y; Z/ D
2 p p p 3
Y Y ZU 1=2 .YA0 C ZB 0 / 1  YA0
6 W 1 7
6 p Y1=2 0 0 0 0 7
6 7:
6 p U Z 0 I 0 0 7
4 .AY C BZ 0 / 0 0 Y 0 5
p
1  AY 0 0 0 Y

Moreover, under the above conditions, the xkC1 D Axk C Buk C wk ;


separation principle holds in the sense that the yk D C xk C vk ;
optimal control is given by N
the smart sensor can compute locally an opti-
uk D .B 0 SB C U /1 B 0 SAOxkjk ;
mal state estimate using a standard stationary
Kalman filter and transmits this estimate to the
where xO kjk is an optimal state estimate given
controller. We model packet dropouts as before
by (2) and the matrix S is the solution to the
using the process k and assume that the process
modified algebraic Riccati (MARE) equation
k is known to the smart sensor by means of an
perfect acknowledgment mechanism. This allows
S D A0 SA C W  A
N 0 SB.B 0 SB C U /1 B 0 SA:
the sensor to know uk exactly and to use it in the
Kalman filter.
Solutions to the MARE may be obtained itera-
Let xQ kjk D EŒxk jy` ; ` ; `  k denote the local
tively when N > c .
estimates transmitted by the sensor. Using the
messages successfully received up to time k, the
Estimation with Local Computation
remote estimator computes the optimal estimate
To reduce the gap between the bounds min and
max on the critical value of the drop probability
in Theorem 1 and to allow for larger probabil- xO kjk1 D EŒxk j` ; xQ `j` ; `  k  1:
ities of drop, one may choose to compute state
estimates at the sensor and transmit those to recursively by
the controller/actuator. This scheme is motivated
by the growing number of smart sensors with xO 0j1 D 0;
embedded processing units that are capable of xO kjk D .1  k /Oxkjk1 C k xQ kjk ; k 2 N;
local computation. For the LTI plant xO kC1jk D AOxkjk C Buk
846 Networked Control Systems: Estimation and Control Over Lossy Networks

Notice that now we are applying the (TVKF) the approach described in section “Estimation
to estimate xQ k , which is fully observable. Since with Local Computation”, but with a reduced
min and max in Theorem 2 are equal for fully computational effort at the sensor.
observable processes (Schenato et al. 2007), the Analogously, an improvement to zeroing or
local computation scheme grants a minimal criti- simply holding the control input in case of
cal value c as stated in the theorem below. packet drops between controller and actuator
is for the controller to transmit a control
Theorem 3 Assume that .A; Rw1=2 / is control-
sequence uk ; ukC1 ; : : : ; ukCN that contains not
lable, (A, C) is OBSERVABLE, and A is unstable.
only the control uk to be used at the current
Then the critical value c is given by min in (3),
time instant but also a few future controls
i.e.,
ukC1 ; ukC2 ; : : : ; ukCN . In the case of packet
drops between controller and actuator, the
EŒPk   M; 8k 2 N , N  min actuator can use previously received “future”
control inputs in lieu of the one contained in the
where M is a positive definite matrix that may lost packet. The sequence of future control inputs
depend on P0 . may be obtained, e.g., by an optimal receding
horizon control strategy (Gupta et al. 2006).
Drops in the Acknowledgement Packets
When there are drops in the acknowledgment
channel from the actuator to the controller, the Estimation with Markovian Drops
controller does not always know k , and there- When k is a Markov process, we no longer
fore, it might not always have access to the have a separation principle, and the optimal con-
control inputs that are actually applied to the troller may depend on the drops sequence. Yet,
plant. In this case, the posterior state proba- optimal state estimates are obtained using the
bility becomes a Gaussian mixture distribution same TVKF presented earlier. Below, we give
with infinitely many components, and the sepa- conditions for the stability of the error covariance
ration principle no longer holds (Schenato et al. when drops are governed by the Gilbert-Elliot
2007). This makes the estimation and control model: PrfkC1 D j jk D i g D pij ; i; j 2
problems computationally more difficult, and, f0; 1g.
due to the smaller information set, some perfor- Theorem 4 (Mo and Sinopoli 2012) Assume
1=2
mance degradation in the control performance that .A; Rw / is controllable, A is unstable, and
should be expected. For this reason, it is generally the system given by the pair .A; C / is nonde-
a good design choice to keep controller and actua- generate as discussed in Remark 1. Moreover,
tor collocated when drops in the acknowledgment suppose that the transition probabilities for the
channels are significant. Gilbert-Elliot model satisfy p01 I p10 > 0. Then
the expected error covariance EŒPk  is uniformly
Buffering bounded if
As an alternative to the approach described in p01 > min
section “Estimation with Local Computation” to
and it is unbounded for some initial condition if
use local computation at a smart sensor to allow
p01 < min .
for larger probabilities of drop, the designer may
also consider the transmission of a sequence
of previous measurements yk ; yk1 ; : : : ; ykN
in each packet. This approach is motivated Networks with Multiple Links
by the fact that often data packets can carry
much more than one vector of measured outputs. We now consider feedback loops that are closed
When N is reasonably large, one should expect over a network of communication links, each of
similar estimation/control performances as in which drops packets according to a Bernoulli
Networked Control Systems: Estimation and Control Over Lossy Networks 847

Y
process. The sensor communicates with a con- pmax-cut D max pij :
troller across the network, and we assume that all cuts.S;T / .i;j /2ST
controller and actuator are collocated. The net-
work may be represented by a graph G with nodes The above maximization can be rewritten as a
in the set V and edges in the set E, where edges minimization over the sums of log pij , which
are drawn between two communicating nodes. leads to a linear program known as the mini-
We denote by pij the probability of a drop when mum cut problem in network optimization theory
node i transmits to node j . Drops are assumed to (Cook 1995).
be independent across links and time.
To maximize robustness with respect to drops, Theorem 5 (Gupta et al. 2009) Assume that
sensors use a Kalman filter to compute an optimal Rw ; Rv > 0, that .A; B/ is stabilizable, that
estimate for the state of the process based on .A; C / is observable, and that A is unstable. Then
their measurements and transmit this estimate the control and communication policy described
across the network. When the sensors do not above is optimal for quadratic costs, and the
have access to the process input, they can take expected state covariance is bounded if and only
advantage of the linearity of the Kalman filter: if
as the output of a Kalman filter is the sum of a pmax-cut  .maxfj.A/jg/2 < 1:
term due to measurements with another term due
to control inputs, sensors may compute only the
contribution due to measurements and transmit Estimation with Controlled
it to the controller, which can subsequently add Communication
the contribution due to the control inputs. This
guarantees that optimal state estimates can still To actively reduce network traffic and power
be computed at the control node, even when the consumption, sensor measurements may not be
sensors do not know the control input (Gupta sent to the remote estimator at every time step. In
addition, one may have the ability to somewhat
et al. 2009). N
The communication in the network goes as control the probability of packet drops by varying
follows. Sensors time stamp their estimates and the transmit power or by transmitting copies of
broadcast them to all nodes in their communi- the same message through multiple channel real-
cation ranges. After receiving information from izations. This is known as controlled communi-
their neighbors, nodes compare time stamps and cation, and it allows the designer to establish a
keep only the most recent estimates. These esti- trade-off between communication and estimation
mates are broadcasted to all neighboring nodes. performance.
When the controller receives new information, We consider the local estimation scenario de-
the optimal Kalman estimate is reconstructed, scribed in section “Estimation with Local Com-
taking into account the total transmission delay putation” with the difference that the Bernoulli
(learned from the packet time stamps), and a drops are now modulated as follows
standard LQG control can be used (Gupta et al. (
2009). 1 with prob: k
k D
To determine whether or not this procedure 0 with prob: 1  k
results in a stable closed loop, one defines a cut
C D .S; T / to be a partition of the node set where the sensor is free to choose k 2 Œ0; pmax 
V such that the sensor node is in S and the as a function of the information available up to
controller node is in T . The cut-set is then defined time k. With its choice, the sensor incurs on a
as the set of edges .i; j / 2 E such that i 2 S communication cost c.k / at time k, where c./
and j 2 T , i.e., the set of edges that connect is some increasing function that may represent,
the sets S and T . The max-cut probability is then for example, the energy needed in order to trans-
defined as mit with a probability of drop equal to k . Note
848 Networked Control Systems: Estimation and Control Over Lossy Networks

that transmission scheduling, where k is either When computing eQ k and k in (5) is com-
0 or pmax , is a special case of this framework. putationally too costly for the sensor, one may
In order to choose k , the sensor considers prefer to make k a function of the number
the estimation error eQ k WD xQ kjk  xO kjk1 between of consecutive dropped packets `k . In this case,
the local and the remote estimators. This error minimizing JQ in (4) is equivalent to minimizing
evolves according to the cost
( "K1 #
dk with prob: k 1 X  
eQ kC1 D JN WD lim E trace †`k C c.k / ;
K!1 K
AQek C dk with prob: 1  k kD0

where dk is the innovations process arising from where


the standard Kalman filter in the smart sensor. X̀
†` WD A0m Rw Am :
Our objective is to find a “communication mD0
policy” that minimizes the long-term average cost
Since `k belongs to a countable set, one can very
"K1 # efficiently solve this optimization using dynamic
1 X
JQ WD lim E kQek k2 C c.k / ; programming (Mesquita et al. 2012).
K!1 K
kD0

 > 0; (4)
Summary and Future Directions
which penalizes a linear combination h of the i re- Most positive results in the subject rely on the
2
mote estimation error variance E kQek k and assumption of perfect acknowledgments and on
the average communication cost EŒc.k /. In actuators and controllers being collocated. Future
this context, a communication policy should be research should address ways of circumventing
understood as a rule that selects k as a function these assumptions.
of the information available to the sensor.
When
Cross-References
.1  pmax / maxfj.A/jg2 < 1;
 Data Rate of Nonlinear Control Systems and
there exists an optimal communication policy that Feedback Entropy
chooses k as a function of eQk , which may be  Information and Communication Complexity
computed via dynamic programming and value of Networked Control Systems
iteration (Mesquita et al. 2012). While this pro-  Networked Control Systems: Architecture and
cedure can be computationally difficult, it is of- Stability Issues
ten possible to obtain suboptimal but reasonable  Networked Systems
performance with rollout policies such as the  Quantized Control and Data Rate Constraints
following one:

k D arg min Œ.pmax  /Qe0k A0 HAQek Bibliography


2Œ0;pmax 
Antunes D, Hespanha JP, Silvestre C (2012) Volterra
C c./ (5) integral approach to impulsive renewal systems: ap-
plication to networked control. IEEE Trans Autom
Control 57:607–619
where H is the positive semidefinite solution to
Cook WJ (1995) Combinatorial optimization: papers from
the Lyapunov equation .1  pmax /A0 HA  H D the DIMACS special year, vol 20. American Mathe-
I (Mesquita et al. 2012). matical Society, Providence
Networked Systems 849

Gupta V, Sinopoli B, Adlakha S, Goldsmith A, Murray R Introduction


(2006) Receding horizon networked control. In: Pro-
ceedings of the allerton conference on communication,
control, and computing, Monticello Networked systems appear in numerous scientific
Gupta V, Dana AF, Hespanha JP, Murray RM, Has- and engineering domains, including communica-
sibi B (2009) Data transmission over networks for tion networks (Toh 2001), multi-robot networks
estimation and control. IEEE Trans Autom Control (Arkin 1998; Balch and Parker 2002), sensor
54(8):1807–1819
Imer OC, Basar T (2005) Optimal estimation with lim- networks (Santi 2005; Schenato et al. 2007),
ited measurements. In: Proceedings of the 44th IEEE water irrigation networks (Cantoni et al. 2007),
conference on decision and control, 2005 and 2005 power and electrical networks (Chow 1982;
European control conference (CDC-ECC’05), Seville Chiang et al. 1995; Dörfler et al. 2013), camera
Mesquita AR, Hespanha JP, Nair GN (2012) Redundant
data transmission in control/estimation over lossy net- networks (Song et al. 2011), transportation
works. Automatica 48(8):1612–1620 networks (Ahuja et al. 1993), social networks
Mo Y, Sinopoli B (2012) Kalman filtering with intermit- (Jackson 2010), and chemical and biological
tent observations: tail distribution and critical value. networks (Kuramoto 1984; Strogatz 2003).
IEEE Trans Autom Control 57(3):677–689
Schenato L, Sinopoli B, Franceschetti M, Poolla K, Sastry Their applications are pervasive, ranging from
SS (2007) Foundations of control and estimation over environmental monitoring, ocean sampling,
lossy networks. Proc IEEE 95(1):163–187 and marine energy systems, through search
Sinopoli B, Schenato L, Franceschetti M, Poolla K, and rescue missions, high-stress deployment in
Jordan MI, Sastry SS (2004) Kalman filtering with
intermittent observations. IEEE Trans Autom Control disaster recovery, health monitoring of critical
49(9):1453–1464 infrastructure to science imaging, the smart grid,
and cybersecurity.
The rich nature of networked systems makes
it difficult to provide a definition that, at the
Networked Systems same time, is comprehensive enough to capture
their variety and simple enough to be expressive
Jorge Cortés of their main features. With this in mind, we N
Department of Mechanical and Aerospace loosely define a networked system as a “system of
Engineering, University of California, San systems,” i.e., a collection of agents that interact
Diego, La Jolla, CA, USA with each other. These groups might be hetero-
geneous, composed by human, biological, or en-
gineered agents possessing different capabilities
Abstract regarding mobility, sensing, actuation, commu-
nication, and computation. Individuals may have
This entry provides a brief overview on net- objectives of their own or may share a common
worked systems from a systems and control per- objective with others – which in turn might be ad-
spective. We pay special attention to the nature versarial with respect to another subset of agents.
of the interactions among agents; the critical In a networked system, the evolutions of the
role played by information sharing, dissemina- states of individual agents are coupled. Coupling
tion, and aggregation; and the distributed control might be the result of the physical interconnec-
paradigm to engineer the behavior of networked tion among the agents, the consequence of the im-
systems. plementation of coordination algorithms where
agents use information about each other, or a
combination of both. There is diversity too in the
nature of agents themselves and the interactions
Keywords among them, which might be cooperative, adver-
sarial, or belong to the rich range between the
Autonomous networks; Cooperative control; two. Due to changes in the state of the agents, the
Multi-agent systems; Swarms network, or the environment, interactions among
850 Networked Systems

agents may be changing and dynamic. Such inter- distributed agent-to-agent interactions (Okubo
actions may be structured across different layers, 1986; Parrish et al. 2002; Conradt and Roper
which themselves might be organized in a hi- 2003; Couzin et al. 2005). In robotics, engineers
erarchical fashion. Networked systems may also design algorithmic solutions to help multivehicle
interact with external entities that specify high- networks and embedded systems coordinate
level commands that trickle down through the their actions and perform challenging spatially
system all the way to the agent level. distributed tasks (Arkin 1998; Committee on
A defining characteristic of a networked sys- Networked Systems of Embedded Computers
tem is the fact that information, understood in a 2001; Balch and Parker 2002; Howard et al. 2006;
broad sense, is sparse and distributed across the Kumar et al. 2008). Graph theorists and applied
agents. As such, different individuals have access mathematicians study the role played by the
to information of varying degrees of quality. interconnection among agents in the emergence
As part of the operation of the networked sys- of phase transition phenomena (Bollobás 2001;
tem, mechanisms are in place to share, transmit, Meester and Roy 2008; Chung 2010). This
and/or aggregate this information. Some informa- interest is also shared in communication and
tion may be disseminated throughout the whole information theory, where researchers strive to
network or, in some cases, all information can design efficient communication protocols and
be made centrally available at a reasonable cost. examine the effect of topology control on group
In other scenarios, however, the latter might turn connectivity and information dissemination
out to be too costly, unfeasible, or undesirable (Zhao and Guibas 2004; Giridhar and Kumar
because of privacy and security considerations. 2005; Lloyd et al. 2005; Santi 2005; Franceschetti
Individual agents are the basic unit for decision and Meester 2007). Game theorists study the gap
making, but decisions might be made from in- between the performance achieved by global,
termediate levels of the networked system all the network-wide optimizers and the configurations
way to a central planner. The combination of in- that result from selfish agents interacting locally
formation availability and decision-making capa- in social and economic systems (Roughgarden
bilities gives rise to an ample spectrum of possi- 2005; Nisan et al. 2007; Easley and Kleinberg
bilities between the centralized control paradigm, 2010; Marden and Shamma 2013). In mechanism
where all information is available at a central design, researchers seek to align the objectives
planner who makes the decisions, and the fully of individual self-interested agents with the
distributed control paradigm, where individual overall goal of the network. Static and mobile
agents only have access to the information shared networked systems and their applications to the
by their neighbors in addition to their own. study of natural phenomena in oceans (Paley
et al. 2008; Graham and Cortés 2012; Zhang
and Leonard 2010; Das et al. 2012; Ouimet and
Perspective from Systems Cortés 2013), rivers (Ru and Martínez 2013;
and Control Tinka et al. 2013), and the environment (DeVries
and Paley 2012) also raise exciting challenges in
There are many aspects that come into play when estimation theory, computational geometry, and
dealing with networked systems regarding com- spatial statistics.
putation, processing, sensing, communication, The field of systems and control brings a
planning, motion control, and decision making. comprehensive approach to the modeling, analy-
This complexity makes their study challenging sis, and design of networked systems. Emphasis
and fascinating and explains the interest that, is put on the understanding of the general
with different emphases, they generate in a large principles that explain how specific collective
number of disciplines. In biology, scientists behaviors emerge from basic interactions; the
analyze synchronization phenomena and self- establishment of models, abstractions, and tools
organized swarming behavior in groups with that allow us to reason rigorously about complex
Networked Systems 851

interconnected systems; and the development the optimal tuning of sensors, and the distributed
of systematic methodologies that help engineer optimization of network resources. The entry
their behavior. The ultimate goal is to establish a  Estimation and Control over Networks explores
science for integrating individual components the impact that communication channels may
into complex, self-organizing networks with have on the execution of estimation and control
predictable behavior. To realize the “power tasks over networks of sensors and actuators.
of many” and expand the realm of what is A strong point of commonality among the
possible to achieve beyond the individual agent contributions is the precise characterization of the
capabilities, special care is taken to obtain scalability of coordination algorithms, together
precise guarantees on the stability properties with the rigorous analysis of their correctness
of coordination algorithms, understand the and stability properties. Another focal point is
conditions and constraints under which they the analysis of the performance gap between
work, and characterize their performance and centralized and distributed approaches in regard
robustness against a variety of disturbances and to the ultimate network objective.
disruptions. Further information about other relevant
aspects of networked systems can be found
throughout this Encyclopedia. Among these, we
Research Issues – and How the Entries highlight the synthesis of cooperative strategies
in the Encyclopedia Address Them for data fusion, distributed estimation, and
adaptive sampling, the analysis of the network
Given the key role played by agent-to-agent inter- operation under communication constraints (e.g.,
actions in networked systems, the Encyclopedia limited bandwidth, message drops, delays, and
entries  Graphs for Modeling Networked In- quantization), the treatment of game-theoretic
teractions and  Dynamic Graphs, Connectivity scenarios that involve interactions among
of deal with how their nature and effect can be multiple players and where security concerns
modeled through graphs. This includes diverse might be involved, distributed model predictive N
aspects such as deterministic and stochastic control, and the handling of uncertainty,
interactions, static and dynamic graphs, state- imprecise information, and events via discrete-
dependent and time-dependent neighboring event systems and triggered control.
relationships, and connectivity. The importance
of maintaining a certain level of coordination
and consistency across the networked system is Summary and Future Directions
manifested in the various entries that deal with
coordination tasks that are, in some way or an- In conclusion, this entry has illustrated ways
other, related to some form of agreement. These in which systems and control can help us de-
include consensus ( Averaging Algorithms sign and analyze networked systems. We have
and Consensus), formation control ( Vehicular focused on the role that information and agent
Chains), cohesiveness, flocking ( Flocking interconnection play in shaping their behavior.
in Networked Systems), synchronization We have also made emphasis on the increasingly
( Oscillator Synchronization), and distributed rich set of methods and techniques that allow
optimization ( Distributed Optimization). A to provide correctness and performance guar-
great deal of work (e.g., see  Optimal Deploy- antees. The field of networked systems is vast
ment and Spatial Coverage and  Multi-vehicle and the amount of work impossible to survey in
Routing), is also devoted to the design of this brief entry. The reader is invited to further
cooperative strategies that achieve spatially explore additional topics beyond the ones men-
distributed tasks such as optimal coverage, space tioned here. The monographs (Ren and Beard
partitioning, vehicle routing, and servicing. These 2008; Bullo et al. 2009; Mesbahi and Egerst-
entries explore the optimal placement of agents, edt 2010; Alpcan and Başar 2010) and edited
852 Networked Systems

volumes (Kumar et al. 2004; Shamma 2008; Bullo F, Cortés J, Martínez S (2009) Distributed control of
Saligrama 2008), and manuscripts (Olfati-Saber robotic networks. Applied mathematics series. Prince-
ton University Press, Princeton, NJ. Electronically
et al. 2007; Baillieul and Antsaklis 2007; Leonard available at http://coordinationbook.info
et al. 2007; Kim and Kumar 2012), together Cantoni M, Weyer E, Li Y, Ooi SK, Mareels I, Ryan M
with the references provided in the Encyclopedia (2007) Control of large-scale irrigation networks. Proc
entries mentioned above, are a good starting point IEEE 95(1):75–91
Chiang HD, Chu CC, Cauley G (1995) Direct stability
to undertake this enjoyable effort. Given the big analysis of electric power systems using energy func-
impact that networked systems have, and will tions: theory, applications, and perspective. Proc IEEE
continue to have, in our society, from energy and 83(11):1497–1529
transportation, through human interaction and Chow JH (1982) Time-scale modeling of dynamic net-
works with applications to power systems. Springer,
healthcare, to biology and the environment, there New York, NY
is no doubt that the coming years will witness Chung FRK (2010) Graph theory in the information age.
the development of more tools, abstractions, and Not AMS 57(6):726–732
models that allow to reason rigorously about Committee on Networked Systems of Embedded Comput-
ers (2001) Embedded, everywhere: a research agenda
intelligent networks and for techniques that help for networked systems of embedded computers. Na-
design truly autonomous and adaptive networks. tional Academy Press, Washington, DC
Conradt L, Roper TJ (2003) Group decision-making in
animals. Nature 421(6919):155–158
Couzin ID, Krause J, Franks NR, Levin SA (2005) Effec-
Cross-References tive leadership and decision-making in animal groups
on the move. Nature 433(7025):513–516
Das J, Py F, Maughan T, O’Reilly T, Messié M, Ryan J,
 Averaging Algorithms and Consensus Sukhatme GS, Rajan K (2012) Coordinated sampling
 Dynamic Graphs, Connectivity of of dynamic oceanographic features with AUVs and
 Distributed Optimization drifters. Int J Robot Res 31(5):626–646
 Estimation and Control over Networks DeVries L, Paley D (2012) Multi-vehicle control
in a strong flowfield with application to hurri-
 Flocking in Networked Systems cane sampling. AIAA J Guid Control Dyn 35(3):
 Graphs for Modeling Networked Interactions 794–806
 Multi-vehicle Routing Dörfler F, Chertkov M, Bullo F (2013) Synchronization in
 Optimal Deployment and Spatial Coverage complex oscillator networks and smart grids. Proc Natl
Acad Sci 110(6):2005–2010
 Oscillator Synchronization Easley D, Kleinberg J (2010) Networks, crowds, and
 Vehicular Chains markets: reasoning about a highly connected world.
Cambridge University Press, Cambridge, UK
Franceschetti M, Meester R (2007) Random networks for
communication. Cambridge University Press, Cam-
bridge, UK
Bibliography Giridhar A, Kumar PR (2005) Computing and commu-
nicating functions over sensor networks. IEEE J Sel
Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows: Areas Commun 23(4):755–764
theory, algorithms, and applications. Prentice Hall, Graham R, Cortés J (2012) Adaptive information collec-
Englewood Cliffs tion by robotic sensor networks for spatial estimation.
Alpcan T, Başar T (2010) Network security: a decision and IEEE Trans Autom Control 57(6):1404–1419
game-theoretic approach. Cambridge University Press, Howard A, Parker LE, Sukhatme GS (2006) Experiments
Cambridge, UK with a large heterogeneous mobile robot team: ex-
Arkin RC (1998) Behavior-based robotics. MIT, Cam- ploration, mapping, deployment, and detection. Int J
bridge, MA Robot Res 25(5–6):431–447
Baillieul J, Antsaklis PJ (2007) Control and communica- Jackson MO (2010) Social and economic networks.
tion challenges in networked real-time systems. Proc Princeton University Press, Princeton, NJ
IEEE 95(1):9–28 Kim KD, Kumar PR (2012) Cyberphysical systems: a
Balch T, Parker LE (eds) (2002) Robot teams: from perspective at the centennial. Proc IEEE 100(Special
diversity to polymorphism. A. K. Peters, Wellesley, Centennial Issue):1287–1308
MA Kumar V, Leonard NE, Morse AS (eds) (2004) Coopera-
Bollobás B (2001) Random graphs, 2nd edn. Cambridge tive control. Lecture notes in control and information
University Press, Cambridge, UK sciences, vol 309. Springer, New York, NY
Neural Control and Approximate Dynamic Programming 853

Kumar V, Rus D, Sukhatme GS (2008) Networked robots. Strogatz SH (2003) SYNC: the emerging science of spon-
In: Siciliano B, Khatib O (eds) Springer handbook of taneous order. Hyperion, New York, NY
robotics. Springer, New York, NY, pp 943–958 Tinka A, Rafiee M, Bayen A (2013) Floating sensor
Kuramoto Y (1984) Chemical oscillations, waves, and networks for river studies. IEEE Syst J 7(1):36–49
turbulence. Springer, New York, NY Toh CK (2001) Ad hoc mobile wireless networks: proto-
Leonard NE, Paley D, Lekien F, Sepulchre R, Fratan- cols and systems. Prentice Hall, Englewood Cliffs, NJ
toni DM, Davis R (2007) Collective motion, sen- Zhang F, Leonard NE (2010) Cooperative filters and con-
sor networks and ocean sampling. Proc IEEE 95(1): trol for cooperative exploration. IEEE Trans Autom
48–74 Control 55(3):650–663
Lloyd EL, Liu R, Marathe MV, Ramanathan R, Ravi Zhao F, Guibas L (2004) Wireless sensor networks: an
SS (2005) Algorithmic aspects of topology control information processing approach. Morgan-Kaufmann,
problems for ad hoc networks. Mobile Netw Appl San Francisco, CA
10(1–2):19–34
Marden JR, Shamma JS (2013) Game theory and dis-
tributed control. In: Young P, Zamir S (eds) Handbook
of game theory, vol 4. Elsevier, Oxford, UK
Meester R, Roy R (2008) Continuum percolation. Cam- Neural Control and Approximate
bridge University Press, Cambridge, UK Dynamic Programming
Mesbahi M, Egerstedt M (2010) Graph theoretic methods
in multiagent networks. Applied mathematics series. Frank L. Lewis1 and Kyriakos G. Vamvoudakis2
Princeton University Press, Princeton, NJ 1
Nisan N, Roughgarden T, Tardos E, Vazirani VV Arlington Research Institute, University of
(2007) Algorithmic game theory. Cambridge Univer- Texas, Fort Worth, TX, USA
2
sity Press, Cambridge, UK Center for Control, Dynamical Systems and
Okubo A (1986) Dynamical aspects of animal grouping: Computation (CCDC), University of California,
swarms, schools, flocks and herds. Adv Biophys 22:1–
94 Santa Barbara, CA, USA
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
and cooperation in networked multi-agent systems.
Proc IEEE 95(1):215–233 Abstract
Ouimet M, Cortés J (2013) Collective estimation of ocean
nonlinear internal waves using robotic underwater
drifters. IEEE Access 1:418–427
There has been great interest recently in “uni- N
Paley D, Zhang F, Leonard N (2008) Cooperative control versal model-free controllers” that do not need
for ocean sampling: the glider coordinated control a mathematical model of the controlled plant,
system. IEEE Trans Control Syst Technol 16(4):735– but mimic the functions of biological processes
744 to learn about the systems they are controlling
Parrish JK, Viscido SV, Grunbaum D (2002) Self-
organized fish schools: an examination of emergent online, so that performance improves automati-
properties. Biol Bull 202:296–305 cally. Neural network (NN) control has had two
Ren W, Beard RW (2008) Distributed consensus in multi- major thrusts: approximate dynamic program-
vehicle cooperative control. Communications and con- ming, which uses NN to approximately solve the
trol engineering. Springer, New York, NY
Roughgarden T (2005) Selfish routing and the price of optimal control problem, and NN in closed-loop
anarchy. MIT, Cambridge, MA feedback control.
Ru Y, Martínez S (2013) Coverage control in constant flow
environments based on a mixed energy-time metric.
Automatica 49(9):2632–2640
Saligrama V (ed) (2008) Networked sensing information Keywords
and control. Springer, New York, NY
Santi P (2005) Topology control in wireless ad hoc and Adaptive control; Learning systems; Neural net-
sensor networks. Wiley, New York, NY
works; Optimal control; Reinforcement learning
Schenato L, Sinopoli B, Franceschetti M, Poolla K, Sastry
SS (2007) Foundations of control and estimation over
lossy networks. Proc IEEE 95(1):163–187
Shamma JS (ed) (2008) Cooperative control of distributed Neural Feedback Control
multi-agent systems. Wiley, New York, NY
Song B, Ding C, Kamal AT, Farrel JA, Roy-Chowdhury
AK (2011) Distributed camera networks. IEEE Signal The objective is to design NN feedback con-
Process Mag 28(3):20–31 trollers that cause a system to follow, or track,
854 Neural Control and Approximate Dynamic Programming

 T
a prescribed trajectory or path. Consider the dy- VPO D Gx O WO r  kG krk VO ;
namics of an n-link robot manipulator
with any constant symmetric matrices F; G > 0,
P qP C G.q/ C F .q/
M.q/qR C Vm .q; q/ P C d D  and scalar tuning parameter k > 0.
(1)
NN Controller for Discrete-Time Systems
with q.t/ 2 Rn the joint variable vector, M.q/
Most feedback controllers today are implemented
an inertia matrix, Vm a centripetal/coriolis ma-
on digital computers. This requires the specifi-
trix, G.q/ a gravity vector, and F ./ representing
cation of control algorithms in discrete time or
friction terms. Bounded unknown disturbances
digital form (Lewis et al. 1999). To design such
and modeling errors are denoted by d and the controllers, one may consider the discrete-time
control input torque is .t/. The sliding mode dynamics xkC1 D f .xk / C g.xk /uk with un-
control approach (Slotine and Li 1987) can be
known functions f ./; g./. The digital NN con-
generalized to NN control systems. Given a de-
troller derived in this situation has the form of a
sired trajectory, qd 2 Rn define the tracking error
feedback linearization controller shown in Fig. 1.
e.t/ D qd .t/  q.t/ and the sliding variable error
One can derive tuning algorithms, for a discrete-
r D eP C e with  D T > 0. Define the time neural network controller with L layers, that
nonlinear robot function, guarantee system stability and robustness (Lewis
et al. 1999). For the i -th layer, the weight updates
f .x/ DM.q/.qRd C e/
P C Vm .q; q/.
P qPd C e/ are of the form
C G.q/ C F .q/
P

WO i .k C 1/ D WO i .k/  ˛i
Oi .k/yOiT .k/
 x.t/ of measured
with the known vector  signals

is selected as, x D e T eP T qdT qPdT qR dT . T
 I  ˛i
Oi .k/
Oi .k/ WO i .k/
NN Controller for Continuous-Time
Systems where
O i .k/ are the output functions of layer i ,
The NN controller is designed based on func- 0 < < 1 is a design parameter, and
tional approximation properties of NN as shown (
in Lewis et al. (1999). Thus, assume that f .x/ WO iT
Oi .k/ C Kv r.k/ for i D 1; : : : ; L  1,
yOi .k/ D
can be approximated by fO.x/ D WO T .VO T x/ r.k C 1/ for i D L
with VO ; WO the estimated NN weights. Select the
control input,  D WO T .VO T x/ C Kv r  v with with r.k/ a filtered error.
Kv a symmetric positive definite gain and v.t/
a robustifying function. This NN control struc- Feedforward Neurocontroller
ture is shown in Fig. 1. The outer proportional- Industrial, aerospace, DoD, and MEMS assembly
derivative (PD) tracking loop guarantees robust systems have actuators that generally contain
behavior. The inner loop containing the NN is deadzone, backlash, and hysteresis. Since these
known as a feedback linearization loop, and the actuator nonlinearities appear in the feedforward
NN effectively learns the unknown dynamics loop, the NN compensator must also appear in
online to cancel the nonlinearities of the system. the feedforward loop. This design is significantly
Let the estimated sigmoid Jacobian be O 0  more complex than for feedback NN controllers.
d .z/
d z jzDVO T x . Then, the NN weight tuning laws are Details are given in Lewis et al. (2002). Feedfor-
provided by ward controllers can offset the effects of dead-
zone if properly designed. It can be shown that
a NN deadzone compensator has the structure
WPO D F r
O T  F O0 VO T xr T  kF krk WO ; shown in Fig. 2.
Neural Control and Approximate Dynamic Programming 855

Neural Control and Approximate Dynamic Programming, Fig. 1 Neural network robot controller

Neural Control and Approximate Dynamic Programming, Fig. 2 Feedforward NN for deadzone compensation

The NN compensator consists of two NNs. stability proof techniques. The two NN were each
NN II is in the direct feedforward control loop, selected as having one tunable layer, namely, the
and NN I is not directly in the control loop but output weights. The activation functions were set
serves as an observer to estimate the (unmea- as a basis by selecting fixed random values for
sured) applied torque .t/. The feedback stability the first-layer weights. To guarantee stability, the
and performance of the NN deadzone compen- output weights of the inversion NN II (subscript
sator have been rigorously proven using nonlinear i denotes weights and sigmoids of the inversion)
856 Neural Control and Approximate Dynamic Programming

and the estimator NN I should be tuned respec- controller or adding adaptive characteristics to an
tively as optimal controller.

WPO i D T i .Vi w/r T WO T 0 .V T u/V T Optimal Adaptive Control of Discrete-Time


Nonlinear Systems

 k1 T krk WO i  k2 T krk WO i WO i ; Consider a class of discrete-time systems de-
scribed by the deterministic nonlinear dynamics
WPO D S 0 .V T u/V T WO i i .ViT w/r T in the affine state space difference equation form

 k1 S krk WO ; xkC1 D f .xk / C g.xk /uk ; (2)

with design matrices T; S > 0 and tuning gains with state xk 2 Rn and control input uk 2
k1 ; k2 . Rm . A deterministic control policy is defined
as a function from state space to control space
Rn ! Rm . That is, for every state xk , the policy
Approximate Dynamic Programming defines a control action uk D h.xk / as a feedback
for Feedback Control controller. Define a deterministic cost function
that yields the value function:
The current status of work in approximate dy-
1
X
namic programming (ADP) for feedback control
is given in Lewis and Liu (2012). ADP is a V .xk / D i k r.xi ; ui /;
i Dk
form of reinforcement learning based on an ac-
tor/critic structure. Reinforcement learning (RL)
with 0 <  1 a discount factor, Q.xk /; R > 0,
is a class of methods used in machine learning
and uk D h.xk / a prescribed feedback control
to methodically modify the actions of an agent
policy. The optimal value is given by Bellman’s
based on observed responses from its environ-
optimality equation:
ment (Sutton and Barto 1998). The actor/critic
structures are RL systems that have two learning  
V  .xk / D min r.xk ; h.xk // C V  .xkC1 / ;
structures: A critic network evaluates the perfor- h.:/
mance of a current action policy, and based on
that evaluation, an actor structure updates the ac- which is the discrete-time Hamilton-Jacobi-
tion policy as shown in Fig. 3. Adaptive optimal Bellman (HJB) equation. Two forms of RL can be
controllers (Lewis et al. 2012b) have been pro- based on policy iteration (PI) and value iteration
posed by adding optimality criteria to an adaptive (VI). For temporal difference learning, PI is

Neural Control and


Approximate Dynamic
Programming, Fig. 3 RL
with an actor/critic
structure
Neural Control and Approximate Dynamic Programming 857

written as follows in terms of the deterministic in Abu-Khalaf and Lewis (2005) that will give
Bellman equation. us the structure for the proposed online algo-
rithms that follow. Consider the following non-
linear time-invariant affine in the input dynamical
Algorithm 1 PI for discrete-time systems
system given by
1: procedure
2: Given admissible policies
h0 .xk /
3: while V hi  V hi  ac do P
x.t/ D f .x.t// C g.x.t//u.t/I x.0/ D x0 (3)
4: Solve for the value V.i/ .x/ using

ViC1 .xk / D r.xk ; hi .xk // C ViC1 .xkC1 /


with x.t/ 2 Rn ; f .x.t// 2 Rn ; g.x.t// 2
Rnm and control input u.t/ 2 Rm . We assume
5: Update the control policy h.iC1/ .xk / using that f .0/ D 0; f .x/ C g.x/u is Lipschitz
  continuous on a set ˝  Rn that contains the
hiC1 .xk / D arg min r.xk ; h.xk // C ViC1 .xkC1 /
h./ origin and that the system is stabilizable on ˝,
that is, there exists a continuous control function
6: i WD i C 1
u.t/ 2 U such that the system is asymptotically
7: end while
8: end procedure stable on ˝. Define the infinite horizon integral
cost 8t  0
Z 1
where ac is a small number that checks the V .xt / D r.x./; u.//d ; (4)
algorithm convergence. Value iteration is similar, t
but the policy evaluation procedure is performed
as Vi C1 .xk / D r.xk ; hi .xk // C Vi .xkC1 /. In with Q.x/ positive definite and R 2 Rmm
value iteration, we can select any initial control a symmetric positive definite matrix. For any
policy, not necessarily admissible or stabilizing. admissible control policy if the associated cost
In the control system shown in Fig. 3, the critic (4) is C 1 , then an infinitesimal version is the
and the actor NNs are tuned online using the Bellman equation, and the optimal cost function N
observed data xk ; xkC1 ; r.xk ; hi .xk // along the V  .x/ is defined by
system trajectory. The critic and actor are tuned Z 1 

sequentially in both the PI and the VI. That is, the V .x0 / D min r.x; u/d 
u
weights of one neural network are held constant, 0

while the weights of the other are tuned until


which satisfies the HJB equation. By employing
convergence. This procedure is repeated until
the stationarity condition, the optimal control
both neural networks have converged. Thus, the
function for the given problem is
controller learns the optimal controller online.
The convergence of value iteration using two 1 @V  .x/
neural networks for the discrete-time nonlinear u .x/ D  R1 g T .x/ : (5)
2 @x
system (2) is proven in Al-Tamimi et al. (2008).
Design of an ADP controller that uses only output Inserting the optimal control (5) into the Bellman
feedback is given in Lewis and Vamvoudakis equation, one obtains the formulation for HJB

(2011). equation in terms of @V@x.x/ and with boundary
condition V  .0/ D 0
Optimal Adaptive Control of
Continuous-Time Nonlinear Systems @V  .x/ T
RL is considerably more difficult for continuous- 0 D r.x; u / C .f .x/ C g.x/u /;
@x
time systems than for discrete-time systems, and (6)
fewer results are available. This subsection will
provide the formulation of optimal control prob- which for the linear case becomes the well-
lem followed by an offline PI algorithm provided known Riccati equation. In order to find the
858 Neural Control and Approximate Dynamic Programming

optimal control solution for the problem, one dynamics is proposed in Vrabie et al. (2009)
needs to solve the HJB equations (6) for the value where the Bellman equation is proved to be
function and then substitute in (5) to obtain the equivalent to the integral reinforcement learn-
optimal control. However, due to the nonlinear ing form with an optimal value given for some
nature of the HJB equation, finding its solution is T > 0 as
generally difficult or impossible. The following
Z t CT
PI algorithm is an iterative algorithm for solving
optimal control problems and will give us the V  .x.t// D arg min r.x./; u.//d 
u t
structure for the online learning algorithm.
C V  .x.t C T //:

Algorithm 2 PI for continuous-time systems Therefore, the temporal difference error for
1: procedure continuous-time systems can be defined as
2: Given admissible policies
u.0/
u.i / u.i /
3: while V  V  ac do
4: Solve for the value V .i/ .x/ using Bellman’s e.t W t C T / D .t W t C T / C V .x.t C T //
equation
 V .x.t//;
.i / T
@V u T R t CT
Q.x/ C .f .x/ C g.x/u.i/ / C u.i/ Ru.i/ D0;
@x with .t W t C T /  t r.x./; u.//d  with-
.i /
V u .0/ D 0 out any information of the plant dynamics. The
IRL controller just given tunes the critic neural
5: Update the control policy u.iC1/ using network to determine the value while holding
T!
the control policy fixed. The IRL algorithm can
.i /
1 1 T @V u be implemented online by RL techniques using
u .iC1/
D R g .x/
2 @x value function approximation VO .x/ D WO 1T
.x/
in a critic approximator network. Using that ap-
6: i WD i C 1
7: end while proximation in the PI algorithm, one can use
8: end procedure batch least squares or recursive least squares
to update the value function, and then on con-
vergence of the value parameters, the action is
A PI algorithm that solves online the HJB updated. The implementation of the IRL optimal
equation without full information of the plant adaptive control algorithm is shown in Fig. 4.

Neural Control and Approximate Dynamic Programming, Fig. 4 Hybrid optimal adaptive controller based on IRL
Neural Control and Approximate Dynamic Programming 859

The work in Vamvoudakis and Lewis (2010) Acknowledgments This material is based upon the work
presents a way of finding the optimal control supported by NSF. Grant Number: ECCS-1128050, ARO.
Grant Number: W91NF-05-1-0314, AFOSR. Grant Num-
solution in a synchronous manner along with ber: FA9550-09-1-0278.
stability and convergence guarantees but with
known dynamics. This procedure is more nearly
in line with accepted practice in adaptive control.
Bibliography
A synchronous online learning algorithm that
avoids the knowledge of drift dynamics is pro- Abu-Khalaf M, Lewis FL (2005) Nearly optimal control
posed in Vamvoudakis et al. (2013). laws for nonlinear systems with saturating actuators
using a neural network HJB approach. Automatica
41(5):779–791
Learning in Games Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-
Reinforcement learning techniques have been ap- time nonlinear HJB solution using approximate dy-
plied to design adaptive controllers that converge namic programming: convergence proof. IEEE Trans
to the solution of two-player zero-sum games Syst Man Cybern Part B 38(4):943–949
Lewis FL, Liu D (2012) Reinforcement learning and
in Vamvoudakis and Lewis (2012) and Vrabie approximate dynamic programming for feedback con-
et al. (2012), of multiplayer nonzero-sum games trol. IEEE Press computational intelligence series.
in Vamvoudakis et al. (2012a), and of Stackelberg Wiley-Blackwell, Oxford
games in Vamvoudakis et al. (2012b). In these Lewis FL, Vamvoudakis KG (2011) Reinforcement learn-
ing for partially observable dynamic processes: adap-
cases, the adaptive control structure has multiple tive dynamic programming using measured output
loops, with action networks and critic networks data. IEEE Trans Syst Man Cybern Part B 41(1):14–25
for each player. The adaptive controller for zero- Lewis FL, Jagannathan S, Yesildirek A (1999) Neural
sum games finds the solution to the H-infinity network control of robot manipulators and nonlinear
systems. Taylor and Francis, London
control problem online in real time. This adaptive Lewis FL, Campos J, Selmic R (2002) Neuro-fuzzy con-
controller does not require any systems dynamics trol of industrial systems with actuator nonlinearities.
information. Society of Industrial and Applied Mathematics Press,
Philadelphia N
Lewis FL, Vrabie D, Syrmos VL (2012a) Optimal control.
Wiley, New York
Summary and Future Directions Lewis FL, Vrabie D, Vamvoudakis KG (2012b) Rein-
forcement learning and feedback control: using natural
This entry discusses some neuro-inspired adap- decision methods to design optimal adaptive con-
trollers. IEEE Control Syst Mag 32(6):76–105
tive control techniques. These controllers have Slotine JJE, Li W (1987) On the adaptive control of robot
multi-loop, multi-timescale structures and can manipulators. Int J Robot Res 6(3):49–59
learn the solutions to Hamilton-Jacobi design Sutton RS, Barto AG (1998) Reinforcement learning – an
equations such as the Riccati equation online introduction. MIT, Cambridge
Vamvoudakis KG, Lewis FL (2010) Online actor-critic
without knowing the full dynamical model of the algorithm to solve the continuous-time infinite horizon
system. A method known as Q learning allows the optimal control problem. Automatica 46(5):878–888
learning of optimal control solutions online, in Vamvoudakis KG, Lewis FL (2011) Multi-player non
zero sum games: online adaptive learning solution
the discrete-time case, for completely unknown
of coupled Hamilton-Jacobi equations. Automatica
systems. Q learning has not yet been fully inves- 47(8):1556–1569
tigated for continuous-time systems. Vamvoudakis KG, Lewis FL (2012) Online solution
of nonlinear two-player zero-sum games using syn-
chronous policy iteration. Int J Robust Nonlinear Con-
trol 22(13):1460–1483
Cross-References Vamvoudakis KG, Lewis FL, Hudas GR (2012a) Multi-
agent differential graphical games: online adaptive
 Adaptive Control, Overview learning solution for synchronization with optimality.
Automatica 48(8):1598–1611
 Stochastic Games and Learning
Vamvoudakis KG, Lewis FL, Johnson M, Dixon WE
 Optimal Control and the Dynamic Program- (2012b) Online learning algorithm for Stackelberg
ming Principle games in problems with hierarchy. In: Proceedings
860 Nominal Model-Predictive Control

of the 51st IEEE conference on decision and control, linear dynamical systems. While the literal mean-
Maui pp 1883–1889 ing of “model-predictive control” applies to virtu-
Vamvoudakis KG, Vrabie D, Lewis FL (2013) Online
adaptive algorithm for optimal control with integral re-
ally every model-based controller design method,
inforcement learning. Int J Robust Nonlinear Control, nowadays the term commonly refers to control
Wiley. doi: 10.1002/rnc.3018 methods in which pieces of open-loop optimal
Vrabie D, Pastravanu O, Lewis FL, Abu-Khalaf M (2009) control functions or sequences are put together
Adaptive optimal control for continuous-time lin-
ear systems based on policy iteration. Automatica
in order to synthesize a sampled data feedback
45(2):477–484 law. As such, it is often used synonymously with
Vrabie D, Vamvoudakis KG, Lewis FL (2012) Optimal “receding horizon control.”
adaptive control and differential games by reinforce- The concept of MPC was first presented in
ment learning principles. Control engineering series.
IET Press, London
Propoı̆ (1963) and was reinvented several times
Werbos PJ (1989) Neural networks for control and system already in the 1960s. Due to the lack of suf-
identification. In: Proceedings of the IEEE conference ficiently fast computer hardware, for a while
on decision and control, Tampa these ideas did not have much of an impact.
Werbos PJ (1992) Approximate dynamic programming
for real-time control and neural modeling. In: White
This changed during the 1970s when MPC was
DA, Sofge DA (eds) Handbook of intelligent control. successfully used in chemical process control. At
Van Nostrand Reinhold, New York that time, MPC was mainly applied to linear sys-
tems with quadratic cost and linear constraints,
since for this class of problems algorithms were
Nominal Model-Predictive Control sufficiently fast for real-time implementation – at
least for the typically relatively slow dynamics
Lars Grüne of process control systems. The 1980s have then
Mathematical Institute, University of Bayreuth, seen the development of theory and increasingly
Bayreuth, Germany sophisticated concepts for linear MPC, while in
the 1990s nonlinear MPC (often abbreviated as
NMPC) attracted the attention of the MPC com-
Abstract munity. After the year 2000, several gaps in the
analysis of nonlinear MPC without terminal con-
Model-predictive control is a controller design straints and costs were closed, and increasingly
method which synthesizes a sampled data feed- faster algorithms were developed. Together with
back controller from the iterative solution of the progress in hardware, this has considerably
open-loop optimal control problems. We describe broadened the possible applications of both linear
the basic functionality of MPC controllers, their and nonlinear MPC.
properties regarding feasibility, stability and per- In this entry, we explain the functionality of
formance, and the assumptions needed in order nominal MPC along with its most important
to rigorously ensure these properties in a nominal properties and the assumptions needed to
setting. rigorously ensure these properties. We also
give some hints on the underlying proofs. The
term nominal MPC refers to the assumption
Keywords
that the mismatch between our model and the
real plant is sufficiently small to be neglected
Recursive feasibility; Sampled-data feedback;
in the following considerations. If this is not
Stability
the case, methods from robust MPC must be
used ( Robust Model-Predictive Control). We
Introduction describe all concepts for nonlinear discrete time
systems, noting that the basic results outlined in
Model-predictive control (MPC) is a method for this entry are conceptually similar for linear and
the optimization-based control of linear and non- for continuous-time systems.
Nominal Model-Predictive Control 861

Model-Predictive Control The key idea of MPC is to compute the values


N .x/ of the MPC feedback law N from the
In this entry, we discuss MPC for discrete time open-loop optimal control sequences u . To
control systems of the form formalize this idea, consider the closed-loop
system
xu .j C 1/ D f .xu .j /; u.j //; xu .0/ D x0 (1)   
xN .k C 1/ D f xN .k/; N xN .k/ : (5)

with state xu .j / 2 X , initial condition x0 2 X, In order to evaluate N along the closed-loop


and control input sequence u D .u.0/; u.1/; : : :/ solution, given an initial value xN .0/ 2 X, we
with u.k/ 2 U , where the state space X and iteratively perform the following steps.
the control value space U are normed spaces.
For control systems in continuous time, one may Basic MPC Loop
either apply the discrete time approach to a sam- 1. Set k WD 0.
pled data model of the system. Alternatively, 2. Solve (2)–(4) for x0 D xN .k/; denote
continuous-time versions of the concepts and the optimal control sequence by u D
results from this entry are available in the liter- .u .0/; : : : ; u .N  1//.
ature; see, e.g., Findeisen and Allgöwer (2002) or 3. Set N .xN .k// W u .0/, compute xN .k C 1/
Mayne et al. (2000). according to (5), set k WD k C 1. and go to (1).
The core of any MPC scheme is an optimal Due to its ability to handle constraints and pos-
control problem of the form sibly nonlinear dynamics, MPC has become one
of the most popular modern control methods in
minimize JN .x0 ; u/ (2) the industry ( Model-Predictive Control in Prac-
tice). While in the literature various variants of
this basic scheme are discussed, here we restrict
w.r.t. u D .u.0/; : : : ; u.N  1// with
ourselves to this most widely used basic MPC N
scheme.
X
N 1
When analyzing an MPC scheme, three prop-
JN .x0 ; u/ W `.xu .j /; u.j // C F .xu .N // erties are important:
j D0
• Recursive Feasibility, i.e., the property that the
(3)
constraints (4) can be satisfied in Step (ii) in
subject to the constraints each sampling instant
• Stability, i.e., in particular convergence of
u.j / 2 U; xu .j / 2 X for j D 0; : : : ; N  1 the closed-loop solutions xN .k/ to a desired
xu .N / 2 X0 ; equilibrium x as k ! 1
(4) • Performance, i.e., appropriate quantitative
for control constraint set U  U , state con- properties of xN .k/
straint set X  X , and terminal constraint set Here we discuss these three issues for two widely
X0  X . The function ` W X U ! R is used MPC variants:
called stage cost or running cost; the function 1. MPC with terminal constraints and costs
F W X ! R is referred to as terminal cost. 2. MPC with neither terminal constraints nor
We assume that for each initial value x0 2 X, costs
the optimal control problem (2) has a solution In (a), F and X0 in (3) and (4) are specifically
and denote the corresponding minimizing control designed in order to guarantee proper perfor-
sequence by u . Algorithms for computing u mance of the closed loop. In (b), we set F  0
are discussed in  Optimization Algorithms for and X0 D X. Thus, the choice of ` and N
Model Predictive Control and  Explicit Model in (3) is the most important part of the design
Predictive Control. procedure.
862 Nominal Model-Predictive Control

Recursive Feasibility is viable (Grüne and Pannek 2011, Theorem 3.5).


However, checking viability and even more con-
Since the ability to handle constraints is one structing a viable state constraint set is in general
of the key features of MPC, it is important to a very difficult task. Hence, other methods for
ensure that the constraints xN .k/ 2 X and establishing recursive feasibility are needed. One
N .xN .k// 2 U are satisfied for all k  0. How- method is to assume that the sequence of feasible
ever, beyond constraint satisfaction, the stronger sets XN ; N 2 N becomes stationary for some N0 ,
property xN .k/ 2 XN is required, where XN i.e., that XN C1 D XN holds for all N  N0 .
denotes the feasible set for horizon N , Under this assumption, recursive feasibility of
XN0 follows, see Kerrigan (2000, Theorem 5.3).
XN WD fx2Xj there exists u such that (4) holds}. However, like viability, stationarity is difficult to
verify.
The property x 2 XN is called feasibility of For this reason, a conceptually different
x. Feasibility of x D xN .k/ is a prerequisite approach to ensure recursive feasibility was
for the MPC feedback N being well defined, presented in Grüne and Pannek (2011, Theo-
because the nonexistence of such an admissible rem 8.20); a similar approach for linear systems
control sequence u would imply that solving (2) can be found in Primbs and Nevistić (2000).
under the constraints (4) in Step (ii) of the MPC The approach is suitable for stabilizing MPC
iteration is impossible. problems in which the stage cost ` penalizes the
Since for k  0 the state xN .k C 1/ D distance to a desired equilibrium x (cf. section
f .xN .k/; u .0// is determined by the solution “Stability”). Assuming the existence – but not
of the previous optimal control problem, the usual the knowledge – of a viable neighborhood N
way to address this problem is via the notion of of x , one can show that any initial point x0
recursive feasibility. This property demands the for which the corresponding open-loop optimal
existence of a set A  X such that: solution satisfies xu .j / 2 N or some j  N is
• For each x0 2 A, the problem (2)–(4) is contained in a recursively feasible set. The fact
feasible. that ` penalizes the distance to x then implies
• For each x0 2 A and the optimal control u? xu .j / 2 N for suitable initial values. Together,
from (2) to (4), the relation f .x0 ; u .0// 2 A these properties yield the existence of recursively
holds. feasible sets AN which become arbitrarily large
It is not too difficult to see that this property as N increases.
implies xN .k/ 2 A for all k  1 if xN .0/ 2 A.
For terminal-constrained problems, recursive
feasibility is usually established by demanding Stability
that the terminal constraint set X0 is viable or
controlled forward invariant. This means that for Stability in the sense of this entry refers to the fact
each x 2 X0 , there exists u 2 U with f .x; u/ 2 that a prespecified equilibrium x 2 X – typically
X0 . Under this assumption, it is quite straight- a desired operating point – is asymptotically sta-
forward to prove that the feasible set A D XN ble for the MPC closed loop for all initial values
is also recursively feasible (Grüne and Pannek in some set S. This means that the solutions
2011, Lemma 5.11). Note that viability of X0 xN .k/ starting in S converge to x as k ! 1
is immediate if X0 D fx g and x 2 X is an and that solutions starting close to x remain
equilibrium, i.e., a point for which there exists close to x for all k  0. Note that this setting can
u 2 U with f .x ; u / D x . This setting is be extended to time-varying reference solutions;
referred to as equilibrium terminal constraint. see  Tracking Model Predictive Control.
For MPC without terminal constraints, the In order to enforce this property, we assume
most straightforward way to ensure recursive fea- that the stage cost ` penalizes the distance to the
sibility is to assume that the state constraint set X equilibrium x in the following sense: ` satisfies
Nominal Model-Predictive Control 863

`.x ; u / D 0 and ˛1 .jxj/  `.x; u/ (6) on the problem data (see, e.g., Rawlings and
Mayne 2009, Sect. 4.5; Grüne and Pannek 2011,
for all x 2 X and u 2 U. Here ˛1 is a Sect. 5.3).
K1 function, i.e., a continuous function ˛1 W For ensuring that VN is strictly decreasing
Œ0; 1/ ! Œ0; 1/ which is strictly increasing, along the closed-loop solutions, we need to prove
unbounded, and satisfies ˛1 .0/ D 0. With jxj, we
denote the norm on X . In this entry, we exclu- VN .f .x; N .x///  VN .x/  `.x; N .x//: (9)
sively discuss stage costs ` satisfying (6). More
general settings using appropriate detectability
conditions are discussed in Rawlings and Mayne In order to prove this inequality, one uses on
(2009, Sect. 2.7) or Grimm et al. (2005) in the the one hand the dynamic programming principle
context of stabilizing MPC. Even more general ` stating that
are allowed in the context of economic MPC; see
the  Economic Model Predictive Control article. VN 1 .f .x; N .x/// D VN .x/  `.x; N .x//:
In case of terminal constraints and terminal (10)
costs, a compatibility condition between ` and
F is needed on X0 in order to ensure stability. On the other hand, one shows that (7) implies
More precisely, we demand that for each x 2 X0
there exists a control value u 2 U such that VN 1 .x/  VN .x/ (11)
f .x; u/ 2 X0 and

F .f .x; u//  F .x/  `.x; u/ (7) for all x 2 XN . Inserting (11) with f .x; N .x//
in place of x into (10) then immediately
holds. Observe that the condition f .x; u/ 2 X0 yields (9). Details of this proof can be found,
is again the viability condition which we already e.g., in Mayne et al. (2000), Rawlings and Mayne
imposed for ensuring recursive feasibility. Note (2009), or Grüne and Pannek (2011). The survey N
that (7) is trivially satisfied for F  0 in case of Mayne et al. (2000) is probably the first paper
X0 D fx g by choosing u D u . which develops the conditions needed for this
Stability is now concluded by using the opti- proof in a systematic way; a continuous-time
mal value function version of these results can be found in Fontes
(2001).
Summarizing, for MPC with terminal con-
VN .x0 / WD inf JN .x0 ; u/
u s:t: (4) straints and costs, under the conditions (6)–(8),
we obtain asymptotic stability of x on SDXN .
as a Lyapunov function. This will yield stability For MPC without terminal constraints and
on S D XN , as XN is exactly the set on which VN costs, i.e., with X0 D X and F  0, these
is defined. In order to prove that VN is a Lyapunov conditions can never be satisfied, as (7) will
function, we need to check that VN is bounded immediately imply `.x; u/ D 0 for all x 2
from below and above by K1 functions ˛1 and ˛2 X, contradicting (6). Moreover, without terminal
and that VN is strictly decaying along the closed- constraints and costs, one cannot expect (9) to
loop solution. be true. This is because without terminal con-
The first amounts to checking straints, the inequality VN 1 .x/  VN .x/ holds,
which together with the dynamic programming
˛1 .jxj/  VN .x/  ˛2 .jxj/ (8) principle implies that if (9) holds, then it holds
with equality. This, however, would imply that
for all x 2 XN . The lower bound follows N is the infinite horizon optimal feedback law,
immediately from (6) (with the same ˛1 ), and which – though not impossible – is very unlikely
the upper bound can be ensured by conditions to hold.
864 Nominal Model-Predictive Control

Thus, we need to relax (9). In order to do so, performance is to evaluate the infinite horizon
instead of (9), we assume the relaxed inequality functional corresponding to (3) along the closed
loop, i.e.,
VN .f .x; N .x///  VN .x/  ˛`.x; N .x//
1
X
(12)   
c`
J1 .x0 ; N / WD ` xN .k/; N xN .k/
for some ˛ > 0 and all x 2 X, which is still kD0
enough to conclude asymptotic stability of x
if (6) and (8) hold. The existence of such an ˛ with xN .0/ D x0 . This value can then be
can be concluded from bounds on the optimal compared with the optimal infinite horizon value
value function VN . Assuming the existence of
constants K  0 such that the inequality V1 .x0 / WD inf J1 .x0 ; u/
uWu.k/2U;xu .k/2X

VK .x/  K min `.x; u/ (13)


u2U where
1
X
holds for all K D 1; : : :; N and x 2 X, there are
J1 .x0 ; u/ WD `.xu .k/; u.k//:
various ways to compute ˛ from 1 ; : : :; N , see
kD0
Grüne (2012, Sect. 3). The best possible estimate
for ˛, whose derivation is explained in detail in To this end, for MPC with terminal constraints
Grüne and Pannek (2011, Chap. 6), yields and costs, by induction over (9) and using non-
negativity of `, it is fairly easy to conclude the
Q
N
inequality
. N  1/ . i  1/
i D2
˛ D1 : (14)
Q
N QN c`
J1 .x0 ; N /  VN .x0 /
i  . i  1/
i D2 i D2
for all x 2 XN . However, due to the conditions
Though not immediately obvious, a closer look at on the terminal cost in (7), VN may be consid-
this term reveals ˛ ! 1 as N ! 1 if the K are erably larger than V1 and an estimate relating
bounded. Hence, ˛ > 0 for sufficiently large N . these two functions is in general not easy to
Summarizing the second part of this section, derive (Grüne and Pannek 2011, Examples 5.18
for MPC without terminal constraints and costs, and 5.19). However, it is possible to show that un-
under the conditions (6), (8), and (13), asymptotic der the same assumptions guaranteeing stability,
stability follows on S D X for all optimization the convergence
horizons N for which ˛ > 0 holds in (14).
Note that the condition (13) implicitly depends VN .x/ ! V1 .x/
on the choice of `. A judicious choice of ` can
considerably reduce the size of the horizon N for holds for N ! 1 (Grüne and Pannek 2011,
which ˛ > 0 holds, see Grüne and Pannek (2011, Theorem 5.21). Hence, we recover approximately
Sect. 6.6) and thus the computational effort for optimal infinite horizon performance for suffi-
solving (2)–(4). ciently large horizon N .
For MPC without terminal constraints and
costs, the inequality VN .x0 /  V1 .x0 / is im-
Performance mediate; however, (9) will typically not hold. As
a remedy, we can use (12) in order to derive an
Performance of MPC controllers can be estimate. Using induction over (12), we arrive at
measured in many different ways. As the MPC the estimate
controller is derived from successive solutions
of (2), a natural quantitative way to measure its c`
J1 .x0 ; N /  VN .x0 /=˛  V1 .x0 /=˛:
Nominal Model-Predictive Control 865

Since ˛ ! 1 as N ! 1, also in this case Recommended Reading


we obtain approximately optimal infinite horizon
performance for sufficiently large horizon N . MPC in the form known today was first described
in Propoı̆ (1963) and is now covered in several
monographs, two recent ones being Rawlings and
Summary and Future Directions Mayne (2009) and Grüne and Pannek (2011).
More information on continuous-time MPC can
MPC is a controller design method which uses be found in the survey by Findeisen and Allgöwer
the iterative solution of open-loop optimal control (2002). The nowadays standard framework for
problems in order to synthesize a sampled data stability and feasibility of MPC with stabiliz-
feedback controller N . The advantages of MPC ing terminal constraints is presented in Mayne
are its ability to handle constraints, the rigorously et al. (2000); for a continuous-time version, see
provable stability properties of the closed loop, Fontes (2001). Stability of MPC without terminal
and its approximate optimality properties. As- constraints was proved in Grimm et al. (2005)
sumptions needed in order to rigorously ensure under very general conditions; for a comparison
these properties together with the corresponding of various such results, see Grüne (2012). Feasi-
mathematical arguments have been outlined in bility without terminal constraints is discussed in
this entry, both for MPC with terminal constraints Kerrigan (2000) and Primbs and Nevistić (2000).
and costs and without. Among the disadvantages
of MPC are the computational effort and the fact
that the resulting feedback is a full state feedback,
thus necessitating the use of a state estimator to Bibliography
reconstruct the state from output data ( Moving
Horizon Estimation). Findeisen R, Allgöwer F (2002) An introduction to nonlin-
ear model predictive control. In: 21st Benelux meeting
Future directions include the application of
on systems and control, Veldhoven, The Netherlands N
MPC to more general problems than set point (see also http://www.tue.nl/en/publication/ep/p/d/ep-
stabilization or tracking, the development of effi- uid/252788/), pp 119–141
cient algorithms for large-scale problems includ- Fontes FACC (2001) A general framework to design
stabilizing nonlinear model predictive controllers. Syst
ing those originating from discretized infinite-
Control Lett 42:127–143
dimensional control problems, and the under- Grimm G, Messina MJ, Tuna SE, Teel AR (2005) Model
standing of the opportunities and limitations of predictive control: for want of a local control Lya-
MPC in increasingly complex environments; see punov function, all is not lost. IEEE Trans Autom
Control 50(5):546–558
also  Distributed Model Predictive Control. Grüne L (2012) NMPC without terminal constraints. In:
Proceedings of the IFAC conference on nonlinear
model predictive control – NMPC’12, pp 1–13
Grüne L, Pannek J (2011) Nonlinear model predictive
Cross-References control: theory and algorithms. Springer, London
Kerrigan EC (2000) Robust constraint satisfaction: invari-
ant sets and predictive control. PhD thesis, University
 Distributed Model Predictive Control
of Cambridge
 Economic Model Predictive Control Mayne DQ, Rawlings JB, Rao CV, Scokaert POM (2000)
 Explicit Model Predictive Control Constrained model predictive control: stability and
 Model-Predictive Control in Practice optimality. Automatica 36:789–814
Primbs JA, Nevistić V (2000) Feasibility and stability of
 Moving Horizon Estimation
constrained finite receding horizon control. Automat-
 Optimization Algorithms for Model Predictive ica 36(7):965–971
Control Propoı̆ A (1963) Application of linear programming meth-
 Robust Model-Predictive Control ods for the synthesis of automatic sampled-data sys-
tems. Avtomat i Telemeh 24:912–920
 Stochastic Model Predictive Control
Rawlings JB, Mayne DQ (2009) Model predictive control:
 Tracking Model Predictive Control theory and design. Nob Hill Publishing, Madison
866 Nonlinear Adaptive Control

While adaptive control is naturally associated


Nonlinear Adaptive Control with the notion of estimation, i.e., the parameters
of a system have to be identified to design a
A. Astolfi
controller, it may be possible to design adaptive
Department of Electrical and Electronic
controllers which do not rely on a complete pa-
Engineering, Imperial College London,
rameter estimation: it is sufficient to estimate the
London, UK
effect of the parameters on the control signal.
Dipartimento di Ingegneria Civile e Ingegneria
Adaptive control is different from robust con-
Informatica, Università di Roma Tor Vergata,
trol. In the simplest possible occurrence, the aim
Roma, Italy
of robust control is to design a control law guar-
anteeing performance specifications for a given
range of parameter values. Robust control thus
Abstract
requires some a priori information on the parame-
ter. Adaptive control does not require any a priori
We consider the control of nonlinear systems in
information on the parameter, although any such
which parameters are uncertain and may vary.
information can be exploited in the controller
For such systems the control must adapt to the
design, but requires a parameterized model: a
parameter change to deliver closed-loop perfor-
model which contains information on the way the
mance, such as asymptotic stability or tracking.
parameters affect the dynamics of the system.
A concise description of available methods and
The adaptive control problem for general non-
basic adaptive stabilization results, which can be
linear systems can be formulated as follows. Con-
used as building blocks for complex adaptive
sider a nonlinear system described by equations
control problems, are discussed.
of the form

xP D F .x; u; /; y D H.x; /; (1)


Keywords
where x.t/ 2 Rn denotes the state of the system,
Adaptive stabilization; Linear parameterization; u.t/ 2 Rm denotes the input of the system,
Lyapunov function; Nonlinear parameterization;  2 Rq denotes the constant unknown parameter,
Output feedback y.t/ 2 Rp denotes the measured output, and
F W Rn Rm Rq ! Rn and H W Rn Rq !
Rp are smooth mappings. While we focus on
continuous-time systems, similar considerations
Introduction apply to discrete-time systems. In what follows,
for simplicity, we mostly assume that y D x: the
The adaptive control problem, namely, the prob- whole state of the system is available for control
lem of designing a feedback controller which design.
contains an adaptation mechanism to counteract The adaptive control problem consists in find-
changes in the parameters of the system to be ing, if possible, a dynamic control law described
controlled, is of significant importance in ap- by equations of the form
plications. In almost all systems, physical pa-
rameters are subject to changes. These may be P
O D w.x; O ; r/; (2)
triggered, for example, by changes in temperature
(the volume of a liquid/gas), aging (the friction
u D v.x; O ; r/; (3)
coefficient of a mechanical system), or normal
operation (the mass of the fuel of an aircraft with r.t/ 2 Rs an exogenous (reference) signal
changes during flight, the center of mass of a and w W Rn Rq Rs ! Rq and v W
vehicle is affected by its load). Rn Rq Rs ! Rm smooth mappings, such
Nonlinear Adaptive Control 867

that the closed-loop system, described by the using recursive least-square methods. This ap-
equations proach has its roots in identification theory and
has been studied in-depth for linear systems. The
latter relies on the observation that the design of
P
xP D F .x; v.x; O ; r/; /; O D w.x; O ; r/; an update law is equivalent to the design of a
(4) (reduced-order) observer for the extended system
has specific properties. For example, one could
xP D F .x; u; /; P D 0;
require that all trajectories be bounded and the
x-component of the state converge to a given
with output y D x. This approach has its roots in
value x ? (this is the so-called adaptive regulation
the theory of nonlinear observer design.
requirement) or that the input-output behavior
The approaches described so far relies on a
of the system from the input r to some user-
sort of separation principle: the update law and
defined output signal coincide with a given refer-
the feedback law are designed separately. While
ence model (this is the so-called model reference
this approach may be adequate for linear systems,
adaptive control requirement).
for nonlinear systems it is often necessary to
A natural way to characterize design specifi-
design the update law and the feedback law in
cations for the adaptive control problem and to
one step, i.e., the selection of the feedback law
facilitate its solution is to assume the existence of
depends upon the selection of the update law
a known parameter controller, described by the
and vice versa. To illustrate this design method,
equation
and provide some explicit adaptive control design
u D v ? .x; ; r/; (5)
tools, we focus on a special class of nonlinear sys-
such that the nonadaptive closed-loop system tems: systems which are linearly parameterized
xP D F .x; v ? .x; ; r/; / satisfies given design in the unknown parameter.
specifications. In this perspective, the adaptive
control problem boils down to the design of N
the update law (2) and of the feedback law (3) Linearly Parameterized Systems
such that the behavior of the adaptive closed-loop
system matches that of the nonadaptive closed- Consider the system (1) and assume the mapping
loop system xP D F .x; v ? .x; ; r/; /. F is affine in the parameter  and in the control
The above description suggests a design u, namely,
method for the feedback law: one could replace
 with O in Eq. (5). This design is often known F .x; u; / D f0 .x/ C g.x/u C f1 .x/; (6)
as certainty equivalence design and lends itself
to the interpretation that O be an estimate for . with f0 W Rn ! Rn , g W Rn ! Rn Rm
Naturally, one could also modify the feedback and f1 W Rn ! Rn Rq smooth mappings.
law, replacing  with O and adding x-dependant For this class of systems, under additional as-
terms: this is often called a redesign. Redesign sumptions, it is possible to provide systematic
may be guided by various considerations, for adaptive control design tools. We provide two
example, it may be based on the use of a specific formal results: additional results (depending on
Lyapunov function (yielding the so-called the specific assumptions imposed on the system)
Lyapunov redesign), or by structural properties may be derived. In both cases the focus is on
of the system, or by robustness constraints. the adaptive stabilization problem: the goal of
The interpretation of O as an estimate for  the adaptive controller is to render a given equi-
leads to two similar approaches for the design librium stable, in the sense of Lyapunov, and to
of the update law. The former, pursued in the guarantee convergence of the x-component of the
so-called indirect adaptive control, relies on the state (recall that the state of the adaptive closed-
design of a parameter estimator, for example, O
loop system is the vector .x; /).
868 Nonlinear Adaptive Control

Theorem 1 Consider the system (6) and a point P @ˇ h  


O D  f0 .x/ C f1 .x/ O C ˇ.x/
x  . Assume there exist a known parameter con- @x
troller i
u D v0 .x/ C v1 .x/; C g.x/v.x; O C ˇ.x// (8)

with v0 W Rn ! Rm and v1 W Rn ! Rm Rq
smooth mappings, and a positive definite and and the feedback law
radially unbounded function V W Rn ! R, such
that V .x ? / D 0 and u D v.x; O C ˇ.x//

@V 
f .x; / < 0 are such that all trajectories of the closed-loop
@x
system are bounded and lim x.t/ D x ? .
t !1
for all x ¤ x ? . The stability properties of the adaptive closed-
Then the update law loop system in Theorem 1 can be studied with the
Lyapunov function W .x; / O D V .x/C 1 kO k2 ,
 > 2
PO @V whereas a Lyapunov analysis for the adaptive
 D g.x/v1 .x/
@x closed-loop system of Theorem 2 can be carried
out, under additional assumptions, via a Lya-
and the feedback law punov function of the form W .x; O / D V .x/ C
1 O
2
k Cˇ.x/k2 . This suggests that in Theorem 1
u D v0 .x/ C v1 .x/O O plays the role of the estimate of , whereas in
Theorem 2 such a role is played by O C ˇ.x/.
are such that all trajectories of the closed-loop Note that in none of the theorems, the parame-
system are bounded and lim x.t/ D x ? . ter estimate is required to converge to the true
t !1 value of the parameters, although in Theorem 2
the feedback law is required to converge, along
Theorem 2 Consider the system (6) and a point
trajectories, to the known parameter controller.
x  . Assume there exists a known parameter con-
This has a very important, possibly counterin-
troller u D v.x; / such that the closed-loop
tuitive, consequence: the asymptotic nonadaptive
system
controller u D v.x; O1 /, where O1 D lim .t/,O
xP D f  .x; /; t !1
provided the limit exists, is not in general a
where f  .x; / D f0 .x/Cf1 .x/ Cg.x/v.x; /, stabilizing controller for system (6).
has a globally asymptotically stable equilibrium
at x  . Assume, in addition, that there exists a Example 1 Consider the nonlinear system de-
mapping ˇ W Rn ! Rq such that all trajectories scribed by the equation xP D u C x 2 ; with
of the system x.t/ 2 R, u.t/ 2 R, and  2 R. A known
parameter controller satisfying the assumptions

of Theorem 1 (with and V .x/ D x 2 =2) and of

zP D  f1 .x/ z; Theorem 2 (with ˇ.x/ D x) is u D x  x 2 :
@x
The resulting update laws and feedback laws are
xP D f  .x/ C g.x/ .v.x;  C z/  v.x; //
(7)
P
are bounded and satisfy O1 D x 3 ; O 2;
u1 D x  x

and
lim Œg.x.t// .v.x.t/;  C z.t//  v.x.t/;  // D 0:
t!1

P
Then the update law O2 D x; u2 D x  .O C x/x 2 ;
Nonlinear Adaptive Control 869

respectively, the subscripts “1” and “2” are used linearly parameterized in the unmeasured states,
to refer to the construction in Theorem 1 and 2, namely, it is described by equations of the form
respectively.
The basic building blocks in Theorems 1 and 2 xP 1 D x2 C 1 .x1 / C '1 .x1 /> ;
can be exploited repeatedly to design adaptive xP 2 D x3 C 2 .x1 / C '2 .x1 /> ;
controllers for systems with a specific structure, ::
:
for example, for systems described by the equa-
tions xP i D xi C1 C i .x1 / C 'i .x1 />  C bi u;
::
:
xP 1 D x2 C '1 .x1 /> ;
xP n1 D xn C n1 .x1 / C 'n1 .x1 />  C bn1 u;
xP 2 D x3 C '2 .x1 ; x2 /> ;
xP n D n .x1 / C 'n .x1 />  C bn u;
::
: (9) y D x1
xP i D xi C1 C 'i .x1 ; : : : ; xi /> ;
::
: with xi .t/ 2 R, for i D 1; : : : ; n, u.t/ 2 R,
xP n D u C 'n .x1 ; : : : ; xn /> ; y.t/ 2 R, 'i W R ! Rq , and i W R ! R,
for i D 1; : : : ; n, smooth mappings,  2 Rq ,
with xi .t/ 2 R, for i D 1; : : : ; n, u.t/ 2 R, 'i W and b D Œbi ;    ; bn1 ; bn > unknown, but such
Ri ! Rq , for i D 1; : : : ; n, smooth mappings, that the sign of bn is known and the polynomial
and  2 Rq . Note that the last of the equations bn s ni C bn1 s ni 1 C    C bi has all roots with
(9) can be replaced by negative real part (this implies that the system,
with input u and output y, is minimum phase).
xP n D N u C 'n .x1 ; : : : ; xn /> ;
Nonlinear Parameterized Systems
with N 2 R, provided its sign is known (this N
condition may be removed using the so-called
Adaptive control of nonlinearly parameterized
Nussbaum gain). The parameter N is often re-
systems is an open area of research. The design
ferred to as the high-frequency gain of the sys-
of adaptive controllers relies often upon struc-
tem: a terminology borrowed from linear systems
tural assumptions, for example, the existence of
theory.
a monotonic parameterization, as in the system
described by the equation

Output Feedback Adaptive Control


xP D F .x; u/ C ˆ.x; /;
A key feature of the parameterized systems de-
with x.t/ 2 Rn , u.t/ 2 Rm ,  2 Rq , and F W
scribed so far is that these are linearly parame-
Rn Rm ! Rn and ˆ W Rn Rq ! Rn smooth
terized in . The linear parameterization allows
mappings and such that, for all x, the mapping ˆ
to develop systematic design tools, such as those
satisfies the monotonicity condition
given in Theorems 1 and 2. Such results, how-
ever, require full information on the state of the
system. If only partial information on the state .a  b /> .ˆ.x; a /  ˆ.x; b // > 0;
is available, one has to combine an estimator of
the state with an update law. Such a combina- for all a ¤ b . Alternatively, the design may
tion requires either strong assumptions on the exploit the so-called over-parameterization, for
system or very specific structures. For example, example, the equation of the system
it is feasible if the system is not only linearly
parameterized in the parameter , but it is also xP D u C 1 .x/ sin  C 2 .x/ cos ;
870 Nonlinear Filters

with x.t/ 2 R, u.t/ 2 R, and  2 R, may be Ilchmann A (1997) Universal adaptive stabilization of
rewritten in over-parameterized form as nonlinear systems. Dyn Control 7(3): 199–213
Ilchmann A, Ryan EP (1994) Universal -tracking for
nonlinearly-perturbed systems in the presence of
xP D u C 1 .x/1 C 2 .x/2 ; noise. Automatica 30(2):337–346
Jiang Z-P, Praly L (1998) Design of robust adaptive
controllers for nonlinear systems with dynamic uncer-
with i 2 R, for i D 1; 2. Note that the tainties. Automatica 34(7):825–840
over-parameterized form overlooks the important Kanellakopoulos I, Kokotović PV, Morse AS (1991)
information that 12 C 22 D 1. Systematic design of adaptive controllers for feed-
back linearizable systems. IEEE Trans Autom Control
36(11):1241–1253
Krstić M, Kanellakopoulos I, Kokotović P (1995) Nonlin-
Summary and Future Directions ear and adaptive control design. Wiley, New York
Marino R, Tomei P (1995) Nonlinear control design:
geometric, adaptive and robust. Prentice-Hall, London
The problem of adaptive stabilization for non- Nussbaum RD Some remarks on a conjecture in param-
linear systems has been discussed. Two concep- eter adaptive control. Syst Control Lett 3(5):243–246
tual building blocks for the design of stabilizing (1982)
adaptive controllers have been discussed, and Pomet J-B, Praly L (1992) Adaptive nonlinear regulation:
estimation from the Lyapunov equation. IEEE Trans
classes of systems for which these blocks al- Autom Control 37(6):729–740
low to explicitly design adaptive controllers have Sastry SS, Isidori A (1989) Adaptive control of
been given. The role of parameter convergence, linearizable systems. IEEE Trans Autom Control
or lack thereof, has been briefly discussed to- 34(11):1123–1131
Spooner JT, Maggiore M, Ordonez R, Passino KM (2002)
gether with connections between adaptive and Stable adaptive control and estimation for nonlinear
observer designs. The difficulties associated with systems. Wiley, New York
non-full state measurement and with nonlinear Townley S (1999) An example of a globally stabiliz-
parameterization have been also briefly high- ing adaptive controller with a generically destabiliz-
ing parameter estimate. IEEE Trans Autom Control
lighted. Several problems have not been dis- 44(11):2238–2241
cussed, for example, model reference adaptive
control, robust adaptive control, universal adap-
tive controllers, and the use of projections to
incorporate prior knowledge on the parameter.
Details on these can be found in the bibliography Nonlinear Filters
below.
Frederick E. Daum
Raytheon Company, Woburn, MA, USA

Cross-References
Abstract
 Adaptive Control, Overview
 History of Adaptive Control Nonlinear filters estimate the state of dynami-
 Stochastic Adaptive Control cal systems given noisy measurements related to
 Switching Adaptive Control the state vector. In theory, such filters can pro-
vide optimal estimation accuracy for nonlinear
measurements with nonlinear dynamics and non-
Gaussian noise. However, in practice, the actual
Bibliography performance of nonlinear filters is limited by the
curse of dimensionality. There are many different
Astolfi A, Karagiannis D, Ortega R (2008) Nonlinear and types of nonlinear filters, including the extended
adaptive control with applications. Springer, London
Hovakimyan N, Cao C (2010) L1 adaptive control theory. Kalman filter, the unscented Kalman filter, and
SIAM, Philadelphia particle filters.
Nonlinear Filters 871

Keywords Bayesian Formulation of Filtering


Problem
Bayesian; Computational complexity; Curse of
dimensionality; Estimation; Extended Kalman The Bayesian approach to nonlinear filters is by
filter; Non-Gaussian; Particle filter; Prediction; far the most popular formulation of the problem
Smoothing; Stability; Unscented Kalman (see Ho and Lee 1964), and it has virtually
filter eliminated all other competing theories, because
it is simple, general, systematic, and useful. All
ten nonlinear filters listed in Table 1 are Bayesian.
The Bayesian approach uses a model of the dy-
Description of Nonlinear Filters namics of x as well as a model of the measure-
ments. For example, discrete-time dynamics and
Nonlinear filters are algorithms that estimate the measurement models are typically of the form
state vector (x) of a nonlinear dynamical system
given measurements of nonlinear functions of the x.tkC1 / D f.x.tk /; tk / C w.tk /
state vector corrupted by noise. Such filters also
quantify the uncertainty in the resulting estimate z.tk / D h.x.tk /; tk / C v.tk /
of the state vector (e.g., using the error covariance
matrix). Some nonlinear filters compute the entire in which x.tk / is the d-dimensional state vector at
probability density of the state vector conditioned time tk , z.tk / is the m-dimensional measurement
on the set of measurements available, rather than vector at time tk , v is the measurement noise, and
computing a point estimate of the state vector w is the so-called process noise. Both v and w are
(e.g., conditional mean or maximum likelihood). often modeled as Gaussian zero-mean random
For some applications the conditional probability processes with statistically independent values at
density of x is highly non-Gaussian (e.g., strongly distinct discrete times, but these models could be
multimodal). Even if the measurement noise and highly non-Gaussian with statistically correlated N
the process noise and the initial uncertainty in random values. The initial probability density of
x are all Gaussian, the conditional density of x before any measurements are available is also
x can be non-Gaussian, owing to the nonlin- used in the Bayesian formulations. Real physical
earities in the dynamics or measurements. The systems are most commonly modeled as evolving
dynamical systems can evolve in continuous time in continuous time using Itô stochastic differen-
or discrete time, and the measurements can be tial equations:
made in continuous time or at discrete times.
The most popular nonlinear filter in practical dx D f.x.t/; t/dt C dw
applications is the extended Kalman filter (EKF),
but there are many other families of nonlin- However, most engineers would rather think
ear filters, including particle filters, unscented of the above Itô equation as an ordinary differen-
Kalman filters (UKFs), batch least squares, exact tial equation driven by Gaussian white noise:
finite-dimensional filters, Gaussian sum filters,
cubature Kalman filters, etc. Table 1 summarizes dx=dt D f.x.t/; t/ C dw=dt
the most popular nonlinear filters. The theory
for nonlinear filters is relatively simple (see Ho Mathematicians prefer the Itô equation to avoid
and Lee 1964), but the crucial practical issue is the embarrassment that the time derivative of w(t)
computational complexity, even today with fast does not exist. For details of stochastic calculus,
modern inexpensive computers, e.g., graphical see Jazwinski (1998). Such mathematical sub-
processing units (GPUs). See Ristic et al. (2004) tleties rarely cause any trouble in practical en-
for a book which is both accessible to engineers gineering applications. We emphasize, however,
and thorough. that it is important to correctly model continuous-
872 Nonlinear Filters

Nonlinear Filters, Table 1 Summary of nonlinear filters


Conditional
probability Computational
Nonlinear filter density complexity Comments References
1. Extended Kal- Gaussian d3 Gives good accuracy for Gelb et al. (1974)
man filter (EKF) many practical
applications but can be
highly suboptimal in
difficult problems
2. Unscented Kal- Gaussian d3 Often the UKF beats the Julier and
man filter(UKF) EKF, but sometimes the Uhlmann (2003)
EKF is better than the
UKF; see Noushin (2008)
for details
3. Batch least sq- Gaussian d3 Often beats the EKF Sorenson (1980)
uares accuracy but can fail for
multimodal or other
strongly non-Gaussian
densities
4. Particle filter Arbitrary Varies from d3 to Often beats the EKF Doucet (2011)
exponential in d, accuracy but can fail due
depending on many to the curse of
features of the problem dimensionality and
particle degeneracy and
ill-conditioning
5. Cubature Kal- Gaussian d3 Sometimes beats the EKF Haykin (2010)
man filter and UKF for difficult
nonlinear non-Gaussian
problems, but not always
6. Gaussian sum Arbitrary Varies from d3 to Beats the EKF for certain Sorenson (1988)
exponential in d, difficult nonlinear
depending on many non-Gaussian problems
features of the problem
7. Exact finite-di- Exponential d3 Beats the EKF for certain Daum (2005)
mensional filters family difficult nonlinear non-
Gaussian problems
8. Implicit Arbitrary Suffers from the curse Only low-dimensional Chorin (2009)
particle filters of dimensionality numerical examples have
(i.e., computation time been published so far
grows exponentially
in d)
9. Particle flow Arbitrary Faster than standard Beats the EKF by orders Daum (2013)
filter particle filters by many of magnitude for certain
orders of magnitude difficult nonlinear
for high-dimensional non-Gaussian problems
problems (but
unfortunately there is
no explicit formula for
computation time)
10. Numerical so- Arbitrary Suffers from the curse Beats the EKF by orders Ristic (2004)
lution of Fokker- of dimensionality of magnitude for certain
Planck equation (i.e., computation time difficult nonlinear
grows exponentially non-Gaussian problems
in d)
Nonlinear Filters 873

time random processes for the evolution of the There is no useful way to quantify the com-
state vector (x) in many practical applications. putational complexity of nonlinear filters with-
Similarly, one can model measurements in con- out also quantifying estimation accuracy of x.
tinuous time using Itô calculus: This contrasts with standard computational com-
plexity theory (e.g., P vs. NP) because we are
dz D h.x.t/; t/dt C dv interested in approximations rather than exact so-
lutions. This is the basic idea of IBC. In practice,
Most engineers consider continuous-time mea- engineers compare the estimation accuracy and
surement models as impractical and unnecessar- computational complexity of different nonlinear
ily complicated mathematically, because digital filters using Monte Carlo simulations for specific
computers always require discrete-time measure- applications and specific computers.
ments and there are no practical analog com- The most active area of current research in
puters that can be used for nonlinear filtering, nonlinear filters is focused on particle filters,
owing to the overwhelming superiority of digital which have the promise of optimal accuracy for
computers in terms of accuracy, stability, dy- essentially any nonlinear filter problem, at the
namic range, and flexibility. Nevertheless, there cost of very high computational complexity for
are many papers published by researchers us- high-dimensional problems. In the early days
ing continuous-time measurement models. But (1994–2004), researchers often asserted that par-
the vast majority of practical papers on nonlin- ticle filters “beat the curse of dimensionality,”
ear filters use discrete time measurement models but it is well known today that this assertion
for obvious reasons. This contrasts sharply with is wrong (e.g., see Daum 2005). Unfortunately,
the practical importance of correctly modeling there is no useful theory of computational com-
continuous-time random processes for the evolu- plexity for particle filters, but rather the currently
tion of the state vector (x). available theory gives asymptotic bounds on ac-
curacy with generic “constants.” Such bounds
on the variance of estimation error are generally N
Nonlinear Filter Algorithms of the form c/N in which N is the number of
particles and c is the generic so-called constant.
There is no universally best nonlinear filter for But we know that the so-called constant actually
all applications, and there is much debate about varies by many orders of magnitude depending
which is the best nonlinear filter for any given on the specifics of the problem, including the
application. Even if we knew the best nonlinear following: (1) dimension of the state vector be-
filter for a given computer, the answer could be ing estimated, (2) uncertainty in the initial state
very different for a different computer; in partic- vector, (3) measurement accuracy, (4) stability
ular, some filters can exploit massively parallel of the dynamical system that describes the time
processing architectures, whereas others cannot. evolution of the state vector, (5) geometry of
Research and development of nonlinear filters the conditional probability densities (e.g., uni-
should continue rapidly for the foreseeable fu- modal, log-concave, multimodal, etc.), (6) Lip-
ture. More generally, there is no universal the- schitz constants of the log probability densi-
ory of computational complexity for practical ties, (7) curvature of the nonlinear dynamics and
algorithms of this type; perhaps the closest ap- measurements, (8) ill-conditioning of the Fisher
proximation to such a theory is “information- information matrix for the estimation problem,
based complexity” (IBC); e.g., see Traub and etc. Moreover, there are no tight bounds on the
Werschulz (1998) and Dick et al. (2013). The so-called constant c for practical nonlinear filter
estimation accuracy of x and the computational problems, but rather the best bounds for simple
complexity of the nonlinear filter are intimately MCMC problems are known to be 30 orders of
connected, as shown below for particle filters. magnitude too large; see Dick et al. (2013).
874 Nonlinear Filters

Discrete-Time Measurement Models rule is a simple formula that multiplies two prob-
ability densities and normalizes it by dividing by
Research papers on nonlinear filters are often p.zk jZk1 /. In most applications, there is no need
mathematically abstract, but advanced math is not to normalize the density, and hence, Bayes’ rule
required for practical engineering applications for the unnormalized conditional density is even
(e.g., see Ho and Lee 1964). In particular, one simpler:
can avoid the advanced stochastic mathematics
used for continuous-time measurements by using p.x.tk /; tk jZk / D p.x.tk /; tk jZk1 /p.zk jx.tk //
discrete-time measurements, which is the practi-
cal case of interest anyway, owing to the use of We see that Bayes’ rule for the unnormalized
digital computers to implement such algorithms. conditional density is simply a multiplication of
The notion that continuous-time measurements two densities (i.e., the likelihood and the prior).
results in simpler, better, or more elegant results
for nonlinear filters is misleading; for exam-
ple, we have the elegant innovation theory for Summary and Future Directions
continuous-time measurements (Kailath 1970),
but this theory is not applicable for discrete- In practical applications, the most popular non-
time measurements, likewise with the elegant linear filter is the extended Kalman filter (EKF),
formula for propagating the conditional mean followed by the unscented Kalman filter (UKF).
for continuous-time measurements (the so-called These two filters give good accuracy and robust
Fujisaki-Kallianpur-Kunita formula). More gen- performance for many practical applications. The
erally, the simple discrete-time version of Bayes’ computational complexity of both the EKF and
rule suffices for practical real-world engineering UKF grows as the cube of the dimension of the
applications; there is rarely a need to employ state vector, and hence, they are very practical
the more complex continuous-time version. The to run in real time on laptops or PCs for many
discrete time formula for Bayes’ rule is simply real- world applications. But there are also many
difficult nonlinear or non-Gaussian problems for
which the EKF and UKF give suboptimal accu-
p.x.tk /; tk jZk / D p.x.tk /; tk jZk1 /p.zk jx.tk //=p.zk jZk1 / racy, and in some cases, they give surprisingly
bad accuracy. The accuracy of optimal nonlinear
in which filters is limited by the curse of dimensionality.
We know how to write the equations for the
p.x.tk /, tk jZk / D probability density of x at time
optimal nonlinear filter, but the solution generally
tk conditioned on Zk ; this is also called the
takes an exponentially increasing time to com-
“posteriori probability density”
pute as the dimension of the state vector grows.
x.t/ D state vector of the dynamical system at
There are many different kinds of nonlinear fil-
time t
ters, and this is still an active field of research,
Zk D set of all measurements up to and including
as shown in Crisan and Rozovskii (2011). Future
time tk
research is likely to exploit advances in compu-
zk D measurement vector at time tk
tational complexity theory for approximation of
p.zk jx.tk // D probability density of zk condi-
functions in the style of information-based com-
tioned on x.tk /; this is also called the “like-
plexity (IBC) rather than P vs. NP theory. This
lihood”
is because we want good fast approximations
p.AjB/ D probability density of A conditioned
rather than exact algorithms. A lucid introduc-
on B
tion to IBC is Traub and Werschulz (1998), and
This is all one needs to know about Bayes’ recent work is surveyed in Dick et al. (2013).
rule for practical engineering applications of non- Another fruitful direction of research is to ex-
linear filtering; see Ho and Lee (1964). Bayes’ ploit the recent advances in transport theory,
Nonlinear Filters 875

as explained in Daum (2013); the best intro- Daum F (2005) Nonlinear filters: beyond the Kalman
duction to transport theory is the book by Vil- filter. IEEE AES Magazine 20:57–69
Daum F, Huang J (2013a) Particle flow with non-zero
lani (2003), which is very accessible yet thor- diffusion for nonlinear filters. In: Proceedings of SPIE
ough. Research in exact finite-dimensional fil- conference, San Diego
ters is difficult but could yield substantial im- Daum F, Huang J (2013b) Particle flow and Monge-
provements in accuracy and computational com- Kantorovich transport. In: Proceedings of IEEE
FUSION conference, Singapore
plexity; for example, see Benes (1981), Marcus Dick J, Kuo F, Peters G, Sloan I (eds) (2013) Monte Carlo
(1984), and Daum (2005). Progress in nonlin- and quasi-Monte Carlo methods 2012. Proceedings of
ear filter research could be inspired by many conference, Sydney. Springer, Heidelberg
diverse fields, including fluid dynamics, quan- Doucet A, Johansen AM (2011) A tutorial on parti-
cle filtering and smoothing: fifteen years later. In:
tum chemistry, quantum field theory, gauge the- Crisan D, Rozovskii B (eds) The Oxford handbook
ory, string theory, Lie superalgebras, Lie super- of nonlinear filtering. Oxford University Press, Oxford
groups, and neuroscience. An important open pp 656–704
research topic is the stability of nonlinear fil- Gelb A et al (1974) Applied optimal estimation. MIT,
Cambridge
ters, which is obviously a fundamental limita- Ho Y-C, Lee RCK (1964) A Bayesian approach to prob-
tion to good theoretical upper bounds on esti- lems in stochastic estimation and control. IEEE Trans
mation error. We still do not have a practical Autom Control 9:333–339
theory of stability for nonlinear filters. Perhaps Jazwinski A (1998) Stochastic processes and filtering
theory. Dover, Mineola
the closest approximation to such a theory is Julier S, Uhlmann J (2004) Unscented filtering and non-
the lucid paper by van Handel (2010), which linear estimation. IEEE Proc 92:401–422
makes an interesting attempt at understanding Kailath T (1970) The innovations approach to detection
the stability of nonlinear filters. In particular, and estimation theory. Proc IEEE 58: 680–695
Kushner HJ (1964) On the differential equations satisfied
van Handel’s paper aims to generalize Kalman’s by conditional probability densities of Markov pro-
theory of stability for the Kalman filter by con- cesses. SIAM J Control 2:106–119
necting stability with the essence of controlla- Marcus SI (1984) Algebraic and geometric methods
in nonlinear filtering. SIAM J Control Optim 22:
bility and observability. A good survey of what
817–844
N
is known about stability theory for nonlinear Noushin A, Daum F (2008) Some interesting observations
filters is given in various articles in Crisan and regarding the initialization of unscented and extended
Rozovskii (2011). Kalman filters. In: Proceedings of SPIE conference,
Orlando
Ristic B, Arulampalam S, Gordon N (2004) Beyond the
Kalman filter. Artech House, Boston
Cross-References Sorenson HW (1974) On the development of practical
nonlinear filters. Inf Sci 7:253–270
 Estimation, Survey on Sorenson HW (1980) Parameter estimation. Marcel-
 Extended Kalman Filters Dekker, New York
Sorenson HW (1988) Recursive estimation for nonlinear
 Kalman Filters
dynamic systems. In: Spall J (ed) Bayesian analysis of
 Particle Filters time series and dynamic models. Marcel-Dekker, New
York, pp 127–165
Stratonovich RL (1960) Conditional Markov processes.
Theory Probab Appl 5:156–178
Bibliography Traub J, Werschulz A (1998) Complexity and information.
Cambridge University Press, Cambridge
Arasaratnam I, Haykin S, Hurd TR (2010) Cubature van Handel R (2010) Nonlinear filters and system theory.
Kalman filtering for continuous-discrete systems. In: Proceedings of 19th international symposium on
IEEE Trans Signal Process 58:4977–4993 mathematical theory of networks and systems, Bu-
Benes V (1981) Exact finite-dimensional filters for certain dapest
diffusions with nonlinear drift. Stochastics 5:65–92 Villani C (2003) Topics in optimal transportation. Ameri-
Chorin A, Tu X (2009) Implicit sampling for particle can Mathematical Society, Providence
filters. Proc Natl Acad Sci 106:17249–17254 Zakai M (1969) On the optimal filtering of diffusion
Crisan D, Rozovskii B (eds) (2011) Oxford handbook of processes. Z fur Wahrscheinlichkeitstheorie und verw
nonlinear filtering. Oxford University Press, Oxford Geb 11:230–243
876 Nonlinear Sampled-Data Systems

range of reasons including reduced cost, reduced


Nonlinear Sampled-Data Systems wiring, more robust hardware, easier and more
flexible programming, and so on. Nowadays, a
Dragan Nesic1 and Romain Postoyan2;3
1 large majority of controllers are implemented
Department of Electrical and Electronic
on digital computers, and, hence, sampled-data
Engineering, The University of Melbourne,
systems are prevalent in practice. On the other
Melbourne, VIC, Australia
2 hand, nonlinear plant models are necessary in
Université de Lorraine, CRAN, France
3 numerous applications when a wide range of op-
CNRS, CRAN, France
erating conditions need to be considered or when
truly nonlinear phenomena, such as friction or
state/input constraints, are not negligible. Hence,
Abstract
there are many situations where nonlinear plant
models are essential, such as vertical takeoff and
Sampled-data systems are control systems in
landing of an aircraft, robots, automotive engines,
which the feedback law is digitally implemented
and biochemical reactors, to name a few. It has
via a computer. They are prevalent nowadays
to be noted that the nonlinearity may also come
due to the numerous advantages they offer
from the controller even when we consider linear
compared to analog control. Nonlinear sampled-
plants as it is the case in adaptive control or model
data systems arise in this context when either the
predictive control with constraints, for example.
plant model or the controller is nonlinear. While
Structure of sampled-data systems: Figure 1
their linear counterpart is now a mature area,
presents a typical structure of a sampled-data
nonlinear sampled-data systems are much harder
system which consists of a continuous-time plant,
to deal with and, hence, much less understood.
an analog-to-digital (A/D) converter (i.e., a sam-
Their inherent complexity leads to a variety of
pler), a digital-to-analog (D/A) converter (i.e., a
methods for their modeling, analysis, and design.
hold device), and a discrete-time controller.
A summary of these methods is presented in this
The A/D converter takes measurements y.tk /
entry.
of a continuous-time output signal y.t/, such as
temperature or pressure, at sampling time instants
tk ; k D 0; 1; : : : and sends them to the control
Keywords algorithm. The measurements are obtained with
finite precision (i.e., they are quantized); this ef-
Discrete time; Nonlinear; Sampled data; Sam- fect is not considered in this entry. The sampling
pler; Zero-order hold instants tk are often equidistant, that is, tk D kT;
k D 0; 1; : : :, where the distance T between any
two consecutive sampling instants is referred to
Introduction as the sampling period. The sampling period is
an important degree of freedom in the design of
Definition: A control system in which a sampled-data systems and it needs to be carefully
continuous-time plant is controlled by a digital selected.
computer is referred to as a sampled-data The control algorithm is discrete in nature. It
control system or simply a sampled-data system takes the sequence of measurements y.tk / and
(Chen and Francis 1994); see Fig. 1. Nonlinear processes them to produce a sequence of control
sampled-data systems arise when either the values u.tk /. The D/A converter converts the se-
model of the plant or the controller is nonlinear; quence of control values u.tk / into a continuous-
otherwise the system is referred to as a linear time signal u.t/ that drives the actuators which
sampled-data system. control the plant. Typically, a zero-order hold is
Motivation: Sampled-data control is prefer- used, i.e., u.t/ D u.tk /; 8t 2 Œtk ; tkC1 /. However,
able to continuous-time (analog) control for a it is possible to use other types of holds.
Nonlinear Sampled-Data Systems 877

Nonlinear Sampled-Data
Systems, Fig. 1 u(tk) u(t) y(t) y(tk)
D/A Continuous-time A/D
Sampled-data (control) converter plant converter
system

Discrete-time
controller

Note that the system in Fig. 1 can be general- are useful only for very small sampling periods.
ized in many ways. An important generalization Nevertheless, they are invaluable and are used as
is multi-rate sampling where the output of the the first step in the controller/observer design in
system is sampled at one sampling rate while the so-called emulation design approach.
the control inputs are updated at a different sam- Discrete-time models only capture the be-
pling rate. Another generalization are networked havior of the sampled-data system at sampling
control systems which are discussed in the last instants. Indeed, they ignore the inter-sample be-
section. havior of the system and this is their main draw-
back. There are two ways in which nonlinear
discrete-time models arise: (i) from the identi-
Modeling fication of the plant model using the sampled
measurements and (ii) from the discretization
The combination of continuous-time and of a known continuous-time plant model. For N
discrete-time components renders the analysis instance, black box identification methods often
and the design of sampled-data systems lead to nonlinear discrete-time models in input-
challenging. Still, linear systems allow for output form, such as NARMA (nonlinear auto-
computationally efficient analysis and design regressive moving average) models (Chen et al.
techniques that benefit from the z and ı 1989; Juditsky et al. 1995). Depending on the
transforms, as well as convex optimization approximating functions used, the nonlinearities
(Chen and Francis 1994). Nonlinear sampled- can be polynomial, neural network type, fuzzy
data systems, one the other hand, are much harder type, and so on. On the other hand, the discretiza-
to deal with since the aforementioned methods do tion of the continuous-time plant model requires
not apply in this case. This inherent difficulty has an exact analytic solution of a set of nonlinear
led to a variety of models for different analysis differential equations. When such an analytic
and design methods: solution exists, we can obtain the exact discrete-
1. Continuous-time models time models of the system; this is typically as-
2. Discrete-time models sumed for linear plants. Nonlinear sampled-data
3. Sampled-data models systems are different from their linear counter-
We discuss bellow each of these models, their parts in that it is typically impossible to obtain the
features, and the analysis or design methods that exact discrete-time model and only approximate
exploit them. discrete-time models are available for analysis
Continuous-time models basically ignore the and design (Nešić et al. 1999; Nešić and Teel
sampling process and assume that all signals are 2004).
continuous time. They are the coarsest approx- Sampled-data models capture the true be-
imation of the sampled-data system and they havior of the sampled-data system including its
878 Nonlinear Sampled-Data Systems

inter-sample behavior. There are several ways conditions of a nonlinear system that solutions
in which this can be achieved. One way is to blow up within a time that is shorter than the
model the piecewise constant signals that arise sampling period. As a consequence, for such an
from zero-order hold devices as signals with a initial condition and input, the exact discrete-time
time-varying delay; this gives rise to time-delay system cannot be defined. This is a fundamental
nonlinear models (Teel et al. 1998). Another obstacle to achieving global stability results for
recently proposed approach is to model nonlinear nonlinear systems if the sampling period is fixed
sampled-data systems as hybrid dynamical sys- and independent of the size of the initial state.
tems (Goebel et al. 2012). An extensive analy- Nevertheless, it is possible to ensure semi-global
sis and design toolbox has been developed for stability properties for very general nonlinear
hybrid dynamical systems and these results can systems which means that any compact domain
be used for nonlinear sampled-data systems. An- of convergence can be achieved if the sampling
other class of models, based on the so-called lift- period is sufficiently reduced (Nešić and Teel
ing, has been applied for linear systems where the 2004).
system is represented as a discrete-time system Model structure is changed: An important
with infinite dimensional input and output spaces. issue for nonlinear sampled-data systems is that
While this approach has been very successful in the sampling modifies the structure of the model.
the linear context (Chen and Francis 1994), it ap- When the continuous-time plant model has a
pears that it is not as useful for nonlinear systems certain structure, such as triangular or affine
due to difficulties arising from harder analysis in the input, the corresponding exact discrete-
and prohibitive computational requirements. time model will not inherit it; see Monaco and
Normand-Cyrot (2007) and Yuz and Goodwin
(2005). This significantly complicates the design
The Main Issues and Analysis of sampled-data systems via the discrete-
time approach since many nonlinear design
Controllability/observability: Issues arising techniques, like backstepping or forwarding, are
due to sampling in linear systems transfer to heavily reliant on the structure of the model.
the nonlinear context although they are less Zero dynamics: Probably the most signifi-
understood in this case. For instance, it is cant aspect of the changed structure are the so-
well known that sampling may “destroy” the called sampling zeros. In linear systems, it is well
controllability and/or observability properties of known that if a continuous-time linear system of
the system (Chen and Francis 1994). In other relative degree r  2 is sampled, then generically
words, if the continuous-time plant model is for fast sampling the discrete-time models of
controllable/observable, then the corresponding the plant will have relative degree r D 1. In
exact discrete-time model of the plant may other words, sampling introduces extra zeros in
not verify these properties for some sampling the model which are often unstable and thus
periods. A simple test is available for linear render the system non-minimum phase. It is well
systems to avoid this phenomenon, but we are known that the controller design is much harder
not aware of similar results in the nonlinear for non-minimum phase systems, and, moreover,
context. there are certain fundamental performance limi-
Finite escape times: A major difference tations in this case. Recently, results that extend
between continuous-time linear and nonlinear the notion of sampling zeros to the nonlinear
systems is that the former have well defined sampled-data systems have been reported; see
solutions for constant control inputs and the references in Monaco and Normand-Cyrot
arbitrarily long sampling periods. This is not (2007).
the case, in general, for nonlinear systems as they Passivity: Some plant properties like passivity
may exhibit finite escape times. In other words, are much more restrictive in discrete time than
for a constant input it may happen for some initial in continuous time. Indeed, it is necessary for a
Nonlinear Sampled-Data Systems 879

continuous-time plant to have relative degree 1 the designed controller/observer is discretized


or 0 to be passive, whereas only relative degree for implementation and the sampling period is
0 discrete-time plants may possess this property. reduced sufficiently for the method to work. This
In other words, an exact discrete-time model of method is approximate since the continuous-time
a passive continuous-time plant of relative degree plant model approximates well the sampled-data
1 will not be passive; that is, sampling typically systems only for sufficiently small sampling peri-
destroys passivity. ods. The discretization can be done using various
implicit or explicit Runge-Kutta methods, such as
the forward or backward Euler method (Monaco
Controller Design and Normand-Cyrot 2007; Yuz and Goodwin
2005). The emulation method is probably the
Linearization: The simplest way to design best understood of all design methods. It was
sampled-data nonlinear systems is to linearize the shown that a range of stability properties that
plant at a given operating point. In this case, the can be cast in terms of dissipation inequalities
nonlinear plant dynamics are approximated by a are preserved in an appropriate sense under the
linear model around a chosen equilibrium, and emulation approach (Laila et al. 2002). Moreover,
then any of the linear sampled-data techniques nonconservative estimates of the upper bound for
can be applied to the linearized model. The the required sampling period in emulation have
obtained solution is then implemented on the true been reported recently (Nešić et al. 2009).
nonlinear plant. The drawback of this technique Exact discrete-time design method: Exact
is that the solution would typically perform well discrete-time design method assumes that
only in the vicinity of the selected equilibrium an exact discrete-time model of the plant is
point. available to the designer; see Kötta (1995)
Nonlinear methods: An alternative is to per- and the references cited therein. This approach
form designs that rely on a nonlinear plant model. is reasonable when black box identification
These approaches can be divided into feedback techniques are used for modeling. Moreover, N
linearization, emulation design method, (approxi- in some rare cases it is possible to obtain
mate and exact) discrete-time design method, and the exact discrete-time model of the plant by
sampled-data design method. integrating the continuous-time model with
Feedback linearization: Some classical prob- fixed inputs (assuming the zero-order hold is
lems, like feedback linearization, are harder for used). This is the case when the plant dynamics
sampled-data systems than continuous-time ones. are linear while the control law is nonlinear
It was shown that a class of discrete-time nonlin- (e.g., adaptive control) or the plant is linear
ear systems for which feedback linearization is with state/input constraints, which is a setup
possible is smaller than the corresponding class often used in the model predictive control. The
of continuous-time systems (Grizzle 1987). This literature on exact discrete-time design method
has led to approximate feedback linearization is vast and many of the nonlinear continuous-
techniques which consider achieving feedback time design techniques, like backstepping,
linearization approximately with an error that can forwarding, and passivity-based designs, are
be reduced by reducing the length of the sampling extended to discrete-time nonlinear systems; see
period (Arapostathis et al. 1989). Kötta (1995) and Grizzle (1987). A drawback
Continuous-time design method (Emulation of these methods is that they assume a special
design): Emulation is a design technique consist- structure of the discrete-time nonlinear model,
ing of two steps. In the first step, a continuous- such as upper or lower triangular structure, which
time controller or observer is designed for the is typically much more restrictive in discrete-
continuous-time plant while ignoring sampling time than in continuous-time due to the loss of
to achieve appropriate stability, performance, structure due to sampling that was discussed
and/or robustness guarantees. In the second step, earlier.
880 Nonlinear Sampled-Data Systems

Approximate discrete-time design method: The first approach consists of representing


Due to the nonlinearity, it is impossible in nonlinear sampled-data systems as systems with
most cases to obtain an exact discrete-time time-varying delays (Teel et al. 1998). However,
plant model by integrating its continuous-time controller design tools for such systems need to
model equations; instead, a range of approximate be further developed.
discrete-time plant models, such as Runge-Kutta, The second approach involves representing
can be used for controller/observer design. It the nonlinear sampled-data system as a hybrid
was recently shown that this design method dynamical system. Recent advances on model-
may lead to disastrous consequences where the ing and analysis of hybrid dynamical systems
controller stabilizes the approximate discrete- (Goebel et al. 2012) offer great opportunities in
time plant model for all (arbitrarily small) this context, but the full potential of this approach
sampling periods while the same controller is still to be exploited. Nonlinear sampled-data
destabilizes the exact discrete-time plant model systems are just a small subclass of hybrid dy-
for all sampling periods; see Nešić and Teel namical systems, and developing specific anal-
(2004) and Nešić et al. (1999). This is true even ysis and design tools tailored to this class of
for linear systems and some commonly used systems seems promising.
discretization techniques and controller designs. It should be emphasized that there are many
These considerations have led to the development related techniques, such as discrete-time adaptive
of a framework for controller design based on control and model predictive control, that deal
approximate discrete-time models (Nešić et al. with classes of nonlinear sampled-data systems
1999; Nešić and Teel 2004). This framework but are not a part of the mainstream nonlinear
provides checkable conditions on the continuous- sampled-data literature.
time plant model, the approximate discrete-time
model and the controller that guarantee that
the controllers designed in this manner would Summary and Future Directions
stabilize the exact discrete-time model and,
hence, the nonlinear sampled-data system for Summary: Sampled-data control systems
sufficiently small sampling periods. The design are nowadays prevalent and there are many
is based on families of approximate discrete-time situations where nonlinear models need to be
models parameterized with the sampling period, used to deal with wider ranges of operating
and the design objectives are more demanding conditions, more restrictive constraints, and
than in the continuous-time nonlinear systems. enhanced performance specifications. Despite
Ideas from numerical analysis are adapted their increasing importance, the design of
to this context. This framework was used to nonlinear sampled-data systems remains largely
design controllers and observers for classes of unexplored, and it is much less developed than
nonlinear sampled-data systems where typically its continuous-time counterpart. A variety of
Euler approximate discretization is employed to models, analysis, and design techniques make
generate the approximate discrete-time model. nonlinear sampled-data literature very diverse
Sampled-data design method: Both emula- and a comprehensive textbook reference or a
tion and discrete-time design methods have their unifying approach is still missing. Many open
drawbacks. Indeed, the former method ignores questions remain for nonlinear sampled-data
the sampling at the design stage, whereas the systems, such as results on multi-rate sampling,
latter method ignores and may produce unaccept- design techniques based on sampled-data models,
able inter-sampling behavior. Thus, methods that and other generalizations which are discussed
use a sampled-data model of the plant for design below.
are much more attractive. There are two possible Future Directions: In the 1990s, a new gener-
ways in which this can be achieved for nonlinear ation of digitally controlled systems has evolved
sampled-data systems. from the more classical sampled-data systems
Nonlinear Sampled-Data Systems 881

which are generally referred to as networked Arapostathis A, Jakubczyk B, Lee HG, Marcus SI, Sontag
control systems (NCS); see Heemels et al. (2010) ED (1989) The effect of sampling on linear equiv-
alence and feedback linearization. Syst Control Lett
and the references cited therein. These systems 13:373–381
exploit digital wired or wireless communication Chen T, Francis B (1994) Optimal sampled-data systems.
networks within the control loops. Such a setup Springer, New York
is introduced to reduce the cost, weight, and Chen S, Billings SA, Luo W (1989) Orthogonal
least squares methods and their application to
volume of the engineered systems, but its spe- non-linear system identification. Int J Control 50:
cial structure imposes new challenges due to the 1873–1896
communication constraints, data packet dropouts, Goebel R, Sanfelice RG, Teel AR (2012) Hybrid
quantization of data, varying sampling periods, dynamical systems. Princeton University Press,
Princeton
time delays, etc. At the same time, these systems Grizzle JW (1987) Feedback linearization of discrete-time
provide new flexibilities due to the distributed systems. Syst Control Lett 9:411–416
computation within the control system that can Heemels M, Teel AR, van de Wouw N, Nešić D (2010)
be used to improve the performance and mitigate Networked control systems with communication con-
straints: tradeoffs between transmission intervals, de-
some of the undesirable network effects on the lays and performance. IEEE Trans Autom Control
overall system performance. Moreover, embed- 55:1781–1796
ded microprocessors allow for event-triggered Juditsky A, Hjalmarsson H, Benveniste A, Delyon B,
and self-triggered sampling (Anta and Tabuada Ljung L, Sjöberg J, Zhang Q (1995) Nonlinear black-
box models in system identification: mathematical
2010) that are still largely unexplored especially foundations (original research article). Automatica
for nonlinear systems. Design of NCS was identi- 31:1725–1750
fied as one of the biggest challenges to the control Khalil HK (2004) Performance recovery under output
research community in the twenty-first century, feedback sampled-data stabilization of a class of non-
linear systems. IEEE Trans Autom Control 49:2173–
and more than a decade of intense research on this 2184
topic still has not provided a comprehensive and Kötta U (1995) Inversion method in the discrete-time
unifying approach for their analysis and design. nonlinear control systems synthesis problems. Lecture
notes in control and information sciences, vol 205.
Novel results on modeling and Lyapunov stability
Springer, Berlin
N
theory for (nonlinear) hybrid dynamical systems Laila DS, Nešić D, Teel AR (2002) Open and closed loop
appear to offer the right analysis design tools but dissipation inequalities under sampling and controller
they are still to be converted into efficient and emulation. Eur J Control 18:109–125
easy-to-use design tools in the control engineers’ Monaco S, Normand-Cyrot D (2007) Advanced tools for
nonlinear sampled-data systems’ analysis and control.
toolbox. Eur J Control 13:221–241
Nešić D, Teel AR (2004) A framework for stabilization
of nonlinear sampled-data systems based on their ap-
proximate discrete-time models. IEEE Trans Autom
Control 49:1103–1034
Cross-References Nešić D, Teel AR, Kokotović PV (1999) Sufficient condi-
tions for stabilization of sampled-data nonlinear sys-
 Event-Triggered and Self-Triggered Control tems via discrete-time approximations. Syst Control
 Hybrid Dynamical Systems, Feedback Con- Lett 38:259–270
Nešić D, Teel AR, Carnevale D (2009) Explicit
trol of
computation of the sampling period in emulation
 Optimal Sampled-Data Control of controllers for nonlinear sampled-data
 Sampled-Data Systems systems. IEEE Trans Autom Control 54:
619–624
Teel AR, Nešić D, Kokotović PV (1998) A note
on input-to-state stability of sampled-data
nonlinear systems. In: Proceedings of the
Bibliography conference on decision and control’98, Tampa,
pp 2473–2478
Anta A, Tabuada P (2010) To sample or not to sam- Yuz JI, Goodwin GC (2005) On sampled-data models
ple: self-triggered control for nonlinear systems. IEEE for nonlinear systems. IEEE Trans Autom Control
Trans Autom Control 55:2030–2042 50:477–1488
882 Nonlinear System Identification Using Particle Filters

Markov process, implying that we only need to


Nonlinear System Identification condition on the most recent state xt , since that
Using Particle Filters contains all information about the past. Further-
more,  denotes the parameters, f ./ and h ./
Thomas B. Schön
that are probability density functions, encoding
Department of Information Technology, Uppsala
the dynamic and the measurement models, re-
University, Uppsala, Sweden
spectively. In the interest of a compact notation,
we will suppress the input ut throughout the text.
Abstract The SSM introduced in (1) is general in that
it allows for nonlinear and non-Gaussian rela-
Particle filters are computational methods open- tionships. Furthermore, it includes both black-
ing up for systematic inference in nonlinear/non- box and gray-box models on state-space form.
Gaussian state-space models. The particle filters Nonlinear black-box and gray-box models are
constitute the most popular sequential Monte covered by  Nonlinear System Identification:
Carlo (SMC) methods. This is a relatively recent An Overview of Common Approaches. The off-
development, and the aim here is to provide line nonlinear system identification problem can
a brief exposition of these SMC methods and (slightly simplified) be expressed as recovering
how they are key enabling algorithms in solving information about the parameters  based on the
nonlinear system identification problems. The information in the T measured inputs u1WT ,
particle filters are important for both frequentist fu1 ; : : : ; uT g and outputs y1WT . For a thorough ex-
(maximum likelihood) and Bayesian nonlinear position of the system identification problem, we
system identification. refer to  System Identification: An Overview.
Nonlinear system identification has a long his-
tory, and a common assumption of the past has
Keywords been that of linearity and Gaussianity. This as-
sumption is very restrictive, and we have now
Bayesian; Backward simulation; Maximum like- witnessed well over half a century of research
lihood; Markov chain Monte Carlo (MCMC); devoted to finding useful approximate algorithms
Particle filter; Particle MCMC; Particle smoother; allowing this assumption to be weakened. This
Sequential Monte Carlo development has significantly intensified during
the past two decades of research on sequential
Monte Carlo (SMC) methods (including particle
Introduction
filters and particle smoothers). However, the use
of SMC for nonlinear system identification is
The state-space model (SSM) offers a general
more recent than that. The aim here is to in-
tool for modeling and analyzing dynamical phe-
troduce the key ideas enabling the use of SMC
nomena. The SSM consists of two stochastic pro-
methods in solving nonlinear system identifica-
cesses: the states fxt gt 1 and the measurements
tion problems, and as we will see, it is not a
fyt gt 1 , which are related according to
matter of straightforward application. The devel-
opment of SMC-based identification follows two
xt C1 j .xt D xt /
f .xt C1 j xt ; ut /; (1a)
clear trends that are indeed more general: (1) The
yt j .xt D xt /
h .yt j xt ; ut /; (1b) problems we are working with are analytically
intractable, and hence, the mindset has to shift
and the initial state x1
 .x1 /. We use bold from searching for closed-form solutions to the
face for random variables and
means “dis- use of computational methods, and (2) the new
tributed according to.” The notation xt C1 j .xt D algorithms have basic building blocks that are
xt / stands for the conditional probability of xt C1 themselves algorithms. Both these trends call for
given xt D xt . The state process fxt gt 1 is a new developments.
Nonlinear System Identification Using Particle Filters 883

Before the SMC methods are introduced estimates of the unknown parameters  in a
in section “Sequential Monte Carlo”, their model, by making use of the information avail-
need is clearly explained by formulating both able in the obtained measurements y1WT . The
the Bayesian and the maximum likelihood ML estimate is obtained by finding the  that
identification problems in sections “Bayesian maximizes the so-called log-likelihood function,
Problem Formulation” and “Maximum Like- which is defined as
lihood Problem Formulation”, respectively.
Solutions to these problems are then provided in X
T

sections “Bayesian Solutions” and “Maximum `T ./ , log p .y1WT / D log p .yt j y1Wt 1 /:
t D1
Likelihood Solutions”, respectively. Finally,
(3)
we give some intuition for online (recursive)
solutions in section “Online Solutions”, and in
section “Summary and Future Directions”, we Note that we use  as a subindex to denote
conclude with a summary and directions for that the corresponding probability density func-
future research. tion is parameterized by , analogously to what
was done in (1). The one step ahead predictor
p .yt j y1Wt 1 / is computed by marginalizing
Bayesian Problem Formulation
p.yt ; xt j y1Wt 1 / D h .yt j xt /p .xt j y1Wt 1 /
In formulating the Bayesian problem, the param-
w.r.t. xt , i.e., integrating out xt from p.yt ; xt j
eters  are modeled as unknown stochastic vari-
y1Wt 1 /. To summarize, the ML estimate O ML is
ables, ı.e., the model (1) needs to be augmented
obtained by solving the following optimization
with a prior density for the parameters Â

problem:
p./. The aim in Bayesian system identification
is to compute the posterior density of  given PT R
O ML , arg max t D1 log h .yt j xt /
the measurements p. j y1WT /. More generally, 
we typically compute the joint posterior of the
p .xt j y1Wt 1 /dxt : (4) N
parameters  and the states x1WT ,

This problem formulation clearly reveals the im-


p.; x1WT j y1WT / D p.x1WT j ; y1WT /p. j y1WT /:
portant fact that the nonlinear state inference
(2)
problem (here computing p .xt j y1Wt 1 /) is
By explicitly including the state variables x1WT in inherent in any maximum likelihood formulation
the problem formulation according to (2), they for identification of SSMs. For linear Gaussian
take on the role of auxiliary variables. The reason models, the Kalman filter offers closed-form so-
for including the state variables x1WT as auxiliary lutions for the state inference problem, but for
variables is that the alternative of excluding them nonlinear models, there are no closed-form solu-
would require us to analytically marginalize the tions available.
states x1WT . This is not possible for the model (1)
under study. However, once we have an approxi-
mation of p.; x1WT j y1WT / available, the density Sequential Monte Carlo
p. j y1WT / is easily obtained by straightforward
marginalization. Solving the nonlinear system identification
problem implicitly requires us to solve various
Maximum Likelihood Problem nonlinear state inference problems. We will, for
Formulation example, need to approximate the smoothing
In formulating the maximum likelihood (ML) density p.x1WT j y1WT / and the filtering
problem, the parameters  are modeled as un- density p.xt j y1Wt /. The SMC samplers
known deterministic variables. The ML formula- offer approximate solutions to these and other
tion offers a systematic way of computing point nonlinear state inference problems, where
884 Nonlinear System Identification Using Particle Filters

the accuracy is only limited by the available h.yt j xt /p.xt j y1Wt 1 /


p.xt j y1Wt / D ; (6a)
computational resources. This section only deals p.yt j y1Wt 1 /
with the state inference problem, allowing us to Z
drop the  in the notation for brevity. p.xt j y1Wt 1 / D f .xt j xt 1 /
Most SMC samplers hinge upon importance
sampling, motivating section “Importance Sam- p.xt 1 j y1Wt 1 /dxt 1 : (6b)
pling”. In section “Particle Filter”, we make use
of importance sampling in computing an approxi-
In the general case (1) there are no analytical
mation of the filtering density p.xt j y1Wt /, and in
solutions available for the above equations. The
section “Particle Smoother”, a particle smoothing
particle filter maintains an empirical approxima-
strategy is introduced to approximately compute
tion of the solution, which at time t 1 amounts to
p.x1WT j y1WT /.

X
N
Importance Sampling pO .xt 1 j y1Wt 1 / D
N
wit 1 ıxit 1 .xt 1 /; (7)
Let z be a random variable distributed according i D1
to some complicated density .z/ and let './ be
some function of interest. Importance sampling where ıxit 1 .xt 1 / denotes the Dirac delta mass
offers a systematic way of evaluating integrals of located at xit 1 . Furthermore, wit 1 and xit 1 are
the form referred to as the weights and the particles, re-
Z spectively. We will now derive the particle filter
E Œ'.z/ D '.z/.z/dz; (5) by designing an importance sampler allowing
us to approximately solve (6). The derivation is
performed in an inductive fashion, starting by
without requiring samples directly generated
assuming that p.xt 1 j y1Wt 1 / is approximated
from .z/. The density .z/ is referred to as
by (7). Inserting PN(7) into (6b) results in pO N
the target density, i.e., the density we are trying
.xt j y1Wt 1 / D i D1 wt 1 f .xt j xit 1 /; which is
i
to sample from. The importance sampler relies
used in (6a) to compute an approximation of the
on a proposal density q.z/, from which it is
filtering density p.xt j y1Wt / up to proportionality.
simple to generate samples, let zi
q.z/,
Hence, this allows us to target p.xt j y1Wt /
i D 1; : : : ; N . Since each sample zi is drawn
using an importance sampler, where the form of
from the proposal density rather than from
pO N .xt j y1Wt 1 / suggests that new samples can be
the target density .z/, we must somehow
proposed according to
account for this discrepancy. The so-called
importance weights w Q i D .zi /=q.zi / encode
the difference.P By normalizing the weights X
N
xit
q.xt j y1Wt / D wit 1 f .xt j xit 1 /: (8)
wi D w Q i= N j D1 wQ j , we obtain a set of i D1
weighted samples fz ; wi gN
i
i D1 that can be used to
approximately evaluate the integral (5) resulting It is worth noting that we can obtain a more
P
in E Œ'.z/ N i i
i D1 w '.z /. Schön and Lindsten general algorithm by replacing f .xt j xit 1 / in
(2014) provide an introduction to importance the above mixture with a density q.xt j xit 1 ; yt /.
sampling within a dynamical systems setting, However, in the interest of a simple, but still
whereas Robert and Casella (2004) provide a highly useful algorithm, we keep (8). The pro-
general treatment. posal density (8) is a weighted mixture consisting
of N components, which means that we can
Particle Filter generate a sample xQ it from it via a two-step
The solution to the nonlinear filtering problem procedure: first we select which component to
is provided by the following two recursive sample from, and secondly we generate a sample
equations: from that component. More precisely, the first
Nonlinear System Identification Using Particle Filters 885

Algorithm 1 Bootstrap particle filter (for i D particle filters. There is by now a fairly good
1; : : : ; N ) understanding of the convergence properties of
1. Initialization (t D 1): the particle filter; see, e.g., Doucet and Johansen
(a) Sample xi1  .x1 /. (2011) for basic results and further pointers into
(b) Compute the importance weights w Q i1 D h.y1 j xi1 /
PN j the literature.
and normalize w1 D w
i
Q 1 = j D1 w
i
Q1.
2. For t D 2 to T do:
(a) Resample fxit1 ; wit1 g resulting in equally Particle Smoother
weighted particles fQxit1 ; 1=N g.
(b) Sample xit  f .xt j xQit1 /. A particle smoother is an SMC method targeting
(c) Compute the importance weights w Q it D h.yt j xit / the joint smoothing density p.x1WT j y1WT / (or
PN j
and normalize wit D w Q it = j D1 wQt . one of its marginals). There are several different
strategies for deriving particle smoothers. Rather
than mentioning them all, we introduce one pow-
part amounts to selecting one of the N particles erful and increasingly popular strategy based on
fxit 1 gN backward simulation, giving rise to the family
i D1 according to
of forward filtering/backward simulation (FFBSi)
samplers.
j j
P xQ t 1 D xit 1 j fxt 1 ; wt 1 gN
j D1 D wt 1 ;
i
In an FFBSi sampler the joint smoothing den-
sity p.x1WT j y1WT / is targeted by complementing
a forward particle filter with a second recur-
where the selected particle is denoted as xQ t 1 . sion evolving in the time-reversed direction. The
By repeating this N times, we obtain a set of following factorization of the joint smoothing
equally weighted particles fQxit 1 gNi D1 , constituting density
an empirical approximation of p.xt 1 j y1Wt 1 /,
analogously to (7). We can then draw xit
!
1
TY
f .xt j xQ it 1 / to generate a realization from the
p.x1WT j y1WT / D p.xt j xt C1 ; y1Wt /
proposal (8). This procedure that turns a weighted t D1
N
set of samples into an unweighted one is com-
monly referred to as resampling. p .xT j y1WT /;
Finally, using the approximation pO N .xt j
y1Wt 1 / in (6a) and the proposal density according immediately suggests a highly useful time-
to (8) allows us to compute the weights as reversed recursion. Start by generating a sample
wQ it D h.yt j xit /. Once all the N weights are xQ T
p.xT j y1WT /. We then continue generating
computed and normalized, we obtain a collection samples backward in time by sampling from the
of weighted particles fxit ; wit gN i D1 targeting the
so-called backward kernel p.xt j xt C1 ; y1Wt /
filtering density at time t. We have now (in a according to xQ t
p.xt j xQ t C1 ; y1Wt /, for
slightly nonstandard fashion) derived the so- t D T  1; : : : ; 1. The resulting sample
called bootstrap particle filter, which was the first xQ 1WT , .xQ 1 ; : : : ; xQ T / is then by construction a
particle filter introduced by Gordon et al. (1993) sample from the joint smoothing density. Hence,
two decades ago. Since the introduction of in performing M backward simulations, we
Algorithm 1, the surrounding theory and practice obtain the following approximation of the joint
have undergone significant developments; see, smoothing density:
e.g., Doucet and Johansen (2011) for an up-
to-date survey. The weights fwi1WT gN i D1 and X M
1
the particles fxi1WT gN i D1 are random variables, pO M .x1WT j y1WT / D ıxQi1WT .x1WT /: (9)
i D1
M
and in executing the algorithm, we generate
one realization from these. This is a useful
insight both when it comes to understanding, For details on how to design algorithms
but also when it comes to the analysis of the implementing the backward simulation strategy,
886 Nonlinear System Identification Using Particle Filters

derivations, properties, and references, we refer breaking work by Andrieu et al. (2010). During
to the recent survey on backward simulation the past 3 years, the PG samplers have developed
methods by Lindsten and Schön (2013). quite a lot, and improved versions are surveyed
and explained by Lindsten and Schön (2013).

A Nontrivial Example
Bayesian Solutions To place PMCMC in the context of nonlinear sys-
tem identification, we will now solve a nontrivial
Strategies identification problem. The PG sampler is used
The posterior density (2) is analytically to compute the posterior density for a general
intractable, but we can make use of Markov chain Wiener model (linear Gaussian system followed
Monte Carlo (MCMC) samplers to address the by a static nonlinearity) (Giri and Bai 2010):
inference problem. An MCMC sampler allows
us to approximately generate samples from  
  xt
an arbitrary target density .z/. This is done xt C1 D A B C vt ; vt
N .0; Q/;
ut
by simulating a Markov chain (ı.e., a Markov
(10a)
process) fzŒrgr1 , which is constructed in such
a way that the stationary distribution of the chain zt D Cxt ; (10b)
is given by .z/. The sample paths fzŒrgR rD1 of
yt D g.zt / C et ; et
N .0; r/:
the chain can then be used to draw inference
(10c)
about the target distribution. Two constructive
ways of finding a suitable Markov chain to
simulate are provided by the Metropolis Hastings Based on observed inputs u1WT and outputs y1WT ,
(MH) and the Gibbs samplers, where the latter we wish to identify the model (10). We place
can be interpreted as a special case of the a matrix normal inverse Wishart (MNIW) prior
former. See, e.g., Robert and Casella (2004) on f.A; B/; Qg, an inverse gamma prior on r,
for details on MCMC. A Gibbs sampler targeting and a Gaussian process (Rasmussen and Williams
p.; x1WT j y1WT / is given by 2006) prior on the function g, resulting in a
(i) Draw  0
p. j x1WT ; y1WT /. semiparametric model. We can without loss of
0
(ii) Draw x1WT
p.x1WT j  0 ; y1WT /. generality fix the matrix C according to C D
The second step is hard, since it requires us .1; 0; : : : ; 0/. For a complete model specification,
to generate a sample from the joint smoothing we refer to Lindsten et al. (2013).
density. Simply replacing step (ii) with a back- The posterior distribution p.; x1WT j y1WT /
ward simulator does not result in a valid method is computed using a newly developed PG
(Andrieu et al. 2010). sampler referred to as particle Gibbs with
One interesting solution is provided by the ancestor sampling (PGAS); see Lindsten and
family of particle MCMC (PMCMC) sampler, Schön (2013). In the present experiment we
first introduced by Andrieu et al. (2010). PM- make use of T D 1;000 observations. The
CMC provides a systematic way of combining dimension of the state–space is 6, the linear
SMC and MCMC, where SMC is used to con- dynamics contains complex poles resulting in
struct the proposal density for the MCMC sam- oscillations as seen in Fig. 1, and the nonlinearity
pler. The so-called particle Gibbs (PG) sampler is non-monotonic; see Fig. 2. A subspace
resolves the problems briefly mentioned above by method is used to find an initial guess for the
a nontrivial modification of the SMC algorithm. linear system, and the static nonlinearity is
Introducing the PG sampler lies outside the scope initialized using a linear function (i.e., a straight
of this work; we refer the reader to the ground- line).
Nonlinear System Identification Using Particle Filters 887

Nonlinear System
10
Identification Using
Particle Filters, Fig. 1

Magn. (dB)
0
Bode diagram of the
−10
sixth-order linear system.
The black curve is the true −20
system. The red curve is
−30
the estimated posterior
mean of the Bode diagram, 10−1 100
and the shaded area is the Frequency (rad/s)
99 % Bayesian credibility
interval
200
Phase (deg)

−200
10−1 100
Frequency (rad/s)

Using the PGAS sampler (with N D


1
0.8
15 particles), we construct a Markov chain
0.6 fŒr; x1WT ŒrgR
rD1 with p.; x1WT j y1WT / as
0.4 its stationary distribution. We run this Markov
0.2 chain for R D 25;000 iterations, where
h (z)

0
the first 10;000 are discarded. The result is N
−0.2
−0.4 visualized in Figs. 1 and 2, where we plot
−0.6 the Bode diagram for the linear system and
−0.8 the static nonlinearity, respectively. In both
−1 figures we also provide the 99 % Bayesian
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 credibility interval. MATLAB code for Bayesian
z identification of Wiener models is available
Nonlinear System Identification Using Particle Fil- from user.it.uu.se/~thosc112/research/software.
ters, Fig. 2 The black curve is the true static nonlinearity html.
(non-monotonic). The red curve is the estimated posterior The resonance peaks are accurately modeled,
mean of the static nonlinearity, and the shaded area is the
but the result is less accurate at low frequen-
99 % Bayesian credibility interval
cies (likely due to a lack of excitation). The
fact that the posterior mean is inaccurate at low
It is worth pausing for a moment to reflect frequencies is encoded in our estimate of the
upon the posterior distribution p.; x1WT j y1WT / posterior distribution as shown by the credibility
that we are computing. The unknown “parame- intervals.
ters”  live in the space ‚ D R64 F , where F is In Figs. 1 and 2, we have visualized not only
an appropriate function space. The states x1WT live the posterior mean but also the uncertainty for the
in the space R61;000 . Hence, p.; x1WT j y1WT / entire model. We could do this since the model
is actually a rather complicated object for this is a linear dynamical system followed by a static
example. nonlinearity. It would be most interesting if we
888 Nonlinear System Identification Using Particle Filters

can come up with ways in which we could visu- Q.;  0 /  Q. 0 ;  0 /, the likelihood is either
alize the uncertainty inherent in general nonlinear increased or left unchanged, ı.e., `T ./  `T . 0 /.
dynamical systems. The EM algorithm now suggests itself in that
we can generate a sequence of iterates f k gk1
that guarantees that the log-likelihood is not de-
Maximum Likelihood Solutions creased for increasing k by alternating the fol-
lowing two steps: (1) (Expectation) compute the
Identifying the parameters  in a general non- intermediate quantity Q.;  k / and (2) (maxi-
linear SSM using maximum likelihood amounts mization) compute the subsequent iterate  kC1
to solving the optimization problem (3). This by maximizing Q.;  k / w.r.t. . This procedure
is a challenging problem for several reasons, is then repeated until convergence, guaranteeing
for example, it requires the computation of the convergence to a stationary point on the likeli-
predictor density p .yt j y1Wt 1 /. Furthermore, its hood surface.
gradient (possibly also its Hessian) is very useful The FFBSi particle smoother offers an approx-
in setting up an efficient optimization algorithm. imation of the joint smoothing density p 0 .x1WT j
There are no closed-form solutions available for y1WT / according to (9), which inserted into (11)
these objects, forcing us to rely on approxima- provides an approximative solution QO M .;  0 / to
tions. The SMC methods briefly introduced in the expectation step. In solving the maximization
section “Sequential Monte Carlo” provide rather step, we typically want gradients of the interme-
natural tools for this task, since they are capable diate quantity r QO M .;  0 /. These can also be
of producing approximations where the accuracy approximated using (9). The above development
is only limited by the available computational is summarized in Algorithm 2, providing a solu-
resources. tion where the basic building blocks are them-
To establish a clear interface between the selves complex algorithms, an SMC algorithm
maximum likelihood problem (3) and the SMC for the E step and a nonlinear optimization algo-
methods, it has proven natural to make use of rithm for the M step. This means that we have the
the expectation maximization (EM) algorithm option of replacing the FFBSi particle smoother
(Dempster et al. 1977). The EM algorithm in step 2a with any other algorithm capable of
proceeds in an iterative fashion to compute ML producing estimates of the joint smoothing den-
estimates of unknown parameters  in probabilis- sity. The family of PMCMC methods introduced
tic models involving latent variables. The strategy in section “Bayesian Solutions” contains several
underlying the EM algorithm is to exploit the highly interesting alternatives. A detailed account
structure inherent in the probabilistic model to on Algorithm 2 is provided by Schön et al.
separate the original problem into two closely (2011); see also Cappé et al. (2005).
linked problems. The first problem amounts to
computing the so-called intermediate quantity
Z Algorithm 2 EM for nonlinear system identifica-
0
Q.;  / , log p .x1WT ; y1WT / tion
1. Initialization: Set k D 0 and initialize  k .
2. Expectation (E) step:
p 0 .x1WT j y1WT /dx1WT
(a) Compute an approximation pOMk .x1WT j y1WT /, for
D E 0 Œlog p .x1WT ; y1WT / j y1WT  ; (11) example, using an FFBSi sampler.
(b) Calculate QO M .;  k /.
3. Maximization (M) step: Compute
where we have already made use of the
fact that the latent variables in an SSM are kC1 D arg max QO M .;  k /:

given by the states. Furthermore,  0 denotes
a particular value for the parameters . We 4. Check termination condition. If satisfied, terminate;
can show that by choosing a new  such that otherwise, update k ! k C 1 and return to step 2.
Nonlinear System Identification Using Particle Filters 889

Finally, we mention Fisher’s identity opening to grow in the future as there is a clear need
up yet another avenue for designing ML esti- motivated by the constantly growing data sets and
mators using SMC approximations. Even if we there are also clear theoretical opportunities.
are not interested in using EM when solving
the nonlinear system identification problem, the
intermediate quantity (11) is useful. The reason
Summary and Future Directions
is provided via Fisher’s identity,
We have discussed how SMC samplers can be
ˇ ˇ
r `T ./ˇ D 0 D r Q.;  0 /ˇ D 0 used to solve nonlinear system identification
problems, by sketching both Bayesian and
R ˇ ML solutions. A common feature of the
D r log p .x1WT ; y1WT /ˇ D 0 resulting algorithms is that they are (nontrivial)
p 0 .x1WT j y1WT /dx1WT ; combinations of more basic algorithms. We
have, for example, seen the combined use of
which provides a means to compute approxi- a particle smoother and a nonlinear optimization
mations of the log-likelihood gradient. Hessian solver in Algorithm 2 to compute ML estimates.
approximations are also available, but these are As another example we have the class of
more involved. Hence, Fisher’s identity opens up PMCMC methods, where the basic building
for direct use of any off-the-shelf gradient-based blocks are provided by SMC samplers and
optimization method in solving (4). MCMC samplers. The use of SMC and MCMC
methods for nonlinear system identification has
only just started to take off, and it presents very
Online Solutions interesting future prospects. Some directions for
future research are as follows: (1) The family
Online (also referred to as recursive or adap- of PMCMC algorithms is rich and fast growing,
tive) identification refers to the problem where with great potential for further developments. For
the parameter estimate is updated based on the example, its use in solving the state smoothing
N
parameter estimate at the previous time step and problem (i.e., computing p.x1WT j y1WT /) is likely
the new measurement. This is used when we to provide better algorithms in the near future.
are dealing with big data sets and in real-time (2) Related to this is the potential to design new
situations. SMC offers interesting opportunities particle smoothers capable of generating new
when it comes to deriving online solutions for particles also in the time-reversed direction. (3)
nonlinear state- space models. The most direct There are open and highly relevant challenges
idea is simply to make use of a gradient method when it comes to designing backward simulators
for Bayesian nonparametric methods (Hjort et al.
t D t 1 C t r log p .yt j y1Wt 1 /; 2010). A key question here is how to represent
the backward kernel p.xt j xt C1 ; y1Wt / in such
where f t g is the sequence of step sizes. Fisher’s
nonparametric settings. (4) The use of Bayesian
identity (12) opens up for the use of SMC in
nonparametric models will open up interesting
approximating r log p .yt j y1Wt 1 / However,
possibilities for hybrid system identification,
this leads to a rapidly increasing variance,
since they allow us to systematically express and
something that can be dealt with by the so-called
work with uncertainties over segmentations.
“marginal” Fisher identity; see Poyiadjis et al.
(2011) for details.
An interesting alternative is provided by an Cross-References
online EM algorithm; see, e.g., Cappé (2011) for
a solid introduction. The online EM approaches  Nonlinear System Identification: An Overview
rely on the additive properties of the Q-function. of Common Approaches
The area of online solutions via SMC is likely  System Identification: An Overview
890 Nonlinear System Identification: An Overview of Common Approaches

Recommended Reading Poyiadjis G, Doucet A, Singh SS (2011) Particle approx-


imations of the score and observed information matrix
in state space models with application to parameter
An overview of SMC methods for system iden- estimation. Biometrika 98(1):65–80
tification is provided by Kantas et al. (2009), Rasmussen CE, Williams CKI (2006) Gaussian processes
and a thorough introduction to SMC is provided for machine learning. MIT, Cambridge
by Doucet and Johansen (2011). The forthcom- Robert CP, Casella G (2004) Monte Carlo statistical
methods, 2nd edn. Springer, New York
ing monograph by Schön and Lindsten (2014) Schön TB, Lindsten F (2014) Learning of dynamical
provides a textbook introduction to particle fil- systems – particle filters and Markov chain methods.
ters/smoothers (SMC), MCMC, PMCMC, and Forthcoming book, see user.it.uu.se/~thosc112/lds
their use in solving problems in nonlinear system Schön TB, Wills A, Ninness B (2011) System identifi-
cation of nonlinear state-space models. Automatica
identification and nonlinear state inference. A 47(1):39–49
self-contained introduction to particle smoothers
and the backward simulation idea is provided by
Lindsten and Schön (2013). The work by Cappé
et al. (2005) also contains a lot of very relevant Nonlinear System Identification: An
material in this respect. Overview of Common Approaches

Bibliography Qinghua Zhang


Inria, Campus de Beaulieu, Rennes Cedex,
Andrieu C, Doucet A, Holenstein R (2010) Particle France
Markov chain Monte Carlo methods. J R Stat Soc Ser
B 72(2):1–33
Cappé O (2011) Online EM algorithm for hidden Markov
models. J Comput Graph Stat 20(3): 728–749 Abstract
Cappé O, Moulines E, Rydén T (2005) Inference in hidden
Markov models. Springer, New York Nonlinear mathematical models are essential
Dempster A, Laird N, Rubin D (1977) Maximum likeli- tools in various engineering and scientific
hood from incomplete data via the EM algorithm. J R
Stat Soc Ser B 39(1):1–38 domains, where more and more data are recorded
Doucet A, Johansen AM (2011) A tutorial on particle by electronic devices. How to build nonlinear
filtering and smoothing: fifteen years later. In: Crisan mathematical models essentially based on
D, Rozovsky B (eds) Nonlinear filtering handbook. experimental data is the topic of this entry. Due
Oxford University Press, Oxford, UK
Giri F, Bai E-W (eds) (2010) Block-oriented non- to the large extent of the topic, this entry provides
linear system identification. Volume 404 of lecture only a rough overview of some well-known
notes in control and information sciences. Springer, results, from gray-box to black-box system
Berlin/Heidelberg identification.
Gordon NJ, Salmond DJ, Smith AFM (1993) Novel
approach to nonlinear/non-Gaussian Bayesian state
estimation. IEE Proc Radar Signal Process 140:107–
113 Keywords
Hjort N, Holmes C, Müller P, Walker S (eds) (2010)
Bayesian nonparametrics. Cambridge University
Press, Cambridge/New York Black-box models; Block-oriented models; Gray-
Kantas N, Doucet A, Singh S, Maciejowski J (2009) An box models; Nonlinear system identification
overview of sequential Monte Carlo methods for pa-
rameter estimation in general state-space models. In:
Proceedings of the 15th IFAC symposium on system
identification, Saint-Malo, pp 774–785 Introduction
Lindsten F, Schön TB (2013) Backward simulation
methods for Monte Carlo statistical inference. Found The wide success of linear system identification
Trends Mach Learn 6(1):1–143
in various applications (Ljung 1999;  System
Lindsten F, Schön TB, Jordan MI (2013) Bayesian semi-
parametric Wiener system identification. Automatica Identification: An Overview) does not necessarily
49(7):2053–2063 mean that the underlying dynamic systems are
Nonlinear System Identification: An Overview of Common Approaches 891

intrinsically linear. Quite often, linear system Gray-Box Models


identification can be successfully applied to a
nonlinear system if its working range is restricted This section covers gray-box models, from the
to a neighborhood of some working point. Nev- most to the least demanding ones in terms of prior
ertheless, some advanced engineering systems knowledge.
may exhibit significant nonlinear behaviors under
their normal working conditions, so do most Parametrized Physical Models
biological or social systems. There is therefore an The dynamic behaviors of some engineering
increasing demand on nonlinear dynamic system systems are governed by well-known physical
modeling theory. Nonlinear system identification laws, typically in the form of differential
is studied to partly answer this demand, when equations, possibly with unknown parameters.
experimental data carry the essential information These parametrized physical equations can
for modeling purpose. be used as gray-box models for system
Nonlinear system identification, compared to identification. In most situations, such a model
its linear counterpart, is a much more vast topic, can be written in the form of a vectorial first-order
as in principle a nonlinear model can be any ordinary differential equation (ODE), known as
description of a system which is not linear. For state equation, and can be generally written as
this reason, this entry provides only a rough
overview of some well-known results. dx.t/
An overview of the basic concepts of system D f .x.t/; u.t/I / (1)
dt
identification can be found in  System Iden-
tification: An Overview, notably the five basic where t represents the time, x.t/ is the state
elements to be taken into account in each ap- vector, u.t/ the input, and f ./ a (nonlinear)
plication, among which the (nonlinear) model function parametrized by the vector .
structures will be mainly focused on by this The observation on the system (typically with
entry, as they represent the essential particu- electronic sensors), referred to as the output and N
larities of nonlinear system identification prob- denoted by y.t/, is related to x.t/ and u.t/
lems. through another known parametrized equation
The various model structures used in non-
linear system identification are often classified y.t/ D h.x.t/; u.t/I / C v.t/ (2)
by the level of available prior knowledge about
the considered system: from white-box models
where v.t/ represents the measurement error.
to black-box models, via gray-box models. In
With digital electronic instruments, the input
principle, a white-box model is fully built from
u.t/ and the output y.t/ are sampled at some
prior knowledge. Such a fully white-box ap-
discrete-time instants, say t D ; 2; 3; : : : ; N 
proach is rarely feasible for complex systems
with some constant sampling period  > 0. For
because of insufficient prior knowledge or of in-
the sake of notation simplicity, let the sampling
tractable system complexity. Therefore, the sys-
period  D 1 and assume ideal instantaneous
tem identification methods summarized in this
samplers; then the sampled input-output data set
entry concern gray-box and black-box models,
is denoted by
for which experimental data play an essential
role.
For ease of presentation, the main content of Z N D fu.1/; y.1/; u.2/; y.2/; : : : ; u.N /; y.N /g :
this entry will be restricted to the single-input (3)
single-output (SISO) case. The multiple-input In some applications, data samples are made at
multiple-output (MIMO) case will be discussed irregular time instants. Some studies are particu-
in the section “Multiple-Input Multiple-Output larly focused on system identification in this case
Systems” below. (Garnier and Wang 2008).
892 Nonlinear System Identification: An Overview of Common Approaches

The main remaining task of gray-box system which are a wider class of dynamic system mod-
identification is to estimate the parameter vec- els than the abovementioned state-space mod-
tor  from the data set Z N . The identification els ( Modeling of Dynamic Systems from First
criterion is typically defined with the aid of an Principles). For most dynamic systems, it is pos-
output predictor derived from the system model. sible to avoid the DAE formulation by causality
A natural output predictor is simply based on analysis, so that the connections between differ-
the numerical solution of the state equation (1): ent system components are treated as information
for some given value of , initial state x.0/ and flow, instead of algebraic constraints. There ex-
some assumed inter-sample behavior of the input ist also some theoretic studies on DAE system
u.t/ (e.g., with a zero order hold), the trajectory identifiability (Ljung and Glad 1994) and some
O
of x.t/, denoted byx.tj/, is computed with a recent developments on the identification of such
numerical ODE solver, then the output prediction systems (Gerdin et al. 2007).
is computed as
Combined Physical and Black-Box Models
O
y.tj/ D h.x.tj/;
O u.t/I /: (4)
It may happen that, in a complex system, part
of the components is well described by physical
The parameter vector  is typically estimated
laws (possibly with available models from a soft-
by minimizing the sum of squared prediction
ware library), but some other components are not
error ".tj/ D y.t/  y.tj/.
O See  System
well studied. In this case, the latter components
Identification: An Overview and Bohlin (2006)
can be dealt with black-box models (or possibly
for more details.
empirical models). The entire model can be fitted
The predictor based on the numerical solution
to a collected data set Z N , like in the case of the
of the state equation (1) (known as a simulator)
previous subsection.
may be in trouble if this equation with the given
value of  is unstable. Moreover, the state equa-
tion (1) may also be subject to some modeling Block-Oriented Models
error that should be taken into account in the Complex systems, notably those studied in en-
output predictor. In such cases, the output predic- gineering, are often made of a certain number
tor can be made with the aid of some nonlinear of components; thus a system model can be
state observer (Gauthier and Kupka 2001) or built by connecting component models. In this
some nonlinear filtering algorithm (Doucet and sense, such component-based models could be
Johansen 2011). said “block-oriented.” In the system identifica-
Alternatively, sequential Monte Carlo (SMC) tion literature, the term block-oriented model is
methods can also be applied to the identification often used in a particular context (Giri and Bai
of (small size) nonlinear state-space systems, 2010), where it is typically assumed that each
typically assuming a discrete-time counterpart of component is either a linear dynamic subsystem
the model described by Eqs. (1) and (2). See or a nonlinear static one. Here, the term “static”
 Nonlinear System Identification Using Particle means that the behavior of the component is
Filters. memoryless and can be described by an algebraic
The gray-box approach is particularly useful equation. The study of system identification with
in an engineering field when some software li- such models is motivated by the fact that, when a
brary of commonly used components is available. controlled system is stabilized around a working
In this case, a system model can be built by point, its dynamic behavior can be well described
connecting available component models. Never- by a linear model, but its actuators and sensors
theless, the “connection” of the component mod- may exhibit significant nonlinear behaviors like
els may introduce algebraic constraints through saturation or dead zone. The choice of a particular
variables shared by connected components, lead- block-oriented model structure depends on the
ing to differential algebraic equations (DAE), prior knowledge about the underlying system,
Nonlinear System Identification: An Overview of Common Approaches 893

u(t) x(t) y(t) estimated by a well-established linear system


Nonlinear Linear identification method. As the nb C m parameters
Static Dynamic
bj and l ; are replaced by nb m parameters
in the new parametrization, the term “over-
Nonlinear System Identification: An Overview of parametrization” refers to the fact that typically
Common Approaches, Fig. 1 Hammerstein system
nb C m < nb m. The estimated over-parametrized
model can be reduced to the original parametriza-
u(t) z(t) y(t) tion, usually through the singular value decom-
Linear Nonlinear
Dynamic Static position (SVD) of the matrix filled with the esti-
mated parameter products bj l . See Giri and Bai
Nonlinear System Identification: An Overview of (2010) for other identification methods with vari-
Common Approaches, Fig. 2 Wiener system ant formulations of Hammerstein system model.
When the linear subsystem is approximated
by a finite impulse response (FIR) model, it is
with specific identification methods available for possible to first estimate the linear model before
different model structures. estimating a model for the nonlinear block (Gre-
The most frequently studied block-oriented blicki and Pawlak 1989).
models for system identification concern the
Hammerstein system and the Wiener system, Wiener System Identification
each composed of two blocks, as illustrated, A SISO Wiener system is typically formulated as
respectively, in Figs. 1 and 2.

1
X
Hammerstein System Identification
A SISO Hammerstein system is typically formu- z.t/ D hk u.t  k/ (7a)
kD1
lated as
y.t/ D g.z.t// C v.t/ (7b) N
x .t/ D f .u .t// (5a)
where the sequence h1 ; h2 ,. . . is the impulse re-
y .t/ C a1 y.t  1/ C    C ana y.t  na / sponse of the linear subsystem, g./ is some non-
linear function, and v.t/ is a noise independent of
D b1 x .t  1/ C    C bnb x .t  nb / the input u.t/.
C v .t/ : (5b) Some methods for Wiener system identifica-
tion assume a finite impulse response (FIR) of
If the nonlinearity f ./ is expressed in the form the linear subsystem. In this case, the linear sub-
of system model is characterized by the vector col-
Xm lecting the FIR coefficients hT = [h1 ; h2 ,. . . , hn ].
f .u/ D l l .u/ (6) There are two typical kinds of efficient solutions,
lD1 assuming either the Gaussian distribution of the
with some chosen basis functions  l ./, then input u.t/ (Greblicki 1992) or the monotonicity
the identification problem amounts to fitting the of the nonlinear function g./ (Bai and Reyland
model parameters l ; ai ; bj to a collected data Jr 2008). In both cases, it is possible to directly
set Z N . A well-known method is based on over- estimate the FIR coefficients h from the input-
parametrization (Bai 1998): replace in (5b) each output data Z N , without explicitly estimating the
x.t  j / with the right-hand side of (6) and treat unknown nonlinear function g./: The estimated
each parameter product bj l as an individual h can be used to compute the internal variable
parameter, then the newly parametrized model is z.t/. It then becomes relatively easy to estimate
equivalent to a linear ARX model ( System the nonlinear function g./ from the computed
Identification: An Overview), which can be z.t/ and the measured y.t/.
894 Nonlinear System Identification: An Overview of Common Approaches

Other Block-Oriented Model Structures each considered working point, the nonlinear
Among block-oriented models composed of system is linearized so that the corresponding
more blocks, the most well-known ones concern controller can be designed from the linear control
Hammerstein-Wiener system and Wiener- theory. A by-product of this controller design
Hammerstein system. They are both composed procedure is a collection of linearized models
of 3 blocks connected in series, the former has indexed by the scheduling variable . This col-
a linear dynamic block preceded and followed lection, seen as a whole model of the globally
by two nonlinear static blocks, and the latter nonlinear system, is known as an LPV model
has a nonlinear static block in the middle of (Toth 2010). This approach has been particularly
two linear dynamic blocks. In general, the successful in the field of flight control.
prediction error method (PEM) (Ljung 1999) An LPV model can be formulated either in
is applied to the identification of such systems, input-output form or in state-space form. In the
with heuristic methods for the initialization input-output form, a SISO model can be written
of model parameters. Some recent results on as
Hammerstein-Wiener system identification have
been reported in Wills et al. (2013). There exist y .t/Ca1 ./ y .t1/ C    C ana ./ y .t  na /
also some other variants, with parallel blocks
or feedback loops. In most cases, each block is D b1 ./ u .t  1/ C    C bnb ./ u .t  nb /
either linear dynamic or nonlinear static, but C v .t/ : (8)
there is a notable exception: hysteresis blocks.
Hysteresis is a phenomenon typically observed and in the state-space form as
in some magnetic or mechanic systems. Its
mathematical description is both dynamic and x .t C 1/ D A ./ x .t/ C B ./ u .t/ C w .t/
strongly nonlinear and cannot be decomposed (9a)
into linear dynamic and nonlinear static blocks.
Due to the importance of hysteresis components y .t/ D c ./ x .t/ C D ./ u .t/ C v .t/
in some control systems, system identification (9b)
involving such blocks is currently an active
research topic (Giri et al. 2008). As a global model of the whole nonlinear
system, the -dependent parameters (matrices)
LPV Models ai ./; bj ./; A./, etc., are functions defined for
Linear parameter-varying (LPV) models could be all  2 , where  is the relevant working
classified as black-box models, because typically range of the considered system (a compact subset
they rely more on experimental data than on prior of a real vector space). If originally the LPV
knowledge. However, engineers often have good model was built through a collection of linearized
insights into such models; they are thus presented models around different working points, then the
in the gray-box section. values of these functions are first defined for
the corresponding discrete values of . For other
From Gain Scheduling to LPV Models values of  2 , these functions can be defined
Gain scheduling is a method originally developed by interpolation. Alternatively, by choosing some
for the control of nonlinear systems. It consists parametric forms of ai ./; bj ./; A./, etc., the
in designing different controllers for different whole LPV model can also be estimated by fitting
working points of a nonlinear system and in it to a data set Z N , through nonlinear optimiza-
switching among the designed controllers accord- tion (Toth 2010).
ing to the actual working point. It is typically
assumed that the working point is determined by Local Linear Models
some observed variable (vector) referred to as the In an LPV model, the model parameters can in
scheduling variable and denoted by . Around principle depend on the scheduling variable  in
Nonlinear System Identification: An Overview of Common Approaches 895

any chosen manner. A particularly useful case is the global approach: all the model parameters
is when they are formulated as expansions over are estimated simultaneously by solving a single
local basis functions. For example, in (8), the optimization problem for the whole model. This
parameter ai ./ may be expressed as approach can produce more accurate models in
terms of prediction error, but it is numerically
X
m
much more expensive and may lead to models
ai ./ D ai;l l ./ (10)
difficult to be interpreted by engineers.
lD1
The practical success of local linear models
where kl ./ are some chosen bell-shaped (local) strongly depends on the possibility of finding a
basis functions, typically the Gaussian function, scheduling variable  of small dimension rel-
centered at different positions  D cl; 2 , and evantly determining the working point of the
ai;l are coefficients of the expansion. Similarly considered system. If there exists a nonlinear
state-space model of the system, then in principle
X
m
bj ./ D bi;l l ./ : (11) the working point is determined jointly by the
lD1
state and the input of the system. As quite often
physically meaningful state variables are not fully
Assume that the basis functions are normalized observed, they cannot be used in the definition of
such that . It is possible to define  as delayed output and
Xm
input variables, e.g.,
l ./ D 1 (12)
lD1
T D Œy.t/; : : : ; y.tna /; u.t1/; : : : ; u.tnb /
for all  2 . Then the LPV model (8) can be
viewed as an interpolation of m “local” models but it typically leads to a vector of quite large
dimension. It is thus important to use practical
y.t/ C a1;l y.t  1/ C    C ana ;l y.t  na / insights about a given system to find a relevant
D b1;l u.t  1/ C    C bnb ;l u.t  nb /Cv.t/ vector  of reduced dimension. N
(13) For a single-dimensional , the choice of
indexed by l D 1; 2; : : : ; m. Each of these linear the local basis function centers cl can be made
models is valid for  close to cl , the center of the following some practical insight or equally
corresponding basis function l ./; hence, they spaced within . For a large-dimensional ,
are called local linear models. this task is more difficult. The equally spaced
If the local basis functions l ./ are viewed as approach would lead to too many local models,
membership functions of fuzzy sets, then the local as their number would exponentially increase
linear model is strongly related to the Takagi- with the dimension of . In this case, an
Sugeno fuzzy model (Takagi and Sugeno 1985). empirical approach, called local linear model
An advantage of this point of view is the possi- tree (LOLIMOT) (Nelles 2001), can be applied.
bilities of incorporating prior knowledge in the It iteratively partitions  in order to place the
form of linguistic rules and of interpreting some local basis functions where the system is more
local linear models resulting from system identi- likely nonlinear or where the available data are
fication. more concentrated.
There are two approaches to building local
linear models. The first one is the local ap-
proach: for each chosen value of cl; 2 , a Black-Box Models
local model is estimated from data corresponding
to the values of  within a neighborhood of Ideally speaking, a black-box model should be
cl . This approach has the advantages of being solely built from experimental data, without any
computationally efficient, easily updatable, and prior knowledge. In practice, some prior knowl-
well understood by engineers. The second one edge is always necessary, though experimental
896 Nonlinear System Identification: An Overview of Common Approaches

data play a much more important role. For in- more efficient than FIR models, in the sense of
stance, the choice of the input and output vari- requiring fewer model parameters. By analogy,
ables, implying some causality relationship, is an the nonlinear ARX model takes the form
important prior knowledge.
With the fast development of electronic de- yO .t/ D f .y .t  1/ ; : : : ; y .t  na /;
vices, more and more sensor signals are available
u .t  1/ ; : : : :u .t  nb // : (15)
in various fields, notably for engineering, envi-
ronmental, and biomedical systems. Meanwhile,
the processing power of modern computers in- This is likely the most frequently used black-
creases every year. Black-box modeling has thus box model structure for nonlinear dynamic sys-
more and more potential applications. Neverthe- tem identification (Sjöberg et al. 1995; Juditsky
less, the importance of prior knowledge in a et al. 1995).
modeling procedure should not be forgotten. In
general prior knowledge leads to more reliable Nonlinear Function Estimators
models in terms of validity range, as the validity For a nonlinear ARX model in the form of (15),
of physical equations is often well understood. In the nonlinear function f ./ has to be estimated
contrast, for a black-box model essentially based from an available input-output data set Z N . Typ-
on experimental data, it may be hard to ensure ically, an estimator of f ./ with some chosen
its validity for interpolation and even harder for parametric structure is used. Let
extrapolation.

T .t/ D Œy .t  1/ ; : : : ; y .t  na /;

Input-Output Black-Box Models u .t  1/ ; : : : ; u .t  nb / ; (16)


As the primary role of a mathematical model
is to predict the output of the system for given then system identification in this case amounts to
input values, it is natural to design black-box solving a nonlinear regression problem
models directly in the form of a predictor. As
the output y.t/ of a dynamic system depends on y .t/ D g .
.t/ I / C v .t/ (17)
the past inputs, a predicted output yO .t/ may be
formulated in the form of where g./ is a chosen nonlinear function
parametrized by , capable of approximating
a large class of nonlinear functions by appropri-
yO .t/ D f .u .t  1/ ; u .t  2/ ; : : : ; u .t  nb //
ately adjusting , and v.t/ is the modeling error
(14)
to be minimized in some sense.
where f ./ is some nonlinear function (to be The most well-known nonlinear function es-
estimated from experimental data) and nb is a timators implementing g./ in practice are poly-
chosen integer. In principle, nb can be infinitely nomials, splines, multiple-layer neural networks,
large (as y.t/ depends on all the past inputs radial basis networks, wavelets, and fuzzy-neural
in general), but in practice, a model of finite estimators. Most of these estimators can be writ-
complexity has to be chosen. If the considered ten in the form
system is stable in the sense that sufficiently old
past inputs are (gradually) forgotten, then it is X
m
g .
.t/ I / D l  .˛l .
 ˇl // (18)
reasonable to truncate the dependence on the past lD1
inputs.
The model structure (14) is similar to the or in some close variant of this form, where
linear finite impulse response (FIR) model (Ljung ./ is some “mother” basis function dilated
1999). It is known that, for linear system identi- and translated by ˛l and ˇl before being
fication, the use of ARX models, predicting y.t/ weighted by l in the sum forming the estimator
from both past inputs and past outputs, is often (Sjöberg et al. 1995). For example, ./ is
Nonlinear System Identification: An Overview of Common Approaches 897

typically chosen as a (Gaussian) bell-shaped chosen particular nonlinear function estimator for
function in radial basis networks or a sigmoid the considered system.
(S-shaped) function in multiple-layer neural
networks. State-Space Black-Box Models
Another approach to nonlinear function For a gray-box model in the form of (1) and
estimation is called nonparametric estimation. (2), it is assumed that the parametric forms of
Its main idea is to estimate f ('*) for any the nonlinear functions f ./ and h./ are known
given value of '* by the (weighted) average from prior knowledge. If no such knowledge is
of the values of y.t/ in the available data available, it is possible to estimate these nonlinear
set corresponding to values of '.t/ close to functions with some function estimator, like those
'*. This category includes kernel estimators introduced in the previous subsection. Such an
(Nadaraya 1964) and memory-based estimators approach leads to state-space black-box models.
(Specht 1991). In practice, it is easier to use the discrete-time
The nonlinear function estimation problem as counterpart of the state equation (1). Because
formulated in (17) can also be addressed with the typically the state vector x.t/ is not directly
Gaussian process model. Assume that g in (17) is observed, the estimation of f ./ and h./ cannot
a Gaussian process whose covariance matrix for be formulated as nonlinear regression problems,
any regressor pair '.t/ and './ is a known func- in contrast to the case of input-output black-
tion of the regressor pair, then the posterior dis- box models. Another difficulty is related to the
tribution of g given observations on y.t/ can be nonuniqueness of the state-space representation
computed by applying the Bayes’ theorem under of a given system: any (linear or nonlinear) state
certain assumptions (Rasmussen and Williams transformation would lead to a different state-
2006). This method is strongly related to the least space representation of the same system. In some
squares support vector machines (Suykens et al. existing methods, a linear state-space model is
2002) and to some extent is similar to kernel first estimated; then nonlinear function estimators
estimators. are used to compensate the residuals of f ./ and N
The difficulty for estimating a nonlinear func- h./ after their linear approximations (Paduart
tion f .'/ strongly depends on the dimension of et al. 2010).
'. In the single-dimensional case, most existing
methods can produce satisfactory results. When
the dimension of ', say n, increases, in order to Multiple-Input Multiple-Output
keep the data “density” unchanged, the number Systems
of data points must increase exponentially with
n. This fact implies that, in the high-dimensional For multiple-input multiple-output (MIMO)
case (say n > 10), for most practically available systems, state-space models like (1)–(2) remain
data sets, the data points are sparse in the space in the same form, by considering vector values
of '. It is thus practically impossible to estimate of the notations u.t/ and y.t/ at each time
f .'/ with a good accuracy everywhere in the instant, up to some similar adaptation of the other
space of '. In order to remedy this problem, prior involved notations. For input-output models like
knowledge can be used to form a more elaborated (15), the involved notations can also be vector
vector ' of reduced dimension, instead of the valued, but the fact that different inputs and/or
simple form of past input and output variables. outputs can have different delays makes the
The resulting model will be more of gray-box notations more complicated. For block-oriented
nature. If this approach is not possible, one has models, though a MIMO linear block is usually
to expect that the estimation algorithm automat- described by a general linear model in state-
ically discovers some low-dimension nature of space form or in input-output form, there is no
the nonlinear relationship being estimated. The consensus for the structural choice of MIMO
success would depend on the suitability of the nonlinear blocks.
898 Nonlinear System Identification: An Overview of Common Approaches

Some Practical Issues a Bayesian approach, but this approach is not


applicable to black-box models.
The general practical aspects discussed in
 System Identification: An Overview are
of course also valid for nonlinear system Summary and Future Directions
identification, but some particularities in the
nonlinear case should be highlighted. Compared to linear system identification, the non-
It is important to apply appropriate input sig- linear case is a much more vast topic, of which
nals so that the collected data convey sufficient this entry provides only a rough overview. The
information for system identification. The de- main lines that should be retained are that both
sign of input signals for this purpose is known prior knowledge and experimental data are re-
as experiment design. In the framework of lin- quired for system identification and that the more
ear system identification, experiment design is prior knowledge is incorporated in a model, the
usually formulated through the optimization of better the extent of its validity is understood. The
the covariance matrix of model parameter esti- lack of prior knowledge should be compensated by
mates ( Experiment Design and Identification the processing of large amounts of data. The data
for Control), which often leads to non-convex that can be processed within an acceptable time
optimization problems. Experiment design in the depend on the power of computers that progresses
nonlinear case has not been systematically stud- every year. Meanwhile, the research and develop-
ied. If possible, the chosen input signal should ment of efficient algorithms for large data process-
be similar to what will be actually applied to ing with multiple or massively parallel processors
the considered system and cover various working are an exciting topic in system identification.
conditions. Another simple rule is that the input
should excite a nonlinear system at different am-
plitudes, whereas binary input signals are often
used for linear systems. Cross-References
Model validation is a particularly delicate
 Experiment Design and Identification for Con-
task for nonlinear black-box models. As already
mentioned when such models are introduced, trol
 Modeling of Dynamic Systems from First Prin-
the available data points are usually sparse
when a nonlinear function is estimated in a ciples
 Nonlinear System Identification Using Particle
high-dimensional space; it is thus practically
impossible to uniformly ensure the estimation Filters
 System Identification: An Overview
accuracy of the nonlinear function. It is important
to extensively perform cross-validation, by
testing the validity of the model on large data sets
that have not been used for model estimation. Recommended Reading
Regularization is also an important issue for
nonlinear black-box models. Because of lack of Nonlinear system identification is covered by a
prior knowledge, each nonlinear black-box model vast literature. After the readings about general
has a flexible structure in order to cover a large topics on system identification (see  System
class of nonlinear systems, typically with many Identification: An Overview and references
model parameters, implying large variances of therein), the reader may further read (Nelles
parameter estimates ( System Identification: An 2001) for black-box system identification,
Overview). Appropriately applying a regularized (Bohlin 2006) for gray-box system identification,
criterion for model parameter estimation can re- (Giri and Bai 2010) for block-oriented system
duce the variances. For gray-box models, prior identification, and (Toth 2010) for LPV system
knowledge can be used for regularization through identification.
Nonlinear Zero Dynamics 899

Bibliography Specht DF (1991) A general regression neural network.


IEEE Trans Neural Netw 2(5):568–576
Bai EW (1998) An optimal two-stage identification al- Suykens JAK, Van Gestel T, De Brabanter J, De Moor
gorithm for Hammerstein-Wiener nonlinear systems. B, Vandewalle J (2002) Least squares support vector
Automatica 34(3):333–338 machines. World Scientific, Singapore
Bai E-W, Reyland Jr J (2008) Towards identification of Takagi T, Sugeno M (1985) Fuzzy identification of sys-
Wiener systems with the least amount of a priori infor- tems and its applications to modeling and control.
mation on the nonlinearity. Automatica 44(4):910–919 IEEE Trans Syst Man Cybern 15(1):116–132
Bohlin T (2006) Practical grey-box process identification Toth R (2010) Modeling and identification of linear
– theory and applications. Springer, London parameter-varying Systems. Springer, Berlin
Wills A, Schön T, Ljung L, Ninness B (2013) Identi-
Doucet A, Johansen AM (2011) A tutorial on particle
fication of Hammerstein-Wiener models. Automatica
filtering and smoothing: fifteen years later. In: Crisan
49(1):70–81
D, Rozovsky B (eds) Nonlinear filtering handbook.
Oxford University Press, Oxford
Garnier H, Wang L (eds) (2008) Identification of
continuous-time models from sampled data. Springer,
London Nonlinear Zero Dynamics
Gauthier J-P, Kupka I (2001) Deterministic observation
theory and applications. Cambridge University Press,
Cambridge/New York Alberto Isidori
Gerdin M, Schön T, Glad T, Gustafsson F, Ljung L Department of Computer and System Sciences
(2007) On parameter and state estimation for linear “A. Ruberti”, University of Rome
differential-algebraic equations. Automatica 43:416– “La Sapienza”, Rome, Italy
425
Giri F, Bai E-W (eds) (2010) Block-oriented nonlinear
system identification Springer, Berlin/Heidelberg
Giri F, Rochdi Y, Chaoui FZ, Brouri A (2008) Identifica- Abstract
tion of Hammerstein systems in presence of hysteresis-
backlash and hysteresis-relay nonlinearities. Automat-
The notion of zero dynamics plays a role in
ica 44(3):767–775
Greblicki W (1992) Nonparametric identification of nonlinear systems that is analogous to the role
Wiener systems. IEEE Trans Inf Theory 38(5):1487– played, in a linear system, by the notion of zeros N
1493 of the transfer function. In this article, we review
Greblicki W, Pawlak M (1989) Nonparametric identifica-
the basic concepts underlying the definition of
tion of Hammerstein systems. IEEE Trans Inf Theory
35(2):409–418 zero dynamics and discuss its relevance in the
Juditsky A, Hjalmarsson H, Benveniste A, Delyon B, context of nonlinear feedback design.
Ljung L, Sjöberg J, Zhang Q (1995) Nonlinear black-
box models in system identification: mathematical
foundations. Automatica 31(11):1725–1750
Ljung L (1999) System identification – theory for the user, Keywords
2nd edn. Prentice-Hall, Upper Saddle River
Ljung L, Glad T (1994) On global identifiability for arbi-
trary model parametrizations. Automatica 30(2):265– High-gain feedback; Inverse systems; Minimum-
276 phase nonlinear systems; Normal forms; Output
Nadaraya EA (1964) On estimating regression. Theory regulation; Stabilization
Probab Appl 9:141–142
Nelles O (2001) Nonlinear system identification. Springer,
Berlin/New York
Paduart J, Lauwers L, Swevers J, Smolders K, Schoukens Introduction
J, Pintelon R (2010) Identification of nonlinear sys-
tems using polynomial nonlinear state space models.
Automatica 46(4):647–656 The concept of zero dynamics of a nonlinear
Rasmussen CE, Williams CKI (2006) Gaussian processes system was introduced in the early 1980s as the
for machine learning. MIT, Cambridge nonlinear analogue of the concept of transmission
Sjöberg J, Zhang Q, Ljung L, Benveniste A, Delyon B,
zero of a linear system. This concept played
Glorennec P-Y, Hjalmarsson H, Juditsky A (1995)
Non-linear black-box modeling in system identifica- a fundamental role in the development of sys-
tions unified overview. Automatica 31(11):1691–1724 tematic methods for asymptotic stabilization of
900 Nonlinear Zero Dynamics

relevant classes of nonlinear systems. As a matter To see how this is possible, consider for sim-
of fact, a nonlinear system in which the zero dy- plicity the case of a system modeled by equations
namics possess a globally asymptotically stable of the form
equilibrium can be robustly stabilized, globally
or at least with guaranteed region of attraction, xP D f .x/ C g.x/u
by means of output feedback. This is a nonlinear y D h.x/
analogue of a well-know property of linear sys-
tems, namely, the property that an n-dimensional with state x 2 Rn , input u 2 R, output y 2
linear systems having n  1 zeros with neg- R and in which f .x/; g.x/; h.x/ are smooth
ative real part can be stabilized by means of functions. Systems of this forms are called input-
proportional output feedback, if the feedback affine systems. The analysis of such systems is
gain is sufficiently large. The concept of zero rendered particularly simple if appropriate no-
dynamics also plays a relevant role in variety of tations are used. Given any real-valued smooth
other problems of feedback design, such as input- function .x/ and any n-vector valued smooth
output linearization with internal stability, non- function X.x/, let LX .x/ denote the (direc-
interacting control with internal stability, output tional) derivative of .x/ along X.x/, that is the
regulation, and feedback equivalence to passive real-valued smooth function
systems.
X n
@
LX .x/ D Xi .x/ ;
i D1
@xi
The Zero Dynamics
d 1
and, recursively, set LdX  D LX LX .x/ for
One of the cornerstones of the geometric the- any d  1.
ory of control systems (for linear as well as Suppose there exists an integer r  1 with the
for nonlinear systems) is the analysis of how following properties
the observability property can be influenced by
feedback. This study, originally conceived in the
context of the problem of disturbance decoupling,
Lg h.x/ D Lg Lf h.x/ D    D Lg Lfr2 h.x/
had far reaching consequences in a number of
other domains. One of these consequences is the D 0 8x 2 Rn
possibility of characterizing in “geometric terms”
Lg Lfr1 h.x/ ¤ 0 8x 2 Rn :
the notion of zero of the transfer function of
a system. In a (single-input single-output and
If this is the case, it is possible to show that the
minimal) linear system, a complex number z is
set
a zero of the transfer function if and only if the
input u.t/ D exp.zt/ yields – for a suitable
choice of the initial state – a forced response Z  D fx 2 Rn W h.x/ D Lh .x/ D   
in which the output is identically zero. This D Lfr1 h.x/ D 0g
“open-loop” and “time-domain” characterization
has a “closed-loop and “geometric” counterpart: is a smooth sub-manifold of Rn , of codimension
all such z’s coincide with the eigenvalues of the r. It is also easy to show that the state-feedback
unobservable part of the system, once the latter law
has been rendered maximally unobservable by Lrf h.x/
means of feedback. One of the earlier successes u .x/ D 
Lg Lr1
f h.x/
of the geometric approach to the analysis and
design of nonlinear systems was the possibility renders the vector
of extending these equivalent characterizations to
the domain of nonlinear systems. f  .x/ D f .x/ C g.x/u .x/
Nonlinear Zero Dynamics 901

tangent to Z  , at each point x of Z  . In other zP D f0 .z; /


words, Z  is an invariant manifold of the
P D Ar  C Br Œq0 .z; / C b.z; /u
feedback-modified system
y D Cr 
xP D f  .x/:
in which z 2 Rnr ,  2 Rn , the matrices
It is seen from this construction that the output Ar ,Br ,Cr have the form
y.t/ D h.x.t// of the system is identically zero 0 1 0 1
if and only if x.0/ 2 Z  and u.t/ D u .x.t//, 01 0  0 0
B0 0 1    0C B0C
where x.t/ is the solution of xP D f  .x/ passing B C B C
through x.0/ at time t D 0. As a consequence, Ar D B
B    C C; Br D B C
B  C ;
@0 0 0    1A @0A
the restriction of xP D f  .x/ to its invariant
manifold Z  characterizes all internal dynamics 00 0  0 1
 
that occur in the system once initial condition and Cr D 1 0 0    0 ;
input are chosen in such a way that the output is
constrained to be identically zero. The dynamics and b.z; / ¤ 0 for all .z; /. These equations are
in question are called the zero-dynamics of the said to be in normal form (Isidori 1995).
system. Note that this construction demonstrates, It is easy to check that, in these coordinates,
as anticipated, the equivalence between an “open- the manifold Z  is the set of all pairs .z; /
loop” and a “closed-loop” characterization of all having  D 0, the state feedback law u .x/ is
the (internal) dynamics of a given system that the function
are compatible with the constraint that the output
is identically zero. This construction can be ex- q0 .z; /
tended to multi-input multi-output systems, with u .z; / D 
b.z; /
the aid of an appropriate recursive algorithm,
known as the zero dynamics algorithm (Isidori and the restriction of xP D f  .x/ to the manifold N
1995). Z  is nothing else than

Normal Forms zP D f0 .z; 0/:

The coordinate-free construction presented above The latter provide a simple characterization of
becomes even more transparent if special coordi- the zero dynamics of the system, once that the
nates are chosen. To this end, set latter has been brought to its normal form.
It is worth observing that, in the case of a
1 linear system, functions f0 .z; / and q0 .z; / are
g  .x/ D g.x/
Lg Lfr1 h.x/ linear functions, and b.z; / is a constant. Conse-
quently, the normal form can be written as
and define, recursively,
zP D F z C G
X0 .x/ D g  .x/; Xk .x/ D Œf  .x/; Xk1 .x/ ;
P D Ar  C Br ŒH z C K C bu
for 1  k  r  1, in which ŒY .x/; X.x/ denotes y D Cr  :
the Lie bracket of Y .x/ and X.x/. It is possible to
show that if the vector fields X0 .x/; : : : ; Xn1 .x/ It is also easy to check that the transfer func-
are complete, there exists a smooth nonlinear, tion of the system can be expressed as
globally defined, change of variables by means
of which the system can be transformed into a det.sI  F /
T .s/ D b ;
system of the form det.sI  A/
902 Nonlinear Zero Dynamics

in which the constraint that the output is identically zero,


  the inverse system must describe all dynamics
F G resulting in any admissible output function. As a
AD :
Br H Ar C Br K consequence, computation of the zero dynamics
and computation of the inverse system (whenever
From this it is concluded that in a (controllable this is possible) are not equivalent and the lat-
and observable) linear system, the zeros of the ter is possible only under substantially stronger
transfer function T .s/ coincide with the eigen- assumptions. The computation of the zero dy-
values of F . In other words, in a linear system namics is based on an extension (Isidori 1995)
the zero dynamics are linear dynamics whose of the classical algorithm of Wonham (1979) for
eigenvalues coincide with the zeros of the transfer the computation of the largest controlled invariant
function of the system. subspace in the kernel of the output map, while
the computation of the inverse system is based
on extensions, due to Hirschorn (1979) and Singh
The Inverse System (1981) of the so-called structure algorithm intro-
duced by Silverman (1969) for the computation
Another property associated with the notion of of inverses and zero structure at the infinity.
zero of the transfer function, in a (single-input For a comparison of such assumptions and of
single-output) linear system, is the fact that the their influence on the outcome of the associated
zeros characterize the dynamics of the inverse algorithms, see Isidori and Moog (1988).
system (the latter being – loosely speaking – a
system able to reproduce the input u.t/ from
output y.t/ that this input has generated). This Input-Output Linearization
property has an immediate analogue for nonlinear
systems. Considering system in normal form and An appealing feature of the normal form de-
setting scribed above is the straightforward observation
that a (state) feedback law of the form
yr1 .t/ D col.y.t/; y .1/ .t/; : : : ; y .r1/ .t// ;
1
uD Œq0 .z; / C Kr  C v
it is easily seen that the input u.t/ can be deter- b.z; /
mined as the output of a dynamical system, driven
by yr1 .t/ and y .r/ .t/, modeled by changes the system into a system

zP D f0 .z; yr1 / zP D f0 .z; /


y .r/  q0 .z; yr1 / (1) P D .Ar C Br Kr / C Br v
uD :
b.z; yr1 / y D Cr 

Thus, it is concluded that the unforced internal whose input-output behavior (between input v
dynamics of the inverse system coincide with the and output y) is fully linear (and stable if Kr is
zero dynamics as defined above. chosen so that the matrix Ar C Br Kr in Hurwitz).
It should be stressed, though, that the coin- In fact, the law in question renders the sys-
cidence is limited to the case of single-input tem partially unobservable, with all nonlinearities
single-output systems. For a multi-input multi- confined to its unobservable part (Isidori et al.
output nonlinear systems, the link between zero 1981). This control law is clearly non-robust,
dynamics and the dynamics of the inverse system as it relies upon exact cancelation of possibly
is more subtle. This is essentially due to the fact uncertain terms, but it can be rendered robust
that while the concept of zero dynamics only by means of appropriate dynamic compensation
seeks to determine the dynamics compatible with (Freidovich and Khalil 2008).
Nonlinear Zero Dynamics 903

The system obtained in this way has the struc- functions ˛./; ˛./; ˛./ and a class K function
ture of a cascade of two sub-systems, one of ./ satisfying
which, modeled as
˛.jzj/  V .z/  ˛.jzj/ 8z
zP D f0 .z; / ; @V
f0 .z; /  ˛.jzj/ 8.z; /
@z
is seen as “driven” by the input . This motivates
the interest in classifying the asymptotic proper- such that jzj  .jj/:
ties of such subsystem, as discussed below.
As a special case, it is seen that a system is
globally minimum phase if and only if there
Asymptotic Properties of the Zero exists a function V .z/, bounded as above, such
Dynamics that
@V
f0 .z; 0/  ˛.jzj/ 8z :
@z
Linear systems with no zeroes in the right-half
complex plane are traditionally called minimum- If, instead, the weaker inequality
phase systems, in view of certain properties of the
@V
Bode gain and phase plots of its transfer function. f0 .z; 0/  0 8z
Thus, in view of the interpretation given above, @z
linear systems whose zero dynamics are asymp- holds, the system is said to be globally weakly
totically stable are minimum-phase systems. This minimum-phase.
terminology has been (somewhat abusively, but The criterion summarized above is of
with the clear intent of providing a concise and paramount importance in the design of feedback
expressive characterization) borrowed to classify laws to the purpose of stabilizing nonlinear
nonlinear systems whose zero dynamics have systems that are globally (or strongly) minimum
desirable (from the stability viewpoint) proper- phase, as it will be seen below. N
ties. Assuming that z D 0 is an equilibrium of
zP D f0 .z; 0/, the following cases are consid-
ered: Zero Dynamics and Stabilization
• A nonlinear system is locally minimum-phase
(respectively, locally exponentially minimum- The first and foremost immediate implication of
phase) if the equilibrium z D 0 of zP D f0 .z; 0/ the properties described above is the fact that the
is locally asymptotically (respectively locally feedback law
exponentially) stable (Byrnes and Isidori
1984). 1
• A nonlinear system is globally minimum- uD Œq0 .z; / C Kr  ;
b.z; /
phase if the equilibrium z D 0 of zP D f0 .z; 0/
is globally asymptotically stable (Byrnes and if Kr is chosen so that the matrix Ar C Br Kr
Isidori 1991). in Hurwitz, globally asymptotically stabilizes
• A nonlinear system is strongly minimum- the equilibrium .z; / D .0; 0/ of a strongly
phase if the system zP D f0 .z; /, viewed as minimum-phase system. In fact, as observed, the
a system with input  and state z, is input-to- corresponding closed-loop system can be seen as
state stable (Liberzon 2002). an asymptotically stable (linear) system driving
According to the well-known criterion of an input-to-state stable (nonlinear) system. As
Sontag (1995) for input-to-state stability, a already observed, this control mode is non-
system is strongly minimum phase if and only robust (as it relies upon exact cancelations) and
if there exists a positive definite and proper requires the availability of the full state .z; /
smooth real-valued function V .z/, class K1 of the controlled system. However, both these
904 Nonlinear Zero Dynamics

deficiencies can be to some extent fixed, by control in question by means of a dynamic control
means of appropriate techniques, that will be law that only depends on the output y, following
briefly reviewed below. a design paradigm originally proposed by H.
If the requirement of global stability is re- Khalil. In fact, if the system is strongly minimum
placed by the (weaker) requirement of stability phase and also locally exponentially minimum
with a guaranteed region of attraction, then the phase and if q0 .0; 0/ D 0, asymptotic stability
desired control goal can be achieved by means with a guaranteed region of attraction can be
of a much simpler law, depending only on the achieved by means of dynamical feedback law of
partial state  and not requiring cancelations. the form Khalil and Esfandiari (1993)
Stability with a guaranteed region of attraction
P
essentially means that a given equilibrium is O1 D O2 C cr1 .y  O1 /
rendered asymptotically stable, with a region of
P
attraction that contains an a priori fixed compact O2 D O3 C  2 cr2 .y  O1 /
set. In this context, the most relevant results can 
be summarized as follows. PO
Assume the system possesses a globally de- r1 D Or C  r1 c1 .y  O1 /
P
fined normal form and, without loss of generality, Or D  r c0 .y  O1 /
let b.z; / > 0. Let the system be controlled by a O ;
u D  L .kKr /
“partial state” feedback of the form

in which  and the ci are design parameters and


u D kKr  ;
L .s/ is a smooth saturation function, character-
ized as follows: L .s/ D s if jsj  L, L .s/
in which k 2 R. Under this control mode, the
is odd and monotonically increasing, with 0 <
following results are obtained:
L0 .s/  1, and lims!1 L .s/ D L.1 C c/ with
• Suppose the system is strongly minimum
0 < c 1. The number L is a design parameter
phase. Then, there is a matrix Kr and, for
also.
every choice of a compact set C and of a
It is also possible to show that a suitable
number " > 0, there are a number k  and a
“extension” of this dynamic feedback law can be
time T  such that, if k  k  , all trajectories of
used to asymptotically recover the effects of the
the closed-loop system with initial condition
input-output linearizing law considered earlier.
in C are bounded and satisfy jx.t/j  " for all
In this way, the lack of robustness intrinsically
t  T .
present in such control law is overcome (Frei-
• Suppose the system is strongly minimum
dovich and Khalil (2008)).
phase and also locally exponentially minimum
phase. Suppose q0 .0; 0/ D 0. Then, there
is a matrix Kr and, for every choice of a
compact set C there is a number k  such Output Regulation
that, if k  k  , the equilibrium x D 0 of
the system is locally asymptotically stable, The concept of zero dynamics plays a fundamen-
with a domain of attraction that contains the tal role in the problem of output regulation. The
set C. problem in question considers a controlled plant
In these results, the system is stabilized by modeled by
means of a static control law that depends only
on the partial state  and not on the (possi- xP D f .w; x; u/
bly unknown) quantities q0 .z; /, b.z; /. Bear- e D h.w; x/ ;
ing in mind the fact that the r components of
 coincide with the output y and its deriva- in which u is the control input, w is a set of ex-
tives y .1/ ; : : : ; y .r1/ , it is possible to replace the ogenous variables (command and disturbances),
Nonlinear Zero Dynamics 905

and e is a set of regulated variables. The exoge- the controlled plant. The second two equations,
nous variables are thought of as generated by an on the other hand, interpret the ability, of the
autonomous system controller, to generate the feedforward input nec-
essary to keep e.t/ D 0 in steady-state. This is
wP D s.w/ a nonlinear version of the well-known internal
model principle of Francis and Wonham (1975).
known as the exosysten. The problem is to design
a (possibly dynamic) controller
Passivity
xP c D fc .xc ; e/
u D hc .xc ; e/ Consider a nonlinear input-affine system having
the same number m of inputs and outputs and
driven by the regulated variable e, such that recall that this system is said to be passive if
in the resulting closed-loop system all trajecto- there exists a continuous nonnegative function
ries are ultimately bounded and limt !1 e.t/ D real-valued function W .x/, with W .0/ D 0, that
0: The problem in question has been the ob- satisfies
ject of intensive research in the past years. In Z t
what follows we limit ourselves to highlight the W .x.t//  W .x.0//  y T .s/u.s/ds
role of the concept of zero dynamics in this 0
problem.
Assume that the set W where the exosystem along trajectories. The function W .x/ is the so-
evolves is compact and invariant and suppose a called storage function of the system.
controller exists that solves the problem of output It is well known that the notion of passiv-
regulation. Then, the associated closed-loop has a ity plays an important role in system analysis
steady-state locus (see Isidori and Byrnes 2008), and that the theory of passive systems leads to
the graph of a possibly set-valued map defined on powerful methodologies for the design of feed-
N
W . Suppose the map in question is single-valued, back laws for nonlinear systems. In this context,
which means that for each given exogenous input the question of whether a given, non-passive,
function w.t/, there exists a unique steady-state nonlinear system could be rendered passive by
response, expressed as x.t/ D .w.t// and means of state feedback is indeed relevant. It
xc .t/ D c .w.t//. If, in addition, .w/ and c .w/ turns out that this possibility can be simply ex-
are continuously differentiable, it is readily seen pressed as a property of the zero dynamics of the
that system.
Suppose that Lg h.x/ is nonsingular and set
Ls .w/ D f .w; .w/; .w// g  .x/ D g.x/ŒLg h.x/1 . If the m columns
0 D h.w; .w// of g  .x/ are complete and commuting vector
8w 2 W: fields, there exists a globally defined change of
Ls c .w/ D fc .c .w/; 0/
coordinates that brings the system in normal form
.w/ D hc .c .w/; 0/

The first two equations, introduced in Isidori zP D f0 .z; y/


and Byrnes (1990), are known as the nonlinear yP D q0 .z; y/ C b.z; y/u
regulator equations. They clearly show that the
graph of the map .w/ is a manifold contained Then, there exists a feedback law u D ˛.z; y/ that
in the zero set of the output map e, rendered renders the resulting closed-loop system passive,
invariant by the control u D .w/. In particular, with a C 2 and positive definite storage function
the steady-state trajectories of the closed-loop W .x/, if and only if the system is globally weakly
system are trajectories of the zero dynamics of minimum phase (Byrnes et al. 1991).
906 Nonlinear Zero Dynamics

Limits of Performance value of the L2 norm of the output coincides with


the least amount of energy required to stabilize
It is well-known that linear systems having zeros the dynamics of z.
in the left-half plane are difficult to control, and
obstruction exists to the fulfillment of certain
Summary and Future Directions
control specifications. One of these is found in
the analysis of the so-called cheap control prob-
The concept of zero dynamics plays an important
lem, namely, the problem of finding a stabilizing
role in a large number of problems arising in
feedback control that minimizes the functional
analysis and design of nonlinear control systems,
Z 1 among which the most relevant ones are the
1
J" D Œy T .t/y.t/ C "uT .t/u.t/dt problems of asymptotic stabilization and those of
2 0
asymptotic tracking/rejection of exogenous com-
when " > 0 is small. As " ! 0, the optimal mand/disturbance inputs. Essentially, all such ap-
value J" tends to J0 , the ideal performance. It plications deal with single-input single-output
is well-known that, in a linear system, J0 D 0 systems, require the system to be preliminarily
if and only if the system is minimum phase and reduced to a special form by means of appropriate
right invertible and, in case the system has zeros change of coordinates, and assume the dynamics
with positive real part, it is possible to express in question to be globally asymptotically stable.
explicitly J0 in terms of the zeros in question. If The analysis of systems having many inputs and
the (linear) system is expressed in normal form as many outputs, of systems in which normal forms
cannot be defined, and of systems in which the
zP D F z C G zero dynamics are unstable is still a challenging
P D H z C K C bu
and unexplored area of research.
yD
Cross-References
with b ¤ 0, and the zero dynamics are antistable
(that is all the eigenvalues of F have positive real  Differential Geometric Methods in Nonlinear
part), it can be shown that J0 coincides with the Control
minimal value of the energy  Input-to-State Stability
Z  Regulation and Tracking of Nonlinear Systems
1
1
J D  T .t/.t/dt
2 0
Bibliography
required to stabilize the (antistable) system zP D
F zCG. In other words, the limit as " ! 0 of the Byrnes CI, Isidori A (1984) A frequency domain philos-
optimal value of J" is equal to the least amount of ophy for nonlinear systems. IEEE Conf Dec Control
23:1569–1573
energy required to stabilize the dynamics of the
Byrnes CI, Isidori A (1991) Asymptotic stabilization of
inverse system. minimum-phase nonlinear systems. IEEE Trans Au-
This result has an appealing nonlinear coun- tom Control AC-36:1122–1137
terpart (Seron 1999). In fact, for a nonlinear Byrnes CI, Isidori A, Willems JC (1991) Passivity, feed-
back equivalence, and the global stabilization of min-
input-affine system having the same number m of
imum phase nonlinear aystems. IEEE Trans Autom
inputs and outputs in normal form, with f0 .z; / Control AC-36:1228–1240
of the form f0 .z; / D f0 .z/ C g0 .z/ and zP D Francis BA, Wonham WM (1975) The internal model
f0 .z/ antistable, under appropriate technical as- principle for linear multivariable regulators. J Appl
Math Optim 2:170–194
sumptions (mostly related to the existence of the
Freidovich LB, Khalil HK (2008) Performance recovery
solution of the associated optimal control prob- of feedback-linearization-based designs. IEEE Trans
lems), the same result holds: the lowest attainable Autom Control 53:2324–2334
Nonparametric Techniques in System Identification 907

Hirschorn RM (1979) Invertibility for multivariable non- parametric methods, these techniques require
linear control systems. IEEE Trans Autom Control no detailed structural information to get
AC-24:855–865
Isidori A (1995) Nonlinear control systems, 3rd edn.
insight into the dynamic behavior of complex
Springer, Berlin/New York systems. Therefore, nonparametric methods are
Isidori A, Byrnes CI (1990) Output regulation of nonlinear used in system identification to get an initial
systems. IEEE Trans Autom Control AC-35:131–140 idea of the model complexity and for model
Isidori A, Byrnes CI (2008) Steady-state behaviors in
nonlinear systems, with an application to robust dis-
validation purposes (e.g., detection of unmodeled
turbance rejection. Ann Rev Control 32:1–16 dynamics). Their drawback is the increased
Isidori A, Moog C (1988) On the nonlinear equivalent variability compared with the parametric
of the notion of transmission zeros. In: CI Byrnes, estimates. Although the main focus of this entry
A Kurzhanski (eds) Modelling and adaptive control.
Lecture notes in control and information sciences,
is on the classical identification framework
vol 105. Springer, Berlin/New York pp 445–471 (estimation of dynamical systems operating
Isidori A, Krener AJ, Gori-Giorgi C, Monaco S (1981) in open loop from known input, noisy output
Nonlinear decoupling via feedback: a differential ge- observations), the reader will also learn more
ometric approach. IEEE Trans Autom Control AC-
26:331–345
about (i) the connection between transient and
Khalil HK, Esfandiari F (1993) Semiglobal stabilization leakage errors, (ii) the estimation of dynamical
of a class of nonlinear systems using output feedback. systems operating in closed loop, (iii) the
IEEE Trans Autom Control AC-38:1412–1415 estimation in the presence of input noise, and (iv)
Liberzon D, Morse AS, Sontag ED (2002) Output-input
stability and minimum-phase nonlinear systems. IEEE the influence of nonlinear distortions on the linear
Trans Autom Control AC-43:422–436 framework. All results are valid for discrete- and
Seron MM, Braslavsky JH, Kokotovic PV, Mayne DQ continuous-time systems. The entry concludes
(1999) Feedback limitations in nonlinear systems: with some user choices and practical guidelines
from Bode integrals to cheap control. IEEE Trans
Autom Control AC-44:829–833 for setting up a system identification experiment
Singh SN (1981) A modified algorithm for invertibility and choosing an appropriate estimation method.
in nonlinear systems. IEEE Trans Autom Control AC-
26:595–598
Silverman LM (1969) Inversion of multivariable linear
N
systems. IEEE Trans Autom Control AC-14:270–276 Keywords
Sontag ED (1995) On the input–to–state stability property.
Eur J Control 1:24–36 Best linear approximation; Correlation method;
Wonham WM (1979) Linear multivariable control: a geo- Empirical transfer function estimate; Errors-
metric approach. Springer, New York
in-variables; Feedback; Frequency response
function; Gaussian process regression; Impulse
transient response modeling method; Local
polynomial method; Local rational method;
Noise (co)variances; Noise power spectrum;
Nonparametric Techniques Spectral analysis
in System Identification

Rik Pintelon and Johan Schoukens Introduction


Department ELEC, Vrije Universiteit Brussel,
Brussels, Belgium Nonparametric representations such as frequency
response functions (FRFs) and noise power spec-
tra are very useful in system identification: they
Abstract are used (i) to verify the quality of the identifi-
cation experiment (high or poor signal-to-noise
This entry gives an overview of classical ratio?), (ii) to get quickly insight into the dy-
and state-of-the-art nonparametric time and namic behavior of the plant (complex or easy
frequency-domain techniques. In opposition to identification problem?), and (iii) to validate the
908 Nonparametric Techniques in System Identification

parametric plant and noise models (detection of


unmodeled dynamics); see also  System Identi-
fication: An Overview. In addition, via specially
designed periodic excitation signals, it is possible

Noise
to detect and quantify the nonlinear distortions
in the FRF estimate. As such, without estimating
a parametric model, the users can easily decide
whether or not the linear framework is accurate
enough for their particular application. Plant
The estimation of the nonparametric models
typically starts from sampled input-output signals
u.nT s / and y.nT s /; n D 0; 1; : : : ; N  1, that Nonparametric Techniques in System Identification,
Fig. 1 Classical identification framework: discrete- or
are transformed to the frequency domain via the continuous-time plant operating in open loop; known in-
discrete Fourier transform (DFT) put u.t /, noisy output y.t / observations; and v.t / filtered
discrete-time or band-limited continuous-time white noise
N 1
e.t / that is independent of u.t /. y0 .t / denotes the true
1 X output of the plant. In the continuous-time case, it is
X.k/ D p x.nTs /e j 2k n=N (1) assumed that the unobserved driving noise source e.t / has
N nD0 finite variance and constant (white) power spectrum within
the acquisition bandwidth
with Ts the sampling period, x D u or y, and
X D U or Y . One of the main difficulties
some practical guidelines are given (section
in estimating an FRF and noise power spec-
trum is the leakage error in the DFT spectrum “Guidelines”). Unless otherwise stated, the input
X.k/ D DFT.x.t// (1). It is due to the finite u.t/ and the disturbing noise v.t/ are assumed to
be statistically uncorrelated.
duration NT s of the experiment, and it increases
the mean square error of the nonparametric esti-
mates. Therefore, all methods try to suppress the
leakage error as much as possible. The Leakage Problem
This entry starts by a detailed analysis of
the leakage problem (section “The Leakage For arbitrary excitations u.t/, the relationship
Problem”), followed by an overview of standard between the true input U.k/ and true output Y0 .k/
and advanced nonparametric time (section DFT spectra (1) of a linear dynamic system is
“Nonparametric Time-Domain Techniques”) and given by
frequency (section “Nonparametric Frequen-
cy-Domain Techniques”) domain techniques. Y0 .k/ D G.k /U.k/ C TG .k / (2)
First, it is assumed that the system operates
in open loop (see Fig. 1) and that known where k D j!k or exp.j!k Ts / for,
input, noisy output observations are available respectively, continuous- and discrete-time
(sections “Nonparametric Time-Domain Tech- systems; !k D 2k=.NT s /; G.k / the plant
niques” and “Nonparametric Frequency-Domain frequency response function; and TG .k /
Techniques”). Next, section “Extensions” the leakage error due to the plant dynamics
extends the results to systems operating in (Pintelon and Schoukens 2012, Section 6.3.2).
closed loop (section “Systems Operating in The leakage error TG ./ is a smooth function of
Feedback”); to noisy input, noisy output the frequency that decreases to zero as O.N 1=2 /
observations (section “Noisy Input, Noisy Output for N increasing to infinity. It depends on the
Observations”); and to nonlinear systems (section difference between the initial and final conditions
“Nonlinear Systems”). Finally, some user choices of the experiment and has exactly the same poles
are discussed (section “User Choices”) and as the plant transfer function. Therefore, the
Nonparametric Techniques in System Identification 909

time-domain response of TG ./ is decaying and noise power spectrum estimation is making
exponentially to zero as a transient error. a trade-off between the reduction of the leakage
From this short discussion, it can be concluded error Eleak .k/ and the increase of the interpola-
that the leakage error in the frequency domain tion error Eint .k/ (Schoukens et al. 2006).
is equivalent to the transient error in the time Note that exactly the same analysis can be
domain. The only difference being that the former made for the continuous- or discrete-time dynam-
depends on the difference between the initial and ics of the disturbing output noise v.t/ in Fig. 1
final conditions, while the latter solely depends
on the initial conditions. V .k/ D H.k /E.k/ C TH .k / (7)
Standard spectral analysis methods (see sec-
tion “Spectral Analysis Method”) suppress the with H.k / the noise frequency response
leakage term TG .k / in (2) by multiplying the function, E.k/ the DFT of the unobserved
time-domain signals with a window w.t/ before driving discrete-time or band-limited continuous-
taking the DFT (1) time white noise source e.t/ (Pintelon and
Schoukens 2012, Section 6.7.3), and TH .k /
X
N 1 the noise leakage (transient) term. The noise
1 kn
.X.k//W D p w.nTs /x.nTs /e j 2 N leakage term is often neglected but can be
N wrms nD0
important for lightly damped systems (e.g., in
(3)
N 1 1=2 modal analysis). Most nonparametric techniques
P suppress the sum of the plant and noise leakage
with wrms D jw.nTs /j2 =N the root
nD0 errors TG .k / C TH .k /.
mean square (rms) value of the window w.t/. If an integer number of periods of the
The scaling in (3) is such that the transformation steady-state response to a periodic excitation
preserves the rms value of the signal. The rela- is measured, then the plant leakage error TG .k /
tionship between the DFT spectra .U.k//W and in (2) is zero, which simplifies significantly the
.Y0 .k//W of the windowed input-output signals estimation problem. Therefore, for the frequency- N
w.t/u.t/ and w.t/y0 .t/ is given by domain techniques, a distinction is made between
periodic and nonperiodic excitations. Note,
.Y0 .k//WD G.k / .U.k//W CEint .k/CEleak .k/ however, that the noise leakage (transient)
(4) term TH .k / in (7) remains different from
where Eint .k/ and Eleak .k/ are, respectively, the zero.
interpolation error and the remaining leakage
error
Nonparametric Time-Domain
Techniques
Eint .k/ D .G.k /U.k//W  G.k / .U.k//W
(5) The time-domain methods estimate the impulse
Eleak .k/ D .TG .k //W (6) response of the plant via the time-domain rela-
Note that Eint .k/ D 0 if G.k / is constant tionship that the true output y0 .t/ equals the con-
within the bandwidth of W .k/, while the interpo- volution product between the impulse response
lation error is large if the FRF varies significantly g.t/ and the true input u.t/. For discrete-time
within the window bandwidth. To keep Eint .k/ systems, it takes the form
small, the frequency resolution 1=.NT s / should 1
be sufficiently large and the window bandwidth X
y0 .t/ D g.n/u.t  n/ (8)
should be small enough. On the other hand, a nD0
larger window bandwidth is beneficial for reduc-
ing the leakage error Eleak .k/. Hence, choosing In practice only a finite number of impulse re-
an appropriate window for nonparametric FRF sponse coefficients g.t/ can be estimated from
910 Nonparametric Techniques in System Identification

N 1
N input-output samples, and, therefore, (8) is 1 X
approximated by a finite sum RO yu .m/ D y.t/u.t  m/ (12)
N  L t DL
X
L
y0 .t/ g.n/u.t  n/ (9) 1
1 X
N
RO uu .m  n/ D
nD0
u.t  n/u.t  m/
N  L t DL
where L  N  1 should also be determined (13)
from the data. From (9), it can be seen that
the response depends on the past input values (Godfrey 1993, Chapter 1; Ljung 1999, Chap-
u.1/; u.2/; : : :; u.L/. Since these values are ter 6). Since the number of estimated impulse
unknown, an exponentially decaying transient response coefficients L can grow with the amount
error is present in the first L samples of the pre- of data N , the correlation method (11) is clas-
dicted output (9). This transient error is the time- sified as being nonparametric. If the input is
domain equivalent of the leakage error TG .k / white noise, then the expected value of RO uu .m/ is
in (2). To remove the transient error, the first L proportional to the Kronecker delta ı.m/, and the
output samples can be discarded in the predicted cross-correlation RO yu .m/ (11) is – within a scal-
output (9). It reduces the amount of data from N ing factor – a good approximation of the impulse
to N  L and, hence, increases the mean square response. This property is used in blind channel
error of the estimates. If it is known that the estimation.
transfer function has no direct term, then g.0/ D
0, and the sum (9) starts from n D 1.
Gaussian Process Regression
Correlation Methods The linear least squares (10) solution can be
Correlation methods have been studied inten- (very) sensitive to disturbing output noise if L
sively since the end of the 1950s (see Eykhoff is not much smaller than N . This problem is
1974) and are nowadays still used in telecom- circumvented by the Gaussian process regression
munication channel estimation and equalization. approach. The key idea consists in modeling
The impulse response coefficients are found by the impulse response coefficients g.n/ as a
minimizing the sum of the squared differences zero-mean Gaussian process with a certain
between the observed output samples and the covariance structure PL that depends on a few
output samples predicted by (9) hyper-parameters (Pillonetto et al. 2011). In
Chen et al. (2012), it has been shown that the
X
N 1 X
L Gaussian process regression is equivalent to
.y.t/  g.n/ u.t  n//2 (10) the following regularized (see also  System
t DL nD0 Identification Techniques: Convexification, Reg-
ularization, and Relaxation) linear least squares
w.r.t. g.m/; m D 0; 1; : : :; L. The solution of problem
this linear least squares problem is given by the
famous Wiener-Hopf equation
X
N 1 X
L
.y.t/  g.n/u.t  n//2 C 2 g T PL1 g
X
L
RO yu .m/ D g.n/ RO uu .m  n/ (11) t DL nD0
(14)
nD0
where g D .g.0/; g.1/; : : :; g.L//T and with
for m D 0; 1; : : :; L, where RO yu and RO uu are 2 the variance of the output disturbance. The
estimates of, respectively, the cross- and autocor- hyper-parameters defining PL and the noise vari-
relation functions Ryu ./ D Efy.t/u.t  /g and ance 2 are estimated via an empirical Bayes
Ruu ./ D Efu.t/u.t  /g method.
Nonparametric Techniques in System Identification 911

Nonparametric Frequency-Domain subrecords of the total response (Ljung 1999,


Techniques Section 6.4). In Heath (2007), it is shown that the
optimally (in mean square sense) weighted ETFE
The frequency-domain techniques estimate the equals the spectral analysis method.
frequency response function (FRF) using rela-
tionship (2) or (4) between the input-output DFT Spectral Analysis Method
spectra. We start with the simplest approach and The spectral analysis method is available in any
gradually increase the complexity of the esti- digital spectrum analyzer. It is based on the
mation methods. Note that nonparametric FRF relationship between the FRF and the cross-
estimation is still a quickly evolving research and autopower spectra of the input-output
area, such that the pros and cons of the advanced signals
methods are yet not well established.
Syu ./ F fRyu ./g
G./ D D (17)
Empirical Transfer Function Estimation Suu ./ F fRuu ./g
If an integer number of periods P of the steady-
state response to a periodic excitation is ob- with F fg the Fourier transform (Bendat and Pier-
served, then the leakage term in TG .k / in (2) sol 1980, Chapter 4; Brillinger 1981, Chapter 8).
is zero, and the FRF is estimated by dividing the Comparing (11) and (17), it can be seen that the
output by the input DFT spectra at the excited spectral analysis method is the frequency-domain
frequencies (Pintelon and Schoukens 2012, Sec- equivalent of the correlation method (take the
tion 2.4) Fourier transform of the expected value of (11)).
O k / D Y .k/
G. (15) There are basically two methods for estimating
U.k/ the cross- and autopower spectra in (17) from
The output noise variance V2 .k/ is estimated via sampled data: the Blackman and Tukey (1958)
the sample variance O V2 .k/ of the output DFT and the Welch (1967) procedures.
spectra over the P consecutive signal periods. The Blackman-Tukey procedure (Blackman N
The variance of the FRF estimate (15) is then and Tukey 1958; Ljung 1999, Section 6.4) con-
given by sists in taking the DFT (3) of the windowed cross-
and autocorrelation functions, viz.,

O k // D V2 .k/
N 1
var.G.
P jU.k/j2
(16) 1 X
RO yu ./ D y.t/ u.t  / (18)
N tD
where jU.k/j is the magnitude of U.k/.
N 1
Applying (15) to random excitations gives 1 X
SORyu .k/ D p w./ RO yu ./e j 2 N (19)
k
the empirical transfer function estimate (Ljung
N  D0
1999, Section 6.3). Due to the presence of the
plant leakage error TG .k /=U.k/, the statistical resulting in an FRF estimate (17) at the full
properties of (15) for random inputs are quite frequency resolution 1=.NT s / of the measure-
different from those for periodic inputs. While ment. It can be shown that (19) is a smoothed
the empirical transfer function estimate (ETFE) is version of the periodogram Y .k/U.k/, where is
unbiased and has finite variance (16) for periodic UN the complex conjugate of U (Brillinger 1981,
inputs, it is biased and has infinite variance for Chapter 5).
random inputs (Broersen 2004). To improve the In the Welch approach (Welch 1967; Pin-
statistical properties of the ETFE for random telon and Schoukens 2012, Section 2.6), the N
inputs, one can either approximate locally the input-output samples are split into M subrecords
ETFE by a polynomial (Stenman et al. 2000) of N=M samples each, and the DFT spectra
or perform a weighted average of ETFEs over .U Œm .k//W and .Y Œm .k//W of the windowed
912 Nonparametric Techniques in System Identification

input and output samples are calculated via (3) more experiments (input-output data records) are
where N is replaced by N=M , giving available. If the number of measured records M
increases to infinity, then (22) converges to the
1 X  Œm   Œm 
M
true value, provided a perfect suppression of the
SOYW UW .k/ D Y .k/ W U .k/ W leakage error.
M mD1
(20) In measurement devices, the quality of the
XM
ˇ Œm  ˇ2 spectral analysis estimate (22) is often quantified
1 ˇ U .k/ W ˇ
SOUW UW .k/ D (21) via the coherence 2 .!/
M mD1
ˇ ˇ
The spectral analysis estimate of the FRF and its ˇSyu ./ˇ2
.!/ D
2
(25)
variance are then given by Syy ./Suu ./

O
O k / D SYW UW .k/
G. (22)
which is comprised between 0 and 1. It is related
SOUW UW .k/ to the variance of the spectral analysis estimate as

n o
O k // D 1  .!k / .!k / jG.k /j2
2 2 2
O k // V .k/ E SOU1 U .k/
var.G. (23) var.G.
M W W
(Brillinger 1981, Chapter 8; Heath 2007). Finally,
A coherence smaller than 1 indicates the presence
the output noise variance V2 .k/ in (23) is esti-
of disturbing noise, residual leakage errors, non-
mated as
linear distortions, or a nonobserved input.
0 ˇ ˇ2 1
ˇO ˇ Following the same lines of Welch (1967),
M BO ˇ YW UW
S .k/ ˇ C the statistical properties of the spectral analysis
O V .k/ D
2
@SYW YW .k/  A
M 1 SOUW UW .k/ estimate (22) can be improved via overlapping
subrecords in the cross- and autopower spectra
(24)
estimates (20) and (21). This has been studied in
(Brillinger 1981, Chapter 8; Pintelon and detail for noise power spectra in Carter and Nut-
Schoukens 2012, Section 2.5.4). Due to tall (1980) and for FRFs in Antoni and Schoukens
the spectral width of the window used, the (2007).
estimates (22) and (24) are correlated over
the frequency (the correlation length is about Advanced Methods
twice the spectral width). Note that (21) is used The goal of the advanced methods is to estimate
for estimating noise power spectra (Brillinger the FRF at the full frequency resolution 1=.NT s /
1981, Chapter 5). Note also that for periodic of the experiment duration NT s while suppressing
excitations combined with a rectangular window the influence of the leakage and the noise errors.
w.nT s / D 1, the spectral analysis estimate (22), Without some extra information, it is impossi-
where each subrecord is equal to a signal period, ble to achieve this goal via (2). The additional
simplifies to the ETFE (15). piece of information that allows one to solve the
Compared with the Blackman-Tukey proce- problem is that the FRF and the leakage error are
dure (19), the FRF estimate (22) based on the locally smooth functions of the frequency.
Welch approach (20) and (21) has a frequency The local polynomial method (Pintelon and
resolution and a variance (23) that are M times Schoukens 2012, Chapter 7) approximates the
smaller. In measurement devices, the FRFs are FRF and the leakage error in (2) locally in the
estimated using the Welch approach (20)–(22) frequency band Œk  n; k C n by a polynomial.
where each subrecord is an independent measure- From the residuals of the local linear least squares
ment with a fixed number of samples. The reason solution, one also gets an estimate of the output
for this is that the cross- and autopower spectra noise variance V2 and, hence, also of the variance
estimates (20) and (21) can easily be updated as of the FRF. The whole procedure is repeated for
Nonparametric Techniques in System Identification 913

all DFT frequencies k in the frequency band of back” and “Noisy Input, Noisy Output Observa-
interest. The correlation length of the estimates tions,” it is shown that the estimation bias can
equals ˙2n, which is twice the local bandwidth be avoided if a known external reference signal
of the polynomial approximation. is available (typically the signal stored in the
The local rational method (McKelvey and arbitrary waveform generator).
Guérin 2012) follows the same lines as the local Since most real-life systems behave to some
polynomial method, except that the FRF and the extent nonlinearly, it is important to detect and
leakage error in (2) are locally approximated by quantify the nonlinear effects in FRF estimates.
rational forms with the same poles (G D B=A This issue is handled in section “Nonlinear Sys-
and TG D I =A). Due to the common poles, tems.”
the local rational approximation problem can be
transformed into a local linear least squares prob-
lem. The method is biased but suppresses better Systems Operating in Feedback
the plant leakage error of lowly damped systems. The key difficulty of estimating the FRF of a plant
The transient impulse response modeling operating in feedback (see Fig. 2) using nonpe-
method (Hägg and Hjalmarsson 2012) ap- riodic excitations is that the true input u.t/ is
proximates the FRF and the leakage error by, correlated with the process noise v.t/. The direct
respectively, finite impulse and transient response approaches of sections “Nonparametric Time–
models, giving a large sparse global linear least Domain Techniques” and “Nonparametric Fre-
squares problem. From the residuals of the global quency-Domain Techniques” lead to biased esti-
linear least squares solution, one gets an estimate mates (Wellstead 1981). This can easily be seen
of the output noise variance V2 and, hence, also from the ETFE (15) applied to the feedback setup
of the variance of the FRF. This approach has the in Fig. 2
best smoothing properties and is recommended
in case the noise error is dominant.
O k / D G.k /Gact .k /R.k/ C V .k/ (26)
G. N
Gact .k /R.k/  Gfb .k /V .k/
Extensions
where Gact .k / and Gfb .k / are, respectively,
In sections “Nonparametric Time-Domain Tech- the actuator and feedback dynamics. From (26),
niques” and “Nonparametric Frequency-Domain it follows that in those frequency bands where
Techniques,” it is assumed that the linear plant the process noise V .k/ dominates, one rather
operates in open loop and that the input is known estimates minus the inverse of the feedback dy-
exactly. If the plant operates in feedback and/or namics instead of the plant FRF. On the other
the input observations are noisy, then the pre- hand, at those frequencies where the reference
sented time and frequency-domain techniques are signal injects most power, the ETFE (26) will be
biased. In sections “Systems Operating in Feed- close to the plant FRF.

Feedback

Actuator Plant

Nonparametric Techniques in System Identification, on the process noise v.t /, and y.t / is the noisy output
Fig. 2 Plant operating in closed loop: r.t / is the ex- observation
ternal reference signal, the known input u.t / depends
914 Nonparametric Techniques in System Identification

Actuator Plant

Nonparametric Techniques in System Identification, my .t / are the input and output measurement errors; v.t / is
Fig. 3 Errors-in-variables framework: r.t / is the external the process noise; and u.t /, y.t / are the noisy input, noisy
reference signal; ng .t / is the generator noise; mu .t /, output observations

If a known external reference signal is avail- input autopower spectrum in (17) is biased. In-
able, then the bias is avoided via the indirect deed, due to the noise on the input, Suu ./ is
method proposed in Wellstead (1981) too large, resulting in too small direct FRF esti-
mates. This is true for all direct FRF approaches
Syr ./=Srr ./ Syr ./ in sections “Nonparametric Time-Domain Tech-
G./ D D : (27)
Sur ./=Srr ./ Sur ./ niques” and “Nonparametric Frequency-Domain
Techniques.” Applying the indirect method of
The basic idea consists in modeling the feedback section “Systems Operating in Feedback” re-
setup (see Fig. 2) from the known reference to the moves the bias because the noise on the in-
input and output simultaneously. This reduces the put is independent of the reference signal (e.g.,
single-input, single-output closed loop problem see (27)). Proceeding in this way, the closed
to a single-input, two-output open loop problem. loop case (see Fig. 2) with noisy input, noisy
Since the process noise v.t/ is independent of output observations is also solved by the indirect
the reference signal r.t/, the direct estimate of method.
the single-input, two-output FRF is unbiased. If the excitation is periodic, then the mean
Calculating the ratio of the two FRFs finally value of the input-output DFT spectra over the
gives the indirect estimate (27). This procedure P consecutive periods converges to the their
can be applied to any of the direct methods true values as P tends to infinity (Pintelon
of sections “Nonparametric Time-Domain Tech- and Schoukens 2012, Section 2.5). Hence, the
niques” and “Nonparametric Frequency-Domain ETFE (15) is still consistent, and no external
Techniques.” Proceeding in this way, unstable reference is needed. The same conclusion is valid
plants operating in a stabilizing feedback loop for systems operating in feedback.
can also be handled.
If the excitation is periodic, then the process Nonlinear Systems
noise v.t/ is independent of the periodic part The classes of nonlinear systems considered are
of the input u.t/, and the ETFE (15) converges those systems whose steady-state response to
to the true value as the number of periods P a periodic input is periodic with the same pe-
tends to infinity (Pintelon and Schoukens 2012, riod as the input. It excludes phenomena such
Section 2.5). Hence, in the periodic case, no as chaos and subharmonics but allows for hard
external reference is needed. nonlinearities such as saturation, dead zones, and
clipping.
Noisy Input, Noisy Output Observations The classes of excitations considered are
The key difficulty of estimating the FRF of a plant stationary random signals with a specified
excited by a nonperiodic signal from noisy input, power spectrum and probability density function.
noisy output observations (see Fig. 3) is that the An important special case is the class of
Nonparametric Techniques in System Identification 915

ys .t/ has the following properties (Pintelon and


NL
PISPO Schoukens 2012, Section 3.4.4):
1. YS .k/ has zero-mean value: EfY s.k/g D 0.
2. YS .k/ it is uncorrelated with – but not inde-
pendent of U.k/ W EfY .k/U.k/g D 0.
3. YS .k/ is asymptotically .N ! 1/ normally
BLA distributed.
4. YS .k/ is asymptotically .N ! 1/ uncorre-
lated over the frequency.
Nonparametric Techniques in System Identification, These second-order properties are exactly the
Fig. 4 Best linear approximation (BLA) of a nonlinear
(NL) period in, same period out (PISPO) system, excited same as those of a filtered white noise distur-
by a zero-mean random signal u.t / with a given power bance, except that the noise is independent of
spectrum and probability density function. y.t / is the the input. It shows that it is impossible to dis-
zero-mean part of the actual output of the nonlinear tinguish the nonlinear distortions ys .t/ from the
system. uDC and yDC are the DC levels of the actual input
and output of the nonlinear system. The zero-mean output disturbing noise v.t/ in FRF measurements using
residual ys .t / is uncorrelated with – but not independent stationary random excitations (only second-order
of – the input u.t / statistics are involved in (22)–(24)).
Using random phase multisines, it is possible
to detect and quantify the nonlinear distortions
because ys .t/ is then periodically related to the
Gaussian excitation signals with a specified input u.t/ (property of the NL PISPO system).
power spectrum. This class includes random Indeed, analyzing the FRF over consecutive sig-
phase multisines (a sum of harmonically nal periods quantifies the noise variance v.t/
related sinewaves with user-specified amplitudes (ys .t/ does not change over the periods), while
and random phases) with the same Riemann analyzing the FRF over different random phase
equivalent power spectrum (Pintelon and realizations of the input quantifies the sum of the N
Schoukens 2012, Section 4.2). noise variance and the variance of the nonlinear
Consider a nonlinear (NL) period in, same distortions (ys .t/ depends on the random phase
period out (PISPO) system excited by a random realization of the input). Subtracting both vari-
excitation belonging to a particular class (see ances gives an estimate of the variance of the non-
Fig. 4). The FRF (17), where the expected value linear distortions. While this variance quantifies
is taken w.r.t. the random realization of the exci- exactly the variability of the nonparametric FRF
tation, is the best (in mean square sense) linear estimate due to the nonlinear distortions, it can
approximation (BLA) of the nonlinear PISPO (significantly) underestimate the variability of a
system, because the difference ys .t/ between the parametric plant model. The basic reason for this
actual output of the nonlinear system (DC value is that the true variance of the parametric plant
excluded) and the output predicted by the linear model also depends on the nonzero higher (>2)
approximation is uncorrelated with the input u.t/ order moments between the input u.t/ and the
(Enqvist and Ljung 2005). Although uncorre- nonlinear distortions ys .t/.
lated with the input, the output residual ys .t/
still depends on u.t/. If the NL PISPO system
operates in feedback (see Fig. 2), then the indirect User Choices
method (27) is used for calculating the BLA, and
the output residual ys .t/ is uncorrelated with – There is no clear answer to the question which
but not independent of – the reference signal r.t/ of the presented techniques is the best. It strongly
(Fig. 2). depends on the intended use of the nonparametric
For the class of Gaussian excitation signals, it estimates and the particular application handled.
can be shown that the DFT spectrum YS .k/ of For example, the intended use can be:
916 Nonparametric Techniques in System Identification

1. A smooth representation of the FRF in this section, we provide some advices/


2. Use of the nonparametric estimates as an in- guidelines based on our personal interpretation
termediate step for parametric modeling of the of these facts:
plant • Always store the reference signal together
In the first case, one should opt for the mini- with the observed input-output signals. The
mum mean square error solution, while in the knowledge of the reference signal allows one
second case, it is crucial that the nonparametric to solve nonparametrically the closed loop and
estimates are unbiased, possibly at the price of an errors-in-variables problems.
increased variance. Indeed, the parametric plant • Whenever possible use periodic excitation sig-
modeling step cannot eliminate the bias error in nals (random phase multisines): they allow
the nonparametric estimates while it suppresses one to estimate from one experiment the FRF,
the variance error. the noise level, and the level of the nonlinear
The application-dependent answers to the fol- distortions. As such the deviation of the true
lowing questions strongly influence the choice dynamic behavior from the ideal linear time-
and the settings of the method used: invariant framework is quantified.
1. Is a large frequency resolution needed and/or • Select one of the advanced methods if fre-
is leakage the dominant error? quency resolution is of prime interest.
2. Is the noise or the leakage error dominant? • If the goal of the identification experiment
3. Is it necessary to detect and quantify the non- is to minimize the prediction error, then the
linear behavior? Gaussian process regression method is a very
If the answer to the first question is yes, then promising approach.
one should opt for one of the advanced methods • For lowly damped systems and a limited fre-
(section “Advanced Methods”) or use the spec- quency resolution, the local rational method is
tral analysis estimates (section “Spectral Analysis a good candidate solution.
Method”) with a small number M of subrecords. • Use a minimum mean square solution for a
On the other hand, if the noise error is dominant, smooth representation of the FRF.
then M in (22)–(24) should be chosen as large • Choose unbiased nonparametric estimates for
as possible. To detect and quantify the nonlinear use in parametric plant modeling (estimation,
effects, one should use periodic signals (random validation, and model selection).
phase multisines) combined with the ETFE (sec- • When comparing nonparametric techniques,
tion “Empirical Transfer Function Estimation”). always take into account all aspects of
Finally, comparing the different nonparamet- the estimates: the bias and variance of
ric techniques is also not straightforward because the FRF and noise model, the frequency
of their different resolution, and the correlation length over the
1. Frequency resolution frequency.
2. Quality of the estimated noise model
3. Correlation length over the frequency
The latter is set by the spectral width of the
window used in the spectral analysis method and Summary and Future Directions
the local bandwidth in the advanced methods.
Nonparametric techniques are very useful
because they simplify the parametric plant
modeling in the initial selection of the model
complexity and in the detection of unmodeled
Guidelines dynamics. The classical correlation and spectral
analysis methods developed in the 1950s and
While the previous sections give well-established refined till the 1980s are still widely used.
facts about the different nonparametric techniques, Recently, advanced time- and frequency-domain
Nonparametric Techniques in System Identification 917

methods have been developed which all try to Bendat JS, Piersol AG (1980) Engineering applications of
minimize the sensitivity (bias and variance) of the correlations and spectral analysis. Wiley, New York
Blackman RB, Tukey JW (1958) The measurement of
nonparametric estimates to disturbing noise, non- power spectra from the point of view of commu-
linear distortion, and transient (leakage) errors. nications engineering – Part II. Bell Syst Tech J
The renewed research interest in nonparamet- 37(2):485–569
ric techniques should be continued to handle Brillinger DR (1981) Time series: data analysis and the-
ory. McGraw-Hill, New York
the following challenging problems: short data Broersen PMT (2004) Mean square error of the empirical
sets, missing data, detection and quantification of transfer function estimator for stochastic input signals.
time-variant behavior, modeling of time-variant Automatica 40(1):95–100
dynamics, and modeling of nonlinear dynamics. Carter GC, Nuttall AH (1980) On the weighted overlapped
segment averaging method for power spectral estima-
tion. Proc IEEE 68(10):1352–1353
Chen T, Ohlsson H, Ljung L (2012) On the es-
timation of transfer functions, regularizations and
Cross-References Gaussian processes – revisited. Automatica 48(8):
1525–1535
 Frequency Domain System Identification Enqvist M, Ljung L (2005) Linear approximations of
 Frequency-Response and Frequency-Domain nonlinear FIR systems for separable input processes.
Automatica 41(3):459–473
Models Eykhoff P (1974) System identification. Wiley, New York
 System Identification: An Overview Godfrey K (1993) Perturbation signals for system identi-
 System Identification Techniques: Convexifica- fication. Prentice-Hall, Englewood Cliffs
tion, Regularization, and Relaxation Hägg P, Hjalmarsson H (2012) Non-parametric frequency
response function estimation using transient impulse
response modelling. Paper presented at the 16th IFAC
symposium on system identification, Brussels, 11–13
July, pp 43–48
Recommended Reading Heath WP (2007) Choice of weighting for averaged
nonparametric transfer function estimates. IEEE Trans
The classical correlation (see section “Correla- Autom Control 52(10):1914–1920
Ljung L (1999) System identification: theory for the user,
N
tion Methods”) and spectral analysis (see section
2nd edn. Prentice-Hall, Upper Saddle River
“Spectral Analysis Method”) methods are well McKelvey T, Guérin G (2012) Non-parametric fre-
covered by the text books listed below. The rec- quency response estimation using a local rational
ommended reading list includes the basic papers model. Paper presented at the 16th IFAC sympo-
on the spectral analysis methods (Blackman and sium on system identification, Brussels, 11–13 July,
pp 49–54
Tukey 1958; Welch 1967; Wellstead 1981) and Pillonetto G, Chiuso A, De Nicolao G (2011) Prediction
the most recent developments described in sec- error identification of linear systems: a nonparametric
tions “Gaussian Process Regression” and “Ad- Gaussian regression approach. Automatica 47(2):291–
vanced Methods.” 305
Pintelon R, Schoukens J (2012) System identification: a
frequency domain approach, 2nd edn. IEEE, Piscat-
Acknowledgments This work is sponsored by the away/Wiley, Hoboken
Research Foundation Flanders (FWO-Vlaanderen), the Schoukens J, Rolain Y, Pintelon R (2006) Analysis of win-
Flemish Government (Methusalem Fund, METH1), the dowing/leakage effects in frequency response function
Belgian Federal Government (Interuniversity Attraction measurements. Automatica 42(1):27–38
Poles programme IAP VII, DYSCO), and the European Stenman A, Gustafsson F, Rivera DE, Ljung L, McKelvey
Research Council (ERC Advanced Grant SNLSID). T (2000) On adaptive smoothing of empirical transfer
function estimates. Control Eng Pract 8(11):1309–
1315
Welch PD (1967) The use of the fast Fourier trans-
Bibliography form for the estimation of power spectra: a method
based on time averaging over short, modified peri-
Antoni J, Schoukens J (2007) A comprehensive study of odograms. IEEE Trans Audio Electroacoust 15(2):
the bias and variance of frequency-response-function 70–73
measurements: optimal window selection and overlap- Wellstead PE (1981) Non-parametric methods of system
ping strategies. Automatica 43(10):1723–1736 identification. Automatica 17(1):55–69
918 Numerical Methods for Continuous-Time Stochastic Control Problems

stochastic controls as well as historical remarks.


Numerical Methods for Much of the development has been accompanied
Continuous-Time Stochastic Control by the needs and progress in science, engineering,
Problems as well as finance. Typically, the problems are
highly nonlinear, so a closed-form solution is
George Yin
very difficult to obtain. As a result, designing
Department of Mathematics, Wayne State
feasible numerical algorithms becomes vitally
University, Detroit, MI, USA
important. Among the many approximation
methods, the Markov chain approximation
methods have shown most promising features.
Abstract Primarily for treating diffusions, the Markov
chain approximation method was initiated in the
This expository article provides a brief review
1970s (Kushner 1977) and substantially devel-
of numerical methods for stochastic control in
oped further in Kushner (1990b) and Kushner
continuous time. It concentrates on the methods
and Dupuis (1992). Nowadays, such method
of Markov chain approximation for controlled
are used for more complex jump diffusions, or
diffusions. Leaving most of the technical details
systems with random switchings. There were also
out with the broad general audience in mind,
efforts to incorporate the methods into an expert
it aims to serve as an introductory reference
system so that the methods can be placed into
or a user’s guide for researchers, practitioners,
an easily usable tool box (Chancelier et al. 1986,
and students who wish to know some-
1987). In addition to the existing applications in
thing about numerical methods for stochastic
a wide variety of engineering problems, recently
control.
applications include such areas as insurance,
quantile hedging for guaranteed minimum death
benefits, dividend payment and investment
Keywords strategies with capital injection, singular control,
risk management, portfolio selection with
Markov chain approximation; Numerical meth- bounded constraints, and production planning
ods; Stochastic control and manufacturing problems; see Jin et al. (2011,
2012, 2013), Sethi and Zhang (1994), and Yin
et al. (2009) and references therein.
Introduction Let us begin with the controlled diffusion
problem. We wish to minimize the cost function
This expository article provides a brief review defined by
of numerical methods for stochastic control in
continuous time. Leaving most of the technical hZ  i
J.x; u.// D Ex R.X.t/; u.t//dt CB.X.// ;
details out with the broad general audience in 0
mind, it aims to serve as an introductory reference (1)
for researchers, practitioners, and students, who
wish to know something about numerical meth-
ods for stochastic controls. with the Rr -valued process X.t/ defined by the
The study of stochastic control has witnessed solution of the stochastic differential equation
tremendous progress in the last few decades; see,
for example, Fleming and Rishel (1975), Fleming
dX.t/ D b.X.t/; u.t//dt C .X.t//d W;
and Soner (1992), Kushner (1977), and Yong and
Zhou (1999) among others, for fundamentals of X.0/ D x (2)
Numerical Methods for Continuous-Time Stochastic Control Problems 919

where x 2 Rr , u./ is a U -valued, measurable for illustration. Section “Numerical Computa-


process with U  Rd being a compact control tion” discusses the implementation issues. We
set, W ./ is an r-dimensional standard Brownian conclude the entry with a few further remarks.
motion, and  is the first exit time of the diffusion
from a bounded domain D, that is,  D minft W
X.t/ 62 D 0 g with D 0 denoting the interior of D,
b.; / W Rr Rd 7! Rr , ./ W Rr 7! Rr Rr ; Markov Chain Approximation
and R.; / W Rr Rd 7! R and B./ W Rr 7! R.
In the above, b./ is the control-dependent drift, The main idea was initiated in Kushner (1977)
./ is the diffusion matrix, R./ is the running and streamlined, extended, and further developed
cost, and B./ is the terminal or boundary cost. in Kushner and Dupuis (1992). An earlier paper
Throughout the entry, we assume that the stop- describing how to discretize the elliptic HJB
ping time  < 1 with probability one (w.p.1) for equation and then interpret it according to a
simplicity. Denote the value function by V .x/ D controlled Markov chain can be found in Kushner
infu J.x; u.//; where the inf is taken over all and Kleinman (1968). This section illustrates
admissible controls. Write the transpose of Y 2 the Markov chain approximation methods with
Rd1 d2 as Y 0 with d1 ; d2  1, a.x/ D .x/ 0 .x/, simple setup. The reader is suggested to read the
and define the generator of the controlled Markov references mentioned above for a comprehensive
process by treatment. To begin, let h > 0 be a small
“step size” in the approximation. Instead of
1 the domain D, we need to work with a finite
Lu f .x/ D tr.a.x/fxx .x// C b 0 .x; u/fx .x/;
2 set to ensure computational feasibility. Set Rrh
(3)
to be r-dimensional lattice cube, i.e., Rrh D
for a suitably smooth function f ./, where fx ./ f: : : ; 2h; h; 0; h; 2h; : : :gr (an r-dimensional
and fxx ./ denote the gradient and Hessian product of the indicated set). Denote the interior
of f ./, respectively. Note that the operator of D by D 0 , and define Dh0 D D 0 \ Rrh . N
is control dependent. Using @D to denote the We shall construct a controlled, discrete-time
boundary of D, then the associated Hamilton- Markov chain, whose transition probabilities
Jacobi-Bellman (HJB) equation satisfied by the have desired properties in line with the controlled
value function is given by diffusion and whose values are in D0h . Suppose
that f˛nh g is a time-homogeneous, discrete-time,
( controlled Markov chain with finite state space
infŒLu V .x/ C R.x; u/ D 0; x 2 D 0 ;
u (4) D0h and transition probabilities P D .p.x; yjv//
V .x/ D B.x/; x 2 @D: with x; y 2 D0h . Here we only consider the
case that the Markov chain has a finite state
The subject matter of this article is to space. This is sufficient for our computational
solve the optimal stochastic control problem purposes. At any time n, the control action is a
numerically. random variable denoted by uhn taking values in a
The rest of the entry is arranged as follows. compact set U . Set the interpolation interval by
Section “Markov Chain Approximation” focuses t h .x; v/ > 0 and write tnh D t h .˛n ; uhn /
on Markov chain approximation. It illustrates such that supx;v t h .x; v/ ! 0 as h ! 0
how one can construct the controlled Markov but infx;v t h .x; v/ > 0 for each h > 0. The
chain in discrete time for the approximation of control is admissible if the Markov property
the continuous-time stochastic control problems. P .˛nC1h
D yj˛ih ; uhi I i  n/ D P .˛nC1 h
D
Section “Illustration: A One-Dimensional Prob- yj˛n ; un / D P .˛n ; yjun / holds. Use U h
h h h h

lem” uses a one-dimensional case as an example to denote the collection of controls, which
920 Numerical Methods for Continuous-Time Stochastic Control Problems

are determined by a sequence of measurable given f˛jh ; uhj W j  n; ˛nh D x; uhn D vg by Enh .
functions Fnh ./ such that uhn D Fnh .˛kh ; k  We say that a control policy is locally consistent
nI uhk ; k < n/:Denote the conditional expectation if

Enh ˛nh D b.x; v/t h .x; v/ C o.h .x; v//;


Enh Œ˛nh  Enh ˛nh Œ˛nh  Enh ˛nh 0 D a.x/t h .x; v/ C o.t h .x; v//; (5)
a.x/ D .x/ 0 .x/; j˛nh j ! 0 as h ! 0 uniformly in n; !;

where ˛nh D ˛nC1 h


 ˛nh . The meaning of the facilitated by the use of the so-called relaxed
local consistency can be seen from the corre- controls (Kushner and Dupuis 1992, p. 267),
sponding controlled diffusion (2) (with X.0/ D x which enables us to characterize the limit under
and u.t/ D v for t 2 Œ0; ı, where ı > 0 is a small the framework of weak convergence. The detailed
parameter) in that Ex .X.ı/  x/ D b.x; v/ı C argument is beyond the scope of this entry. We
o.ı/, Ex ŒX.ı/  xŒX.ı/  x0 D a.x/ı C o.ı/: refer the reader to Kushner and Dupuis (1992,
Let h be the first time that f˛nh g leaves the set Dh0 . Chapter 10) for further reading on the proof of
We have an approximation for the cost function of convergence and the conditions needed.
the controlled diffusion (1) given by
2 3 Illustration: A One-Dimensional
h 1
X
J h .x; uh / D Ex 4
uh
R.˛jh ; uhj /tjh C B.˛hh /5:
Problem
j D0
(6) In this section, we use a one-dimensional exam-
Pn1 ple to illustrate the Markov chain approximation
Define tnh D h
j D0 tj and the continuous- methods, which enables us to present the results
time interpolations ˛ h .t/ D ˛nh , uhn D uhn for with a better visualization. Consider (2) with x 2
t 2 Œtnh ; tnC1
h
/: Define the first exit time of ˛ h ./ R. We proceed to find the transition probabilities
from Dh by h D thh : Corresponding to the
0
and interpolation intervals for the Markov chain
continuous-time problems, the first term on the f˛nh g. To construct a controlled Markov chain that
right-hand side of (6) represents the running cost is locally consistent, we first consider a special
and the last term gives the terminal cost. Denote case, namely, the control space has only one
the value function by V h .x/. Then it satisfies the admissible control uh 2 U h . In this case, min in
dynamic programming equation (7) can be removed. Discretize the HJB equation
using upwind finite difference method with step
8
ˆ inf ŒR.x; v/t h .x; v/
<v2U
size h > 0 by
h
P
V h .x/ D C y p h .x; yjv/V h .y/; x 2 Dh0 ;
:̂ V .x/ ! V h .x/
B.x/; x 62 Dh0 :
V h .x C h/  V h .x/
(7) Vx .x/ ! for b.x; v/ > 0;
hh
V .x/  V .x  h/
h

Proving the convergence of the numerical al- Vx .x/ ! h


for b.x; v/ < 0;
gorithms is an important task. This requires the V .x C h/  2V .x/ C V h .x  h/
h h
Vxx .x/ ! :
use of local consistency, interpolation of the ap- h2
proximating sequences in continuous time, as
well as martingale representation. The proof is For x 2 Dh0 , it leads to
Numerical Methods for Continuous-Time Stochastic Control Problems 921

V h .x C h/  V h .x/ C V h .x/  V h .x  h/ 
b .x; v/  b .x; v/
h h
a.x/ V .x C h/  2V .x/ C V .x  h/
h h h
C C R.x; v/ D 0;
2 h2

where b C and b  are the positive and nega- 2. Find an improved control by P
tive parts of b, respectively. Comparing with the uh;nC1 .x/ WD argminv2U h Πy p h ..x; y/jv/
dynamic programming equation, we obtain the V h;n .y/ C R.x; v/t h .x; v/:
transition probabilities 3. Find V h;nC1 ./ with uh;nC1 ./ by solving (7). If
jV h;nC1  V h;n j > ", go to Step 2 above with
.a.x/=2/ C hb C .x; v/ n ! n C 1.
p h .x; x C h/jv/ D ;
Q 
.a.x/=2/ C hb .x; v/
p h .x; x  h/jv/ D ; Further Remarks
Q 2
h
p h ./ D 0; otherwise, t h .x; v/ D ; Variations of the Problems. Variants of the
Q problems can be considered. For example, one
may consider nonlinear filtering problems or sin-
with Q D a.x/ C hjb.x; v/j being well defined. gularly perturbed control and filtering problems.
With the transition probabilities given above, we For problems arising in manufacturing systems,
can proceed to verify the local consistency by one often needs to treat controlled Markov chain
straight forward calculations and prove the de- with no diffusion terms. Such a case can also
sired convergence. be handled by the Markov chain approximation
methods; see Sethi and Zhang (1994) for the
problem and Yin and Zhang (2013, Chapter 9) for
Numerical Computation the numerical methods. In this article, we mainly N
discussed the approach by using probabilistic
To numerically approximate the controlled diffu- approach for getting the weak convergence of
sions, frequently used methods are either value the interpolations of the controlled Markov chain.
iterations or policy iterations (iteration in policy One can also use the so-called viscosity solution
space). Using Markov chain approximation in methods to treat the convergence; see Barles
conjunction with either value iteration or iteration and Souganidis (1991) (also Kushner and Dupuis
in policy space, we can further obtain a sequence 1992, Chapter 11).
of value functions fV h;n g such that V h;n ! V h
as n ! 1. The procedures can be described as Variance Control. In this entry, only drift in-
follows. volves control term. When the diffusion term is
also subject to controls, the problem becomes
Value Iteration more difficult. In Peng (1990), the idea of using
1. Given a tolerance " > 0, set n D 0; for x 2 backward stochastic differential equations was
Dh0 , set V h;0 D constant (for instance, 0). initiated, which had significant impact in the
2. Using V h;n obtained in (7) to obtain V h;nC1 . development of such stochastic control problems.
3. If jV h;nC1 V h;n j > ", go to Step 3 above with Detailed discussions can be found in Yong and
n ! n C 1. Zhou (1999). The numerical problems for diffu-
sion term involving controls can also be treated;
Policy Iteration see Kushner (2000) for further discussion. In this
1. Given a tolerance " > 0, set n D 0; for x 2 case, the so-called numerical noise or numerical
Dh0 , take an initial control uh0 .x/ D constant. viscosity can be introduced, so care must be
Use uh0 .x/ in lieu of v, solve (7) to find V h;0 ./. taken.
922 Numerical Methods for Continuous-Time Stochastic Control Problems

Complex Models Involving Jump and Switch- Acknowledgments The research of this author was sup-
ing. Note that only controlled diffusions are ported in part by the Army Research Office under grant
W911NF-12-1-0223.
considered in this entry. More complex models
such as controlled jump diffusions (Kushner and
Dupuis 1992), switching diffusions (Yin and Zhu
2010), and switching jump diffusions can be
Bibliography
treated (Song et al. 2006). Differential games can
also be treated (Kushner 2002; Song et al. 2008). Barles G, Souganidis P (1991) Convergence of approxi-
mation schemes for fully nonlinear second order equa-
Differential Delay Systems. Stochastic differ- tions. J Asymptot Anal 4:271–283
Bertsekas DP, Castanon DA (1989) Adaptive aggregation
ential delay systems may come into play. The
methods for infinite horizon dynamic programming.
corresponding numerical algorithms have been IEEE Trans Autom Control 34:589–598
studied extensively in Kushner (2008). Due to Chancelier P, Gomez C, Quadrat J-P, Sulem A, Blanken-
their inherent infinite dimensionality, a main is- ship GL, La Vigna A, MaCenary DC, Yan I (1986)
An expert system for control and signal processing
sue here concerns suitable finite approximation to with automatic FORTRAN program generation. In:
the memory segments. Mathematical systems symposium, Stockholm. Royal
Institute of Technology, Stockholm
Rates of Convergence. This entry mainly Chancelier P, Gomez C, Quadrat J-P, Sulem A (1987)
Automatic study in stochastic control. In: Fleming W,
discusses the convergence of the approximation Lions PL (eds) IMA volume in mathematics and its
methods. There is also much interest in ascertain- applications, vol 10. Springer, Berlin
ing rates of convergence. Such effort goes back Crandall MG, Lions PL (1983) Viscosity solutions of
to the paper Menaldi (1989) (see also Zhang Hamilton-Jacobi equations. Trans Am Math Soc
277:1–42
2006). Subsequently, it has been resurgent effort Crandall MG, Ishii H, Lions PL (1992) User’s guide to
in dealing with this issue from a nonlinear partial viscosity solutions of second order partial differential
differential equation point of view; see Krylov equations. Bull Am Math Soc 27:1–67
(2000). Our recent work Song and Yin (2009) Fleming WH, Rishel RW (1975) Deterministic and
stochastic optimal control. Springer, New York
complements the study by providing a probabilis- Fleming WH, Soner HM (1992) Controlled Markov pro-
tic approach for treating switching diffusions. cesses and viscosity solutions. Springer, New York
Jin Z, Wang Y, Yin G (2011) Numerical solutions of quan-
Stochastic Approximation. In certain optimal tile hedging for guaranteed minimum death benefits
under a regime-switching-jump-diffusion formulation.
control problems, the optimal controls or near- J Comput Appl Math 235:2842–2860
optimal controls turn out to be of threshold Jin Z, Yin G, Zhu C (2012) Numerical solutions of optimal
type. An alternative way of solving such risk control and dividend optimization policies under a
problems leading to at least suboptimal or generalized singular control formulation. Automatica
48:1489–1501
near-optimal control is to use a stochastic Jin Z, Yang HL, Yin G (2013) Numerical methods for
approximation approach; see Kushner and optimal dividend payment and investment strategies of
Yin (2003) for a comprehensive treatment of regime-switching jump diffusion models with capital
stochastic approximation algorithms. Some injections. Automatica 49:2317–2329
Krylov VN (2000) On the rate of convergence of finite-
successful examples include manufacturing difference approximations for Bellman’s equations
systems (Yin and Zhang 2013, Section 9.3) and with variable coefficients. Probab Theory Relat Fields
liquidation decision making (Yin et al. 2002). 117:1–16
Kushner HJ (1977) Probability methods for approxima-
tion in stochastic control and for elliptic equations.
Academic, New York
Kushner HJ (1990a) Weak convergence methods and
Cross-References singularly perturbed stochastic control and filtering
problems. Birkhäuser, Boston
Kushner HJ (1990b) Numiercal methods for stochastic
 Stochastic Dynamic Programming control problems in continuous time. SIAM J Control
 Stochastic Maximum Principle Optim 28:999–1048
Numerical Methods for Nonlinear Optimal Control Problems 923

Kushner HJ (2000) Consistency issues for numerical


methods for variance control with applications to Numerical Methods for Nonlinear
optimization in finance. IEEE Trans Autom Control Optimal Control Problems
44:2283–2296
Kushner HJ (2002) Numerical approximations for
Lars Grüne
stochastic differential games. SIAM J Control Optim
40:457–486 Mathematical Institute, University of Bayreuth,
Kushner HJ (2008) Numerical methods for controlled Bayreuth, Germany
stochastic delay systems. Birkhäuser, Boston
Kushner HJ, Dupuis PG (1992) Numerical methods
for stochastic control problems in continuous time.
Springer, New York Abstract
Kushner HJ, Kleinman AJ (1968) Numerical meth-
ods for the solution of the degenerate nonlinear el- In this article we describe the three most
liptic equations arising in optimal stochastic con-
trol theory. IEEE Trans Autom Control AC-13:
common approaches for numerically solving
344–353 nonlinear optimal control problems governed by
Kushner HJ, Yin G (2003) Stochastic approximation ordinary differential equations. For computing
and recursive algorithms and applications, 2nd edn. approximations to optimal value functions and
Springer, New York
Menaldi J (1989) Some estimates for finite difference
optimal feedback laws, we present the Hamilton-
approximations. SIAM J Control Optim 27:579–607 Jacobi-Bellman approach. For computing
Peng S (1990) A general stochastic maximum principle approximately optimal open-loop control
for optimal control problems. SIAM J Control Optim functions and trajectories for a single initial
28:966–979
Sethi SP, Zhang Q (1994) Hierarchical decision making in
value, we outline the indirect approach based
stochastic manufacturing systems. Birkhäuser, Boston on Pontryagin’s maximum principle and the
Song QS, Yin G (2009) Rates of convergence of numerical approach via direct discretization.
methods for controlled regime-switching diffusions
with stopping times in the costs. SIAM J Control
Optim 48:1831–1857
Song QS, Yin G, Zhang Z (2006) Numerical method Keywords N
for controlled regime-switching diffusions and
regime-switching jump diffusions. Automatica 42:
Direct discretization; Hamilton-Jacobi-Bellman
1147–1157
Song QS, Yin G, Zhang Z (2008) Numerical solutions for equations; Optimal control; Ordinary differential
stochastic differential games with regime switching. equations; Pontryagin’s maximum principle
IEEE Trans Autom Control 53:509–521
Warga J (1962) Relaxed variational problems. J Math Anal
Appl 4:111–128
Yan HM, Yin G, Lou SXC (1994) Using stochastic op- Introduction
timization to determine threshold values for control
of unreliable manufacturing systems. J Optim Theory This article concerns optimal control problems
Appl 83:511–539
Yin G, Zhang Q (2013) Continuous-time Markov chains governed by nonlinear ordinary differential equa-
and applications: a two-time-scale approach, 2nd edn. tions of the form
Springer, New York
Yin G, Zhu C (2010) Hybrid switching diffusions: proper-
ties and applications. Springer, New York P
x.t/ D f .x.t/; u.t// (1)
Yin G, Liu RH, Zhang Q (2002) Recursive algorithms for
stock liquidation: a stochastic optimization approach. with f W R Rn Rm ! Rn . We assume that for
SIAM J Optim 13:240–263
Yin G, Jin H, Jin Z (2009) Numerical methods for portfo-
each initial value x 2 Rn and measurable control
lio selection with bounded constraints. J Comput Appl function u./ 2 L1 .R; Rm / there exists a unique
Math 233:564–581 solution x.t/ D x.t; x; u.// of (1) satisfying
Yong J, Zhou XY (1999) Stochastic controls: Hamiltonian x.0; x; u.// D x.
systems and HJB equations. Springer, New York
Zhang J (2006) Rate of convergence of finite difference
Given a state constraint set X  Rn and a
approximations for degenerate ODEs. Math Comput control constraint set U  Rm , a running cost
75:1755–1778 g W X U ! R, a terminal cost F W X ! U , and
924 Numerical Methods for Nonlinear Optimal Control Problems

a discount rate ı  0, we consider the optimal control policy. This can be expressed in open-
control problem loop form u? W R ! U , in which the function
u? depends on the initial value x and on the
minimize J T .x; u.// (2) initial time which we set to 0 here. Alternatively,
u./2U T .x/
the optimal control can be computed in state-
and time-dependent closed-loop form, in which
where
a feedback law ? W R X ! U is sought. Via
Z T u? .t/ D ? .t; x.t//, this feedback law can then
J T .x; u.// WD e ıs g.x.s; x; u.//; u.s//ds be used in order to generate the time-dependent
0
optimal control function for all possible initial
C e ıT F .x.T; x; u./// values. Since the feedback law is evaluated along
(3) the trajectory, it is able to react to perturbations
and and uncertainties which may make x.t/ deviate
from the predicted path. Finally, knowing u? or
U T .x/ WD ? , one can reconstruct the corresponding opti-
 ˇ  mal trajectory by solving
ˇ x.s; x; u.// 2 X
u./ 2 L1 .R; U / ˇ (4)
ˇ for all s 2 Œ0; T 
P
x.t/ D f .x.t/; u? .t// or
In addition to this finite horizon optimal con- P
x.t/ D f .x.t/; ? .t; x.t///:
trol problem, we also consider the infinite horizon
problem in which T is replaced by “1,” i.e.,
Hamilton-Jacobi-Bellman Approach
minimize
1
J 1 .x; u.// (5)
u./2U .x/
In this section we describe the numerical ap-
where proach to solving optimal control problems via
Z Hamilton-Jacobi-Bellman equations. We first de-
1
scribe how this approach can be used in order
J 1 .x; u.// WD e ıs g.x.s; x; u.//; u.s//ds
0 to compute approximations to the optimal value
(6) function V T and V 1 , respectively, and after-
and wards how the optimal control can be synthesized
using these approximations. In order to formulate
U 1 .x/ WD this approach for finite horizon T , we interpret
 ˇ  V T .x/ as a function in T and x. We denote
ˇ x.s; x; u.// 2 X
u./ 2 L1 .R; U / ˇˇ ; (7) differentiation w.r.t. T and x with subscript T
for all s  0
and x, i.e., VxT .x/ D dV T .x/=dx, VTT .x/ D
respectively. dV T .x/=d T etc.
The term “solving” (2)–(4) or (5)–(7) can We define the Hamiltonian of the optimal
have various meanings. First, the optimal value control problem as
functions
H.x; p/ WD maxfg.x; u/  p  f .x; u/g;
u2U
V .x/ D
T
inf T
J .x; u.//
u./2U T .x/
with x; p 2 Rn , f from (1), g from (3) or (6), and
or “” denoting the inner product in Rn . Then, under
V 1 .x/ D inf J 1 .x; u.// appropriate regularity conditions on the problem
u./2U 1 .x/
data, the optimal value functions V T and V 1
may be of interest. Second, and often more im- satisfy the first order partial differential equations
portantly, one would like to know the optimal (PDEs)
Numerical Methods for Nonlinear Optimal Control Problems 925

VTT .x/ C ıV T .x/ C H.x; VxT .x// D 0 by h, and rearranging terms, one arrives at the
equation
and
ıV 1 .x/ C H.x; Vx1 .x// D 0 Vh1 .x/ D minfhg.x; u/
u2U

in the viscosity solution sense. In the case of C .1  ıh/Vh1 .x.h;


Q x; u//g
V T , the equation holds for all T  0 with the
boundary condition V 0 .x/ D F .x/. defining an approximation Vh1 V 1 . This is
The framework of viscosity solutions is now a purely algebraic dynamic programming-
needed because in general the optimal value type equation which can be solved numerically,
functions will not be smooth; thus, a generalized e.g., by using a finite element approach. The
solution concept for PDEs must be employed (see equation is typically solved iteratively using a
Bardi and Capuzzo Dolcetta 1997). Of course, suitable minimization routine for computing the
appropriate boundary conditions are needed at “min” in each iteration (in the simplest case, U
the boundary of the state constraint set X . is discretized with finitely many values and the
Once the Hamilton-Jacobi-Bellman char- minimum is determined by direct comparison).
acterization is established, one can compute We denote the resulting approximation of V 1 by
numerical approximations to V T or V 1 by VQh1 . Here, approximation is usually understood
solving these PDEs numerically. To this end, in the L1 sense (see Falcone 1997 or Falcone
various numerical schemes have been suggested, and Ferretti 2013).
including various types of finite element and The semi-Lagrangian scheme is appealing for
finite difference schemes. Among those, semi- synthesis of an approximately optimal feedback
Lagrangian schemes Falcone (1997) or Falcone because Vh1 is the optimal value function of
and Ferretti (2013) allow for a particularly Q
the auxiliary discrete-time problem defined by x.
elegant interpretation in terms of optimal control This implies that the expression
synthesis, which we explain for the infinite N
horizon case. ?h .x/ WD argminfhg.x; u/
In the semi-Lagrangian approach, one takes u2U

advantage of the fact that by the chain rule for C .1  ıh/Vh1 .x.h;
Q x; u//g;
p D Vx1 .x/ and constant control functions u,
the identity is an optimal feedback control value for this
ˇ discrete-time problem for the next time step, i.e.,
1 d ˇˇ on the time interval Œt; t C h/ if x D x.t/. This
ıV .x/  p  f .x; u/ D  .1  ıt/V 1
dt ˇt D0 feedback law will be approximately optimal for
the continuous-time control system when applied
.x.t; x; u//
as a discrete-time feedback law, and this ap-
proximate optimality remains true if we replace
holds. Hence, the left-hand side of this equality Vh1 in the definition of ?h by its numerically
can be approximated by the difference quotient computable approximation VQh1 . A similar con-
struction can be made based on any other numer-
V 1 .x/  .1  ıh/V 1 .x.h; x; u// ical approximation VQ 1 V 1 , but the explicit
h correspondence of the semi-Lagrangian scheme
to a discrete-time auxiliary system facilitates the
for small h > 0. Inserting this approximation interpretation and error analysis of the resulting
into the Hamilton-Jacobi-Bellman equation, re- control law.
placing x.h; x; u/ by a numerical approxima- The main advantage of the Hamilton-
Q x; u/ (in the simplest case, the Euler
tion x.h; Jacobi approach is that it directly computes an
Q x; u/ D x C hf .x; u/), multiplying
method x.h; approximately optimal feedback law. Its main
926 Numerical Methods for Nonlinear Optimal Control Problems

disadvantage is that the number of grid nodes and


needed for maintaining a given accuracy in
a finite element approach to compute VQh1
u? .t/ D argmin H.x ? .t/; p.t/; u/; (10)
in general grows exponentially with the state u2U
dimension n. This fact – known as the curse
of dimensionality – restricts this method to low- for almost all t 2 Œ0; T  (see Grass et al. 2008,
dimensional state spaces. Unless special structure Theorem 3.4). The variable p is referred to as the
is available which can be exploited, as, e.g., in adjoint or costate variable.
the max-plus approach (see McEneaney 2006), it For a given initial value x0 2 Rn , the numer-
is currently almost impossible to go beyond state ical approach now consists of finding functions
dimensions of about n D 10, typically less for x W Œ0; T  ! Rn , u W Œ0; T  ! U and p W
strongly nonlinear problems. Œ0; T  ! Rn satisfying

Maximum Principle Approach P


x.t/ D f .x.t/; u.t// (11)
P D ıp.t/  Hx .x.t/; p.t/; u.t// (12)
p.t/
In contrast to the Hamilton-Jacobi-Bellman ap-
proach, the approach via Pontryagin’s maximum u.t/ D argmin H.x.t/; p.t/; u/ (13)
u2U
principle does not compute a feedback law. In-
stead, it yields an approximately open-loop op- x.0/ D x0 ; p.T / D Fx .x.T // (14)
timal control u? together with an approximation
to the optimal trajectory x ? for a fixed initial for t 2 Œ0; T . Depending on the regularity of
value. We explain the approach for the finite the underlying data, the conditions (11)–(14) may
horizon problem. For simplicity of presentation, only be necessary but not sufficient for x and
we omit state constraints in our presentation, i.e., u being an optimal trajectory x ? and control
we set X D Rn and refer to, e.g., Vinter (2000), function u? , respectively. However usually x and
Bryson and Ho (1975), or Grass et al. (2008) for u satisfying these conditions, are good candidates
more general formulations as well as for rigorous for the optimal trajectory and control, thus justi-
versions of the following statements. fying the use of these conditions for the numerical
In order to state the maximum principle approach. If needed, optimality of the candidates
(which, since we are considering a minimization can be checked using suitable sufficient optimal-
problem here, could also be called minimum ity conditions for which we refer to, e.g., Maurer
principle), we define the non-minimized (1981) or Malanowski et al. (2004). Due to the
Hamiltonian as fact that in the maximum principle approach first
optimality conditions are derived which are then
H.x; p; u/ D g.x; u/ C p  f .x; u/: discretized for numerical simulation, it is also
termed first optimize then discretize.
Then, under appropriate regularity assumptions, Solving (11)–(14) numerically amounts to
there exists an absolutely continuous function p W solving a boundary value problem, because the
Œ0; T  ! Rn such that the optimal trajectory x ? condition x ? .0/ D x0 is posed at the beginning
and the corresponding optimal control function of the time interval Œ0; T  while the condition
u? for (2)–(4) satisfy p.T / D Fx .x ? .T // is required at the end.
In order to solve such a problem, the simplest
P D ıp.t/  Hx .x ? .t/; p.t/; u? .t//
p.t/ (8) approach is the single shooting method which
proceeds as follows:
with terminal or transversality condition We select a numerical scheme for solving the
ordinary differential equations (11) and (12) for
p.T / D Fx .x ? .T // (9) t 2 Œ0; T  with initial conditions x.0/ D x0 ,
Numerical Methods for Nonlinear Optimal Control Problems 927

p.0/ D p0 and control function u.t/. Then, we : : : < tN D T and in addition to p0k intro-
proceed iteratively as follows: duces variables x1k ; : : : ; xNk 1 ; p1k ; : : : ; pNk 1 2
(0) Find initial guesses p00 2 Rn and u0 .t/ for Rn . Then, starting from initial guesses p00 , u0 , and
the initial costate and the control, fix " > 0, x10 ; : : : ; xN0 1 ; p10 ; : : : ; pN0 1 , in each iteration the
and set k WD 0. Eqs. (11)–(14) are solved numerically on the in-
(1) Solve (11) and (12) numerically with initial tervals Œtj ; tj C1  with initial values xjk and pjk ,
values x0 and p0k and control function uk . respectively. We denote the respective solutions
Denote the resulting trajectories by xQ k .t/ and in the k-th iteration by xQ jk and pQjk . In order
pQ k .t/. to enforce that the trajectory pieces computed
(2) Apply one step of an iterative method for on the individual intervals Œtj ; tj C1  fit together
solving the zero-finding problem G.p/ D 0 continuously, the map G is redefined as
with
G.x1k ; : : : ; xNk 1 ; p0k ; p1k ; : : : ; pNk 1 / D
G.p0k / WD pQ .T /  Fx .xQ .T //
k k
0 1
xQ 0k .t1 /  x1k
for computing p0kC1 . For instance, in case of B :: C
B : C
B C
the Newton method we get B xQ Nk 2 .t1 /  xNk 1 C
B C
B pQ0k .t1 /  p1k C:
B C
p0kC1 WD p0k  DG.p0k /1 G.p0k /: B
B :: C
C
B : C
@ pQNk 2 .t1 /  pNk 1 A
If kp0kC1  p0k k < ", stop; else compute
pQN 1 .T /  Fx .xQ Nk 1 .T //
k

ukC1 .t/ WD argmin H.x k .t/; p k .t/; u/;


u2U The benefit of this approach is that the so-
lutions on the shortened time intervals depend
set k WD k C 1, and go to (1). much less sensitively on the initial values and N
The procedure described in this algorithm is the control, thus making the problem numerically
called single shooting because the iteration much better conditioned. The obvious disadvan-
is performed on the single initial value p0k . tage is that the problem becomes larger as the
For an implementable scheme, several details function G is now defined on a much higher di-
still need to be made precise, e.g., how to mensional space but this additional effort usually
parameterize the function u.t/ (e.g., piecewise pays off.
constant, piecewise linear or polynomial), how While the convergence behavior for the multi-
to compute the derivative DG and its inverse ple shooting method is considerably better than
(or an approximation thereof), and the argmin in for single shooting, it is still a difficult task to
(2). The last task considerably simplifies if the select good initial guesses xj0 , pj0 and u0 . In order
structure of the optimal control, e.g., the number to accomplish this, homotopy methods can be
of switchings in case of a bang-bang control, is used (see, e.g., Pesch 1994) or the result of a
known. direct approach as presented in the next section
However, even if all these points are set- can be used as an initial guess. The latter can
tled, the set of initial guesses p00 and u0 for be reasonable as the maximum principle-based
which the method is going to converge to a approach can yield approximations of higher ac-
solution of (11)–(14) tends to be very small. curacy than the direct method.
One reason for this is that the solutions of (11) In the presence of state constraints or mixed
and (12) typically depend very sensitively on state and control constraints, the conditions (12)–
p00 and u0 . In order to circumvent this problem, (14) become considerably more technical and
multiple shooting can be used. To this end, one thus more difficult to be implemented numeri-
selects a time grid 0 D t0 < t1 < t2 < cally (cf. Pesch 1994).
928 Numerical Methods for Nonlinear Optimal Control Problems

Direct Discretization and

Despite being the most straightforward and sim- I.ti ; ti C1 ; xi ; ui / D .ti C1  ti /e ıti g.xi ; ui /:
ple of the approaches described in this article,
the direct discretization approach is currently the Introducing the optimization variables
most widely used approach for computing single u0 ; : : : ; uN 1 2 Rm and x1 ; : : : ; xN 2 Rn , the
finite horizon optimal trajectories. In the direct discretized version of (2)–(4) reads
approach, we first discretize the problem and then
solve a finite dimensional nonlinear optimiza- 1
X
N
tion problem (NLP), i.e., we first discretize, then minimize I.ti ; ti C1 ; xi ; u/ C e ıT F .xN /
optimize. The main reasons for the popularity xj 2R ;uj 2R
n m
i D0
of this approach are the simplicity with which
constraints can be handled and the numerical ef- subject to the constraints
ficiency due to the availability of fast and reliable
NLP solvers. uj 2 U; j D 0; : : : ; N  1
The direct approach again applies to the finite
xj 2 X; j D 1; : : : ; N
horizon problem and computes an approximation
to a single optimal trajectory x ? .t/ and control xj C1 D x.t
Q j C1 ; tj ; xj ; u/; j D 0; : : : ; N
function u? .t/ for a given initial value x0 2 X .
To this end, a time grid 0 D t0 < t1 < t2 < : : : < This way, we have converted the optimal control
tN D T and a set Ud of control functions which problem (2)–(4) into a finite dimensional nonlin-
are parameterized by finitely many values are ear optimization problem (NLP). As such, it can
selected. The simplest way to do so is to choose be solved with any numerical method for solving
u.t/  uj 2 U for all t 2 Œti ; ti C1 . However, such problems. Popular methods are, for instance,
other approaches like piecewise linear or piece- sequential quadratic programming (SQP) or in-
wise polynomial control functions are possible, terior point (IP) algorithms. The convergence of
too. We use a numerical algorithm for ordinary this approach was proved in Malanowski et al.
differential equations in order to approximately (1998); for an up-to-date account on theory and
solve the initial value problems practice of the method, see Gerdts (2012) and
Betts (2010). These references also explain how
P
x.t/ D f .x.t/; ui /; x.ti / D xi (15) information about the costates p.t/ can be ex-
tracted from a direct discretization, thus linking
for i D 0; : : : ; N  1 on Œti ; ti C1 . We de- the approach to the maximum principle.
note the exact and numerical solution of (15) The direct method sketched here is again a
Q ti ; xi ; ui /, respectively.
by x.t; ti ; xi ; ui / and x.t; multiple shooting method, and the benefit of this
Finally, we choose a numerical integration rule in approach is the same as for solving boundary
order to compute an approximation problems, thanks to the short intervals Œti ; ti C1 ;
the solutions depend much less sensitively on the
Z ti C1 data than the solution on the whole interval Œ0; T ,
I.ti ; ti C1 ; xi ; ui / e ıt thus making the iterative solution of the resulting
ti
discretized NLP much easier. The price to pay is
g.x.t; ti ; xi ; u/; u.t//dt: again the increase of the number of optimization
variables. However, due to the particular structure
In the simplest case, one might choose xQ as of the constraints guaranteeing continuity of the
the Euler scheme and I as the rectangle rule, solution, the resulting matrices in the NLP have
leading to a particular structure which can be exploited
numerically by a method called condensing (see
Q i C1 ; ti ; xi ; ui / D xi C .ti C1  ti /f .xi ; ui /
x.t Bock and Plitt 1984).
Numerical Methods for Nonlinear Optimal Control Problems 929

An alternative to multiple shooting methods Bellman equations and direct discretization. For
are collocation methods, in which the internal the former, the development of discretization
variables of the numerical algorithm for solv- schemes suitable for increasingly higher
ing (15) are also optimization variables. However, dimensional problems is in the focus. For the
nowadays, the multiple shooting approach as de- latter, the popularity of these methods in online
scribed above is usually preferred. For a more applications like MPC triggers continuing effort
detailed description of various direct approaches, to make this approach faster and more reliable.
see also Binder et al. (2001), Sect. 5. Beyond ordinary differential equations, the
development of numerical algorithms for the
optimal control of partial differential equations
Further Approaches for Infinite (PDEs) has attracted considerable attention
Horizon Problems during the last years. While many of these
methods are still restricted to linear systems,
The last two approaches only apply to finite in the near future we can expect to see many
horizon problems. While the maximum princi- extensions to (classes of) nonlinear PDEs. It is
ple approach can be generalized to infinite hori- worth noting that for PDEs, maximum principle-
zon problems, the necessary conditions become like approaches are more popular than for
weaker and the numerical solution becomes con- ordinary differential equations.
siderably more involved (see Grass et al. 2008).
Both the maximum principle and the direct ap-
proach can, however, be applied in a receding Cross-References
horizon fashion, in which an infinite horizon
problem is approximated by the iterative solution  Discrete Optimal Control
of finite horizon problems. The resulting control  Economic Model Predictive Control
technique is known under the name of model  Nominal Model-Predictive Control
 Optimal Control and the Dynamic Program-
predictive control (MPC; see Grüne and Pannek N
2011), and under suitable assumptions, a rigorous ming Principle
approximation result can be established.  Optimal Control and Pontryagin’s Maximum
Principle
 Optimization Algorithms for Model Predictive
Summary and Future Directions Control

The three main numerical approaches to optimal


Bibliography
control are:
• The Hamilton-Jacobi-Bellman approach, Bardi M, Capuzzo Dolcetta I (1997) Optimal control
which provides a global solution in feedback and viscosity solutions of Hamilton-Jacobi-Bellman
form but is computationally expensive for equations. Birkhäuser, Boston
Betts JT (2010) Practical methods for optimal control
higher dimensional systems
and estimation using nonlinear programming, 2nd edn.
• The Pontryagin maximum principle approach SIAM, Philadelphia
which computes single optimal trajectories Binder T, Blank L, Bock HG, Bulirsch R, Dahmen W,
with high accuracy but needs good initial Diehl M, Kronseder T, Marquardt W, Schlöder JP,
von Stryk O (2001) Introduction to model based op-
guesses for the iteration
timization of chemical processes on moving horizons.
• The direct approach which also computes sin- In: Grötschel M, Krumke SO, Rambau J (eds) Online
gle optimal trajectories but is less demanding optimization of large scale systems: state of the art.
in terms of the initial guesses at the expense of Springer, Heidelberg, pp 295–340
Bock HG, Plitt K (1984) A multiple shooting algorithm
a somewhat lower accuracy
for direct solution of optimal control problems. In:
Currently, the main trends in numerical optimal Proceedings of the 9th IFAC world congress, Bu-
control lie in the areas of Hamilton-Jacobi- dapest. Pergamon, Oxford, pp 242–247
930 Numerical Methods for Nonlinear Optimal Control Problems

Bryson AE, Ho YC (1975) Applied optimal control. problems. In: Fiacco AV (ed) Mathematical program-
Hemisphere Publishing Corp., Washington, DC. Re- ming with data perturbations. Lecture notes in pure
vised printing and applied mathematics, vol 195. Dekker, New York,
Falcone M (1997) Numerical solution of dynamic pro- pp 253–284
gramming equations. In: Appendix A in Bardi M, Malanowski K, Maurer H, Pickenhain S (2004) Second-
Capuzzo Dolcetta I (eds) Optimal control and viscos- order sufficient conditions for state-constrained op-
ity solutions of Hamilton-Jacobi-Bellman equations. timal control problems. J Optim Theory Appl
Birkhäuser, Boston 123(3):595–617
Falcone M, Ferretti R (2013) Semi-Lagrangian approxi- Maurer H (1981) First and second order sufficient
mation schemes for linear and Hamilton-Jacobi equa- optimality conditions in mathematical program-
tions. SIAM, Philadelphia ming and optimal control. Math Program Stud 14:
Gerdts M (2012) Optimal control of ODEs and DAEs. De 163–177
Gruyter textbook. Walter de Gruyter & Co., Berlin McEneaney WM (2006) Max-plus methods for nonlinear
Grass D, Caulkins JP, Feichtinger G, Tragler G, Behrens control and estimation. Systems & control: founda-
DA (2008) Optimal control of nonlinear processes. tions & applications. Birkhäuser, Boston
Springer, Berlin Pesch HJ (1994) A practical guide to the solution of
Grüne L, Pannek J (2011) Nonlinear model predictive real-life optimal control problems. Control Cybern
control: theory and algorithms. Springer, London 23(1–2):7–60
Malanowski K, Büskens C, Maurer H (1998) Conver- Vinter R (2000) Optimal control. Systems & control:
gence of approximations to nonlinear optimal control foundations & applications. Birkhäuser, Boston
O

Stabilizability; State observers; Static state


Observer-Based Control feedback
H.L. Trentelman1 and Panos J. Antsaklis2
1
Johann Bernoulli Institute for Mathematics and
Computer Science, University of Groningen,
Groningen, AV, The Netherlands
2 Introduction
Department of Electrical Engineering,
University of Notre Dame, Notre Dame,
In this entry, we explain the notion of observer-
IN, USA
based feedback control. Given a to-be-controlled
system in input-state-output form, together with
a control objective, the problem is to design a
Abstract
feedback controller such that the closed-loop
system meets the objective. In the case when
An observer-based controller is a dynamic
all state variables of the system are available
feedback controller with a two-stage structure.
for control, the design problem is considered
First, the controller generates an estimate of the
to be simpler, and often the controller can be
state variable of the system to be controlled,
chosen to be a static state feedback control law.
using the measured output and known input
In the more general case where the controller
of the system. This estimate is generated by a
has access only to a linear function of the
state observer for the system. Next, the state
state variables, the problem is more involved
estimate is treated as if it were equal to the exact
and requires the design of a dynamic feedback
state of the system, and it is used by a static
control law. The key idea of observer-based
state feedback controller. Dynamic feedback
feedback control is the following. As a first
controllers with this two-stage structure appear
step, one determines a state observer for the
in various control synthesis problems for linear
system, i.e., a system that estimates the state of
systems. In this entry, we explain observer-based
the system based on the measured outputs and
control in the context of internal stabilization by
inputs of the system. Next, the state estimate
dynamic measurement feedback.
is treated as if it were exactly equal to the
actual state of the system and is used by a
Keywords static state feedback controller. In this way, a
dynamic feedback controller is obtained that is
Detectability; Dynamic output feedback control; composed of a (dynamic) state observer and a
Internal stabilization; Separation principle; static feedback part.

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
932 Observer-Based Control

Dynamic Output Feedback Control Observer-Based Controllers

Consider the controlled and observed system †: Given the system (1) and a control objective,
the problem thus arises on how to determine the
maps K; L; M , and N so that the closed-loop
P
x.t/ D Ax.t/ C Bu.t/ C Ed.t/;
systems meet the objective. As an example, take
y.t/ D C x.t/; (1) the special case when E in (1) is equal to zero
z.t/ D H x.t/; (i.e., the system has no external disturbances or
reference signals) and that we wish the closed-
with x.t/ 2 X D Rn the state, u.t/ 2 Rm loop system (3) to be internally stable, i.e., we
the control input, and y.t/ 2 Rp the mea- want to find the maps K; L; M , and N so that
sured output. The signal d.t/ may represent a the eigenvalues i of the system map of (3) are in
disturbance input or a desired reference signal, the open left half-plane, i.e., satisfy Re.i / < 0
while the signal z.t/ is a controlled output signal. for all i . If we had access to the entire state
A; B; C; E, and H are maps (or matrices). variable x (instead of only to the linear function
In general, a linear controller for this system is y D C x), then this problem would be simpler:
a finite-dimensional linear time-invariant system assuming that the system is stabilizable (The
 represented by system xP D Ax C Bu is called stabilizable if
there exists a map F such that A C BF has all
its eigenvalues in the open left half-plane), find
P
w.t/ D Kw.t/ C Ly.t/; a map F such that the eigenvalues of A C BF
(2)
u.t/ D M w.t/ C Ny.t/: are in the open left half-plane; then take the static
state feedback controller u D F x as the control
law. That is, we would choose the state space
The state space of the controller is assumed to be
dimension of the controller  equal to 0 and the
W D Rq for some positive integer q. K; L; M ,
maps K, L, and M to be void, and we would take
and N are assumed to be linear maps (or matri-
N D F.
ces). The controller (2) takes the observations y
In general, however, we only have access to a
as its input and generates the control function u as
given linear function y D C x of x, determined
its output. The closed-loop system resulting from
by the output map C . The key idea of observer-
the interconnection of † and  is described by
based control is the following:
the equations
Use the theory of observer design to find an
! observer for the state x of the system (1), i.e.,
    
P
x.t/ ACBNC BM x.t/ E an observer that generates an estimate  of the
D C d.t/;
P
w.t/ LC K w.t/ 0 system state x based on the measured output
 
  x.t/ y and the control input u. Next, apply a static
z.t/ D H 0 : feedback u D F  mimicking the (not permissible)
z.t/
(3) control law u D F x.
This idea leads to a dynamic feedback con-
The control action of interconnecting the con- troller (2) of a very particular structure: the con-
troller  with the system (1) is called dynamic troller is the combination of a state observer
feedback. The state space of the closed-loop sys- (with a certain state space dimension) and a static
tem (3) is called the extended state space and is control law acting on the state estimate. This two-
equal to the Cartesian product X W D RnCq . In stage structure, separating estimation and control,
general, a feedback control problem amounts to is often called the separation principle. We will
finding linear maps K; L; M , and N such that the work out this idea in more detail for the case
closed-loop system (3) satisfies the control design when E D 0 (no external disturbances or ref-
specifications. erence signals) and the aim is to obtain internal
Observer-Based Control 933

stability of the system. Before doing this, we first Introducing the estimation error e W D   x and
explain the most important material on observers interconnecting the system (1) with (5), we find
that is needed in the sequel. that the error e satisfies the differential equation

P D .A  GC /e.t/:
e.t/ (6)
State Observers
Hence all possible errors converge to 0 as t tends
If the state is not available for measurement, one to infinity if and only if A  GC is a stability
can try to reconstruct it using a system, called matrix, i.e., has all its eigenvalues in the open left
observer, that takes the control input and the half-plane. In that case, we call (5) a stable state
measured output of the original system as inputs observer. Thus, a stable state observer exists if
and yields an output that is an estimate of the and only if G can be found such that A  GC is
state of the original system. Again in case that a stability matrix. The problem of finding such a
in the system (1) we have E D 0, i.e., there are G is dual to the problem of finding a matrix F
no disturbance signals. This is illustrated in the to a pair .A; B/ such that A C BF is a stability
following picture: matrix.
Definition 1 The pair .C; A/ is called detectable
if there exists a matrix G such that A  GC is a
Σ
u y stability matrix, i.e., has all its eigenvalues in the
open left half-plane.

Ω Theorem 1 Given system †, the following state-


ments are equivalentW
1. † has a stable state observer.
2. .C; A/ is detectable.
The quantity  is supposed to be an estimate, The equation for  can be rewritten using an
in some sense, of the state, and w is the state artificial output  D C  as P D A C Bu C O
variable of the observer. In general, the observer, G.y  /. The interpretation of this is as follows.
denoted by , has equations of the form If  is the exact state, then  D y, and hence 
obeys exactly the same differential equation as x.
P
w.t/ D P w.t/ C Qu.t/ C Ry.t/;
(4) Otherwise, the equation for  has to be corrected
.t/ D S w.t/:
by a term determined by the output error y  .
Consequently, the state observer consists of an
It turns out that particular choices for P; Q; R,
exact replica †d up of the original system with an
and S , specifically P D A  GC (where the map
extra input channel for incorporating the output
G has to be determined), Q D B, R D G, and
error and an extra output, the state of the observer,
S D I , lead to
which serves as the desired estimate for the state
of the original system. The following diagram
P D .A  GC /.t/ C Bu.t/ C Gy.t/: (5)
.t/ depicts the situation:

y C
Σ
u
h

+
G
Σdup
934 Observer-Based Control

Observer-Based Stabilization Assume that † is stabilizable and detectable.


Then F and G can be found such that A C BF
We now work out the ideas put forward in the and A  GC are stability matrices. Since the
previous sections for the special case of stabi- set of eigenvalues of Ae is the union of those
lization by dynamic measurement feedback, i.e., of A C BF and A  GC , it follows that Ae
to find a controller (2) such that the closed- is a stability matrix. Consequently, the system
loop system (3) is internally stable; equivalently, xP e D Ae xe is asymptotically stable; equivalently,
the system mapping of (3) is a stability matrix. every solution xe D .x; e/ converges to 0 as t
Again, we restrict ourselves to the case when tends to infinity. Of course, if .x; / is a solution
E D 0. of (7), then  D xCe, with xe D .x; e/ a solution
We assume that we know how to stabilize by of xP e D Ae xe . Hence .x; / also converges to 0
state feedback and how to build a state observer. as t goes to infinity. Thus we have proved the “if”
If we have a plant of which we do not have the part of the following theorem:
state available for measurement, we use a state
Theorem 2 There exists an internally stabilizing
observer to obtain an estimate of the state, and
dynamic feedback controller for † if and only if
we apply the state feedback to this estimate rather
† is stabilizable and detectable. A controller is
than to the true state. This is illustrated by the
given by
following picture:
P D .A  GC /.t/ C Bu.t/ C Gy.t/;
.t/
(8)
u.t/ D F .t/;

Σ
u y
where F is any map such that A C BF is a

Ω
stability matrix and G is any map such that
A  GC is a stability matrix.
The controller (8) is an observer-based dynamic
F feedback controller, since it is composed of a
state observer and a static feedback part.

Again, consider the system † given by (1) and


let the observer  be given by (5) . Combining Summary and Future Directions
this with u D F  yields
We have given an introduction to observer-based
P
x.t/ D Ax.t/ C BF ; feedback controllers and have explained that such
(7) controllers are dynamic feedback controllers that
P.t/ D .A  GC C BF /.t/ C GC x.t/:
can be represented as the composition of a state
observer for the system, together with a static
Introducing again e W D   x, we obtain, in
control law mimicking a (not permitted) static
accordance with the previous section,
state feedback control law. We have given a
detailed description of this principle for the case
P
x.t/ D .A C BF /x.t/ C BF e.t/; that the system to be controlled has no external
disturbances or reference signals and the control
P D .A  GC /e.t/:
e.t/
objective is internal stability of the system. More
intricate versions of the principle of observer-
That is, the equation xP e D Ae xe with based feedback control appear in control design
problems for linear systems with external distur-
   
x A C BF BF bances and reference signals and with different,
xe W D ; Ae W D :
e 0 A  GC more sophisticated, control objectives. Examples
Observers for Nonlinear Systems 935

of these are the regulator problem, the problem Introduction


of disturbance decoupling with internal stability,
the H2 optimal control problem, and the H1 Observers are answers to the question of
suboptimal control problem. estimating, from observed/measured/empirical
variables, denoted y, and delivered by sensors
equipping a real-world system, some “theoreti-
cal” variables, called hidden variables in this text,
Cross-References denoted z, which are involved in a mathematical
model related to this system. The measured
 Linear State Feedback variables make what is called the a posteriori
 Observers in Linear Systems Theory information on the hidden variables, whereas the
model is part of the a priori information. Because
a model cannot fit exactly a system, introduction
Bibliography of uncertainties is mandatory.
Typically this model describing the link be-
Antsaklis PJ, Michel AN (2007) A linear systems primer. tween hidden and measured variables is made of
Birkhäuser, Boston three components:
Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con- • A dynamic model describes the dynam-
trol theory for linear systems. Springer, London
Wonham WM (1979) Linear multivariable control: a geo-
ics/evolution (xP denotes the time deriva-
metric approach. Springer, New York tive dx
dt
):
Zadeh LA, Desoer CA (1963) Linear systems theory – the  
state-space approach. McGraw-Hill, New York P Df .x.t/;t;ı s.t//resp. xkC1 D fk xk ; ıks ;
x.t/
(1)
where t, in the continuous case, or k, in the
Observers for Nonlinear Systems discrete case, is an evolution parameter, called
time in this text; x is a state, assumed finite
Laurent Praly dimensional in this text; and ı s represents
MINES ParisTech, PSL Research University, the uncertainties in the state dynamics. Any O
CAS, Fontainebleau, France possible known inputs are represented here by
the time dependence of f .
• A sensor model relates state and measured
Abstract variables:
   
y.t/ D h x.t/; t; ı m .t/ resp. yk D hk xk ; ıkm
Observers are objects delivering estimation of
(2)
variables which cannot be directly measured.
The access to such hidden variables is made with ı m representing the uncertainties in the
possible by combining modeling and measure- measurements.
ments. But this is bringing face to face real world • A model which relates state and hidden vari-
and its abstraction with, as a result, the need ables:
for dealing with uncertainties and approxima-    
z.t/ D x; t; ı h .t/ resp. zk D k xk ; ıkh
h h
tions leading to difficulties in implementation and
(3)
convergence.
where again ı h represents the uncertainties in
the hidden variables.
In a deterministic setting, the a priori information
Keywords on the uncertainties .ı s ; ı m ; ı h / may be that the
values of ı s , ı m , and ı h are unknown but belong
Detectability; Distinguishability; Estimation to known sets s , m , and h . Namely, we have:
936 Observers for Nonlinear Systems

ı s .t/ 2 s .t/; ı m .t/ 2 m .t/; ı h .t/ 2 h .t/ ; Observation problem At each time t, respectively
respectively, ıks 2 sk ; ıkm 2 m k, given the function s 2t  T; t 7! y.s/,
k ; ık 2 k :
h h

(4) respectively the sequence l 2 fk  K; : : : ; kg 7!


In a stochastic setting and more specifically in a yl , find an estimation zO.t/, respectively zOk , of z.t/,
Bayesian approach, it may be that ı s , ı m , and ı h respectively, zk , satisfying
are unknown realizations of stochastic processes    
zO.t/ D O t; ı h .t/ resp. zOk D xO k ; ıkh :
h h
for which we know the probability distributions. x.t/; k

Similarly we may also know a priori that we


have: O
where x.t/, respectively xO k , is to be found as a
solution of
x.t/ 2 X .t/; z.t/ 2 Z.t/
respectively, xk 2 Xk ; zk 2 Zk O
x.t/ 2 X .t/ ;
(5)  ıs 
y.s/ D h X .x.t/;
O t; s/; s; ı m .s/ 8s 2t T; t;
where the sets X and Z are known or we may
have a priori probability distribution for x and z. respectively
In this context, the a priori information is the
h
data of the functions f , h, and , of the sets xO k 2 Xk ;
s , m , and h or the corresponding probability  
ıs
distribution and so may be also of the sets X yl D hl Xl .xO k ; k/; ılm 8l 2 fk  K; : : : ; kg
and Z or the corresponding a priori probability
distribution. and where the time functions ı s , ı m , and ı h must
In the next section, we state the observation agree with the a priori (deterministic/stochastic)
problem and give the solutions which are direct information or minimized in some way.
consequences of the deterministic and stochastic
In this statement T , respectively K, quantifies
setting given above. This will allow us to see
the time window length or memory length during
that an observer is actually a dynamical system
which we record the measurement. The accu-
with the measurements as inputs and the estimate
mulation with time of measurements, together
as output. But approximations in the implemen-
with the model equations (1)–(3) and the as-
tation of these solutions, not knowing how to
sumptions on .ı s ; ı m ; ı h /, gives a redundancy of
initialize, may lead to convergence problems even
data compared with the number of unknowns that
when the uncertainties disappear. The second part
the hidden variables are. This is why it may be
of this text is devoted to this convergence topic.
possible to solve this observation problem.
To ease the presentation, we deal only with
To simplify the following presentation, we
the discrete time case in section “Set Valued
restrict our attention on the case where the hidden
and Conditional Probability Valued Observers”
variables are actually the full model state, i.e.,
and the continuous time case in sections “An
Optimization Approach” and “Convergent
z D .x/ D x :
h
Observers.”
When z differs from x, observers are called
Observation Problem and Its functional observers.
Solutions
Set-Valued and Conditional
The Observation Problem Probability-Valued Observers
s ıs
Let X ı .x; t; s/, respectively Xl .x; k/, denote a Conceptually the answer to this problem is easy
solution of (1) at time s, respectively l, going at least when the memory increases with time
through x at time t, respectively k, and under the (TP .t/ D 1 resp. KkC1 D Kk C 1) leading to an
action of ı s . infinite non-fading memory. It consists in starting
Observers for Nonlinear Systems 937

from all what the a priori information makes Milanese et al. (1996) and Witsenhausen
possible and to eliminate what is not consistent (1966) for the set case and Arulampalam
with the a posteriori information. In the set- et al. (2002), Bucy and Joseph (1987),
valued observer setting, in the discrete time case, Candy (2009), and Jazwinski (2007) for the
this gives the following observer. To ease its read- conditional probability case.
ing, we underline the data given by the a priori Need of finite or infinite but fading memory:
information. It requires the introduction of two In these observers, model states x which are
sets k and kjk1 which are updated at each time consistent with the a priori information but
k when a new measurement yk is made available. do not agree with the a posteriori information
k is the set which xk is guaranteed to belong to at are eliminated (set intersection or probability
time k, knowing all the measurements up to time product). But once a point is eliminated, this is
k, and kjk1 is the same but with measurements forever. As a consequence if there is, at some
known up to time k  1. time, a misfit between a priori and a posteriori
Set-valued observer: information, it is mistakenly propagated in
future times. A way to round this problem
Initialization: 0 D X0
is to keep the information memory finite or
At each time k: pre- kjk1 D fk1 .k1 ; sk1 / infinite but fading. In particular, with fixed
diction (flowing) ˚  length memory, consistent points which were
T 
restriction k D x 2 kjk1 Xk W disregarded due to measurements which are
(consistency) o no more in the memory are reintroduced. This
yk 2 hk .x; m k / says also that observers should not be sensitive
estimation xO k 2 k to their initial condition.
Not single-valued estimate. The observers intro-
A key feature here is that this observer has a state
duced above realize a lossless data compres-
k – a set – and is a dynamical system in the form:
sion with extracting and preserving all what
concerns the hidden variables in the redun-
kC1 D 'k .k ; yk /; xO k 2 k dant data given by a priori and a posteriori
information. But this “lossless compression” O
with y as input and xO as output which is not answer is not single valued (set valued or
single valued. Important also, the initial condition conditional probability valued) as a result of
of the state  is given by the a priori information. taking uncertainties into account. Actually, to
In the stochastic setting, following the get a single-valued answer, the observation
Bayesian paradigm, the observer has the same problem must be complemented by making
structure but with the state k being a conditional precise for what the estimation is made. For
probability. See Jazwinski (2007, Theorem 6.4) instance, we may want to select the most likely
or Candy (2009, Table 2.1). In that setting too the or the average or more generally some cost-
observer is not a single state; it is the (a posteriori) minimizing estimate xO among all the possible
conditional probability of the random variable xk ones given by . In this way we obtain an
given the a priori information and the sequence observer giving a single-valued estimate:
of measurements l 2 fk  K; : : : ; kg 7! yl .
Comments kC1 D 'k .k ; yk /; xO k D k .k /
Implementation: For the time being, except for
very specific cases (Kalman filter, . . . ), the set- respectively
valued and the conditional probability-valued
P D '..t/; y.t/; t/; x.t/
.t/ O D ..t/; t/
observers remain conceptual since we do not
know how to manipulate numerically sets (6)
and probability laws. Their implementation But then, in general, we lose information, and
requires approximations. For instance, see in particular we have no idea on the confidence
938 Observers for Nonlinear Systems

level this estimate has. Also, since the function state reproduces the measurement. Namely,
, at least, encodes for what the estimate xO is we get
used, for different uses, different functions  
may be needed. PO D f .x.t/;
x.t/ O t; 0/CE f
7! y.
/g; x.t/;y.t/;
O t

An Optimization Approach where E is zero when h.x.t/;O t/ D y.t/. But, as


A shortcut to obtain directly an observer giving opposed to what we saw in the previous section,
a single-valued estimate is to design it by trading the initial condition for the part xO of the observer
off among a priori and a posteriori information state is unknown. Hence, we encounter again
(see Cox 1964, pages 7–10; Alamir 2007). For the need for the observer to forget its initial
example, in the continuous time case, we can condition.
O among the minimizers (in
select the estimate x.t/
x) of Convergent Observers
Z t  We have mentioned that often an observer can be
C.fs 7! ı s .s/g; x; t/ D C ı s .s/; y.s/;
1 implemented as a dynamical system, but without
s
 knowing necessarily how to initialize it. Also
X ı .x; t; s/; s ds approximation is involved both in its design and
its implementation. So, at least when it gives a
s
where X ı .x; t; s/ is still the notation for a so- single-valued estimate, we are facing the problem
lution to (1) and fs 7! ı s .s/g, representing the of convergence of this estimate to the “true”
unmodelled effect on the dynamics, is among the value, at least when there is no uncertainties. We
arguments for the minimization. The infinitesimal concentrate now our attention on the study of this
cost C is chosen to take nonnegative values convergence, but, to simplify, in the continuous
and be such that C.0; h.x; s/; x; s/ is zero. For time case only.
instance, it can be Let the model and observer dynamics be

C.ı s ; y; x; s/ D kı s k2x C dy .y; h.x; s//2 P


x.t/ D f .x.t/; t/; y.t/ D h.x.t/; t/ (7)

where k:kx is a norm at the point x and dy P D '..t/; y.t/; t/; O


x.t/ D ..t/; y.t/; t/
is a distance in the measurement space. In the (8)
same spirit, instead of optimization, a minimax with the observer state  of finite dimension
approach can be followed. See, for instance, Bert- m. We denote by .X.x; t; s/; „..x; /; t; s// a
sekas and Rhodes (1971), Başar and Bernhard solution of (7)–(8).
(1995, Chapter 7), and Willems (2004). Since we are dealing with convergence, the
With x fixed, the minimization of C is an focus is on what is going on when the time
infinite horizon optimal control problem in becomes very large and in particular on the set
reverse time. Solving on line this problem is  of model states which are accumulation points
extremely difficult and again approximations of some solution. Specifically we are interested in
are needed. We do not go on with this the stability properties of the set
approach, but we remark that, under extra
n o
assumptions, the observer we obtain following Z.t/ D .x; / W x 2 & x D .; h.x; t/; t/
this approach can also be implemented in
the form of a dynamical system (6) but with
which is contained in the zero estimation error set
the specificity that the estimate xO is part
associated with the given model-observer pair.
of the observer state  and its dynamics
are a copy of the undisturbed model with a Definition 1 (convergent observer) We say the
correction term which is zero when the estimated observer (8) is convergent if for each t, there
Observers for Nonlinear Systems 939

exists a set Za .t/  Z.t/, such that on the the dimension m of the observer state  to
domain of existence of the solution, a distance be larger or equal to the dimension n of the
between the point .X.x; t; s/; „..x; /; t; s// and model state x minus the dimension p of the
the set Za .s/ is upperbounded by a real function measurement y.
s 7! ˇx;;t
c
.s/, may be dependent on .x; ; t/, with
nonnegative values, strictly decreasing and going Is Injective Given h
to zero as s goes to infinity. We consider now the case where the observer
has been designed with a function which is
Necessary Conditions for Observer injective given h, namely, we have the following
Convergence implication, when x is in Xa .t/,
h
No Restriction on .1 ; h.x; t/; t/ D .2 ; h.x; t/; t/
It is possible to prove that if the observer is
i
convergent, then, & 1 2 i .x; t/
Necessity of detectability: When h and are
uniformly continuous in x and , respectively, H) 1 D 2 :
the estimate xO does converge to the model state
x. In this case, two solutions of the model (7) In a “generic” situation, this property, together
which produce the same measurement must with the surjectivity given h, implies that the
converge to each other. This is an asymptotic dimension m of the observer state  should be
distinguishability property called detectability. between n  p and n.
If we are interested not only in the asymptotic If a convergent observer has such a function
behavior but also in the transient (as for output , then .x; t/ 7! i .x; t/, which is (of course) a
feedback), a property stronger than detectabil- (single valued) function, admits a Lie derivative
.Lf i .x; t/ D limdt !0 .X.x;t;t Cdtdt
/;t Cdt / i .x;t /
i
ity is needed. In particular instantaneous dis- /
tinguishability (see section “Observers Based Lf i satisfying
on Instantaneous Distinguishability”) is neces-
sary if we want to be able to impose the decay Lf i .x; t/ D '. i .x; t/; h.x; t/; t/ 8x 2 Xa .t/
O
c
rate of the function ˇx;;t . (10)
Necessity of m  n  p: For each t, there exists
a subset Xa .t/ of , supposed to collect the This says (very approximatively) that ' is nothing
model states which can be asymptotically esti- but the image of the vector field f , under the
mated and such that we can associate, to each change of coordinates .x; t/ 7! . i .x; t/; t/ but
of its point x, a set i .x; t/ allowing us to again all this given h. As partly obtained in the
redefine the set Za .t/ as optimization approach, the observer dynamics are
˚  then a copy of the model dynamics with maybe a
Za .t/ D .x; / W x 2 Xa .t/ &  2 i .x; t/ : correction term which is zero when the estimated
state reproduce the measurement.
This implies that for each t and each x in If moreover the functions h and are uni-
Xa .t/, there is a point  satisfying formly continuous in x and , respectively, then,
given 1 and 2 a distance between „..x; 1 /; t; s/
x D .; h.x; t/; t/ : (9) and „..x; 2 /; t; s/ goes to zero as s goes to
infinity. This property is related to what was
This is a surjectivity property of the function called extreme stability (see Yoshizawa 1966) in
but of a special kind since h.x; t/ is an the 1950s and 1960s and is called incremental
argument of . We say that, for each t, the stability today (see Angeli 2002). It holds when,
function is surjective to Xa .t/ given h. In with denoting by „y .; t; s/ the solution at time
a “generic” situation this property requires s of the observer dynamics :
940 Observers for Nonlinear Systems

P
.t/ D '..t/; y.t/; t/ O
 D i .x.t/; t/ :

going through  at time t and under the action See Andrieu and Praly (2006), Luenberger
of y, the flow  7! „y .; t; s/ is a strict con- (1964), and Shoshitaishvili (1990).
traction (see Jouffroy (2005) for a bibliography
on contraction) for each s > t or, at least, if a Observers Based on Instantaneous
distance between any two solutions „y .1 ; t; s/ Distinguishability
and „y .2 ; t; s/, with the same input y, converges Instantaneous distinguishability means that we
to 0. can distinguish as quickly as we want two model
states by looking at the paths of the measurements
Sufficient Conditions they generate. A sufficient condition to have this
Knowing now how a convergent observer should property can be obtained by looking at the Taylor
look like, we move to a quick description of some expansion in s of h.X.x; t; s/; s/. Indeed, we
such observers. have:

Observers Based on Contraction X


m1
.s  t/i
Since the flow generated by the observer should h.X.x; t; s/; s/ D hi .x; t/
i D0

be a contraction, we may start its design by
 
picking the function ' as Co .s  t/ m1

P
.t/ D '..t/; y.t/; t/ D A .t/ C B.y.t/; t/ where hi is a function obtained recursively as

where A, not related to f , is a matrix whose


h0 .x; t/ D h.x; t/
2
eigenvalues have strictly negative real part. Under
weak restriction, there exists a function i satis- P
hi C1 .x; t/ D hi .x; t/ D @hi
@x .x; t/f .x; t/
fying (10), namely, C @hi
@t .x; t/:

Lf i .x; t/ D A i .x; t/ C B.h.x; t/; t/ : If there exists an integer m such that, in some
(11) uniform way with respect to t, the function
To obtain a convergent observer, it is then suf-
x 7! Hm .x; t/ D .h0 .x; t/ ; : : : ; hm1 .x; t//
ficient that there exists a (uniformly continuous)
function satisfying
is injective, then we do have instantaneous distin-
x D . .x; t/; h.x; t/; t/
i guishability. We say the system is differentially
observable of order m when this injectivity prop-
For this to be possible, the function i should erty holds. When a system has such a property,
be injective given h. This injectivity holds when the model state space has a very specific struc-
the observer state has dimension m  2.n C 1/, ture as discussed in Isidori (1995, Section 1.9).
the model is distinguishable, and provided the It means that we can reconstruct x from the
eigenvalues of A have a sufficiently negative knowledge of y and its m1 first time derivatives,
real part and are not in a set of zero Lebesgue i.e., there exists a function ˆ such that we have:
measure.
Unfortunately, we are facing again a possible x D ˆ .Hm .x; t/; t/ :
difficulty in the implementation since an expres-
sion for a function i satisfying (11) is needed This way, we are left with estimating the deriva-
and the function W .; y; t/ 7! x.t/
O is known tives of y. This can be done as follows. With the
implicitly only as notation i D hi 1 .x; t/, we obtain:
Observers for Nonlinear Systems 941

 
P
.t/ D F  C G hm ˆ..t/; t/ ; t monotonic tangentially to the level sets of the
function h, i.e., for all .x; y; v; t/ satisfying
where y D h.x; t/ and @x
@h
.x; t/v D 0, we have:

@f
F  D .2 ; : : : ; m ; 0/ ; G D .0 ; : : : ; 0 ; 1/ : vT P .x; y; t/ v  0: (13)
@x
When the last term on the right hand side is This is another way of expressing a detectability
Lipschitz, we can find a convergent observer in condition. This expression is coordinate depen-
the form: dent, hence the importance of choosing the coor-
dinates properly.
P / D F .t / C G hm .x.t
.t O /; t / C K.y.t /  1 .t //; When this condition is strict and uniform in t,
O / D ..t /; t / ;
x.t it is sufficient to get a locally convergent observer
and even a nonlocal one when h is linear in x, i.e.,
with  being actually an estimation of  and h.x; t/ D H.t/x, again a coordinate-dependent
where K is a constant matrix and is a modified condition. In this latter case the observer takes the
version of ˆ keeping the estimated state in its a form
priori given set X .t/. P D f..t/; y.t/; t/ C `..t// P 1 H.t/T
.t/
This is the high-gain observer paradigm.
See Gauthier and Kupka (2001) and Tornambe Œy.t/  H.t/.t/;
(1988). The implementation difficulty is in
O
x.t/ D .t/;
the function Ô , not to mention sensitivity to
measurement uncertainty. where ` is a real function to be chosen with suf-
ficiently large values. If (13) is strict and uniform
Observers with Bijective Given h and holds for all v, the correction term is not
needed.
Case Where  Is the Identity Function A con- There are many other results of this type,
vergent observer whose function is the identity exploiting one or the other specificity of the O
has the following form: dependence on x of the function f – monotonicity,
P D f .; t/ convexity, . . . . See Fan and Arcak (2003), Krener
  and Isidori (1983), Respondek et al. (2004), San-
C E f
7! y.
/g; .t/; y.t/; t ; x.t/
O D .t/: felice and Praly (2012), . . .
(12)
The only piece remaining to be designed is the Case Where .x; t/ 7! . i .x; t/; h.x; t/; t/ Is
correction term E. It has to ensure convergence a Diffeomorphism At each time t we know
and may be also other properties like symmetry already that the model state x we want to es-
preserving (see Bonnabel et al. 2008). timate satisfy y.t/ D h.x; t/. So, as remarked
For this design, a first step is to exhibit some in Luenberger (1964), when .h.x; t/; t/ can be
specific properties of the vector field f by writing used as part of coordinates for .x; t/, we need
it in some appropriate coordinates. For example, to estimate the remaining part only. This can be
there may exist coordinates such that the done if we find a function i , whose values are
expression of f takes the form f.x.t/; h.x; t/; t/ n  p dimensional, such that .x; t/ 7! .y; ; t/ D
and the corresponding observer (12) is such .h.x; t/; i .x; t/; t/ is a diffeomorphism and the
that there exists a positive definite matrix P flow  7! y .; t; s/ generated by
for which the function s 7! .X.x; t; d / 
XO ..x; x/;
O t; s//0 P .X.x; t; d /  XO ..x; x/;
O t; s// @ i @ i
P D
.t/ .x.t/; t/f .x.t/; t/ C .x.t/; t/;
is strictly decaying (if not zero). A necessary @x @t
condition for this to be possible is that f is D '..t/; y.t/; t/
942 Observers for Nonlinear Systems

is a strict contraction for all s > t. Indeed in this Bonnabel S, Martin P, Rouchon P (2008) Symmetry-
case the observer dynamics can be chosen as preserving observers. IEEE Trans Autom Control
53(11):2514–2526
Bucy R, Joseph P (1987) Filtering for stochastic processes
P
.t/ D '..t/; y.t/; t/ with applications to guidance, 2nd edn. Chelsea Pub-
lishing Company, Chelsea
Candy J (2009) Bayesian signal processing:
O is obtained as solution of
and the estimate x.t/
classical, modern, and particle filtering methods.
Wiley series in adaptive learning systems for
O
i .x.t/; t/ D .t/; O
h.x.t/; t/ D y.t/: signal processing, communications and control.
Wiley, Hoboken
Carnevale D, Karagiannis D, Astolfi A (2008) Invari-
This is the reduced-order observer paradigm. ant manifold based reduced-order observer design
for nonlinear systems. IEEE Trans Autom Control
See, for instance, Besançon (2000, Proposi- 53(11):2602–2614
tion 3.2), Carnevale et al. (2008), and Luenberger Cox H (1964) On the estimation of state variables and
(1964, Theorem 4). parameters for noisy dynamic systems. IEEE Trans
Autom Control 9(1):5–12
Fan X, Arcak M (2003) Observer design for systems with
multivariable monotone nonlinearities. Syst Control
Lett 50:319–330
Cross-References Gauthier J-P, Kupka I (2001) Deterministic observation
theory and applications. Cambridge University Press,
 Differential Geometric Methods in Nonlinear Cambridge/New York
Isidori A (1995) Nonlinear control systems, 3rd edn.
Control Springer, Berlin/New York
 Observers in Linear Systems Theory Jazwinski A (2007) Stochastic processes and filtering
 Regulation and Tracking of Nonlinear Systems theory. Dover, Mineola
Jouffroy J (2005) Some ancestors of contraction analysis.
In: Proceedings of the IEEE conference on decision
and control, Seville, pp 5450–5455
Krener A, Isidori A (1983) Linearization by output in-
Bibliography jection and nonlinear observer. Syst Control Lett 3(1):
47–52
Alamir M (2007) Nonlinear moving horizon observers: Luenberger D (1964) Observing the state of a linear
theory and real-time implementation. In: Nonlinear system. IEEE Trans Mil Electron MIL-8:74–80
observers and applications. Lecture notes in Milanese M, Norton J, Piet-Lahanier H, Walter E (eds)
control and information sciences. Springer, (1996) Bounding approaches to system identification.
Berlin/New York Plenum Press, New York
Andrieu V, Praly L (2006) On the existence of Kazantzis- Respondek W, Pogromsky A, Nijmeijer H (2004) Time
Kravaris/Luenberger observers. SIAM J Control scaling for observer design with linearizable error
Optim 45(2):432–456 dynamics. Automatica 40:277–285
Angeli D (2002) A Lyapunov approach t incremen- Sanfelice R, Praly L (2012) Convergence of nonlinear
tal stability properties. IEEE Trans Autom Control observers on with a riemannian metric (Part I). IEEE
47(3):410–421 Trans Autom Control 57(7):1709–1722
Arulampalam M, Maskell S, Gordon N, Clapp T (2002) Shoshitaishvili A (1990) Singularities for projections of
A tutorial on particle filters for online nonlinear/non- integral manifolds with applications to control and
Gaussian Bayesian tracking. IEEE Trans Signal observation problems. In: Arnol’d VI (ed) Advances
Process 50(2):174–188 in Soviet mathematics. Theory of singularities and its
Bain A, Crisan D (2009) Fundamentals of stochastic applications, vol 1. American Mathematical Society,
filtering. Stochastic modelling and applied probability, Providence
vol 60. Springer, New York/London Tornambe A (1988) Use of asymptotic observers having
Başar T, Bernhard P (1995) H 1 optimal control and high gains in the state and parameter estimation. In:
related minimax design problems: a dynamic game Proceedings of the IEEE conference on decision and
approach, revised 2nd edn. Birkäuser, Boston control, Austin, pp 1791–1794
Bertsekas D, Rhodes IB (1971) On the minimax reach- Willems JC (2004) Deterministic least squares filtering.
ability of target sets and target tubes. Automatica 7: J Econ 118:341–373
233–213 Witsenhausen H (1966) Minimax control of uncertain
Besançon G (2000) Remarks on nonlinear adaptive ob- systems. Elec. Syst. Lab. M.I.T. Rep. ESL-R-269M,
server design. Syst Control Lett 41(4):271–280 Cambridge, May 1966
Observers in Linear Systems Theory 943

Yoshizawa T (1966) Stability theory by Lyapunov’s sec- with x.t/ 2 Rn , u.t/ 2 Rm , y.t/ 2 Rp and A,
ond method. The Mathematical Society of Japan, B, C , and D matrices of appropriate dimensions
Tokyo
and with constant entries, and the problem of
estimating its state from measurements of the
input and output signals. In Eq. (1)
x.t/ stands
P
for x.t/, if the system is continuous-time, and
Observers in Linear Systems Theory for x.t C 1/, if the system is discrete-time. In
addition, if the system is continuous-time, then
A. Astolfi1;2 and Panos J. Antsaklis3
1 t 2 RC , i.e., the set of nonnegative real numbers,
Department of Electrical and Electronic
whereas if the system is discrete-time, then t 2
Engineering, Imperial College London,
Z C , i.e., the set of nonnegative integers.
London, UK
2 We are interested in determining an online
Dipartimento di Ingegneria Civile e Ingegneria
estimate xe .t/ 2 Rn , i.e., the estimate at time t
Informatica, Università di Roma Tor Vergata,
has to be a function of the available information
Roma, Italy
3 (input and output) at the same time instant. This
Department of Electrical Engineering,
implies that the estimate is generated by means of
University of Notre Dame, Notre Dame,
a device (known as filter) processing the current
IN, USA
input and output of the system and generating a
state estimate. The filter may be instantaneous,
i.e., the estimate is generated instantaneously by
Abstract
processing the available information. In this case
we have a static filter. Alternatively, the state es-
Observers are dynamical systems which process
timate can be generated processing the available
the input and output signals of a given dynamical
information through a dynamical device. In this
system and deliver an online estimate of the
case we have a dynamic filter.
internal state of the given system which asymp-
Assume, for simplicity, that D D 0. This
totically converges to the exact value of the state.
assumption is without loss of generality. In fact,
For linear, finite-dimensional, time-invariant sys- O
if y D C x C Du and u are measurable, then
tems, observers can be designed provided a weak
also yQ D C x is measurable. Assume, in addition,
observability property, known as detectability,
that the filter which generates the online estimate
holds.
is linear, finite-dimensional, and time-invariant.
Then we may have the following two configura-
tions:
• Static filter. The state estimate is generated via
Keywords the relation

Linear systems; Observers; Reduced order


xe D My C N u; (2)
observer; State estimation

with M and N constant matrices of appropri-


ate dimensions. The resulting interconnected
system is described by the equations
Introduction

x D Ax C Bu;
Consider a linear, finite-dimensional, time- (3)
invariant system described by equations of the xe D M C x C N u:
form

x D Ax C Bu; • Dynamic filter. The state estimate is generated
(1)
y D C x C Du; by the system
944 Observers in Linear Systems Theory


 D F  C Ly C H u; the output signal y. This observer is therefore an
(4) open-loop observer.
xe D M  C Ny C P u;
To exploit the knowledge of y, we modify
the observer (6) adding a term which depends
with F , L, H , M , N and P constant matrices
upon the available information on the estimation
of appropriate dimensions. The resulting in-
error, which is given by ye D C xe  y: This
terconnected system is described by the equa-
modification yields a candidate state observer
tions
described by

x D Ax C Bu;

 D F  C LC x C H u; (5)
 D A C Bu C Lye ;
(7)
xe D M  C NC x C P u: xe D :

In what follows we study in detail the dynamic To assess the properties of this candidate state
filter configuration. This is mainly due to the fact observer, note that e D x  xe is such that
that this configuration allows us to solve most es-
timation problems for linear systems. Moreover,
e D .A C LC /e: (8)
while the use of a static filter is very appealing, it
provides a useful alternative only in very specific The matrix L (known as output injection gain)
situations. can be used to shape the dynamics of the estima-
tion error. In particular, we may select L to assign
State Observer the characteristic polynomial p.s/ of ACLC . To
this end, note that
A state observer is a filter that allows to estimate,
asymptotically or in finite time, the state of a sys- p.s/ D det.sI  .A C LC // D det.sI  .A0 C C 0 L0 //:
tem from measurements of the input and output
signals.
Hence, there is a matrix L which arbitrarily
The simplest possible observer can be
assigns the characteristic polynomial of A C LC
constructed considering a copy of the system,
if and only if the system
the state of which has to be estimated. This
means that a candidate observer for system (1) is
given by
 D A0  C C 0 v

 D A C Bu
(6) is reachable or, equivalently, if and only if the
xe D :
system (1) is observable.
To assess the properties of this candidate state We summarize the above discussion with two
observer, let e D x  xe be the estimation error formal statements.
and note that
e D Ae: As a result, if e.0/ D 0,
Proposition 1 Consider system (1) and suppose
then e.t/ D 0 for all t and for any input signal
the system is observable. Let p.s/ be a monic
u. However, if e.0/ ¤ 0, then, for any input
polynomial of degree n. Then there is a matrix
signal u, e.t/ is bounded only if the system (1)
L such that the characteristic polynomial of A C
is stable and converges to zero only if the system
LC is equal to p.s/. Note that for single-output
(1) is asymptotically stable. If these conditions do
systems, the matrix L assigning the characteristic
not hold, the estimation error is not bounded and
polynomial of A C LC is unique.
system (6) does not qualify as a state observer
for system (1). The intrinsic limitation of the Proposition 2 System (1) is observable if and
observer (6) is that it does not use all the available only if it is possible to arbitrarily assign the
information, i.e., it does not use the knowledge of eigenvalues of A C LC .
Observers in Linear Systems Theory 945

Detectability oversized, i.e., it gives an estimate for the n


components of the state vector, without making
The main goal of a state observer is to provide an use of the fact that some of these components can
online estimate of the state of a system. This goal be directly determined from the output function,
may be achieved, as discussed in the previous e.g., if y D x1 there is no need to reconstruct
section, if the system is observable. However, x1 . It makes, therefore, sense to design a re-
observability is not necessary to achieve this goal: duced order observer, i.e., a device that esti-
in fact the unobservable modes are not modified mates only the part of the state vector which is
by the output injection gain. This implies that not directly attainable from the output. To this
there exists a matrix L such that system (8) is end consider the system (1) with D D 0 and
asymptotically stable if and only if the unob- assume that the matrix C has p independent
servable modes of system (1) have negative real rows. This is the case if rank C D p, whereas
part, in the case of continuous-time systems, or if rank C < p it is always possible to elimi-
have modulo smaller than one, in the case of nate redundant rows. Then there exists a matrix
discrete-time systems. To capture this situation, Q such that, possibly after reordering the state
we introduce a new definition. variables,
Definition 1 (Detectability) System (1) is de- QC D ŒI C2  :
tectable if its unobservable modes have negative
real part, in the case of continuous-time systems, Let
or have modulo smaller than one, in the case of
discrete-time systems.
v D Qy D QC x D x1 C C2 x2 ;
Example 1 (Deadbeat observer) Consider a
discrete-time system described by equations of
the form
in which x1 .t/ 2 Rp and x2 .t/ 2 Rnp denote
the first p and the last n  p components of x.t/.
x.t C 1/ D Ax.t/ C Bu.t/;
Observe that the vector v is measurable. O
y.t/ D C x.t/; From the definition of v, we conclude that
if v and x2 are known, then x1 can be easily
and the problem of designing a state observer, computed, i.e., there is no need to construct an
described by the equation (7), such that, for any observer for x1 .
initial condition x.0/ and for any u, e.k/ D Define now the new coordinates
0; for all k  N , and for some N > 0.
A state observer achieving this goal is called a " # " #

deadbeat state observer. To achieve this goal, it is xO 1 I C2 x1


D Tx D
necessary to select L such that .A C LC /N D xO 2 0 I x2
0 or, equivalently, such that the matrix A C
LC has all eigenvalues equal to 0. Note that
N  n. and note that, by construction, v D Qy D xO 1 : In
the new coordinates, the system, with output v, is
described by equations of the form
Reduced Order Observer

We have shown that, under the hypotheses of


xO1 D AQ11 xO 1 C AQ12 xO 2 C BQ1 u;
observability or detectability, it is possible to

xO2 D AQ21 xO 1 C AQ22 xO 2 C BQ2 u;
design an asymptotic observer of order n for the
system (1). However, this observer is somewhat v D xO 1 :
946 Observers in Linear Systems Theory

To construct an observer for xO 2 , consider the w D  C Lv:


system

 D F  C H v C Gu; The idea is to select the matrices F , H , G, and
L in such a way that w be an estimate for xO 2 . Let
with state , driven by u and v, and with output w  xO 2 be the observation error. Then


w 
xO 2 D Q Q Q Q Q Q
F  C H v C Gu C L A11 xO 1 C A12 xO 2 C B1 u  A21 xO 1 C A22 xO 2 C B2 u
 

D Q Q Q Q Q Q
F  C H C LA11  A12 xO 1 C LA12  A22 xO 2 C G C LB1  B2 u: (9)

To have convergence of the estimation error to Finally, from xO 1 D v and the estimate w of xO 2 ,
zero, regardless of the initial conditions and of we build an estimate xe of the state x inverting
the input signal, we must have the transformation T , i.e.,



.w  xO 2 / D F .w  xO 2 / (10) x1e I C2 v
D :
x2e 0 I w
and F must have all eigenvalues with negative
real part, in the case of continuous-time systems,
or with modulo smaller than one, in the case Summary and Future Directions
of discrete-time systems. Comparing Eqs. (9) and
(10), we obtain that the matrices F , H , G, and L The problem of estimating the state of a linear
must be such that system from input and output measurements can
be solved provided a weak observability condi-
LAQ12  AQ22 D F;
tion holds. The problem addressed in this entry
H C LAQ11  AQ21 D FL; is the simplest possible estimation problem: the
underlying system is linear and all variables are
G C LBQ1  BQ 2 D 0:
exactly measured. Observers for nonlinear sys-
tems and in the presence of signals corrupted by
We now show how the previous equations can be
noise can also be designed exploiting some of
solved and how the stability condition of F can
the basic ingredients, such as the notions of error
be enforced. Detectability of the system implies
system and of output injection, discussed in this
that the (reduced system)
Q D AQ22 Q with output
entry.
yQ D AQ12  is detectable. As a result, there exists a
matrix L such that the matrix

Cross-References
F D AQ22  LAQ12
 Controllability and Observability
has all eigenvalues with negative real part, in the  Estimation, Survey on
case of continuous-time systems, or with modulo  Hybrid Observers
smaller than one, in the case of discrete-time  Kalman Filters
systems. Then the remaining equations are solved  Linear Systems: Continuous-Time, Time-In-
by variant State Variable Descriptions
H D FL  LAQ11 C AQ21 ;  Linear Systems: Discrete-Time, Time-Invariant
G D LBQ1 C BQ2 : State Variable Descriptions
Optimal Control and Mechanics 947

 Observer-Based Control Variational Nonholonomic Systems


 Observers for Nonlinear Systems and Optimal Control
 State Estimation for Batch Processes
Variational nonholonomic problems (i.e., con-
strained variational problems) are equivalent to
optimal control problems under certain regularity
Recommended Reading conditions. This issue was investigated in Bloch
and Crouch (1994), employing the classical re-
Classical references on observers for linear sys- sults of Rund (1966) and Bliss (1930), which re-
tems are given below. late classical constrained variational problems to
Hamiltonian flows, although not optimal control
problems. We outline the simplest relationship
and refer to Bloch (2003) for more details.
Bibliography
Let Q be a smooth manifold and TQ
Antsaklis PJ, Michel AN (2007) A linear systems primer. its tangent bundle with coordinates .q i ; qP i /.
Birkhäuser, Boston Let L W TQ ! R be a given smooth Lagrangian
Brockett RW (1970) Finite dimensional linear systems. and let ˆ W TQ ! Rnm be a given smooth
Wiley, London
Luenberger DG (1963) Observing the state of a linear
function. We consider the classical Lagrange
system. IEEE Trans Mil Electron 8:74–80 problem:
Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con-
trol theory for linear systems. Springer, London
Z T
Zadeh LA, Desoer CA (1963) Linear system theory. minq./ P
L.q; q/dt (1)
McGraw-Hill, New York 0

subject to the fixed endpoint conditions q.0/ D 0,


q.T / D qT and subject to the constraints
Optimal Control and Mechanics
P D 0:
ˆ.q; q/
Anthony Bloch
O
Department of Mathematics, The University of P / D
Consider a modified Lagrangian ƒ.q; q;
Michigan, Ann Arbor, MI, USA P C   ˆ.q; q/
L.q; q/ P with Euler–Lagrange equa-
tions

d @ƒ @ƒ
Abstract P /
.q; q; P / D 0; ˆ.q; q/
.q; q; P D 0:
dt @qP @q
(2)
There are very natural close connections between
mechanics and optimal control as both involve We can rewrite this equation in Hamiltonian
variational problems. This is a huge subject and form and show that the resulting equations are
we just touch on some interesting connections equivalent to the equations of motion given by
here. A survey and history may be found in the maximum principle for a suitable optimal
Sussman and Willems (1997). Other aspects may control problem. Set p D @ƒ @qP
P / and con-
.q; q;
be found in Bloch (2003). sider this equation together with the constraints
P D 0. We can solve these two equations
ˆ.q; q/
P / under suitable conditions as discussed
for .q;
in Bloch (2003). We obtain the standard Hamil-
Keywords tonian equations with H.q; p/ D p  .q; p/ 
L.q; .q; p//.
Nonholonomic integrator; Sub-Riemannian opti- We now compare this to the optimal control
mal control; Variational problems problem
948 Optimal Control and Mechanics

Z T a class of embedded control problems in Bloch


minu./ g.q; u/dt (3) et al. (2011, 2013).
0
Let T > 0, Q0 ; QT 2 SO.n/ be given and
subject to q.0/ D 0, q.T / D qT , qP D f .q; u/;
fixed. Let the rigid body optimal control problem
where u 2 Rm and f; g are smooth functions.
be given by
Then we have the following:
Z T
Theorem 1 The Lagrange problem and optimal 1
min hU; J.U /idt (5)
control problem generate the same (regular) ex- U 2so.n/ 4 0
tremal trajectories, provided that:
P D 0 if and only if there exists a u
(i) ˆ.q; q/ subject to the constraint on U that there be a curve
such that qP D f .q; u/. Q.t/ 2 SO.n/ such that
(ii) L.q; f .q; u// D g.q; u/.
QP D QU Q.0/ D Q0 ; Q.T / D QT :
For the proof and more details, see Bloch (2003).
(6)
Proposition 1 The rigid body optimal control
The n-Dimensional Rigid Body problem has optimal evolution equations (4)
where P is the costate vector given by the
An interesting mechanical example is the n- maximum principle.
dimensional rigid body. See Manakov (1976) and The optimal controls in this case are given by
Ratiu (1980).
One can introduce a related system which we U D J 1 .QT P  P T Q/: (7)
will call the symmetric representation of the rigid
body; see Bloch et al. (2002).
By definition, the left invariant representa- Kinematic Sub-Riemannian Optimal
tion of the symmetric rigid body system is given Control Problems
by the first-order equations
Optimal control of underactuated kinematic sys-
QP D QI PP D P  (4) tems give rise to very interesting mechanical
systems.
where Q; P 2 SO.n/ and where  is regarded as The problem is referred to as sub-Riemannian
a function of Q and P via the equations in that it gives rise to a geodesic flow with
respect to a singular metric (see the work of
 WD J 1 .M / 2 so.n/ and M WD QT PP T Q: Strichartz (1983, 1987) and Montgomery (2002)
and references therein). This problem has an
One can check that differentiating M yields interesting history in control theory (see Brock-
ett 1973, 1981; Baillieul 1975). See also Bloch
the classical form of the n-dimensional rigid body
equations. For more on the precise relationship, et al. (1994) and Sussmann (1996) and further
see Bloch et al. (2002). references below.
Now we can link the symmetric representation We consider control systems of the form
of the rigid body equations with the theory of
X
m
optimal control. This work, developed in Bloch xP D Xi ui ; x 2 M; u 2   Rm ; (8)
and Crouch (1996) and more generally in Bloch i D1
et al. (2002), has been further extended to optimal
control problems for the infinitesimal generators where  contains an open subset that contains
of group actions (so-called Clebsch optimal con- the origin, M is a smooth manifold of dimension
trol problems) in Gay-Balmaz and Ratiu (2011) n, and each of the vector fields in the collection
and Bloch et al. (2011, 2013) and even further to F WD fX1 ; : : : ; Xk g is complete.
Optimal Control and Mechanics 949

We assume that the system satisfies the acces- where A1 .x; y/ and A2 .x; y/ are smooth func-
sibility rank condition and is thus controllable, tions of x and y. A1 D y and A2 D x
since there is no drift term. Then we can pose the recover the Heisenberg/nonholonomic integrator
optimal control problem equations. More generally we get the flow of a
particle in a magnetic field – it is not hard to
Z carry out the optimal control analysis to see this.
1X 2
T m
min u .t/dt (9) Details are in Bloch (2003).
u./ 0 2 i D1 i

subject to the dynamics (8) and the endpoint


Cross-References
conditions x.0/ D x0 and x.T / D xT . These
problems were studied by Griffiths (1983) from
 Discrete Optimal Control
the constrained variational viewpoint and from
 Optimal Control and Pontryagin’s Maximum
the optimal control viewpoint by Brockett (1981,
Principle
1983). In the sub-Riemannian geodesic problem,
 Optimal Control with State Space
abnormal extremals play an important role. See
Constraints
work by Strichartz (1983), Montgomery (1994,
 Singular Trajectories in Optimal
1995), Sussmann (1996), and Agrachev and
Control
Sarychev (1996).

Example: Optimal Control and a Particle


in a Magnetic Field The control analysis Bibliography
of the Heisenberg model or nonholonomic
integrator goes back to Brockett (1981) and Agrachev AA, Sarychev AV (1996) Abnormal sub-
Baillieul (1975), while a modern treatment of Riemannian geodesics: Morse index and rigid-
ity. Ann Inst H Poincaré Anal Non Linéaire
the relationship with a particle in a magnetic 13:635–690
field may be found in Montgomery (1993), for Baillieul J (1975) Some optimization problems in
example. A nice treatment of the pure mechanical geometric control theory. Ph.D. thesis, Harvard O
aspects of a particle in a magnetic field may be University
Baillieul J (1978) Geometric methods for nonlinear opti-
found in Marsden and Ratiu (1999). mal control problems. J Optim Theory Appl 25:519–
The Heisenberg optimal control equations are 548
a particular case of planar charged particle mo- Bliss G (1930) The problem of lagrange in the calculus of
tion in a magnetic field. This may be seen by variations. Am J Math 52:673–744
Bloch AM (2003) (with Baillieul J, Crouch PE,
considering the slightly more general problem Marsden JE), Nonholonomic mechanics and con-
below. trol. Interdisciplinary applied mathematics. Springer,
We now consider the optimal control problem New York
Bloch AM, Crouch PE (1994) Reduction of Euler–
Z Lagrange problems for constrained variational prob-
min .u2 C v 2 /dt (10) lems and relation with optimal control problems.
In: Proceedings of the 33rd IEEE conference on
decision and control, Lake Buena Vista. IEEE,
pp 2584–2590
subject to the equations Bloch AM, Crouch PE (1996) Optimal control and
geodesic flows. Syst Control Lett 28(2):65–72
Bloch AM, Crouch PE, Ratiu TS (1994) Sub-Riemannian
optimal control problems. Fields Inst Commun AMS
xP D u; 3:35–48
Bloch AM, Crouch P, Marsden JE, Ratiu TS (2002)
yP D v; The symmetric representation of the rigid body
equations and their discretization. Nonlinearity 15:
zP D A1 u C A2 v; (11) 1309–1341
950 Optimal Control and Pontryagin’s Maximum Principle

Bloch AM, Crouch PE, Nordkivst N, Sanyal AK (2011)


Embedded geodesic problems and optimal control for Optimal Control and Pontryagin’s
matrix Lie groups. J Geom Mech 3:197–223 Maximum Principle
Bloch AM, Crouch PE, Nordkivst N (2013) Continuous
and discrete embedded optimal control problems and
Richard B. Vinter
their application to the analysis of Clebsch optimal
control problems and mechanical systems. J Geom Imperial College, London, UK
Mech 5:1–38
Brockett RW (1973) Lie theory and control systems
defined on spheres. SIAM J Appl Math 25(2):
213–225
Abstract
Brockett RW (1981) Control theory and singular Rie-
mannian geometry. In: Hilton PJ, Young GS (eds) Pontryagin’s Maximum Principle is a collection
New directions in applied mathematics. Springer, New of conditions that must be satisfied by solutions
York, pp 11–27
of a class of optimization problems involving
Brockett RW (1983) Nonlinear control theory and
differential geometry. In: Proceedings of the in- dynamic constraints called optimal control prob-
ternational congress of mathematicians, Warsaw, lems. It unifies many classical necessary condi-
pp 1357–1368 tions from the calculus of variations. This article
Gay-Balmaz F, Ratiu TS (2011) Clebsch optimal con-
trol formulation in mechanics. J Geom Mech 3:
provides an overview of the Maximum Principle,
41–79 including free-time and nonsmooth versions. A
Griffiths PA (1983) Exterior differential systems. time-optimal control problem is solved as an
Birkhäuser, Boston example to illustrate its application.
Manakov SV (1976) Note on the integration of Euler’s
equations of the dynamics of an n-dimensional rigid
body. Funct Anal Appl 10:328–329
Marsden JE, Ratiu TS (1999) Introduction to mechan-
ics and symmetry. Texts in applied mathematics,
Keywords
vol 17. Springer, New York. (1st edn. 1994; 2nd edn.
1999) Dynamic constraints; Hamiltonian system; Maxi-
Montgomery R (1993) Gauge theory of the falling cat. mum principle; Nonlinear systems; Optimization
Fields Inst Commun 1:193–218
Montgomery R (1994) Abnormal minimizers. SIAM J
Control Optim 32:1605–1620
Montgomery R (1995) A survey of singular curves in
sub-Riemannian geometry. J Dyn Control Syst 1:
Optimal Control
49–90
Montgomery R (2002) A tour of sub-Riemannian geome- A widely used framework for studying mini-
tries, their geodesics and applications. Mathematical mization problems, encountered in the optimal
surveys and monographs, vol 91. American Mathemat-
selection of flight trajectories and other areas
ical Society, Providence
Ratiu T (1980) The motion of the free n-dimensional rigid of advanced engineering design and operation
body. Indiana U Math J 29:609–627 involving dynamic constraints, is to view them as
Rund H (1966) The Hamiltonian–Jacobi theory in the special cases of the problem:
calculus of variations. Krieger, New York
Strichartz R (1983) Sub-Riemannian geometry. J Diff
8
Geom 24:221–263; see also J Diff Geom 30:595–596 ˆ
ˆ Minimize J.x.:/; u.:// W
ˆ RT
(1989) ˆ
ˆ D 0 L.t; x.t/; u.t//dt C g.x.0/; x.T //
Strichartz RS (1987) The Campbell–Baker–Hausdorff– ˆ
ˆ
ˆ
ˆ
Dynkin formula and solutions of differential equa- ˆ
ˆ over measurable functions u.:/ W
ˆ
ˆ Œ0; T  ! Rm and
tions. J Funct Anal 72:320–345 ˆ
ˆ
Sussmann HJ (1996) A cornucopia of four-dimensional <
abnormal sub-Riemannian minimizers. In: Bellaïche .P / absolutely continuous functions x.:/ W
ˆ
ˆ
A, Risler J-J (eds) Sub-Riemannian geometry. ˆ
ˆ Œ0; T  ! Rn satisfying
Progress in mathematics, vol 144. Birkhäuser, Basel, ˆ
ˆ
ˆ P
ˆ x.t/ D f .t; x.t/; u.t// a.e.,
pp 341–364 ˆ
ˆ
ˆ
ˆ
Sussmann HJ, Willems JC (1997) 300 years of optimal ˆ
ˆ u.t/ 2  a.e.,
control: from the Brachystochrone to the maximum :̂
principle. IEEE Control Syst Mag 17:32–44 .x.0/; x.T // 2 C;
Optimal Control and Pontryagin’s Maximum Principle 951

the data for which comprise a number T > 0, derivation of general necessary conditions of
functions f W Œ0; T Rn Rm ! Rn , L W Œ0; T  optimality.
Rn  Rm ! R and g W Rn  Rn ! R and sets
C  Rn and   Rm .
It is assumed that set C has the functional in- The Maximum Principle
equality and equality constraint set representation
The centerpiece of optimal control theory is a
C D f.x0 ; x1 / 2 Rn W i .x0 ; x1 /  0 set of conditions that a minimizer .x.:/; N uN .://
for i D 1; 2; : : : ; k1 and (1) must satisfy, known as Pontryagin’s Maximum
i
.x0 ; x1 / D 0 for i D 1; 2; : : : ; k2 g; Principle or, simply, the Maximum Principle. It
came to prominence through a 1961 book, which
appeared in English translation as Pontryagin LS
in which i W Rn  Rn ! R, i D 1; : : : ; k1 and et al. (1962). It bears the name of L S Pontryagin,
i
W Rn  Rn ! R, i D 1; : : : ; k2 are given because of his role as leader of the research group
functions. at the Steklov Institute, Moscow, which achieved
A control function is a measurable function this advance. But the first proof is attributed to
u.:/ W Œ0; T  ! Rm satisfying u.t/ 2  a.e. Boltyanskii. For given   0, define the Hamil-
0
t 2 Œ0; T  A state trajectory x.:/ associated tonian function H W Œ0; T Rn Rn Rm ! R
with a control function u.:/ is a solution to the
P
differential equation x.t/ D f .t; x.t/; u.t//. A H .t; x; p; u/ WD p T f .t; x; u/  L.t; x; u/:
pair of functions .x.:/; u.:// comprising a control
function u.:/ and an associated state trajectory Theorem 1 (The Maximum Principle) Let
x.:/ satisfying the condition .x.0/; x.T // 2 N
.x.:/; uN .:// be a minimizer for (P ). Assume that
C is a feasible process. A feasible process the following hypotheses are satisfied:
N
.x.:/; uN .:// which achieves the minimum of (i) g is continuously differentiable.
J.x.:/; u.:// over all feasible processes is called (ii) i , i D 1; : : : ; k1 and i , i D 1; : : : ; k2 , are
a minimizer. continuously differentiable. O
Frequently, the initial state is fixed, i.e., C (iii) With fQ.t; x; u/ D .L.t; x; u/; f .t; x; u//,
takes the form fQ.:; :; :/ is continuous, fQ.t; :; u/ is continu-
ously differentiable for each .t; u/, and there
CDfx0 g  C1 for some x02Rn and some C1Rn : exist > 0 and k.:/ 2 L1 such that

In this case, (P ) is a minimization problem jfQ.t; x; u/  fQ.t; x 0 ; u/j  k.t/jx  x 0 j


over control functions. Allowing freedom in the
choice of initial state introduces a flexibility for all x; x 0 2 Rn such that jx  x.t/j
N 
into the formulation which is useful in some and jx 0  x.t/j
N  , and u 2 , a.e. t 2
applications however. Œ0; T 
Optimization problems involving dynamic (iv)  is a Borel set.
constraints (such as, but not exclusively, those Then, there exist a number  ( D 0 or 1), an
expressed as controlled differential equations) absolutely continuous arc p W Œ0; T  ! Rn ,
are known as optimal control problems. Various numbers ˛ i  0 for i D 1; : : : ; k1 and numbers
frameworks are available for studying such ˇ i for i D 1; : : : ; k2 satisfying
problems. .P / is of special importance, since
it embraces a wide range of significant dynamic .p.:/; ; f˛ i g; fˇ i g/ 6D .0; 0; f0; : : : 0g; f0; : : : 0g/
optimization problems which are beyond the
reach of traditional variational techniques and, and such that the following conditions are satis-
at the same time, it is well suited to the fied:
952 Optimal Control and Pontryagin’s Maximum Principle

The Adjoint Equation: the differential equations for the pi .:/’s is, first,
to construct the Hamiltonian H .t; x; p; u/ D
@ T p T f .t; x; u/p T f .t; x; u/  L.t; x; u/ and,
p.t/
P D N
f .t; x.t/; uN .t//p.t/
@x second, to use the fact that the i th component
@ pi .:/ of the costate p.t/ D Œp1 .t/; : : : ; pn .t/T
N
 LT .t; x.t/; uN .t//; a.e., satisfies the equation:
@x
The Maximization of the Hamiltonian Condi- @
tion: pPi .t/ D N
H .t; x.t/; p.t/; uN .t//
@xi
for i D 1; : : : ; n:
N
H .t; x.t/; p.t/; uN .t//
D max H .t; x.t/;
N p.t/; u/ a.e., The preceding equations are of course merely a
u2 component-wise statement of the costate equa-
The Transversality Condition: tion above. In many applications the endpoint
constraints take the form
.p T .0/; p T .T // D rg.x.0/;
N N //
x.T
xi .0/ D 0i for i 2 J0 and xi .0/ 2 Rn for i … J0
X
k1
C ˛ i r i .x.0/;
N N //
x.T xi .T / D 1i for i 2 J1 and xi .0/ 2 Rn for i … J1
i D1

X
k2 for given index sets J0 ; J1  f0; : : : ; ng and n-
C ˇi r i
N
.x.0/; N //
x.T vectors 0i for i 2 J0 and 1i for i 2 J1 , i.e., the
i D1 endpoints of each state trajectory component are
and ˛ D 0 for all i 2 f1; : : : ; k1 g such
i either “fixed” or “free.” In such cases the rules for
setting up the boundary conditions on the pi .:/’s
N
that i .x.0/; N // < 0; in which
x.T are

pi .0/ 2 Rn for i 2 J0 and pi .0/


rh.x0 ; x1 /.xN 0 ; xN 1 / W

@
@ @ D N
g.x.0/; N // for i … J0
x.T
D h.xN 0 ; xN 1 /; h.xN 0 ; xN 1 / : (2) @x0i
@x0 @x1

If the functions L.t; x; u/ and f .t; x; u/ are


independent of t, then also pi .T / 2 Rn for i 2 J1 and  pi .T /
Constancy of the Hamiltonian for Au- @
D N
g.x.0/; N // for i … J1 ;
x.T
tonomous Problems: @x1i
N
H .x.t/; p.t/; uN .t// D c a.e.
i.e., if xi .0/ (respectively xi .T /) is fixed,
for some constant c. then pi .0/ (respectively pi .T /) is free, and if
xi .0/ (respectively xi .T /) is free, then pi .0/
We allow the cases k1 D 0 (no inequality
(respectively pi .T /) is fixed.
constraints) and k2 D 0 (no equality endpoint
The optimal control problem .P / is a general-
constraints). In the first case, the non-degeneracy
ization of the following problem in the calculus
condition becomes .p.:/; ; fˇ i g/ 6D .0; 0; 0/
of variations:
and the summation involving the ˛ i ’s is dropped
from the transversality condition. The second 8 RT
ˆ
ˆ P
Minimize 0 L.t; x.t/; x.t//dt
case, or any combination of the two cases, is ˆ
<
treated similarly. over absolutely continuous arcs x.:/:
(3)
Derivation of the costate equation and ˆ Œ0; T  ! Rn satisfying
ˆ

boundary conditions. A simple way to derive .x.0/; x.T // D .a; b/:
Optimal Control and Pontryagin’s Maximum Principle 953

for given L W Œ0; T   Rn  Rn ! R and .a; b/ 2 in which rxp H1 denotes the gradient of
Rn  Rn . This problem is a special case of (P ) in H.t; x; p; u/ w.r.t. the vector Œx T ; p T T variable
which f .t; x; u/ D u,  D Rn , k1 D 0, k2 D 2n for fixed .t; u/, together with the endpoint
and conditions
 1 n

. .x0 ; x1 /; : : : ; .x0 ; x1 / ;
  N
.x.0/; N // 2 C and .p T .0/; p T .T //
x.T
nC1 2n
.x0 ; x1 / : : : ;
.x0 ; x1 /
D rg.x.0/;
N N //
x.T
 
D x0T  aT ; x1T  b T :
X
k1
C ˛ i r i .x.0/;
N N //
x.T
It is a straightforward exercise to deduce i D1
from the Maximum Principle, in this special
X
k2
case, that a minimizer satisfies the classical C ˇi r i
N
.x.0/; N //;
x.T
Euler–Lagrange and Weierstrass conditions i D1
and also that the minimizer and associate
costate arc satisfy Hamilton’s system of
for some nonnegative numbers f˛ i g and numbers
equations, under an additional uniform con-
fˇ i g satisfying
vexity hypothesis on L.t; x; :/. Thus, the
Maximum Principle unifies many of the classical
necessary conditions from the calculus of ˛ i D 0 for all i 2 f1; : : : ; k1 g such that
variations and, furthermore, validates them
 i .x.0/;
N N // < 0;
x.T
under reduced hypotheses. But it has far-
reaching implications, beyond these conditions,
because it allows the presence of pathwise where rg; r and r etc., are as defined in (2).
constraints on the velocities, expressed in terms The minimizing control satisfies the relation
of a controlled differential equation and a
control constraint set, which are encountered
uN .t/ D u .t; x.t/;
N p.t//: O
in engineering design, econometrics, and other
areas.
Notice that the first-order vector differential
equation (4) is a system of 2n scalar, first-
The Hamiltonian System order differential equations. Let us suppose
that kN1 inequality endpoint constraints are
In favorable circumstances, we are justified in N
active at .x.0/; N //. Then, satisfaction of
x.T
setting the cost multiplier  D 1 and, fur- the active constraints and the transversality
thermore, the maximization of the Hamiltonian condition impose 2n C kN1 C k2 on the boundary
condition permits us, for each t, to express u as a N
values of .x.:/; p.://. Taking account of the
function of x and p: fact, however, that there are kN1 C k2 unknown
endpoint multipliers, we see that the effective
u D u .t; x; p/: number of endpoint constraints accompanying
the differential equation (4) is
The Maximum Principle now asserts that a min-
N is the first component of a pair
imizing arc x.:/ 2n C kN1 C k2  .kN1 C k2 / D 2n:
N
of absolutely continuous functions .x.:/; p.://
satisfying Hamilton’s system of equations:
Thus, the set of 2n scalar first-order differen-
PT
.pP .t/; xN .t// D rxp H1 .t; x.t/;
T
N p.t/; u  tial equations (4) defining the “two-point bound-
ary value problem” to determine .x; N p/ has the
N
.t; x.t/; p.t/// a.e. ; (4) “right” number of endpoint conditions.
954 Optimal Control and Pontryagin’s Maximum Principle

Refinements The Nonsmooth Maximum Principle

Free-Time Problems: Consider a variant on the In early derivations of the Maximum Principle,
“autonomous” case of problem .P / (L and f it was assumed that the functions f .t; x; u/ and
do not depend on t), call it .F T /, in which the L.t; x; u/ were continuously differentiable with
terminal time T is no longer fixed, but is a choice respect to the x variable. A major research en-
variable along with the control function and the deavor since the early 1970s has been to find
initial state, and the cost function is versions of the Maximum Principle than remain
valid when the functions f .t; x; u/ and L.t; x; u/
JQ .T; x.:/; u.:// satisfy merely a “bounded slope” or, synony-
Z T mously, a Lipschitz continuity condition with
WD L.x.t/; u.t//dt C g.T;
Q x.0/; x.T // respect to x. Such functions are “nonsmooth” in
0 the sense that they can fail to be differentiable,
in the conventional sense, at some points in their
for some function g.:; Q :; /. Take a minimizer domains. An overview of the Maximum Principle
.TN ; x.:/;
N uN .:// for .F T /. Assume, in addition would be incomplete without reference to such
to hypotheses (i)–(iii), that  is bounded and advances.
the function k.:/ in (iii) is bounded. Then the The search for nonsmooth optimality
Maximum Principle conditions (for data in which conditions is motivated by a desire to solve
the end time is frozen at T D TN ) continue to optimal control problems where, in particular,
be satisfied for some p.:/ W Œ0; TN  ! Rn and the function f .t; x; u/ is a piecewise linear
, including the constancy of the Hamiltonian function of x (for fixed .t; u/). Such functions
condition arise, for example, when the f .t; x; u/ is
constructed empirically via a lookup table and
N
H .x.t/; p.t/; uN .t// D c a.e t 2 Œ0; TN  linear interpolation. Nonsmooth cost integrands
are encountered when they are constructed using
for some constant c. But a new condition is “pointwise” supremum and/or “absolute value”
required to reflect the extra degree of freedom operations. The function
in the new problem specification, namely, the
Z T
free end time. This is an additional transversality
J.x.:// D jx.t/jdt C maxfx.1/; 0g ;
condition involving the constant value c of the 0
Hamiltonian:
which penalizes the L1 norm of the state trajec-
Free Time Transversality Condition: c D tory and the terminal value of the scalar state, but
 @T@ g.TN ; x.0/;
N N //:
x.T only if this is nonnegative, is a case in point.
When attempting to generalize the Maximum
Other Refinements: Versions of the Maximum Principle to allow for nonsmooth data, we en-
Principle are available to take account of counter the challenge of interpreting the adjoint
pathwise functional inequality constraints on equation, which can be written as
state variables (“pure” state constraints) and
of both state and control variables (“mixed” @
constraints). Maximum Principle-like conditions p.t/
P D N
H .t; x.t/; uN .t/p.t//;
@x
have also been derived for optimal control
problems in which the dynamic constraint takes in circumstances when the x-gradients of f and
the form of a retarded differential equation with L are not defined, at least not in a conventional
control terms and in which the class of control sense. One approach to dealing with this problem
functions is enlarged to include Dirac delta is via the Clarke generalized gradient @m of
functions (“impulse” optimal control problems). function mW Rn ! R at a point x: N
Optimal Control and Pontryagin’s Maximum Principle 955

8
N WD co f j there exist sequences x i ! x,
@m.x/ N ˆ Minimize T
ˆ
ˆ
 i !  such that, for each i , m.:/ is Frêchet ˆ
ˆ over times T > 0, measurable functions u.:/:
ˆ
ˆ
differentiable at x i and i D @x
@
m.x i /g: ˆ
ˆ Œ0; T  ! R and
ˆ
ˆ
ˆ
< absolutely continuous functions x.:/:
Œ0; T  ! R2 such that
Here, “co” means closed convex hull. In a land- ˆ
ˆ
ˆ
ˆ P
x.t/ D Ax.t/ C bu.t/ a.e.
mark paper, Clarke FH 1976, Clarke proved ˆ
ˆ
ˆ
ˆ u.t/ 2  a.e.
a necessary condition commonly referred to as ˆ
ˆ
ˆ .x1 .0/; x2 .0// D .1; 0/ and
the nonsmooth Maximum Principle, in which the :̂
.x1 .T /; x2 .T // D .0; 0/:
adjoint equation is replaced by a differential in-
clusion involving the (partial) generalized gradi-

N p.t/; uN .t// of H.t; :; p.t/; uN .t// 01 0


ent @x H.t; x.t/; in which A D , b D and  D
N
w.r.t x, evaluated at x.t/, namely, 00 1
Œ1; C1.
The (free-time) Maximum Principle provides
pP T .t/ 2 @x H.t; x.t/;
N uN .t// a.e. t 2 Œ0; T : the following information about a minimizing
end time TN , control uN .:/, and corresponding
N
state x.:/ D .xN 1 .:/; xN 2 .://. There exists an arc
This formulation of the “adjoint inclusion” for
p.:/ D Œp1 .:/; p2 .:/T such that
the nonsmooth Maximum Principle and the un-
restricted hypothesis under which it is derived in xPN 1 .t/ D xN 2 .t/ and xPN 2 .t/ D uN .t/ ; (6)
this paper remain state of the art. pP1 .t/ D 0 and  pP2 .t/ D p1 .t/; (7)
uN .t/ D arg max fp2 .t/u j u 2 Œ1; C1g (8)

Example .xN 1 ; xN 2 /.0/D.1; 0/and.xN 1 ; xN 2 /.T /D.0; 0/ (9)


p1 .t/ xN 2 .t/ C jp2 .t/j D  for all t: (10)
We illustrate the application of the Maximum
Principle with a simple example. It has the fol- Condition (1) permits us to express uN .:/ in terms
lowing interpretation. A 1 kg mass is located 1 m
O
of p2 .:/, thus
along the line and has zero velocity. We seek a
time TN > 0 s. which is the minimum over all uN .t/ D signfp2 .t/g ;
times T > 0 having the property: there exists a
time-varying force u.t/, 0  t  1 satisfying and thereby eliminate uN .:/. It can be shown that
relations (6)–(2) have a unique solution for TN ,
uN .t/, x.t/,
N p.t/ and  D 0 or 1. Furthermore,
1  u.t/  C1
these relations cannot be satisfied with  D 0.
The unique solution (with  D 1) is
such that, under the action of the force, the mass
is located at the origin with zero velocity at time TN D 2
T . Note that, in consequence of Newton’s second .xN 1 .t/; xN 2 .t//
law, the vector x.t/ D .x1 .t/; x2 .t// comprising 8
< .1  2 t 2 ; t/ if t 2 Œ0; 1/
1
the displacement and velocity of mass satisfies
D . 2  .t  1/ C 2 .t  1/ ;
1 1 2
:
1 C 12 .t  1// if t 2 Œ1; 2;



xP 1 .t/ 01 x1 .t/ 0
D C u.t/:
xP 2 .t/ 00 x2 .t/ 1 1 if t 2 Œ0; 1/
uN .t/ D
(5) C1 if t 2 Œ1; 2;
This is a special case of the free-time problem p1 .t/D1 and p2 .t/D1 C t for t 2 Œ0; 2:
956 Optimal Control and the Dynamic Programming Principle

The Maximum Principle is a necessary condition Principle, based on techniques of Nonsmooth Analysis
of optimality. Since a minimizer exists and since we refer to:
.TN ; x.:/;
Clarke FH (1983) Optimization and nonsmooth analysis.
N uN .:/; p.:// is a unique solution to the Wiley-nterscience, New York
Maximum Principle relations, it follows that Clarke FH (2013) Functional analysis, calculus of varia-
.TN ; x.:/;
N uN .:// is the solution to the problem. tions and optimal control. Graduate texts in mathemat-
This problem is amenable to simpler, more ics. Springer, London
Vinter RB (2000) Optimal control. Birkhaüser, Boston
elementary, solution techniques. But the above Engineering texts illustrating the application of the Max-
solution is enlightening, because it highlights imum Principle to solve problems of optimal control
important generic features of the Maximum Prin- and design, in flight mechanics and other areas, in-
ciple. We see how the “maximization of the clude:
Bryson AE, Ho Y-C (1975) Applied optimal control (Re-
Hamiltonian condition” can be used to eliminate vised edn). Halstead Press (a division of John Wiley
the control function and thereby to set up a two- and Sons), New York
point boundary problem for x.:/ N and p.:/ (a very Bryson AE (1999) Dynamic optimization. Addison Wes-
nonclassical construction). ley Longman, Menlo Park

Optimal Control and the Dynamic


Cross-References Programming Principle

 Numerical Methods for Nonlinear Optimal Maurizio Falcone


Control Problems Dipartimento di Matematica, SAPIENZA –
 Optimal Control and the Dynamic Program- Università di Roma, Rome, Italy
ming Principle
 Optimal Control and Mechanics Abstract
 Optimal Control with State Space Constraints
 Singular Trajectories in Optimal Control This entry illustrates the application of Bellman’s
dynamic programming principle within the con-
text of optimal control problems for continuous-
time dynamical systems. The approach leads to a
Bibliography
characterization of the optimal value of the cost
Clarke FH (1976) The maximum principle under minimal functional, over all possible trajectories given
hypotheses. SIAM J Control Optim 14:1078–1091 the initial conditions, in terms of a partial dif-
Pontryagin VG et al. (1962) The Mathematical Theory ferential equation called the Hamilton–Jacobi–
of Optimal Processes, K. N. Tririgoff, Transl., L. W.
Neustadt, Ed., Wiley, New York, 1962
Bellman equation. Importantly, this can be used
Reflections on the origins of the Maximum Principle to synthesize the corresponding optimal control
appear in: input as a state-feedback law.
Pesch HJ, Plail M (2009) The maximum principle of op-
timal control: a history of ingenious ideas and missed
opportunities. Control Cybern 38:973–995 Keywords
Expository texts on the Maximum Principle and related
control theory include:
Continuous-time dynamics; Hamilton–Jacobi–
Berkovitz LD (1974) Optimal control theory. Applied
mathematical sciences, vol 12. Springer, New York Bellman equation; Optimization; Nonlinear
Fleming WH, Rishel RW (1975) Deterministic and systems; State feedback
stochastic optimal control. Springer, New York
Ioffe AD, Tihomirov VM (1979) Theory of extremal
problems. North-Holland, Amsterdam Introduction
Ross IM (2009) A primer on Pontryagins princi-
ple in optimal control. Collegiate Publishers, San
The dynamic programming principle (DPP) is
Francisco
For expository texts that also cover advances in the the- a fundamental tool in optimal control theory.
ory of necessary conditions related to the Maximum It was largely developed by Richard Bellman
Optimal Control and the Dynamic Programming Principle 957

in the 1950s (Bellman 1957) and has since been ˛ W Œt0 ; T  ! A  Rm ;


applied to various problems in deterministic and
stochastic optimal control. The goal of optimal with T finite or C1. Under the assumption that
control is to determine the control function and the control is measurable, existence and unique-
the corresponding trajectory of a dynamical sys- ness properties for the solution of (1) are ensured
tem which together optimize a given criterion by the Carathèodory theorem:
usually expressed in terms of an integral along
Theorem 1 (Carathèodory) Assume that:
the trajectory (the cost functional) (Fleming and
1. f .; / is continuous.
Rishel 1975; Macki and Strauss 1982). The func-
2. There exists a positive constant Lf > 0 such
tion which associates with the initial condition
that
of the dynamical system the optimal value of the
cost functional among all the possible trajectories
jf .x; a/  f .y; a/j  Lf jx  yj;
is called the value function. The most interest-
ing point is that via the dynamic programming
for all x; y 2 Rd , t 2 RC and a 2 A.
principle, one can derive a characterization of the
3. f .x; ˛.t// is measurable with respect to t.
value function in terms of a nonlinear partial dif-
Then, there is a unique absolutely continuous
ferential equation (the Hamilton–Jacobi–Bellman
function y W Œt0 ; T  ! Rd that satisfies
equation) and then use it to synthesize a feedback Z s
control law. This is the major advantage over y.s/ D x0 C f .y. /; ˛. //d : (2)
the approach based on the Pontryagin Maximum t0
Principle (PMP) (Boltyanskii et al. 1956; Pon-
which is interpreted as the solution of (1).
tryagin et al. 1962). In fact, the PMP merely gives
necessary conditions for the characterization of Note that the solution is continuous, but only a.e.
the open-loop optimal control and of the corre- differentiable, so it must be regarded as a weak
sponding optimal trajectory. The DPP has also solution of (1). By the theorem above, fixing a
been applied to construct approximation schemes control in the set of admissible controls
for the value function although this approach suf-
fers from the “curse of dimensionality” since one ˛ 2 A WD f˛ W Œt0 ; T  ! A; measurableg O
has to solve a nonlinear partial differential equa-
tion in a high dimension. Despite the elegance of yields a unique trajectory of (1) which is denoted
the DPP approach, its practical application is lim- by yx0 ; t0 .sI ˛/. Changing the control policy
ited by this bottleneck, and the solution of many generates a family of solutions of the controlled
optimal control problems has been accomplished system (1) with index ˛. Since the dynamics (1)
instead via the two-point boundary value problem are “autonomous,” the initial time t0 can be
associated with the PMP. shifted to 0 by a change of variable. So to simplify
the notation for autonomous dynamics, we can
set t0 D 0 and we denote this family by yx0 .sI ˛/
The Infinite Horizon Problem (or even write it as y.s/ if no ambiguity over
the initial state or control arises). It is customary
Let us present the main ideas for the classical infi- in dynamic programming, moreover, to use the
nite horizon problem. Let a controlled dynamical notations x and t instead of x0 and t0 (since x
system be given by and t appear as variables in the Hamilton–Jacobi–
Bellman equation).
( Optimal control problems require the intro-
P
y.s/ D f .y.s/; ˛.s//
(1) duction of a cost functional J W A ! R which
y.t0 / D x0 : is used to select the “optimal trajectory” for (1).
In the case of the infinite horizon problem, we set
where x0 ; y.s/ 2 Rd , and t0 D 0, x0 D x, and this functional is defined as
958 Optimal Control and the Dynamic Programming Principle

Z 1
position x. This is a reference value which can
Jx .˛/ D g.yx .s; ˛/; ˛.s//e s ds (3)
0
be useful to evaluate the efficiency of a control –
N is close to v.x/, this means that ˛N is
if Jx .˛/
for a given  > 0. The function g represents “efficient.”
the running cost and  is the discount factor, Bellman’s dynamic programming principle
which can be used to take into account the re- provides a first characterization of the value
duced value, at the initial time, of future costs. function.
From a technical point of view, the presence of Proposition 1 (DPP for the infinite horizon
the discount factor ensures that the integral is problem) Under the assumptions of Theorem 1,
finite whenever g is bounded. Note that one can for all x 2 Rd and > 0,
also consider the undiscounted problem ( D
0) provided the integral is still finite. The goal Z
of optimal control is to find an optimal pair v.x/ D inf g.yx .sI ˛/; ˛.s//e s ds
.y  ; ˛  / that minimizes the cost functional. If ˛2A 0

we seek optimal controls in open-loop form, i.e., 
Ce v.yx . I ˛// : (5)
as functions of t, then the Pontryagin Maximum
Principle furnishes necessary conditions for a
pair .y  ; ˛  / to be optimal.
N
Proof Denote by v.x/ the right-hand side of (5).
A major drawback of an open-loop control is
First, we remark that for any x 2 Rd and ˛N 2 A,
that being constructed as a function of time, it
cannot take into account errors in the true state Z 1
of the system, due, for example, to model errors s
N D
Jx .˛/ N
g.y.s/; N
˛.s//e ds
or external disturbances, which may take the evo- 0
Z
lution far from the optimal forecasted trajectory. s
Another limitation of this approach is that a new D N
g.y.s/; N
˛.s//e ds
0
computation of the control is required whenever Z 1
s
the initial state is changed. C N
g.y.s/; N
˛.s//e ds
For these reasons, we are interested in the
Z
so-called feedback controls, that is, controls ex- s
D N
g.y.s/; N
˛.s//e ds C e 
pressed as functions of the state of the system. 0
Under feedback control, if the system trajectory Z 1
is perturbed, the system reacts by changing its  N C //e s ds
N C /; ˛.s
g.y.s
0
control strategy according to the change in the Z
s
state. One of the main motivations for using the  N
g.y.s/; N
˛.s//e ds C e  v.y. //
N
DPP is that it yields solutions to optimal control 0
problems in the form of feedback controls.
N is abbreviated as y.s/).
(here, yx .s; ˛/ N Taking
DPP for the Infinite Horizon Problem the infimum over all trajectories, first over the
The starting point of dynamic programming is to right-hand side and then the left of this inequality,
introduce an auxiliary function, the value func- yields
tion, which for our problem is
v.x/  v.x/
N (6)
v.x/ D inf Jx .˛/; (4)
˛2A
To prove the opposite inequality, we recall that
where, as above, x is the initial position of the vN is defined as an infimum, and so, for any x 2
system. The value function has a clear meaning: Rd and " > 0, there exists a control ˛N " (and the
it is the optimal cost associated with the initial corresponding evolution yN" ) such that
Optimal Control and the Dynamic Programming Principle 959

Z
that is,
N
v.x/C"  g.yN" .s/; ˛N " .s//es dsCe v.yN" . //:
0 Z
(7)
v.x/e  v.y  . //D g.y  .s/; ˛  .s//e s ds
On the other hand, the value function v being also 0

defined as an infimum, for any x 2 Rd and " > 0,


there exists a control ˛Q " such that so that adding and subtracting e  v.x/ and
dividing by , we get
v.yN" . // C "  JyN" . / .˛Q " /: (8)
.v.x/  v.y  . /// v.x/.1  e  /
e  C

Inserting (8) in (7), we get Z
1
D g.y  .s/; ˛  .s//e s ds:
Z 0
N
v.x/  g.yN" .s/; ˛N " .s//e s ds
0 Assume now that v is regular. By passing to the
Ce  JyN" . / .˛Q " /  .1 C e  /" limit as ! 0C , we have
O  .1 C e  /"
 Jx .˛/ v .y  . //  v.x/
 lim 
 v.x/  .1 C e /"; (9) !0C
D Dv.x/  yP  .x/ D Dv.x/  f .x; ˛  .0//
where ˛O is a control defined by
.1  e  /
lim v.x/ D v.x/
( !0C
˛N " .s/ 0s< Z
O
˛.s/ D (10) 1
˛Q " .s  / s  : lim g.y  .s/; ˛  .s//e sds D g.x; ˛  .0//
!0
C
0

O
(Note that ˛./ is still measurable). Since " is where we have assumed that ˛  ./ is continuous
N
arbitrary, (9) finally yields v.x/  v.x/. at 0. Then, we can conclude O
We observe that this proof crucially relies on
the fact that the control defined by (10) still v.x/  Dv.x/  f .x; a /  g.x; a / D 0 (11)
belongs to A, being a measurable control. The
possibility of obtaining an admissible control by where a D ˛  .0/. Similarly, using the equiva-
joining together two different measurable con- lent form
trols is known as the concatenation property. Z
v.x/ C sup  g.y.s/; ˛.s//e s ds
˛2A 0


The Hamilton–Jacobi–Bellman e v.y. // D 0
Equation
of the DPP and the inequality, this implies for any
The DPP can be used to characterize the value (continuous at 0) control ˛ 2 A,
function in terms of a nonlinear partial differ-
ential equation. In fact, let ˛  2 A be the
v.x/  Dv.x/  f .x; a/  g.x; a/
optimal control, and y  the associated evolution
(to simplify, we are assuming that the infimum is  0; for every a 2 A: (12)
a minimum). Then,
Combining (11) and (12), we obtain the
Z
  s   Hamilton–Jacobi–Bellman equation (or dynamic
v.x/D g.y .s/; ˛ .s//e dsCe v.y . //;
0 programming equation):
960 Optimal Control and the Dynamic Programming Principle

u.x/ C sup ff .x; a/  Du.x/  g.x; a/g D 0; where


a2A
(13) 8
ˆ
ˆ minft W yx .t; ˛/ 2 T g if yx .t; ˛/ 2 T
ˆ
ˆ
which characterizes the value function for the < for some t  0
infinite horizon problem associated with mini- tx .˛/ WD
ˆC1
ˆ ifyx .t; ˛/ … T
mizing (3). Note that given x, the value of a ˆ
:̂ for any t  0
achieving the max (assuming it exists) corre-
sponds to the control a D ˛  .0/, and this makes
it natural to interpret the argmax in (13) as the The corresponding value function is called the
optimal feedback at x (see Bardi and Capuzzo minimum time function
Dolcetta (1997) for more details).
In short, (13) can be written as T .x/ WD inf tx .˛.//: (15)
˛./2A

H.x; u; Du/ D 0
The main difference with respect to the previous
with x 2 Rd , and problem is that now the value function T will be
finite valued only on a subset R which depends
H.x; u;p/ D u.x/Csup ff .x; a/pg.x; a/g: on the target, on the dynamics, and on the set of
a2A admissible controls.
(14)
Definition 1 The reachable set R is defined by
Note that H.x; u; / is convex (being the sup of
a family of linear functions) and that H.x; ; p/ R WD [t >0 R.t/ D fx 2 Rn W T .x/ < C1g
is monotone (since  > 0). It is also easy to
see that the solution u is not differentiable even where, for t > 0, R.t/ WD fx 2 Rn W T .x/ < tg.
when f and g are smooth functions (i.e., f; g; 2 The meaning is clear: R is the set of initial points
C 1 .Rn ; A/), so we need to deal with weak so- which can be driven to the target in finite time.
lution of the Bellman equation. This can be done The system is said to be controllable on T if
in the framework of viscosity solutions, a theory for all t > 0, T  int.R.t// (here, int(D)
initiated by Crandall and Lions in the 1980s denotes the interior of the set D). Assuming
which has been successfully applied in many controllability in a neighborhood of the target one
areas as optimal control, fluid dynamics, and gets the continuity of the minimum time function
image processing (see the books Barles (1994) and under the assumptions made on f , A, and T ,
and Bardi and Capuzzo Dolcetta (1997) for an one can prove some interesting properties:
extended introduction and numerous applications (i) R is open.
to optimal control). Typically viscosity solutions (ii) T is continuous on R.
are Lipschitz continuous solutions so they are (iii) lim T .x/ D C1, for any x0 2 @R.
differentiable almost everywhere. x!x0
Now let us denote by XD the characteristic func-
tion of the set D. Using in R arguments similar
to the proof of DPP in the previous section one
An Extension to the Minimum Time
can obtain the following DPP:
Problem
Proposition 2 (DPP for the minimum time
In the minimum time problem, we want to min- problem) For any x 2 R, the value function
imize the time of arrival of the state on a given satisfies
target set T . We will assume that T  Rd is
a closed set. Then our cost functional will be T .x/ D inf ft ^ tx .˛/ C Xft tx .˛/g T .yx .t; ˛//g
˛2A
given by
J.x; ˛/ D tx .˛/ for any t  0 (16)
Optimal Control and the Dynamic Programming Principle 961

and (Bardi and Capuzzo Dolcetta 1997). For a


short introduction to numerical methods based
T .x/ D inf ft C T .yx .t; ˛//g on DP and exploiting the so-called “value
˛2A
iteration,” we refer the interested reader to the
for any t 2 Œ0; T .x/ (17) Appendix A in Bardi and Capuzzo Dolcetta
(1997) and to Kushner and Dupuis (2001) (see
From the previous DPP, one can also obtain the also the book Howard (1960) for the “policy
following characterization of the minimum time iteration”).
function.
Proposition 3 Let RnT be open and T 2
C.RnT /, then T is a viscosity solution of
Cross-References
maxff .x; a/  rT .x/g D 1 x 2 RnT
a2A  Numerical Methods for Nonlinear Optimal
(18) Control Problems
coupled with the natural boundary condition  Optimal Control and Pontryagin’s Maximum
8 Principle
<T .x/ D 0 x 2 @T
: lim T .x/ D C1
x!@R
Bibliography
By the change of variable v.x/ D 1  e T .x/ , one
can obtain a simpler problem getting rid of the Bardi M, Capuzzo Dolcetta I (1997) Optimal control
and viscosity solutions of Hamilton-Jacobi-Bellman
boundary condition on @R (which is unknown). equations. Birkhäuser, Boston
The new function v will be the unique viscosity Barles G (1994) Solutions de viscosité des équations de
solution of an external Dirichlet problem (see Hamilton-Jacobi. In: Mathématiques et applications,
Bardi and Capuzzo Dolcetta (1997) for more vol 17. Springer, Paris
Bellman R (1957) Dynamic programming. Princeton Uni-
details), and the reachable set can be recovered versity Press, Princeton O
a posteriori via the relation R D fx 2 Rd W Bertsekas DP (1987) Dynamic programming: determin-
v.x/ < 1g. istic and stochastic models. Prentice Hall, Englewood
Cliffs
Boltyanskii VG, Gamkrelidze RV, Pontryagin LS (1956)
On the theory of optimal processes (in Russian). Dok-
Further Extensions and Related lady Akademii Nauk SSSR 110, 7–10
Fleming WH, Rishel RW (1975) Deterministic and
Topics
stochastic optimal control. Springer, New York
Fleming WH, Soner HM (1993) Controlled Markov pro-
The DPP has been extended from deterministic cesses and viscosity solutions. Springer, New York
control problems to many other problems. In Howard RA (1960) Dynamic programming and Markov
processes. Wiley, New York
the framework of stochastic control problems
Kushner HJ, Dupuis P (2001) Numerical methods
where the dynamics are given by a diffusion, for stochastic control problems in continuous time.
the characterization of the value function Springer, Berlin
obtained via the DPP leads to a second-order Macki J, Strauss A (1982) Introduction to optimal control
theory. Springer, Berlin/Heidelberg/New York
Hamilton–Jacobi–Bellman equation (Fleming
Pontryagin LS, Boltyanskii VG, Gamkrelidze RV,
and Soner 1993; Kushner and Dupuis 2001). Mishchenko EF (1961) Matematicheskaya teoriya op-
Another interesting extension has been made in timal’ nykh prozessov. Fizmatgiz, Moscow. Translated
differential games where the DPP is based on the into English. The mathematical theory of optimal pro-
cesses. John Wiley and Sons (Interscience Publishers),
delicate notion of nonanticipative strategies for
New York, 1962
the players and leads to a nonconvex nonlinear Ross IM (2009) A primer on Pontryagin’s principle in
partial differential equation (the Isaacs equation) optimal control. Collegiate Publishers, San Francisco
962 Optimal Control via Factorization and Model Matching

Optimal Control via Factorization which relates the input w and the output z when
and Model Matching v1 D 0 and v2 D 0. The objective is to select
K, to minimize this measure of performance.
Michael Cantoni
Alternatively, controllers that achieve a speci-
Department of Electrical & Electronic
fied upper bound are sought. It is also usual to
Engineering, The University of Melbourne,
require internal stability, which pertains to the
Parkville, VIC, Australia
fictitious signals v1 and v2 , as discussed more
subsequently. The best known examples are H2
and H1 control problems. In the former, perfor-
Abstract
mance is quantified as the energy (resp. power)
of z when w is impulsive (resp. unit white noise),
One approach to linear control system design in-
and in the latter, as the worst-case energy gain
volves the matching of certain input-output mod-
from w to z, which can be used to reflect robust-
els with respect to a quantification of closed-
ness to model uncertainty; see Zhou et al. (1996).
loop performance. The approach is based on a
The special case of G22 D 0 gives rise to
parametrization of all stabilizing feedback con-
a (weighted) model-matching problem, in that
trollers, which relies on the existence of co-
the corresponding performance map H.G; K/ D
prime factorizations of the plant model. This
G11 C G12 KG21 exhibits affine dependence on
parametrization and spectral factorization meth-
the design variable K, which is chosen to match
ods for solving model-matching problems are
G12 KG21 to G11 with respect to the scalar quan-
described within the context of impulse-response
tification of performance. Any internally stabiliz-
energy and worst-case energy-gain measures of
able problem with G22 ¤ 0, can be converted into
controller performance.
a model-matching problem. The key ingredients
in this transformation are coprime factorizations
of the plant model. The role of these and other
Keywords factorizations in a model-matching approach to
H2 and H1 control problems is the focus of this
Coprime factorization; H2 control; H1 control; article.
Spectral factorization; Youla-Kučera controller For the sake of argument, finite-dimensional
parametrization linear time-invariant systems are considered via
real-rational transfer functions in the frequency
domain, as the existence of all factorizations
Introduction

Various linear control problems can be formu-


z ⎡ ⎤
lated in terms of the interconnection shown in w
Fig. 1; e.g., see Francis and Doyle (1987), Boyd ⎢G11 G12 ⎥
and Barratt (1991), and Zhou et al. (1996). The G=⎢


⎦ u
linear system K is a controller (with input y and G21 G22
output u  v1 ) to be designed for the generalized
plant model G. The latter is constructed so that
controller performance (i.e., the quality of K v2 y v1
relative to specifications) can be quantified as a Σ K Σ
nonnegative functional of

Optimal Control via Factorization and Model Match-


H.G; K/ D G11 C G12 K.I  G22 K/1 G21 ; ing, Fig. 1 Standard interconnection for control system
(1) design
Optimal Control via Factorization and Model Matching 963

employed is well understood in this setting. This form naturally arises in frequency-domain
Indeed, constructions via state-space realizations analysis of the input-output map associated with
and Riccati equations are well known. The merits P
the time-domain model x.t/ D Ax.t/ C Bu.t/,
of the model-matching approach pursued here with initial condition x.0/ D 0 and output
are at least twofold: (i) the underlying algebraic equation y.t/ D C x.t/ C Du.t/, where xP
input-output perspective extends to more denotes the time derivative of x and u is the
abstract settings, including classes of distributed- input. The study of such linear time-invariant
parameter and time-varying systems (Desoer differential equation models via the Laplace
et al. 1980; Vidyasagar 1985; Curtain and Zwart transform and multiplication by real-rational
1995; Feintuch 1998; Quadrat 2006); and (ii) transfer function matrices is fundamental in
model matching is a convex problem for various linear systems theory (Kailath 1980; Francis
measures of performance (including mixed 1987; Zhou et al. 1996). P 2 R has an inverse
indexes) and controller constraints. The latter P 1 2 R if and only if limjsj!1 P .s/ is a
can be exploited to devise numerical algorithms nonsingular matrix. The superscripts T and
for controller optimization (Boyd and Barratt denote the transpose and complex conjugate
1991; Dahleh and Diaz-Bobillo 1995; Qi et al. transpose. For a matrix Z D Z  with complex
2004). entries, Z > 0 means z Zz  z z for
First, some notation regarding transfer some > 0 and all complex vectors z of
functions and two measures of performance compatible dimension. P  .s/ WD P .s/T ,

for control system design is defined. Coprime whereby .P .j!//
p D P  .j!/ for all real !
factorizations are then described within the with j WD 1. Zeros of transfer function
context of a well-known parametrization of denominators are called poles.
stabilizing controllers, originally discovered In subsequent sections, several subspaces of
by Youla et al. (1976) and Kucera (1975). This R are used to define and solve two standard
yields an affine parametrization of performance linear control problems. The subspace B  R
maps for problems in standard form, and thus, a comprises transfer functions that have no poles
transformation to a model-matching problem. on the imaginary axis in the complex plane. For
Finally, the role of spectral factorizations in P 2 B, the scalar performance index O
solving model-matching problems with respect
to impulse-response energy (H2 ) and worst-case kP k1 WD max
N .P .j!//  0
energy-gain (H1 ) measures of performance is 1!1

discussed.
is finite; the real number
N .Z/ is the maximum
singular value of the matrix argument Z. This
Notation and Nomenclature index measures the worst-case energy-gain from
an input signal u, to the output signal y D P u.
R generically denotes a linear space of matrices Note that kP k1 < if and only if 2 I 
having fixed row and column dimensions, P  .j!/P .j!/ > 0 for all 1  !  1.
which are not reflected in the notation for The subspace S  B  R consists of transfer
convenience, and entries that are proper real- functions that have no poles with positive real
rational
Pfunctions  ofP
the complex  variable s; part. A transfer function in S is called stable
m k n k
i.e., kD1 b k s = kD1 ak s for sets of because the corresponding input-output map is
real coefficients fak gnkD1 and fbk gm kD1 with causal in the time domain, as well as bounded-
m  n < 1. The compatibility of matrix in-bounded-out (in various senses). If P 2 S is
dimensions is implicitly assumed henceforth. such that P  P D I , then it is called inner. If
All matrices in R have (nonunique) “state-space” P; P 1 2 S, then both are called outer.
realizations of the form C.sI  A/1 B C D, Let L denote the subspace of strictly-proper
where A; B; C and D are real valued matrices. transfer functions in B; i.e., for all entries of the
964 Optimal Control via Factorization and Model Matching

matrix, the degree n of the denominator exceeds  M  U0


the degree m of the numerator. Observe that P 2 VQ0 UQ 0 Q
D I and N MQ DI
N V0
L if and only if P  2 L. Moreover, P1 P2 2 L (3)
and P3 P1 2 L for all P1 2 L and Pi 2 B,
i D 2; 3. Now, for P1 ; P2 2 L, define the inner-  
product hold; i.e., M T N T and NQ MQ are right in-
vertible in S. Importantly, if the factorizations are
Z 1 coprime and P 2 S, then M 1 D VQ0  UQ 0 P and
1 Q 1
hP1 ;P2 i WD trace.P1 .j!/P2 .j!//d! < 1 M D V0 P U0 are in S, as sums of products of
2 1 transfer functions in S; i.e., M and MQ are outer.
Doubly coprime factorizations over S always ex-
ist, but these are not unique. Constructions from
p the scalar performance index kP k2 WD
and
hP; P i  0 for P 2 L. This index equates state-space realizations can be found in Zhou
to the root-mean-square (energy) measure of the et al. (1996, Chapter 6) and Francis (1987), for
impulse response and the covariance (power) example. As mentioned above, coprime factor-
of the output signal y D P u, when the input izations play a role in transforming a standard
signal u is unit white noise. By the properties problem into the special case of a model matching
trace.Z1 C Z2 / D trace.Z1 / C trace.Z2 / and problem, via the Youla-Kučera parametrization of
trace.Z1 Z2 / D trace.Z2 Z1 / of the matrix trace, internally stabilizing controllers presented in the
it follows that hP1 C P2 ; P3 i D hP1 ; P3 i C next section.
hP2 ; P3 i and Subsequently, a special coprime factoriza-
tion proves to be useful. If P  .s/P .s/ D
M  .s/N  .s/N.s/M 1 .s/ > 0 for s on the
hP1 ; P2 P3 i D hP2 P1 ; P3 i D hP1 P3 ; P2 i extended imaginary axis (i.e., for s D j! with
 
D hP3 ;P1 P2 i for Pi 2 L; iD 1; 2; 3: 1  !  1), then it is possible to choose
(2) the factor N to be inner. In this case, if P
is also an element of S, then P D NM 1
is called an inner-outer factorization, and
The (not closed) subspace L  B  R can be P  P D .M 1 / M 1 is called a spectral
expressed as the direct sum L D H C H? , where factorization, since M; M 1 2 S. More
H D L \ S and H? is the subspace of transfer generally, if  D   2 B satisfies .s/ > 0
functions in L that have no poles with negative for s on the extended imaginary axis, then there
real part. That is, given P 2 L, there is a unique exists a (non-unique) spectral factor ˙; ˙ 1 2 S
decomposition P D ˘ C .P / C ˘  .P /, with such that  D ˙  ˙. Similarly, there exists
˘ C .P / 2 H and ˘  .P / 2 H? . Observe that a co-spectral factor ˙; Q ˙Q 1 2 S such that

P 2 H if and only if P 2 H? . It can be shown  D ˙Q ˙Q . State-space constructions via

via Plancherel’s theorem that hP1 ; P2 i D 0 for Riccati equations can be found in Zhou et al.
P1 2 H? and P2 2 H. Finally, note that P1 P2 2 (1996, Chapter 13), for example.
H and P3 P1 2 H for P1 2 H and Pi 2 S,
i D 2; 3.
Affine Controller/Performance-Map
Parametrization
Coprime and Spectral Factorizations
With reference
11 to Fig. 1, a generalized plant
Given P 2 R, the factorizations P D NM 1 D model G D G G21 G22 2 R is said to be internally
G12

MQ 1 NQ are said to be (doubly) coprime over S, stabilizable if there exists a K 2 R such that
if N , M , NQ , MQ are all elements of S and there the nine transfer functions associated with the
exist U0 , V0 , UQ 0 , VQ0 all in S such that map from the vector of signals .w; v1 ; v2 / to
Optimal Control via Factorization and Model Matching 965

the vector of signals .z; u; y/, which includes sense that (3) holds for some U0 ; V0 ; UQ 0 ; VQ0 2 S.
the performance map H.G; K/ D G11 C Indeed, since 0 D G22  G22 D MQ 1 .MQ N 
G12 K.I  G22 K/1 G21 , are all elements of S. NQ M /M 1, it follows that
Accounting in this way for the influence of the


fictitious signals v1 and v2 , and the behavior VQ0 UQ 0 M U0 I 0


D
of the internal signals u and y, amounts to NQ MQ N V0 0I
following requirement: Given minimal state-

space realizations, any nonzero initial condition M U0 VQ0 UQ 0


D :
response decays exponentially in the time domain N V0 NQ MQ
when G and K are interconnected according (6)
to Fig. 1 with w D 0, v1 D 0 and v2 D 0.
Not every G 2 R is internally stabilizable Exploiting this and the condition (5), it holds that
in the sense just defined; for example, take K D U V 1 stabilizes G22 if and only if
G11 to have a pole with positive real part and
G21 D G12 D G22 D 0. A necessary condition U D .U0 MQ/ and V D .V0 NQ/ with Q 2 S:
for stabilizability is .I  G22 K/1 2 R; i.e., the
inverse must be proper. The latter always holds if Similarly, K stabilizes G22 if and only if K D
G22 is strictly proper, as assumed henceforth to .VQ0  QNQ /1 .UQ 0  QMQ / with Q 2 S. Together,
simplify the presentation. It is also assumed that these constitute the Youla-Kučera parametriza-
G is internally stabilizable. tions of internally stabilizing controllers. Impor-
It can be shown that G is internally stabilized tantly, the coprime factors that appear in these
by K if and only if the standard feedback in- are affine functions of the stable parameter Q.
terconnection of G22 and K, corresponding to Moreover, using (6), an affine parametrization of
w D 0 in Fig. 1, is internally stable. That is, if the standard performance map (1) holds by direct
and only if the transfer function substitution of either controller parametrization.

Specifically,
I K
2 R; (4) O
G22 I H.G; K/ D G11 C G12 K.I  G22 K/1 G21

which relates u and y to v1 and v2 by virtue of the D T1 C T2 QT3 with Q 2 S; (7)


summing junctions at the interconnection points,
has an inverse in S; see Francis (1987, Theo- where T1 D G11 C G12 U0 MQ G21 , T2 D G12 M
rem 4.2). Substituting the coprime factorizations and T3 D MQ G21 . Clearly, T1 2 S since this
K D U V 1 D VQ 1 UQ and G22 D NM 1 D is the performance map when Q D 0 2 S.
MQ 1 NQ , it follows that the inverse of (4) is an By the assumption that G is stabilizable, it fol-
element of S if and only if lows that T2 and T3 are also elements of S;
see Francis (1987, Chapter 4). The so-called Q-

1
1
M U VQ UQ parametrization in (7) motivates the subsequent
2S , 2 S:
N V NQ MQ consideration of model-matching problems with
(5) respect to the standard measures of control sys-
tem performance k  k2 and k  k1 .
The equivalent characterizations of internal
stability in (5) lead directly to affine parametriza-
tions of controllers and performance maps. Model-Matching via Spectral
Specifically, following the approach of Desoer Factorization
et al. (1980), Vidyasagar (1985), and Francis
(1987), suppose that the factorizations G22 D Bearing in mind the Q-parametrization (7), con-
NM 1 D MQ 1 NQ are doubly coprime in the sider the following H2 model-matching problem,
966 Optimal Control via Factorization and Model Matching

where inf denotes greatest lower bound (infi- tion of assumptions, the infimum is achieved as
mum) and Ti 2 S, i D 1; 2; 3: shown below.
A minimizer of the convex functional f WD
inf kT1 C T2 QT3 k2 : Q 2 H 7! h.T1 C T2 QT3 /; .T1 C T2 QT3 /i
Q2S is a solution of the model matching problem.
Given spectral factorizations ˚  ˚ D T2 T2
Assume that T2 .s/ and T3 .s/ have full column and  D T3 T3 (i.e., ˚; ˚ 1 ; ; 1 2
and row rank, respectively, for s on the extended S), which exist by the assumptions on the
imaginary axis. Also assume that T1 is strictly problem data, let R WD ˚Q and W WD
proper, whereby Q must be strictly proper, and ˚  T2 T1 T3  . Then for Q 2 H, which
thus an element of H  S, for the performance is equivalent to R 2 H by the properties of
index to be finite. Under this standard collec- spectral factors, it follows that

(8)
f .Q/ D hT1 ; T1 i C h˚  T2 T1 T3  ; Ri C hR; ˚  T2 T1 T3  i C hR; Ri
D hT1 ; T1 i C h.˘  .W / C ˘ C .W / C R/; .˘  .W / C ˘ C .W / C R/i  hW; W i
D hT1 ; T1 i  h˘ C .W /; ˘ C .W /i C h.˘ C .W / C R/; .˘ C .W / C R/i; (9)

where the second last equality holds by With a view to highlighting the role of factor-
“completion-of-squares” and the last equality ization methods and simplifying the presentation,
holds since h˘ C .W /; ˘  .W /i D 0 D suppose that T2 is inner, which is possible without
hR; ˘  .W /i. From (9) it is apparent that loss of generality via inner-outer factorization if
T2 .s/ has full column rank for s on the extended
imaginary axis. Furthermore, assume that T3 D
Q D ˚ 1 ˘ C .˚  T2 T1 T3  /1 I . Following the approach of Francis

(1987)

 and
Green et al. (1990),
 let X D X1 X2 WD
is a minimizer of f . As above, spectral fac- T2 I  T2 T2 2 B, so that X  X D I and
torization is a key component of the so-called X T2 D ΠI0 . Observe that
Wiener-Hopf approach of Youla et al. (1976) and
DeSantis et al. (1978). kT1 C T2 Qk1 D kX.T1 C T2 Q/k1
Now consider the H1 model-matching prob-  

 T T1 C Q 
lem D 2  <
inf kT1 C T2 QT3 k1 ; .I  T2 T2 /T1 1
Q2S
(10)
given Ti 2 S, i D 1; 2; 3. This is more chal-
if and only if
lenging than the problem discussed above, where
k  k2 is the performance index. While sufficient
0 < 2 I  T1 .I  T2 T2 /T1
conditions are again available for the infimum
to be achieved, computing a minimizer is gener-  .T2 T1 C Q/ .T2 T1 C Q/ (11)
ally difficult; see Francis and Doyle (1987) and
Glover et al. (1991). As such, nearly optimal on the extended imaginary axis. Note that (11)
solutions are often sought by considering the implies 0 < 2 I  T1 .I  T2 T2 /2 T1 . Thus, it
relaxed problem of finding the set of Q 2 S that follows that there exists a Q 2 S for which (10)
satisfy kT1 C T2 QT3 k1 < for a value of > 0 holds if and only if the following are both sat-
greater than, but close to, the infimum. isfied: (a) there exists a spectral factorization
Optimal Control via Factorization and Model Matching 967

2    D 2 I  T1 .I  T2 T2 /2 T1 ; and (b) Summary


N
there exists an R.D Q 1 / 2 S such that kWN C
N 1 < , where WN WD T  T1  1 2 B. The
Rk The preceding sections highlight the role of co-
2
condition (b) is a well-known extension problem prime and spectral factorizations in formulating
and a solution exists if and only if the induced and solving model-matching problems that arise
norm of the Hankel operator with symbol WN is from standard H2 and H1 control problems.
less than , which is part of a result known as The transformation of standard control problems
Nehari’s theorem. In fact, (b) is equivalent to the to model-matching problems hinges on an affine
existence of a spectral factor ;  1 2 S with parametrization of internally stabilized perfor-
1
11 2 S such that mance maps. Beyond the problems considered
here, this parametrization can be exploited to




devise numerical algorithms for various other
I 0 I WN I 0 I WN
 D ; control problems in terms of convex mathemat-
0  2 I 0 I 0  2 I 0 I ical programs.
(12)

in which case kWN C Rk


N 1  if and only if
  Cross-References
R D R1 R2 with R1 R2 WD SN T I  T ,
N N N 1 N T NT

SN 2 S and kSN k1  ; see Ball and Ran  H-Infinity Control


(1987), Francis (1987), and Green et al. (1990)  H2 Optimal Control
for details, including state-space constructions of  Polynomial/Algebraic Design Methods
the factors via Riccati equations. Noting that  Spectral Factorization







T2 T1 I 0 T2 T1 I 0 I WN
D
0 I 0  2 I 0 I 0 0 I
Bibliography


I 0 I WN I 0
; Ball JA, Ran ACM (1987) Optimal Hankel norm model O
0  2 I 0 I 0
reductions and Wiener-Hopf factorization I: the canon-
ical case. SIAM J Control Optim 25(2):362–382
Ball JA, Helton JW, Verma M (1991) A factorization
it follows using (12) that there exists a Q 2 S
principle for stabilization of linear control systems. Int
such that (10) holds if and only if there exists J Robust Nonlinear Control 1(4):229–294
1
a spectral factor
 ˝; ˝ 1 2 S with ˝11 2 S Boyd SP, Barratt CH (1991) Linear controller design:
˝ D  0  that satisfies
I 0 limits of performance. Prentice Hall, Englewood Cliffs
Curtain RF, Zwart HJ (1995) An introduction to infinite-
dimensional linear systems theory. Volume 21 of texts




in applied mathematics. Springer, New York
T2 T1 I 0 T2 T1  I 0
D ˝ ˝; Dahleh MA, Diaz-Bobillo IJ (1995) Control of uncertain
0 I 0  2 I 0 I 0  2 I systems: a linear programming approach. Prentice
(13) Hall, Upper Saddle River
DeSantis RM, Saeks R, Tung LJ (1978) Basic optimal es-
timation and control problems in Hilbert space. Math
in which case kT1 C T2 Qk1  if and only if Syst Theory 12(1):175–203
Desoer C, Liu R-W, Murray J, Saeks R (1980) Feedback
Q D Q1 Q21 , where Q1T Q2T WD S T I ˝ T ,
system design: the fractional representation approach
S 2 S and kS k1  ; see Green et al. to analysis and synthesis. IEEE Trans Autom Control
(1990). So-called J -spectral factorizations of 25(3):399–412
the kind in (12) and (13) also appear in the Feintuch A (1998) Robust control theory in Hilbert space.
Applied mathematical sciences. Spinger, New York
chain-scattering/conjugation approach of Kimura
Francis BA (1987) A course in H1 control theory.
(1989, 1997) and the factorization approach Lecture notes in control and information sciences.
of Ball et al. (1991), for example. Springer, Berlin/New York
968 Optimal Control with State Space Constraints

Francis BA, Doyle JC (1987) Linear control theory with Keywords


an H1 optimality criterion. SIAM J Control Optim
25(4):815–844
Glover K, Limebeer DJN, Doyle JC, Kasenally Admissible control; Bolza form; Mayer problem
EM, Safonov MG (1991) A characterization
of all solutions to the four block general
distance problem. SIAM J Control Optim 29(2): Problem Formulation and
283–324
Green M, Glover K, Limebeer DJN, Doyle JC (1990) A J - Terminology
spectral factorization approach to H1 control. SIAM
J Control Optim 28(6):1350–1371 Many practical problems in engineering or
Kailath T (1980) Linear systems. Prentice-Hall, Engle- of scientific interest can be formulated in the
wood Cliffs
Kimura H (1989) Conjugation, interpolation and framework of optimal control problems with state
model-matching in H1 . Int J Control 49(1): space constraints. Examples range from the space
269–307 shuttle reentry problem in aeronautics (Bonnard
Kimura H (1997) Chain-scattering approach to H 1 et al. 2003) to the problem of minimizing the base
control. Systems & control. Birkhäuser, Boston
Kucera V (1975) Stability of discrete linear control transit time in bipolar transistors in electronics
systems. In: 6th IFAC world congress, Boston. Paper (Rinaldi and Schättler 2003).
44.1 An optimal control problem with state space
Qi X, Salapaka MV, Voulgaris PG, Khammash M (2004)
constraints in Bolza form takes the following
Structured optimal and robust control with multiple
criteria: a convex solution. IEEE Trans Autom Control form: minimize a functional
49(10):1623–1640 Z T
Quadrat A (2006) On a generalization of the Youla–
Kučera parametrization. Part II: the lattice approach J.u/ D L.t; x.t/; u.t//dt C ˚.T; x.T //
t0
to MIMO systems. Math Control Signals Syst
18(3):199–235
Vidyasagar M (1985) Control system synthesis: a factor- over all Lebesgue measurable functions u W
ization approach. Signal processing, optimization and Œt0 ; T  ! U that take values in a control set
control. MIT, Cambridge U  Rm , subject to the dynamics
Youla D, Jabr H, Bongiorno J (1976) Modern Wiener-
Hopf design of optimal controllers – Part II: the multi-
variable case. IEEE Trans Autom Control 21(3):319– P
x.t/ D F .t; x.t/; u.t//; x.t0 / D x0 ;
338
Zhou K, Doyle JC, Glover K (1996) Robust and optimal terminal constraints
control. Prentice Hall, Upper Saddle River

 .T; x.T // D 0;

and state space constraints

Optimal Control with State Space h˛ .t; x.t//  0 for ˛ D 1; : : : ; r:


Constraints
The focus of this contribution is on state space
Heinz Schättler constraints, and, for simplicity, in this formula-
Washington University, St. Louis, MO, USA tion, we have omitted mixed control state space
constraints of the form gˇ .t; x; u/  0. States
x lie in Rn and controls in Rm ; typically, the
Abstract control set U  Rm is compact and convex,
often a polyhedron. The time-varying vector field
Necessary and sufficient conditions for optimal- F W R  Rn  U ! Rn is continuously differen-
ity in optimal control problems with state space tiable in .t; x/, and the terminal constraint N D
constraints are reviewed with emphasis on geo- f.t; x/ W  .t; x/ D 0g is defined by continuously
metric aspects. differentiable mappings i W R  Rn ! Rk with
Optimal Control with State Space Constraints 969

the property that the gradients r i D . @@t i ; @@xi / constraints, is well established, and will not
(which we write as row vectors) are linearly be addressed here (e.g., see Cesari 1983 or the
independent on N . The terminal time T can be Filippov-Cesari theorem in Hartl et al. 1995).
free or fixed; a fixed terminal time simply would
be prescribed by one of the functions i . The
state space constraints Necessary Conditions for Optimality

First-order necessary conditions for optimality


M˛ D f.t; x/ W h˛ .t; x/ D 0g; ˛ D 1; : : : ; r;
are given by the Pontryagin maximum principle
(Pontryagin et al. 1962). The zero set of even a
are defined by continuously differentiable time- smooth (C 1 ) function can be an arbitrary closed
varying vector fields h˛ W R  Rn ! R; .t; x/ 7! subset of the state space. As a result, in necessary
h˛ .t; x/, and we assume that the gradients rh˛ conditions for optimality, the multipliers associ-
do not vanish on M˛ . In particular, each set M˛ ated with the state space constraints a priori are
thus is an embedded submanifold of codimension only known to be nonnegative Radon measures
1 of RnC1 . We denote by h D .h1 ; : : : ; hr /T the (Ioffe and Tikhomirov 1979; Vinter 2000). Let
time-varying vector field defining the state space u W Œt0 ; T  ! U be an optimal control with
constraints. corresponding trajectory x and, for simplicity of
Terminology: Admissible controls are locally presentation, also assume that no state constraints
bounded Lebesgue measurable functions that are active at the terminal time so that the standard
take values in the control set, u W Œt0 ; T  ! U . transversality conditions apply. Then it follows
Given any admissible control, the initial value that there exist a constant 0  0, an absolutely
P
problem x.t/ D F .t; x.t/; u.t//, x.t0 / D x0 , continuous function , which we write as row-
has a unique solution defined on some maximal vector,  W Œt0 ; T  ! .Rn / , and nonnegative
open interval of definition I . This solution is Radon measures ˛ 2 C  .Œt0 ; T I R/, ˛ D
called the trajectory corresponding to the control 1; : : : ; r, with support in the sets R˛ D ft 2
u and the pair .x; u/ is a controlled trajectory. An Œt0 ; T  W h˛ .t; x .t// D 0g, which do not all
arc  of the graph of a trajectory defined over vanish simultaneously, i.e., O
an open interval I for which none of the state
space constraints is active is called an interior X
r
arc, and  is a boundary arc if at least one 0 C kk1 C ˛ .Œt0 ; T / > 0;
constraint is active on all of I . We call  an M˛ - ˛D1
boundary arc over I if only the constraint h˛  0
is active on I . The times when interior arcs and such that with
boundary arcs meet are called junction times and
r Z
X
the corresponding pairs . ; x. // junction points. @h˛
.t/ D .t/  .s; x .s//d˛ .s/;
Despite the abundance and importance of @x
˛D1 Œt0 ;t /
practical problems that can be described as
optimal control problems with state space
and
constraints, for such problems the theory still
lacks the coherence that the theory for problems
without state space constraints has reached and H D H.t; 0 ; ; x; u/ D 0 L.t; x; u/CF .t; x; u/
there still exist significant gaps between the
theories of necessary and sufficient conditions the following conditions hold:
for optimality for optimal control problems with (a) The adjoint equation holds in the form
state space constraints. The theory of existence
of optimal solutions differs little between optimal @H
P D
.t/ .t; 0 ; .t/; x .t/; u .t//
control problems with and without state space @x
970 Optimal Control with State Space Constraints

@L But in many practical applications, state con-


D 0 .t; x .t/; u .t//
@x straints have strong geometric properties – often
@F they are embedded submanifolds – and it is pos-
 .t/ .t; x .t/; u .t//; sible to strengthen these necessary conditions for
@x
optimality in the sense of specifying the measures
and there exists a row-vector  2 .Rk / such further. We formulate the conditions for a partic-
that ular case of common interest.
We consider an optimal control problem in
@˚ @ Mayer form (i.e., L
0) for a single-input
.T / D 0 .T; x .T // C  .T; x .T //
@x @x control linear system with dynamics
and
xP D F .t; x; u/ D f .t; x/ C ug.t; x/
0 D H.T; 0 ; .T /; x .T /; u .T //
@˚ @ and the control set U a compact interval, U D
C 0 .T; x .T // C  .T; x .T //: Œa; b. Adjoining time as extra state variable, tP

@t @t
1, and defining
(b) The optimal control minimizes the Hamil-
   
tonian over the control set U along 1 0
F0 .t; x/ D and G.t; x/ D ;
..t/; x .t//: f .t; x/ g.t; x/

H.t; 0 ; .t/; x .t/; u .t// for a continuously differentiable function k W R 


Rn ! Rn , the expressions
D min H.t; 0 ; .t/; x .t/; v/:
v2U
LF0 k W R  Rn ! Rn ;
Furthermore, .t; x/ 7! .LF0 k/ .t; x/
@k @k
H.t; 0 ; .t/; x .t/; u .t// D .t; x/ C .t; x/f .t; x/
@t @x
D H.T; 0 ; .t/; x .t/; u .t//
Z and
@H
 .s; 0 ; .s/; x .s/; u .s//ds
Œt;T  @t
LG k W R  Rn ! Rn ;
X r Z
@h˛ @k
C .s; x .s//d˛ .s/ .t; x/ 7! .LG k/ .t; x/ D .t; x/g.t; x/
˛D1 Œt;T 
@t @x

Controlled trajectories .x; u/ for which there represent the Lie (or directional) derivatives of
exist multipliers such that these conditions are the function k along the vector fields F0 and G,
satisfied are called extremals. In general, it cannot respectively. In terms of this notation, the deriva-
be excluded that 0 vanishes and extremals with tive of the function h˛ (defining the manifold M˛ )
0 D 0 are called abnormal, while those with along trajectories of the system is given by
0 > 0 are called normal. In this case, the
d
multiplier can be normalized, 0 D 1. hP ˛ .t; x.t// D h˛ .t; x.t//
dt
Special Case: A Mayer Problem for D LF0 h˛ .t;x.t//Cu.t/LG h˛ .t;x.t//:
Single-Input Control Linear Systems
Under the general assumptions formulated above, If the function LG h˛ does not vanish at a point
the sets R˛  Œt0 ; T  when a particular con- .tQ; x/
Q 2 M˛ , then there exists a neighborhood
straint is active can be arbitrarily complicated. V of .tQ; x/
Q such that there exists a unique
Optimal Control with State Space Constraints 971

control u˛ D u˛ .t; x/ which solves the equation M˛ transversally (e.g., see Schättler 2006). This
hP ˛ .t; x/ D 0 on V and u˛ is given in feedback follows from the following characterization of
form as transversal connections between interior and
boundary arcs due to Maurer (1977): if is an
LF0 h˛ .t; x/ entry or exit junction time between an interior arc
u˛ .t; x/ D  :
LG h˛ .t; x/ and an M˛ -boundary arc for which the reference
control u has a limit at along the interior
The manifold M˛ is said to be control invariant arc, then the interior arc is transversal to M˛
of relative degree 1 if the Lie derivative of h˛ with at entry or exit if and only if the control u is
respect to G, LG h˛ , does not vanish anywhere on discontinuous at .
M˛ and if the function u˛ .t; x/ is admissible, i.e.,
takes values in the control set Œa; b. Informal Formulation of Necessary
Thus, for a control-invariant submanifold of Conditions
relative degree 1, the control that keeps the man- In order to ensure the practicality of necessary
ifold invariant is unique, and the corresponding conditions for optimality, it is essential that be-
dynamics induce a unique flow on the constraint. sides atomistic structures at junctions that lead to
This assumption corresponds to the least degener- computable jumps in the multipliers, the Radon
ate, i.e., in some sense most generic or common, measures ˛ have no singular parts with respect
scenario and is satisfied for many practical prob- to Lebesgue measure. If it is assumed a priori that
lems. optimal controlled trajectories are finite concate-
Suppose the reference extremal is normal and nations of interior and boundary arcs, and if the
let ˛ be an M˛ -boundary arc defined over an constraint sets have a reasonably regular structure
open interval I with corresponding boundary (embedded submanifolds and transversal inter-
control u˛ that takes values in the interior of the sections thereof) and satisfy a rather technical
control set along ˛ . Then the Radon measure ˛ constraint qualification (see Hartl et al. 1995)
is absolutely continuous with respect to Lebesgue that guarantees that the restrictions of the system
measure on I with continuous and nonnegative to active constraints have solutions, then it is
Radon-Nikodym derivative ˛ .t/ given by possible to specify the above necessary condi- O
  tions further and formulate more user friendly
@g
.t/ @t
.t; x .t// C Œf; g.t; x .t// versions for the determination of the multipliers.
˛ .t/ D Such formulations have become the standard for
LG h˛ .t; x .t//
numerical computations, but they still have not
where Œf; g denotes the Lie bracket of the time- always been established rigorously and somewhat
varying vector fields f and g in the variable x, carry the stigma of a heuristic nature. Neverthe-
less, it is often this more concrete set of condi-
@g @f tions that allow to solve problems numerically
Œf; g.t; x/ D .t; x/f .t; x/  .t; x/g.t; x/: and analytically. If then, in conjunction with
@x @x
sufficient conditions for optimality, it is possible
In particular, in this case, the adjoint equation can to verify the optimality of the computed extremal
be expressed in the more common form solutions, this generates a satisfactory theoretical
procedure. Such conditions, following Hartl et al.
@F @h˛ (1995), generally are referred to as the “informal
P
.t/ D .t/ .t; x ; u /  ˛ .t/ .t; x /;
@x @x theorem”.
Suppose .x ; u / is a normal extremal con-
with all partial derivatives evaluated along the trolled trajectory defined over the interval Œt0 ; T 
reference trajectory. Furthermore, the multiplier with the property that the graph of x is a finite
 remains continuous at entry or exit if the concatenation of interior and boundary arcs with
controlled trajectory .x ; u / meets the constraint junction times i , i D 1; : : : ; k, t0 D 0 < 1 <
972 Optimal Control with State Space Constraints

: : : < k < kC1 D T . Under an appropriate and at the junction times i we have that
constraint qualification, there exist a multiplier ,
 W Œt0 ; T  ! .Rn / , which is absolutely con- H. i ; . i /; x . i /; u . i //
tinuous on each subinterval Πi ; i C1 ; multipliers
˛ , ˛ W Œt0 ; T  ! .Rn / , which are continuous D H. i ; . i C/; x . i /; u . i C//
on each interval Πi ; i C1 ; a vector  2 .Rk / ; @h
and vectors . i / 2 .Rr / , i D 1; : : : ; k, with . i / . i ; x . i //:
@t
nonnegative entries such that:
(a) (adjoint equation) On each interval
. i ; i C1 /, i D 0; : : : ; r,  satisfies the Sufficient Conditions for Optimality
adjoint equation in the form
The literature on sufficient conditions for opti-
P @L
.t/ D  .t; x .t/; u .t// mality for optimal control problems with state
@x
space constraints is limited. The value function
@F for an optimal control problem at a point .t; x/ in
.t/ .t; x .t/; u .t//
@x the extended state space, V D V .t; x/, is defined
Xr
@h˛ as the infimum over all admissible controls u for
 ˛ .t/ .t; x /; which the corresponding trajectory starts at the
˛D1
@x
point x at time t and satisfies all the constraints
of the problem,
with ˛ .t/ D 0 if the constraint M˛ is not
active at time t. Assuming that no state space
constraint is active at the terminal time, the V .t; x/ D inf J.u/:
u2U
value of the multiplier  at the terminal time
is given by the transversality condition
Any sufficiency theory for optimal control prob-
@˚ @ lems, one way or another, deals with the solution
.T / D .T; x .T // C  .T; x .T //: of the corresponding Hamilton-Jacobi-Bellman
@x @x
(HJB) equation:
At any junction time i between an interior
arc and a boundary arc, the multiplier  may
@V @V
be discontinuous satisfying a jump condition .t; x/ C min .t; x/F .t; x; u/
@t u2U @x
of the form
CL.t; x; u/g
0;
@h
. i / D . i C/ C . i / . i ; x . i // V .T; x/ D ˚.T; x/ whenever  .T; x/ D 0:
@x

and the complementary slackness condition Value functions for optimal control problems
rarely are differentiable everywhere, but gener-
@h
. i / . i ; x . i // D 0 ally have singularities along lower-dimensional
@x submanifolds. Nevertheless, under some techni-
holds. cal assumptions and with proper interpretations
(b) The optimal control minimizes the Hamil- of the derivatives, this equation describes the
tonian over the control set U along evolution of the value function of an optimal
..t/; x .t//: control problem and, if an appropriate solution
can be constructed, indeed solves the optimal
H.t; .t/; x .t/; u .t// control problem.
There exists a broad theory of viscosity so-
D min H.t; .t/; x .t/; v/ lutions to the HJB equation (e.g., Fleming and
v2U
Optimal Control with State Space Constraints 973

Soner 2005; Bardi and Capuzzo-Dolcetta 2008) regular synthesis, as it was developed by Piccoli
that is also applicable to problems with state and Sussmann in (2000) for problems without
space constraints (Soner 1986) and, under vary- state space constraints, does not yet exist for
ing technical assumptions, characterizes the value problems with state space constraints. Results
function V as the unique viscosity solution to the that embed a controlled reference extremal into
HJB equation. This has led to the development of a local field of extremals have been given by
algorithms that can be used to compute numerical Bonnard et al. (2003) or Schättler (2006), and
solutions. these constructions show the applicability of the
A more classical and more geometric concepts of a regular synthesis to problems with
approach to solving the HJB equation is based state space constraints as well.
on the method of characteristics and goes back to
the work of Boltyansky on a regular synthesis Examples of Local Embeddings
for optimal control problems without state of Boundary Arcs
space constraints (Boltyanskii 1966). This work We illustrate the typical, i.e., in some sense most
follows classical ideas of fields of extremals common, generic structures of local embeddings
from the calculus of variations and imposes of boundary arcs in Figs. 1 and 2. The state
technical conditions that allow to handle the constraint M˛ is a control-invariant submanifold
singularities that arise in the value functions of relative degree 1 and represented by a hori-
(e.g., see, Schättler and Ledzewicz 2012). zontal line as it arises when limits on the size
Stalford’s results in Stalford (1971) follow of a particular state are imposed. Figure 1 shows
this approach for problems with state space the typical entry-boundary-exit concatenations of
constraints, but a broadly applicable theory of an interior arc followed by a boundary arc and

Optimal Control with


State Space Constraints,
Fig. 1 A typical local
synthesis around a
boundary arc when no
O
terminal constraints are
present

Optimal Control with


State Space Constraints,
Fig. 2 A typical local
synthesis around a
boundary arc when
terminal constraints are
present
974 Optimal Control with State Space Constraints

another interior arc. The local embedding of the the interior of the control set) in the interior of the
boundary arc differs substantially from classi- admissible domain, possibly with saturation if the
cal local imbeddings for unconstrained problems control limits are reached.
in the sense that this field necessarily contains
small pieces of trajectories which, when propa-
gated backward, are not close to the reference Cross-References
trajectory. This, however, does not affect the
memoryless properties required for a synthesis  Numerical Methods for Nonlinear Optimal
forward in time, and strong local optimality of Control Problems
the reference trajectory can be proven combining  Optimal Control and Pontryagin’s Maximum
synthesis type arguments with homotopy type Principle
approximations of the synthesis (Schättler 2006).
The one trajectory marked as black line in Fig. 1
corresponds to an optimal trajectory that meets Bibliography
the constraint only at the junction point and
immediately bounces back into the interior. Such Bardi M, Capuzzo-Dolcetta I (2008) Optimal control and
a trajectory arises as the limit when the concate- viscosity solutions of Hamilton-Jacobi-Bellman equa-
nation structure of optimal controlled trajecto- tions. Springer, New York
Boltyanskii VG (1966) Sufficient conditions for optimal-
ries changes from interior-boundary-interior arcs ity and the justification of the dynamic programming
to trajectories that do not meet the constraint. principle. SIAM J Control Optim 4:326–361
These structures are one of the extra sources Bonnard B, Faubourg L, Launay G, Trélat E (2003)
for singularities in the value function that come Optimal control with state space constraints and the
space shuttle re-entry problem. J Dyn Control Syst
up in optimal control problems with state space 9:155–199
constraints. Switching surfaces for the interior Cesari L (1983) Optimization – theory and applications.
arcs, as one is also shown in this figure, do not Springer, New York
cause such a loss of differentiability if they are Fleming WH, Soner HM (2005) Controlled Markov pro-
cesses and viscosity solutions. Springer, New York
crossed transversally be the extremal trajectories Frankowska H (2006) Regularity of minimizers and of ad-
of the field. joint states in optimal control under state constraints.
Figure 2 depicts the structure of an optimal Convex Anal 13:299–328
synthesis for a problem from electronics, the Hartl RF, Sethi SP, Vickson RG (1995) A survey of
the maximum principles for optimal control problems
problem of minimizing the base transit time of with state constraints. SIAM Rev 37:181–218
bipolar homogeneous transistors. The electrical Ioffe AD, Tikhomirov VM (1979) Theory of extremal
field that determines the transit time is controlled problems. North-Holland, Amsterdam/New York
by tailoring a distribution of dopants in the base Maurer H (1977) On optimal control problems with
bounded state variables and control appearing linearly.
region, and this dopant profile becomes an impor- SIAM J Control Optim 15:345–362
tant design parameter determining the speed of Piccoli B, Sussmann H (2000) Regular synthesis and
the device. But due to physical and engineering sufficient conditions for optimality. SIAM J Control
limitations, the variables describing the dopants Optim 39:359–410
Pontryagin LS, Boltyanskii VG, Gamkrelidze RV,
need to be limited, and thus this becomes an Mishchenko EF (1962) Mathematical theory of opti-
optimal control problem with state space con- mal processes. Wiley-Interscience, New York
straints represented by hard limits on the vari- Rinaldi P, Schättler H (2003) Minimization of the base
ables. The constraints here are control invariant transit time in semiconductor devices using optimal
control. In: Feng W, Hu S, Lu X (eds) Dynamical
of relative degree 1. Optimal solutions, in the systems and differential equations. Proceedings of the
presence of initial and terminal constraints, have 4th international conference on dynamical systems
both portions along the upper and lower control and differential equations. Wilmington, May 2002,
limits of the constrained variable and typically pp 742–751
Schättler H (2006) A local field of extremals for optimal
proceed from the upper to the lower values along control problems with state space constraints of rela-
an optimal singular control (which takes values in tive degree 1. J Dyn Control Syst 12:563–599
Optimal Deployment and Spatial Coverage 975

Schättler H, Ledzewicz U (2012) Geometric optimal con- The specific formulation of facility location
trol. Springer, New York problems depends very much on the particular
Soner HM (1986) Optimal control with state-space con-
straints I. SIAM J Control Optim 24:552–561
underlying application. A distinguishing feature
Stalford H (1971) Sufficient conditions for optimal control is that all involve strategic planning, accounting
with state and control constraints. J Optim Theory for the long-term impact on the facility operating
Appl 7:118–135 cost and their fast response to the demand. Thus,
Vinter RB (2000) Optimal control. Birkhäuser, Boston
these problems lead to constrained optimization
formulations which are typically very hard to
solve optimally. The computational complexity
of such problems, which, even in their most basic
Optimal Deployment and Spatial formulations, typically lead to NP-hard problems,
Coverage has made their solution largely intractable until
the advent of high-speed computing.
Sonia Martínez
Locational optimization techniques have also
Department of Mechanical and Aerospace
been employed to solve optimal estimation
Engineering, University of California, La Jolla,
problems by static sensor networks, mesh and
San Diego, CA, USA
grid optimization design, clustering analysis, data
compression, and statistical pattern recognition;
Abstract see Du et al. (1999). However, these solutions
typically require centralized computations and
Optimal deployment refers to the problem of how availability of information at all times.
to allocate a finite number of resources over a When the facilities are multiple vehicles or
spatial domain to maximize a performance metric mobile sensors, the underlying dynamics may
that encodes certain quality of service. Depend- require additional changes and further analysis
ing on the deployment environment, the type of that guarantee the overall system stability. In
resource, and the metric used, the solutions to this what follows, we review a particular coverage
problem can greatly vary. control problem formulation in terms of the
so-called expected-value multicenter functions O
that makes the analysis tractable leading to
robust, distributed algorithm implementations
Keywords employing computational geometric objects such
as Voronoi partitions.
Coverage control algorithms; Facility location
problems
Basic Ingredients from
Computational Geometry
Introduction
In order to formulate a basic optimal deployment
The problem of deciding what are optimal geo- problem and algorithm, we require of several
graphic locations to place a set of facilities has a notions from computational geometry; see Bullo
long history and is the main subject in operations et al. (2009) for more information.
research and management science; see Drezner Let S be a measurable set of Rm , for m 2
(1995). A facility can be broadly understood as N, consider a distance function d on Rm , and
a service such as a school; a hospital; an airport; let P D fp1 ; : : : ; pn g be n distinct points of
an emergency service, such as a fire station; or, S , corresponding to locations of certain facil-
more generally, routes of a vehicle, from buses ities. The Voronoi partition of S generated by
to aircraft, an autonomous vehicle, or a mobile P and associated with d is given by V.P / D
sensor. fV1 .P /; : : : ; Vn .P /g, where
976 Optimal Deployment and Spatial Coverage

˚
Vi .P / D q 2 S j d.pi ; q/  d.pj ; q/; Expected-Value Multicenter
 Functions
j 2 P n fi g ; i 2 f1; : : : ; ng:
Facility location problems consist of spatially
Given r 2 R>0 , denote by B.pi ; r/ the closed allocating a number of sites to provide certain
ball of center pi and radius r. The r-limited quality of service. Problems of this class are for-
Voronoi partition of S generated by P and as- mulated in terms of multicenter functions and, in
sociated with d is the Voronoi partition of the particular, expected-value multicenter functions.
set S \ [niD1 B.pi ; r/, denoted as Vr .P / D To define these, consider W S ! R0
fV1;r .P /; : : : Vn;r .P /g. a density function over a bounded measurable
Let W S ! R0 be a measurable density set S  Rm . One can regard as a function
function on S . The area and the centroid (or measuring the probability that some event takes
center of mass) of W  S with respect to are place over the environment. The larger the value
the values of .q/, the more important the location q will
have. We refer to a nonincreasing and piecewise
Z
continuously differentiable function f W R0 !
A .W / D .q/dq;
W
R, possibly with finite jump discontinuities, as a
Z performance function.
1
CM .W / D q .q/dq: Performance functions describe the utility of
A .W / W
placing a node at a certain distance from a loca-
tion in the environment. The smaller the distance,
We say that the set of distinct points P in S the larger the value of f , that is, the better the
is a centroidal Voronoi configuration (resp., a r- performance. For instance, in sensing problems,
limited centroidal Voronoi configuration) if each performance functions can encode the signal-to-
pi is at the centroid of its own Voronoi cell. noise ratio between a source with an unknown
That is, pi D CM .Vi .P //, i 2 f1; : : : ; ng location and a sensor attempting to locate it.
(resp., pi D CM .Vi;r .P //, and i 2 f1; : : : ; ng). Without loss of generality, it can be assumed that
Voronoi partitions and centroidal Voronoi config- f .0/ D 0.
urations help assess the distribution of locations An expected-value multicenter function mod-
in a spatial domain as we establish below. els the expected value of the coverage over any
A Voronoi partition induces a natural proxim- point in S provided by a set of points p1 ; : : : ; pn .
ity graph, called the Delaunay graph, over the Formally,
set of points P . We recall that a graph G is a Z
pair G D .V; E/ where V is a set of n vertices
H.p1 ; : : : ; pn / D max f.kqpi k2 / .q/dq;
and E is a set of ordered pair of vertices, E  S i 2f1;:::;ng
V  V , called edge set. A proximity graph is (1)
a graph function defined on the set S , which where k  k2 denotes the 2-norm of Rm . This
assigns a set of distinct points P  S to a graph definition can be understood as follows: consider
G.P / D .P; E.P //, where E.P / is a function the best coverage of q 2 S among those provided
of the relative locations of the point set. Example by each of the nodes p1 ; : : : ; pn , which corre-
graphs include the following: sponds to the value maxi 2f1;:::;ng f .kq  pi k2 /.
1. The r-disk graph, Gdisk,r , for r 2 R>0 . Here, Then, modulate the performance by the impor-
.pi ; pj / 2 Edisk,r .P / if d.pi ; pj /  r. tance .q/ of the location q. Finally, the infinites-
2. The Delaunay graph, GD . We have .pi ; pj / 2 imal sum of this quantity over the environment S
ED .P / if Vi .P / \ Vj .P / ¤ ;. gives rise to H.p1 ; : : : ; pn / as a measure of the
3. The r-limited Delaunay graph, GLD,r , for r 2 overall coverage provided by p1 ; : : : ; pn .
R>0 . Here, .pi ; pj / 2 ELD,r .P / if Vi;r .P / \ From here, we can formulate the following
Vi;r .P / ¤ ;. geometric optimization problem, known
Optimal Deployment and Spatial Coverage 977

as the continuous p-median problem, see fp1 ; : : : ; pn g 2 S . For any performance function
Drezner (1995): f and for any partition fW1 ; : : : ; Wn g of S ,

max H.p1 ; : : : ; pn /: (2) H.p1 ; : : : ; pn ; V1 .P /; : : : ; Vn .P // 


fp1 ;:::;pn gS
H.p1 ; : : : ; pn ; W1 ; : : : ; Wn /;
The expected-value multicenter function can
be alternatively described in terms of the Voronoi with a strict inequality if any set in fW1 ; : : : ; Wn g
partition of S generated by P D fp1 ; : : : ; pn g. differs from the corresponding set in fV1 .P /; : : : ;
Let us define the set Vn .P /g by a set of positive measure.
˚ Next, we characterize the smoothness of the
C D .p1 ; : : : ; pn / 2 .Rm /n j pi D pj expected-value multicenter function (Cortés et al.
for some i ¤ j g ; 2005). Before stating the precise properties, let
us introduce some useful notation. For a perfor-
consisting of tuples of n points, where some of mance function f , let discont.f / denote the (fi-
them are repeated. Then, for .p1 ; : : : ; pn / 2 S n n nite) set of points where f is discontinuous. For
C, one has each a 2 discont.f /, define the limiting values
from the left and from the right, respectively, as
n Z
X
H.p1 ; : : : ; pn / D f .kq pi k2 / .q/dq: f .a/ D lim f .x/; fC .a/ D lim f .x/:
i D1 Vi .P / x!a x!aC
(3)
Recall that the line integral of a function g W
This expression of H is appealing because it R2 ! R over a curve C parameterized by a con-
clearly shows the result of the overall coverage tinuous and piecewise continuously differentiable
of the environment as the aggregate contribution map W Œ0; 1 ! R2 is defined as follows:
of all individual nodes. If .p1 ; : : : ; pn / 2 C, then Z Z Z 1
a similar decomposition of H can be written in gD g. /d WD g. .t// k P .t/k2 dt;
terms of the distinct points P D fp1 ; : : : ; pn g. C C 0 O
Inspired by (3), a more general version of
and is independent of the selected parameteriza-
the expected-value multicenter function is given
tion.
next. Given .p1 ; : : : ; pn / 2 S n and a partition
Now, given a set S  Rm that is bounded
fW1 ; : : : ; Wn g of S , let
and measurable, a density W S ! R0 , and
a performance function f W R !0 R, the
H.p1 ; : : : ; pn ; W1 ; : : : ; Wn /
expected-value multicenter function H W S n ! R
X n Z
is globally Lipschitz (Given S  Rh , a function
D f .kq  pi k2 / .q/dq: (4) f W S ! Rk is globally Lipschitz if there exists
i D1 Wi
K 2 R>0 such that kf .x/  f .y/k2  Kkx 
For all .p1 ; : : : ; pn / 2 S n n C, we have that yk2 for all x; y 2 S .) on S n ; and continuously
H.p1 ; : : : ;pn /D H.p1 ; : : : ; pn ;V1.P /; : : : ;Vn .P //. differentiable on S n n C, where for i 2 f1; : : : ; ng
With respect to, e.g., sensor networks, this
Z
function evaluates the performance associated @H @
.P / D f .kq  pi k2 / .q/dq
with an assignment of the sensors’ locations @pi Vi .P / @p i
at .p1 ; : : : ; pn / and a region assignment X  
.W1 ; : : : ; Wn /. C f .a/  fC .a/
a2discont.f /
Moreover, one can establish that the Voronoi
Z
partition (Du et al. 1999) V.P / is optimal for
nout .q/ .q/dq; (5)
H among all partitions of S . That is, let P D Vi .P / \ @B.pi ;a/
978 Optimal Deployment and Spatial Coverage

where nout is the outward normal vector to and the inequality is strict if there exists i 2
B.pi ; a/. f1; : : : ; ng for which Wi has nonvanishing area
Different performance functions lead to differ- and pi 6D CM .Wi /. In other words, the centroid
ent expected-value multicenter functions. Let us locations CM .W1 /; : : : ; CM .Wn / are optimal
examine some important cases. for Hdistor among all configurations in S .
Note that when n D 1, the node location that
Distortion Problem optimizes p 7! Hdistor .p/ is the centroid of the
Consider the performance function f .x/ D x 2 . set S , denoted by CM .S /.
Then, on S n n C, the expected-value multicenter Recall that the gradient of Hdistor on S n n C
function takes the form takes the form,
n Z
X
Hdistor .p1 ; : : : ; pn /D  @Hdistor
i D1 Vi .P / .P / D2A .Vi .P //.CM .Vi .P //  pi /;
@pi
kq  pi k22 .q/dq:
i 2 f1; : : : ; ng;
In signal compression Hdistor is referred to as
the distortion function and is relevant in many
that is, the i th component of the gradient points
disciplines where including vector quantization,
in the direction of the vector going from pi
signal compression, and numerical integration;
to the centroid of its Voronoi cell. The critical
see Gray and Neuhoff (1998) and Du et al.
points of Hdistor are therefore the set of centroidal
(1999). Here, distortion refers to the average
Voronoi configurations in S . This is a natural
deformation (weighted by the density ) caused
generalization of the result for the case n D 1,
by reproducing q 2 S with the location pi in
where the optimal node location is the centroid
P D fp1 ; : : : ; pn g such that q 2 Vi .P /. By
CM .S /.
means of the Parallel Axis Theorem (see Hibbeler
2006), it is possible to express Hdistor as a sum
Area Problem
Hdistor .p1 ; : : : ; pn ; W1 ; : : : ; Wn / For r 2 R>0 , consider the performance function
X
n f .x/ D 1Œ0;r .x/, that is, the indicator function
D  J .Wi ; CM .Wi // of the closed interval Œ0; r. Then, the expected-
i D1 value multicenter function becomes
 A .Wi /kpi  CM .Wi /k22 ; (6)
R X
n
where J .W; p/ D W kq  pk2 .q/dq is the Harea,r .p1 ; : : : ; pn / D A .Vi .P / \ B.pi ; r//
so-called moment of inertia of the region W i D1
about p with respect to . In this way, the terms
D A .[niD1 B.pi ; r//;
J .Wi ; CM .Wi // only depend on the partition
of S , whereas the second terms multiplied by
A .Wi / include the particular location of the which corresponds to the area, measured accord-
points. As a consequence of this observation, the ing to , covered by the union of the n balls
optimality of the centroid locations for Hdistor B.p1 ; r/; : : : ; B.pn ; r/.
follows Bullo et al. (2009). More precisely, let Let us see how the computation of the partial
fW1 ; : : : ; Wn g be a partition of S . Then, for any derivatives of Harea,r specializes in this case.
set points P D fp1 ; : : : ; pn g in S , Here, the performance function is differentiable
  everywhere except at a single discontinuity, and
Hdistor CM .W1 /; : : : ; CM .Wn /; W1 ; : : : ; Wn
its derivative is identically zero. Therefore, the
 Hdistor .p1 ; : : : ; pn ; W1 ; : : : ; Wn /; first term in (5) vanishes. The gradient of Harea,r
Optimal Deployment and Spatial Coverage 979

on S n n C then takes the form, for each i 2 The Voronoi-center deployment algorithm
f1; : : : ; ng, achieves convergence of a set of nodes
to a centroidal Voronoi configuration, thus
Z maximizing the expected-value multicenter
@Harea,r
.P / D nout .q/ .q/dq; function Hdistor . The algorithm is distributed over
@pi Vi .P / \ @B.pi ;r/
the proximity graph GD , as the computation of
the centroids requires information in NGD .pi /,
where nout is the outward normal vector to for each i 2 f1; : : : ; ng. Additional properties of
B.pi ; r/. The critical points of Harea,r correspond this algorithm are that the algorithm is adaptive
to configurations with the property that each pi to agent departures or arrivals and amenable to
is a local maximum for the area of Vi;r .P / D asynchronous implementations.
Vi .P / \ B.pi ; r/ at fixed Vi .P /. We refer to On the other hand, the limited-Voronoi-normal
these configurations as r-limited area-centered deployment algorithm achieves convergence to a
Voronoi configurations. set that locally maximizes the area covered by the
set of sensing balls. The algorithm is distributed
in the sense that agents only need to know in-
Optimal Deployment Algorithms formation from neighbors in the proximity graph
G2r or, more precisely, GLD,r . Thus, it can be
Once a set of optimal deployment configurations implemented by agents that employ range-limited
have been characterized, the next step is to devise interactions. It enjoys similar robustness proper-
a distributed algorithm that allows a group of ties as the Voronoi-center deployment algorithm.
mobile robots to converge to such configurations.
Gradient algorithms are the first of the options
that should be explored. Simulation Results
For the expected-value multicenter functions,
robots whose dynamics can be described by first- We show evolutions of the Voronoi-centroid de-
order integrator dynamics and which can commu- ployment algorithm in Fig. 1. One can verify that
nicate at predetermined communication rounds of the final network configuration is a centroidal O
a fixed time schedule, these laws present a similar Voronoi configuration. For each evolution we
structure, loosely described as follows: depict the initial positions, the trajectories, and
the final positions of all robots.
[Informal description] In each communication Finally, we show an evolution of limited-
round, each robot performs the following tasks: (i) Voronoi-normal deployment algorithm in
it transmits its position and receives its neighbors’
positions; (ii) it computes a notion of the geometric Fig. 2. One can verify that the final network
center of its own cell, determined according to configuration is an r2 -limited area-centered
some notion of partition of the environment. (iii) Voronoi configuration. In other words, the
Between communication rounds, each robot moves deployment task is achieved.
toward this center.

The notions of geometric center and of par- Future Directions for Research
tition of the environment differ depending on The algorithms described above achieve
what is the type of expected-value multicenter locally optimal deployment configurations with
function used. In the Voronoi-center deployment respect to expected-value multicenter functions.
algorithm, the geometric center just reduces to However, this simplified setting does not
CM .Vi /. In the limited-Voronoi-normal deploy- account for many important constraints, such
ment problem in (ii), each agent computes the as obstacles and deployment in non-convex
@H
direction of v D @parea,ri
for some r and (iii) environments (Pimenta et al. 2008; Caicedo-
moves for a maximum step size in this direction Nùñez and Žefran 2008), deployment with vis-
to ensure the area function will be decreased. ibility sensors, range-limited and wedge-shaped
980 Optimal Deployment and Spatial Coverage

Optimal Deployment and Spatial Coverage, Fig. 1 and Voronoi partition. The central figure illustrates the
The evolution of the Voronoi-centroid deployment algo- evolution of the robots. After 13 s, the value of Hdistor has
rithm with n D 20 robots. The left-hand (resp., right- monotonically increased to approximately 0:515
hand) figure illustrates the initial (resp., final) locations

Optimal Deployment and Spatial Coverage, Fig. 2 central figure illustrates the evolution of the robots. The
r
The evolution of the limited-Voronoi-normal deployment 2
-limited Voronoi cell of each robot is plotted in light
algorithm with n D 20 robots and r D 0:4. The gray. After 36 s, the value of Harea , with a D r2 , has
left-hand (resp., right-hand) figure illustrates the initial monotonically increased to approximately 14:141
(respectively, final) locations and Voronoi partition. The

footprints (Ganguli et al. 2006; Laventall Cross-References


and Cortés 2009), and energy and vehicle
dynamical restrictions (Kwok and Martínez  Graphs for Modeling Networked Interactions
2010a,b). Deployment strategies find application  Multi-vehicle Routing
in exploration and data gathering tasks, and so  Networked Systems
these algorithms have been expanded to account
for uncertainty and learning of unknown density
functions (Schwager et al. 2009; Graham and
Cortés 2012; Zhong and Cassandras 2011; Bibliography
Martínez 2010). Gossip and self-triggered
communications (Bullo et al. 2012; Nowzari Bullo F, Cortés J, Martínez S (2009) Distributed con-
and Cortés 2012), self-triggered computations trol of robotic networks. Applied mathematics series.
Princeton University Press. Available at http://www.
for region approximation (Ru and Martínez coordinationbook.info
2013), and area equitable partitions (Cortés Bullo F, Carli R, Frasca P (2012) Gossip coverage control
2010) have also been investigated. Much work is for robotic networks: dynamical systems on the the
currently being devoted to solve on the current space of partitions. SIAM J Control Optim 50(1):
419–447
limitations of these nontrivial extensions, which Caicedo-Nùñez CH, Žefran M (2008) Performing cover-
make the problem settings significantly harder to age on nonconvex domains. In: IEEE conference on
solve. control applications, San Antonio, pp 1019–1024
Optimal Sampled-Data Control 981

Cortés J (2010) Coverage optimization and spatial load


balancing by robotic sensor networks. IEEE Trans Optimal Sampled-Data Control
Autom Control 55(3):749–754
Cortés J, Martínez S, Bullo F (2005) Spatially-distributed Yutaka Yamamoto
coverage optimization and control with limited-range
Department of Applied Analysis and Complex
interactions. ESAIM Control Optim Calc Var 11:
691–719 Dynamical Systems, Graduate School of
Drezner Z (ed) (1995) Facility location: a survey of ap- Informatics, Kyoto University, Kyoto, Japan
plications and methods. Springer series in operations
research. Springer, New York
Du Q, Faber V, Gunzburger M (1999) Centroidal Voronoi
tessellations: applications and algorithms. SIAM Rev Abstract
41(4):637–676
Ganguli A, Cortés J, Bullo F (2006) Distributed This article gives a brief overview on the modern
deployment of asynchronous guards in art gal-
development of sampled-data control. Sampled-
leries. In: American control conference, Minneapolis,
pp 1416–1421 data systems intrinsically involve a mixture of
Graham R, Cortés J (2012) Cooperative adaptive sam- two different time sets, one continuous and the
pling of random fields with partially known co- other discrete. Due to this, sampled-data systems
variance. Int J Robust Nonlinear Control 22(5):
504–534
cannot be characterized in terms of the stan-
Gray RM, Neuhoff DL (1998) Quantization. IEEE Trans dard notions of transfer functions, steady-state
Inf Theory 44(6):2325–2383. Commemorative Issue response, or frequency response. The technique
1948–1998 of lifting resolves this difficulty and enables the
Hibbeler R (2006) Engineering mechanics:
statics & dynamics, 11th edn. Prentice Hall,
recovery of such concepts and simplified solu-
Upper Saddle River tions to sampled-data H 1 and H 2 optimization
Kwok A, Martínez S (2010a) Deployment problems. We review the lifting point of view, its
algorithms for a power-constrained mobile sensor application to such optimization problems, and
network. Int J Robust Nonlinear Control 20(7):
725–842
finally present an instructive numerical example.
Kwok A, Martínez S (2010b) Unicycle coverage con-
trol via hybrid modeling. IEEE Trans Autom Control
55(2):528–532
Laventall K, Cortés J (2009) Coverage control by multi- Keywords O
robot networks with limited-range anisotropic sensory.
Int J Control 82(6):1113–1121 Computer control; Frequency response; H 1 and
Martínez S (2010) Distributed interpolation schemes
for field estimation by mobile sensor networks.
H 2 optimization; Lifting; Transfer operator
IEEE Trans Control Syst Technol 18(2):
491–500
Nowzari C, Cortés J (2012) Self-triggered coordination of
robotic networks for optimal deployment. Automatica Introduction
48(6):1077–1087
Pimenta L, Kumar V, Mesquita R, Pereira G A sampled-data control system consists of
(2008) Sensing and coverage for a network of
heterogeneous robots. In: IEEE international
a continuous-time plant and a discrete-time
conference on decision and control, Cancun, controller, with sample and hold devices
pp 3947–3952 that serve as an interface between these two
Ru Y, Martínez S (2013) Coverage control in constant flow components. As can be seen from this fact,
environments based on a mixed energy-time metric.
Automatica 49:2632–2640
sampled-data systems are not time invariant,
Schwager M, Rus D, Slotine J (2009) Decentralized, and various problems arise from this property.
adaptive coverage control for networked robots. Int J To be more specific, consider the unity-
Robot Res 28(3):357–375 feedback control system shown in Fig. 1; r is
Zhong M, Cassandras C (2011) Distributed cover-
age control and data collection with mobile sen-
the reference signal, y the system output, and
sor networks. IEEE Trans Autom Control 56(10): e the error signal. These are continuous-time
2445–2455 signals. The error e.t/ goes through the sampler
982 Optimal Sampled-Data Control

Optimal Sampled-Data Control, Fig. 1 A unity-feedback system

Optimal Sampled-Data Control, Fig. 2 Sampling with 0-order hold

(or an A/D converter) S. This sampler reads out .H.wŒk// .t/ WD wŒk; for kh  t < .k C 1/h:
the values of e.t/ at every time step h called
the sampling period and produces a discrete-
time signal ed Œk, k D 0; 1; 2; : : : (Fig. 2). In A typical sample-hold action is shown in Fig. 2.
particular, the sampling operator S acts on a While one can consider a nonlinear plant P or
continuous-time signal w.t/, t  0, as controller C , or infinite-dimensional P and C we
confine ourselves to linear and finite-dimensional
P and C , and also suppose that P and C are time
S.w/Œk WD w.kh/; k D 0; 1; 2; : : : invariant in continuous time and in discrete time,
respectively.

The discretized signal is then processed by the


discrete-time controller C.z/ and becomes a con-
trol input ud . There can also be a quantization The Main Difficulty
effect, although for the sake of simplicity this is
neglected here. The obtained signal ud then goes As stated above, the unity-feedback system Fig. 1
through another interface H called a hold device is not time invariant either in continuous time or
or a D/A converter to become a continuous-time in discrete time, even when the plant and con-
signal. A typical example is the 0-order hold troller are both time invariant in their respective
where H simply maintains the value of a discrete- domains of operators. The mixture of the two
time signal wŒk constant as its output until the time sets prohibits the total closed-loop system
next sampling time: from being time invariant.
Optimal Sampled-Data Control 983

The lack of time-invariance implies that we This idea makes it possible to view a (time-
cannot naturally associate to sampled-data sys- invariant or even periodically time-varying)
tems such classical concepts of transfer functions, continuous-time system as a linear, time-invariant
steady-state response and frequency response. discrete-time system.
One can regard Fig. 1 as a time-invariant Let
discrete-time system by ignoring the intersample P
x.t/ D Ax.t/ C Bu.t/
(2)
behavior and focusing attention on the sample- y.t/ D C x.t/:
point behavior only. But the obtained model does be a given continuous-time plant and lift the input
not then reflect what happens between sampling u.t/ to obtain uŒk./. We apply this lifted input
times. This approach can lead to the neglect with the timing t D kh (h is the prespecified sam-
of undesirable inter-sample oscillations, called pling rate as above) and observe how it affects the
ripples. To monitor the intersample behavior, system. Let xŒk be the state at time t D kh. The
the notion of the modified z-transform was state xŒk C 1 at time .k C 1/h is given by
introduced, see, e.g., Jury (1958) and Ragazzini
and Franklin (1958); however, this transform is Z h
usable only after the controller has been designed xŒk C 1 D e Ah xŒk C e A.h / BuŒk. /d :
0
and hence not for the design problems considered (3)
in this article.
The right-hand side integral defines an operator

Z h
Lifting: A Modern Approach L2 Œ0; h/ ! Rn W u./ 7! e A.h / Bu. /d :
0

A new approach was introduced around 1990–


While the state-transition (3) only described a
1991 (Bamieh et al. 1991; Tadmor 1991; Toivo-
discrete-time update, the system keeps producing
nen 1992; Yamamoto 1990, 1994). The new idea,
an output during the intersample period. If we
now called lifting, makes it possible to describe
consider the lifting of x.t/, it is easily seen to be
sampled-data systems via a time-invariant model
while maintaining the intersample behavior.
described by O
Let f .t/ be a continuous-time signal. Instead Z 
of sampling f .t/, we will represent it as a se- xŒk./ D e A
xŒk C e A.  / BuŒk. /d :
quence of functions. Namely, we set up the cor- 0

respondence:
As such, the lifted output yŒk./ is given by
L W f 7! ff Œk./g1
kD0 ; Z 

f Œk./ D f .kh C /; 0   < h (1) yŒk./ D C e A xŒkC C e A.  /BuŒk. /d :


0
(4)
See Fig. 3. Observe that formulas (3) and (4) take the form

Optimal Sampled-Data Control, Fig. 3 Lifting


984 Optimal Sampled-Data Control

xŒk C 1 D AxŒk C BuŒk regarded as a function of ! 2 Œ0; !s / (!s WD


2= h). Its gain at ! is defined to be
yŒk D CxŒk C DuŒk;
kG.e j!h /vk
and the operators A; B; C; D do not depend kG.e j!h /k D sup : (9)
v2L2 Œ0;h/ kvk
on the time variable k. In other words, it is
possible to describe this continuous-time system
The maximum kG.e j!h /k over Œ0; !s / is the H 1
with discrete timing, once we adopt the lifting norm of G.z/. The H 2 -norm of G is defined by
point of view. To be more precise, the operators
Z !1=2
A; B; C; D are defined as follows: h 2= h

kGk2 W D trace fG .e j!h
/G.e j!h
/gd! ;
2
AW Rn ! Rn W x 7! e Ah x 0
Rh (10)
BW L2 Œ0; h/ ! Rn W u 7! 0 e A.h / Bu. /d where the trace here is taken in the sense of
CW Rn ! L2 Œ0; h/ W x 7! C e A. /x Hilbert-Schmidt norm; see Chen and Francis
DW L2 Œ0; h/ ! L2 Œ0; h/ W u 7!
R A.  /
(1995) for details.
0 Ce Bu. /d
(5)
Thus the continuous-time plant (2) can be de- H 1 and H 2 Control Problems
scribed by a time-invariant discrete-time model.
Once this is done, it is straightforward to connect A significant consequence of the lifting approach
this expression with a discrete-time controller, described above is that various robust control
and hence, sampled-data systems (for example, problems such as H 1 and H 2 control problems
Fig. 1) can be fully described by time-invariant for sampled-data control systems can be
discrete-time equations, without discarding the converted to corresponding discrete-time (finite-
intersampling information. We will also denote dimensional) problems. The approach was
the overall equation (with discrete-time controller initiated by Chen and Francis (1990) and later
included) abstractly in the form solved by Bamieh and Pearson (1992), Kabamba
and Hara (1993), Sivashankar and Khargonekar
xŒk C 1 D AxŒk C BuŒk (1994), Tadmor (1991), and Toivonen (1992)
(6)
yŒk D CxŒk C DuŒk: in more complete forms; see Chen and Francis
(1995) for the pertinent historical accounts.
While the obtained discrete-time model is a time Let us introduce the notion of generalized
invariant, the input and output spaces are now plants. Suppose that a continuous time plant is
infinite dimensional. Its transfer function (oper- given in the following model:
ator) is defined as
xP c .t/ D Axc .t/ C B1 w.t/ C B2 u.t/
G.z/ WD D C C.zI  A/1 B: (7) z.t/ D C1 xc .t/ C D11 w.t/ C D12 u.t/ (11)
y.t/ D C2 xc .t/
Note that A in (6) is a matrix because it is so for
A in (5). Hence, (6) is stable if G.z/ is analytic for Here w is the exogenous input, u.t/ control input,
fz W jzj  1g, provided that there is no unstable y.t/ measured output, and z.t/ is the controlled
pole-zero cancellation. output. The objective is to design a controller that
takes the sampled measurements of y and returns
Definition 1 Let G.z/ be the transfer operator of a control variable u according to the following
the lifted system given by (7), which is stable in formula:
the sense above. The frequency response operator
is the operator xd Œk C 1 D Ad xd Œk C Bd SyŒk
vŒk D Cd xd Œk C Dd SyŒk (12)
G.e j!h / W L2 Œ0; h/ ! L2 Œ0; h/ (8) uŒk./ D H./vŒk
Optimal Sampled-Data Control 985

H 1 Norm Computation and


Reduction to Finite Dimension

Let us write the system (11) and (12) in the form

xŒk C 1 D AxŒk C BuŒk


(14)
yŒk D CxŒk C DuŒk:

as in (6). For simplicity of treatments, assume


D11 in (11) to be zero; for the general case, see
Yamamoto and Khargonekar (1996).
Let G.z/ be the transfer operator G.z/ WD D C
C.zI  A/1 B. The H 1 norm of G is given as
the maximum of the singular values of the gain
G.e j!h / for ! 2 Œ0; 2= h/.
Optimal Sampled-Data Control, Fig. 4 Sampled feed- Now consider the singular value equation
back system
. 2 I  G  G.e j!h //w D 0: (15)
where H./ is a suitable hold function. This
is depicted in Fig. 4. The objective here is to and suppose that > kDk. A crux here is
design or characterize a controller that achieves that A; B; C are finite-rank operators, and we can
a prescribed performance level > 0 in such a reduce this to a finite-dimensional rank condition.
way that Taking the adjoint of (14), we obtain
kTzw k1 < (13)
pŒk D A pkC1 C C  vŒk
where Tzw denotes the closed-loop transfer oper-
ator from w to z. This is the H 1 control problem eŒk D B  pkC1 C D vŒk:
for sampled-data systems. If we take the H 2 - O
norm (10) instead, then the problem becomes that Taking the z-transforms of both sides, setting z D
of the H 2 (sub)optimal control problem. e j!h , and substituting v D y and e D 2 w, we
The difficulty here is that both w and z are obtain
continuous-time variables, and hence their lifted
variables are infinite dimensional. A remarkable e j!h x D Ax C Bw
fact here is that the H 1 problem (and the p D e j!h A p C C  .Cx C Dw/
H 2 problem as well) (13) can be equivalently
transformed to an H 1 problem for a finite- . 2  D D/w D e j!h B  p C D Cx:
dimensional discrete-time system. We will
indicate in the next section how this can be done. Eliminating the variable w then yields





j!h I BR 1 B  A C BR 1 D C 0 x
e   D0 (16)
0 A C C  DR 1 B  C .I C DR 1 D /C I p

where R D . I  D D/. The important point to operator from L2 Œ0; h/ to Rn , and its adjoint B 
be noted here is that all the operators appearing is an operator from Rn to L2 Œ0; h/. Hence, the
here are actually matrices. For example, B is an composition BR 1 B  is a linear operator from
986 Optimal Sampled-Data Control

Rn into itself, i.e., a matrix. Thus, for a given • Sampled-data H 1 design with the generalized
the singular value equation admits a nontrivial plant

solution w for (15) if and only if the finite- P .s/ P .s/


G.s/ D ;
dimensional equation (16) admits a nontrivial so- P .s/ P .s/
T
lution x p (Yamamoto 1993; Yamamoto and
Khargonekar 1996). (Note that R is invertible • Discrete-time H 1 design with the discrete-
since > kDk.) time generalized plant Gd .z/ given by the
It is possible to find matrices A; B; C such that step-invariant transformation (see, e.g., Chen

A D A C BR 1 D C, BB = 2 D BR 1 B  , and and Francis 1995) of G.s/.
 Figures 6 and 7 show the frequency and
C C D C  .I C DR 1 D /C, and hence (16) is
equivalent to time responses of the two resulting closed-
loop systems, respectively. In Fig. 6, the solid
" # curve shows the response of the sampled design,

!

I B B = 2 A 0 x while the dash-dotted curve shows the discrete-


    D0
0 A C C I p time frequency response, but purely reflecting
(17) its sample-point behavior only. At first glance,
it may appear that the discrete-time design
for  D e j!h . In other words, we have that
performs better. But when we actually compute
kGk1 < if and only if there exists no  of
the lifted sampled-data frequency response in
modulus 1 such that (17) holds.
the sense defined in Definition 1, it becomes
It can be proven that by substituting the ex-
obvious that the sampled-data design is far
pressions of (11) and (12) for .A; B; C; D/, one
superior. The dashed curve shows the frequency
obtains a finite-dimensional discrete-time gener-
response of the closed-loop, i.e., that of G.s/
alized plant Gd with digital controller (12) such
connected with the discrete-time designed
that kGk1 < if and only if kGd k1 < . The
Kd . The response is similar to the discrete-
precise formulas for the discrete-time plant can
time frequency response in low frequency, but
be found, e.g., Bamieh and Pearson (1992), Chen
exhibits a very sharp peak around the Nyquist
and Francis (1995), Kabamba and Hara (1993),
frequency (i.e., half the sampling frequency;
Yamamoto and Khargonekar (1996), and Cantoni
in the present case, = h 6:28 rad/s, i.e.,
and Glover (1997).
1=2h D 1 Hz).
This can also be verified from the initial-
state responses Fig. 7 with x.0/ D .1; 1/. The
An H 1 Design Example solid curve shows the sampled-data design
and the dashed curve the discrete-time one.
For sampled-data control systems, there used to Both responses decay to zero rapidly at
be, and still is, a rather common myth that if sampled instants as shown by the circles for
one takes a sufficiently fast sampling rate, it will the discrete-time design. But the discrete-time
not cause a major problem. This can be true for design exhibits very large ripples, with period
continuous-time design, but we here show that if approximately 1 s. This corresponds to 1 Hz,
we employ a sample-point discretization without which is the same as 2 D = h [rad/s],
a performance consideration for intersampling i.e., the Nyquist frequency. This is precisely
behavior, fast sampling rates can cause a serious captured in the lifted frequency response in
problem. Fig. 6.
Take a simple second-order plant P .s/ D It is worth noting that when we take the
1=.s 2 C0:1sC1/, and consider the disturbance re- sampling period h smaller, the response for the
jection problem minimizing the H 1 -norm from discrete-time design becomes even more oscilla-
w to z as given in Fig. 5. Set the sampling time tory and shows a very high peak in the frequency
h D 0:5. We execute the following: response.
Optimal Sampled-Data Control 987

Optimal Sampled-Data
Control, Fig. 5
Disturbance rejection

Optimal Sampled-Data
Control, Fig. 6 Frequency
responses h D 0:5

O
Summary, Bibliographical Notes, and There are other performance indices for opti-
Future Directions mality, typically those arising from H 2 and L1
norms. These problems have also been studied
We have given a short summary of the main extensively, and fairly complete solutions are
achievements of modern sampled-data control available. For the lack of space, we cannot list all
theory. Particularly, we have reviewed how the references, and the reader is referred to Chen and
technique of lifting resolved the intrinsic diffi- Francis (1995) and Yamamoto (1999) for a more
culty arising from the mixture of two distinct time concrete survey and references therein.
sets: continuous and discrete. This idea further For classical treatments of sampled-data con-
led to the new notions of transfer operators and trol, it is instructive to consult Jury (1958) and
frequency response. These notions together en- Ragazzini and Franklin (1958). The textbook
abled us to treat optimal sampled-data control Åström and Wittenmark (1996) covers both clas-
problems in a unified and transparent way. We sical and modern aspects of digital control. For a
have outlined how the sampled-data H 1 control mathematical background of the computation of
problem can equivalently be reduced to a cor- adjoints treated in section “H 1 Norm Computa-
responding discrete-time H 1 problem, without tion and Reduction to Finite Dimension,” consult
sacrificing the performance in the intersample be- Yamamoto (2012) as well as Yamamoto (1993).
havior. This has been exemplified by a numerical Since control devices are now mostly digital,
example. the importance of sampled-data control will
988 Optimal Sampled-Data Control

Optimal Sampled-Data Control, Fig. 7 Initial-state responses h D 0:5

definitely increase. While the linear, time- points when the band-limiting hypothesis does
invariant case as treated here is now fairly not hold. See, for example, Yamamoto et al.
complete, sampled-data control for a nonlinear (2012) and Nagahara and Yamamoto (2012) for
or an infinite-dimensional plant seems to be still the idea and some efforts in this direction.
quite an open issue, although it is unclear if the
methodology treated here is effective for such
classes of plants.
Sampled-data control has much to do with Cross-References
signal processing. Indeed, since it can optimize
continuous-time performance, it can shed a new  Control Applications in Audio Reproduction
light on digital signal processing. Traditionally,  H2 Optimal Control
Shannon’s paradigm based on the perfect band-  H-Infinity Control
limiting hypothesis and the sampling theorem  Optimal Control via Factorization and Model
has been prevalent in the signal processing Matching
community. Since the sampling theorem opts
for perfect reconstruction, the resulting theory Acknowledgments The author would like to thank
reduces mostly to discrete-time problems. In Masaaki Nagahara and Masashi Wakaiki for their help
in the numerical example references.
other words, the intersample information is
buried in the sampling theorem. It should,
however, be noted that the very stringent band-
limiting hypothesis is almost never satisfied
in reality, and various approximations are Bibliography
necessitated. In contrast, sampled-data control
Åström KJ, Wittenmark B (1996) Computer controlled
can provide an optimal platform for dealing with systems—theory and design, 3rd edn. Prentice Hall,
and optimizing the response between sampling Upper Saddle River
Optimization Algorithms for Model Predictive Control 989

Bamieh B, Pearson JB (1992) A general framework


for linear periodic systems with applications to H1 Optimization Algorithms for Model
sampled-data control. IEEE Trans Autom Control Predictive Control
37:418–435
Bamieh B, Pearson JB, Francis BA, Tannenbaum A (1991)
Moritz Diehl
A lifting technique for linear periodic systems with
applications to sampled-data control systems. Syst Department of Microsystems Engineering
Control Lett 17:79–88 (IMTEK), University of Freiburg, Freiburg,
Cantoni M, Glover K (1997) H1 sampled-data synthesis Germany
and related numerical issues. Automatica 33:2233–
2241
ESAT-STADIUS/OPTEC, KU Leuven,
Chen T, Francis BA (1990) On the L2 -induced norm Leuven-Heverlee, Belgium
of a sampled-data system. Syst Control Lett 15:
211–219
Chen T, Francis BA (1995) Optimal sampled-data control
systems. Springer, New York
Abstract
Jury EI (1958) Sampled-data control systems. Wiley, New
York This entry reviews optimization algorithms for
Kabamba PT, Hara S (1993) Worst case analysis and both linear and nonlinear model predictive con-
design of sampled data control systems. IEEE Trans
Autom Control 38:1337–1357
trol (MPC). Linear MPC typically leads to spe-
Nagahara M, Yamamoto, Y (2012) Frequency do- cially structured convex quadratic programs (QP)
main min-max optimization of noise-shaping Delta- that can be solved by structure exploiting active
Sigma modulators. IEEE Trans Signal Process 60: set, interior point, or gradient methods. Nonlin-
2828–2839
Ragazzini JR, Franklin GF (1958) Sampled-data control
ear MPC leads to specially structured nonlinear
systems. McGraw-Hill, New York programs (NLP) that can be solved by sequential
Sivashankar N, Khargonekar PP (1994) Characteriza- quadratic programming (SQP) or nonlinear inte-
tion and computation of the L2 -induced norm of rior point methods.
sampled-data systems. SIAM J Control Optim 32:
1128–1150
Tadmor G (1991) Optimal H1 sampled-data control in
continuous time systems. In: Proceedings of ACC’91,
Boston, Massachusetts, pp 1658–1663 Keywords
Toivonen HT (1992) Sampled-data control of continuous- O
time systems with an H1 optimality criterion. Auto-
matica 28:45–54
Banded matrix factorization; Convex optimiza-
Yamamoto Y (1990) New approach to sampled-data sys- tion; Karush-Kuhn-Tucker (KKT) conditions;
tems: a function space method. In: Proceedings of 29th Sparsity exploitation
CDC, Honolulu, Hawaii, pp 1882–1887
Yamamoto Y (1993) On the state space and frequency
domain characterization of H 1 -norm of sampled-
data systems. Syst Control Lett 21:163–172
Yamamoto Y (1994) A function space approach to Introduction
sampled-data control systems and tracking problems.
IEEE Trans Autom Control 39:703–712 Model predictive control (MPC) needs to solve
Yamamoto Y (1999) Digital control. In: Webster JG
(ed) Wiley encyclopedia of electrical and elec-
at each sampling instant an optimal control prob-
tronics engineering, vol 5. Wiley, New York, lem with the current system state xN 0 as initial
pp 445–457 value. MPC optimization is almost exclusively
Yamamoto Y (2012) From vector spaces to function based on the so-called direct approach which
spaces—introduction to functional analysis with appli-
cations. SIAM, Philadelphia
first discretizes the continuous time system to
Yamamoto Y, Khargonekar PP (1996) Frequency response obtain a discrete time optimal control problem
of sampled-data systems. IEEE Trans Autom Control (OCP). This OCP has as optimization variables
41:166–176 a state trajectory X D Œx0> ; : : : ; xN> > with xi 2
Yamamoto Y, Nagahara M, Khargonekar PP (2012) Signal
reconstruction via H 1 sampled-data control theory—
Rnx for i D 0; : : : ; N and a control trajectory
Beyond the Shannon paradigm. IEEE Trans Signal U D Œu> > >
0 ; : : : ; uN 1  with ui 2 R
nu
for i D
Process 60:613–625 0; : : : ; N  1. For simplicity of presentation, we
990 Optimization Algorithms for Model Predictive Control

restrict ourselves to the time-independent case, impossible due to computational delays. Several
and the OCP we treat in this article is stated as ideas can help us to deal with this issue.
follows:
Off-line Precomputations and Code
X
N 1 Generation
minimize L.xi ; ui / C E .xN / (1a) As consecutive MPC problems are similar and
X; U i D0
differ only in the value xN 0 , many computations
subject to x0  xN 0 D 0; (1b) can be done once and for all before the MPC
controller execution starts. Careful preprocessing
xi C1  f .xi ; ui / D 0; i D 0; : : : ; N  1; and code optimization for the model routines is
(1c) essential, and many tools automatically gener-
h.xi ; ui /  0; i D 0; : : : ; N  1; ate custom solvers in low-level languages. The
(1d) generated code has fixed matrix and vector di-
mensions, has no online memory allocations, and
r .xN /  0: (1e) contains a minimal number of if-then-else state-
ments to ensure a smooth computational flow.
The MPC objective is stated in Eq. (1a), the sys-
tem dynamics enter via Eq. (1c), while path and Delay Compensation by Prediction
terminal constraints enter via Eqs. (1d) and (1e). When we know how long our computations for
All functions are assumed to be differentiable and solving an MPC problem will take, it is a good
to have appropriate dimensions (h.x; u/ 2 Rnh idea not to address a problem starting at the cur-
and r.x/ 2 Rnr ). Note that xN 0 2 Rnx is not an op- rent state but to simulate at which state the system
timization variable, but a parameter upon which will be when we will have solved the problem.
the OCP depends via the initial value constraint in This can be done using the MPC system model
Eq. (1b). The optimal solution trajectories depend and the open-loop control inputs that we will ap-
only on this value and can thus be denoted by ply in the meantime. This feature is used in many
X  .xN 0 / and U  .xN 0 /. Obtaining them, in partic- practical MPC schemes with non-negligible com-
ular the first control value u0 .xN 0 /, as fast and putation time.
reliably as possible for each new value of xN 0 is
the aim of all MPC optimization algorithms. The Division into Preparation and Feedback Phase
most important dividing line is between convex A third ingredient of several MPC algorithms is
and non-convex optimal control problems (OCP). to divide the computations in each sampling time
If the OCP is convex, algorithms exist that find a into a preparation phase and a feedback phase.
global solution reliably and in computable time. The more CPU intensive preparation phase is
If the OCP is not convex, one usually needs to be performed with a predicted state xN 0 , before the
satisfied with approximations of locally optimal most current state estimate, say xN 00 , is available.
solutions. The OCP (1) is convex if the objec- Once xN 00 is available, the feedback phase delivers
tive (1a) and all components of the inequality quickly an approximate solution to the optimiza-
constraint functions (1d) and (1e) are convex and tion problem for xN 00 .
if the equality constraints (1c) are linear.
We typically speak of linear MPC when the Warmstarting and Shift
OCP to be solved is convex, and otherwise of An obvious way to transfer solution information
nonlinear MPC. from one solved MPC problem to the next one
uses the existing optimal solution as an initial
General Algorithmic Features for MPC guess to start the iterative solution procedure of
Optimization the next problem. We can either directly use
In MPC we would dream to have the solution to the existing solution without modification for
a new optimal control problem instantly, which is warmstarting or we can first shift it in order to
Optimization Algorithms for Model Predictive Control 991

account for the advancement of time, which is instead of (2). Another way is to use a banded
particularly advantageous for systems with time- matrix factorization.
varying dynamics or objectives.
Condensing
Iterating While the Problem Changes The constraints (2b) and (2c) can be used to
A last important ingredient of some MPC algo- eliminate the state trajectory X . This yields an
rithms is the idea to work on the optimization equivalent but smaller-scale QP of the following
problem while it changes, i.e., to never iterate form:
the optimization procedure to convergence for an
MPC problem getting older and older during the
>

1 U H G U
minimize (3a)
iterations but to rather work with the most current
U 2 RN n u 2 xN 0 G> J xN 0
information in each new iteration.
subject to d C K xN 0 C M U  0: (3b)

Convex Optimization for Linear MPC The number of inequality constraints is the same
as in the original QP (2) and given by m D
Linear MPC is based on a linear system model of N nh C nr . Note that in the simplest case without
the form xi C1 D Axi CBui and convex objective inequalities (m D 0), the solution U  .xN 0 / of
and constraint functions in (1a), (1d), and (1e). the condensed QP can be obtained by setting the
The most widespread linear MPC setting uses gradient of the objective to zero, i.e., by solving
a convex quadratic objective function and affine H U  .xN 0 / C G xN 0 D 0. The factorization of a
constraints and solves the following quadratic dense matrix H with dimension N nu N nu needs
program (QP): O.N 3 n3u / arithmetic operations, i.e., the compu-
tational cost of condensing-based algorithms typ-
N 1
>

1 X xi Q S xi 1 ically grows cubically with the horizon length N .


minimize > C xN> P xN
X; U 2 i D0 i u S R ui 2
Banded Matrix Factorization
(2a) O
An alternative way to deal with the sparsity is
subject to x0  xN 0 D 0; (2b) best sketched at hand of a sparse convex QP (2)
without inequality constraints (2d) and (2e). We
xi C1  Axi  Bui D 0;i D 0; : : : ; N  1;
define the vector of Lagrange multipliers Y D
(2c)
Œy0> ; : : : ; yN> > and the Lagrangian function by
b C C xi C Dui  0;i D 0; : : : ; N  1;
(2d) 1
L.X; U; Y / D y0> .x0  xN 0 / C xN> P xN
2
c C F xN  0: (2e)

1 >
1 X xi
N
Q S xi
C > Cyi>C1
Here, b; c are vectors and Q;
S; R; P; C; D; F 2 i D0 iu S R ui
Q S
matrices, and matrices > and P are sym- .xi C1 Axi CBui /: (4)
S R
metric and positive semi-definite to ensure the QP
is convex. If we reorder all unknowns that enter the Lagran-
gian and summarize them in the vector
Sparsity Exploitation
The QP (2) has a specific sparsity structure that W D Œy0> ; x0> ; u> > > > > > >
0 ; y1 ; x1 ; u1 ; : : : ; yN ; xN 
can be exploited in different ways. One way is to
reduce the variable space by a procedure called the optimal solution W  .xN 0 / is uniquely charac-
condensing and then to solve a smaller-scale QP terized by the first-order optimality condition
992 Optimization Algorithms for Model Predictive Control

rW L.W  / D 0: Active Set Methods


The optimal solution of the QP is characterized
Due to the linear quadratic dependence of L on by its active set, i.e., the set of inequality con-
W , this is a block-banded linear equation in the straints (3b) that are satisfied with equality at this
unknown W  : point. If one would know the active set for a given
problem instance xN 0 , it would be easy to find the
2 3 2 3 solution. Active set methods work with guesses
0 I xN 0
6I Q S A > 7 6 0 7 of the active set which they iteratively refine.
6 7 6 7
6 S > R B > 7 6 0 7 In each iteration, they solve one linear system
6 7 6 7
6 A B 7 6 7
6 0 I 7 W  D 6 0 7 : corresponding to a given guess of the active set.
6 I Q   7 67
6 7 6 7 If the KKT conditions are satisfied, the optimal
6    7 67
6 7 6 7 solution is found; if they are not, another division
4   0 I 5 45
0 into active and inactive constraints needs to be
I P
tried. A crucial observation is that an existing
factorization of the linear system can be reused to
Because the above matrix has nonzero elements a large extent when only one constraint is added
only on a band of width 2nx C nu around the di- or removed from the guess of the active set. Many
agonal, it can be factorized with a computational different active set strategies exist, three of which
cost of order O.N.2nx C nu /3 /. we mention: Primal active set methods first find
a feasible point and then add or remove active
Treatment of Inequalities constraints, always keeping the primal variables
An important observation is that both the un- U feasible. Due to the difficulty of finding a
condensed QP (2) and the equivalent condensed feasible point first, they are difficult to warmstart
QP (3) typically fall into the class of strictly con- in the context of MPC optimization. Dual active
vex parametric quadratic programs: the solution set strategies always keep the dual variables 
U  .xN 0 /; X  .xN 0 / is unique and depends piecewise positive. They can easily be warmstarted in the
affinely and continuously on the parameter xN 0 . context of MPC. Parametric or online active
Each affine piece of the solution map corresponds set strategies ensure that all iterates stay primal
to one active set and is valid on one polyhedral and dual feasible and go on a straight line in
critical region in parameter space. This observa- parameter space from a solved QP problem to
tion forms the basis of explicit MPC algorithms the current one, only updating the active set when
which precompute the map u0 .xN 0 /, but it can also crossing the boundary between critical regions, as
be exploited in online algorithms for quadratic implemented in the online QP solver qpOASES.
programming, which are the focus of this section. Active set methods are very competitive in
We sketch the different ways of how to treat practice, but their worst case complexity is not
inequalities only for the condensed QP (3), but polynomial. They are often used together with the
they can equally be applied in sparse QP al- condensed QP formulation, for which each active
gorithms that directly address (2). The optimal set change is relatively cheap. This is particu-
solution U  .xN 0 / for a strictly convex QP (3) larly advantageous if an existing factorization can
is – together with the corresponding vector of be kept between subsequent MPC optimization
Lagrange multipliers, or dual variables  .xN 0 / 2 problems.
Rm – uniquely characterized by the so-called
Karush-Kuhn-Tucker (KKT) conditions, which Interior Point Methods
we omit for brevity. There are three big families Another approach is to replace the KKT condi-
of solution algorithms for inequality-constrained tions by a smooth approximation that uses a small
QPs that differ in the way they treat inequalities: positive parameter > 0:
active set methods, interior point methods, and
gradient projection methods. H U  C G xN 0 C M >  D 0; (5a)
Optimization Algorithms for Model Predictive Control 993

   
d C K xN 0 C M U  i i C D 0; i D 1; : : : ; m: 1
U ŒkC1
DP U 
Œk
.H U C G xN 0 / :
Œk
LH
(5b)
An improved version of the gradient projection
These conditions form a smooth nonlinear system algorithm is called the optimal or fast gradient
of equations that uniquely determines a primal method and has probably the best possible iter-
dual solution U  .xN 0 ; / and  .xN 0 ; / in the in- ation complexity of all gradient type methods.
terior of the feasible set. They are not equiv- All variants of gradient projection algorithms
alent to the KKT conditions, but for ! 0, are easy to warmstart. Though they are not as
their solution tends to the exact QP solution. An versatile as active set or interior point methods,
interior point algorithm solves the system (5a) they have short code sizes and can offer ad-
and (5b) by Newton’s method. Simultaneously, vantages on embedded computational hardware,
the path parameter , that was initially set to such as the code generated by the tool FIOR-
a large value, is iteratively reduced, making the DOS.
nonlinear set of equations a closer approximation
of the original KKT system. In each Newton
iteration, a linear system needs to be factored Optimization Algorithms for
and solved, which constitutes the major com- Nonlinear MPC
putational cost of an interior point algorithm.
For the condensed QP (3) with dense matrices When the dynamic system xi C1 D f .xi ; ui /
H; M , the cost per Newton iteration is of order is not affine, the optimal control problem (1)
O.N 3 /. But the interior point algorithm can also is non-convex, and we speak of a nonlinear
be applied to the uncondensed sparse QP (2), MPC (NMPC) problem. NMPC optimization
in which case each iteration has a runtime of algorithms only aim at finding a locally optimal
order O.N /. In practice, for both cases, 10–30 solution of this problem, and they usually do
Newton iterations usually suffice to obtain very it in a Newton-type framework. For ease of
accurate solutions. As an interior point method notation, we summarize problem (1) in the form
needs always to start with a high value of and of a general nonlinear programming problem O
then reduces it during the iterations, warmstarting (NLP):
is of minor benefit. There exist efficient code
generation tools that export convex interior point
solvers as plain C-code such as CVXGEN and minimize ˆ.X; U / (6a)
X; U
FORCES.
subject to Geq .X; U; xN 0 / D 0; (6b)
Gradient Projection Methods
Gradient projection methods do not need to Gineq .X; U /  0: (6c)
factorize any matrix but only evaluate the
gradient of the objective function H U Œk C G xN 0 Let us first discuss a fundamental choice that
in each iteration. They can only be implemented regards the problem formulation and number of
efficiently if the feasible set is a simple set optimization variables.
in the sense that a projection P.U / on this
set is very cheap to compute, as, e.g., for Simultaneous vs. Sequential Formulation
upper and lower bounds on the variables U , When an optimization algorithm addresses prob-
and if we know an upper bound LH > 0 lem (6) iteratively, it works intermediately with
on the eigenvalues of the Hessian H . The nonphysical, infeasible trajectories that violate
simple gradient projection algorithm starts the system constraints (6b). Only at the optimal
with an initialization U Œ0 and proceeds as solution the constraint residual is brought to zero
follows: and a physical simulation is achieved. We speak
994 Optimization Algorithms for Model Predictive Control

of a simultaneous approach to optimal control All Newton-type optimization methods try to


because the algorithm solves the simulation and find a point satisfying the KKT conditions by
the optimization problems simultaneously. Vari- using successive linearizations of the problem
ants of this approach are direct discretization or functions and Lagrangian. For this aim, starting
direct multiple shooting. with an initial guess .V Œ0 ; Y Œ0 ; Œ0 /, they
On the other hand, the equality con- generate sequences of primal-dual iterates
straints (6b) could easily be eliminated by .V Œk ; Y Œk ; Œk /.
a nonlinear forward simulation for given An observation that is crucial for the efficiency
initial value xN 0 and control trajectory U , of all NMPC algorithms is that the Hessian of
similar to condensing in linear MPC. Such a the Lagrangian is at a current iterate given by a
forward simulation generates a state trajectory matrix of the form
Xsim .xN 0 ; U / such that for the given value of xN 0 ; U
the equality constraints (6b) are automatically 2 3
Œk Œk
satisfied: Geq .Xsim .xN 0 ; U /; U; xN 0 / D 0. Inserting Q0 S0
6S Œk;> Œk 7
this map into the NLP (6) allows us to formulate 6 0 R0 7
6 Œk Œk 7
an equivalent optimization problem with a 6 Q1 S1 7
6 7
rV2 L./ D 6 Œk;>
S1
Œk
R1 7:
reduced variable space: 6 7
6   7
6 7
4   5
minimize ˆ.Xsim .xN 0 ; U /; U / (7a) P Œk
U
subject to Gineq .Xsim .xN 0 ; U /; U /  0: (7b) This block sparse matrix structure makes it possi-
ble to use in each Newton-type iteration the same
When solving this reduced problem with an it- sparsity-exploiting linear algebra techniques as
erative optimization algorithm, we sequentially outlined in section “Convex Optimization for
simulate and optimize the system, and we speak Linear MPC” for the linear MPC problem.
of the sequential approach to optimal control. Major differences exist on how to treat the
The sequential approach has a lower dimen- inequality constraints, and the two big families of
sional variable space and is thus easier to use Newton-type optimization methods are sequen-
with a black-box NLP solver. On the other hand, tial quadratic programming (SQP) methods and
the simultaneous approach leads to a sparse NLP nonlinear interior point (NIP) methods.
and is better able to deal with unstable nonlinear
systems. In the remainder of this section, we Sequential Quadratic Programming (SQP)
thus only discuss the specific structure of the A first variant to iteratively solve the KKT sys-
simultaneous approach. tem is to linearize all nonlinear functions at the
current iterate .V Œk ; Y Œk ; Œk / and to find a new
Newton-Type Optimization solution guess from the solution of a quadratic
In order to simplify notation further, we summa- program (QP):
rize and reorder all optimization variables U and
X in a vector V D Œx0> ; u> > > > >
0 ; : : : ; xN 1 ; uN 1 ; xN 
and use the same problem function names as minimize ˆquad .V I V Œk ; Y Œk ; Œk / (9a)
in (6) also with the new argument V . V
As in the section on linear MPC, we can subject to Geq;lin .V; xN 0 I V Œk / D 0; (9b)
introduce multipliers Y for the equalities and 
for the inequalities and define the Lagrangian Gineq;lin .V I V Œk /  0: (9c)

L.V; Y; / D ˆ.V / C Y > Geq .V; xN 0 / Here, the subindex “lin” in the constraints G;lin
>
C  Gineq .V /: (8) expresses that a first-order Taylor expansion at
Optimization Algorithms for Model Predictive Control 995

V Œk is used, while the QP objective is given method proceeds thus exactly as in an interior
by ˆquad .V I V Œk ; Y Œk ; Œk / D ˆlin .V I V Œk / C point method for convex problems, with the only
Œk > 2
2 .V  V / rV L./.V  V Œk /. Note that the
1 difference that it has to re-linearize all problem
QP has the same sparsity structure as the QP (2) functions in each iteration. An excellent software
resulting from linear MPC, with the only differ- implementation of the NIP method is given in the
ence that all matrices are now time varying over form of the code IPOPT.
the MPC horizon. In the case that the Hessian
matrix is positive semi-definite, this QP is convex Continuation Methods and Tangential
so that global solutions can be found reliably Predictors
with any of the methods from section “Con- In nonlinear MPC, a sequence of OCPs with
Œ0 Œ1 Œ2
vex Optimization for Linear MPC.” The solu- different initial values xN 0 ; xN 0 ; xN 0 ; : : : is solved.
tion of the QP along with the corresponding For the transition from one problem to the next,
constraint multipliers gives the next SQP iterate it is beneficial to take into account the fact that
.V ŒkC1 ; Y ŒkC1 ; ŒkC1 /. Apart from the presented the optimal solution W  .xN 0 / depends almost ev-
“exact Hessian” SQP variant, which has quadratic erywhere differentiably on xN 0 . The concept of a
convergence speed, several other SQP variants continuation method is most easily explained in
exist, which make use of other Hessian approx- the context of an NIP method with fixed path
imations. A particularly useful Hessian approx- parameter N > 0. In this case, the solution
imation for NMPC is possible if the original W  .xN 0 ; N / of the smooth root finding problem
objective function ˆ.V / is convex quadratic, and GNIP .W  .xN 0 ; N /; xN 0 ; N / D 0 from Eq. (11) is
the resulting SQP variant is called the generalized smooth with respect to xN 0 . This smoothness can
Gauss-Newton method. In this case, one can just be exploited by making use of a tangential pre-
use the original objective as cost function in the dictor in the transition from one value of xN 0 to
QP (9a), resulting in convex QP subproblems and another. Unfortunately, the interior point solution
(often fast) linear convergence speed. manifold is strongly nonlinear at points where the
active set changes, and the tangential predictor is
Nonlinear Interior Point (NIP) Method not a good approximation when we linearize at
In contrast to SQP methods, an alternative way such points. O
to address the solution of the KKT system is to
replace the last nonsmooth KKT conditions by a Generalized Tangential Predictor and
smooth nonlinear approximation, with > 0: Real-Time Iterations
In fact, the true NLP solution is not determined
rV L.V  ; Y  ;  / D 0 (10a) by a smooth root finding problem (10a)–(3)
but by the (nonsmooth) KKT conditions. The

Geq .V ; xN 0 / D 0 (10b) solution manifold has smooth parts when the
Gineq;i .V  / i C D 0; i D 1; : : : ; m: active set does not change, but non-differentiable
points occur whenever the active set changes.
(10c) We can deal with this fact naturally in an SQP
framework by solving one QP of form (9) in
We summarize all variables in a vector W D order to generate a tangential predictor that
ŒV > ; Y > ; > > and summarize the above set of is also valid in the presence of active set
equations as changes. In the extreme case that only one
such QP is solved per sampling time, we speak
GNIP .W; xN 0 ; / D 0: (11) of a real-time iteration (RTI) algorithm. The
computations in each iteration can be subdivided
The resulting root finding problem is then into two phases, the preparation phase, in
solved with Newton’s method, for a descending which the derivatives are computed and the QP
sequence of path parameters Œk . The NIP is condensed, and the feedback phase, which
996 Optimization Algorithms for Model Predictive Control

ŒkC1
only starts once xN 0 becomes available and Binder T, Blank L, Bock HG, Bulirsch R, Dah-
men W, Diehl M, Kronseder T, Marquardt W,
in which only a condensed QP of form (3) is Schlöder JP, Stryk OV (2001) Introduction to
solved, minimizing the feedback delay. This model based optimization of chemical processes
NMPC algorithm can be generated as plain on moving horizons. In: Grötschel M, Krumke
C-code, e.g., by the tool ACADO. Another SO, Rambau J (eds) Online optimization of large
scale systems: state of the art. Springer, Berlin,
class of real-time NMPC algorithms based on pp 295–340
a continuation method can be generated by the Bryson AE, Ho Y-C (1975) Applied optimal control.
tool AutoGenU. Wiley, New York
Diehl M, Ferreau HJ, Haverbeke N (2009) Efficient nu-
merical methods for nonlinear MPC and moving hori-
zon estimation. In: Nonlinear model predictive control.
Lecture notes in control and information sciences,
Cross-References vol 384. Springer, Berlin, pp 391–417
Domahidi A, Zgraggen A, Zeilinger MN, Morari M, Jones
 Explicit Model Predictive Control CN (2012) Efficient interior point methods for mul-
tistage problems arising in receding horizon control.
 Model-Predictive Control in Practice In: IEEE conference on decision and control (CDC),
 Numerical Methods for Nonlinear Optimal Maui, Dec 2012, pp 668–674
Control Problems Ferreau HJ, Bock HG, Diehl M (2008) An online ac-
tive set strategy to overcome the limitations of ex-
plicit MPC. Int J Robust Nonlinear Control 18(8):
Recommended Reading 816–830
Fletcher R (1987) Practical methods of optimization, 2nd
Many of the algorithmic ideas presented in this edn. Wiley, Chichester
Gill PE, Murray W, Wright MH (1999) Practical optimiza-
article can be used in different combinations than
tion. Academic, London
those presented, and several other ideas had to be Houska B, Ferreau HJ, Diehl M (2011) An auto-
omitted for the sake of brevity. Some more details generated real-time iteration algorithm for nonlinear
can be found in the following two overview arti- MPC in the microsecond range. Automatica 47(10):
2279–2285
cles on MPC optimization: Binder et al. (2001)
Mattingley J, Boyd S (2009) Automatic code generation
and Diehl et al. (2009). The general field of nu- for real-time convex optimization. In: Convex op-
merical optimal control is treated in Bryson and timization in signal processing and communica-
Ho (1975), Betts (2010), and the even broader tions. Cambridge University Press, New York,
pp 1–43
field of numerical optimization is covered in the
Nesterov Y (2004) Introductory lectures on convex opti-
excellent textbooks (Fletcher 1987; Wright 1997; mization: a basic course. Applied optimization, vol 87.
Nesterov 2004; Gill et al. 1999; Nocedal and Kluwer, Boston
Wright 2006; Biegler 2010). General purpose Nocedal J, Wright SJ (2006) Numerical optimization.
Springer series in operations research and financial
open-source software for MPC and NMPC is de- engineering, 2nd edn. Springer, New York
scribed in the following papers: FORCES (Dom- Ohtsuka T, Kodama A (2002) Automatic code
ahidi et al. 2012), CVXGEN (Mattingley and generation system for nonlinear receding horizon
Boyd 2009), qpOASES (Ferreau et al. 2008), control. Trans Soc Instrum Control Eng 38(7):
617–623
FiOrdOs (Richter et al. 2011), AutoGenU (Oht- Richter S, Morari M, Jones CN (2011) Towards
suka and Kodama 2002), ACADO (Houska et al. computational complexity certification for constrained
2011), and IPOPT (Wächter and Biegler 2006). MPC based on Lagrange relaxation and the fast
gradient method. In: 50th IEEE conference
on decision and control and European control
conference (CDC-ECC), Orlando, Dec 2011,
Bibliography pp 5223–5229
Wächter A, Biegler LT (2006) On the implementation of
Betts JT (2010) Practical methods for optimal control a primal-dual interior point filter line search algorithm
and estimation using nonlinear programming, 2nd edn. for large-scale nonlinear programming. Math Program
SIAM, Philadelphia 106(1):25–57
Biegler LT (2010) Nonlinear programming. SIAM, Wright SJ (1997) Primal-dual interior-point methods.
Philadelphia SIAM, Philadelphia
Optimization Based Robust Control 997

Our control system is described by the first-


Optimization Based Robust Control order ordinary differential equation
Didier Henrion
xP D A .ı/ x C D .ı/ u
LAAS-CNRS, University of Toulouse, Toulouse,
y D C .ı/ x
France
Faculty of Electrical Engineering, Czech
where as usual x 2 Rn denotes the states, u 2
Technical University in Prague, Prague,
Rm denotes the controlled inputs, and y 2 Rp
Czech Republic
denotes the measured outputs, all depending on
time t, with xP denoting the time derivative of x.
The system is subject to uncertainty and this is
Abstract
reflected by the dependence of matrices A, B,
and C on uncertain parameter ı which is typically
This entry describes the basic setup of linear ro-
time varying and restricted to some bounded set
bust control and the difficulties typically encoun-
tered when designing optimization algorithms to
cope with robust stability and performance spec- ı 2   Rq :
ifications.
A linear control law

u D Ky
Keywords
modeled by a matrix K 2 Rm p must be
Linear systems; Optimization; Robust control designed to overcome the effect of the uncertainty
while optimizing some performance criterion
(e.g., pole placement, disturbance rejection,
H2 or H1 norm). Sometimes, a relevant
Linear Robust Control performance criterion is that the control should O
be stabilizing for the largest possible uncertainty
Robust control allows dealing with uncertainty
(measured, e.g., by some norm on ). In this
affecting a dynamical system and its environ-
section, for conciseness, we restrict our attention
ment. In this section, we assume that we have a
to static output feedback control laws, but most
mathematical model of the dynamical system
of the results can be extended to dynamical output
without uncertainty (the so-called nominal
feedback control laws, where the control signal u
system) jointly with a mathematical model of
is the output of a controller (a linear system to be
the uncertainty. We restrict ourselves to linear
designed) whose input is y.
systems: if the dynamical system we want to
control has some nonlinear components (e.g.,
input saturation), they must be embedded in
Uncertainty Models
the uncertainty model. Similarly, we assume
that the control system is relatively small
Amongst the simplest possible uncertainty mod-
scale (low number of states): higher-order
els, we can find the following:
dynamics (e.g., highly oscillatory but low energy
• Unstructured uncertainty, also called norm-
components) are embedded in the uncertainty
bounded uncertainty, where
model. Finally, for conciseness, we focus
exclusively on continuous-time systems, even
 D fı 2 Rq W jjıjj  1g
though most of the techniques described in this
section can be transposed readily to discrete-time and the given norm can be a standard vector
systems. norm or a more complicated matrix norm if ı is
998 Optimization Based Robust Control

interpreted as a vector obtained by stacking the .k1 ; k2 ; k3 / 2 R3 such that k12 C k22 C k32 < 1
column of a matrix and ˛.A.K// < 0 for
• Structured uncertainty, also called polytopic  
uncertainty, where 1 k1
A.K/ D :
k2 k3
 D conv fıi ; i D 1; : : : ; N g
There exist various approaches to handling non-
is a polytope modeled as the convex combination convexity. One possibility consists of building
of a finite number of given vertices ıi 2 Rq ; i D convex inner approximations of the stability re-
1; : : : ; N gion in the parameter space. The approxima-
We can find more complicated uncertainty tions can be polytopes, balls, ellipsoids, or more
models (e.g., combinations of the two above: see complicated convex objects described by linear
Zhou et al. 1996), but to keep the developments matrix inequalities (LMI). The resulting stability
elementary, they are not discussed here. conditions are convex, but surely conservative, in
the sense that the conditions are only sufficient
for stability and not necessary. Another approach
to handling nonconvexity consists of formulating
Nonconvex Nonsmooth Robust
the stability conditions algebraically (e.g., via the
Optimization
Routh-Hurwitz stability criterion or its symmetric
version by Hermite) and using converging hier-
The main difficulties faced when seeking a feed-
archies of LMI relaxations to solve the result-
back matrix K are as follows:
ing nonconvex polynomial optimization problem:
• Nonconvexity: The stability conditions are
see, e.g., Henrion and Lasserre (2004) and Chesi
typically nonconvex in K.
(2010).
• Nondifferentiability: The performance cri-
The second difficulty characteristic of
terion to be optimized is typically a non-
optimization-based robust control is the potential
differentiable function of K.
nondifferentiability of the objective function.
• Robustness: Stability and performance
Consider for illustration one of the simplest
should be ensured for every possible instance
optimization problems which consists of
of the uncertainty.
minimizing the spectral abscissa ˛.A.K// of
So if we are to formulate the robust control
a matrix A.K/ depending linearly on a matrix
problem as an optimization problem, we should
K. Such a minimization makes sense since
be ready to develop and use techniques from non-
negativity of the spectral abscissa is equivalent
convex, nondifferentiable, robust optimization.
to system stability. Then typically, ˛.A.K// is
Let us first elaborate on the first difficulty
a continuous but non-Lipschitz function of K,
faced by optimization-based robust control,
which means that its gradient can be unbounded
namely, the nonconvexity of the stability
locally. In Fig. 2, we plot the spectral abscissa
conditions. In continuous time, stability of a
˛.A.K// for
linear system xP D Ax is equivalent to negativity
of the spectral abscissa, which is defined as the  
0 1
maximum real part of the eigenvalues of A: A.K/ D
K K

and K 2 R. The function is non-Lipschitz at K D


˛.A/ D maxfRe  W det.In  A/ D 0;  2 Cg:
0, at which the global minimum ˛.A.0// D 0
is achieved. Nonconvexity of the function is also
It turns out that the open cone of matrices apparent in this example. The lack of convexity
A 2 Rn n such that ˛.A/ < 0 is nonconvex and smoothness of the spectral abscissa and other
(Ackermann 1993). This is illustrated in Fig. 1 similar performance criteria renders optimization
where we represent the set of vectors K D of such functions particularly difficult (Burke
Optimization Based Robust Control 999

Optimization Based 0.4


Robust Control, Fig. 1 A
nonconvex ball of stable 0.2
matrices
0

−0.2

K3 −0.4

−0.6

−0.8

−1

0.5

0 0.5
K2
0
−0.5
−0.5 K1

Optimization Based
Robust Control, Fig. 2
The spectral abscissa is 0.9
typically nonconvex and
0.8
nonsmooth
0.7

0.6
spectral abscissa

O
0.5

0.4

0.3

0.2

0.1

−0.1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
K

et al. 2001, 2006b). In Fig. 3, we represent graphs The third difficulty for optimization-based
of the spectral abscissa (with flipped vertical robust control is the uncertainty. As explained
axis for better visualization) of some small-size above, optimization of a performance criterion
matrices depending on two real parameters, with with respect to controller parameters is already
randomly generated parametrization. We observe a potentially difficult problem for a nominal
the typical nonconvexity and lack of smoothness system (i.e., when the uncertainty parameter is
around local and global optima. equal to zero). This becomes even more difficult
1000 Optimization Based Robust Control

Optimization Based Robust Control, Fig. 3 The graph of the negative spectral abscissa for some randomly generated
matrix parametrizations

when this optimization must be carried out for A unified approach to addressing conflicting
all possible instances of the uncertainty ı in performance criteria and uncertainty consists of
. This is where the above assumption that the searching for locally optimal solutions of a nons-
uncertainty set  has a simple description proves mooth optimization problem that is built to incor-
useful. If the uncertainty ı is unstructured and porate minimization objectives and constraints
not time varying, then it can be handled with the for multiple plants. This is called (linear robust)
complex stability radius (Ackermann 1993), the multiobjective control, and formally, it can be
pseudospectral abscissa (Trefethen and Embree expressed as the following optimization prob-
2005), or via an H1 norm constraint (Zhou et al. lem
1996). If the uncertainty ı is structured, then
we can try to optimize a performance criterion minK maxi D1;:::;N fgi .K/ W ˇi D 1g
at every vertex in the polytopic description
(which is a relaxation of the problem of s:t: gi .K/  ˇi ; i D 1; : : : ; N;
stabilizing the whole polytope). An example
is the problem of simultaneous stabilization, where each gi (K) is a function of the closed-
where a controller K must be found such that the loop matrix Ai .K/ (e.g., a spectral abscissa
maximum spectral abscissa of several matrices or an H1 norm) and the scalars ˇi are given
Ai .K/; i D 1; : : : ; N is negative (Blondel 1994). and such that if ˇi D 1 for some i , then gi
Finally, if the uncertainty ı is time varying, then appears in the objective function and not in a
performance and stability guarantees can still be constraint: see Gumussoy et al. (2009) for details.
achieved with the help of Lyapunov certificates or In the above problem, the objective function,
potentially conservative convex LMI conditions: a maximum of nonsmooth and nonconvex
see, e.g., Boyd et al. (1994) and Scherer et al. functions, is typically also nonsmooth and
(1997). nonconvex. Moreover, without loss of generality,
Optimization-Based Control Design Techniques and Tools 1001

we can easily impose a sparsity pattern on Henrion D, Lasserre JB (2004) Solving nonconvex op-
controller matrix K to account for structural timization problems – how GloptiPoly is applied to
problems in robust and nonlinear control. IEEE Con-
constraints (e.g., a low-order decentralized trol Syst Mag 24(3):72–83
controller). Scherer CW, Gahinet P, Chilali M (1997) Multi-objective
output feedback control via LMI optimization. IEEE
Trans Autom Control 42(7):896–911
Trefethen LN, Embree M (2005) Spectra and pseudospec-
Software Packages tra: the behavior of nonnormal matrices and operators.
Princeton University Press, Princeton
Algorithms for nonconvex nonsmooth opti- Zhou K, Doyle JC, Glover K (1996) Robust and optimal
mization have been developed and interfaced control. Prentice Hall, Upper Saddle River
for linear robust multiobjective control in the
public domain Matlab package HIFOO released
in Burke et al. (2006a) and based on the theory
described in Burke et al. (2006b). In 2011,
Optimization-Based Control Design
The MathWorks released HINFSTRUCT, a
Techniques and Tools
commercial implementation of these techniques
based on the theory described in Apkarian and
Pierre Apkarian1 and Dominikus Noll2
Noll (2006). 1
DCSD, ONERA – The French Aerospace Lab,
Toulouse, France
2
Institut de Mathématiques, Université de
Cross-References Toulouse, Toulouse, France
 H-Infinity Control
 LMI Approach to Robust Control
Abstract

Bibliography Structured output feedback controller synthesis


is an exciting new concept in modern control O
Ackermann J (1993) Robust control – systems with uncer- design, which bridges between theory and
tain physical parameters. Springer, Berlin practice insofar as it allows for the first time
Apkarian P, Noll D (2006) Nonsmooth H-infinity synthe-
sis. IEEE Trans Autom Control 51(1): 71–86 to apply sophisticated mathematical design
Blondel VD (1994) Simultaneous stabilization of linear paradigms like H1 or H2 control within
systems. Springer, Heidelberg control architectures preferred by practitioners.
Boyd S, El Ghaoui L, Feron E, Balakrishnan V (1994) The new approach to structured H1 control,
Linear matrix inequalities in system and control the-
ory. SIAM, Philadelphia developed during the past decade, is rooted
Burke JV, Lewis AS, Overton ML (2001) Optimizing in a change of paradigm in the synthesis
matrix stability. Proc AMS 129:1635–1642 algorithms. Structured design may no longer
Burke JV, Henrion D, Lewis AS, Overton ML (2006a) be based on solving algebraic Riccati equations
HIFOO – a Matlab package for fixed-order controller
design and H-infinity optimization. In: Proceedings of or matrix inequalities. Instead, optimization-
the IFAC symposium robust control design, Toulouse based design techniques are required. In
Burke JV, Henrion D, Lewis AS, Overton ML (2006b) this essay we indicate why structured con-
Stabilization via nonsmooth, nonconvex optimization. troller synthesis is central in modern control
IEEE Trans Autom Control 51(11):1760–1769
Chesi G (2010) LMI techniques for optimization over engineering. We explain why non-smooth
polynomials in control: a survey. IEEE Trans Autom optimization techniques are needed to compute
Control 55(11):2500–2510 structured control laws, and we point to
Gumussoy S, Henrion D, Millstone M, Overton ML software tools which enable practitioners
(2009) Multiobjective robust control with HIFOO 2.0.
In: Proceedings of the IFAC symposium on robust to use these new tools in high-technology
control design (ROCOND 2009), Haifa applications.
1002 Optimization-Based Control Design Techniques and Tools

Keywords where A, B1 ,: : : are real matrices of appropriate


dimensions, xP 2 RnP is the state, u 2 Rnu the
Controller tuning; H1 synthesis; Multi-objective control, y 2 Rny the measured output, w 2 Rnw
design; Non-smooth optimization; Structured the exogenous input, and z 2 Rnz the regulated
controllers; Robust control output. Similarly, the sought output feedback
controller K is described as

xP K D AK xK C BK y
Introduction KW (2)
u D CK xK C DK y

In the modern high-technology field of control, with xK 2 RnK and is called structured if
engineers usually face a large variety of con- the (real) matrices AK ; BK ; CK ; DK depend
curring design specifications such as noise or smoothly on a design parameter x 2 Rn , referred
gain attenuation in prescribed frequency bands, to as the vector of tunable parameters. Formally,
damping, decoupling, constraints on settling or we have differentiable mappings
rise time, and much else. In addition, as plant
models are generally only approximations of the
AK D AK .x/; BK D BK .x/; CK D CK .x/;
true system dynamics, control laws have to be
robust with respect to uncertainty in physical DK D DK .x/;
parameters or with regard to un-modeled high-
frequency phenomena. Not surprisingly, such a and we abbreviate these by the notation K.x/ for
plethora of constraints present a major challenge short to emphasize that the controller is structured
for controller tuning, not only due to the ever- with x as tunable elements.
growing number of such constraints but also A structured controller synthesis problem is
because of their very different provenience. then an optimization problem of the form
The dramatic increase in plant complexity is
exacerbated by the desire that regulators should minimize kTwz .P; K.x//k
be as simple as possible, easy to understand and subject to K.x/ closed-loop stabilizing (3)
to tune by practitioners, convenient to hardware K.x/ structured, x 2 Rn
implement, and generally available at low cost.
Such practical constraints explain the limited use where Twz .P; K/ D F` .P; K/ is the lower feed-
of black-box controllers, and they are the driving back connection of (1) with (2) as in Fig. 1 (left),
force for the implementation of structured control also called the linear fractional transformation
architectures, as well as for the tendency to re- (Varga and Looye 1999). The norm kk stands for
place hand-tuning methods by rigorous algorith- the H1 norm, the H2 norm, or any other system
mic optimization tools. norm, while the optimization variable x 2 Rn
regroups the tunable parameters in the design.
Standard examples of structured controllers
Structured Controllers K.x/ include realizable PIDs and observer-based,
reduced-order, or decentralized controllers,
Before addressing specific optimization tech- which in state space are expressed as
niques, we introduce some basic terminology
for control design problems with structured 2 0 0 1
3

controllers. A state-space description of the given 4 0 1= kD = 5 ; A  B2 Kc  Kf C2 Kf ;


Kc 0
P used for design is given as kI 1= kP C kD =
2 3
8 q q
< xP P D AxP C B1 w C B2 u
diag A diag B
AK BK 6 i D1 Ki i D1 Ki 7
P W z D C1 xP C D11 w C D12 u (1) ; 6
4 q q
7:
5
: CK DK
diag CKi diag DKi
y D C2 xP C D21 w C D22 u i D1 i D1
Optimization-Based Control Design Techniques and Tools 1003

Optimization-Based Control Design Techniques and Tools, Fig. 1 Black-box full-order controller K on the left,
structured 2-DOF control architecture with K D block-diag.K1 ; K2 / on the right

Optimization-Based
Control Design
Techniques and Tools,
Fig. 2 Synthesis of K D
block-diag.K1 ; : : : ; KN /
against multiple
requirements or models
P .1/ ; : : : ; P .M / . Each
Ki .x/ can be structured

where K.x/ takes a typical block-diagonal


In the case of a PID, the tunable parameters
structure featuring the tunable elements x D
are x D . ; kP ; kI ; kD /, for observer-based
.a; kP ; kI /.
controllers x regroups the estimator and state-
In much the same way, arbitrary multi-loop in-
feedback gains .Kf ; Kc /, for reduced order
terconnections of fixed-model elements with tun-
controllers nK < nP the tunable parameters x
able controller blocks Ki .x/ can be rearranged as
are the n2K C nK ny C nK nu C ny nu unknown
in Fig. 2 so that K.x/ captures all tunable blocks
O
entries in .AK ; BK ; CK ; DK /, and in the
in a decentralized structure general enough to
decentralized form x regroups the unknown
cover most engineering applications.
entries in AK1 ; : : : ; DKq . In contrast, full-
The structure concept is equally useful to
order controllers have the maximum number
deal with the second central challenge in control
N D n2P C nP ny C nP nu C ny nu of degrees of
design: system uncertainty. The latter may be
freedom and are referred to as unstructured or as
handled with -synthesis techniques (Stein and
black-box controllers.
Doyle 1991) if a parametric uncertain model
More sophisticated controller structures K.x/
is available. A less ambitious but often more
arise from architectures like, for instance, a
practical alternative consists in optimizing the
2-DOF control arrangement with feedback
structured controller K.x/ against a finite set
block K2 and a set-point filter K1 as in Fig. 1
of plants P .1/ ; : : : ; P .M / representing model
(right). Suppose K1 is the 1st-order filter
variations due to uncertainty, aging, sensor and
K1 .s/ D a=.s C a/ and K2 the PI feedback
actuator breakdown, and un-modeled dynamics,
K2 .s/ D kP C kI =s. Then the transfer Try
in tandem with the robustness and performance
from r to y can be represented as the feedback
specifications. This is again formally covered by
connection of P and K.x/ with
Fig. 2 and leads to a multi-objective constrained
optimization problem of the form
2 3
A 0 0 B

6 C D 7
P WD 6
0 0 7 ; K.x/ WD K1 .s/ 0 ;
4 0 I 0 0 5 0 K2 .s/ minimize f .x/ D max
.k/
kTwi zi .K.x//k
k2SOFT;i 2Ik
C 0 I D
1004 Optimization-Based Control Design Techniques and Tools

subject to g.x/ D max


.k/
kTwj zj .K.x//k  1 In the late 1990s and early 2000s, a change
k2HARD;j 2Jk of methods was observed. Structured H2 - and
K.x/ structured and stabilizing H1 -synthesis problems (3) were addressed by
x 2 Rn (4)
bilinear matrix inequality (BMI) optimization,
which used local optimization techniques based
.k/
where Twi zi denotes the i th closed-loop robust- on the augmented Lagrangian method (Fares
ness or performance channel wi ! zi for the et al. 2001; Noll et al. 2002; Kocvara and Stingl
kth plant model P .k/ .s/. The rationale of (4) 2003), sequential semidefinite programming
is to minimize the worst-case cost of the soft methods (Fares et al. 2002; Apkarian et al. 2003),
.k/
constraints kTwi zi k, k 2 SOFT while enforcing and non-smooth methods for BMIs (Noll et al.
.k/
the hard constraints kTwj zj k  1, k 2 HARD. 2009; Lemaréchal and Oustry 2000). However,
Note that in the mathematical programming ter- these techniques were based on the bounded
minology, soft and hard constraints are classi- real lemma or similar matrix inequalities and
cally referred to as objectives and constraints. were therefore of limited success due to the
The terms soft and hard point to the fact that presence of Lyapunov variables, i.e., matrix-
hard constraints prevail over soft ones and that valued unknowns, whose dimension grows
meeting hard constraints for solution candidates quadratically in nP C nK and represents the
is mandatory. bottleneck of that approach.
The epoch-making change occurs with the
introduction of non-smooth optimization tech-
Optimization Techniques Over the niques (Noll and Apkarian 2005; Apkarian and
Years Noll 2006b,c, 2007) to programs (3) and (4).
Today non-smooth methods have superseded ma-
During the late 1990s, the necessity to develop trix inequality-based techniques and may be con-
design techniques for structured regulators sidered the state of the art as far as realistic
K.x/ was recognized (Fares et al. 2001), and applications are concerned. The transition took
the limitations of synthesis methods based on almost a decade.
algebraic Riccati equations (AREs) or linear Alternative control-related local optimization
matrix inequalities (LMIs) became evident, as techniques and heuristics include the gradient
these techniques can only provide black-box sampling technique of Burke et al. (2005),
controllers. The lack of appropriate synthesis derivative-free optimization discussed in Kolda
techniques for structured K.x/ led to the et al. (2003) and Apkarian and Noll (2006a)
unfortunate situation, where sophisticated and particle swarm optimization; see Oi
approaches like the H1 paradigm developed by et al. (2008) and references therein and also
academia since the 1980s could not be brought to evolutionary computation techniques (Lieslehto
work for the design of those controller structures 2001). The last three classes do not exploit
K.x/ preferred by practitioners. Design engineers derivative information and rely on function
had to continue to rely on heuristic and ad hoc evaluations only. They are therefore applicable
tuning techniques, with only limited scope and to a broad variety of problems including those
reliability. As an example, post-processing to where function values arise from complex
reduce a black-box controller to a practical size numerical simulations. The combinatorial nature
is prone to failure. It may at best be considered a of these techniques, however, limits their
fill-in for a rigorous design method which directly use to small problems with a few tens of
computes a reduced-order controller. Similarly, variable. More significantly, these methods often
hand-tuning of the parameters x remains a lack a solid convergence theory. In contrast,
puzzling task because of the loop interactions as we have demonstrated over recent years
and fails as soon as complexity increases. (Apkarian and Noll 2006b; Noll et al. 2008),
Optimization-Based Control Design Techniques and Tools 1005

specialized non-smooth techniques are highly minimize f .x/


efficient in practice, are based on a sophis- subject to g.x/  0 (5)
ticated convergence theory, are capable of x 2 Rn
solving medium-size problems in a matter
of seconds, and are still operational for where f; g W Rn ! R are locally Lipschitz
large-size problems with several hundreds of functions and are easily identified from the cast
states. in (4).
In the realm of convex optimization, non-
smooth programs are conveniently addressed by
so-called bundle methods, introduced in the late
1970s by Lemaréchal (1975). Bundle methods
Non-smooth Optimization are used to solve difficult problems in integer pro-
Techniques gramming or in stochastic optimization via La-
grangian relaxation. Extensions of the bundling
The benefit of the non-smooth casts (3) and technique to non-convex problems like (3) or
(4) lies in the possibility to avoid searching for (4) were first developed in Apkarian and Noll
Lyapunov variables, a major advantage as their (2006b,c, 2007), Apkarian et al. (2008), Noll
number .nP C nK /2 =2 usually largely dominates et al. (2009), and, in more abstract form, Noll
n, the number of true decision parameters x. et al. (2008).
Lyapunov variables do still occur implicitly in the Figure 3 shows a schematic view of a
function evaluation procedures, but this has no non-convex bundle method consisting of a
harmful effect for systems up to several hundred descent-step generating inner loop (yellow
states. In abstract terms, a non-smooth optimiza- block) comparable to a line search in smooth
tion program has the form optimization, embedded into the outer loop

Optimization-Based Control Design Techniques and Tools, Fig. 3 Flowchart of proximity control bundle
algorithm
1006 Optimization-Based Control Design Techniques and Tools

(blue box), where serious iterates are processed, Computational Tools


stopping criteria are applied, and the model
tradition is assured. Serious steps or iterates refer The novel non-smooth optimization methods
to steps accepted in a line search, while null steps became available to the engineering commu-
are unsuccessful steps visited during the search. nity since 2010 via the MATLAB Robust
By model tradition, we mean continuity of the Control Toolbox (Robust Control Toolbox 4.2
model between (serious) iterates x j and x j C1 2012; Gahinet and Apkarian 2011). Routines
by recycling some of the older planes used at HINFSTRUCT , LOOPTUNE , and SYSTUNE are
counter j into the new working model at j C 1. versatile enough to define and combine tunable
This avoids starting the first inner loop k D 1 at blocks Ki .x/, to build and aggregate design
j C 1 from scratch and therefore saves time. .k/
requirements Twz of different nature, and to
At the core of the interaction between in- provide suitable validation tools. Their imple-
ner and outer loop is the management of the mentation was carried out in cooperation with
proximity control parameter , which governs P. Gahinet (MathWorks). These routines further
the stepsize kx  yk k between trial steps yk exploit the structure of problem (4) to enhance
at the current serious iterate x. Similar to the efficiency (see Apkarian and Noll 2006b, 2007).
management of a trust region radius or of the It should be mentioned that design problems
stepsize in a line search, proximity control al- with multiple hard constraints are inherently
lows to force shorter trial steps if agreement of complex. It is well known that even simultaneous
the local model with the true objective function stabilization of more than two plants P .j / with
is poor and allows larger steps if agreement is a structured control law K.x/ is NP-complete so
satisfactory. that exhaustive methods are expected to fail even
Oracle-based bundle methods traditionally as- for small to medium problems. The principled
sure global convergence in the sense of subse- decision made in Apkarian and Noll (2006b)
quences under the sole hypothesis that for every and reflected in the MATLAB routines is to
trial point x, the function value f .x/ and a Clarke rely on local optimization techniques instead.
subgradient 2 @f .x/ are provided. In automatic This leads to weaker convergence certificates
control applications, it is as a rule possible to but has the advantage to work successfully
provide more specific information, which may be in practice. In the same vein, in (4) it is
exploited to speed up convergence. preferable to rely on a mixture of soft and hard
Computing function value and gradients of the requirements, for instance, by the use of exact
H2 norm f .x/ D kTwz .P; K.x// k2 requires es- penalty functions (Noll and Apkarian 2005).
sentially the solution of two Lyapunov equations Key features implemented in the mentioned
of size nP CnK (see Apkarian et al. 2007; Rautert MATLAB routines are discussed in Apkarian
and Sachs 1997). For the H1 norm, f .x/ D (2013), Gahinet and Apkarian (2011), and
kTwz .P; K.x// k1 , function evaluation is based Apkarian and Noll (2007).
on the Hamiltonian algorithm of Benner et al.
(2012) and Boyd et al. (1989). The Hamiltonian
matrix is of size nP C nK so that function eval- Design Example
uations may be costly for very large plant state
dimension (nP > 500), even though the number Design of a feedback regulator is an interactive
of outer loop iterations of the bundle algorithm is process, in which tools like SYSTUNE ,
not affected by a large nP and generally relates LOOPTUNE , or HINFSTRUCT support the
to n, the dimension of x. The additional cost for designer in various ways. In this section we
subgradient computation for large nP is relatively illustrate their enormous potential by solving
cheap as it relies on linear algebra (Apkarian and a multi-model, fixed-structure reliable flight
Noll 2006b). control design problem.
Optimization-Based Control Design Techniques and Tools 1007

num(s)
wg s2+0.6s+0.09
Band-Limited
White Noise Dryden filter
x gust
Product wind gust

Clock
Interval Test

muStep
x⬘ = Ax+Bu
+ 1
setpoint Ki* uvec + y = Cx+Du
– s e K*uvec K*uvec
alphaStep + u Aircraft
Ki
Rate Limiter S Outage F16 aircraft
Integral action
betaStep
Setpoint
commands

State feedback

y K*uvec
x = [u w q v p r]
To Workspace Scope Kx

z
mu alpha beta
Selector

Optimization-Based Control Design Techniques and Tools, Fig. 4 Synthesis interconnection for reliable control

Optimization-Based Control Design Techniques and Tools, Table 1 Outage scenarios where 0 stands for failure
Outage cases Diagonal of outage gain
Nominal mode 1 1 1 1 1
Right elevator outage 0 1 1 1 1
Left elevator outage 1 0 1 1 1
Right aileron outage 1 1 0 1 1
Left aileron outage 1 1 1 0 1
Left elevator and right aileron outage 1 0 0 1 1 O
Right elevator and right aileron outage 0 1 0 1 1
Right elevator and left aileron outage 0 1 1 0 1
Left elevator and left aileron outage 1 0 1 0 1

In reliable flight control, one has to maintain and rudder deflections (deg). The elevators are
stability and adequate performance not only in grouped symmetrically to generate the angle of
nominal operation but also in various scenarios attack. The ailerons are grouped antisymmetri-
where the aircraft undergoes outages in elevator cally to generate roll motion. This leads to three
and aileron actuators. In particular, wind gusts control actions as shown in Fig. 4. The controller
must be alleviated in all outage scenarios to main- consists of two blocks, a 36 state-feedback gain
tain safety. Variants of this problem are addressed matrix Kx in the inner loop and a 3  3 integral
in Liao et al. (2002). gain matrix Ki in the outer loop, which leads to a
The open loop F16 aircraft in the scheme of total of 27 D dim x parameters to tune.
Fig. 4 has six states, the body velocities u; v; w In addition to nominal operation, we consider
and pitch, roll, and yaw rates q; p; r. The state eight outage scenarios shown in Table 1.
is available for control as is the flight-path bank The different models associated with the
angle rate  (deg/s), the angle of attack ˛ (deg), outage scenarios are readily obtained by pre-
and the sideslip angle ˇ (deg). Control inputs are multiplication of the aircraft control input by a
the left and right elevator, left and right aileron, diagonal matrix built from the rows in Table 1.
1008 Optimization-Based Control Design Techniques and Tools

Optimization-Based Control Design Techniques and Tools, Fig. 5 Responses to step changes in , ˛, and ˇ for
nominal design

 Z T 
The design requirements are as follows: 1
• Good tracking performance in , ˛, and ˇ J D lim E kWe ek C kWu uk dt :
2 2
T !1 T 0
with adequate decoupling of the three axes. (6)
• Adequate rejection of wind gusts of 5 m/s. Diagonal weights We and Wu provide tuning
• Maintain stability and acceptable performance knobs for trade-off between responsiveness, con-
in the face of actuator outage. trol effort, and balancing of the three channels.
Tracking is addressed by an LQG cost We use We D diag.20; 30; 20/; Wu D I3 for nor-
(Maciejowski 1989), which penalizes integrated mal operation and We D diag.8; 12; 8/; Wu D I3
tracking error e and control effort u via for outage conditions. Model-dependent weights
Optimization-Based Control Design Techniques and Tools 1009

Optimization-Based Control Design Techniques and Tools, Fig. 6 Responses to step changes in , ˛, and ˇ for
fault-tolerant design

allow to express the fact that nominal operation In particular, the variance of e is limited to 0:01
prevails over failure cases. Weights for failure for normal operation and to 0:03 for the outage
cases are used to achieve limited deterioration of scenarios.
performance or of gust alleviation under deflec- With the notation of section “Non-smooth Op-
tion surface breakdown. timization Techniques,” the functions f .x/ and
.k/
The second requirement, wind gust allevia- g.x/ in (5) are f .x/ WD maxkD1;:::;9 kTrz .x/k2
.k/
tion, is treated as a hard constraint limiting the and g.x/ WD maxkD1;:::;9 kTwg e .x/k2 , where r
variance of the error signal e in response to white denotes the set-point inputs in , ˛, and ˇ. The
noise wg driving the Dryden wind gust model. regulated output z is
1010 Optimization-Based Control Design Techniques and Tools

h iT optimized to achieve higher performance.


zT WD .We1=2 e/T .Wu1=2 u/T ;
Application to fault detection and isolation
with x D .vec.Ki /; vec.Kx // 2 R27 . Soft may also reveal as an interesting vein.
constraints are the square roots of J in (6)
with appropriate weightings We and Wu , hard
constraints the RMS values of e, suitably Cross-References
weighted to reflect variance bounds of 0:01 and
0:03. These requirements are covered by the  H-Infinity Control
Variance and WeightedVariance options  Optimization Based Robust Control
in Robust Control Toolbox 4.2 (2012).  Robust Synthesis and Robustness Analysis
With this setup, we tuned the controller Techniques and Tools
gains Ki and Kx for the nominal scenario only
(nominal design) and for all nine scenarios
(fault-tolerant design). The responses to set-
point changes in , ˛, and ˇ with a gust speed Bibliography
of 5 m/s are shown in Fig. 5 for the nominal
design and in Fig. 6 for the fault-tolerant design. Apkarian P (2013) Tuning controllers against multiple
design requirements. In: American control conference
As expected, nominal responses are good but (ACC), Washington, DC, pp 3888–3893
notably deteriorate when faced with outages. In Apkarian P, Noll D (2006a) Controller design via nons-
contrast, the fault-tolerant controller maintains mooth multi-directional search. SIAM J Control Op-
acceptable performance in outage situations. tim 44(6):1923–1949
Apkarian P, Noll D (2006b) Nonsmooth H1 synthesis.
Optimal performance (square root of LQG cost IEEE Trans Autom Control 51(1):71–86
J in (6)) for the fault-tolerant design is only Apkarian P, Noll D (2006c) Nonsmooth optimization
slightly worse than for the nominal design (26 for multidisk H1 synthesis. Eur J Control 12(3):
229–244
vs. 23). The non-smooth program (5) was solved
Apkarian P, Noll D (2007) Nonsmooth optimization for
with SYSTUNE , and the fault-tolerant design multiband frequency domain control design. Automat-
(9 models, 11 states, 27 parameters) took 30 s ica 43(4):724–731
on Mac OS X with 2:66 GHz Intel Core i7 and Apkarian P, Noll D, Thevenet JB, Tuan HD (2003) A
spectral quadratic-SDP method with applications to
8 GB RAM. The reader is referred to Robust
fixed-order H2 and H1 synthesis. Eur J Control
Control Toolbox 4.2 (2012) or higher versions, 10(6):527–538
for further examples, and additional details. Apkarian P, Noll D, Rondepierre A (2007) Mixed
H2 =H1 control via nonsmooth optimization. In: Pro-
ceedings of the 46th IEEE conference on decision and
control, New Orleans, pp 4110–4115
Future Directions
Apkarian P, Noll D, Prot O (2008) A trust region spectral
bundle method for nonconvex eigenvalue optimiza-
From an application viewpoint, non-smooth tion. SIAM J Optim 19(1):281–306
optimization techniques for control system Benner P, Sima V, Voigt M (2012) L1 -norm compu-
tation for continuous-time descriptor systems using
design and tuning will become one of the
structured matrix pencils. IEEE Trans Autom Control
standard techniques in the engineer’s toolkit. 57(1):233–238
They are currently studied in major European Boyd S, Balakrishnan V, Kabamba P (1989) A bisection
aerospace industries. method for computing the H1 norm of a transfer
matrix and related problems. Math Control Signals
Future directions may include:
Syst 2(3):207–219
• Extension of these techniques to gain schedul- Burke J, Lewis A, Overton M (2005) A robust gradient
ing in order to handle larger operating do- sampling algorithm for nonsmooth, nonconvex opti-
mains. mization. SIAM J Optim 15:751–779
Fares B, Apkarian P, Noll D (2001) An augmented la-
• Application of the available tools to integrated
grangian method for a class of LMI-constrained prob-
system/control when both system physical lems in robust control theory. Int J Control 74(4):
characteristics and controller elements are 348–360
Option Games: The Interface Between Optimal Stopping and Game Theory 1011

Fares B, Noll D, Apkarian P (2002) Robust control via


sequential semidefinite programming. SIAM J Control Option Games: The Interface
Optim 40(6):1791–1820 Between Optimal
Gahinet P, Apkarian P (2011) Structured H1 synthesis Stopping and Game Theory
in MATLAB. In: Proceedings of the IFAC world
congress, Milan, pp 1435–1440
Kocvara M, Stingl M (2003) A code for convex nonlinear Benoit Chevalier-Roignant1 and Lenos
and semidefinite programming. Optim Methods Softw Trigeorgis2
18(3):317–333 1
Oliver Wyman, Munich, Germany
Kolda TG, Lewis RM, Torczon V (2003) Optimization by 2
direct search: new perspectives on some classical and
University of Cyprus, Nicosia, Cyprus
modern methods. SIAM Rev 45(3):385–482
Lemaréchal C (1975) An extension of Davidon methods
to nondifferentiable problems. In: Balinski ML, Wolfe Abstract
P (eds) Nondifferentiable optimization. Mathematical
programming study, vol 31. North-Holland, Amster- Managers can stake a claim by committing to
dam, pp 95–109
capital investments today that can influence their
Lemaréchal C, Oustry F (2000) Nonsmooth algorithms
to solve semidefinite programs. In: El Ghaoui L, rivals’ behavior or take a “wait-and-see” or step-
Niculescu S-I (eds) SIAM advances in linear matrix by-step approach to avoid possible adverse mar-
inequality methods in control series. SIAM ket consequences tomorrow. At the core of this
Liao F, Wang JL, Yang GH (2002) Reliable robust flight
corporate dilemma lies the classic trade-off be-
tracking control: an LMI approach. IEEE Trans Con-
trol Syst Technol 10:76–89 tween commitment and flexibility. This trade-
Lieslehto J (2001) PID controller tuning using evolution- off calls for a careful balancing of the merits
ary programming. In: American control conference, of flexibility against those of commitment. This
Arlington, Virginia, vol 4, pp 2828–2833
balancing is captured by option games.
Maciejowski JM (1989) Multivariable feedback design.
Addison-Wesley, Wokingham
Noll D, Apkarian P (2005) Spectral bundle meth-
ods for nonconvex maximum eigenvalue func-
tions: first-order methods. Math Program B 104(2):
701–727
Keywords
Noll D, Torki M, Apkarian P (2002) Partially augmented
lagrangian method for matrix inequality constraints. Game theory; Option games; Optimal stopping; O
Submitted Rapport Interne, MIP, UMR 5640, Maths. Real options
Dept. – Paul Sabatier University
Noll D, Prot O, Rondepierre A (2008) A proximity con-
trol algorithm to minimize nonsmooth and nonconvex
functions. Pac J Optim 4(3):571–604
Noll D, Prot O, Apkarian P (2009) A proximity control al- Introduction
gorithm to minimize nonsmooth and nonconvex semi-
infinite maximum eigenvalue functions. J Convex Anal
16(3–4):641–666 The global competitive environment has become
Oi A, Nakazawa C, Matsui T, Fujiwara H, Matsumoto K, increasingly more challenging as modern
Nishida H, Ando J, Kawaura M (2008) Development economies undergo unprecedented changes in
of PSO-based PID tuning method. In: International
conference on control, automation and systems, Seoul,
the midst of the global economic turmoil. Real-
Korea, pp 1917–1920 world dilemmas corporate managers face today
Rautert T, Sachs EW (1997) Computational design of are driven by the interplay among strategic
optimal output feedback controllers. SIAM J Optim and market uncertainty. The tech industry
7(3):837–852
Robust Control Toolbox 4.2 (2012) The MathWorks Inc.,
has evolved most rapidly, putting companies
Natick unable to respond to market developments
Stein G, Doyle J (1991) Beyond singular values and and technological breakthroughs at severe
loopshapes. AIAA J Guid Control 14:5–16 disadvantage. Corporate management’s plans
Varga A, Looye G (1999) Symbolic and numerical soft-
ware tools for LFT-based low order uncertainty mod-
and how they implement their strategy will likely
eling. In: Proceedings of the CACSD’99 symposium, determine whether the firm will survive and be
Cohala, pp 1–6 successful in the marketplace or become extinct.
1012 Option Games: The Interface Between Optimal Stopping and Game Theory

Formulating the right strategy in the right taking projects with positive NPV, i.e., projects
competitive environment at the right time is a Rfor which the present value of cash flows, v.x/ D
1 rt
nontrivial task. Whether to invest in a new tech- 0 e EŒXt dt , exceeds the necessary invest-
nology, a new product or enter a new market is ment cost, I . In the present case, the firm will
a strategic decision of immense importance. Cor- invest under the zero-NPV criterion if
porate management must assess strategic options
x
with proper analytical tools that can help deter- I (1)
mine whether to commit to a particular strategic r g
path, given scarce or costly resources, or whether
to stay flexible. Oftentimes, firms need to position This traditional criterion views investment oppor-
themselves flexibly to capitalize on future oppor- tunities as now-or-never decisions under passive
tunities as they emerge, while limiting potential management. However, this precludes the possi-
losses arising from adverse future circumstances. bility to adjust future decisions in case the market
In many cases, corporate managers find them- develops off the expected path. While market
selves in need to revise their decision plans in uncertainty is factored in through the discount
view of actual market developments when facing rate, the flexibility management has is typically
an uncertain future; they can then decide to un- not properly accounted for.
dertake only those projects with sufficiently high
prospects in the future to justify commitment at Real Options Analysis
that time. This needs to be balanced with the It has become standard practice in finance and
need to make irreversible strategic commitments strategy to interpret real investment opportunities
to seize first-mover advantage presenting rivals as being analogous to financial options. This view
with a fait accompli to which they have no choice is well accepted among academics and practi-
but adapt. tioners alike and is at the core of real options
analysis (ROA). ROA is an extension of option-
pricing theory to real investment situations (My-
Capital Budgeting Ignoring Strategic ers 1977; Trigeorgis 1996). This approach effec-
Interactions tively allows one to capture the dynamic nature of
decision-making since it factors in management’s
Net Present Value flexibility to revise and adapt its decision in the
Prevailing management approaches simplify face of market uncertainty. ROA allows managers
matters and often lead to investment decisions with flexibility to adapt to actual market devel-
that are detrimental to the firm’s long-term well- opments as uncertainty gets resolved. Managers
being. Suppose a firm’s future cash flow at time may, for example, delay the start (or closure) of a
t is given by a random variable Xt . Cash flows project depending on its prospects. This approach
then evolve as a geometric Brownian motion leverages on optimal stopping theory (e.g., see
Bensoussan and Lions 1982; Dixit and Pindyck
dXt D gXt dt C
Xt dBt and X0
x 1994) and is considered to be more reflective
of real decision-making than traditional methods.
with drift parameter g and volatility
. The Brow- In the case the firm can delay the decision to
nian motion .Bt I t  0/ captures exogenous invest, for example, the problem is one of optimal
market uncertainty. The standard criterion used stopping:
in corporate finance is based on discounted cash
flows (DCF) or net present value (NPV). This V .x/ D maxT EŒe rT .v.XT /  I /
consists in assessing the current value of a project
by discounting the expected future cash flows by ROA, the discount rate r is the risk-free
EŒXt  at a constant discount rate, r. Management interest (Dixit and Pindyck 1994; Trigeorgis
supposedly creates shareholder value by under- 1996). The time of managerial action, T, is
Option Games: The Interface Between Optimal Stopping and Game Theory 1013

now a strategic decision variable random by firms may share the right to a related investment
nature as the decision maker faces an uncertain opportunity in the industry.
environment. This problem has an analytical
solution characterized by a threshold policy, say Game Theory
a trigger XN , given by In oligopolistic industries, firms often have dif-
ficulty predicting how rivals will behave and
XN b make decisions based on beliefs about their likely
D I (2)
r g b1 behavior. A theory that helps characterize beliefs
and form predictions about which strategies op-
where b is the positive root of a quadratic ponents will follow is helpful in analyzing such
function (e.g., see Dixit and Pindyck 1994) and oligopolistic situations. Game theory has tradi-
b=.b  1/ > 1. When decisions are costly or tionally been used to frame strategic interactions
difficult to reverse, corporate managers would be arising in conflict situations involving parties
more cautious and careful to make decisions. A with different objectives or interests. It attempts
firm should not always commit immediately – to model behavior in strategic situations or games
even if the NPV criterion (1) indicates so – but in which one party’s success in making choices
wait until the gross project value is sufficiently depends on the choices of other players through
positive to cover the investment cost I by a factor influencing one another’s welfare. Game theory
larger than one, as expressed in (2). Investing adopts a different perspective on optimization, as
prematurely may destroy shareholder value. the focus is on the formation of beliefs about
Real options may justify sometimes undertaking how rivals’ optimal strategies. Finance theory
projects with negative (static) net present value has been primarily concerned with “moves by
if it creates a platform for growth options or nature,” while game theory focuses on “optimiza-
delaying projects with positive NPV. tion problems” involving multiple players. To
solve a game, one needs to reduce a complex
multiplayer problem into a simpler structure that
Accounting for Strategic Interactions captures the essence of the conflict situation. One
in Capital Budgeting can then derive useful predictions about how O
rivals are likely to react in a given situation.
Strategic Uncertainty Game theory helped reshape microeconomics by
As natural monopolies have lost their secular providing analytical foundations for the study of
well-protected positions owing to market liber- market behavior and has been at the foundation
alization in the European Union and elsewhere of the Nobel prize winning research field of
across the globe, strategic interdependencies have industrial organization.
become new key challenge for managers. At Dynamic game theory (see, e.g., Basar and
the same time sectors traditionally populated by Olsder 1999) addresses problems in which
multiple firms have undergone significant consol- several parties are in repeated interaction.
idation, often resulting in oligopolistic situations Strategic management approaches based on
with a reduced number of players. The ongoing dynamic economic theory can provide a richer
economic crisis has amplified these consolidation foundation for understanding developments and
pressures. These two ongoing phenomena – lib- competitive reactions within an industry. As
eralization and consolidation – have put high on firm competitiveness involves interactions among
the corporate agenda the assessment of strategic several players (rivals, suppliers or clients), game
options under competition. Standard real options theoretic analysis brings important insights into
analysis often examines investment decisions as strategic management in addressing such issues
if the option holder has a proprietary right to as first- and second-mover advantages, firm
exercise. This perspective may not be realistic entry and exit decisions, strategic commitment,
in the new oligopolistic environment as several reputation, signaling, and other informational
1014 Option Games: The Interface Between Optimal Stopping and Game Theory

effects. A key lesson is that, when firms react to equilibrium include the riskiness of the venture,
one another, it may sometimes be appropriate
, the magnitude of the first-mover advantage and
for one firm to take an aggressive stance in the exclusive or shared ability to reap the benefits
expectation that rivals will back off. Dynamic of the investment vis-à-vis rivals. When firms
industrial organization includes the analysis of can grasp a large first-mover advantage from in-
“games of timing” such as preemption games vesting early but cannot differentiate themselves
or war of attrition, whereby firms decide on sufficiently from each other, they may be tempted
appropriate investment timing under rivalry. to wage a preemptive war, investing prematurely
at an early market stage that actually kills option
value. If firms are more on an equal footing but
do not see much benefit from investing early,
Option Games
they may prefer to wait and invest (jointly) at a
later stage when the future market is sufficiently
The earlier optimal stopping problem falls in the
mature. If, however, one firm has a comparative
category of “games of timing” when a firm’s
cost advantage that dominates (e.g., a radical or
entry decision influences another firm’s market
drastic technological superiority) its rival indus-
strategy. Option games are most suitable to help
try, participants may prefer a consensual leader-
model situations where a firm that has a real op-
follower investment arrangement involving less
tion to (dis)invest faces rivalry. Here, the problem
option value destruction.
consists in finding a Nash equilibrium solution
for the two-player equivalent of the above optimal
stopping problem. This solution must also satisfy
certain dynamic consistency criteria. For sequen- Conclusions
tial investments, the follower is faced with a
single-agent optimal investment timing problem; Corporate management’s strategic tool kit should
it will thus enter if the gross project value exceeds provide clearer guidance on whether to pursue
the investment cost by a sufficient factor. A firm a wait-and-see stance in the face of uncertain
entering the market early on, i.e., a leader, earns market developments or jump on the first-mover
temporary monopoly rents as long as demand bandwagon to build competitive advantage. We
remains below the follower’s entry threshold. discussed two different modeling approaches that
Following the follower’s entry, the firms act as provide complementary perspectives and insights
a duopoly. As long as the leader’s value exceeds to help management deal with issues of flexibility
the follower’s, there is an incentive for one firm to versus commitment: real options and dynamic
invest, but not necessarily for both of them, lead- game theory. While each approach separately
ing to a “coordination problem.” The competitive might turn a blind eye to flexibility or commit-
pressure will dissipate away the leader’s first- ment, an integrative perspective through “options
mover advantage, leading to a market entry point games" might provide the right balance and serve
that is not socially optimal and to rent dissipa- as a tool kit for adaptive competitive strategy.
tion. Unfortunately, the multiplayer problem does Both perspectives ultimately aim to derive bet-
not involve a simple analytical solution, since at ter insights into industry dynamics under indus-
each point a duopolist firm might end up in any try conditions characterized by both market and
of four distinct situations (two-by-two matrix) strategic uncertainty.
depending on the rival’s entry decision. Option Option games pave the way for a consis-
games indicate in each situation which driving tent approach in addressing managerial decision-
force (commitment vs. flexibility) prevails and making, elevating the art of strategy to scientific
whether to go ahead with the investment or wait analysis. Option games integrates in a common,
and see. Main drivers of the prevailing market consistent framework recent advances made in
Oscillator Synchronization 1015

these diverse set of disciplines. This emerging


field that represents a promising strategic man-
Oscillator Synchronization
agement tool that can help guide managerial
Bruce A. Francis
decisions through the complexity of the modern
Department of Electrical and Computer
competitive marketplace.
Engineering, University of Toronto, Toronto,
ON, Canada

Cross-References
Abstract
 Auctions
 Learning in Games The nonlinear Kuramoto equations for n coupled
oscillators are derived and studied. The oscilla-
tors are defined to be synchronized when they os-
Recommended Reading cillate at the same frequency and their phases are
all equal. A control-theoretic viewpoint reveals
Smit and Trigeorgis (2004) discuss related trade- that synchronized states of Kuramoto oscillators
offs with discrete-time real option techniques. are locally asymptotically stable if every oscilla-
Grenadier (2000) and Huisman (2001) examine tor is coupled to all others. The problem of syn-
a number of continuous-time models. Chevalier- chronization in Kuramoto oscillators is closely
Roignant and Trigeorgis (2011) synthesize both related to rendezvous, consensus, and flocking
types of “option games.” An overview of the problems in distributed control. These problems,
literature is provided in Chevalier-Roignant et al. with their elegant solution by graph theory, are
(2011). discussed briefly.

Bibliography Keywords
O
Basar T, Olsder GJ (1999) Dynamic noncooperative game Graph theory; Kuramoto model; Laplacian;
theory, 2nd edn. SIAM, Philadelphia Oscillator; Synchronization
Bensoussan A, Lions J-L (1982) Application of varia-
tional inequalities in stochastic control. North Holland,
Amsterdam
Chevalier-Roignant B, Trigeorgis L (2011) Competitive
strategy: options and games. MIT, Cambridge, MA Introduction
Chevalier-Roignant B, Flath CM, Huchzermeier A, Trige-
orgis L (2011) Strategic investment under uncertainty: An oscillator is an electronic circuit or other kind
a synthesis. Eur J Oper Res 215(3):639–50
of dynamical system that produces a periodic
Dixit AK, Pindyck RS (1994) Investment under uncer-
tainty. Princeton University Press, Princeton signal. If several oscillators are coupled together
Grenadier S (2000) Game choices: the intersection in some fashion and the periodic signals that they
of real options and game theory. Risk Books, each produce are of the same frequency and are in
London
phase, the oscillators are said to be synchronized.
Huisman KJM (2001) Technology investment: a game
theoretic real options approach. Springer, Boston The book Sync: The Emerging Science of Spon-
Myers SC (1977) Determinants of corporate borrowing. taneous Order, by Strogatz, introduces a wide
J Financ Econ 5(2):147–175 variety of phenomena where oscillators synchro-
Smit HTJ, Trigeorgis L (2004) Strategic investment:
nize. Some examples from biology: networks of
real options and games. Princeton University Press,
Princeton pacemaker cells in the heart, circadian pacemaker
Trigeorgis (1996) Real options. MIT, Cambridge, NA cells in the suprachiasmatic nucleus of the brain,
1016 Oscillator Synchronization

cal studies of coupled oscillators. To derive Ku-


ramoto’s equations, we begin with a simple hy-
pothetical setup. Imagine n runners going around
a circular track. Suppose they’re all going at
roughly the same speed, but each adjusts his/her
speed based on the speeds of his/her nearest
neighbors. If some runner passes another, that one
Oscillator Synchronization, Fig. 1 Two metronomes tends to speed up to close the gap. The synchro-
on a board that is on two pop cans. After the metronomes nization question is do the runners eventually end
are let go at the same frequency but at different times, they up running together in a tight pack?
soon become synchronized and tick in unison.
Idealize the runners to be merely points, num-
bered k D 1; : : : ; n. They move on the unit
circle in the complex plane. A point on the unit
metabolic synchrony in yeast cell suspensions, circle can be written as ej , where j denotes the
groups of synchronously flashing fireflies, and unit imaginary number and  denotes the angle
crickets that chirp in unison. Engineering exam- measured counterclockwise from the positive real
ples include clock synchronization in distributed axis. The position of point k at time t is zk .t/ D
communication networks and electric power net- ej.!t Ck .t //, where ! is the nominal rotational
works with synchronous generators. speed in rad/s, and k .t/ is the difference between
A very simple example of oscillator synchro- the actual angle at time t and the nominal angle
nization was discovered by Christiaan Huygens, !t. Notice that ! is a constant positive real
the prominent Dutch scientist and mathematician number and it is the same for all n points. As
who lived in the 1600s. One of his contributions in circuit theory, it simplifies the mathematics to
was the invention of the pendulum clock, where a refer all the positions to the sinusoid ej!t , and
pendulum swings back and forth with a constant therefore we define the local position of point k
frequency. Huygens observed that two pendulum to be pk .t/ D zk .t/=ej!t , i.e., pk .t/ D ejk .t / :
clocks in his house synchronized after some time. Differentiate the local position with respect to
The explanation for this phenomenon is that the time and let “dot” denote d=dt: pPk D ejk j Pk :
pendula were coupled mechanically through the Define the local rotational velocity vk D Pk and
wooden frame of the house. The same principle substitute into the preceding equation:
can be observed by a fun, simple experiment. As
in Fig. 1, put two pop cans on a table, on their
pPk D vk jpk : (1)
sides and parallel to each other. Place a board on
top of them, and place two (or more) metronomes
The local velocity vk could be positive or neg-
on the board. Set the metronomes to tick at the
ative. Notice that if we view pk as a vector
same frequency. Start them off ticking but not in
from the origin and view multiplication by j
unison. Within a few minutes they will be ticking
as rotation by =2, then jpk can be viewed as
in unison.
tangent to the circle at the point pk – see the
In this essay we derive what are known as
picture on the left in Fig. 2.
the Kuramoto equations, a mathematical model
Now we propose a feedback law for vk in
of n oscillators, and then we study when they will
Eq. (1); see the picture on the right in Fig. 2. Take
synchronize.
vk proportional to the projection of pi onto the
tangent at pk , that is, vk D hpi ; jpk i. Here the
inner product between two complex numbers v; w
The Kuramoto Model is hv; wi D Re vw.N (You may check that this is
equivalent to the usual dot product of vectors in
In 1975 the Japanese researcher Yoshiki Ku- R2 .) Thus from (1) the model to get k to close the
ramoto gave one of the first serious mathemati- gap is pPk D hpi ; jpk ijpk .
Oscillator Synchronization 1017

the state space associated with this equation. It is


important to get the state space right because oth-
erwise the concepts of stability and synchroniza-
tion become shaky. The phase angles k are real
numbers with units of radians, so at first glance
the state space is Rn . But the angles are defined
modulo 2 and so their values are restricted to lie
in the interval Œ0; 2/. In this way the state space
Oscillator Synchronization, Fig. 2 Left: The vectors pk becomes Œ0; 2/n . For example, if n D 2 the state
and jpk . Right: The local velocity vk
space is the square Œ0; 2/  Œ0; 2/ viewed as
a subset of the plane R2 . The mapping 7! ej
More generally, suppose that point k pays is a one-to-one correspondence from the interval
attention to not just point i but a fixed set of points Œ0; 2/ to the unit circle in the complex plane
called its neighbors. Let Nk denote the index set C. This unit circle is usually denoted S1 , the
of neighbors of point k and for simplicity assume superscript signifying the circle’s dimension as a
Nk does not depend on time. We consider the manifold. By this correspondence the state space
P of (2) is the n-fold product S1      S1 , and this
control law vk D i 2Nk hpi ; jpk i and thereby
arrive at the model of the evolution of the posi- is sometimes called the n-torus, denoted Tn .
tions pk : To recap, in what follows, the state space
is Œ0; 2/n . This is an n-dimensional manifold
X rather than a vector space.
pPk D hpi ; jpk ijpk :
i 2Nk

However, the Kuramoto model gives the evo- Synchronization


lution of the angles k rather than the points pk .
To find the equation for k , we observe that Control-theoretic methods, for example, that of
Sepulchre et al. (2007), have been insightful. We
O
hpi ; jpk i D Re .pNi jpk / address now the question of whether or not the
  oscillators in (2) synchronize, that is, the phases
D Re eji j ejk asymptotically converge to a common value. In
D sin.i  k /: the state space, Œ0; 2/n , the set of synchronized
states is the set of vectors  of the form c1,
In this way, the controlled points move according where c 2 Œ0; 2/ and 1 is the vector of 1’s. The
to simplest case is when every point is a neighbor of
X every other point, i.e., Nk contains every integer
pPk D sin.i  k /jpk :
i 2Nk
in the set 1; : : : ; n except k. Then (2) becomes

Substitute in pk D ejk and then cancel jpk : X


n

X Pk D sin.i  k /; k D 1; : : : ; n: (3)


Pk D sin.i  k /; k D 1; : : : ; n: (2) i D1
i 2Nk
Let us show that if the initial phases k .0/ are
This is the Kuramoto model of coupled oscil- all close enough together, then .t/ converges
lators in terms of the phases of the oscillators. asymptotically to a synchronized state. This will
There are n coupled nonlinear ordinary differen- show that the synchronized states are locally
tial equations. asymptotically stable in a certain sense.
Equation (2) has the vector form P D g./. As stated before, Eq. (3) has the form P D
There are some variations in the literature about g./. The function g./ is the gradient of a
1018 Oscillator Synchronization

positive definite function. Indeed, let rej denote consensus, or flocking problems. Phase synchro-
the average of the points ej1 ; : : : ; ejn . Of course, nization is replaced by the requirement of mobile
r and are functions of , and so we have robots gathering at some location, by the require-
ment of temperature sensors in a sensor network
1  j1  converging to the same temperature estimate, or
r./ej . /
D e C    C ejn
n by the requirement that mobile robots should
head in the same direction. The simplest form of
and therefore these problems has the equations
1 ˇˇ j1 ˇ X
r./ D e C    C ejn ˇ : Pk D .i  k /; k D 1; : : : ; n: (4)
n
i 2Nk
The average of n points on the unit circle lives
inside the unit disc, and therefore r./ is a real Notice that this can be obtained from the Ku-
number between 0 and 1. It equals 1 if and only ramoto model (2) merely by replacing sin.i k /
if the n points are equal, that is, the n phases are by i  k in (2), that is, by linearizing the
equal, and this is the state where the phases are latter at a synchronized state. We shall continue
synchronized. to call k a phase of an oscillator. When do the
Define the function phases evolving according to (4) synchronize?
The answer to the question involves a lovely col-
n2
V ./ D r./2 laboration between graph theory and dynamics.
2 Introduce a directed graph that is in one-to-
1 ˇˇ ˇ2
ˇ one correspondence with the neighbor structure.
D ˇej1 C    C ejn ˇ
2 The graph is made up of n nodes, one for each
1  j1  
D e C    Cejn ej1 C    Cejn : oscillator. From each node there is an arrow to
2 every neighbor of that node; that is, from node
Thus k is an arrow to every node in Nk . Denote the
adjacency matrix and the degree matrix of the
@V ./ graph by, respectively, A and D. That is, aij D 1
D sin.1  k / C    C sin.n  k /
@k if j is a neighbor of i and di i equals the sum of
the elements on row i of A. The Laplacian of the
and therefore (3) can be written as P D graph is defined to be L D D  A. Then (4) is
@V ./=@: This is a gradient equation. If .0/ equivalent to simply
is chosen so that all the phases are close enough
together, then r..0// will be close to 1, and P D L; (5)
therefore  will move in a direction to increase
V ./, that is, increase r./, until in the limit where  is still the vector with elements
r./ D 1 and the phases are synchronized. 1 ; : : : ; n . Whether or not synchronization
There are results, e.g., Sepulchre et al. (2008), occurs depends on the connectivity of the graph.
when the coupling is not all-to-all. Also, the term We stop here and refer the reader to the articles
“synchronization” is used more generally than  Averaging Algorithms and Consensus and
just for oscillators Wieland et al. (2011).  Flocking in Networked Systems
Suppose there are an infinite but countable
number of oscillators in the model (5). When will
Rendezvous, Consensus, Flocking, they synchronize? To answer this, we have to be
and Infinitely Many Oscillators more specific.
Let us allow an infinite number of oscillators
Synchronization of coupled oscillators is closely numbered by the integers, positive, zero, and
related to other problems known as rendezvous, negative. Denote the phases by k and let 
Oscillator Synchronization 1019

denote the phase vector, whose kth component is the most general. A more general model allows
k . Assume each oscillator has only finitely many different frequencies !k instead of just one, and
neighbors, let Nk denote the set of neighbors of also a coupling gain K, leading to the model
oscillator k, and let L be the Laplacian of the as-
sociated graph. Finally, let .t/ evolve according K X
Pk D !k C sin.i  k /; k D 1; : : : ; n:
to the Eq. (5). This equation isn’t automatically n i 2N
k
well posed in the sense that there may not be a (6)
solution defined for all t > 0. We have to impose
An important problem associated with the
a framework so that solutions do indeed exist.
Kuramoto model is to determine which synchro-
One natural space in which to place .0/ is `2 ,
nized states are stable. The linearized equation is
the Hilbert space of square-summable sequences.
interesting in its own right and relates to problems
If L is a bounded operator on `2 , then so is eLt
of rendezvous, consensus, and flocking.
for every t > 0, and hence the phase vector
Reference Dörfler and Bullo (2014) offers
exists and belongs to `2 for every t > 0. Another
some questions for future study. In particular,
natural space in which to place .0/ is `1 , the
it would be interesting to extend the Kuramoto
Banach space of bounded sequences. Again, a
model beyond the first-order oscillators of (2).
phase vector exists for all t > 0 if L is a bounded
Also, the case of general neighbor sets has much
operator on `1 .
room for exploration.
The following example is from Feintuch and
Asymptotic stability is a robust property. For
Francis (2012). Take the neighbor sets to be
example, if the origin is asymptotically stable
Nk D fk  1g. The graph is a chain: There is
for the system xP D Ax, it remains so if A is
an arrow from node k to node k  1, for every
perturbed by a sufficiently small amount. This
k, and the Laplacian is the infinite matrix with
is because the spectrum of a matrix is a con-
1 on the diagonal, 1 on the first subdiagonal,
tinuous function of the matrix. The sketch in
and zero elsewhere. This Laplacian is a bounded
Fig. 1 vividly depicts the concept of synchronized
operator on both `2 and `1 . Now the vector c1,
oscillators. A topic for future study is that of ro-
where 1 is the vector of all 1’s, belongs to `1
bustness. Mathematically, if the two metronomes O
for every real number c, but it belongs to `2
are identical, they will synchronize perfectly –
only for c D 0. So the phases can potentially
this can be proved. Of course, physically two
synchronize at any value in `1 , but only at 0 in
metronomes cannot be identical, and yet they will
`2 . For the example under discussion, if the initial
synchronize if they are close enough physically.
phase vector is in `2 , then the phases synchro-
A mathematical study of this phenomenon might
nize at 0. By contrast, there exist initial phase
be interesting.
vectors in `1 such that synchronization does
not occur. Even worse, limt !1 .t/ does not
exist. The conclusion is that whether or not the Cross-References
oscillators will synchronize is a difficult question
in general.  Averaging Algorithms and Consensus
 Flocking in Networked Systems
 Graphs for Modeling Networked Interactions
Summary and Future Directions  Networked Systems
 Vehicular Chains
The Kuramoto model is a widely used paradigm
for coupled oscillators. The model has the form
P D f .E/, where  is the vector of phases, Recommended Reading
the matrix E maps  into the vector of possible
differences i  k , and f is a function. The The literature on the Kuramoto model is
Kuramoto model considered in this essay is not huge – there are now many hundreds of journal
1020 Output Regulation Problems in Hybrid Systems

papers continuing the study of oscillators using Lin Z, Francis BA, Maggiore M (2007) State agreement
Kuramoto’s model. There is space here only to for coupled nonlinear systems with time-varying inter-
action. SIAM J Control Optim 46:288–307
highlight a few sources. Moreau L (2005) Stability of multi-agent systems with
You can find a mathematical study of time-dependent communication links. IEEE Trans
coupled metronomes in Pantaleone (2002). Also, Automatic Control 50:169–182
Pantaleone’s webpage Pantaleone describes Pantaleone J. Webpage. http://salt.uaa.alaska.edu/jim/
Pantaleone J (2002) Synchronization of metronomes. Am
some experimental observations. Kuramoto’s J Phys 70(10):992–1000
original paper is Kuramoto (1975). Dörfler and Scardovi L, Sarlette A, Sepulchre R (2007) Synchroniza-
Bullo have recently written a comprehensive tion and balancing on the N-torus. Syst Control Lett
survey (Dörfler and Bullo (2014)). Strogatz has 56:335–341
Sepulchre R, Paley DA, Leonard NE (2007) Stabilization
written extensively on oscillator synchronization. of planar collective motion: all-to-all communication.
His book Sync is fascinating and is highly IEEE Trans Automatic Control 52(5):811–824
recommended (Strogatz 2004). See also Strogatz Sepulchre R, Paley DA, Leonard NE (2008) Stabiliza-
(2000) and Strogatz and Stewart (1993). The tion of planar collective motion with limited com-
munication. IEEE Trans Automatic Control 53(3):
papers Scardovi et al. (2007) and Dörfler 706–719
and Bullo (2011) are recommended for more Strogatz SH (2000) From Kuramoto to Crawford: ex-
recent results, the latter treating the general ploring the onset of synchronization in populations of
model (6). coupled oscillators. Physica D 143:1–20
Strogatz SH (2004) Sync: the emerging science of
Getting phases in oscillators to synchronize is spontaneous order. Hyperion Books, New York
a special case of getting the states or outputs of Strogatz SH, Stewart I (1993) Coupled oscillators and
coupled systems asymptotically to converge to biological synchronization. Sci Am 269:102–109
a common value. There is a very large number Wieland P, Sepulchre R, Allgower F (2011) An internal
model principle is necessary and sufficient for linear
of references on these subjects, a seminal one output synchronization. Automatica 47:1068–1074
being Jadbabaie et al. (2003); others are Lin et al.
(2007) and Moreau (2005). Regarding infinitely
many oscillators, the physics literature treats only
a continuum of oscillators, whereas countably Output Regulation Problems in
many oscillators are the subject of Feintuch and Hybrid Systems
Francis (2012).
Sergio Galeani
Acknowledgments I greatly appreciate the help from Dipartimento di Ingegneria Civile e Ingegneria
Luca Scardovi, Florian Dörfler, and Francesco Bullo.
Informatica, Università di Roma “Tor Vergata”,
Roma, Italy
Bibliography

Dörfler F, Bullo F (2011) On the critical coupling


for Kuramoto oscillators. SIAM J Appl Dyn Syst Abstract
10(3):1070–1099
Dörfler F, Bullo F (2014) Synchronization in complex This entry discusses some of the salient fea-
networks of phase oscillators: a survey. Automatica,
50(6), June 2014. To appear. tures of the output regulation problem for hybrid
Feintuch A, Francis B (2012) Infinite chains of kinematic systems, especially in connection with the steady-
points. Automatica 48:901–908 state characterization. In order to better high-
Jadbabaie A, Lin J, Morse AS (2003) Coordination light such peculiarities, the discussion is mostly
of groups of mobile autonomous agents using near-
est neighbor rules. IEEE Trans Automatic Control focused on the simplest class of linear time-
48(6):988–1001 invariant systems exhibiting such behaviors. In
Kuramoto Y (1975) Self-entrainment of a population of comparison with the usual regulation theory, the
coupled nonlinear oscillators. In: Araki H (ed) Vol- role played by the zero dynamics and by the
ume 39 of International symposium on mathematical
problems in theoretical physics, Kyoto. Lecture Notes presence of more inputs than outputs is particu-
in Physics. Springer, p 420 larly striking.
Output Regulation Problems in Hybrid Systems 1021

Keywords The solution of an output regulation prob-


lem entails the solution of two subproblems: the
Disturbance rejection; Hybrid systems; Internal definition of a set of zero output steady-state
model principle; Output regulation; Tracking; solutions and the asymptotic stabilization of such
Zero dynamics solutions (or at least making them attractive; in
many cases of interest, the achievement of this
last objective actually yields asymptotic stabi-
Introduction lization). As a matter of fact, the stabilization
subproblem is already widely studied and de-
Output regulation is one of the most classical scribed per se; for this reason, after some short
problems in control theory, and its celebrated remarks in section “Stabilization Obstructions in
solution in the linear time-invariant case (Davison Hybrid Regulation”, the remainder of this pre-
1976; Francis and Wonham 1976) is character- sentation will focus only on steady-state-related
ized by remarkable elegance and ideas (like the issues, for the simplest class of systems which
internal model principle). While the extension to exhibit the most peculiar and interesting phe-
nonlinear systems is still an active field of inves- nomena of hybrid steady-state behavior (see in
tigation, the study of output regulation for hybrid particular section “Key Features in Hybrid vs
systems is also being actively pursued, and sev- Classical Output Regulation”). For concreteness,
eral surprising results have already appeared for only hybrid systems E, P characterized by linear
the linear case, suggesting that a richer structure time-invariant (flow and jump) dynamics will
arises in hybrid output regulation problems due to be considered; following Goebel et al. (2012,
the interplay between flow and jump dynamics. Chap. 2), a two-dimensional parameterization of
The problem can be stated as follows. A hybrid time .t; k/ 2 R  N will be used, with t
known exosystem E with initial state belonging measuring the flow of (usual) time and k counting
to a suitably defined set W0 produces a signal the number of jumps experienced by the solution
w possibly affecting both the plant P and the (see Fig. 1 for a specific example). So, the exosys-
compensator C; the compensator has to guarantee tem E will be described at time .t; k/ by
that for any initial state of E in a set W0 : O
• All closed-loop responses are bounded.
• The output e of P asymptotically converges to
wP D S w ; .w; t; k/ 2 CE ; (1a)
zero.
In order to avoid trivialities, the exosystem E wC D J w ; .w; t; k/ 2 DE ; (1b)
is assumed to be such that its state evolution
from nonzero initial states in W0 is bounded and
not asymptotically converging to zero, both in k
forward and in backward time. (t4 , 4)
4
Two typical embodiments of the output regu-
(t3, 3)
lation problem are the disturbance rejection and 3
(t4 , 3)
the reference tracking problems. In disturbance (t2 , 2)
2
rejection, w acts as a disturbance on P and cannot (t3, 2)
be measured by C, and the output e from which 1
the effect of w has to be canceled is the actual
0
plant output. In reference tracking, w contains the 0 M 2 M 3 M 4 M t
references to be tracked by an output yr of P,
so that w can be assumed to be known by C; by Output Regulation Problems in Hybrid Systems,
Fig. 1 Hybrid time domain T for a “sampled data” hybrid
defining the regulated output e as e D yr  r, the system. Dots indicate .t; k/ 2 T when jumps occur
reference tracking problem is cast as an output (see section “Hybrid Steady-State Generation” for the tk
regulation problem. notation)
1022 Output Regulation Problems in Hybrid Systems

and the plant P will be described at time .t; k/ by Hence, the (Euclidean) distance between the two
solutions at time .t; k/ is given by

(
xP D Ax C Bu C P w ; .x; u; t; k/ 2 CP ; "; t 2 Œk; k C 1  ";
(2a) d.t; k/ D
.1  "/; t 2 Œk C 1  "; k C 1I
x C D Ex C Rw ; .x; u; t; k/ 2 DP ;
(2b) in other words, choosing " > 0 as small as de-
sired, arbitrarily close initial conditions generate
e D C x C Qw ; (2c)
trajectories which are apart by a finite amount (as
close as desired to 1) during the arbitrarily small
with x.t; k/ 2 Rn , u.t; k/ 2 Rm , e.t; k/ 2 Rp , time intervals where t 2 Œk C 1  "; k C 1.
w.t; k/ 2 Rq , and suitably defined flow sets CE , Since stability deals with trajectories remaining
CP and jump sets DE , DP . close forever and attractivity deals with trajecto-
ries getting closer and closer, examples such as
the one above pose serious issues when defining
Stabilization Obstructions in Hybrid (let alone establish) stability and attractivity in
Regulation the hybrid case. Similar problems arise not only
in output regulation problems but also in other
The achievement of asymptotic stabilization of areas like state tracking, observers, and general
the desired (zero output) steady-state responses interconnections of hybrid systems.
for the considered class of linear hybrid systems However, intuition suggests (and mathematics
crucially depends on whether the plant P and confirms, by using a suitable notion of “dis-
the exosystem E have synchronous jump times or tance”) that such trajectories are close indeed.
not. Several approaches have been proposed in order
to overcome such difficulty. Considering as an
example a bouncing ball tracking another bounc-
Asynchronous Jumps ing ball, the problematic time intervals are those
Typically, jumps in P and E will be asyn- between the bounce of the first ball hitting the
chronous, and this will cause the undesirable ground and the bounce of the other ball; in such
phenomenon that genuinely close trajectories a case, the modified distances are defined by
will look “distant” around each jump when the either
distance is measured according to the usual • Allowing to exclude sufficiently short
Euclidean norm. The simplest illustration of “problematic” intervals (possibly requiring
such phenomenon consists in considering two that their length asymptotically tends to zero);
trajectories of the same system starting from see, e.g., Galeani et al. (2008, 2012)
"-close initial conditions. Consider the system • Considering alternative “mirrored” trajecto-
ries computed as if the last jump did not
vP D 1; v 2 Œ0; 1; v C D 0; v 62 .0; 1/; happen; see, e.g., Forni et al. (2013a,b)
• Using a “stretched” distance function ı such
with the initial states v0 D 0 and v1 D ", 0 < that when point a is in the jump set and
" < 1. The two ensuing solutions at time .t; k/ its image via the jump map is g.a/, then
are immediately computed as ı.a; b/ D ı.g.a/; b/; see, e.g., Biemond et al.
(2013).
v.t; kI v0 /Dt  k; t 2 Œk; k C 1; While the first approach has been proposed first,
( the other two (which are strongly related) have
t  k C "; t 2 Œk; k C 1  "; the advantage of providing (under mild additional
v.t; kI v1 /D
t  k C "  1; t 2 Œk C 1  "; k C 1; hypotheses) global control Lyapunov functions.
Output Regulation Problems in Hybrid Systems 1023

Finally, it is worth noting that the most adequate (and an easily provable separation principle), it is
tools to address similar issues for general hybrid easily shown that:
systems are the “graphical distance” among hy- • Under a hybrid stabilizability hypothesis,
brid arcs and related concepts (see Goebel et al. state feedback stabilization of (5) is easily
2012, Chap. 5). achieved.
• Output feedback stabilization of (5) from e
Synchronous Jumps is also trivial under an additional hybrid de-
When synchronous jumps are considered, the tectability hypothesis.
above issue disappears, and asymptotic stabiliza- • Under hybrid detectability of the cascade
tion becomes a much simpler matter. Although of (4) and (5), w can be asymptotically
synchronous jumps look more like an exception estimated from e.
than a rule in hybrid systems, they are very Due to the above facts, it can be assumed without
reasonable for specific classes of problems. loss of generality that (5) is asymptotically sta-
In order to have synchronous jumps, some ble (equivalently, that all eigenvalues of Ee A M
authors have considered the use of “jump inputs” have modulus strictly less than one). Asymptotic
which impose a jump at a certain time, which stability then yields incremental stability, since
can be physically reasonable in some systems, letting xO and xL denote two motions under the
e.g., two tanks separated by a movable wall, same inputs u, w and only differing in their initial
assuming that when the wall is removed the fluid states, it is immediate to see that their difference
reaches the equilibrium configuration almost in- xQ WD xO  xL evolves as
stantaneously.
Another relevant class consists of “sampled xPQ D xPO  xPL D AxO C Bu C P w
data” systems, whose jumps are essentially due
to digital components which operate at a fixed  .AxL C Bu C P w/;
sampling rate, which will be considered in the xQ C D xO C  xL C D E xO C Rw  .E xL C Rw/;
rest of this entry. In such a case, letting M be the
sampling period, the time domain of the hybrid
system is fixed as (see Fig. 1) that is, xPQ D Ax, Q xQ C D E x, Q and so it is O
just a free motion of the plant, asymptotically
T WD f.t; k/ W t 2 Œk M ; .k C 1/ M ; k 2 Zg; converging to zero. Incremental stability implies
(3) that regulation is achieved as soon as it is shown
all jumps happen exactly for .t; k/ with t D .k C that for any exogenous input w it is possible
1/ M , and then (1) can be simplified as to find an input u and an initial state of (5)
such that e is identically zero, since then any
wP D S w ; (4a) other motion arising from a different initial state
C
will asymptotically converge to the motion with
w D Jw; (4b) identically zero e. Moreover, it is easy to see that
asymptotic stability of the origin actually implies
and (2) can be simplified as uniform, global, and exponential stability of any
trajectory for such systems.
xP D Ax C Bu C P w ; (5a)
C
x D Ex C Rw ; (5b)
e D C x C Qw ; (5c) Hybrid Steady-State Generation

since flow and jump times are clear from the From this point on, the rest of the presentation
context. will be focused only on the case where the prob-
For the latter class of systems, by using linear lem data are of the form (3) to (5), since this
time-invariant hybrid control laws and observers allows to provide an uncluttered view on some
1024 Output Regulation Problems in Hybrid Systems

peculiar features of hybrid steady-state motions, P


xP ss .t; k/ D ….
/w.t; k/ C ….
/w.t;
P k/;
without the burden of having to take care of
delicate stability issues arising in more general xP ss .t; k/ D Axss .t; k/ C Buss .t; k/
contexts. C P w.t; k/:
Based on the preceding discussion, there is no
loss of generality at this point in assuming that: C
• At jumps, xss .t; k/ has to satisfy the two
• Plant (5) is asymptotically stable, which is equations:
equivalent to all eigenvalues of Ee A M having
a magnitude strictly less than one. C
xss .tkC1 ; k/ D ….0/wC .tkC1 ; k/;
• Exosystem (4) is Poisson stable, which is
C
equivalent to all eigenvalues of Je S M having xss .tkC1 ; k/ D Exss .tkC1 ; k/ C Rw.tkC1 ; k/:
a magnitude equal to one.
It is also customary to distinguish between full • For the output ess to be identically zero:
information and error feedback regulation, where
in the first case controller C has access to the 0 D C xss .t; k/ C Qw.t; k/:
complete state .w; x/ of the cascade of E and P,
whereas in the second case C can only measure Substituting (6) in the above conditions and con-
the output e of P. sidering that such relations should hold for all
Having assumed asymptotic stability of plant values of w, the following hybrid regulator equa-
P, the only role of compensator C consists in tions are obtained:
generating the correct steady-state input, since
then, by incremental stability of P, asymptotic P
….
/ C ….
/S D A….
/ C B.
/ C P;
regulation is ensured from any initial state. Re-
(7a)
calling the expression of T in (3), for the follow-
ing developments it is useful to define the jump ….0/J D E…. M / C R; (7b)
times tk and the elapsed time of flow since last
0 D C ….
/ C Q: (7c)
jump
as

tk WD k M ;
.t; k/ WD t  k M I Equations (7) can be shown to be both necessary
and sufficient for (6) to solve the output regula-
tion problem under the considered assumptions.
the arguments of
.t; k/ will usually be omitted
Once a solution of (7) is available, the full in-
since clear from the context. Note that
satisfies
formation regulator simply reduces to the time-

P D 1,
C D 0, and it is often explicitly
varying static feedforward controller
introduced as an additional timer variable.

The Full Information Case u.t; k/ D .


/w.t; k/ (8)
Consider the candidate steady-state motion and
input: which just provides as input the steady-state input
uss characterized as in (6); in fact, since (5) is

xss .t; k/ ….
/ incrementally stable (as follows from its asymp-
D w.t; k/: (6) totic stability, which was assumed without loss of
uss .t; k/ .
/
generality), its output response under the control
Requiring that such expressions actually charac- law (8) must converge to the output response
terize a response of the considered plant, as well associated to (6).
as the associated output is zero, amounts to ask For later use, note that in the non-hybrid case
that: where P and E only flow
• During flows, xP ss .t; k/ has to satisfy the two
equations: wP D S w; (9a)
Output Regulation Problems in Hybrid Systems 1025

xP D Ax C Bu C P w; (9b) †.0/J D L†. M / : (15b)


e D C x C Qw; (9c) .
/ D H †.
/ ; (15c)

the candidate steady state (6) is replaced by Equations (7) and (15) can be shown to be both


necessary and sufficient for (13) to solve the
xss .t/ … output regulation problem under the considered
D w.t/; (10)
uss .t/  assumptions and generalize the corresponding
conditions for the non-hybrid case where P and
and (7) reduces to the celebrated regulator equa- E only flow (see (9)) and (13) and (14) are
tions (or Francis equations) replaced by

…S D A… C B C P; (11a) P D F  C Ge ; (16a)
0 D C … C Q; (11b) u D H ; (16b)
2 3 2 3
and, as above, assuming without loss of gener- xss .t/ …
4 ss .t/ 5 D 4 † 5 w.t/; (16c)
ality that the plant is asymptotically stable, the
full information regulator reduces to the time- uss .t/ 
invariant static feedforward controller:
and (15) reduces to
u.t; k/ D w.t; k/ (12)
†S D F †; (17a)
The Error Feedback Case  D H †: (17b)
When the exosystem state is not measured, a
dynamic compensator of the form Relations (17) are an expression of the in-
ternal model principle, stating that in order to
P D F  C Ge ; (13a) achieve error feedback regulation, the compen- O
sator C must include a suitable “copy” of the
 C D L ; (13b)
exosystem, namely, (17a) imposes a constraint on
u D H ; (13c) the  dynamics of C which, coupled with (17b),
ensures that the signal uss D w used in the full
which is also supposed to flow and jump accord- information case can be equivalently produced
ing to the a priori fixed time domain T considered (without measuring w!) as uss D H †. A similar
for the plant, is introduced, and the corresponding interpretation can be given to (15), which must
candidate steady-state motion including  is be required in addition to (7) in order for (13) to
solve the hybrid error feedback output regulation
2 3 2 3
xss .t; k/ ….
/ problem.
4 ss .t; k/ 5 D 4†.
/5 w.t; k/: (14)
uss .t; k/ .
/
Key Features in Hybrid vs Classical
By following similar steps as above, requiring Output Regulation
invariance of such a manifold in the space of
.x; ; u; w/, as well as zero output on it, leads While the previous section mainly aimed at show-
to the conclusion that in addition to (7), the ing how the classical theory generalizes in the
following relations must be satisfied as well: hybrid case (at least for a special class of hybrid
systems), the aim of this section is to point out
P
†.
/ C †.
/S D F †.
/ ; (15a) some of the striking differences between the two
1026 Output Regulation Problems in Hybrid Systems

cases. Before proceeding further, and in order to be performed, the following discussion will be
keep focus on the characterization of the steady- mainly based on showing the simplest examples
state response, it is worth mentioning here that al- exhibiting the pathologies of interest.
though time-varying systems will be considered, Consider the system with M D 1 (so that
no issue regarding nonuniform stability (like in tk D k, for all k 2 Z) and
general nonautonomous systems) arises since the
timer
just ranges in the compact set Œ0; M  due wP D 0; (18a)
to the assumed periodic structure of T (see also
the end of section “Synchronous Jumps”). wC D w; (18b)
Comparing the classical and the hybrid output



xP 1 1 0 x1 0 0
regulator and considering that P and E are time D C uC w;
xP 2 0 2 x2 1 1
invariant, it seems somewhat strange that in the
(18c)
output feedback case the linear time-invariant
regulator (16a) and (16b) generalizes to a hybrid C

x1 0 1 x1
linear time-invariant regulator (13), whereas in C D ; (18d)
x2 2e 1 x2
the full information case the linear time-invariant

regulator (12) generalizes to a hybrid linear time-


 x1
varying regulator (8). eD 01  w: (18e)
x2
One argument in favor of the time-varying reg-
ulator (8) is based on the following consideration. The unique steady-state solution achieving out-
It is well known that (11) has a unique solution put regulation can be simply computed. In fact,
in the case of a square plant (m D p) under the by (18a) and (18b),
nonresonance condition between the zeros of P
and the eigenvalues of E, requiring that w.t; k/ D .1/k w.0; 0/I

A  sI B then, by (18e) it appears that ess D 0, 8.t; k/ 2 T


rank D n C p; 8s 2 ƒ.S /;
C 0 implies

where ƒ.S / denotes the spectrum of S . In such a x2;ss .t; k/ D w.t; k/ D .1/k w.0; 0/;
case, (11) amounts to a system of nq C pq linear
equations in nq C mq unknowns (the elements 8.t; k/ 2 T ;
of …, ), which might be expected to be satisfied
since m  p. If one were trying to use the unique which in turn implies that xP 2;ss D 0 for all t 2
constant solution .…; / of (11) as a solution .k; k C 1/, k 2 Z and the unique steady-state
of (7), clearly (7a) and (7c) would be satisfied, input
but then (7b) would impose other nq equations uss D 2x2;ss  w:
on … which would unlikely be satisfied. For
this reason, apparently the additional degree of
Since (18d) implies that x1;ss .tkC1 ; k C 1/ D
freedom offered by choosing time dependent …
x2;ss .tkC1 ; k/ D w.tkC1 ; k/ and (18c) implies
and  might be of help. In fact, it can be shown
that x1;ss .t; k/ D e .t k/ x1;ss .tk ; k/, for t 2
that if m D p and under a hybrid nonreso-
.tk ; tkC1 /, it follows that
nance condition (involving Ee A M and Je S M )
between P and E, (7a) and (7b) have a unique
solution for any choice of .
/, so that the design x1;ss .t; k/ D e .t tk / w.tk ; k/; t 2 .tk ; tkC1 /;
boils down to satisfying (7c) by choosing .
/; (19)
but is this always possible? In order to answer
this nontrivial question, a different path must be which finally is coherent with the jump equation
followed. While a complete formal analysis can for x2;ss in (18d) since
Output Regulation Problems in Hybrid Systems 1027

x2;ss .tkC1 ; kC1/D2ex1;ss .tkC1 ; k/Cx2;ss .tkC1 ; k/ that case are provided by copying them in the
compensator dynamics!
D2e.e 1 /w.tk ; k/ C w.tk ; k/ An even stronger consequence of the analysis
(20a) above is a flow zero dynamics internal model
D  w.tk ; k/ (20b) principle, which essentially states that any output
feedback compensator solving the output regula-
D.1/kC1 w.0; 0/: (20c) tion problem must be able to produce as free re-
sponses (during flow) a suitable subset of the nat-
Before commenting the meaning of the above ural modes of the flow zero dynamics (and a suit-
derived steady-state evolution, it is worth noting ably modified version applies to the feedforward
that (18) might actually derive from an original static compensator (8)). It is worth noting that
system with (18c) replaced by while the classical internal model principle re-





quires exact knowledge of the exosystem modes
xP 1 1 0 x1 0 0
D C uC w; (21) (which is kind of a mild requirement, especially
xP 2 1 2 x2 1 1 when the exosystem models references, or con-
stant offsets), the flow zero dynamics internal
under the preliminary state feedback model principle requires the exact knowledge of
the modes of the zero dynamics, which typically
u D x1 C v: (22)
depends on not precisely known plant parame-
Such a feedback renders the subspace ters; clearly, this fact poses serious questions in
fx W x2 D 0g unobservable (when the system view of the achievement of robust regulation.
only flows) and reveals that the dynamics of x1 A final point, also raising serious issues about
in (18c) is the flow zero dynamics of P, that what can be robustly achieved (and how) in the
is, the zero dynamics of P when jumps are setting of hybrid output regulation, is the fact
inhibited. Having set the stage, several interesting that generically, existence of solutions is not
observations can be made now. robust to arbitrarily small parameter varia-
The flow zero dynamics samples the ex- tions. In particular, looking again at the compu-
ogenous signal w at jumps and then evolves tations in (20), it should be clear that the involved O
according to its own modes (see (19)). In fact, functions are all fixed by previous reasonings,
while in the classical case (10) the state and input whereas satisfaction of (20) crucially depends on
at steady state can be expressed as a constant exact cancellations of certain coefficients. Any
matrix times the current value of w, the real small variations of such coefficients in (18d)
nature of the time dependence of  and … in (6) imply that the problem admits no steady state
is linked to this phenomenon of sampling w.tk ; k/ yielding zero output. This fact is in sharp contrast
and propagating along the zero dynamics. A with classical regulation, where the nonresonance
suitable analysis shows that ….
/, .
/ contain condition ensures existence of (different) solu-
products of matrices with rightmost factor e S
tions for small parameter variations. It has to be
(which recovers w.tk ; k/ D e S
w.t; k/ from the noted, though, that under additional conditions,
current value w.t; k/ of w) and leftmost factor robust existence of solutions is guaranteed if
containing the fundamental matrix of the flow the plant is fat, that is, m > p. Using again the
zero dynamics. It is worth mentioning that the previous example, this is the case if an additional
“motion along the zeros” in the present context is input is introduced
strongly related to the same kind of motions used




for perfect tracking in non-hybrid systems. The xP 1 1 0 x1 1 0 u1 0


D C C w;
above insight about the nature of the dependence xP 2 0 2 x2 0 1 u2 1
on
in (6) also reveals why in the output feed-
back case (13) such dependence is not needed: since then even a constant (suitably chosen) value
the required modes of the flow zero dynamics in of u1 can be used to ensure that when the time to
1028 Output Regulation Problems in Hybrid Systems

jump arrives, the value of x1 is such to ensure Recommended Reading


a correct jump for x2 (remember that since x1
is unobservable during flows, its motion can be Foundational contributions on classical output
changed as wished if this helps with ensuring that regulation are Francis and Wonham (1976), Davi-
the observable x2 achieves zero output). son (1976), and Wonham (1985); more recent
monographs include Huang (2004), Trentelman
et al. (2001), Pavlov et al. (2005), Saberi et al.
Summary and Future Directions (2000), and Byrnes et al. (1997). Goebel et al.
(2012) provides a solid introduction to a pow-
The investigation of the output regulation prob- erful and elegant framework for hybrid systems,
lem for hybrid systems is still at a very early including a thorough discussion of stability is-
stage. While the issues of stabilization of the sues related to those mentioned here. Regulation
manifold where regulation is achieved seem to problems (mainly reference tracking) for classes
be a relatively better understood topic (possibly of hybrid systems with asynchronous jumps are
drawing from a richer literature on stabilization presented in Biemond et al. (2013), Forni et al.
of hybrid systems), the geometry and design of (2013a,b), Morarescu and Brogliato (2010), and
such manifold appear to involve several much Galeani et al. (2008, 2012); synchronous jumps
more intricate issues, whose understanding will (and the ensuing advantages) are considered e.g.,
be crucial in order to achieve more complete Sanfelice et al. (2013). The class of linear systems
solutions. with synchronous jumps considered in sections
Already in the very simplified case of linear “Hybrid Steady State Generation” and “Key Fea-
dynamics and synchronous jumps, the important tures in Hybrid vs Classical Output Regulation”
role played by the whole flow zero dynamics has been proposed in Marconi and Teel (2010,
for feasibility (existence of solutions in the nom- 2013) and studied in Cox et al. (2011, 2012); the
inal parameter values) and by the availability issues related to flow zero dynamics, fat plants
of more inputs than outputs for well posedness and robustness have been discussed in Carnevale
(existence of solutions for slightly perturbed pa- et al. (2012a,b, 2013), partly developing remarks
rameter values) marks a strong difference with contained in Galeani et al. (2008, 2012).
the linear non-hybrid case, where both properties
are granted by satisfaction of the nonresonance
condition, which only involves the spectrum of
the zero dynamics, even for square plants. Bibliography
While the expected final goal of this investiga-
tion should hopefully lead to the design of robust Biemond J, van de Wouw N, Heemels W, Nijmeijer
H (2013) Tracking control for hybrid systems with
output regulators based on a suitable internal state-triggered jumps. IEEE Trans Autom Control
model principle, a deeper understanding of the 58(4):876–890
structure of the steady-state motion achieving Byrnes CI, Priscoli FD, Isidori A (1997) Output regulation
regulation, as well as of the effect of additional of uncertain nonlinear systems. Birkhäuser, Boston
Carnevale D, Galeani S, Menini L (2012a) Output reg-
inputs in shaping it, seems to be an important ulation for a class of linear hybrid systems. Part 1:
preliminary step towards such goal. trajectory generation. In: Conference on decision and
control, Maui, pp 6151–6156
Carnevale D, Galeani S, Menini L (2012b) Output reg-
ulation for a class of linear hybrid systems. Part 2:
Cross-References stabilization. In: Conference on decision and control,
Maui, pp 6157–6162
 Hybrid Dynamical Systems, Feedback Control Carnevale D, Galeani S, Sassano M (2013) Nec-
essary and sufficient conditions for output regu-
of
lation in a class of hybrid linear systems. In:
 Nonlinear Zero Dynamics Conference on decision and control, Florence,
 Regulation and Tracking of Nonlinear Systems pp 2659–2664
Output Regulation Problems in Hybrid Systems 1029

Cox N, Teel AR, Marconi L (2011) Hybrid output regula- Huang J (2004) Nonlinear output regulation: theory and
tion for minimum phase linear systems. In: American applications, vol 8. Society for Industrial Mathematics,
control conference, San Francisco, pp 863–868 Philadelphia
Cox N, Marconi L, Teel AR (2012) Hybrid internal Marconi L, Teel AR (2010) A note about hybrid linear
models for robust spline tracking. In: Conference on regulation. In: Conference on decision and control,
decision and control, Maui, pp 4877–4882 Atlanta, pp 1540–1545
Davison E (1976) The robust control of a servomecha- Marconi L, Teel AR (2013) Internal model
nism problem for linear time-invariant multivariable principle for linear systems with periodic state
systems. IEEE Trans Autom Control 21(1):25–34 jumps. IEEE Trans Autom Control 58(11):
Forni F, Teel AR, Zaccarian L (2013a) Follow the bounc- 2788–2802
ing ball: Global results on tracking and state estima- Morarescu I, Brogliato B (2010) Trajectory tracking
tion with impacts. IEEE Trans Autom Control 58(6): control of multiconstraint complementarity lagran-
1470–1485 gian systems. IEEE Trans Autom Control 55(6):
Forni F, Teel AR, Zaccarian L (2013b) Reference mirror- 1300–1313
ing for control with impacts. Daafouz J, Tarbouriech Pavlov AV, Wouw N, Nijmeijer H (2005) Uniform output
S, Sigalotti M (eds) Hybrid systems with constraints. regulation of nonlinear systems: a convergent dynam-
John Wiley & Sons, Inc., Hoboken, pp 213–260. doi: ics approach. Birkhäuser, Boston
10.1002/9781118639856.ch8 Saberi A, Stoorvogel A, Sannuti P (2000) Control of
Francis B, Wonham W (1976) The internal model princi- linear systems with regulation and input constraints.
ple of control theory. Automatica 12(5):457–465 Springer, London
Galeani S, Menini L, Potini A, Tornambe A (2008) Tra- Sanfelice RG, Biemond JJ, van de Wouw N, Heemels
jectory tracking for a particle in elliptical billiards. Int W (2013) An embedding approach for the de-
J Control 81(2):189–213 sign of state-feedback tracking controllers for refer-
Galeani S, Menini L, Potini A (2012) Robust trajectory ences with jumps. Int J Robust Nonlinear Control.
tracking for a class of hybrid systems: an internal doi:10.1002/rnc.2944
model principle approach. IEEE Trans Autom Control Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con-
57(2):344–359 trol theory for linear systems. Springer, London
Goebel R, Sanfelice R, Teel A (2012) Hybrid dynamical Wonham W (1985) Linear multivariable control: a geo-
systems: modeling, stability, and robustness. Princeton metric approach. Applications of mathematics, vol 10,
University Press, Princeton 3rd edn. Springer, New York

O
P

Introduction
Parallel Robots
A parallel robot refers to a kinematic chain in
Frank C. Park
which a fixed platform and moving platform
Robotics Laboratory, Seoul National University,
are connected to each other by several serial
Seoul, Korea
chains, or legs. The legs, which typically have
the same kinematic structure, are connected to
the fixed and moving platforms at points that are
Abstract distributed in a geometrically symmetric fashion.
The Stewart-Gough platform (Fig. 1) is a well-
Parallel robots are closed chains consisting of a known example of a parallel robot: each of the
fixed and moving platform that are connected by six legs is a UPS structure (i.e., consisting of
a set of serial chain legs. Parallel robots typically rigid links serially connected by a universal, pris-
possess both actuated and passive joints and may matic, and spherical joint), with the prismatic
even be redundantly actuated. Although more joint actuated. Other examples of parallel robots
structurally complex and possessing a smaller include the 6  RUS platform of Fig. 2, the
workspace, parallel robots are usually designed haptic interface device of Fig. 3, and the eclipse
to exploit one or more of the natural advantages mechanism of Fig. 4.
they possess over their serial counterparts, e.g., Parallel robots can be regarded as a special
higher stiffness, increased positioning accuracy, class of closed chain mechanisms (i.e., chains
and higher speeds and accelerations. In this chap- that contain one or more closed loops) and are
ter we provide an overview of the kinematic purposely designed to exploit the specific ad-
and dynamic modeling of parallel robots, a de- vantages afforded by the closed chain structure,
scription of their singularity behavior, and basic e.g., for improved stiffness, greater positioning
methods developed for their control. accuracy, or higher speed. Parallel robots should
be distinguished from two or more cooperating
serial robots that may form closed loops during
Keywords execution of a task (e.g., a robotic hand grasping
an object). Some of the fastest velocities and
Closed kinematic chain; Closed loop mechanism; accelerations recorded by industrial robots have
Parallel manipulator been achieved by parallel robots, primarily by

J. Baillieul, T. Samad (eds.), Encyclopedia of Systems and Control, DOI 10.1007/978-1-4471-5058-9,


© Springer-Verlag London 2015
1032 Parallel Robots

of freedom may exceed the kinematic degrees of


freedom, in which case we say that the robot is
redundantly actuated.
A parallel robot with a designated end-effector
frame also has a notion of forward and inverse
kinematics. While for serial chains the forward
kinematics is a well-defined mapping and the
inverse kinematics can typically have multiple
solutions, for parallel robots the situation is less
straightforward. For the Stewart-Gough platform
of Fig. 1, in which the leg lengths can be adjusted
by actuating the prismatic joints, the inverse kine-
matics is unique and straightforward to obtain,
whereas the forward kinematics will have multi-
ple solutions. For other types of parallel robots
Parallel Robots, Fig. 1 Stewart-Gough platform
in which the legs themselves contain one or
more closed loops, both the forward and inverse
placing the actuators on the fixed platform and kinematics can have multiple solutions.
thereby minimizing the mass of the moving parts. The notion of kinematic singularities for par-
Many of the model-based techniques devel- allel robots is also much more involved than
oped for the control of traditional serial chain the case for serial robots. Whereas kinematic
robots are also applicable to a large class of singularities for serial chain robots are charac-
parallel robots. On the other hand, kinematic and terized by configurations at which the forward
dynamic models for parallel robots are inher- kinematics Jacobian (i.e., the linear mapping re-
ently more complex. Parallel robots also pos- lating joint velocities to end-effector frame ve-
sess features not found in serial robots, e.g., locities) becomes singular, for parallel robots and
passive joints, the possibility of redundant actu- closed chains in general, there exist other notions
ation, and a diverse range of singularity behav- of singularities not found in serial chains. For
ior, that need to be considered when designing example, given a parallel robot with kinematic
a control law. We therefore begin with a brief mobility m – if the parallel robot consists only
overview of the kinematic and dynamic mod- of one degree-of-freedom joints, this implies that
eling of parallel robots before discussing their exactly m joints can be actuated – there may exist
control. configurations in which these m joints cannot be
independently actuated. Conversely, even if the m
actuated joints are each fixed to some value, the
Modeling parallel robot may fail to be a structure, i.e., some
of the links may be able to move.
Kinematics In the above scenario, choosing a different set
Whereas the kinematic degrees of freedom, or of m actuated joints may remedy this situation,
mobility, of a serial chain robot can be obtained as in which case such singularities are referred to
the sum of the degrees of freedom of each of the as actuator singularities. Configurations at which
joints, the situation is somewhat more complex singularity behavior occurs regardless of which
for parallel robots and closed chains in general, joints are actuated are denoted configuration sin-
since only a subset of the joints can be indepen- gularities. The final class of singularities are end-
dently actuated. The mobility of a parallel robot effector singularities, which correspond to the
corresponds to the total degrees of freedom of usual serial chain notion of kinematic singularity,
the joints that can be independently actuated. In in which the end-effector loses one or more
some cases the number of actuated joint degrees degrees of freedom of available motion.
Parallel Robots 1033

Revolute joint

Universal joint
Ball joint

Parallel Robots, Fig. 2 6  RUS platform

Prismatic Joint Universal Joint

Universal Joint

P
Parallel Robots, Fig. 3 A 3  P U U haptic interface

Dynamics
In the case of a parallel robot whose actuated
degrees of freedom coincides with its kinematic
Parallel Robots, Fig. 4 The 3  PPRS eclipse parallel
mobility m, it is possible to choose an indepen-
mechanism
dent set of generalized coordinates of dimension
m, denoted q 2 Rm and typically identified with
the actuated joints, and to express the dynamics denotes the vector of gravitational forces. The
in the standard form structure of the dynamic equations is identical
to that for serial chain robots. Also like the case
M.q/qR C C.q; q/
P qP C G.q/ D ; (1) for serial chain robots, the Coriolis matrix term
P 2 Rmm is not unique, so that one
C.q; q/
where  2 Rm denotes the vector of input should ensure that the correct C.q; q/ P is used
joint torques, M.q/ denotes the n  n mass ma- in, e.g., any control law whose stability depends
P qP denotes
trix, the matrix-vector product C.q; q/ on the matrix MP .q/  2C.q; q/ P being skew-
the vector of Coriolis terms, and G.q/ 2 Rm symmetric.
1034 Parallel Robots

It is also important to keep in mind that the laws for serial robots are also covered in this
q must satisfy the kinematic constraint equations handbook, and we refer the reader to  Linear
imposed by the loop closure constraints. That is, Matrix Inequality Techniques in Optimal Control
if  2 Rn denotes the vector of all joints (both for the essential details. Here we summarize
actuated and passive), then q 2 Rm , m  n, the most basic control laws and point out any
will be a subset of  whose values can only additional computational or other requirements
be obtained by solution of the kinematic con- that are needed when applying these laws to
straint equations; depending on the nature of the parallel robots. Note that other control laws and
kinematic constraints, one may have to resort to techniques developed for serial chain robots, e.g.,
iterative numerical methods. robust, sliding mode, can also be applied with the
If the parallel robot is redundantly actuated, same additional considerations and requirements
then the dynamics are subject to a further set of outlined below:
constraints on the input torques. Letting qe denote 1. Computed torque control: Computed torque
the set of independent generalized coordinates control for parallel robots has the same control
and qa be the vector of all actuated joints, the law structure as for serial robots, i.e.,
vector of actuated joint torques a must then
further satisfy S T a D W T , where  denotes  
 D M.q/ Kp e  Kv eP C ff ; (3)
the vector of joint torques for an equivalent tree
structure system that moves identically to the
redundantly actuated parallel robot and W and S where e denotes the tracking error, Kp and Kv
are defined, respectively, by are the proportional and derivative feedback
gain matrices, and ff denotes the feedfor-
@ @qa ward term required to cancel the nonlinear dy-
SD ; W D : (2)
@qe @qe namics. Robust versions of computed torque
control are also applicable to parallel robots
Compared to the dynamics for serial chain robots, under the same set of conditions, e.g., estab-
the dynamics for parallel robots is, in general, lishing appropriate bounds on the mass matrix
considerably more complex and computation- eigenvalues and on the norm of the Coriolis
ally involved. The recursive algorithms that are matrix.
available for computing the inverse and forward 2. Augmented PD control: The augmented PD
dynamics of serial chain robots can also be used control law for serial robots is also applicable
to develop similar recursive algorithms for par- to parallel robots, i.e.,
allel robot dynamics; however, the computations
will be considerably more involved and require
multiple iterations.  D Kp e  Kv eP C M.q/qRd C C.q; q/
P qP C G.q/;
(4)
where qd is the reference trajectory to be
Motion Control tracked and Kp and Kv are the proportional
and derivative feedback gains. Asymptotic sta-
Exactly Actuated Parallel Robots bility is also established under the same con-
For parallel robots whose actuated degrees of ditions.
freedom match the kinematic mobility (this ex- 3. Adaptive control: Because the dynamic equa-
cludes the set of all redundantly actuated parallel tions for parallel robots are also linear in the
robots), most control laws developed for serial link mass and inertial parameters, i.e.,
chain robots are also applicable. This is not al-
together surprising in light of the similarity in
the structure of the kinematic and dynamic equa- M.q/qR C C.q; q/
P qP C G.q/ D ˆ.q; q;
P q/p;
R
tions between serial and parallel robots. Control (5)
Parallel Robots 1035

where p denotes the vector of link mass and are necessary to account for the different structure
inertial parameters, adaptive control laws de- of the dynamic equations.
veloped for serial robots can also be used. Like all model-based control algorithms, the
4. Other control methods: There exist above control laws are subject to model un-
numerous control methods developed for certainties. Whereas in serial chains the effects
serial robots, e.g., task space or operational of model uncertainty simply lead to errors in
space control, sliding mode control, and tracking, for redundantly actuated parallel robots,
various nonlinear control techniques; with the consequences can lead to internal forces in
few exceptions most of these algorithms can addition to end-effector tracking errors. Perhaps
also be applied to exactly actuated parallel the most significant effect of any modeling errors
robots with minimal modification. is that, unlike the serial chain case, the kine-
matic errors can potentially alter the shape of
the configuration space (recall that the configu-
ration space will in general be a curved space
Redundantly Actuated Parallel Robots for closed chains) and also interfere with any
As described earlier, parallel robots exhibit a PD feedback introduced into the control. The
much more diverse range of singularity behavior development of control laws that are robust to
than their serial counterparts, many of which such modeling errors and disturbances remains
depend on the choice of actuated joints (actua- an open and ongoing area of research in parallel
tor singularities). One way to eliminate actuator robot control.
singularities is via redundant actuation, i.e., the
total degrees of freedom of the actuated joints ex-
ceeds the kinematic mobility of the mechanism. Force Control
Redundant actuation offers some protection in
the event of failed actuators and, when combined Both hybrid force-position control and impedance
with an appropriate control law, offers an effec- control are well-known and widely applied
tive means of reducing joint backlash, increas- concepts in serial robots and can be extended
ing speed and payload and stiffness, controlling in a straightforward manner to exactly actuated
compliance through the generation of internal parallel robots. Recall that the basic feature of
forces, and even improving power efficiency (as hybrid force-position control is that the task
P
an analogy, the human musculoskeletal system space is decomposed into force- and position-
is redundantly actuated by antagonistic muscles). controlled directions, whereas in impedance
Of course, redundant actuation introduces a new control, the goal is have the robot maintain
set of control challenges, since the control in- a certain desired spatial stiffness in the task
puts must be designed so as not to conflict with space. Controllers that combine aspects of force-
the kinematic constraints inherent in the parallel position and impedance control have also been
robot; loosely speaking, the actuated joints can proposed and developed for both serial robots
no longer be independently controlled, since the and exactly actuated parallel robots. Modeling
consequences of unintended antagonistic actua- errors will cause deviations in both the force-
tion may be catastrophic. and position-controlled directions – leading
The control of cooperating manipulators (see to motions in force-controlled directions and
 Optimal Control and Mechanics) has a long forces in position-controlled directions – which
history in robotics, and many of the control tech- can be addressed by, e.g., a switching control
niques developed for such multi-arm systems can strategy.
also be applied to redundantly actuated parallel The problem of force control for redundantly
robots. One can also apply the control strategies actuated parallel robots, which encompasses
developed for exactly actuated parallel robots to both force-position and impedance control, has
the redundantly actuated case, but modifications also received some attention in the literature.
1036 Parallel Robots

The main difference with the exactly actuated Bibliography


case is that internal forces can now be generated,
which requires a more detailed and coordinate- Anderson RJ, Spong MW (1988) Hybrid impedance con-
trol of robotic manipulators. IEEE J Robot Autom
invariant examination of stiffness. Control
4:549–556
methods that combine elements of force-position Caccavale F, Siciliano B, Villani L (2003) The Tricept
and impedance control for redundantly actuated robot: dynamics and impedance control. IEEE/ASME
parallel robots have received only limited Trans Mechatron 8:263–268
attention in the literature. Chakarov D (2004) Study of the antagonistic stiffness
of parallel manipulators with actuation redundancy.
Mech Mach Theory 39:583–601
Cheng H, Yiu Y-K, Li Z (2003) Dynamics and control
of redundantly actuated parallel manipulators. IEEE
Cross-References Trans Mechatron 8(4):483–491
Fasse ED, Gosselin CM (1998) On the spatial impedance
control of Gough-Stewart platforms. In: Proceedings
 Linear Matrix Inequality Techniques in Opti- of the IEEE international conference on robotics and
mal Control automation, Leuven, pp 1749–1754
 Optimal Control and Mechanics Ghorbel F (1995) Modeling and PD control of closed-
 Optimal Control via Factorization and Model chain mechanical systems. In: Proceedings of the
IEEE conference on decision and control, New Or-
Matching leans
 Optimal Sampled-Data Control Honegger M, Codourey A, Burdet E (1997) Adaptive
control of the Hexaglide, a 6-DOF parallel manipu-
lator. In: Proceedings of the IEEE international con-
ference on robotics and automation, Washington, DC,
pp 543–548
Recommended Reading Kim J, Park FC, Ryu SJ, Kim J, Hwang JC, Park C,
Iurascu CC (2001) Design and analysis of a redun-
The monograph (Merlet 2006) offers a detailed dantly actuated parallel mechanism for rapid machin-
and comprehensive treatment of all aspects of ing. IEEE Trans Robot Autom 17(4):423–434
Merlet JP (1988) Force feedback control of parallel
parallel robots, with a particularly thorough treat- manipulators. In: Proceedings of the IEEE interna-
ment of the kinematics and singularity analysis. tional conference on robotics automation, Philadel-
Mueller (2008) provides an excellent survey of phia, pp 1484–1489
Merlet JP (2006) Parallel robots. Springer, Heidelberg
the dynamics and control of redundantly actuated
Mueller A (2005) Internal prestress control of redun-
parallel robots and is based on the preceding dantly actuated parallel manipulators and its applica-
work (Mueller 2005). Cheng et al. (2003) exam- tion to backlash avoiding control. IEEE Trans Robot
ines in detail the dynamic model for redundantly 21(4):668–677
Mueller A (2008) Redundant actuation of parallel manip-
actuated parallel robots and the basic control
ulators. In: Wu H (ed) Parallel manipulators: towards
strategies; Nakamura and Ghodoussi (1989) also new applications. I-Tech Publishing, Vienna
examines dynamic models for redundantly actu- Murray R, Li ZX, Sastry S (1994) A mathematical in-
ated parallel robots. Stiffness analysis and con- troduction to robotic manipulation. CRC Press, Boca
Raton
trol of redundantly actuated parallel robots are
Nakamura Y, Ghodoussi M (1989) Dynamics computation
addressed in Yi and Freeman (1993), Chakarov of closed-link robot mechanisms with nonredundant
(2004), and Fasse and Gosselin (1998). Analysis and redundant actuators. IEEE Trans Robot Autom
of specific parallel robots engaged in various 5(3):294–302
Satya SM, Ferreira PM, Spong MW (1995) Hybrid control
control tasks includes Caccavale et al. (2003),
of a planar 3-DOF parallel manipulator for machin-
Honegger et al. (1997), Kim et al. (2001), and ing operations. Trans N Am Manuf Res Inst/SME
Satya et al. (1995). The basic references on robot 23:273–280
control are Murray et al. (1994), Spong et al. Spong MW, Hutchinson S, Vidyasagar M (2006) Robot
modeling and control. Wiley, New York
(2006), Anderson and Spong (1988), and Ghor- Yi B-J, Freeman RA (1993) Geometric analysis of an-
bel (1995) focuses on PD control for closed tagonistic stiffness in redundantly actuated parallel
chains. mechanisms. J Robot Syst 10:581–603
Particle Filters 1037

However, for nonlinear filtering, the complexity


Particle Filters increases exponentially in time. The MC
approach would be to generate a large number
Fredrik Gustafsson
of state trajectories (called particles) and
Division of Automatic Control, Department of
their corresponding sequences of predicted
Electrical Engineering, Linköping University,
measurements and then weighs together the
Linköping, Sweden
trajectories according to how well the predicted
and actual measurement sequences match each
other.
With increasing time, the fit is deemed to
Abstract
be poor, since the state space increases expo-
nentially in time. This is usually referred to as
The particle filter computes a numeric approxi-
the depletion (or degeneracy) problem. The ap-
mation of the posterior distribution of the state
proach in the particle filter is to simulate only
trajectory in nonlinear filtering problems. This
one step at the time and then resample the tra-
is done by generating random state trajectories
jectories if needed. For this reason, the parti-
and assigning a weight to them according to
cle filter is sometimes referred to as a sequen-
how well they predict the observations. The
tial Monte Carlo method. The resampling step
weights are instrumental in a resampling step,
keeps the trajectories that give a good fit, while
where trajectories are either kept or thrown
the bad ones are discarded. The novel idea in
away. This exposition will focus on explaining
the particle filter when it was first published in
the main principles and the main theory in an
1993 was the introduction of this resampling
intuitive way, illustrated with figures from a
step.
simple scalar example. A real-time application
Depletion is still a problem, despite the
is used to graphically show how the particle
resampling step. Mitigating depletion has
filter solves a nontrivial nonlinear filtering
ever since the beginning been the most
problem.
pressing issue in applied particle filtering. This
tutorial will present the basic particle filter
algorithm and discuss ways to avoid depletion
problems both in general terms and in a simple
P
Keywords example.
The particle filter computes an approximation
Estimation; Kalman filter; Nonlinear filtering; to the Bayes optimal filter, conditioned on a
Sequential Monte Carlo sequence of observations and a nonlinear non-
Gaussian system. It is important to note that the
PF approximates the posterior distribution of
the state trajectory, from which the mean and
Introduction covariance are easily extracted. In contrast, the
extended Kalman filter (EKF) computes the mean
The particle filter computes an arbitrarily good and covariance for an approximate dynamical
solution to nonlinear filtering problems. The system (linearized with Gaussian noise). The
goal in nonlinear filtering is to compute the unscented Kalman filter (UKF) likewise also
posterior distribution of the state vector in approximates the mean and covariance. Both
a dynamic model, given measurements that EKF and UKF can only approximate unimodal
are related to the state. Bayes rule provides (one peak) posterior distributions. There are
a recursive but computationally intractable filter bank approximations, like the interacting
solution. Monte Carlo (MC) methods can multiple model (IMM) algorithm, that can keep
essentially solve all Bayesian inference problems. track of a given number of modes in the posterior.
1038 Particle Filters

However, the PF does this in a more natural 2 Resampling: Resample each particle with
.i /
way. probability !k .
3 Time update: Simulate one time step by
.i / .i /
The Basic Particle Filter taking vk  pv .vk / and then set xkC1 D
.i / .i /
f .xk ; vk /.
Nonlinear filtering aims at estimating the The main design parameter here is the number
distribution of a state sequence x1WN D N of particles. A common trick to make the
.x1 ; x2 ; : : : ; xN / from a sequence of observations filter more robust is to increase the variance of
y1WN D .y1 ; y2 ; : : : ; yN /, given a state space pv and pe above. This is called dithering (or
model of the form jittering) and is a practical way to get more robust
nonlinear filters.
xkC1 D f .xk ; vk / or p.xkC1 jxk /; (1a) To illustrate some of the aspects and for
later reference, a simple example will be
yk D h.xk ; ek / or p.yk jxk /: (1b) introduced.

Here, vk denotes process noise, and ek is the


measurement noise. The stochastic variables
Example: First-Order Linear Gaussian
vk , ek for all k and x0 are assumed mutually
Model
independent, with known distributions pv , pe ,
and px0 , which are all being part of the model The Kalman filter (KF) provides the posterior dis-
specification. tribution in an analytical form for linear Gaussian
The particle filter (PF) works with a set of models and is thus suitable for evaluations and
random trajectories. Each trajectory is formed
comparisons. A linear Gaussian model looks like
recursively by iteratively simulating the model
with some randomness and then updating the
likelihood of each trajectory based on the ob- xkC1 D F xk C vk ; vk  N.0; Q/: (2a)
servation. In words, we first evaluate the set of
particles at hand by comparing how well they yk D H xk C ek ; ek  N.0; R/; (2b)
predict the current observation. In this way, the x0  N .0 ; P0 / ; (2c)
particles are assigned a weight. We keep the
particles with large weight and throw away the We will use the scalar case for the illustrations,
particles with small weight, using a stochastic and the figures that follow are based on F D 0:9,
resampling procedure. After this step, we get a H D 1, Q D 1, R D 0:01, P0 D 1. The
smaller set of particles, where many particles particle filter in the scalar case simplifies to the
have several replicas. We then simulate each Matlab algorithm in Table 1. Figure 1 compares
particle to the next observation time using the the sample-based representation of the PF with
dynamical model. After this prediction step, all the Gaussian distribution provided by the KF
particles will be unique (because they are based for the first two time steps. This shows how
on different realizations of the process noise). well the marginal distribution p.xk jy1Wk / is
Below, the basic algorithm (sometimes called .i /
approximated by the samples xk from the PF.
bootstrap PF or sequential importance resampling A rule of thumb is that 30 samples are needed to
(SIR) PF) is summarized. approximate a univariate Gaussian distribution.
• Define a set of random states (particles) by As will be discussed later, the number of
.i /
sampling x0  px0 .x0 /. samples is effectively only 10 here, which
• Iterate in k D 0; 1; : : : : explains the small deviation of the Gaussian
1 Measurement update: Compute the weight functions.
.i / .i /
!k D pe .yk  h.xk // and normalize so To illustrate the fundamental depletion prob-
PN .i / .i /
i D1 !k D 1. lem in the PF, the set of trajectories x1Wk that
Particle Filters 1039

Particle Filters, Table 1 Matlab code for scalar linear Gaussian model

.i/
Particle Filters, Fig. 1 Set of samples fxk giD1WN compared to first a Gaussian approximation of the particles (blue)
and second to the true posterior distribution provided by the Kalman filter (green)

approximates the posterior (smoothing) distri- Proposal Distributions


bution p.xk jy1Wk / is illustrated in Fig. 2. The
upper plot shows a case where the trajectories The time update in the basic PF predicts parti-
are all the same initially. The behavior in the cles in step 4 according to the dynamic model.
upper plot is typical for the basic particle filter The most general derivation of the particle filter
in cases where the measurements are more in- allows for sampling from a more general pro-
formative than the state transition model (small posal (also called importance) distribution. This
measurement noise, R < Q). The lower plot proposal distribution can be any function that
shows a particle filter that is working better, and can be sampled from, and it can depend on both
the modification is explained in the following the previous state and the current measurement.
section. From a filtering perspective, it may appear as
1040 Particle Filters

.i/
Particle Filters, Fig. 2 Set of trajectories fx1W10  one (likelihood). Note that the smoothing distribution
x1W10 giD1WN for two different proposal distributions, one p.xk jy1W10 / can be approximated with the set of particles
.i/
bad one (prior) leading to particle depletion and one good fxk giD1WN using the marginalization principle

“cheating” to look at the next measurement when .i /


• The prior q.xkC1 jxk ; ykC1 / D p.xkC1 jxk /,
.i /

doing the time update, but one has to look at a full as used in the basic PF.
cycle of the iteration scheme. .i /
• The likelihood q.xkC1 jxk ; ykC1 / /
If a proposal distribution of the functional p.ykC1 jxkC1 /. For the model (2), the proposal
form q.xk jxk1 ; yk / is used, then steps 1 and 3 becomes N.yk = h; Rk = h2 /.
have to be modified as follows:
• The optimal (minimizing weight variance)
1 Weight update: Time and measurement up- .i /
choice q.xkC1 jxk ; ykC1 / / p.ykC1 jxkC1 /
dates: .i /
p.xkC1 j xk /. For the model (2), the optimal
proposal is provided by one cycle of the
.i / .i /
.i / .i / p.xk jxk1 / Kalman filter, initialized with the particle
wkjk1 / wk1jk1 .i / .i /
; (3a) .i /
xk .
q.xk jxk1 ; yk /
The optimal proposal keeps the weights constant,
.i / .i / .i / and this would in theory avoid depletion, where
wkjk / wkjk1 p.yk jxk /: (3b)
depletion is interpreted as excessive weight vari-
ance. Figure 2 compares the set of trajectories for
3 Prediction: Generate samples from the pro- the prior and likelihood proposals, respectively.
posal Apparently, the likelihood proposal is to prefer
here, since it suffers less from depletion in the
.i / .i /
particle history. The practical limitation with the
xkC1  q.xkC1 jxk ; ykC1 / (3c) last two alternatives is that one has to be able to
sample from the likelihood, so in practice there
The most natural proposal distributions are the needs to be more measurements than states in the
following: model.
Particle Filters 1041

Particle Filters, Fig. 3 The efficient number of particles Neff .k/=N for prior proposal (blue) and likelihood proposal
(green) (N D 1;000)

Adaptive Resampling posal performs very well with Neff .k/  N ,


while the prior proposal effectively uses only
Resampling is crucial to avoid depletion. Without 10 % of the particles. Thus, Neff .k/ is a good
resampling, all trajectories except for one will indicator of the quality of the proposal distribu-
get zero weight quite quickly. However, there is tion.
no need to resample at every iteration. Actually, In Fig. 1, N D 100 so effectively 10 samples P
resampling increases the weight variance, which are contributing to the Gaussian approximation,
is undesired. The question is how to decide if which as mentioned before is too small a number
resampling is needed. The efficient number of to get a good result.
particles estimated as As a comparison, Fig. 3 also shows Neff if
resampling is never used, then Neff .k/ normally
1 decreases over time. The likelihood proposal does
Neff .k/ D PN  .i / 2 ; (4) not decrease as fast as the prior proposal, and for
i D1 wkjk this very short data sequence resampling is really
not needed at all.
is one suitable indicator. If all particles have the In summary, resampling increases weight vari-
.i /
same weight wkjk D 1=N , then Neff .k/ D N . ance and decreases the performance of the filter.
Conversely, if one weight is one and all other On the other hand, without resampling the effec-
zero, then Neff .k/ D 1. Thus, Neff can be inter- tive number of particles converges monotonously
preted as a measure of how many particles that to only one. So, the idea of adaptive resampling
actually contribute to the solution. is natural. The key idea is to resample only if
Figure 3 shows the evolution of Neff .k/ for the effective number of particles is small. The
prior and likelihood proposals, respectively. With usual rule of thumb is that resampling is needed
resampling in every iteration, the likelihood pro- if Neff .k/ < 2N=3.
1042 Particle Filters

Resampling was the main contribution in Gor- satellites. The approach is based on measuring
don et al. (1993) to get a working algorithm. wheel speeds on one axle and from that dead
reckon a nominal trajectory using standard odo-
metric formulas. A road map is then used as a
Marginalization measurement, to rule out impossible or unlikely
maneuvers. There is no numeric measurement yk
The posterior distribution approximation in this approach, but the likelihood p.yk jxk / in
provided by the particle filter converges with the the model (1b) is large when xk corresponds to a
number of particles. In theory, the convergence road position and decays quickly to zero outside
rate is gk =N , where gk is a polynomial function the road network. Figure 4 illustrates the particle
in time k. In practice, it appears that the required cloud gradually focuses around the true position
number of particles increases very quickly with with time. In particular, the number of modes in
state dimension. Unless a very good proposal the posterior distributions is rather high initially,
distribution is found, the practical limit for the but decreases over time, in particular after each
state dimension is around 3–4 as a rough rule of turn.
thumb. The particle filter is sometimes believed to be
For applications with a large number of states, too computer intensive for real-time applications.
one can in many cases still use the particle filter. As described in Forssell et al. (2002), this demon-
The idea is to find a linear Gaussian substructure strator implemented a particle filter on a pocket
in the model and then divide the state vector computer anno 2001 with N D 15;000 parti-
xk into two parts: xkl for the states that appear cles running in 10 Hz. Thus, the computational
linearly and xkn for the remaining states. Bayes complexity of the particle filter should not be
rule provides the factorization overemphasized in practice.

p.xkl ; x1Wk
n
jy1Wk / D p.xkl jx1Wk
n n
; y1Wk /p.x1Wk jy1Wk /
(5) Summary and Future Directions

With a linear Gaussian substructure for xkl , given The particle filter can be seen as a black-box
n
the whole trajectory x1Wk , then the Kalman filter solution to the nonlinear filtering problem, where
applies and provides a Gaussian distribution for any nonlinear dynamical model with arbitrary
the first factor in (5). The second factor is re- noise distributions can be plugged in. The main
solved using a marginalization procedure so that tuning parameter is the number of particles, and
the particle filter can be applied. the PF will work in theory if this number is large
The bottom line is that each particle is associ- enough. In practice, there are many tricks the
ated with one Kalman filter. The method is called user has to be aware of to mitigate the curse of
Rao-Blackwellized particle filter, or marginalized dimensionality (depletion) that occurs for large
particle filter, in literature. state spaces (more than three) or long time se-
quences (more than a couple of samples). One
engineering trick is dithering, to increase the vari-
Illustrative Application: Navigation ance in the involved noise distributions from their
by Map Matching nominal values. More theoretical ways to miti-
gate depletion include clever choices of proposal
A nontrivial application where the PF solves a distributions to sample from and marginalization
nonlinear filtering problem, where Kalman filter- (to solve a subset of the estimation problem with
based approaches would fail, is described in Fors- a Kalman filter).
sell et al. (2002). The problem is to compute a Current and future research directions include
robust estimate of the position of a car, without the issues above. A further trend concerns
using infrastructure such as cellular networks or the related smoothing problem, which is
Particle Filters 1043

Particle Filters, Fig. 4 Car navigation using wheel speed a position marker can be shown. The circle denotes GPS
and a street map. The figures illustrate of how the particle position, which is only used for comparison. (a) After first
representation of the position posterior distribution of turn. (b) After second turn. (c) After third turn. (d) After
position (course state is now shown) improves over time. fourth turn
After four turns, the posterior is essentially unimodal, and

interesting in itself, but which has turned out engine for more complex problems, such as
to be instrumental in joint state and parameter the simultaneous localization and mapping
estimation problems. There are also many (SLAM) problem, and to approximate the
attempts to make the particle filter more robust, probability hypothesis density (PHD) for multi-
including ideas of filter banks and invoking sensor multi-target tracking. Finally, there
a second layer of sampling algorithms to are a large number of papers reporting on
implement the proposal distribution. There is also applications in traditional as well as new
a trend to use the particle filter as a computational disciplines.
1044 Perturbation Analysis of Discrete Event Systems

Cross-References Liu JS, Chen R (1998) Sequential Monte Carlo methods


for dynamic systems. J Am Stat Assoc 93:1032–1044
Ristic B, Arulampalam S, Gordon N (2004) Beyond the
 Estimation, Survey on Kalman filter: particle filters for tracking applications.
 Extended Kalman Filters Artech House, London
 Kalman Filters
 Nonlinear Filters
 Nonlinear System Identification Using Particle
Filters Perturbation Analysis of Discrete
Event Systems

Yorai Wardi1 and Christos G. Cassandras2


Recommended Reading 1
School of Electrical and Computer Engineering,
Georgia Institute of Technology, Atlanta,
Particle filtering (PF) as a research area started GA, USA
with the seminal paper (Gordon et al. 1993) 2
Division of Systems Engineering, Center for
and the independent developments in Kitagawa Information and Systems Engineering, Boston
(1996) and Isard and Blake (1998). The state of University, Brookline, MA, USA
the art is summarized in the article collection
Doucet et al. (2001), the surveys Liu and Chen
(1998), Arulampalam et al. (2002), Djuric et al. Abstract
(2003), Cappé et al. (2007), Gustafsson (2010),
and the monograph Ristic et al. (2004). Perturbation analysis (PA) is a systematic
methodology for estimating the sensitivities
(gradient) of performance measures in discrete
event systems (DES) with respect to various
Bibliography
model or control parameters of interest. PA takes
Arulampalam S, Maskell S, Gordon N, Clapp T (2002) advantage of the special structure of DES sample
A tutorial on particle filters for online nonlinear/non- realizations and is based entirely on observable
Gaussian Bayesian tracking. IEEE Trans Signal Pro- system data. In particular, it does not require
cess 50(2):174–188
knowledge of the stochastic characterizations of
Cappé O, Godsill SJ, Moulines E (2007) An overview
of existing methods and recent advances in sequential the random processes involved and is simple
Monte Carlo. IEEE Proc 95:899 to implement in a nonintrusive manner. PA
Djuric PM, Kotecha JH, Zhang J, Huang Y, Ghirmai T, estimators, therefore, enable implementations
Bugallo MF, Miguez J (2003) Particle filtering. IEEE
Signal Process Mag 20:19
for real-time control in addition to off-line
Doucet A, de Freitas N, Gordon N (eds) (2001) Sequential optimization. The article presents the main
Monte Carlo methods in practice. Springer, New York ideas and statistical properties of PA techniques
Forssell U, Hall P, Ahlqvist S, Gustafsson F (2002) for both DES and recent generalizations to
Novel map-aided positioning system. In: Proceedings
of FISITA, Helsinki, number F02-1131
stochastic hybrid systems (SHS), especially for
Gordon NJ, Salmond DJ, Smith AFM (1993) A novel the simplest class of sensitivity estimators known
approach to nonlinear/non-Gaussian Bayesian state es- as infinitesimal perturbation analysis (IPA).
timation. IEE Proc Radar Signal Process 140:107–113
Gustafsson F (2010) Particle filter theory and practice
with positioning applications. IEEE Trans Aerosp
Electron Mag Part II Tutor 7:53–82
Isard M, Blake A (1998) Condensation – conditional Keywords
density propagation for visual tracking. Int J Comput
Vis 29(1):5–28
Gradient estimation; Sample-path techniques;
Kitagawa G (1996) Monte Carlo filter and smoother for
non-Gaussian nonlinear state space models. J Comput Sensitivity analysis; Stochastic flow models;
Graph Stat 5(1):1–25 Queueing networks
Perturbation Analysis of Discrete Event Systems 1045

Introduction from the same sample path. These sample path-


based sensitivities can be used, under suitable
Sensitivity analysis is an essential component conditions, to estimate the quantities J. C/
of the system design process in a wide vari- J./ and rJ./, respectively.
ety of application areas. In essence, it provides The PA theory was pioneered by Yu-Chi Ho
quantitative variations of performance metrics who led its eventual development by his own
resulting from possible perturbations in design group and other researchers over two decades.
set points and, hence, can be used in optimization The early works were motivated by optimal re-
and control as well as provide measures of per- source management problems in manufacturing
formance robustness. Perturbation analysis (PA) and concerned the effects of buffer allocation on
is a systematic technique for computing sample- throughput in transfer lines (Ho and Cassandras
based sensitivity estimators of performance met- 1983; Ho et al. 1979). Subsequently PA was
rics in discrete event systems (DES) by using developed in the setting of queueing networks by
the special properties of their sample realizations. virtue of their wide use as models in applications.
The effectiveness of such estimators (e.g., unbi- In this setting, typically  is a set point param-
asedness) depends on the characteristics of the eter affecting service times, inter-arrival times,
DES to which PA is applied and on the specific routing fractions, buffer sizes, and various flow
performance metric of interest. The purpose of control laws; J./ is an expected value perfor-
this article is to present and explain some of the mance metric like average delay, throughput, and
main ideas and fundamental techniques of PA. loss; and L./ is a sample realization of J./.
Figure 1 depicts an abstract schematic where The special structure of sample paths of queueing
the operation of a stochastic system depends on networks often yields simple PA algorithms for
a parameter  that is chosen from a given set the difference terms L. C /  L./, as well
. Let J./ be an expected value performance as the gradient term rL./, from the common
  and suppose that J./ D
function sample path. The PA techniques for computing
 of the system,
E L./ , where E  denotes expectation and L. C /  L./ are collectively referred to as
L./ is a sample realization computable from finite perturbation analysis (FPA), while those for
a sample path of the system. In many situa- computing rL./ are called infinitesimal pertur-
tions J./ lacks a closed-form expression, and bation analysis (IPA) (Ho et al. 1983). Much of
its sample realization, L./, provides the most the development of PA has focused on IPA, rather P
practical way for its estimation. Applications of than FPA, due to its greater simplicity and natural
sensitivity analysis often concern the effects of use in optimization, and, hence, it will be the
perturbations in the parameter  on the sample focal point of this article. For comprehensive ex-
realization L./. Denoting such perturbations by positions of PA and its various techniques, please
, their effects can be characterized by the see Ho and Cao (1991), Glasserman (1991), and
difference term L. C /  L./. PA provides Cassandras and Lafortune (2008).
such difference terms from the same sample path The purpose of the IPA gradient, rL./, is
that was used for computing L./. Furthermore, to estimate rJ./. This, however, is only useful
if  2 Rn and the function L./ is differentiable, as long as rL./ is an unbiased realization of
then PA can compute the gradient term rL./ rJ./, namely,
 
E rL./ D rEŒL./ D rJ./;

and in this case it is said that IPA is unbiased


(Cao 1985). Since J./ D E L./ , unbiased-
ness means the commutativity of the operators of
Perturbation Analysis of Discrete Event Systems, differentiation with respect to  and integration
Fig. 1 Framework for perturbation analysis (PA) (expectation) in the probability space, and this is
1046 Perturbation Analysis of Discrete Event Systems

closely related to the condition that, w.p.1, the triggering event E 0 D arg mini 2 .x/ fYi g, where
random function L./ is continuous throughout Y  D mini 2 .x/ fYi g is the inter-event time (the
. As a matter of fact, the two conditions are time elapsed since the last event occurrence).
practically synonymous. However, shortly after To simplify the notation we define e 0 WD E 0 .
the emergence of IPA, it became apparent that Thus, with e 0 determined, the state transition
in many queueing models of interest, L./ is probabilities p.x 0 I x; e 0 / are used to specify
not continuous and, hence, IPA is not unbiased the next state x 0 . Finally, the clock values are
(Heidelberger et al. 1988). Subsequently various updated: Yi is decremented by Y  for all i
techniques to overcome this problem were ex- (other than the triggering event) which remain
plored, including the so-called cut and paste of feasible in x 0 , while the triggering event (and all
the sample paths and re-parametrization of the other events which are activated upon entering
underlying probability space via statistical con- x 0 ) is assigned a new lifetime sampled from a
ditioning. For a more comprehensive coverage distribution Gi . The set G D fGi W i 2 Eg defines
of such techniques, please see Cassandras and the stochastic clock structure of the automaton.
Lafortune (2008) and references therein. These Let T˛;n denote the nth occurrence time of
techniques can yield unbiased gradient estimators event ˛ 2 E, and let V˛;n denote a realization
in principle but often at the expense of pro- of the lifetime event distribution G˛ such that
hibitive computing costs. Recently an alternative V˛;n D T˛;n  Tˇ;m for some (any) event ˇ 2 E
approach has emerged, based on stochastic flow and m 2 f1; 2; : : :}. One can then always write
models (SFM) that are comprised of fluid queues T˛;n D Vˇ1 ;k1 C : : : C Vˇs ;ks for some s. Let
(Cassandras et al. 2002). It extends, significantly, us now consider a parameter  2 R which can
the class of models and problems where IPA is only affect one or more of the event lifetime
unbiased and has the added advantage of yielding distributions G˛ .xI /; in particular,  does not
very simple gradient estimators. affect the state transition mechanism. The case
The following sections of this article present where  2 Rn can be handled in similar ways,
IPA in the general setting of DES, explain the but the one-dimensional case permits us to use
limits of its scope in queueing models, describe the derivative notation rather than the gradient
alternative PA techniques for extending those symbol, which simplifies the presentation. View-
limits, present the SFM approach, and conclude ing lifetimes as functions of ; V˛;k ./, it can
with some thoughts on future research directions. be shown (under mild technical conditions, see
Glasserman (1991)) that

DES Setting for IPA dV˛;k Œ@G˛ .xI /=@.V˛;k ; /


D ; (1)
d Œ@G˛ .xI /=@x.V˛;k ; /
IPA can be applied to any DES modeled
as a stochastic timed automaton, defined in where the subscript .V˛;k ; / indicates that the
Cassandras and Lafortune (2008) and discussed corresponding derivative of G˛ is evaluated at
in  Models for Discrete Event Systems: An the point .V˛;k ; /. This describes how a pertur-
Overview. Briefly, a stochastic timed automaton bation in  generates a perturbation in the as-
is a sextuple .E; X ; ; p; p0 ; G/, where E is an sociated event lifetime V˛;k . Such a perturbation
event set, X is a state space, and  .x/  E is the can now propagate through the DES to affect
set of feasible events when the state is x, defined various event occurrence times according to the
for all x 2 X . The initial state is drawn from dynamics prescribed by the stochastic timed au-
p0 .x/ D P ŒX0 D x. Subsequently, given that tomaton. Event time derivatives d T˛;n ./=d are
the current state is x, with each feasible event given by
i 2  .x/, we associate a clock value Yi , which
d T˛;n X dVˇ;m
represents the time until event i is to occur. Thus, D .˛; nI ˇ; m/; (2)
comparing all such clock values, we identify the d d
ˇ;m
Perturbation Analysis of Discrete Event Systems 1047

where .˛; nI ˇ; m/ is a triggering indicator tak- dLT X d Tk


N.T /
D ŒC.xk1 /  C.xk /;
ing values in f0; 1g so that .˛; nI ˇ; m/ D 1 if d d
kD1
the nth occurrence of event ˛ is triggered by the
mth occurrence of ˇ, .˛; nI ˛; n/ D 1 for all
where N.T / counts the total number of events
˛ 2 E, .˛; nI ˇ 0 ; m0 / D 1 if .˛; nI ˇ; m/ D 1
observed in Œ0; T  and xk is the state remaining
and .ˇ; mI ˇ 0 ; m0 / D 1, and .˛; nI ˇ; m/ D 0
fixed in any interval .Tk ; TkC1 / with Tk D T˛;n
otherwise.
for some ˛ 2 E, n D 1; 2; : : :.
This leads to a general-purpose algorithm for
evaluating event time derivatives along an ob-
served sample path (see Algorithm 1) of a DES
Estimation of Performance Measure Deriva-
modeled as a stochastic timed automaton. In
tives. Using dLT ./=d from above, we can
particular, we define a perturbation accumulator,
obtain unbiased estimates of dJ =d if the follow-
˛ , for every event ˛ 2 E. The accumulator ˛
ing condition holds:
is updated at event occurrences in two ways: (i) It
is incremented by dV˛ =d whenever an event ˛  
dJ./ dLT ./
occurs, and (ii) it is coupled to an accumulator DE :
d d
ˇ whenever an event ˇ (possibly ˇ D ˛)
occurs that activates an event ˛. No particular
As mentioned earlier, this key condition is closely
stopping condition is specified, since this may
related to the continuity of the sample perfor-
vary depending on the problem of interest.
mance functions. A discontinuity often is caused
by a swap in the order of two events that results
Sample Function Derivatives. Since many
from small variations in  and yields different
sample performance functions L./ of interest
future state trajectories. However, it is possible
can be expressed in terms of event times T˛;n , we
that the future state trajectory following the oc-
can use (2) and Algorithm 1
currence of the two events is invariant under
their order, and in this case the two events are
Algorithm 1 General-purpose IPA algorithm for said to commute. This commuting condition, de-
stochastic timed automata fined by Glasserman, was shown to be identical,
1. Initialization under broad assumptions, to the continuity of P
If event ˛ is feasible at x0 : ˛ WD the sample functions L./ and, hence, to the
d V˛;1 =d unbiasedness of IPA (Glasserman 1991).
Else, for all other ˛ 2 E : ˛ WD 0
2. Whenever event ˇ is observed
The main ideas discussed in this section will
If event ˛ is activated with new lifetime V˛ : next be illustrated on a simple queue.

2.1. Compute d V˛ =d through (1)


Queueing Example
2.2. ˛ WD ˇ C d V˛ =d
Consider the IPA gradient (derivative) of the
expected sojourn time (delay) in a GI/G/1 queue
with respect to a real-valued parameter of its
in order to obtain derivatives of the form
arrival process. Assume that the queue is empty at
dL./=d. As an example, a large class of such time t D 0, and it serves its customers according
functions is of the form to the order of their arrivals. Let us denote by
Z ak ; k D 1; 2; : : :, the arrival times, and by
T
LT ./ D C.x.t; //dt; sk ; k D 1; 2; : : :, the service times of consecutive
0 customers. Furthermore, we denote by vk ; k D
1; 2; : : :, the kth inter-arrival time, namely, vk D
where C.x.t; // is a bounded cost associated ak  ak1 , where a0 WD 0. Observe that the queue
with operating the system at state x.t; /. Then, is a stochastic timed automaton as defined earlier,
1048 Perturbation Analysis of Discrete Event Systems

P
with event space farrival; departureg, state space Since LN ./ D N 1 N D ./, its IPA
PN k 0
kD1
f0; 1; : : :g representing the queue’s occupancy, 0
derivative is LN ./ D N 1
kD1 Dk ./, and
and feasible event set farrivalg when x D 0 and since Dk ./ D ak ./  dk ./, it follows that
farrival; departureg when x > 0. Dk0 ./ D ak0 ./  dk0 ./. The last two derivative
Let  2 R is a parameter of the distribution of terms are computable via the following recursive
the inter-arrival times, and, hence, the realizations procedures. First, ak ./ D ak1 ./ C vk ./, and,
of v depend on  in a functional manner. For hence,
example, if the arrival process is Poisson and 
is its rate, then a realization of the inter-arrival ak0 ./ D ak1
0
./ C vk0 ./:
times has the form v D  ln.1  !/, where
! 2 Œ0; 1 is a uniform variate. To emphasize
The term vk0 ./ has to be obtained directly from
the dependence of v on , we denote it by v./,
the realization of v, and this often is possible
while its dependence on ! is implicit. Similarly,
since such realizations depend functionally on
the arrival times depend on  and, hence, are
; for instance, in the previous example, v D
denoted by ak ./, but the service times sk do
 ln.1  !/ and, hence, v 0 ./ D  ln.1  !/.
not depend on . The departure time of the kth
Next, dk ./ is given by the Lindley equation
customer and its delay depend on  and are
denoted by dk ./ and Dk ./, respectively. The ˚ 
dk ./ D max ak ./; dk1 ./ C sk ;
forthcoming paragraphs discuss the derivatives
of these functions with respect to ; we use the
prime symbol to indicate such derivatives in order and, therefore, denoting by `k the index of the
to simplify the notation. customer that started the busy period containing
Define the sample performance function customer k, we have that

X
N dk0 ./ D a`0 k ./:
1
LN ./ WD N Dk ./
kD1
From these recursive relations it follows that
for a given N > 0. Under stability conditions, Dk0 ./ D 0 if customer k starts a busy period and
P
with probability 1 (w.p.1), limN !1 LN ./ D Dk0 ./ D  kiD`k C1 vi0 ./ if customer k does
J./, where J./ denotes the mean of the delay’s not start a busy period.
stationary distribution. The role of IPA is to These equations shed light on the structure of
estimate J 0 ./ via the sample derivative L0N ./, IPA in a general class of queueing networks. First
and this is justified as long as limN !1 L0N ./ D there is the perturbation generation, namely, the
J 0 ./ w.p.1. In this case, IPA is said to be strongly sampling of derivative (gradient) terms directly
consistent. In contrast, unbiasedness pertains to from the sample sequence of variates (!) which
the performance function
 J
N ./ WD E LN ./ defines the sample path; that was v 0 ./ in the
and means that E L0N ./ D JN0 ./. The con- above example. These terms drive the recursive
cepts of strong consistency and unbiasedness are equations that yield the IPA derivatives. The re-
closely related except that the latter concerns cursive equations often are based on the tracking
finite-horizon processes while the former pertains of certain events such as the start of busy periods
to stationary distributions in steady state. Both or idle periods, and the process of tracking the
concepts have been extensively investigated in derivatives through them is referred to as pertur-
recent years: strong consistency in the setting of bation propagation. In the above example it is
Markov chains and Markov decision processes obvious how the perturbation propagation tracks
Cao (2007) and unbiasedness in the context of the busy periods at the queue. Furthermore, in a
stochastic hybrid systems, as will be described network setting, the perturbations can propagate
in the sequel. We will focus the rest of the from one queue to the next in a natural fashion.
discussion on the issue of unbiasedness. For instance, suppose that customers departing
Perturbation Analysis of Discrete Event Systems 1049

and the mth customer from the lower source


is swapped. Then the service order of these
customers will be swapped as well, inducing
discontinuities in d1;k ./ and d2;m ./ at the point
Perturbation Analysis of Discrete Event Systems, N Consequently, the sample performance
Fig. 2 Queue with two customer classes  D .
functions L1;N ./ and L2;N ./ also will be
discontinuous at  D , N and hence, their IPA
from the queue analyzed above enter a second derivatives are biased. Furthermore, if the queue
queue. Then the derivative terms dk0 ./ of the up- is a part of a network and its output process
stream queue act as the derivative terms ak0 ./ of directs customers to other queues, then the
the downstream queue. This structure of pertur- discontinuities in the various traffic processes
bations’ generation and propagation through the will propagate downstream.
tracking of busy periods and other events often The causes of biasedness in queueing net-
yields simple recursive algorithms for computing works include multiple customer classes, non-
the IPA derivatives. Markovian routing, and loss (spillover) due to
Concerning the issue of unbiasedness, it is finite buffers. This leaves out a limited class of
clear that in the above example, the sample func- networks where IPA can be unbiased and, hence,
tion LN ./ is continuous in  and, hence, IPA is useful in applications. The following sections
unbiased. However, in many systems of interest describe various approaches to overcome this
the IPA, derivative is biased. For example, con- problem.
sider the two-input, single-server queue shown
in Fig. 2, where customers are served according
to their arrival order regardless of source. Sup- IPA Extensions
pose that  is a parameter of the upper arrival
process, but not of the lower arrival process, and When IPA fails (because the commuting con-
denote the respective inter-arrival times of the dition is violated or a sample function exhibits
input streams by v1;k ./; k D 1; 2; : : : ; and discontinuities in ), one can still use the PA ap-
v2;m m D 1; 2; : : : ; as indicated in the figure. proach to derive unbiased performance sensitivity
Furthermore, let d1;k ./ denote the departure estimates. There are two ways to accomplish this:
time from the queue of the kth customer that (i) by modifying the stochastic timed automaton
P
came from the upper source, and let d2;m ./ model so that IPA is “made to work” and (ii) by
denote the departure time from the queue of the paying the price of more information collected
mth customer that came from the lower source. from the observed sample path, in which case,
Similarly, let D1;k ./ and D2;m ./ be the delays the same essential PA philosophy can lead to
of the kth customer from the upper source and unbiased and strongly consistent estimators, but
the mth customer from the lower source, respec- these are no longer as simple as IPA ones. Re-
tively. Lastly, in analogy with the previous exam- garding (i), the main idea here is that there may
ple, consider the sample performance functions be more than one way to construct (statistically
P
L1;N ./ WD N 1 N kD1 D1;k ./ and L2;N ./ WD equivalent) sample paths of a stochastic DES,
P
N 1 N mD1 D 2;m ./. and while one way leads to discontinuous sample
The IPA derivatives L01;N ./ and L02;N ./ have functions L./, another does not; a variety of
quite similar expressions to those derived earlier, such ways is provided in Cassandras and Lafor-
but they are not unbiased. To see this point, tune (2008). Regarding (ii), the methodology of
suppose that v1;k ./ is a monotone increasing smoothed perturbation analysis (SPA) (Gong and
function of , and consider the functions a1;k ./, Ho 1987) provides a generalization of IPA in
d1;k ./, and d2;k ./ for a common sample path. which more information is extracted from a DES
Suppose that at some point N the order of arrivals sample path in order to gain some knowledge
of the nth customer from the upper source about the magnitude of jumps in L./.
1050 Perturbation Analysis of Discrete Event Systems

The main idea of SPA lies in the “smoothing values from a finite set  D f0 ; 1 ; : : : ; M g.
property” of conditional expectation. If we are The theory of concurrent estimation and sample
willing to extract information from a sample path path constructability provides techniques to
and denote it by Z, called the characterization estimate performance measures of the DES
of the sample path, then we can evaluate, not just through the process of constructing sample
the sample function
 L./, but
 also the conditional paths under each of 0 ; 1 ; : : : ; M ; for details
expectation E L./ j Z (provided we have see Cassandras and Lafortune (2008).
some distributional knowledge based on which
this expectation can be evaluated). This can result
in a much smoother function of  than L./. SFM Framework for IPA
Thus, starting with the condition for an IPA
estimator to be unbiased, The stochastic flow model (SFM) framework
  essentially consists of fluid queues which forego
rJ./ D E rL./ ; the notion of the individual customer and focus
instead on the aggregate flow. In such a fluid
we rewrite the left-hand side  above as shown queue, traffic and service processes are character-
below, replacing J./ D E .L./ by the ex- ized by instantaneous flow rates as opposed to the
pectation of a conditional expectation: arrival, departure, and service times of discrete
    customers. The SFM qualifies as a stochastic hy-
rJ./ D rE L./ D rE EŒL./ j Z ; brid system with bilayer dynamics: discrete event
(3) dynamics at the upper layer and time-driven dy-
namics at the lower layer. The discrete events are
where the inner expectation is a conditional one
associated with abrupt (discontinuous) changes in
and the conditioning
 is on the characterization
traffic-flow processes, such as the boundaries of
Z. Treating E L./ j Z as the new sample
busy periods at the queues. In contrast, the time-
function, we expect it to be “smoother” than
driven dynamics describe the continuous evolu-
L./, and, in particular, continuous in . Then,
tion of flow rates between successive discrete
under some additional conditions (comparable
events, usually by differential equations or ex-
to those made in the development of IPA) the
plicit functional terms. Performance metrics that
interchange of differentiation and expectation in
are natural to SFMs typically reflect quantitative
(3) can be justified:
measures of flow rates, like average throughput,
  buffer workload, and loss.
rJ./ D E rEŒL./ j Z :
Due to the smoothing effects of SFMs, they
 
Letting LZ ./ WD E L./ j Z , the SPA appear to provide a far more natural setting for
estimator of rJ./ is IPA than their analogous discrete queueing coun-
terparts. Furthermore, their IPA gradients often
ŒrJ./SPA D rLZ ./: (4) are computable via extremely simple algorithms
that are based entirely on the observed sample
Naturally, the idea is to minimize the amount path. Consequently SFMs could, in principle, be
of added information represented by Z, since implemented on the sample paths generated by an
this incurs added costs we would like to avoid. actual system rather than simulations thereof and
The choice of the characterization Z generally thus be used in real-time optimization.
depends on the sample function considered and All of this next will be explained via a con-
the system under study. crete example of a queue which, though simple,
A number of other extensions to IPA have also captures the salient features of the SFM setting
been developed (see Cassandras and Lafortune for IPA. For a more comprehensive discussion,
2008). It is also worth mentioning that the PA please see Cassandras et al. (2010), Wardi et al.
approach can be applied to a parameter  taking (2010), and Yao and Cassandras (2013).
Perturbation Analysis of Discrete Event Systems 1051

instance,  can be the on time of the flow from an


off/on source, a uniform rate of the server, or the
threshold level in a threshold-based flow control.
Then the aforementioned traffic processes are
functions of  and t and, hence, are denoted by
Perturbation Analysis of Discrete Event Systems, ˛.I t/, ˇ.I t/, x.I t/, etc. Fix the parameter ,
Fig. 3 Basic SFM and consider the evolution of the system over a
given time horizon Œ0; T . Performance measures
Consider the fluid queue depicted in Fig. 3 of interest in applications include the average
whose input and output flow rate processes loss rate over the horizon Œ0; T  and the average
are denoted, respectively, by ˛.t/ and ı.t/. workload there which is related to the delay by
The output flow process depends on the input Little’s Law. Related to them areR the sample per-
flow process via the action of the server as well as T
formance functions L
;T ./ WD 0
.; t/dt and
the buffer size. The server is characterized by an RT
Lx;T ./ WD 0 x.; t/dt; the former is called
instantaneous processing rate, denoted by ˇ.t/,
the loss volume and the latter, the cumulative
and the buffer size, namely, the maximum amount
workload.
of fluid the buffer can hold, is denoted by c. Fluid
To illustrate the forms of their IPA derivatives,
overflow occurs when the inflow (arrival) rate
consider the basic SFM shown in Fig. 3, and let 
exceeds the service rate while the buffer is full,
be its buffer size, namely,  D c. We say that
and the overflow (loss) rate is denoted by
.t/.
a busy period of the queue is lossy if it incurs
Suppose that ˛.t/ and ˇ.t/ are random func-
some loss at any amount. Let us denote by NT
tions defined on a suitable probability space, and
the number of lossy busy periods in the horizon
assume that they are piecewise continuous and of
Œ0; T . Then (see Cassandras et al. 2002), the IPA
bounded variation w.p.1. In order to describe their
derivative of the loss volume has the following
functional relations to ı.t/ and
.t/, we define
form:
the state variable to be the amount of fluid in
the buffer (workload) and denote it by x.t/. The L0
;T ./ D NT ;
dynamics of the system evolve according to the
where again we use the prime symbol to de-
following one-sided differential equation,
note derivative with respect to . This formula P
( amounts to a counting process and indeed it is
dx 0; if x.t/ D 0 and ˛.t/  ˇ.t/
D 0; if x.t/ D c and ˛.t/ ˇ.t/ very simple. As an example, Fig. 4 depicts a
dt C ˛.t/  ˇ.t/; otherwise; typical state trajectory derived from a sample
path. It is readily seen that the first busy period is
and ı.t/ and
.t/ are related to them via lossy while the second one is not, and therefore,
L0
;T ./ D 1.
ˇ.t/; if x.t/ > 0 Concerning the cumulative workload, suppose
ı.t/ D
˛.t/; if x.t/ D 0; that the queue has M lossy busy periods in the
time interval Œ0; T , and let us enumerate them by
and the counter m D 1; : : : ; M . Moreover, denote by
um the first time the buffer becomes full in its mth
˛.t/  ˇ.t/; if x.t/ D c lossy busy period and by vm , the end-time of that

.t/ D
0; if x.t/ < c: busy period. Then (see Cassandras et al. 2002),

Network arrangements of such fluid queues, with X


M
 
specified routing and control schemes, provide a L0x;T ./ D vm  um :
mD1
rich class of SFMs.
Let  2 R be a parameter of the inflow rate, In the example provided by Fig. 4, L0x;T ./ D
the service rate, or a network control law; for v1  u1 .
1052 Perturbation Analysis of Discrete Event Systems

gradient estimators for a considerably richer class


of networks and performance functions in the
SFM setting than for their analogous queueing
models. Furthermore, the algorithms for comput-
ing the IPA gradients often are nonparametric,
Perturbation Analysis of Discrete Event Systems,
robust to modeling variations, and very simple to
Fig. 4 State trajectory of the SFM
compute, and, hence, they hold out promise of
implementations in real-time control in addition
to off-line optimization.
These equations for the IPA derivatives not In analogy with the extension of the scope of
only are very simple, but require no knowledge IPA from queueing systems to stochastic timed
of the specific form of the processes f˛.t/g or automata, the SFM framework has been extended
fˇ.t/g, their realizations, or underlying probabil- to stochastic hybrid systems (SHS), defined in
ity law. They depend only on limited informa- Cassandras et al. (2010). In such systems, in-
tion which is directly observed from the sample cluding SFMs, the functional description of the
path and, hence, are said to be nonparametric or time-driven dynamics is changed according to the
model free. Furthermore, they have been shown occurrence of specific events. However, whereas
to possess considerable robustness to modeling in SFMs these dynamics are described by explicit
variations that do not cause significant alterations functional relations, in SHS they are expressed
of the busy periods of the queue. A case of by differential equations. The aforementioned,
interest is when the SFM formalism is used as appealing properties of IPA gradients in the SFM
an abstraction of a queue. Then the above IPA setting appear to have extensions to the wider
formulas that were derived from an analysis of context of SHS.
the SFM can be successfully applied to sample Future directions in the use of IPA are ex-
paths that are generated from the discrete queue. pected to focus on the control of high-speed
This is in contrast to the IPA formulas that are large-volume networks and, more generally, the
derived from the discrete queue, which generally control of stochastic hybrid systems.
are highly biased.
All of these properties of IPA in the SFM
setting, including its unbiasedness, simplicity, the Cross-References
nonparametric nature of its algorithms, and its ro-
bustness to model variations, have had extensions  Models for Discrete Event Systems: An
to SFM networks and systems beyond the basic Overview
model (see Cassandras et al. 2010; Wardi et al.  Perturbation Analysis of Steady-State Perfor-
2010; Yao and Cassandras 2013). As mentioned mance and Sensitivity-Based Optimization
above, this suggests the potential application of
IPA not only in system optimization via off-line
simulation but also in real-time control where
Bibliography
the sample paths are generated from the actual
system. Cao X-R (1985) Convergence of parameter sensitivity es-
timates in a stochastic experiment. IEEE Trans Autom
Control 30:834–843
Cao X-R (2007) Stochastic learning and optimization: a
Summary and Future Directions sensitivity-based approach. Springer, Boston
Cassandras CG, Lafortune S (2008) Introduction to dis-
In the past 10 years, the focus of research on crete event systems, 2nd edn. Springer, New York
Cassandras CG, Wardi Y, Melamed B, Sun G, Panayiotou
IPA has shifted from the setting of queueing
CG (2002) Perturbation analysis for on-line control
systems to the framework of SFMs. The main and optimization of stochastic fluid models. IEEE
reason for this shift is that IPA yields unbiased Trans Autom Control 47: 1234–1248
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1053

Cassandras CG, Wardi Y, Panayiotou CG, Yao C (2010) The topic discussed includes the gradient-based
Perturbation analysis and optimization of stochastic optimization for systems with continuous param-
hybrid systems. Eur J Control 16:642–664
Glasserman P (1991) Gradient estimation via perturbation
eters and the direct-comparison-based optimiza-
analysis. Kluwer, Boston tion for systems with discrete policies, which is
Gong WB, Ho YC (1987) Smoothed perturbation analysis an alternative to dynamic programming and may
of discrete event systems. IEEE Trans Autom Control apply when the latter fails. Furthermore, these
32:858–866
Heidelberger P, Cao X-R, Zazanis M, Suri, R (1988)
new insights can also be applied to continuous-
Convergence properties of infinitesimal perturbation time and continuous-state dynamic systems, lead-
analysis estimates. Manag Sci 34:1281–1302 ing to a new paradigm of optimal control.
Ho YC, Cao, X-R (1991) Perturbation analysis of discrete
event dynamical systems. Kluwer, Boston
Ho YC, Cassandras CG (1983) A new approach to the
analysis of discrete event dynamic systems. Automat- Keywords
ica 19:149–167
Ho YC, Eyler MA, Chien DT (1979) A gradient technique
for general buffer storage design in a serial production Gradient estimation; Sample-path techniques;
line. Int J Prod Res 17:557–580 Sensitivity analysis; Queueing networks
Ho YC, Cao X-R, Cassandras CG (1983) Infinitesimal
and finite perturbation analysis for queueing networks.
Automatica 19:439–445
Wardi Y, Adams R, Melamed B (2010) A unified approach Introduction
to infinitesimal perturbation analysis in stochastic flow
models: the single-stage case. IEEE Trans Autom
Control 55:89–103 In this chapter, we introduce the theories and
Yao C, Cassandras CG (2013) Perturbation analy- methodologies that utilize the special features
sis and optimization of multiclass multiobjective of discrete event dynamic systems (DEDSs)
stochastic flow models. Discret Event Dyn Syst
for perturbation analysis (PA) and optimization
21:219–256
of steady-state performance. Such theories and
methodologies usually take different perspectives
Perturbation Analysis from the traditional optimization approaches
of Steady-State Performance and and therefore may lead to new insights and
Sensitivity-Based Optimization efficient algorithms. Furthermore, these new
insights can also be applied to continuous-time P
Xi-Ren Cao and continuous state dynamic systems, leading to
Department of Finance and Department of a new paradigm of optimal control.
Automation, Shanghai Jiao Tong University, As discussed in  Perturbation Analysis of
Shanghai, China Discrete Event Systems, perturbation analysis
Institute of Advanced Study, Hong Kong (PA) can be applied to both performance in
University of Science and Technology, finite-period and steady-state performance. This
Hong Kong, China chapter will mainly focus on the latter and a
related topic, the sensitivity-based optimization
Abstract of steady-state performance of stochastic discrete
event dynamic systems.
We introduce the theories and methodologies
that utilize the special features of discrete event
dynamic systems (DEDSs) for perturbation anal- Gradient-Based Approaches
ysis (PA) and optimization of steady-state per-
formance. Such theories and methodologies usu- Basic Ideas
ally take different perspectives from the tradi- The gradient-based performance optimization of
tional optimization approaches and therefore may discrete event dynamic systems (DEDSs) consists
lead to new insights and efficient algorithms. of three steps:
1054 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization

1. Developing efficient algorithms to estimate proved Cao (2007). Its application to queueing
the performance gradients using the special systems and Markov systems will be discussed in
features of a DEDS the following two subsections.
2. Studying the properties of the gradient esti-
mates, including investigating whether they
are unbiased and/or strongly consistent Queueing Systems
3. With the gradient estimates, developing effi- The gradient-based approach for DEDSs starts
cient optimization algorithms with queueing systems, known as infinitesimal
Steps 1 and 2 are referred to as PA, and Step 3 is perturbation analysis (IPA), or simply PA,
usually done together with standard gradient-  Perturbation Analysis of Discrete Event
based optimization approaches, such as hill- Systems. Queueing systems are widely used
climbing type of approaches and stochastic as a model for many DEDSs in literature
approximation approaches such as the Robbins- and have very unique structural features, and
Monro algorithm (Robbins and Monro 1951). PA utilizes such special features to develop
Our focus here is on PA for steady-state per- fast algorithms to estimate the performance
formance. The main principle for estimating the gradients.
gradients of steady-state performance is decom- We first give a brief explanation for the simple
position. In a DEDS, a small change in the rules of PA of queueing systems (Ho and Cao
value of a system parameter induces a series of 1983, 1991). Consider a closed Jackson network
changes on the system’s sample path; each of with M servers and N customers. The service
such changes is called a perturbation. A single times of customers at server i are exponentially
perturbation alone will affect the sample path and distributed with service rate i , i D 1; 2;    ; M .
therefore affect the system performance. Such an After a customer completes its service at server
effect is typically small, and therefore, the linear i , it goes to server j with a routing probability
superposition usually holds. Thus, the effect of a qij , i; j D 1; 2;    ; M . The service discipline is
small parameter change on the steady-state per- first come first served. The number of customers
formance can be determined by summing up the at server i is denoted as ni , and the system
effects of all the perturbations induced (or gener- state is denoted as n D .n1 ; n2 ;    ; nM /. The
ated) by the parameter change. This principle is state process is n.t/ D .n1 .t/; n2 .t/;    ; nM .t//
illustrated by Fig. 1, and it applies to many differ- with ni .t/ being the number of customers at
ent systems with different performance criteria. server i at time t, i D 1; 2;    ; M , t 0.
Following the principle, efficient algorithms can Define Tl as the lth state transition time of the
be developed, and their strong consistency can be process n.t/.

Perturbation Analysis
of Steady-State
Performance and
Sensitivity-Based
Optimization, Fig. 1 The
decomposition of
performance changes
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1055

Perturbation Analysis
of Steady-State
Performance and
Sensitivity-Based
Optimization, Fig. 2
Perturbation generation
and propagation in a
queueing network

Figure 2 illustrates an example of a sample First, the perturbations generated at a server will
path where each stair-style line represents a tra- accumulate until this server is idle. When the
jectory of the number of customers at one server, server enters an idle period, all its perturbations
and the customer transitions among servers are will be lost. (After the idle period, the service
indicated by dotted arrows. starting time is determined by the customer that
An exponentially distributed service time with terminates the idle period, which carries the per-
rate  can be generated according to s D  1 ln , turbation of another server.) Second, when a
where is a uniformly distributed random num- customer finishes its service at server i and enters
ber in [0, 1]. Now let the service rate of a server, server j after an idle period of server j , server j
say server k, change from k to k C k , will obtain the same amount of perturbations as
k << k . Then the service time s will change server i . If server j is not idle at this time, the P
to s 0 D  k C 
1
k
ln (the same is used for both perturbation at server i will only affect the arrival
0
s and s ). Thus, the perturbation of the service time of this customer and will not affect any other
time induced by the change in k is customers/servers.
Figure 2 illustrates the perturbation generation
k and propagation process of a network in which
s D s 0  s   s: (1)
k the service rate of server 1 is decreased with
an infinitesimal amount. Given a system sam-
In summary, because of a small (infinitesimal) ple path obtained by simulation or observation,
change of service rate k , every customer’s we can record the perturbations of all servers
service completion time at server k will obtain as they are generated and propagated along the
(be delayed by) a perturbation that is propor- sample path. From the perturbations we can get
tional to its original service time with a multiplier a perturbed sample path and finally get the per-
  k
k . This is the rule of perturbation genera-
formance changes caused by the change in the
tion. service rate. Specifically, we have the following
Next, we observe that a perturbation will af- algorithm for the perturbations of the service
fect the service starting and completion times completion times (if server k’s service rate is
of other customers in the same server and in perturbed).
other servers in the network. We say a pertur- In Step 2, we add sk;l as the perturbation
bation will be propagated through the network. generated, instead   k
s as indicated by (1).
k k;l
1056 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization

Algorithm 1 Perturbation Analysis where


1. Initialization: Set variables i D 0, i D 1; 2;    ; M .
X
L1
FL D f .Xl /:
2. Perturbation generation: At the lth service completion
time of server k, set k D k C sk;l , l D 1; 2;    , lD0
where sk;l is the service time of the lth customer at Let P 0 be another ergodic transition probabil-
server k.
ity matrix on the same state space. Suppose P
3. Perturbation propagation: If a customer from server i
terminates an idle period of server j , set j D i , changes to P .ı/ D P C ıQ D ıP 0 C .1  ı/P ,
i; j D 1; 2;    ; M . with ı > 0, Q D P 0  P D Œq.j ji /, and
the reward function f keeps the same. We have
Qe D 0. The performance measure will change
to .ı/ D C .ı/. The derivative of in the
The factor   k
k will be canceled when esti- direction of Q is defined as d .ı/ .ı/
d ı D limı!0 ı .
mating performance derivatives with respect to In this discrete-state Markov system, a pertur-
k . The algorithm yields a perturbed sample bation means that the system is perturbed from
path. With the original path and the perturbed one state i to another state j . For example,
path, we can estimate the original and perturbed consider the case where q.i jk/ D 12 , q.j jk/ D 12 ,
(steady-state) throughput, then the estimates of and q.ljk/ D 0 for all l ¤ i; j . Suppose that
its derivative with respect to k can be obtained, these probabilities change to q.i jk/ D 12 C ı,
and it is proved that the estimate is strongly q.j jk/ D 12  ı, and q 0 .ljk/ D 0 for all l ¤ i; j .
consistent. Only a few lines need to be added in Then it may happen that at some time in the orig-
the simulation code to obtain the derivatives. Ex- inal sample path the system transits from state k
periments show that the results are very accurate to state i , but in the perturbed path it transits from
(error is around 5 %, compared with analytical state k to state j instead. In this case, we say
results, Ho and Cao 1983). that the change in transition probabilities induces
a perturbation from i to j at this time. To study
Markov Systems the effect of such a perturbation, we consider two
The decomposition principle shown in Fig. 1 independent sample paths X D fXn I n 0g and
has been applied to Markov systems (Cao X0 D fXn0 I n 0g with X0 D i and X00 D j ;
2007). both of them have the same transition matrix P .
Consider an irreducible and aperiodic Markov The average effect of a perturbation from i to j ,
chain X D fXn W n 0g on a finite state i; j D 1;    ; M , on FL can be measured by the
space S D f1; 2;    ; M g with transition prob- perturbation realization factor defined as
ability matrix P D Œp.j ji / 2 Œ0; 1M M . "L1
Let D . 1 ; : : : ; M / be the vector repre- X
senting its steady-state probabilities and f D d.i; j / D lim E . f .Xl0 /  f .Xl //jX0
L!1
.f1 ; f2 ;    ; fM /T be the reward (or cost) vector, lD0
#
where “T” represents transpose. We have P e D
e, where e D .1; 1;    ; 1/T is an M-dimensional D i; X00 Dj : (3)
vector whose all components equal 1, and we
have D P . We consider the long-term av-
erage (steady-state) performance defined as The matrix D 2 RM M , with d.i; j / as its
.i; j /th element, is called a realization matrix.
We can prove that D satisfies the equation (Cao
X
M 2007)
D E .f / D i fi D  PDP T D F; (4)
i D1
where F D f e T  ef T . Because d.i; j / D
FL d.j; i /, for any i; j , we may define a vector
D f D lim ; w:p:1; (2)
L!1 L g D .g.1/;    ; g.M //T such that
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1057

d.i; j / D g.j /  g.i /; (5) 1. There is a large literature on whether the


PA-based derivative estimates are unbiased
and D D ge T  eg T . g is called a performance (finite period) and/or strongly consistent
potential, which can be estimated with many (steady state). This was first formulated in
sample path-based algorithms, and it satisfies the Cao (1985) and was further discussed in
Poisson equation (Cao 2007) Heidelberger et al. (1988). By now, there
have been extensive studies in this direction:
.I  P C e /g D f: (6) proving the unbiasedness or consistency for
various systems and modifying the approach
Intuitively, (3) and (5) indicate that every visit for system when the IPA estimates are not
to state i contributes to FL on the average by the unbiased (e.g., Cassandras and Lafortune
amount of g.i /, so the effect of a perturbation 1999; Fu and Hu 1997; Glasserman 1991).
from i to j is d.i; j / D g.j /  g.i /. Now, This also includes the recently proposed
we consider a sample path consisting of L tran- fluid model; see  Perturbation Analysis of
sitions. Among these transitions, on the average Discrete Event Systems.
there are L i transitions at which the system 2. Another research topic is how to develop fast
is at state i . After being at state i , the system and efficient algorithms for estimating the
jumps to state j on the average L i p.j ji / times. performance gradients, especially in the case
If the transition probability matrix P changes of Markov systems; see e.g., Cao and Wan
to P .ı/ D P C ıQ, then the change in the (1998), Baxter and Bartlett (2001), and Cao
number of visits to state j after being at state (2005). This is called policy gradients in the
i is L i q.j ji /ı D L i Œp 0 .j ji /  p.j jji /ı. reinforcement learning literature.
This contributes a change of fL i Œp 0 .j ji /  3. There are also research works on how the
p.j ji /ıgg.i / to FL . Thus, the total change in FL gradient estimates and policy iteration (see
due to the change of P to P .ı/ is section “Direct Comparison and Policy Iter-
ation”) can be combined with stochastic ap-
X
M proximation approaches to develop fast con-
FL D L i Œp 0 .j ji /  p.j ji /ıg.i / vergent optimization algorithms; see Marbach
i;j D1 and Tsitsiklis (2001) for gradient-based ap-
proach and Fang and Cao (2004) for policy
P
D L. Qg/ı:
iteration-based approach.
Finally, we have

d 1 FL
D lim D Qg D .P 0  P /g: (7) Direct Comparison and Policy
dı ı!0 ı L
Iteration
If the reward function also changes from f to
f .ı/ D f C ı.f 0  f /, then (7) becomes The sensitivity-based view has been extended
to optimization in discrete spaces of policies.
d With this view, we can develop a new approach
D Œ.P 0 g C f 0 /  .P g C f /: (8) to performance optimization based on a direct

comparison of the performance of any two poli-
Further Works cies. This provides an alternative to the standard
The ideas described in the previous subsections dynamic programming to solving the Markov
may stimulate new research topics in theoretical decision processes (MDP) types of problems; it
analysis, estimation algorithms, and applications. also has been applied to solve some problems
Here we can only give a very brief review for when the standard MDP fails (Cao 2007; Cao and
some of them. Wan 2013).
1058 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization

In an MDP, there is an action space denoted as From (9) and the fact 0 > 0, the proof of the
A. For simplicity, we only consider the discrete following lemma is straightforward.
case. When the system is at any state i 2 S, an ac-
Lemma 1 If P g C f
./ P 0 g C f 0 , then
tion ˛ D d.i / is taken, which controls the system
< ./ 0 .
transition probability, denoted as p ˛ .j ji /, i; j 2
S, and the reward function, denoted as f .i; ˛/. It is interesting to note that in the lemma, we
The mapping d W S ! A is called a policy. Since use only the potentials with one Markov chain,
a policy corresponds to a transition probability i.e., g. Thus, because of the special structure of
matrix P d D Œp d.i / .j ji /Mi;j D1 and the reward the performance difference formula (9), if the
vector f d D .f .1; d.1//;    ; f .M; d.M ///T , condition in Lemma 1 holds, to compare the
we also call the pair .P; f / a policy. performance measures under two policies, we
Consider two policies .P; f / and .P 0 ; f 0 /, may only need the potentials with one policy.
and assume that the Markov chain under both Policy iteration and the optimality equation
policies is ergodic. We use prime “ 0 ” to de- can be easily derived from (9) and Lemma 1.
note the quantities associated with .P 0 ; f 0 /. First,
multiplying both sides of the Poisson equation (6)
Algorithm 2 Policy Iteration
with 0 on the left and after some calculations, we
1. Guess an initial policy d0 , and set k D 0.
get 2. (Policy evaluation) Obtain the potential g dk by solving
the Poisson equation .I  P dk /g dk C dk e D f dk or
estimating it on a sample path. (The superscript “dk ”
0  D 0 f.P 0 g C f 0 /  .P g C f /g: (9)
is added to quantities associated with policy dk .)
3. (Policy improvement) Choose
This is called a performance difference formula,

 
and many optimization results can be derived dkC1 2 arg max f d C P d g dk ; (10)
d 2D
from it in an intuitive way.
component-wise (i.e., to determine an action for each
state). If in state i , action dk .i / attains the maximum,
Policy Iteration and the Optimality Equation and set dkC1 .i / D dk .i /.
The difference formula (9) has a nice decompo- 4. If dkC1 D dk , stop; otherwise, set k WD k C 1 and go
sition structure: it contains two factors, the first to Step 2.
one 0 , which does not depend on P , and the
second one .P 0 g C f 0 /  .P g C f /, in which all
It follows directly from Lemma 1 that when
the parameters are known except the performance
the iteration does not stop, the performance im-
potential g, which can be obtained by only ana-
proves at each iteration. It can be proved easily by
lyzing system with P . This nice feature makes
construction that the iteration stops at an optimal
the difference formula the basis of performance
policy. Again, from Lemma 1, when the iteration
optimization.
stops at a polict dO , it holds
For two M -dimensional vectors a and b, we
define a D b if a.i / D b.i / for all i D n o
O O O
1; 2    ; M ; a  b if a.i /  b.i / for all i D d e C g d D max f d C P d g d : (11)
d 2D
1; 2    ; M ; a < b if a.i / < b.i / for all i D
1; 2    ; M ; and a
b if a.i / < b.i / for at least This is the Hamilton-Jacobi-Bellman (HJB) opti-
one i and a.j / D b.j / for other components. mality equation. If we further define the Q-factor,
The relation  includes D,
, and <. Similar
X
definitions are used for the relations >, , and Qd .i; ˛/ D f .i; ˛/ C p ˛ .j ji /g d .j /: (12)
. j
Next, we note that 0 .i / > 0 for all i D
1; 2;    ; M . Thus, from (9), we know that if Then the policy iteration equation (10) and the
.P 0  P /g C .f 0  f / 0, then 0  > 0. HJB equation (11) become
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1059

dkC1 .i / WD arg maxfQdk .i; ˛/g (13) large, and it is not computationally feasible to
˛2A
implement policy iteration or to solve the HJB
equations. On the other hand, in many practical
and
O O problems in engineering, finance, and social sci-
Qd .i; ˛/ D maxfQd .i; ˇ/g: (14)
ˇ2A ences, control actions are only taken when certain
The only difference between (8) and (9) is events occur. For example, in the traffic control
that in (8) is replaced by 0 in (9). This leads of a network of subnetworks, often times one
to an interesting observation: policy iteration in cannot control the traffic in the same subnetwork,
MDPs in fact chooses the policy with the steepest and control actions are only applied when there
directional derivative as the policy in the next are packets transferring among different subnet-
iteration. Therefore, policy iteration in fact can works. In a portfolio management problem, the
be viewed as the “gradient-based” optimization investor sells or buys stocks when the price his-
in a discrete space. tory experiences some predetermined patterns
In our approach, the HJB equation and pol- (e.g., reaches some level). In a sensor network,
icy iteration are obtained from the performance actions are taken only when one of the sensors
difference equation (9), which compares the per- detects some abnormal situations. In a material
formance of any two policies. It is hence called handling problem, actions are taken when the
a direct-comparison based approach. This ap- inventory level falls below certain threshold.
proach also applies to more general problems, Conceptually, anything happened in the past
such as multichain Markov systems, systems with can be chosen as an event. However, the number
absorbing states, and problems with other per- of such events is too big (much bigger than the
formance criteria such as the discounted perfor- number of states), and studying all these events
mance and the bias, etc. makes analysis infeasible and defeats our original
The direct-comparison approach has been suc- purpose. Therefore, to properly define events, we
cessfully applied to the nth-bias optimality prob- need to strike a balance between the generality on
lem (Cao 2007). Essentially, starting with the one hand and the applicability on the other hand.
performance difference formulas (those similar In the event-based setting, an event e is de-
to (9) for different performance), we can de- fined as a set of state transitions with certain
velop a simple and direct approach to derive the common properties. That is, e WD fhi; j i W
i; j 2 S and hi; j i has common propertiesg, P
results that are equivalent to the sensitive dis-
count optimality for multichain Markov systems where hi; j i denotes a state transition from i to
with long-run average criteria (Puterman 1994), j . This definition can also be easily generalized
and no discounting is needed and no dynamic to represent a finite sequence of state transitions.
programming is used. The approach, motivated We shall see that in many real problems, the
by the development for discrete event dynamic number of events requiring control actions is
systems, provides a clear overall picture for the usually much smaller than that of the states.
area of MDP. An event-based policy d is defined as a map-
The direct-comparison approach can also be ping from E to A, with E being the space of all
applied to some problems where dynamic pro- events. That is, d W E ! A. Let De denote the
gramming fails, including the event-based opti- set of all the stationary and deterministic policies.
mization problems, where the sequence of events The reward function under policy d is denoted
may not be Markovian; see the next section. as f d D f .i; d.e//, and the associated long-run
average performance is denoted as d . When an
event e happens, we choose an action a D d.e/
Event-Based Optimization according to a policy d , where e 2 E and d 2 De .
Our goal is to find an optimal policy dO which
It is well known that for most systems modeled maximizes the long-run average performance as
by Markov processes, the state spaces are too follows.
1060 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization

˚ 
dO D arg max d (15) number of states, and in some cases such as the
d 2De network admission control problem, it is linear to
the system size.
The main difficulty in developing the event- The crucial condition for the above event-
based optimization theories and algorithms lies in based optimization is (19). There are many prob-
the fact that the sequence of events is usually not lems, such as the control of the networks of
Markovian. The standard optimization approach networks and the portfolio management problem,
such as dynamic programming does not apply for which the condition holds; there are also many
to such problems. However, we can apply the problems for which the condition does not hold.
direct-comparison approach to this event-based The error in applying (18) and policy iteration
optimization problem. Here we give a very brief (17) comes from the difference between h .i je/
discussion. and d .i je/ at each iteration. Further research is
Consider an event-based policy d , d 2 needed.
De . When event e occurs, the conditional We use a computer network as an example, in
transition probability is denoted as p d.e/ .j ji; e/. which each computer or router can be modeled
Let d .i je/ be the conditional steady-state as a queueing subnetwork and the computer
probability of state i when event e occurs under network is then a network of such subnetworks.
policy d . Define the aggregated Q-factor Assume that there is an M subnetwork, and
P subnetwork m, m D 1; 2;    ; M , consists of
Qd .e; ˛/ D i d .i je/ km servers. The number of customers at the
h P i
 f .i; ˛/ C j p ˛ .j ji; e/g d .j / : (16) j th server of subnetwork m is denoted as nm;j ,
j D 1; 2;    ; km , and the number of customers
in all thePservers in subnetwork m is denoted as
We may use these aggregated Q-factors to de- km
Nm D j D1 nm;j . Suppose the service time is
velop an event-based policy iteration algorithms
exponentially distributed, then the system state
as Algorithm 2 and obtain the policy iteration
is n WD .n1;1 ;    ; n1;k1 I    I nM;1 ;    ; nM;kM /,
equation (cf. (13))
and the aggregated state is N WD .N1 ;    ; NM /.
Suppose that the transition probabilities among
dkC1 .e/ WD arg maxfQdk .e; ˛/g; e 2 E; the servers in the same subnetwork are fixed and
˛2A
(17) we can only control the transition probabilities
and the event-based HJB optimality equation (cf. among the subnetworks, and furthermore, we
(14)) is can only observe N. Then the problem can
be modeled as an event-based optimization
O O with an event being a customer transition
Qd .e; ˛/ D maxfQd .e; ˇ/g: (18)
ˇ2A among subnetworks, and meanwhile, the
aggregated state is N. It can be proved that
It can be proved that if the conditional probability the condition (19) holds, and therefore, we
h .i je/ does not depend on the policy; i.e., may apply policy iteration (17) and HJB
equation (18).
h .i je/ D d .i je/; 8 i; e; and h; d; (19) Many existing problems fit the event-based
framework. For example, in a partially observ-
then policy iteration (17) indeed leads to a se- able Markov decision process (POMDP), we may
quence of increasing performance and (18) spec- define an observation, or a sequence of observa-
ifies an event-based optimal policy. tions, as an event. Other examples include state
The event-based aggregated Q-factor (16) can and time aggregations, hierarchical control (hy-
be estimated on a sample path in the same way as brid systems), and options. Different events can
for potentials (Cao 2007). The number of events be defined to capture the special features in these
requiring actions is usually much smaller than the different problems. In this sense, the event-based
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1061

approach may provide a unified view to these because in her/his mind, s/he enlarges the
different problems (Cao 2007). possibility of winning a large sum, and a risk
averse person buys insurance, because s/he is
afraid of a big loss and therefore enlarges its
The Sensitivity-Based Approach and probability.
Optimal Control The optimization problem with a distorted
probability suffers from the time-inconsistent is-
We refer to the approaches discussed in the sue; i.e., an optimal policy for the problem in
previous sections the sensitivity-based approach. period Œt; T /, 0 < t < T , is not optimal in the
When the parameters are continuous, it is the same period for the problem in Œ0; T . Thus, the
gradient-based approach, and the focus is on de- standard dynamic programming fails.
veloping efficient algorithms that utilize the par- The gradient-based approach has been applied
ticular system structure to estimate the gradients to the portfolio management problem with prob-
(to justify the algorithms, the unbiasedness or the ability distortion. With this approach, we discov-
consistency of the estimates should be proved). ered that the performance with distorted probabil-
When the parameters are discrete, it is the direct- ity maintains some sort of linearity called mono-
comparison approach that leads to policy iteration linearity. This property shed new insights to the
and the HJB equations, and policy iteration can portfolio management problem and the nonlinear
be viewed as the gradient-based method in dis- expected utility theory (Cao and Wan 2013).
crete spaces. This approach is different from the
conventional dynamic programming, and it has
been successfully applied to MDP with different Conclusion
criteria and the n-bias optimization problems as
well as the event-based optimization and some A sensitivity-based approach has been developed
other problems that dynamic programming fails. to the performance optimization of discrete event
This sensitivity-based approach was motivated dynamic systems (DEDS). The approach utilizes
by the study of discrete event dynamic systems the dynamic structure of DEDS. For systems with
(DEDS). It has been realized that the principles continuous parameters, it is the gradient-based
and methodologies developed for DEDSs also optimization, in which the special feature of a
apply to the optimization of continuous-time and DEDS helps in developing efficient algorithms to
P
continuous-state (CTCS) systems. Here are some estimate the performance derivatives; for systems
examples: with discrete policies, it is the direct-comparison-
1. CTCS systems: For CTCS systems, the dy- based approach, with which policy iteration and
namic is driven by Brownian motions or Levy HJB equations can be derived intuitively by us-
processes. The transition probability matrix in ing the performance difference formulas. Policy
DEDS should be replaced by the infinitesimal iteration can be viewed as the gradient method
generator in CTCS, which is an operator on in a discrete space. The estimation of gradients
the space of continuous functions. With the and the implementation of policy iteration can be
performance difference formulas, we can re- carried out on a given sample path, and efficient
develop the stochastic optimal control theory online learning algorithms can be developed (Cao
for many performance measures, including 2007).
the long-run average, discounted performance, The sensitivity-based approach was developed
and finite horizon problems, with no dynamic for DEDSs, but its principle also applies to
programming; see Cao et al. (2011). systems with continuous-state spaces. The
2. Time-inconsistent optimization problems: In approach provides an alternative to the traditional
behavioral finance, people’s preference is dynamic programming and therefore can be
modeled with a distorted probability. For applied to some problems where dynamic
example, a risk-taking person buys lotteries, programming does not work.
1062 PID Control

Cross-References Puterman ML (1994) Markov decision processes: discrete


stochastic dynamic programming. Wiley, New York
Robbins H, Monro S (1951) A stochastic approximation
 Models for Discrete Event Systems: An method. Ann Math Stat 22:400–407
Overview
 Perturbation Analysis of Discrete Event Sys-
tems
PID Control
Acknowledgments This research was supported in part
by the Collaborative Research Fund of the Research Sebastian Dormido1 and Antonio Visioli2
Grants Council, Hong Kong Special Administrative 1
Region, China, under Grant No. HKUST11/CRF/10 and
Departamento de Informatica y Automatica,
610809. UNED, Madrid, Spain
2
Dipartimento di Ingegneria Meccanica e
Industriale, University of Brescia, Brescia, Italy
Bibliography

Baxter J, Bartlett PL (2001) Infinite-horizon policy-


Synonyms
gradient estimation. J Artif Intell Res 15: 319–350
Cao XR (1985) Convergence of parameter sensitivity es-
timates in a stochastic experiment. IEEE Trans Autom Proportional-Integral-Derivative Control
Control 30:834–843
Cao XR (2005) A basic formula for online policy
gradient algorithms. IEEE Trans Autom Control
50(5):696–699 Abstract
Cao XR (2007) Stochastic learning and optimization – a
sensitivity-based approach. Springer, New York Since their introduction in industry a century
Cao XR, Wan YW (1998) Algorithms for sensitivity
analysis of Markov systems through potentials and
ago, proportional–integral–derivative (PID) con-
perturbation realization. IEEE Trans Control Syst trollers have become the de facto standard for
Technol 6:482–494 the process industry. In this entry, fundamentals
Cao XR, Wan XW (2013) Analysis of non-linear behavior of PID control are outlined, starting from the
– a sensitivity-based approach. submitted
Cao XR, Wang DX, Lu T, Xu YF (2011) Stochastic
basic control law. Additional functionalities and
control via direct comparison. Discret Event Dyn Syst the tuning and automatic tuning of the parameters
Theory Appl 21:11–38 are then considered.
Cassandras CG, Lafortune S (1999) Introduction to
discrete event systems. Kluwer Academic Publishers,
Boston
Fang HT, Cao XR (2004) Potential-based on-line policy Keywords
iteration algorithms for Markov decision processes.
IEEE Trans Autom Control 49:493–505 Anti-windup; Autotuning; Controller tuning;
Fu MC, Hu JQ (1997) Conditional Monte Carlo: gradient
estimation and optimization applications. Kluwer Derivative action; PI control; Proportional
Academic Publishers, Boston control; Proportional-integral-derivative control;
Glasserman P (1991) Gradient estimation via perturbation Ziegler-Nichols
analysis. Kluwer Academic Publishers, Boston
Heidelberger P, Cao XR, Zazanis M, Suri R (1988)
Convergence properties of infinitesimal perturbation
analysis estimates. Manag Sci 34:1281-1302 Introduction
Ho YC, Cao XR (1983) Perturbation analysis and
optimization of queueing networks. J Optim Theory
A proportional–integral–derivative (PID) con-
Appl 40:559–582
Ho YC, Cao XR (1991) Perturbation analysis of discrete- troller is a three-term controller that has a long
event dynamic systems. Kluwer Academic Publisher, history in the automatic control field, starting
Boston from the beginning of the last century. Owing
Marbach P, Tsitsiklis TN (2001) Simulation-based
to its intuitiveness and relative simplicity, in
optimization of Markov reward processes. IEEE Trans
Autom Control 46:191–209 addition to the satisfactory performance that it is
PID Control 1063

able to provide with a wide range of processes, function of a proportional controller can be triv-
it has become the de facto standard controller ially derived as
in industry. It has been evolving along with
the progress of technology, and nowadays it is C.s/ D Kp : (2)
very often implemented in digital form rather
than with pneumatic or electrical components. It The main drawback of using a pure proportional
can be found in virtually all kinds of control controller is that, in general, it cannot set to zero
equipments, either as a stand-alone (single- the steady-state error. This motivates the addition
station) controller or as a functional block in of a bias (or reset) term ub , namely,
Programmable Logic Controllers (PLCs) and
Distributed Control Systems (DCSs). Actually, u.t/ D Kp e.t/ C ub : (3)
the new potentialities offered by the development
of the digital technology and of the software The value of ub can then be adjusted manually
packages have led to a significant growth of the until the steady-state error is reduced to zero.
research in the PID control field: new effective In commercial products, the proportional gain
tools have been devised for the improvement is often replaced by the proportional band PB,
of the analysis and design methods of the basic which is the range of error that causes a full-range
algorithm as well as for the improvement of the change of the control variable, i.e.,
additional functionalities that are implemented
with the basic algorithm in order to increase its 100
PB D : (4)
performance and its ease of use. Kp
The success of the PID controllers is also
enhanced by the fact that they often represent the The integral action is proportional to the integral
fundamental component for more sophisticated of the control error, i.e.,
control schemes that can be implemented when Z t
the basic control law is not sufficient to achieve u.t/ D Ki e./d ; (5)
the required performance or when a more com- 0
plicated control task is of concern.
where Ki is the integral gain. It appears that
the integral action is related to the past values P
Basics of the control error. The corresponding transfer
function is
Ki
Using a PID controller means applying a feed- C.s/ D : (6)
back controller that consists of the sum of three s
types of control actions: a proportional action, an The presence of an integral action allows to
integral action, and a derivative action. reduce the steady-state error to zero when a step
The proportional control action is proportional reference signal is applied or a step load distur-
to the current control error, according to the bance occurs. In other words, the integral action
expression is able to set automatically the correct value of
ub in (3) so that the steady-state error is zero. For
u.t/ D Kp e.t/ D Kp .r.t/  y.t//; (1) this reason, the integral action is also often called
automatic reset.
where u is the controller output, Kp is the pro- While the proportional action is based on the
portional gain, r is the reference signal, and y current value of the control error and the integral
is the process output. Its meaning is straightfor- action is based on the past values of the control
ward, since it implements the typical operation of error, the derivative action is based on the pre-
increasing the control variable when the control dicted future values of the control error. An ideal
error is large (with appropriate sign). The transfer derivative control law can be expressed as
1064 PID Control

de.t/ adopted. Suitable conversion formulae can be


u.t/ D Kd ; (7)
dt applied to obtain an ideal PID controller equiv-
alent to a series one. Obtaining an equivalent PID
where Kd is the derivative gain. The correspond- controller in series form starting from an ideal
ing controller transfer function is one is possible only if the zeros of the ideal PID
controller are real.
C.s/ D Kd s: (8)

The meaning of the derivative action can be better


Additional Functionalities
understood by considering the first two terms of
the Taylor series expansion of the control error at
The expression (11) or (12) of a PID controller is
time Td ahead:
actually not employed in practical cases because
de.t/ of a few problems that can be solved with suitable
e.t C Td / ' e.t/ C Td : (9) modifications of the basic control law.
dt

If a control law proportional to this expression is Modifications of the Derivative Action


considered, i.e., From Expressions (11) and (12), it appears that
the controller transfer function is not proper,
de.t/
u.t/ D Kp e.t/ C Td ; (10) because of the derivative action, and therefore,
dt it cannot be implemented in practice. Indeed, the
high-frequency gain of the pure derivative action
this naturally results in a PD controller. The
is responsible for the amplification of the mea-
control variable at time t is therefore based on the
surement noise in the manipulated variable. This
predicted value of the control error at time t CTd .
problem can be solved by filtering the derivative
For this reason, the derivative action is also called
action with (at least) a first-order low-pass filter.
anticipatory control, or rate action, or pre-act.
The filter time constant should be selected in
The combination of the proportional, integral,
order to suitably filter the noise and to avoid a
and derivative actions can be done in different
significant influence on the dominant dynamics
ways. In the so-called ideal or non-interacting
of the PID controller. Thus, it can be selected
form, the PID controller is described by the
as Td =N , where N generally assumes a value
following transfer function:
between 1 and 33, although in the majority of the
practical cases its setting falls between 8 and 16.
1
Ci .s/ D Kp 1 C C Td s ; (11) Alternatively, the overall control variable can be
Ti s
filtered.
where Kp is the proportional gain, Ti is the Another issue related to the derivative ac-
integral time constant, and Td is the derivative tion that has to be considered is the so-called
time constant. An alternative representation is the derivative kick. In fact, when an abrupt (step-
series or interacting form: wise) change of the set-point signal occurs, the
derivative action is very large, and this results

1  0  in a spike in the control variable signal, which
Cs .s/ D Kp0 1 C 0 Td s C 1 is undesirable. This problem could be simply
Ti s
0 avoided by applying the derivative term to the
Ti s C 1  0 
D Kp0 0 Td s C 1 ; (12) process output only instead of the control error. In
Ti s this case, the ideal (not filtered) derivative action
becomes
where the fact that a modification of the value
of the derivative time constant Td0 affects also dy.t/
the proportional action justifies the nomenclature u.t/ D Kp Td : (13)
dt
PID Control 1065

Obviously, when the set-point signal is constant, Anti-windup


applying the derivative term to the control error One of the most well-known possible sources
or to the process variable is equivalent. Thus, of performance degradation is the so-called inte-
the load disturbance rejection performance is the grator windup phenomenon, which occurs when
same in both cases. the controller output saturates (typically when a
large set-point change occurs). In this case, the
system operates as in the open-loop case, since
Set-Point Weighting for Proportional the actuator is at its maximum (or minimum)
Action limit, regardless of the process output value. The
A typical problem with the design of a feedback control error decreases more slowly than in the
controller is to achieve a high performance both ideal case (where there are no saturation limits),
in the set-point following task and in the load and therefore, the integral term becomes large (it
disturbance rejection task at the same time. For winds up). Thus, even when the value of the pro-
example, for stable processes, a fast load dis- cess variable attains that of the reference signal,
turbance rejection is achieved with a high-gain the controller still saturates due to the integral
(aggressive) controller, which gives an oscillatory term, and this generally yields large overshoots
set-point step response on the other side. This and settling times.
problem can be approached by using a two- In order to cope with this problem, an addi-
degree-of-freedom control architecture, where a tional functionality designed for this purpose can
feedback controller is designed to achieve a high be conveniently used. This can be done in differ-
bandwidth and therefore a satisfactory load dis- ent ways. For example, in the conditional inte-
turbance rejection performance, and then the set- gration approach, the integration is stopped when
point signal is filtered before applying it to the the control variable saturates and the control error
closed-loop system. and the control variable have the same sign.
In the context of PID control, this can be Alternatively, in the back-calculation approach,
achieved by weighting the set-point signal for the integral term is recomputed when the con-
the proportional action, that is, to define the troller saturates by feeding back the difference of
proportional action as follows: the saturated and unsaturated control signal.

u.t/ D Kp .ˇr.t/  y.t//; (14)


P
Tuning
where the value of ˇ is between 0 and 1.
The selection of the PID parameters, i.e., the
In this way, the control scheme has a feedback
tuning of the PID controller, is obviously the
controller (11), and the set-point signal is filtered
crucial issue in the overall controller design. This
by the system
operation should be performed in accordance
with the control specifications (which should take
1 C ˇTi s C Ti Td s 2 into account the set-point following, the load
F .s/ D : (15)
1 C Ti s C Ti Td s 2 disturbance rejection, the control effort, and the
robustness of the system). A major advantage of
The load disturbance rejection task is decoupled the PID controller is that its parameters have a
from the set-point following task, and obviously clear physical meaning, and therefore, manual
it does not depend on the weight ˇ, which can tuning is relatively simple. For example, for sta-
be employed to smooth the (step) set-point sig-

You might also like