Lmibook

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/327386254
A unified algebraic approach to linear control design
Book · January 2017

DOI: 10.1201/9781315136523
CITATIONS READS
300 2,637
3 authors, including:
Robert E Skelton Karolos Grigoriadis

University of California, San Diego University of Houston
518 PUBLICATIONS 17,154 CITATIONS 336 PUBLICATIONS 7,735 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Control of Time-delay Systems - with focus on systems with varying parameters View project
Integrating structure and control design View project
All content following this page was uploaded by Robert E Skelton on 14 January 2019.
The user has requested enhancement of the downloaded file.

A UNIFIED ALGEBRAIC APPROACH
TO LINEAR CONTROL DESIGN
R.E. Skelton, T. Iwasaki, and K. M. Grigoriadis
February 26, 2013

1
Dedicated to Judy, Stephanie, Hope, Katie and Grahm. – Robert E. Skelton

To Junko. – Tetsuya Iwasaki
To my parents. – Karolos M. Grigoriadis
2
Contents
Preface 8
1 Introduction 15
1.1 Output Performance and Second-Order Information . . . . . . . . . . . . . . . . . . 16
1.2 Stability, Pole locations, and Second-Order Information . . . . . . . . . . . . . . . . 17
1.3 Stability Robustness and Second-Order Information . . . . . . . . . . . . . . . . . . 19
1.4 Disturbance Attenuation and Second-Order Information . . . . . . . . . . . . . . . . 19
1.5 Stability Margins Measured by H∞ Norms . . . . . . . . . . . . . . . . . . . . . . . . 21
1.6 Computational Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 1 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Linear Algebra Review 25

2.1 Singular Value Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2 Moore-Penrose Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Solutions of Selected Linear Algebra Problems . . . . . . . . . . . . . . . . . . . . . 29
2.3.1 AXB = Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3.2 AX = C, XB = D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.3.3 AX = C, X = X∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.3.4 AX = C, XB = D, X = X∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.5 AX = C, X = −X∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.6 XB = D, X = −X∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.7 AX = C, XB = D, X = −X∗ . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.8 AX + (AX)∗ + Q = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.9 AXBC + (AXBC)∗ + Q = 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.10 AX = B, XX∗ = I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.11 (AX + B)R(AX + B)∗ = Q . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.12 µBB∗ − Q > 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.13 (A + BXC)R(A + BXC)∗ < Q . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.14 BXC + (BXC)∗ + Q < 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Chapter 2 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3
4
3 Analysis of First-Order Information 49

3.1 Solutions of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . 49
3.2 Solutions of Linear Difference Equations . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Controllability and Observability of Continuous-Time Systems . . . . . . . . . . . . 51
3.3.1 Controllability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3.2 Observability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.4 Controllability and Observability of Discrete-Time Systems . . . . . . . . . . . . . . 58
3.4.1 Controllability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.4.2 Observability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.5 Lyapunov Stability of Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5.1 Continuous-Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5.2 Discrete-Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Chapter 3 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4 Second-Order Information in Linear Systems 69

4.1 The Deterministic Covariance Matrix for Continuous-Time Systems . . . . . . . . . 69
4.2 Models for Control Design (Continuous-Time) . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Stochastic Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4 The Discrete System D-Covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.5 Models for Control Design (Discrete-Time) . . . . . . . . . . . . . . . . . . . . . . . 78
4.6 System Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7 Robust Stability and Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . 93
Chapter 4 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
5 Covariance Controllers 111

5.1 Covariance Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2 Continuous-Time Covariance Controllers . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.2.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.2.2 Covariance Construction for Assignability . . . . . . . . . . . . . . . . . . . . 112
5.2.3 State Feedback for Single Input Systems . . . . . . . . . . . . . . . . . . . . . 117
5.2.4 Output Feedback without Measurement Noise . . . . . . . . . . . . . . . . . . 129
5.2.5 Static Output Feedback with Noisy Measurements . . . . . . . . . . . . . . . 130
5.2.6 Dynamic Output Feedback with Noisy Measurements . . . . . . . . . . . . . 131
5.2.7 Structure of Covariance Controllers . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3 Discrete-Time Covariance Controllers . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5
5.3.2 Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

5.3.3 Plant Covariance Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4 Minimal Energy Covariance Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.4.1 Continuous-Time Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . 145
5.4.2 Discrete-Time Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.5 Finite Wordlength Covariance Control . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.6 Synchronous Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.7 Skewed Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.8 Covariance Assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Chapter 5 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6 Covariance Upper bound Controllers 157

6.1 Covariance Bounding Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 157
6.2 Continuous-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.2.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.2.2 Static Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.2.3 Reduced Order Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . 165
6.2.4 Full-Order Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . . . . 169
6.3 Discrete Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3.3 Reduced-Order Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . 178
6.3.4 Full-Order Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . . . . 180
Chapter 6 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7 H∞ Controllers 185
7.1 H∞ Control Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.2.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
7.2.3 Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.3 Discrete-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.3.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
7.3.3 Dynamic Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
Chapter 7 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8 Model Reduction 205

8.1 H∞ Model Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
8.1.1 Continuous-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
6
8.1.2 Discrete-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

8.2 Model Reduction with Covariance Error Bounds . . . . . . . . . . . . . . . . . . . . 213
8.2.1 Continuous-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.2.2 Discrete-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Chapter 8 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
9 Unified Perspective 221

9.1.1 Stabilizing Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
9.1.2 Covariance Upper Bound Control . . . . . . . . . . . . . . . . . . . . . . . . . 223
9.1.3 Linear Quadratic Regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
9.1.4 L∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
9.1.5 H∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
9.1.6 Positive Real Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
9.1.7 Robust H2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
9.1.8 Robust L∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.1.9 Robust H∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9.2 Discrete-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
9.2.1 Stabilization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
9.2.2 Covariance Upper Bound Control . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.2.3 Linear Quadratic Regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.2.4 `∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.2.5 H∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.2.6 Robust H2 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
9.2.7 Robust `∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
9.2.8 Robust H∞ Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
Chapter 9 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10 Projection Methods 239

10.1 Alternating Convex Projection Techniques . . . . . . . . . . . . . . . . . . . . . . . . 239
10.1.1 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
10.1.2 Feasibility, Optimization and Infeasible Optimization Problems . . . . . . . . 240
10.1.3 The Standard ACP Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
10.1.4 The Optimal ACP Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
10.1.5 The Directional ACP Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
10.2 Geometric Formulation of Covariance Control . . . . . . . . . . . . . . . . . . . . . . 248
10.2.1 State Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
10.2.2 Dynamic Output Feedback with Measurement Noise . . . . . . . . . . . . . . 249
10.2.3 Output Performance Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 250
10.2.4 Covariance Control Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
7
10.3 Projections for Covariance Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

10.3.1 Projection onto the Assignability Set . . . . . . . . . . . . . . . . . . . . . . . 252
10.3.2 Projection onto the Positivity Set . . . . . . . . . . . . . . . . . . . . . . . . . 253
10.3.3 Projection onto the Variance Constraint Set . . . . . . . . . . . . . . . . . . . 253
10.3.4 Projection onto the Block Covariance Constraint Set . . . . . . . . . . . . . . 253
10.3.5 Projection onto the Output Cost Constraint Set . . . . . . . . . . . . . . . . 255
10.4 Geometric Formulation of LMI Control Design . . . . . . . . . . . . . . . . . . . . . 256
10.5 Fixed-Order Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Chapter 10 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
11 Successive Centering Methods 265

11.1 Control Design with Unspecified Controller Order . . . . . . . . . . . . . . . . . . . . 265
11.1.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
11.1.2 Analytic Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
11.1.3 The Method of Centers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
11.2 Control Design with Fixed Controller Order . . . . . . . . . . . . . . . . . . . . . . . 272
11.2.2 A Minimization Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
11.2.3 The XY-Centering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
11.2.4 Extension to Optimal Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
11.3 Control Design with Fixed Controller Structure . . . . . . . . . . . . . . . . . . . . . 283
11.3.2 The VK-Centering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Chapter 11 Closure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
A Linear Algebra Basics 291

A.1 Partitioned Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
A.2 Sign Definiteness of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
A.3 A Linear Vector Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
A.4 Fundamental Subspaces of Matrix Theory . . . . . . . . . . . . . . . . . . . . . . . . 296
A.4.1 Geometric Interpretations Definitions . . . . . . . . . . . . . . . . . . . . . . 296
A.4.2 Construction of the Fundamental Subspaces by SVD . . . . . . . . . . . . . . 301
A.5 Convex Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
A.6 Matrix Inner Products and the Projection Theorem . . . . . . . . . . . . . . . . . . 305
B Calculus of Vectors and Matrices 307

B.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
B.2 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
C Balanced Model Reduction 313

8
Preface
This book provides a unifying point of view of systems and control with a focus on linear systems.
With so many available, one may ask why another book on linear systems and control is necessary,
especially since each of many directions in control theory has reached a fairly mature state. However,
the tools used to develop the existing results are fundamentally different, and there is no unifying
point of view. In fact, there is still a wide-ranging debate over which approach is “best”. Some of
these debates and how they relate to the unifying themes of this book are described in this preface.
Frequency Domain versus State Space Methods.

The frequency domain methods do not directly apply to time-varying and nonlinear systems
(except for the isolated nonlinearities that allow describing function analysis). On the other hand,
the frequency domain, using the classical methods of Evans, Bode, and Nyquist, readily yields
simple controllers of low order. The tractable state space optimization theory yields controllers of
high order, as opposed to the easily accommodated low-order controllers of classical control. The
Youla parametrization [160], when used in optimization problems (see [6]), may yield even higher-
order controllers. Many researchers in control theory view the lack of a tractable method for the
design of low-order controllers as the most fundamental deficiency of modern control theory. This
book presents a parametrization of all stabilizing controllers of low order (equal to or less than
the plant order) to aid in the practical problem of designing a simple controller for a high-order
complex system.
Deterministic versus Stochastic Methods.

The debate over whether to treat the system as deterministic or stochastic has been as heated
as the debate over frequency or time domain methods. One argument against stochastic methods
is that guarantees of absolute values of signals are not possible; yet, bounds on signals are very
practical and important considerations when dealing with real systems with sensor and actuator
saturations, and physical limits of stresses in structures, etc. On the other hand, one argument
for stochastic analysis is that there are no physical sensors and actuators without some electronic
noise on the outputs. In the stochastic literature, covariance analysis is a cornerstone in filtering
9
10 PREFACE
theory and this powerful theory has found many practical uses. In the book [126] a step is taken to
unify the two points of view (deterministic and stochastic). By offering a deterministic treatment
of excitations (initial conditions and impulses), an analysis of the deterministic system is given that
is shown to be mathematically equivalent to the covariance analysis of stochastic processes. Thus,
a deterministic interpretation of the covariance matrix is given. This book will further extend these
ideas to both time-varying and discrete-time systems, so that a background in stochastic processes
is not required to read this book.
Control versus Signal Processing.

To our knowledge, all controllers that have been implemented in aerospace hardware have been
designed in two uncoordinated steps. First, the control analyst designs the controller under the
assumption of infinite precision computation (analog or digital). Secondly, the signal processing
and computer sciences group within the company implements the given controller in a special set
of coordinates chosen for scaling and minimization of computational errors in the flight computer.
To integrate these two steps, a characterization of computational errors should be considered in
the initial design of the controller. In this way it is possible to design controllers to be optimal in
the presence of delay, in the presence of finite wordlengths in the control computer, in the digital
to analog converter, and in the analog to digital converter. In this regard, this book follows the
lead of Williamson [153] and Gevers et al. [37] to suggest some degree of integration of the fields of
signal processing and control. Furthermore, we shall parametrize all controllers which can stabilize
a given system with controllers of a given wordlength and time-delay, assuming a stochastic model
of round-off errors.
Modeling versus Control Design.

It is well understood by now that the modeling and control design problems are not independent
[127]. For this reason, twenty years of model reduction research has left us with little guarantees
about closed-loop performance using reduced-order models. Hence, many of these researchers
moved to the more specialized subject of “controller reduction”, with little additional success.
Optimization problems for controllers of fixed order promise a better answer theoretically, but the
corresponding nonlinear mathematical program presents computational problems. Alternately, one
can separate the low-order controller design into two uncoordinated steps (model reduction then
controller design, or controller design then controller reduction), presenting a more tractable but
less theoretically satisfying model or controller reduction. This book takes neither of these three
approaches to fixed-order controller design, but introduces a unification of the modeling and control
problem in the sense that the controller order is not fixed at the outset, but is guaranteed to be
equal to or less than the order of the plant.
PREFACE 11
Scalar versus Multiobjective Methods.

Most of the literature on optimal control deals with a single objective function (or “cost” func-
tion). It is not practical to judge the relative merits of a controller by computing a single number.
On the other hand multiobjective optimal (or “Pareto-Optimal”) control is computationally de-
manding. Great improvement in such problems has been made by first ignoring optimization, and
characterizing all solutions that lie within a set of inequality constraints. The necessary and suf-
ficient conditions for a solution to lie within a set of (multiple) inequality constraints are usually
considerably simpler and more tractable than the necessary conditions for optimization. Optimiza-
tion can often be approached from the point of view of studying “feasibility” of the inequality
constraints, by reducing the upper bounds in the inequality constraints until there are no feasible
solutions. We take this view and do not focus on optimization, but on the satisfaction of matrix
inequality constraints. This gives a multiobjective nature to our problem.
Performance versus Stability.

The vast majority of control theory has focused on stability, while performance guarantees have
received much less attention. Being able to guarantee specific bounds on the response is needed in
practice. Indeed, “stability” is a mathematical property of a particular “mathematical model” of
the system and never a guaranteed property of the physical system itself. When writing a book
on controller design for real world implementation one must differentiate between the physical
system and its mathematical model. Quite often the model of the closed-loop system may have
the “asymptotic stability” property, but the physical system never does. Even in the absence of
external disturbances the control signal produced by a digital computer does not asymptotically go
to zero, but to some sort of limit cycle dictated by the computer wordlength. Therefore, the ability
to guarantee upper bounds on signals is more important than the (admittedly artificial) ability to
guarantee asymptotic stability. If stability has been overemphasized in the control literature, then
performance has been under-emphasized. Lack of stability might be considered a disaster, but in
physical systems disaster comes long before instability. A stable billion dollar space telescope would
be considered a disaster if it fails to meet the pointing accuracy required to make the observations
and pictures useful.
The comparison of classical versus. modern control methods have been characterized by the
following oversimplification: “In classical control one designs for stability, but then must check
for performance, whereas in modern control, one designs for performance, but then must check for
stability”. Both classical and modern approaches have made progress toward integration of stability
and performance design objectives, but only for highly specialized definitions of performance and
stability margins. Without some assumptions, optimality does not guarantee stability. Hence, it
has been popular recently to optimize the H2 performance scalar (integral squared of the transfer
function) subject to an upper bound on the “H∞ norm” (peak of the transfer function). The H∞
bound delivers a certain kind of stability margin and the H2 norm represents a (scalar) performance
12 PREFACE
measure, related to the Root Mean Squared (RMS) behavior of the output signal. The control
design procedures in this book can include these kind of design objectives. We seek also a method
that can treat time domain “L∞ norms”, (peak of time response) since these represent the physical
limits of real time signals, such as sensor and actuators that are subject to saturation.
This book discusses two kinds of robustness: “performance” robustness and “stability” robust-
ness. By these phrases we mean that certain performance and stability guarantees hold in the
presence of specified errors in the plant model or in the disturbances.
The 1956 result of Massera [89] states that if a system is asymptotically stable, then there
exists a Lyapunov function to prove it so. The practical value of this theorem is that the search
for a Lyapunov function is not a waste of time (the set of functions with the properties we seek is
never an empty set). The theoretical value of this theorem is that the search for a characterization
of all stabilizing controllers would be well served by a characterization of all Lyapunov functions.
Moreover for linear systems, only quadratic Lyapunov functions are needed to capture necessary
and sufficient conditions for stability. That is the approach of this book, to parametrize (for lin-
ear systems) the set of all quadratic Lyapunov functions. This allows us to capture the set of all
stabilizing controllers. From this set, all controllers which can meet the (certain matrix inequal-
ity) performance requirements are parametrized. The connections between Lyapunov stability and
deterministic definitions of RMS performance allows the above unification of the theories for per-
formance and stability guarantees.
Choosing a Design Space.

From 1930 to 1955 the two-dimensional space of the complex plane was the workhorse of control
theory. Design in this space allows placement of poles and zeros. Various other two-dimensional
plots have special use in such designs, including the Bode plot, the Nichols plot, the Nyquist plot
and the Root Locus. The essential tool here was complex analysis.
In the 1960’s state space became the popular design space. Because the n-dimensional state
space does not lend itself to plotting, the graphical methods made popular for the two-dimensional
complex plane played less of a role in this period of control development. Rather, the essential tools
used here were optimization, the calculus of variations, ordinary differential equations and Hilbert
spaces.
In the 1980’s a modern version of complex analysis extended the classical notions of stability
margins to Multi-Input Multi-Output systems in a systematic way. In the Single-Input Single-
Output (SISO) system the Nyquist plot (for unit feedback systems) could indicate the peak magni-
tude of the closed-loop frequency response (closed-loop Bode diagram) by adding “M circles” [52]
to the Nyquist plot. The M circles are the locus of points in the complex plane of the open-loop
transfer function that correspond to the same magnitude in the closed-loop frequency response.
The modern H∞ theory allows the characterization of the peak magnitude of not just one transfer
function, but the “magnitude” (norm) of a transfer matrix. The essential tools here were Hardy
PREFACE 13
spaces and complex analysis [29].

In the late 1980’s a state space interpretation of H∞ theory was provided [25], and robust control
became a strong focus for research. Guaranteeing an upper bound on the H∞ norm in the presence
of parameter perturbations remains a major focus of control theorists to this day. Objectives of
robust H∞ control are stability margin type of goals (in the presence of plant perturbations), and
maintaining an H∞ norm is qualitatively a stability margin type of design goal.
In 1977, Nagayasu [93] introduced (in Japanese) the existence condition for solving the state
feedback covariance control problem. During the mid 1980’s this approach to control called “covari-
ance control” was developed in the English literature [53]. The objectives here were the assignment
of all n(n+1)
2 elements of the state covariance matrix. The n(n+1)
2 dimensional “covariance space”
has some important features. By increasing the design space from n (as in state space) to n(n+1) 2
the class of systems that can be studied with the simple tools of linear algebra are increased! That
is, the class of problems that can be represented as linear problems in the n(n+1)
2 dimensional space
is larger than the class of problems that are linear in state space. In addition to enlarging the class
of control problems that can be treated by linear methods, the covariance control theory needs only
the tools of linear algebra.
The fundamental contribution of this book is to show that a large class of control problems
reduce to a problem in linear algebra. In fact, some 18 control problems (9 continuous-time plus 9
discrete-time) reduce to a single linear algebra problem! Hence linear algebra is the enabling tool
that allows students to view the vast majority of all linear system control problems from a common
setting. The goal of this book is to show how to use linear algebra to accomplish this goal.
It is a pleasure to acknowledge the diligence and patience of Jill Comer and Becky May who
typeset much of the manuscript. Thanks also go to the students of Purdue’s course, “Control
of Uncertain Systems”, AAE 664, who provided helpful feedback since 1991 when this book was
initiated. In fact, two of those students were so helpful and imaginative that they became co-authors
of the book!
14 PREFACE
Chapter 1
Introduction
To illustrate the concepts of this book, a simple system described by a scalar differential equation
is useful
∆ d
ẋ(t) = ax(t) + dw(t), ẋ = dt [x(t)],
(1.1)
y(t) = cx(t),
where x(t) is the state, w(t) is the external input to the system, and y(t) is the output of interest.
Scalars a, d and c are given constants. The solution of this differential equation is
Z t
at
y(t) = ce x(0) + cea(t−τ ) dw(τ )dτ. (1.2)
0
Consider two possible excitations due to the impulsive disturbance in w(t) and the nonzero initial
state x(0) indexed by i = 1, 2 where
{w(t) = wδ(t), x(0) = 0} f or i = 1 (1.3a)

{w(t) = 0, x(0) = x0 } f or i = 2. (1.3b)
Let y(i, t) denote the response y(t) in the event of excitation i. Then
y(1, t) = ceat dw (1.4a)

at
y(2, t) = ce x0 . (1.4b)
If the differential equation (1.1) describes the time history of the error due to the excitations w(t)
and x(0) from a certain desirable system state, or equivalently, if the variable y dictates the error,
performance of the system can be measured by
2 Z
X ∞
∆
Y = y 2 (i, t)dt (1.5)
i=1 0
2 Z
X ∞
∆
X= x2 (i, t)dt. (1.6)
i=1 0
15
16 CHAPTER 1. INTRODUCTION
where it can readily be shown that X and Y satisfy the following;
Y = c2 X, 0 = 2Xa + d2 W + X0 . (1.7)
∆ ∆
where W = w 2 and X0 = x20 . Note that we do not presume to know the actual initial condition x0 ,
or the impulsive disturbance w, only their magnitudes. Actually, the presumption about the exci-
tations is even less specific than uncertainties about x0 and w. Note in (1.7) that our performance
measure X is equivalent over a wide class of excitations characterized by d2 W + X0 = constant.
Let us refer to y(t) as first-order information about system (1.1), and to Y as second-order
information. The study of first-order information occupies much of the control literature. However,
it is the premise of this book that second-order information can be more informative, and indeed,
that many essential properties of (1.1) can be characterized by X, the second-order information
about the state, x. We will later motivate the use of the word “covariance” as a label for X.
These ideas will be illustrated in this chapter, but for motivational illustrations we will use the
scalar system (1.1), or its control problem counterpart:
ẋ(t) = ax(t) + bu(t) + dw(t)

(1.8)
y(t) = cx(t), u(t) = gy(t),
where u(t) is the control input and g is the feedback gain to be determined. The subsequent
chapters will develop these ideas and the mathematical design tools for the general Multi-Input
Multi-Output (MIMO) case.
1.1 Output Performance and Second-Order Information

From (1.1)-(1.5), define a certain norm (size) of y(i, t) by
" 2 Z
#1/2
∆ X ∞
||y||L2 = 2
y (i, t)dt . (1.9)
i=1 0
h R i1/2
The quantity (1.9) is closely related to a Root Mean Square (RMS) value, T1 0T y 2 (t)dt for
some finite T , which is frequently used as a measure of system performance. However, the infinite
horizon (T → ∞) is often more important in engineering problems. More motivation for our use
of performance measure (1.9) will be added later. Quite often in practice the ||y||L2 value of the
output is of interest, rather than the actual output trajectory y(t). For example, in telescope or
antenna pointing problems, pointing control accuracies need not exceed the resolution of the film
or image processing equipment. As long as a light ray is controlled such that it remains within the
grain size of the film (the faster the film, the larger the grain size), the quality of the picture is
limited only by the properties of the film and optics and not by the control system. This is the
ultimate objective of the control designer, to design the control system so that the controller is not
the limiting factor in the total system performance.
1.2. STABILITY, POLE LOCATIONS, AND SECOND-ORDER INFORMATION 17
Physical systems are never asymptotically stable, nor completely controllable, nor completely
observable. That is, there is always a limit on our ability to control the system state (we cannot
take it exactly to zero), or to observe (measure) what the complete system is actually doing. Nor
we can measure with infinite precision. There are periodic behaviors (limit cycles) in the output
due to round-off in finite precision computers, or drifts and nonlinearities in the actuators and
sensors. But our inability to control the output to zero or to know exactly what the output is doing
might not prevent us from placing bounds on their RMS or ||y||L2 values. We will show that bounds
on the absolute value of first-order information (signals) can be related explicitly to second-order
information. This is the fundamental advantage of working with second-order information; and
we will show that “performance on the average”, or “bounds on the absolute values” constitutes a
more realistic objective for control design than the “drive the output zero” requirement associated
with first-order system properties. We will develop a complete theory for assigning a specific value
of X, which fixes the values ||y||L2 = Y 1/2 = (c2 X)1/2 , or provide upper bounds on the same. We
will also be able to fix upper bounds on the absolute value |y(t)|, for certain disturbances.
In this book on the control of second-order information, we begin with linear systems with a
detailed study of their stability properties.
1.2 Stability, Pole locations, and Second-Order Information

Consider the system (1.1). The system is said to be stable if x(t) approaches zero as t → ∞ for all
initial x0 with w(t) = 0. In view of (1.2), the system is stable if and only if a < 0. We argue here
that second-order information is closely connected to stability. Indeed, a given set of first-order
data x(t), 0 ≤ t < ∞, might not reflect any stability properties of a. See from (1.2) and (1.4)
that x(t) can be zero for some initial condition (x0 = −dw), independently of the properties of a.
On the other hand, a given positive value of X is equivalent to stability of a. To see this,
compute X in (1.7) to get for system (1.1),
2aX = −(d2 W + X0 ). (1.10)
Hence a < 0 is equivalent to X > 0 provided d2 W + X0 > 0. Furthermore, Y in (1.5) can be

computed from the experimental first-order data y(i, t), without knowledge of model data a, d, c. In
our attempt to develop control techniques that can be tested in practice, and improved by redesign
in the field, this fact cannot be overemphasized. If data is available, Y can be computed directly
from (1.5), or, equivalently, from the model-based equation (1.7).
Now consider the control problem to choose g in (1.8). To apply (1.7) to model (1.8), instead
of model (1.1), replace a in (1.7) by a + bgc and solve (1.7) for g,
2aX + d2 W + X0
g=− . (1.11)
2bcX
Having noted already the equivalence of X > 0 and a + bgc < 0 (and such equivalence does not
depend on the specific numerical values of the positive number (d2 W +X0 ), hence stability does not
depend on initial conditions), we can regard (1.11) as a parametrization of all stabilizing controllers,
u = gy. The parametrization is explicit in terms of an arbitrary positive number X, where for design
purposes, X should be chosen according to a desired constraint Y = (c2 X) = ||y||L2 < Y .
Example 1.2.1 Consider a vertically fired rocket [126], where m = mass, g = gravity constant,
v = speed, fc = control f orce, fn = impulsive disturbance such as arising from startup transients
in the rocket firing. In the form of (1.8), the model is given by the following system parameters
a = 0, b = m−1 = d, c = 1, u = fc − mg, w = fn , y = x = v − v̄,

∆
fn (t) = wδ(t), W = w 2 , X0 = (v(0) − v̄)2 ,

where w(t) is the assumed impulsive error in the thrust, v̄ is the desired (constant) velocity and
v(0) is the assumed initial velocity. (We specify here only the magnitude of the initial error X0 =
(v(0) − v)2 , we do not presume to know the actual initial condition). Then the control u = g(v − v̄)
that regulates the speed to the desired value v̄, with a guaranteed error ||y||L2 = ||(v − v̄)||L2 = .01
is, from, (1.11), · ¸
m 4 W
g = − 10 + X0 .
2 m2
Note that the evaluation of the ||y||L2 performance of any controller requires knowledge of a
set of excitation events (which we loosely call “disturbances”, w and v(0) in the above example).
Since we are controlling only second-order information, an entire set of various disturbances can
lead to the same controller. In this example the controller is invariant under the infinite number
of disturbances (w, v(0)) satisfying
w2
+ (v(0) − v̄)2 = constant.
m2
In first-order analysis the disturbance environment must be specified exactly to solve the differential
equations for y(t). However, in second-order analysis the disturbance environment need not be
exactly specified. It is reasonable and easier in practice to specify a possible set of disturbances,
rather than pinning our analytical predictions on a single specific disturbance.
Note that the eigenvalues of the closed-loop system (of course there’s only one a + bgc in this
example) can also be related to the second-order information X or Y
d2 W + X0
a + bgc = − (1.12)
2X
and the performance constraint X ≤ X guarantees a stability margin
d2 W + X0
a + bgc ≤ − . (1.13)
2X
Hence, both transient properties (eigenvalues), and steady state properties ||y||L2 can be related to
second-order information X.
1.3. STABILITY ROBUSTNESS AND SECOND-ORDER INFORMATION 19
For the general MIMO system, we will develop a parametrization of all stabilizing controllers
(state feedback, output feedback, or dynamic controllers) in much the same spirit as (1.11). These
results will also be easy to use for assigning poles to a region (by choosing X appropriately, as in
(1.13)).
1.3 Stability Robustness and Second-Order Information

The “stability robustness” issue addresses this concern: Given that the nominal closed-loop system
(1.8) is stable (a + bgc < 0), what can be guaranteed about stability of a + ∆a + bgc where ∆a
represents the uncertainty in a, (of course, we are oversimplifying everything in this motivational
chapter; the b, c and d may also change in general). Y changes to Y + ∆Y when a changes to
a + ∆a. Suppose, we desire a specific ||y||L2 performance bound on the output, say, ||y||L2 2 =
Y + ∆Y ≤ 0.02, for all uncertainty ∆a in the range
∆a
−.1 ≤ ≤ .1 (1.14)
|a|
and we also desire stability of a + bgc + ∆a over this range. From (1.10) with a + ∆a + bgc replacing
a, and X + ∆X replacing X,
c2 (d2 W + X0 )
Y + ∆Y = c2 (X + ∆X) = − ≤ 0.02 (1.15)
2(a + ∆a + bgc)
From (1.14) and (1.15), it may be shown that the controller (1.11) will satisfy both the stability
and the performance constraint for this range of choices of X
d2 W + X0
X ≤ (1.16)
0.2|a| + 50(d2 W + X0 )c2
in the design equation (1.11). The text will relate stability robustness properties of general linear
systems to the second-order information X and Y , and parametrize robustly stabilizing controllers
in terms of X as in (1.11).
1.4 Disturbance Attenuation and Second-Order Information

√
Suppose now we desire to bound the first-order property |y(t)| ≤ .02 subject to zero initial state
x(0) = 0 and arbitrary disturbances w(t) with a known upper bound β on the “energy” of the
disturbance Z ∞
w2 (t)dt ≤ β. (1.17)
o
By this definition, Figure 1.1 shows three disturbances with the same energy level, β.
Z ∞
wj2 (t)dt = 2, j = 1, 2, 3
o
w2 (t)
w3 (t)
w1 (t)
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
0
0 1 2 3 4 5 6 7
Time
Figure 1.1: Energy Equivalent Signals

( √
√ −(1/2)t 2, 1 ≤ t ≤ 2
w1 (t) = 2e , w2 (t) =
0
( √
3
2 (t − 3), 3 ≤ t ≤ 5
w3 (t) = .
0
The text will show for the general case what we now state for the scalar case,
Z ∞
y (t) ≤ Y [
2
w2 (τ )dτ ], 0 ≤ t ≤ ∞. (1.18)
o
Hence, in the presence of an arbitrary disturbance w(t), the peak value of y(t) over all time is
bounded by the right-hand side of (1.18), where Y is computed by (1.7) with X0 = 0 as if w(t)
were impulsive with intensity W = 1. Hence, Y describes a disturbance robustness feature of a
linear system. In fact, the inequality in (1.18) becomes arbitrarily close to equality for a special
disturbance w̄(t) [19]. This special “worst-case” disturbance is useful in practice, since the engineer
can guarantee that if y(t) has acceptable peak values with disturbance w̄(t) satisfying (1.17), then
y(t) will have acceptable peak values for all other disturbances whose energy level is bounded by
the same value β. Such disturbances will be shown.
By choosing the second-order information c2 X appropriately, desired disturbance robustness
properties are obtained, guaranteeing upper bounds on the peak value of the signal y(t) over all
time. In systems with sensor and actuator saturations this signal bounding capability is extremely
important.
1.5. STABILITY MARGINS MEASURED BY H∞ NORMS 21
1.5 Stability Margins Measured by H∞ Norms

For the closed-loop system (1.8), the peak value of the frequency response is a reliable measure
of relative stability (for a unity feedback system the M circles on the Nyquist plot indicate this
magnitude), ¯ ¯
¯ ¯
H∞ = max ¯c(jω − (a + bgc))−1 d¯ .
∆
ω
|cd| |cd|
= max = .
ω {ω 2 + (a + bgc) }
2 1/2 |a + bgc|
It will be shown below that the number we define here as H∞ is less than or equal to a specified
number γ > 0 if X∞ > 0 solves
0 = 2(a + bgc)X∞ + γ −2 c2 X∞
2
+ d2 . (1.19)
Suppose c 6= 0, d 6= 0 and the equation (1.19) has a positive real solution X∞ . This is equivalent
to stability of a + bgc and
|cd|
≤γ
|a + bgc|
since the positive solution of (1.19) is

 s 
µ ¶2
a + bgc |cd|
X∞ = −2 2 −1 + 1− .
γ c γ|a + bgc|
Solving (1.19) for g, all stabilizing gains yielding the H∞ norm less than or equal to γ are
2aX∞ + d2 + γ −2 c2 X∞
2
g=− , X∞ > 0. (1.20)
2bcX∞
Notice that the linear algebra problem that solves the “covariance” control problem for g, yielding
(1.11) also solves (1.19) for g, simply by replacing the forcing term (d2 W + X0 ) in (1.7) (where
a → a + bgc) by (d2 + γ −2 c2 X 2 ). Compare (1.20) with (1.11) to see this point. Hence, almost all
essential results of this book comes from the solution of the covariance control problem (1.11). If
W = 1 and X0 = 0, then d2 W + X0 < d2 + γ −2 c2 X∞ 2 and hence X < X
∞ if X∞ is positive real.
Hence X∞ is an upper bound on the covariance of the system. By choosing the same controller to
satisfy both a performance constraint (X < X) from (1.7), and a stability margin constraint (1.19),
for some γ > 0, the competing objectives of stability versus performance can be satisfied or traded.
Solving either the (1.7) or (1.19) for the controller g is a problem in linear algebra. To prepare for
these problems the next chapter is devoted to a review of linear algebra and matrix methods.
1.6 Computational Errors

One of the fundamental deficiencies of existing control theory is the lack of treatment of the finite
precision computational issues. Suppose (1.1) represents a dynamic system under measurement
feedback control, controlled by analog computation, which introduces computational errors of the
form
ẋ = ax + b(u + eu ) + w
u = g(y + ey )
y = cx
where ey and eu are the errors introduced by the sensor and actuator errors, respectively (in digital
control ey and eu would be the errors introduced by the finite wordlength of the Analog to Digital
(A/D) converters and the Digital to Analog (D/A) converters). The magnitude of the errors eu
and ey depend upon the quality, accuracy, of the sensor and actuator hardware. In this illustration
these “magnitudes” Eu and Ey are the square of the intensities of impulsive errors eu and ey . W
is the square of the intensity of the impulsive w(t). Applying these three impulses one at a time,
the closed-loop system
ẋ = (a + bgc)x + beu + bgey + w
yields the norm ||x||L2 of the state
3 Z
X ∞
∆
X = ||x||2L2 = x2 (i, t)dt (1.21)
i=1 0
satisfying,
0 = 2X(a + bgc) + b2 Eu + b2 g 2 Ey + W (1.22)
Completing the square and factoring yields the equation, linear in g,
" #1/2
cX 1 c2 X 2
bg + = ±p − 2aX − b2 Eu − W .
Ey Ey Ey
The X corresponding to a real solution g follows from

c2 X 2
− 2aX − b2 Eu − W ≥ 0.
Ey
Hence the smallest admissible X = X ◦ > 0 is
1 h q i
X◦ =
∆ 2 E 2 + E c2 (W + b2 E )
aEy + a y y u (1.23)
c2
corresponding to the value g = g o ,
cX ◦
go = − (1.24)
bEy
Note that the optimal performance is achieved at a finite g ◦ as opposed to the infinite gain predicted
by the standard optimal control theory which ignores computational errors, (hence, an infinite
precision sensor, Ey = 0), where, from (1.22), X is arbitrarily small if |g| is arbitrarily large. For
an infinite precision actuator Eu = 0,
1 h q i
X ◦ (Eu = 0) = aEy + a2 E 2 + c2 E W > 0.
y y
c2
1.6. COMPUTATIONAL ERRORS 23
Hence, an accurate sensor is more important than an accurate actuator, in the sense that X ◦ can
be taken to zero with a perfect sensor, but not with a perfect actuator. Two observations are
important here. In the presence of computational errors in the controller:
1. There exist lower bounds on performance which cannot be predicted by the usual infinite
precision assumptions for controller implementation.
2. The performance bounds are a function of the precision with which the controller (sensors,
actuators, control computer) is implemented.
These concepts will be developed in the discrete-time case to include the effects of A/D, and
D/A conversion, computational delay, and computer round-off.
Chapter 1 Closure
The tools of this text will allow (but not be limited to) the following types of control design
specifications, related to our example (1.8): We seek a g to satisfy some combinations of the
following:
(i) Disturbance attenuation requirement for the finite energy disturbances: We require
|y(t)| ≤ ²1 , ∀t ≥ 0
R∞
for all 0 w2 (t)dt ≤ ²2 , for given ²1 and ²2 .
(ii) Disturbance attenuation requirement for impulsive disturbances: We require
||y||L2 ≤ ²4
||u||L2 ≤ ²5
in response to w(t) = wδ(t) with |w| ≤ ²6 , for given ²4 , ²5 and ²6 .
(iii) Robust stability requirement: For a given ²3 we require that the closed-loop system remain
stable when a changes to a + ∆a, where the perturbation ∆a satisfies
|∆a| ≤ ²3 .
(iv) The poles of the closed-loop system should lie in a specified region in the complex plane (e.g.,
a circle with center given at − ²6 and radius given by 0 < ²7 < ²6 , or a specified rectangular
region).
(v) The H∞ norm of a specified closed-loop transfer function should be (the frequency response
should have a peak) less than a specified number ²8 .
(vi) Controllers of low (less than plant) order to accomplish (i)-(v) are required (only constant
gain state feedback is considered in the motivational examples of this introductory chapter).
(vii) Performance should be guaranteed in the presence of computational errors in the controller
and A/D, D/A converters.
Such design tools are developed for both continuous and discrete-time systems. Beginning with
these elementary motivations, we seek to develop a unified approach to control built upon second-
order information as the fundamentally important data. The advantages of this approach to edu-
cation and the theory of control are several:
1. Using second-order properties in the design goals allows one to parametrize all stabilizing
controllers in terms of physically meaningful data X.
2. The RMS performance of multiple outputs (Y is a matrix in this case) can be simultaneously
controlled to preassigned values, or upper bounds.
Y ≤ Ȳ
This lends a multiobjective capability to the theory of multi-input, multi-output systems.
3. In the presence of a class of uncertainties in the model data, stability and performance bounds
can be incorporated into the control design specifications. We refer to these two design
objectives as stability robustness and performance robustness.
4. All stabilizing controllers of fixed order can be parametrized in terms of the second-order
information. This text will develop tools to design low-order controllers with guaranteed
stability and RMS performances on multiple inputs and outputs.
5. Pole assignment can be accomplished by proper choices of the second-order information, and
poles can be assigned to a region.
6. Disturbance attenuation properties can be accomplished by proper choices of the second-order

information.
7. The extensions to time-varying systems are straightforward, but are not contained in this
book.
Chapter 2
Linear Algebra Review
This chapter documents, for later use, certain results from linear algebra. Some common notations
are listed in Table 2.1, and a more fundamental review of linear algebra appears in Appendix A.
Many results of this chapter are taken from [154].
TABLE 2.1 Common Notations
∆ √
j = imaginary unit ( −1)
∆
In = n × n identity matrix
∆
AT = transpose of matrix A
A∗
∆
= complex conjugate transpose of A
∆
A+ = The Moore-Penrose inverse of A
∆ Pn
trA = i=1 aii (trace of A)
∆
λi [A] = eigenvalue λi of matrix A
∆
A>0 = λi [A] > 0, ∀ i (A “symmetric positive definite”)
∆
A≥0 = λi [A] ≥ 0, ∀ i (A “symmetric positive semidefinite”)
A = EΛE−1
∆
= Spectral Decomposition of A
∆
Λ = Jordan form
∆
E = modal matrix (of eigenvectors or generalized eigenvectors)
A = UΣV∗
∆
= Singular Value Decomposition (SVD) of A
∆
kAkF = tr[AAT ]1/2 = Frobenius norm of A
∆
kAk = σ̄[A] = maximum singular value of A
∆
σ[A] = minimum (nonzero) singular value of A
A = FF∗
∆
= Factoral Decomposition of A ≥ 0
25
26 CHAPTER 2. LINEAR ALGEBRA REVIEW
2.1 Singular Value Decomposition

The modal data of a square matrix are its eigenvalues and eigenvectors. The matrix decomposition
A = EΛE−1 (where the eigenvalues of A appear on the diagonal of Λ and the eigenvectors are
the columns of E) is called the spectral decomposition of A. If A is a Hermitian matrix, then
the eigenvalues and the eigenvectors of the spectral decomposition of A can be chosen to be real,
and the eigenvectors can be chosen to be orthonormal. The matrix decompositions in this section
require modal data computations for Hermitian matrices. In this regard, this section is a special
case of spectral decomposition. In another way it is more general, since we shall not restrict A to
be square.
A square unitary matrix U is defined by the property U∗ U = I = UU∗ , where the ∗ superscript
denotes complex conjugate transpose. The main result is as follows.
Theorem 2.1.1 Let A ∈ C k×n be a matrix of rank r. Then there exist unitary matrices U and
V such that
A = UΣV∗ (2.1)
where U satisfies
AA∗ U = UΣΣ∗ (2.2)
and V satisfies
A∗ AV = VΣ∗ Σ (2.3)
where Σ has the canonical structure

" #
Σ0 0
Σ= , Σ0 = diag (σ1 , · · · , σr ) > 0. (2.4)
0 0
The numbers σi , i = 1, · · · , r are called the nonzero singular values of A.
Proof. Since A∗ A ≥ 0, its eigenvalues are all real and nonnegative. Let σi2 , i = 1, · · · , n be the
eigenvalues of A∗ A. For convenience, we arrange σi such that
σ12 ≥ σ22 ≥ · · · ≥ σn2 ≥ 0.
Suppose that the rank of A is r. Then, the rank of A∗ A is r, and we have
σ12 ≥ σ22 ≥ · · · ≥ σr2 > 0, σr+1 = · · · = σn = 0.
Let vi , i = 1, · · · , n be the orthonormal eigenvectors of A∗ A associated with σi2 . Define

h i
V = [v1 v2 · · · vr vr+1 · · · vn ] = V1 V2
Then from the spectral decomposition theorem

" #
∗ ∗ Σ20 0
V A AV =
0 0
2.2. MOORE-PENROSE INVERSE 27
where Σ20 = diag(σ12 , · · · , σr2 ). This provides
V2∗ A∗ AV2 = 0 (2.5)
and
V1∗ A∗ AV1 = Σ20 (2.6)
or equivalently
Σ−1 ∗ ∗ −1
0 V1 A AV1 Σ0 = I.
Define the k × r matrix U1 by

U1 = AV1 Σ−1
∆
0 (2.7)
which is column unitary since U∗1 U1 = I. Hence, there exists matrix U2 such that U = [U1 U2 ]
is unitary. Then
" #
U∗1 AV1 U∗1 AV2
U∗ AV = .
U∗2 AV1 U∗2 AV2
Now (2.7) implies that AV1 = U1 Σ0 and (2.5) implies that AV2 = 0. Hence
" # " #
∗ Σ0 0 Σ0 0
U AV = or A = U V∗ .
0 0 0 0
2
For more on singular value decomposition, see [51], [154] or many other books on linear algebra.
2.2 Moore-Penrose Inverse

It has been shown that any n × m matrix A can be expressed as a singular value decomposition
A = UΣV∗ (2.8)
where U and V are n × n and m × m unitary matrices respectively, and Σ is an n × m matrix

given by " #
Σ0 0
Σ= ; Σ0 = diag (σ1 , σ2 , · · · , σr ) (2.9)
0 0
where σk > 0 for k = 1, 2, · · · , r are the singular values of A, and r is the rank of A.
Given the n × m matrix A in (2.8) and (2.9), define the m × n matrix
" #
Σ−1 0
+ ∆
A =V 0
U∗ . (2.10)
0 0
From (2.8), (2.9) and (2.10) the n × n matrix AA+ is given by

" #
Ir 0
+
AA = U U∗
0 0
and the m × m matrix A+ A is given by

" #
Ir 0
A A=V+
V∗ .
0 0
Exercise 2.2.1 Show that A+ in (2.10) satisfies
AA+ A = A; A+ AA+ = A+ (2.11)
(AA+ )∗ = AA+ ; (A+ A)∗ = A+ A
Definition 2.2.1 A matrix A+ which satisfies properties (2.11) is called the Moore-Penrose inverse
of A.
Theorem 2.2.1 For every real n×m matrix A, there exists a unique m×n matrix A+ , the Moore-
Penrose inverse of A, which satisfies (2.11). Moreover, if A has the singular value decomposition
(2.8), (2.9) then A+ is given by (2.10).
Proof. By construction (and Exercise 2.2.1) we have shown that (2.11) has at least one solution
as given by (2.10). We now must show that this solution is unique. First consider a solution A(1)
of the equation
AA(1) A = A.
Substitution of A using the SVD formula in (2.8) and (2.9) implies A(1) is of the form
" #
Σ−1 Z12
A (1)
=V 0
U∗
Z21 Z22
where the matrices Z12 of dimension r × (n − r), Z21 of dimension (m − r) × r and Z22 of dimension
(m − r) × (n − r) are arbitrary. Now
" #
Σ−1 Z12
A (1)
AA (1)
=V 0
U∗
Z21 Z21 Σ0 Z12
Hence A(1) AA(1) = A(1) implies
Z22 = Z21 Σ0 Z12 .
Also, AA(1) = (AA(1) )∗ implies Z12 = 0, and (A(1) A)∗ = A(1) A implies Z21 = 0. Therefore, if all
four conditions hold in (2.11), A(1) = A+ is unique. 2
Exercise 2.2.2 Consider an n × m matrix A. Show that

(i) If A has full row rank n, that is, rank(AA∗ ) = n, then
A+ = A∗ (AA∗ )−1
(ii) If A has full column rank m, that is, rank(A∗ A) = m, then

A+ = (A∗ A)−1 A∗
Exercise 2.2.3 Show that if A∗ A = I, then A+ = A∗ .
Exercise 2.2.4 Let X = X∗ , det[X] 6= 0, A∗ A = I. Then [AX]+ = X−1 A∗ .

2.3. SOLUTIONS OF SELECTED LINEAR ALGEBRA PROBLEMS 29
2.3 Solutions of Selected Linear Algebra Problems

This section records the solution of certain linear matrix equations and linear matrix inequalities
that will provide the solution of every control problem to be discussed later in the book. Indeed,
the major point of this book is to show that the following linear algebra results solve a large variety
of linear control design problems.
2.3.1 AXB = Y
For the given set of matrices A, B, Y, we consider in this section all solutions X to the linear
algebraic equations of the form AXB = Y.
Theorem 2.3.1 Let A be an n1 × n2 matrix, X be an n2 × n3 matrix, B be an n3 × n4 matrix and

Y be an n1 × n4 matrix. Then the following statements are equivalent:
(i) The equation

AXB = Y (2.12)
has a solution X.
(ii) A, B and Y satisfy

AA+ YB+ B = Y. (2.13)
(iii) A, B and Y satisfy

(I − AA+ )Y = 0, Y(I − B+ B) = 0. (2.14)
In this case all solutions are
X = A+ YB+ + Z − A+ AZBB+ , (2.15)
where Z is an arbitrary n2 × n3 matrix and A+ denotes the Moore-Penrose inverse of A.
Proof. The implication (i) ⇒ (ii) can be verified by multiplying both sides of (2.12) by AA+
from the left and by B+ B from the right. To prove the converse, suppose (2.13) holds. Then using
(2.15)
AXB = A[A+ YB+ + Z − A+ AZBB+ ]B

= AA+ YB+ B + AZB − AA+ AZBB+ B
= Y
which holds by virtue of the pseudo-inverse property AA+ A = A and (2.13). Thus we have (i) ⇒
(ii) and it has been shown that any X given by (2.15) is a solution of (2.12).
To prove that any solution X to (2.12) can be generated by (2.15), we must show that for any
solution of (2.12), there exists a Z satisfying (2.15). That is, solve
X = A+ (AXB) B+ + Z − A+ AZBB+ ,
for Z. Obviously, a choice Z = X works.
To prove the equivalence of (ii) and (iii), suppose (2.13) holds. Replace Y in (2.14) by the
left-hand side of (2.13) to get
(I − AA+ )(AA+ YB+ B) = 0, AA+ YB+ B(I − B+ B) = 0.
Hence (2.13) implies (2.14). Now suppose (2.14) holds, then using AA+ Y = Y and YB+ B = Y,
we have
(AA+ Y)B+ B = YBB+ = Y.
This completes the proof. 2
Corollary 2.3.1 Consider an m × n matrix A and a vector y ∈ C m . The following statements are
equivalent.
(i) There exists a vector x ∈ C n such that
Ax = y. (2.16)
(ii) A and y satisfy

(I − AA+ )y = 0. (2.17)
In this case all solution vectors x are given by
x = A+ y + (I − A+ A)z
where z is an arbitrary vector in C n
Proof. The proof follows immediately from Theorem 2.3.1 by setting B = 1. 2
Exercise 2.3.1 Show that in terms of the SVD of A,

" #" #
Σ 0 V1∗
A = [U1 U2 ]
0 0 V2∗
all solutions of (2.16) are given by

" #
Σ−1 U∗1 y
x = [V1 V2 ] (2.18)
z1
where z1 is an arbitrary vector of dimension p = n − r, where Σ ∈ Rr×r and r is the rank if A.

Example 2.3.1 Use SVD to solve Ax = y where y is a scalar, and

h i
A= 1 1 .
The SVD of A is
A1 = U1 Σ1 V1T
where
" #
h i √ 1 1 1
U1 = 1; Σ1 = Σ0 0 , Σ0 = 2; V1 = √ .
2 1 −1
Condition (2.17) is satisfied. Hence, all solutions x are parametrized as

" #  
1 1 1 √1 y
x= √  2 
2 1 −1 z1
where z1 is an arbitrary real number.
The following result provides a solution to a Frobenius norm minimization problem.
Theorem 2.3.2 Consider the matrices A, X, B and Y with dimensions as in Theorem 2.3.1 and
the Frobenius norm minimization problem
min kAXB − YkF .
with respect to the matrix X. Then, the minimum is achieved by any X belonging to the following
class of matrices generated by an arbitrary matrix Z:
Xopt = A+ YB+ + Z − A+ AZB+ . (2.19)

∆
Proof. Let X be an arbitrary matrix and define Θ = X − Xopt . Then define
∆
f (X) = kAXB − Yk2F − kAXopt BYk2F
= tr(2AΘBAT0 ) + kAΘBk2F ,
where
∆
A0 = AXopt B − Y.
Note that using the expression (2.19), we have
tr(AΘBAT0 )
= tr(ΘB(AXopt B − Y)T A)
= tr(ΘB(B+ BYAA+ − Y)A)
= tr(Θ(BB+ BYAA+ A − BYA))
= 0.
where the properties (2.11) have been used. Therefore, we have f (X) ≥ 0 for any X, with equality
holding when X = Xopt . 2
Notice that if the equation AXB = Y is consistent, that is, if condition (2.13) is satisfied, then
the minimum value of kAXB − YkF is zero and the parametrization (2.19) provides all solutions
of AXB=Y.
2.3.2 AX = C, XB = D
Theorem 2.3.3 Let matrices A, B, C and D be given. The following statements are equivalent.
(i) There exists a common solution X to the two linear matrix equations
AX = C, XB = D. (2.20)
(ii) The following three conditions hold;
AA+ C = C, (2.21a)
DB+ B = D, (2.21b)
AD = CB. (2.21c)
In this case, all solutions are given by
X = A+ C + DB+ − A+ ADB+ + (I − A+ A)Z(I − BB+ )
where Z is arbitrary.
Proof. Necessity of conditions (2.21) is easy to establish since (2.21a) and (2.21b) correspond to
the solvability conditions of the first and second equation (2.20), respectively and (2.21c) is required
since
AXB = CB = AD.
To prove sufficiency of the conditions (2.21) as well as the expression for all solutions, consider the
general solution of the first equation AX = C
X = A+ C + (I − A+ A)Y
where Y arbitrary. For a solution to exist, (2.21a) is a necessary and sufficient condition. Substi-
tuting the expression for X into the second equation XB = D we obtain
(I − A+ A)YB = D − A+ CB (2.22)
This equation has a solution for Y if and only if the two conditions corresponding to equations
(2.14) are satisfied. These provide
(D − A+ CB)(I − B+ B) = 0.
That is,
DB+ B = D
and
£ ¤
I − (I − A+ A)(I − A+ A)+ (D − A+ CB) = 0
or
A+ A(D − A+ CB) = 0.
That is,
A+ AD = A+ CB.
Pre-multiplying by A and using (2.21a) we obtain (2.21c).

The general solution of (2.22) with respect to Y is
Y = (I − A+ A)(D − A+ CB)B+ + Z − (I − A+ A)ZBB+ .
Hence, the general solution X of the equations (2.20) is given by
X = A+ C + (I − A+ A)Y
= A+ C + (I − A+ A)(D − A+ CB)B+
+(I − A+ A)Z − (I − A+ A)ZBB+
= A+ C + DB+ − A+ CBB+ − A+ ADB+ + A+ AA+ CBB+
+(I − A+ A)Z(I − BB+ )
which proves the general solution of the theorem. 2
2.3.3 AX = C, X = X∗
Theorem 2.3.4 Let matrices A and C be given. The following statements are equivalent.
(i) There exists a Hermitian solution X = X∗ to AX = C.
(ii) The following two conditions hold:
CA∗ = (CA∗ )∗ , (I − AA+ )C = 0.
In this case, all such solutions are given by
X = A+ C + C∗ A+∗ − A+ CA∗ A+∗

+(I − A+ A)Θ(I − A+ A) (2.23)
Θ = Θ∗ arbitrary.
Proof. We must prove that AX = C and XA∗ = C∗ have a common solution. From Theorem
2.3.3 there exists such a solution if and only if
(I − AA+ )C = 0
C∗ (I − A∗+ A∗ ) = 0
AC∗ = CA∗
where the first and second conditions are redundant. This proves the necessary and sufficient
conditions for existence. All solutions are from Theorem 2.3.3
X = A+ C + C∗ A∗+ − A+ AC∗ A∗+ + (I − A+ A)Θ(I − A∗ A∗+ ).
This proves (2.23) since AC∗ = CA∗ and (A+ A) = (A+ A)∗ . 2
2.3.4 AX = C, XB = D, X = X∗
Theorem 2.3.5 Let matrices A, B, C and D be given, and suppose (2.21) holds. Then the
following statements are equivalent.
(i) There exists a common Hermitian solution (X = X∗ ) to (2.20).
(ii) The conditions
RP∗ = PR∗
AA+ C = C (2.24)
(I − TT+ )(D∗ − B∗ A+ C) = 0
hold, where " # " #

A C
T = B∗ (I − A+ A),
∆ ∆
P= , R= .
B∗ D∗
In this case, all Hermitian solutions are
X = P+ R + R∗ P+∗ − P+ RP∗ P+∗

+(I − P+ P)Θ(I − P+ P), (2.25)
Θ = Θ∗ arbitrary.
Proof. Equations (2.20) have a common Hermitian solution if and only if
PX = R (2.26)
has a Hermitian solution. According to Theorem 2.3.4, this is equivalent to RP∗ = (RP∗ )∗ (which
provides the first of the conditions (2.24)), and the equations
AX = C B∗ X = D∗ (2.27)
being consistent. The first one of these equations is solvable for X if and only if
AA+ C = C (2.28)
which is the second condition in (2.24) and the general solution is
X = A+ C + (I − A+ A)Z
where Z is an arbitrary matrix of appropriate dimension. Substituting this expression in the second
equation B∗ X = D∗ results in
TZ = D∗ − B∗ A+ C.
This equation has a solution for Z if and only if the third condition in (2.24) is satisfied. The
expression for the general solution (2.25) is obtained by applying the general solution of Theorem
2.3.4. 2
∆ √
By replacing X by jX and C by jC, and Θ by jΘ, j = −1, Theorem 2.3.4 can be used to
find skew-Hermitian solutions to linear algebra problems (note that jX is Hermitian if and only if
X is skew-Hermitian).
2.3.5 AX = C, X = −X∗
Corollary 2.3.2 Let matrices A and C be given. The following statements are equivalent.
(i) There exists a skew-Hermitian solution X to the equation
AX = C, X = −X∗ . (2.29)
(ii) The following two conditions hold:
CA∗ = −AC∗ , (I − AA+ )C = 0. (2.30)
X = A+ C − C∗ A+∗ − A+ CA∗ A+∗ + (I − A+ A)Θ(I − A+ A) (2.31)

Θ = −Θ∗ arbitrary. (2.32)
where Θ is an arbitrary skew-Hermitian matrix.
Replacing X, C, D in Theorem 2.3.3 by jX, jC, jD leads to the following result
2.3.6 XB = D, X = −X∗
Corollary 2.3.3 Let matrices B and D be given. The following statements are equivalent.
(i) There exists a skew-Hermitian solution to the equation
XB = D, X = −X∗ .
(ii) The following conditions hold:

D(I − B+ B) = 0
B∗ D = −D∗ B.
X = −B∗+ D∗ + DB+ + B∗+ D∗ BB+ + (I − BB+ )S(I − BB+ )
for arbitrary S = −S∗ .
2.3.7 AX = C, XB = D, X = −X∗
Corollary 2.3.4 Let matrices A, B, C and D be given. The following statements are equivalent.
(i) There exists a common skew-Hermitian solution X to the equations (2.20).
(ii) The condition (2.21) holds and in addition
(I − PP+ )R = 0
RP∗ = −PR∗ , (2.33)
where " # " #

∆ A ∆ C
P= R= .
B∗ −D∗
In this case all skew-Hermitian solutions are
X = P+ R − R∗ P+∗ − P+ RP∗ P+∗ + (I − P+ P)Θ(I − P+ P) (2.34)
where Θ is an arbitrary skew-Hermitian matrix.
2.3.8 AX + (AX)∗ + Q = 0
Theorem 2.3.6 Let A and Q be given where Q = Q∗ . The following statements are equivalent.
(i) There exists a matrix X satisfying
AX + (AX)∗ + Q = 0. (2.35)
(ii) A and Q satisfy

(I − AA+ )Q(I − AA+ ) = 0. (2.36)
In this case, all such X are given by

1
X = − A+ Q(2I − AA+ ) + A+ SAA+ + (I − A+ A)Z (2.37)
2
where Z is arbitrary and S is an arbitrary skew-Hermitian matrix.
Proof. First note that (2.35) holds if and only if

1
AX = − (Q + Ŝ)
2
holds for some skew-Hermitian Ŝ = −Ŝ∗ . Using Theorem 2.3.1, the above equation is solvable for
X if and only if
(I − AA+ )(Q + Ŝ) = 0, (2.38)
in which case, all solutions are
1
X = − A+ (Q + Ŝ) + (I − A+ A)Z (2.39)
2
where Z is arbitrary. From Corollary 2.3.2, there exists a skew-Hermitian matrix Ŝ satisfying (2.38)
if and only if
− (I − AA+ )Q(I − AA+ )∗ = (I − AA+ )Q(I − AA+ )∗ ,

− [I − (I − AA+ )(I − AA+ )+ ](I − AA+ )Q = 0
or equivalently,
(I − AA+ )Q(I − AA+ ) = 0
holds. In this case, all such Ŝ are given by
Ŝ = −(I − AA+ )QAA+ + Q(I − AA+ ) + AA+ S̄AA+ (2.40)

∆
where S̄ is an arbitrary skew-Hermitian matrix. Substituting (2.40) into (2.39) and defining S =
− 21 S̄ yields (2.37). 2
2.3.9 AXBC + (AXBC)∗ + Q = 0

Theorem 2.3.7 Let matrices A, B, C and Q be given, where Q is Hermitian and C is a square
invertible matrix. Then the following statements are equivalent.
AXBC + (AXBC)∗ + Q = 0. (2.41)
(ii) The three conditions hold;
(I − AA+ )Q(I − AA+ ) = 0 (2.42)

(I − B+ B)C−∗ QC−1 (I − B+ B) = 0 (2.43)
(I − DD+ )(I − B+ B)C−∗ Q = 0 (2.44)
where
D = (I − B+ B)C−∗ AA+
∆
(2.45)
and C−∗ = (C−1 )∗ .
∆
In this case, all such matrices X are given by

1
X = − A+ (Q + S)C−1 B+ + Z − A+ AZBB+ (2.46)
2
where Z is arbitrary and
S = [P+ R + (I − P+ P)Θ](I − P+ P) − (P+ R)∗

∆
(2.47)
" # " #
∆ I − AA+ ∆ −(I − AA+ )
P= , R= Q
(I − B+ B)C−∗ (I − B+ B)C−∗
where Θ = −Θ∗ is an arbitrary skew-Hermitian matrix.
Proof. The equality (2.41) holds if and only if
1
AXBC = − (Q + S), S = −S∗
2
holds for some skew-Hermitian matrix S. From Theorem 2.3.1, the above equation is solvable for
X if and only if
(I − AA+ )(Q + S) = 0
(Q + S)C−1 (I − B+ B) = 0
hold, in which case, all solutions X are given by (2.46). Rearranging, we have
(I − AA+ )S = −(I − AA+ )Q,

SC−1 (I − B+ B) = −QC−1 (I − B+ B).
Using Corollary 2.3.4, there exists a skew-Hermitian matrix S satisfying the above equations if and
only if
£ ¤
I − (I − AA+ )(I − AA+ )+ (I − AA+ )Q = 0,
£ ¤
(I − DD+ ) (I − B+ B)C−∗ Q + (I − B+ B)C−∗ (I − AA+ )Q = 0,
hold and
" #
−(I − AA+ )Q(I − AA+ ) −(I − AA+ )QC−1 (I − B+ B)
(I − B+ B)C−∗ Q(I − AA+ ) (I − B+ B)C−∗ QC−1 (I − B+ B)
is a skew-Hermitian matrix, or equivalently, (2.42)-(2.44) hold, where we used the identity
(I − DD+ )(I − B+ B)C−∗ AA+ = (I − DD+ )D = 0.
In this case, all such S are given by (2.47). 2

2.3.10 AX = B, XX∗ = I
Theorem 2.3.8 Let A ∈ C a×b and B ∈ C a×c be given matrices, where c ≥ b. Then the following
statements are equivalent.
(i) There exists X satisfying

AX = B, XX∗ = I. (2.48)
(ii) A and B satisfy

AA∗ = BB∗ . (2.49)

" # " #
h i I 0 ∗
VB1
X= VA1 VA2 ∗
(2.50)
0 U VB2
where U is an arbitrary matrix such that UU∗ = I and VA1 , VA2 , VB1 and VB2 are defined from
the SVDs of A and B as follows
" # " #
h i ΣA 0 ∗
VA1 ∗
A= UA1 UA2 ∗
= UA ΣA VA , (2.51)
0 0 VA2
" # " #
h i ΣA 0 ∗
VB1 ∗
B= UA1 UA2 ∗
= UA ΣA VB , (2.52)
0 0 VB2
Proof. Square both sides of AX = B to see that
AX(AX)∗ = AXX∗ A∗ = AA∗ = BB∗ .
This proves necessity of (2.49). For sufficiency, recall from the SVD of A that UA satisfies
" #
ΣA 2 0
AA∗ UA = UA . (2.53)
0 0
But from (2.49) and (2.53) it is clear that we can choose UA = UB , ΣA = ΣB . Hence (2.52). Now
define
" # " #
∗ ∆ Z∗1 ∆
∗
VA1
Z = = X. (2.54)
Z∗2 ∗
VA2
Then (2.48) is equivalent to

" #" # " #
h i ΣA 0 Z∗1 h i ΣA 0 ∗
UA1 UA2 = UA1 UA2 VB ,
0 0 Z∗2 0 0
Z∗ Z = I (2.55)
which is equivalent to
ΣA Z∗1 = ΣA VB1
∗
, Z∗2 Z2 = I, Z∗2 Z1 = 0, (2.56)
∗ V
which is equivalent to (using the fact VB2 B1 = 0)
Z1 = VB1 , Z2 = VB2 U∗ , UU∗ = I, (2.57)
which is equivalent to (since X = VA Z∗ )

" #
I 0 ∗
X = VA VB UU∗ = I. (2.58)
0 U
This completes the proof of Theorem 2.3.8. 2
2.3.11 (AX + B)R(AX + B)∗ = Q

Theorem 2.3.9 Let matrices A, B, R and Q be given. Suppose Q = Q∗ ∈ C n×n , R = R∗ ∈ C r×r
and R > 0. Then the following statements are equivalent.
(i) There exists a matrix X such that
(AX + B)R(AX + B)∗ = Q. (2.59)
(ii) The following conditions hold:
Q ≥ 0, rank(Q) ≤ r, (2.60)
(I − AA+ )(Q − BRB∗ )(I − AA+ ) = 0. (2.61)
X = A+ (LUR−1/2 − B) + (I − A+ A)Z (2.62)
where Z is arbitrary and
LL∗ = Q, L ∈ C n×r
" #
∆ I 0 ∗
U = VL VR (2.63)
0 UF
" #
ΣL 0 ∗
(I − AA )L = UL
+
VL , (SV D)
0 0
" #
ΣL 0 ∗
(I − AA )BR
+ 1/2
= UL VR (SV D)
0 0
where UF is an arbitrary matrix such that UF U∗F = I.

Proof. Since the left-hand side of (2.59) is positive semidefinite with rank less than or equal to r,
(2.60) is necessary. Pre- and post-multiplying the both sides of (2.59) by I − AA+ , we have (2.61).
This proves the necessity of (2.60) and (2.61). To prove sufficiency, suppose (2.60) and (2.61) hold.
Then there exists L ∈ C n×r such that Q = LL∗ . Now,
(AX + B)R(AX + B)∗ = LL∗
holds if and only if

(AX + B)R1/2 = LU, UU∗ = I
or equivalently,
AX = LUR−1/2 − B, UU∗ = I
holds for some orthogonal matrix U. The above equation is solvable for X if and only if
(I − AA+ )(LUR−1/2 − B) = 0
and all solutions X are given by (2.62). Rearranging, we have
(I − AA+ )LU = (I − AA+ )BR1/2 , UU∗ = I.
From Theorem 2.3.8, the above equation is solvable for an orthogonal matrix U if and only if
(I − AA+ )LL∗ (I − AA+ ) = (I − AA+ )BRB∗ (I − AA+ )
or equivalently, (2.61) holds. In this case, all such U are given by (2.63). 2
2.3.12 µBB∗ − Q > 0

In the sequel, we shall need the following definition. For a matrix B ∈ C n×m with rank r, let
B⊥ ∈ C (n−r)×n be any matrix such that B⊥ B = 0 and B⊥ B⊥∗ > 0. Note that such a matrix B⊥
exists if and only if B has linearly dependent rows (n > r), and the set of all such matrices can be
captured by B⊥ = TU∗2 , where T is an arbitrary nonsingular matrix and U2 is from the SVD
" #" #
h i Σ1 0 V1∗
B= U1 U2 . (2.64)
0 0 V2∗
Theorem 2.3.10 (Finsler’s Theorem) Let matrices B ∈ C n×m and Q ∈ C n×n be given. Suppose
rank (B) < n and Q = Q∗ . Let (Br ,B` ) be any full rank factor of B, i.e., B = B` Br , and define
D = (Br B∗r )−1/2 B+
∆
` . Then the following statements are equivalent.
(i) There exists a scalar µ such that

µBB∗ − Q > 0. (2.65)
(ii) The following condition holds;

P = B⊥ QB⊥∗ < 0.
∆
(2.66)
If the above statements hold, then all scalars µ satisfying (2.65) are given by
h i
µ > µmin = λmax D(Q − QB⊥∗ P−1 B⊥ Q)D∗ .
∆
(2.67)
Proof. Let T be a square nonsingular matrix defined by

" #
∆ D
T= . (2.68)
B⊥
By a congruence transformation with T, (2.65) is equivalent to

" #
µI − DQD∗ −DQB⊥∗
>0 (2.69)
−B⊥ QD∗ −B⊥ QB⊥∗
or equivalently,
P = B⊥ QB⊥∗ < 0
∆
(2.70)
µI − DQD∗ + DQB⊥∗ P−1 B⊥ QD∗ > 0 (2.71)
which proves the necessity of (2.66). Now, to prove sufficiency, suppose P < 0. Clearly, there exists
a µ satisfying (2.71) and all such µ are given by (2.67). 2
The existence condition (2.66) is known as Finsler’s Theorem (see references in [66], [106]). Theorem
2.3.10 provides not only the existence condition but also all acceptable values of µ. Note that
µmin ≤ 0 if and only if Q ≤ 0 since µmin ≤ 0 is equivalent to
D(Q − QB⊥∗ P−1 B⊥ Q)D∗ ≤ 0
which is equivalent to TQT∗ ≤ 0 since P < 0.

The following result can be verified by a similar procedure to the proof of Theorem 2.3.10.
Corollary 2.3.5 Let matrices B and Q = Q∗ be given. Then the following statements are equiv-
alent.
(i) There exists a symmetric matrix X such that
Q + BXB∗ > 0.
(ii) One of the following conditions hold;
B⊥ QB⊥∗ > 0 or BB∗ > 0.
Suppose (ii) holds and B∗ B > 0, but BB∗ is singular. Then all matrices X satisfying the condition
in (i) are given by
X > B+ [QB⊥∗ (B⊥ QB⊥∗ )−1 B⊥ Q − Q]B+∗ .
2.3.13 (A + BXC)R(A + BXC)∗ < Q

Theorem 2.3.11 Let matrices A ∈ C n×` , B ∈ C n×m , C ∈ C k×` , R ∈ C `×` and Q ∈ C n×n be given.
Suppose B∗ B > 0, CC∗ > 0, R > 0 and Q > 0. The following statements are equivalent.
(A + BXC)R(A + BXC)∗ < Q. (2.72)
(ii) The following two conditions hold;
B⊥ (Q − ARA∗ )B⊥∗ > 0 or BB∗ > 0,
C∗⊥ (R−1 − A∗ Q−1 A)C∗⊥∗ > 0 or C∗ C > 0.
If the above statements hold, then all matrices X satisfying (2.72) are given by
X = −(B∗ ΦB)−1 B∗ ΦARC∗ (CRC∗ )−1 + (B∗ ΦB)−1/2 LΨ1/2
where L is an arbitrary matrix such that kLk < 1 and
Φ = (Q − ARA∗ + ARC∗ (CRC∗ )−1 CRA∗ )−1 ,

∆
Ψ = Rc − Rc CRA∗ (Φ − ΦB(B∗ ΦB)−1 B∗ Φ)ARC∗ Rc ,

∆
Rc = (CRC∗ )−1 .
∆
Proof. After expanding and completing the square, the inequality (2.72) can equivalently be
written as
(BX + ARC∗ Rc )R−1 ∗
c (BX + ARC Rc ) < Φ
∗ −1
where Φ and Rc are defined above. Note that the assumptions R > 0 and CC∗ > 0 imply that
Rc > 0. Then, by the Schur complement formula, the above inequality and Rc > 0 are equivalent
to
(X∗ B∗ + Rc CRA∗ )Φ(BX + ARC∗ Rc ) < Rc
and Φ > 0. After expanding, completing the square with respect to X yields
(X + ΦB B∗ ΦARC∗ Rc )∗ Φ−1 ∗ ∗
B (X + ΦB B ΦARC Rc ) < Ψ (2.73)
where ΦB = (B∗ ΦB)−1 . Since the left-hand side is nonnegative, we have Ψ > 0. Thus, Φ > 0
∆
and Ψ > 0 are necessary for the existence of X satisfying (2.72). To prove the converse, suppose
−1/2
Φ > 0 and Ψ > 0. In this case, ΦB and Ψ−1/2 exist, and (2.73) can be equivalently written
L = ΦB −1/2 (X + ΦB B∗ ΦARC∗ Rc )Ψ−1/2 .

∆
kLk < 1,
Solving for X, we have
X = −ΦB B∗ ΦARC∗ Rc + ΦB LΨ1/2 ,

1/2
kLk < 1.
By construction, the above X solves (2.72) and in fact, any solution X can be generated by the
above formula.
Finally, we shall show that the existence conditions Φ > 0 and Ψ > 0 are equivalent to
statement (ii). To this end, note that Φ > 0 holds if and only if there exists V > 0 such that
Q − ARA∗ + ARC∗ (CRC∗ + V)−1 CRA∗ > 0,
or equivalently, using the matrix inversion lemma,
Q − A(R−1 + C∗ V−1 C)−1 A∗ > 0,
or equivalently, using the Schur complement formula,
R−1 + C∗ V−1 C − A∗ Q−1 A > 0.
From Finsler’s Theorem, there exists V > 0 satisfying the above inequality if and only if
C∗⊥ (R−1 − A∗ Q−1 A)C∗⊥∗ > 0 or C∗ C > 0
holds. The equivalence between Ψ > 0 and the first condition in statement (ii) can be shown by a
similar procedure. This completes the proof. 2
For the special case where C = I, Theorem 2.3.11 reduces to the following.
Corollary 2.3.6 Let matrices A, B, Q and R be given. Suppose Q = Q∗ , R = R∗ > 0 and

B∗ B > 0. Then the following statements are equivalent.
(A + BX)R(A + BX)∗ < Q. (2.74)
(ii) Q > 0 and

B⊥ (Q − ARA∗ )B⊥∗ > 0 or BB∗ > 0.
If the above statements hold, then all matrices X satisfying (2.74) are given by
X = −(B∗ Q−1 B)−1 B∗ Q−1 A + (B∗ Q−1 B)−1/2 LΨ1/2
Ψ = R−1 − A∗ Q−1 A + A∗ Q−1 B(B∗ Q−1 B)−1 B∗ Q−1 A.

∆
The following result is also a special case of Theorem 2.3.11 where B = I and C = I.
Corollary 2.3.7 Let matrices A, B, C, Q and R be given, where all the matrices except B are
symmetric. Then the following statements are equivalent.

" # " #
Q X A B
> . (2.75)
X∗ R B∗ C
∆ ∆
(ii) V = Q − A > 0 and W = R − C > 0.
Suppose the above statements hold. Then all matrices X satisfying (2.75) are given by
X = B + V1/2 LW1/2 ,
where L is an arbitrary matrix such that kLk < 1.
Proof. Suppose (i) holds. Then the necessity of (ii) is obvious. To prove the converse, suppose
(ii) holds. Then using the Schur complement formula, (2.75) is equivalent to
(X − B)W−1 (X − B)∗ < V.
Now the result follows as a special case of Theorem 2.3.11. 2
2.3.14 BXC + (BXC)∗ + Q < 0

Theorem 2.3.12 Let matrices B ∈ C n×m , C ∈ C k×n and Q = Q∗ ∈ C n×n be given. The following
BXC + (BXC)∗ + Q < 0. (2.76)
(ii) The following two conditions hold.
B⊥ QB⊥∗ < 0 or BB∗ > 0,
C∗⊥ QC∗⊥∗ < 0 or C∗ C > 0.
Suppose the above statements hold. Let rb and rc be the ranks of B and C, respectively, and (B` ,Br )
and (C` ,Cr ) be any full rank factors of B and C, i.e., B = B` Br , C = C` Cr . Then all matrices
X in statement (i) are given by
r KC` + Z − Br Br ZC` C`
+ +
X = B+ +
where Z is an arbitrary matrix and
K = −R−1 B∗` ΦC∗r (Cr ΦC∗r )−1 + S1/2 L(Cr ΦC∗r )−1/2
∆
S = R−1 − R−1 B∗` [Φ − ΦC∗r (Cr ΦC∗r )−1 Cr Φ]B` R−1

∆
where L is an arbitrary matrix such that kLk < 1 and R is an arbitrary positive definite matrix
such that
Φ = (B` R−1 B∗` − Q)−1 > 0.
∆
∆
Proof. Suppose statement (i) holds. Then K = Br XC` satisfies
B` KCr + (B` KCr )∗ + Q < 0.
Let a matrix R > 0 be such that
B` KCr + (B` KCr )∗ + Q + C∗r K∗ RKCr < 0
or equivalently,
(B` R−1 + C∗r K∗ )R(R−1 B∗` + KCr ) < B` R−1 B∗` − Q =: Φ−1 .
Note that such a matrix R > 0 always exists and a choice of such R is R = εI for sufficiently small
ε > 0. Now, using Corollary 2.3.6, there exists a matrix K satisfying the above inequality if and
∗⊥∗
only if Φ > 0 and either C∗⊥
r QCr < 0 or C∗r Cr > 0, in which case all such matrices K are given
by
K = −R−1 B∗` ΦC∗r (Cr ΦC∗r )−1 + S1/2 L(Cr ΦC∗r )−1/2
S = R−1 − R−1 B∗` [Φ − ΦC∗r (Cr ΦC∗r )−1 Cr Φ] B` R−1 > 0.

∆
⊥∗
Using Finsler’s Theorem (Corollary 2.3.5), Φ > 0 holds for some R > 0 if and only if B⊥
` QB` < 0
∗⊥∗ ⊥∗
or B` B∗` > 0. It is easy to verify that C∗⊥r QCr < 0 and B⊥ ` QB` < 0 are equivalent to
∗⊥ ∗⊥∗ ⊥ ⊥∗
C QC < 0 and B QB < 0, respectively. Finally, note that K = Br XC` holds if and only
if
r KC` + Z − Br Br ZC` C`
+ +
X = B+ +
holds for some Z. Thus we have established the necessity of statement (ii). Sufficiency and the
explicit formula for X follow by construction. This completes the proof. 2
Note that all solutions X to (2.76) can be captured by the freedoms in the choice of parameters
Z, L and R. For the special case where B and C have full column and row s, respectively, the
+
freedom due to Z disappears since B+ r Br = I and C` C` = I. In view of the above proof, the
freedom R > 0 can be restricted to have the structure R = rI for some scalar r > 0 without loss
of generality. The following result shows another special case where the freedom R > 0 (as well as
Z) disappears.
Corollary 2.3.8 Let matrices B, C and Q = Q∗ be given. Suppose the conditions in statement
(ii) of Theorem 2.3.12 hold. If we further assume that
B∗ B > 0, CB⊥∗ B⊥ C∗ > 0,
then all matrices X satisfying

BXC + (BXC)∗ + Q < 0
are given by
X = X1 + X2 LX3
X1 = (C∗1 − Q12 Q−1 ∗ −1 ∗ −1

∆
22 C2 )(C2 Q22 C2 ) ,
X2 = (Q12 Q−1 ∗ −2 ∗ 1/2

∆
22 Q12 − Q11 − X1 X3 X1 ) ,
X3 = (−C2 Q−1 ∗ −1/2

∆
22 C2 ) ,
" # " #
Q11 Q12 B+ h i
∆
= Q B+∗ B⊥∗ ,
Q∗12 Q22 B⊥
h i h i
∆
C1 C2 =C B+∗ B⊥∗ .
Proof. By a congruent transformation, we have

" #
B+ h i
(BXC + (BXC)∗ + Q) B+∗ B⊥∗ < 0,
B⊥
or equivalently, using the definitions given above,

" #
Q11 + XC1 + C∗1 X∗ Q12 + XC2
< 0.
Q∗12 + C∗2 X∗ Q22
Since statement (ii) in Theorem 2.3.12 holds, we have Q22 < 0 and hence the above inequality is
equivalent to
Q11 + XC1 + C∗1 X∗ − (Q12 + XC2 )Q−1 ∗
22 (Q12 + XC2 ) < 0.
Now, by supposition,
C2 C∗2 = CB⊥∗ B⊥ C∗ > 0,
and hence C2 Q−1 ∗

22 C2 < 0. Thus, after expanding, we can complete the square as follows;
(X − X1 )X−2 ∗
3 (X − X1 ) < X2
2
where matrices X1 , X2 and X3 are defined above. Then the result follows as a special case of
Corollary 2.3.6. This completes the proof. 2
Note that the result of Theorem 2.3.11 follows as a special case of Theorem 2.3.12. For R > 0,
(A + BXC)R(A + BXC)∗ < Q
is equivalent to
" # " # " #
B h i 0 h i Q A
∗
(−X) 0 C + (−X) B∗ 0 − < 0.
0 C∗ A R−1
∗
Thus, applying Theorem 2.3.12, we have the existence conditions in statement (ii) of Theorem
2.3.11. If B∗ B > 0 and CC∗ > 0, then the assumptions in Corollary 2.3.8 are satisfied, i.e.,
" #∗ " #
B B
= B∗ B > 0,
0 0
" #⊥∗ " #⊥ " #
h i B B 0
0 C = CC∗ > 0.
0 0 C∗
Hence, the result of Corollary 2.3.8 can be applied to obtain an explicit formula for X, where all
the freedoms are captured by kLk < 1 as is the case for the formula given in Theorem 2.3.11.
Theorem 2.3.12 reduces to the following when specialized to the case C = I.
Corollary 2.3.9 Let matrices B and Q = Q∗ be given. The following statements are equivalent.
BX + (BX)∗ + Q < 0.
(ii) The following condition holds.
B⊥ QB⊥∗ < 0 or BB∗ > 0.
Suppose the above statements hold and further assume that B∗ B > 0. Then all matrices X in
statement (i) are given by
√
X = −ρB∗ + ρLΩ1/2
where L is any matrix such that kLk < 1 and ρ > 0 is any scalar such that
Ω = ρBB∗ − Q > 0.
∆
Chapter 2 Closure
Chapter 2 defines the notation, techniques, and virtually all the mathematical tools that are needed
to derive all results of this book. Linear algebra concepts have become increasingly important in
the state space analysis and design of control systems [128]. A detailed presentation of the singular
value decomposition and its connections with control theory can be found in [76]. The Moore-
Penrose generalized inverse was defined in [104]. Solvability conditions for linear matrix equations
can be found in [109], [3] and [73] although many results in this chapter are new in this area. Most
of the solvability conditions of linear and quadratic matrix inequalities are new although the basic
Theorem 2.3.10 dates back to Finsler [28]. More theoretical details in matrix methods are available
in the books [51], [42], [154].
Theorem 2.3.12 is the most important result in this book. Almost all control problems in
this book can be analytically solved by this theorem. That is approximately 20 different control
problems all reduce to this problem of linear algebra. The point of the book is to show how to
rearrange these control problems so that they take the form of Theorem 2.3.12.
Chapter 3
Analysis of First-Order Information
This chapter reviews the classical analysis of state space models including the system “abilities”;
observability, controllability, and stability.
3.1 Solutions of Linear Differential Equations

We consider a linear, time-varying dynamic system of the form
ẋ(t) = A(t)x(t) + B(t)u(t),

(3.1)
y(t) = C(t)x(t) + D(t)u(t)
where A(t), B(t), C(t), and D(t), are the system matrices which may be functions of time t, and
x(t) is the state vector, u(t) is the control input, and y(t) is the output. The solution of (3.1) is
given by
Z t
x(t) = Φ(t, t0 )x(t0 ) + Φ(t, σ)B(σ)u(σ)dσ
t0
where Φ(t, t0 ) is called the state transition matrix and is generated by solving the differential
equation
d
Φ(t, t0 ) = A(t)Φ(t, t0 ), Φ(t0 , t0 ) = I.
dt
If A is a constant matrix, then
∞
X
∆ Ai τ i
Φ(τ + t0 , t0 ) = Φ(τ, 0) = eAτ = , (3.2)
i=0
i!
and
Z t
x(t) = eA(t−to ) x(to ) + eA(t−σ) Bu(σ)dσ. (3.3)
t0
Various properties of eAt may be found in a first course in linear systems [11, 126, 68], such as the
following.
49
50 CHAPTER 3. ANALYSIS OF FIRST-ORDER INFORMATION
Theorem 3.1.1 Let A be a constant n × n matrix. Then
eAt = L−1 [sI − A]−1 (3.4a)

eAt = EeΛt E−1 (3.4b)
X
n−1
eAt = Ai αi (t) (3.4c)
i=0
where L−1 is the inverse Laplace transformation operator, A = EΛE−1 is the spectral decomposition
of A, and the functions αi (t) are computed from the inverse Laplace transforms of
αn−1 (s) = |sI − A|−1

αn−2 (s) = |sI − A|−1 (s + an−1 )
αn−3 (s) = |sI − A|−1 (s2 + an−1 s + an−2 ) (3.5a)
..
.
αo (s) = |sI − A|−1 (sn−1 + an−1 sn−2 + · · · a2 s + a1 )
where
|sI − A| = sn + an−1 sn−1 + an−2 sn−2 + · · · a1 s + ao .
The above theorem allows one to compute the first-order information y(t) for a linear system.
To illustrate how difficult and unreliable such calculations can be, consider four examples for the
computation of eAt .
" # " #
−1 1 −1 + ² 1
A1 = , A2 = ,
0 −1 0 −1
" # " #
−1 1 −1 1 + ²
A3 = , A4 = .
0 −1 + ² 0 −1
Matrix A1 is defective (meaning that it does not have linearly independent eigenvectors). Matrices
A2 , A3 , A4 might represent various effects of computational errors in an attempt to study the
first-order behavior of ẋ = A1 x. While both A1 and A4 are defective, matrices A2 and A3 are
nondefective, for any (even arbitrarily small) ² 6= 0. Hence y(t) = CeAt x(0) for each A1 , A2
above yields, for C = [1 0], x(0) = [0 1]T ,
y 1 (t) = CeA1 t x(0)

" # " #
1 0 −1 1
= CE1 e Λ1 t
E−1
1 x(0), E1 = , Λ1 =
0 1 0 −1
= te−t
y 2 (t) = CeA2 t x(0)
" # " #
1 1 −1 0
= CE2 e Λ2 t
E−1
2 x(0) , E2 = , Λ2 =
−² 0 0 −1+²
Ã !
e²t − 1 −t
= e .
²
3.2. SOLUTIONS OF LINEAR DIFFERENCE EQUATIONS 51
The error y 2 (t) − y 1 (t) is " #

e²t − 1
y 2 (t) − y 1 (t) = − t e−t .
²
Hence, small modeling errors can drastically change the character of the first-order response of a
linear system, and great care is required to obtain a good model for control design.
3.2 Solutions of Linear Difference Equations

Consider a linear discrete-time dynamic system
xk+1 = Ak xk + Bk uk ,
yk = Ck xk + Dk uk (3.6)
where Ak , Bk , Ck , Dk , xk , yk , uk denote matrices and vectors that are functions of the time index
∆
k, that is, at time tk , Ak = A(tk ), etc. The solution of (3.6) for xk is
X
k
xk = Φk0 x0 + Φki Bi−1 ui−1 ,
i=1
∆
ˆ I Φki = Πk−1
Φkk = α=1 Aα .
If A and B are constant then Aα = A for all α and this solution reduces to
X
k
xk = Ak x0 + Ak−i Bui−1 . (3.7)
i=1
3.3 Controllability and Observability of Continuous-Time Systems

3.3.1 Controllability
Consider the system
ẋ(t) = A(t)x(t) + B(t)u(t)

y(t) = C(t)x(t) (3.8)
Definition 3.3.1 System (3.8) is said to be completely state controllable at time t = t0 if there
exists a time tf > t0 and a control u(t), t ∈ [t0 , tf ] such that the state is transferred from an
arbitrary initial state x(t0 ) = x0 to an arbitrarily specified x(tf ) = xf in a finite time tf < ∞.
Suppose we wish to know whether (3.8) is completely controllable at t0 . This is equivalent to

asking whether there exists a u(σ), σ ∈ [t0 , tf ] such that
Z tf
Φ(tf , σ) B(σ)u(σ)dσ = x̃ (3.9)
t0
∆
for some tf < ∞ and for any specified x̃ = xf − Φ(tf , t0 )x0 . Since every element of the vector x̃
is arbitrary, the rows of the matrix
∆
R(σ) = Φ(tf , σ)B(σ) (3.10)
must be linearly independent on the interval σ ∈ [t0 , tf ]. This is equivalent to

Z tf
∆
X(tf ) = R(σ)RT (σ)dσ > 0, (3.11)
t0
Ẋ(t) = X(t)AT (t) + A(t)X(t) + B(t)BT (t), (3.12)

X(t0 ) = 0,
X(tf ) > 0.
Equation (3.12) may be derived from (3.11) by replacing tf by t and differentiating X(t) with
respect to t.
Theorem 3.3.1 The system ẋ(t) = A(t)x(t) + B(t)u(t) is completely state controllable at time to
if and only if there exists tf < ∞ such that (3.12) holds.
Exercise 3.3.1 Suppose system (3.8) is given. Using similar steps as (3.8)-(3.10) prove that the
output y(tf ) can be taken to an arbitrary value for some tf < ∞ if and only if
Z tf
C(tf ) Φ(tf , σ)B(σ)BT (σ)ΦT (tf , σ)dσCT (tf ) > 0,
t0
or, equivalently
Ẋ(t) = X(t)AT (t) + A(t)X(t) + B(t)BT (t),

X(t0 ) = 0,
C(tf )X(tf )CT (tf ) > 0. (3.13)
Now suppose hereafter that A, B, C are constant matrices.

Equation (3.11) can also be written as follows,
Z tf
eA(tf −σ) BBT eA
T (t −σ)
X(tf ) = f
dσ
t0
Z 0 Tτ
= − eAτ BBT eA dτ
tf −t0
Z tf −t0 Tτ
= eAτ BBT eA dτ.
0
3.3. CONTROLLABILITY AND OBSERVABILITY OF CONTINUOUS-TIME SYSTEMS 53
T
Since the integrand eAτ BBT eA τ ≥ 0, it follows that X(t2 ) ≥ X(t1 ) if t2 ≥ t1 , (for any given
t0 ). Hence, the existence of a tf such that X(tf ) > 0 does not depend upon the choice of t0 . Now
suppose X(tf ) > 0 for some tf < ∞. Then X(t̃) > 0 for every t̃ ≥ tf , including the limiting case
tf = ∞. Likewise, if X(∞) is not positive definite then X(t̃) is not positive definite for any tf < ∞.
This proves the following.
Corollary 3.3.1 The linear time-invariant system ẋ = Ax + Bu is completely state controllable

if and only if there exists some tf < ∞ such that
Z tf Tτ
X(tf ) = eAτ BBT eA dτ > 0
0
or equivalently (3.12) holds for t0 = 0 and A, B constant.
Note that X(tf ) always exists for tf < ∞ but might not exist for tf = ∞. This is illustrated
by example.
" # " #
1 0 1
Exercise 3.3.2 Computing X(tf ) for A = ,B= yields
0 −1 1
Z tf Tτ
X(tf ) = eAτ BBT eA dτ
0
Z " #" #" #
tf eτ 0 1 1 eτ 0
= dτ
0 0 e−τ 1 1 0 e−τ
" #
1 2tf
2 (e − 1) tf
= −2tf )
2 (1 − e
1
tf
which is positive definite for every 0 < tf < ∞, but X(tf ) does not exist for tf = ∞.
This example shows that tf cannot be taken as infinity in Corollary 3.3.1, without some as-
sumptions. Now define X(∞) by (if it exists),
Z ∞ Tτ
X(∞) = eAτ BBT eA dτ. (3.14)
0
This matrix is called the controllability Gramian. Using (3.4c) write (3.14) as
h i h iT
X(∞) = B AB · · · An−1 B Ω B AB · · · An−1 B (3.15)
where
 
α0 (t)I
Z  
∞ α1 (t)I 
∆  
Ω=  ..  [α0 (t)I α1 (t)I · · · αn−1 (t)I] dt.
 . 
0  
αn−1 (t)I
Suppose the integral in Ω is finite (i.e., Ω exists). The matrix Ω is positive definite if αi (t) i =
0, . . . n − 1 are linearly independent functions on the interval t ∈ [0, ∞]. This is true for any A and
may be proved from (3.5a), but we omit this proof. Hence, from (3.15), see that rank[B AB · · ·
An−1 B] = n is necessary and sufficient for X(∞) > 0. We define A to be an asymptotically
stable matrix if all eigenvalues of A are in the open left half plane. Notice from (3.5) that for an
asymptotically stable matrix A,
lim αi (t) = 0,
t→∞
and matrix Ω is finite if A is asymptotically stable. Hence, if A ∈ Rn×n is asymptotically stable

then Ω exists and is positive definite, and thus X(∞) > 0 is equivalent to rank[B AB . . . An−1 B] =
n. This discussion gives a necessary and sufficient condition for X(∞) > 0, under the assumption
that A is asymptotically stable. Finally, we must remove the stability assumption.
Corollary 3.3.2 X(∞) exists if and only if the controllable modes of ẋ = Ax + Bu are asymptot-
ically stable. If X(∞) exists, then X(∞) > 0 if and only if (A,B) is a controllable pair.
Proof. We will only prove the case for nondefective A. Describe the time invariant system
ẋ = Ax + Bu in its modal coordinates. That is
   
λ1 b∗1
   
 λ2   b∗2 
   
ẋ =  .. x +  .. u
 .   . 
   
λn b∗n
The “modes” are characterized by the eigenvalue, eigenvector pairs and a “controllable mode”
i that is “asymptotically stable” corresponds to {Real[λi ] < 0, bi 6= 0}. Hence Xij (∞) =
R∞ λ τ ∗
0 e
i b b eλj τ dτ exists if and only if b = 0 whenever Real[λ ] > 0.
i j k k 2
Exercise 3.3.3 i) Show that X(∞) exists but is not positive definite for the pair
" # " #
1 0 0
A= , B=
0 −1 1
ii) Show that X(∞) exists and is positive definite for the pair
" # " #
−2 0 1
A= , B= .
0 −1 1
iii) Show that X(∞) does not exist but the pair
" # " #
1 0 1
A= , B=
0 −1 1
is controllable.
Note that if X(∞) exists it satisfies X(∞) = X, where
0 = XAT + AX + BBT . (3.16)
However, all X that satisfy (3.16) might not be matrices of the form (3.14), (note that (3.14) is
always a positive semidefinite matrix).
" # " #
1 0 0
Exercise 3.3.4 For A = , B= , (3.16) yields
0 −1 1
" #
0 X12
X=
X12 1/2
for arbitrary X12 (say X12 = 1), and X is not a positive semidefinite matrix for X12 6= 0. Hence
this X 6= X(∞) and X(∞) does not exist.
Corollary 3.3.3 The matrix X solving (3.16) is unique if and only if there are no two eigenvalues
of A that are symmetrically located about the jw axis.
Proof. The left eigenvectors li of A satisfy l∗i A = λi l∗i . Multiply (3.16) from the left by l∗i and
from the right by lk to get
0 = l∗i XAT lk + l∗i AXlk + l∗i BBT lk .
But using l∗i A = λi l∗i , (for the scalar λ∗ = λ̄)
0 = l∗i Xlk (λk + λi ) + l∗i BBT lk
yields unique values for the elements of the transformed X̂ : X̂ik = [E−1 XE−∗ ]ik = l∗i Xlk ,
E−∗ = [l1 · · · ln ],
l∗ BBT lk
X̂ik = l∗i Xlk = − i
∆
∀i, k
λi + λ̄k
if and only if λi + λ̄k 6= 0 for all i and k. Or, equivalently, since the eigenvalues occur in complex
conjugate pairs, λi + λk 6= 0 for all i and k. 2
Corollary 3.3.3 suggests one computational procedure for solving Lyapunov equations of the
form (3.16), but the procedure is not efficient because the transformation to Jordan form is com-
putationally difficult. Instead of transforming to a diagonal form, it is much easier to transform A
to a triangular form
 
λ1 λ12 · · · λ1n
 
 0 λ2 · · · λ2n 
 
U∗ AU =  .. .. .. ..  (3.17)
 . . . . 
 
0 0 λn
since there exists a unitary U to do this. Assume that A in (3.16) is already in upper triangular
form. Then the solution of (3.16) can be expressed in terms of its columns defined by
 
X1i
 
 X2i 
 
Xi =  ..  i = 1, 2, · · · n (3.18)
 . 
 
Xii
by computing sequentially for i = n, n − 1, · · · 2, 1,

      
X1i 
 Q1i λ1,i+1 · · · λ1n Xi,i+1
  
     
  
 Q2i    
 X2i     λ2,i+1 · · · λ2n  Xi,i+2 
 ..  = −(λi Ii + Ai )−1  . + ..  .. 
  
  ..    
 .  
    .  . 


Xii Qii λi,i+1 · · · λin Xin
 
X1k 

 

X
n  X2k 

 
+ λik  .  , (3.19)
 .. 
k=i+1  




Xik
where Ai is the i × i upper left-hand corner of A. The proof is easy to show by construction. See
[154] for a proof. Note that λi I + Ai is triangular and therefore easy to invert. Hence (3.19) yields
the unique solution to (3.16) if λi + λj 6= 0 for any i, j.
Corollary 3.3.4 If the controllable modes of ẋ = Ax + Bu are asymptotically stable, the following
(i) The system is completely state controllable.

R∞ Tτ
(ii) 0 eAτ BBT eA dτ > 0.
(iii) X > 0, 0 = AX + XAT + BBT
(iv) rank[B AB · · · An−1 B] = n (dimension of x)
3.3.2 Observability
Now consider the system
ẋ(t) = A(t)x(t), y(t) = C(t)x(t) (3.20)
Suppose we wish to determine x(t0 ) given the data y(t), t0 ≤ t ≤ tf . Note that knowledge of x(t0 )
is equivalent to knowledge of x(t) for any t, since Φ(t, t0 ) is invertible and x(t) = Φ(t, t0 )x(t0 ).
From (3.20),
y(t) = C(t)Φ(t, t0 )x(t0 ) (3.21)
Definition 3.3.2 The system (3.20) is said to be completely observable at time tf > t0 if the data
y(t), t ∈ [t0 , tf ] yields a unique solution x(t0 ) to (3.21).
Now consider y(t) = C(t)x(t) and some given data y(t) over an interval t0 ≤ t ≤ tf . In order for
x(t0 ) to have a unique solution in (3.21) the columns of C(t)Φ(t, t0 ) must be linearly independent
∆ Rt
on the interval [t0 , tf ]. This means that K(t0 ) = t0f ΦT (σ, t0 )CT (σ)C(σ)Φ(σ, t0 )dσ > 0 (where
Φ̇(t, t0 ) = A(t)Φ(t, t0 ), Φ(t0 , t0 ) = I), or equivalently

−K̇(t) = K(t)A(t) + AT (t)K(t) + CT (t)C(t) 


K(tf ) = 0 (3.22)


K(t0 ) > 0 f or some t0 < tf . 
These results are summarized as follows.
Theorem 3.3.2 The system ẋ(t) = A(t)x(t), y(t) = C(t)x(t) is completely observable at time tf
if and only if there exists 0 < to < tf such that K(to ) > 0, where
− K̇(t) = K(t)A + AT K(t) + CT C

K(tf ) = 0. (3.23)
The time-invariant cases follow in a natural way from the above theorems by setting K̇(t) to
zero. The matrix K below is called the observability Gramian.
Z ∞ Tτ
K = eA C T CeAτ dt
0
Corollary 3.3.5 If the observable modes of ẋ = Ax, y = Cx are asymptotically stable the follow-
ing statements are equivalent:
(i) The system is completely observable.

R∞ Tτ
(ii) 0 eA CT CeAτ dτ > 0.
(iii) K > 0, 0 = KA + AT K + CT C
(iv) rank[CT AT CT · · · An−1T CT ] = n (dimension of x)
Exercise 3.3.5
1. Show that (A, BBT ) is a controllable pair if and only if (A, B) is a controllable pair.
2. Show that, for any W > 0, (A, BWBT ) is a controllable pair if and only if (A, B) is
controllable.
3.4 Controllability and Observability of Discrete-Time Systems

3.4.1 Controllability
Consider now the discrete-time system (3.6).
Definition 3.4.1 The system (3.6) is called “output controllable at time ko ” if there exists an
integer kf and a sequence {uko , uko +1 , uko +2 , · · · ukf } such that ykf = yf for an arbitrarily specified
yf , for any given initial state xko .
When (A, B, C, D) are constant matrices, the “at time ko = 0” can be deleted in the definition
and ko = 0 can be substituted without loss. When C = I, of course y is replaced by x, and C, D
need not be stated, where output controllability reduces to state controllability in the definition.
Theorem 3.4.1 These statements are equivalent:
(i) The matrix time-varying triple (Ak , Bk , Ck , Dk ) is output controllable at time ko .
(ii) There exists kf > ko such that
Xk+1 = Ak Xk ATk + Bk BTk (3.24)

Xko = 0,
Ckf Xkf Ckf T + Df DTf > 0. (3.25)
Theorem 3.4.2 Let A, B, C be constant and suppose X exists satisfying
X = AXAT + BBT . (3.26)
Then, the following two statements are equivalent:
(i) The system (3.6) is output controllable.
(ii) CXCT + DDT > 0.
The following statements are also equivalent:
(i) The system (3.6) is state controllable.
(ii) X > 0.
The solution to (3.26), if it exists, is

∞
X i
X= Ai BBT (AT ) (3.27)
i=0
3.4. CONTROLLABILITY AND OBSERVABILITY OF DISCRETE-TIME SYSTEMS 59
as proved by direct substitution into (3.26). From (3.27)

∆
X = ΩΩT , Ω = [B AB A2 B · · ·]
making it clear that rankX = rankΩ.

The Cayley-Hamilton theorem [11, 126, 68] states that for any real square matrix A with
characteristic equation
λm + am−1 λm−1 + · · · a1 λ + a0 = 0,
the following holds:

Am + am−1 Am−1 + · · · a1 A + a0 I = 0.
Due to the Cayley-Hamilton theorem
rank[B AB · · · An−1 B] = rank [B AB · · · Ai B]
for any i > n − 1, since An equals some linear combination of lower powers of A. Hence no new
linearly independent columns in Ω are added beyond the An−1 B column block.
Theorem 3.4.3 If X in (3.27) exists, then the following statements are equivalent:
(i) The matrix pair (A, B) is (state) controllable.
(ii) rankX = rank[B AB · · · An−1 B] = n (= dimension of A).
(iii) X > 0.
Exercise 3.4.1 Define Xk as the matrix
∆ X
k
Xk = Ak−i BBT (AT )k−i
i=j+1
and show that

Xk+1 = AXk AT + BB.
Corollary 3.4.1 There exists a unique solution X to (3.26) if and only if λi [A] 6= (λj [A])−1 for
all i, j.
Proof. Pre and postmultiply (3.26) by matrix of left eigenvectors of A (where λl∗ = l∗ A, E−∗ =
[l1 l2 · · · ln ]), as follows,
E−1 XE−∗ = E−1 (AXA∗ + BB∗ )E−∗

= E−1 AEE−1 XE−∗ E∗ A∗ E−∗ + E−1 BB∗ E−∗
Xij = (ΛX Λ̄)ij + li∗ BB∗ li
= λi Xij λ̄j
Xij = (1 − λi λ̄j )−1 li∗ BB∗ lj .
Hence the ij element of the transformed matrix X is unique if λi λj 6= 1. 2

Now solve (3.26) by exploiting the symmetric structure of X and the ith column of dimension
i, as in (3.18), assuming that A has been transformed to an upper triangular form (3.17). Then
the solution of (3.26) is given recursively for i = n, n − 1, · · · , 2, 1 by
 
 X
n X
n−1 
Xi = (Ii − λi Ai )−1 Qi + Aik Xk λik + Ãi,n−k X̃k λik (3.28)
 
k=i+1 k=i
where Aik is the i × k upper left corner of A, Ãi,n−k is the i × (n − k) upper right corner of A and
the ith column of X is denoted by
   
" # Xij Xi,i+1
Xi  .   .. 
Xicol = , Xi = 

.
.  , X̃i = 
  . .

X̃i
Xii Xi,n
See [154] for a proof of (3.28), or easily verify by construction.
3.4.2 Observability
Define the system
xk = Ak xk + Bk uk
(3.29)
yk = Ck xk + Dk uk .
Definition 3.4.2 We say that (3.29) is observable at time tp if there exists a time q ≤ p such that
knowledge of {u(k), y(k) q ≤ k ≤ p} allows a unique solution for x(q).
From (3.29) write
      
yq Cq Dq u(q)
      .. 
      
 yq+1   Cq+1 Aq   Cq+1 Bq Dq+1 
 . 
 =  xq +   . 
 ..   ..   .. ..  . 
 .   .   . .  .  
yp Cp Ap−1 · · · Aq Cp Ap−2 · · · Aq−1 Bq ··· ··· Dp u(p)
or, simply,
ỹ(p, q) = C̃(p, q)xq + B̃(p, q)ũ(p, q) (3.30)
Hence, observability at time p is equivalent to the existence of a unique xq satisfying (3.30), given
∆
the matrix C̃(p, q), and the vector ŷ(p, q) = ỹ(p, q) − B̃(p, q)ũ(p, q). This linear algebra problem
has solution
xq = C̃+ (p, q)Ŷ(p, q) + (I − C̃+ (p, q)C̃(p, q))z
if the following existence condition holds
(I − C̃(p, q)C̃+ (p, q))ŷ(p, q) = 0 (3.31)

3.4. CONTROLLABILITY AND OBSERVABILITY OF DISCRETE-TIME SYSTEMS 61
The solution xq is unique if the columns of C̃(p, q) are linearly independent, in which case
I − C̃+ (p, q)C̃(p, q) = 0. (3.32)
There exists at least one solution for x(q) if (3.31) holds, and there exists a solution for arbitrary
input/output data Ŷ(p, q) if and only if the rows of C̃(p, q) are linearly independent so that
I − C̃(p, q)C̃+ (p, q) = 0 (3.33)
Since C̃ ∈ Rny (p−q+1)×nx , uniqueness requires ny (p − q + 1) ≥ nx . Specifically, we require a left

inverse of C̃(p, q), or equivalently
∆
P(p, q) = C̃T (p, q)C̃(p, q) > 0
or
X
p−q
P(p, q) = AT i CT CAi . (3.34)
i=0
Since P(p2 , q2 ) ≥ P(p1 , q1 ) whenever p2 − q2 ≥ p1 − q1 , we test observability in the time-invariant

case by the condition
∞
X
∆
P= AT i CT CAi > 0. (3.35)
i=0
Since observability is a function only of the matrix pair (A,C) we may say, relative to (3.29), that
the “matrix pair (A,C) is observable” (or not).
Exercise 3.4.2 Show that P(p, q) in (3.34) satisfies
P = AT PA + CT C. (3.36)
Exercise 3.4.3 For the system given in example 4.3.1, suppose C = [0 1]. Find out how much
data q u(k), y(k), q ≤ k ≤ p is required to uniquely compute the initial state x(0).
Theorem 3.4.4 The following statements are equivalent:
(i) The time varying system (3.29) is observable at time p.
(ii) There exists q such that
Pk = ATk Pk+1 Ak + CTk Ck , Pp = 0 (3.37)

Pq > 0
If (A, C) is a pair of constant matrices, and if Pq exists from (3.37) for p = ∞ the following
statements are equivalent:
(iii) The time invariant system (3.29) is observable.

(iv) P = AT PA + CT C, P > 0.
Note for the time-invariant case, and from the Cayley-Hamilton theorem that
∞
X
rankP = rank AiT CT CAi = rank[CT AT C · · ·][CT AT CT · · ·]T
i=0
= rank[C T
AT CT · · · An−1T CT ].
Hence observability is equivalent to
rank[CT AT CT · · · An−1T CT ] = n.
The solution of (3.36) is unique if λi [A] 6= (λj [A])−1 for any i, j. The solution of (3.36) follows
from the same algorithm (3.28) by substitutions A → AT , C → BT .
3.5 Lyapunov Stability of Linear Systems

The early work of Lyapunov [85] remains, to this day, one of the most powerful methodologies for
stability analysis. No other stability method can treat such a large class of problems: Nonlinear
systems, time-varying systems, linear systems. In fact, Massera [89] has pointed out that under
mild assumptions, a Lyapunov function always exists for proving asymptotic stability of a solution,
if the solution is indeed asymptotically stable. This result extended the Lyapunov method beyond
the sufficient conditions of Lyapunov’s work, to include discussions of necessity as well. In fact, for
linear systems, this opens the door for our characterization of the class of all quadratic Lyapunov
functions that can be used to prove stability of a given stable system. This result can then be easily
extended beyond analysis to parameterize the set of all plant parameters and controller parameters
that can stabilize the system. Hence, relatively new tests are given for a plant to be stabilizable
by a controller of fixed-order (e.g. state feedback, output feedback). All these results follow in
subsequent chapters. This chapter states the relevant Lyapunov stability theory for linear systems.
3.5.1 Continuous-Time Systems

For the linear time-invariant system
ẋ = Ax, y = Cx. (3.38)
the solution for any τ is

y(t) = CeA(t−τ ) x(τ ). (3.39)
Theorem 3.5.1 The following result characterizes the stability of nonlinear systems The null so-
lution of the system ẋ = (x, t) is asymptotically stable in the sense of Lyapunov if there exists a
scalar function V(0, t) = 0, V(∞, t) = ∞, V(x, t) > 0 for all x such that either (a) V̇(x, t) ≤ 0 and
di
dti
V(x, t) = 0 for all i > 0 implies x = 0, or (b) V̇(x, t) < 0.
3.5. LYAPUNOV STABILITY OF LINEAR SYSTEMS 63
Generally, stability is a property of a solution. For linear systems stability of the null solution
x(t) = 0 is equivalent to the stability of any other solution since from (3.39) stability will depend
only on A and not x(τ ). By a slight abuse of language in the linear systems of this book we refer
simply to the “stability of the system” (3.38).
We will now show how to construct a Lyapunov function for any linear system, where y in (3.38)
has no physical significance in this discussion. We choose any C such that (3.38) is observable.
This means that Z ∞
T
eA σ CT CeAσ dσ > 0. (3.40)
0
Define Z ∞
∆
V(x(t)) = yT (σ)y(σ)dσ.
t
Then from (3.38) and (3.40)
V̇ (x(t)) = −yT (t)y(t) = −xT (t)CT Cx(t). (3.41)
From (3.40) and (3.41), using x(σ) = eA(σ−t) x(t), the Lyapunov function is
Z ∞
V(x(t)) = xT (σ)CT Cx(σ)dσ
0
Z ∞ T (σ−t)
= xT (t) eA CT CeA(σ−t) d σ x(t)
t
Z ∞
T ∆ Tτ
= x (t)Px(t), P = eA CT CeAτ dτ
0
where P > 0 by assumption of observability of (A, C).
Exercise 3.5.1 Show that if P exists, it satisfies the equation
0 = PA + AT P + CT C. (3.42)
Exercise 3.5.2 Show that for some Ω

Z ∞
∆ Tτ
P= eA CT CeAτ dτ = [CT AT CT · · ·]Ω[CT AT CT · · ·]T
0
T
and that rank P = rank [CT AT CT · · · An−1T CT ]. Note that since the integrand eA σ CT CeAσ
R T R T
is a nonnegative definite matrix, then tt2 eA σ CT CeAσ dσ ≥ tt1 eA σ CT CeAσ dσ if t2 ≥ t1 , and
observability at any tf < ∞ implies observability at tf = ∞. Note also that observability (at any
time) is guaranteed if one chooses a nonsingular C. Hence V (x(t)) > 0 ∀ x 6= 0, where P > 0
is parametrized by C such that (A, C) is observable. Now, V̇ ≤ 0 obviously from (3.41), and
V̇ ≡ 0 is equivalent to y(t) ≡ 0. Hence, the question “does V̇ ≡ 0 imply x = 0?” reduces to “does
y(t) ≡ 0 imply x = 0?” But this is true if and only if all state variables are observable in y(t).
This is guaranteed by the (A, C) observable assumption. Hence, we have the conclusion,
Theorem 3.5.2 The following are equivalent statements:

(i) The system {ẋ = Ax} is asymptotically stable in the sense of Lyapunov.
(ii) The eigenvalues of A lie in the open left half plane.
(iii) If (A, C) is an observable pair, there exists P > 0 satisfying 0 = PA + AT P + CT C,
Proof that (ii) and (iii) are equivalent follows by multiplying the Lyapunov equation by e∗i from
the left and by ei from the right where Aei = λi ei to get
0 = e∗i PAei + e∗i A∗ Pei + e∗i C∗ Cei

= e∗i Pei (λi + λi ) + e∗i C∗ Cei .
Since e∗i C∗ Cei > 0 (note that Cei 6= 0 because of observability of every mode), e∗i Pei > 0, then
λi + λi = 2Real[λi ] < 0.
Now consider another Lyapunov function, for t > 0,
V(x(t)) = x∗ (t)X−1 x(t) (3.43)
where X is defined by Z ∞
∆ Tσ
X= eAσ BBT eA dσ
0
which satisfies (if X exists)
0 = XAT + AX + BBT . (3.44)
Then
V̇(x(t)) = ẋT (t)X−1 x(t) + xT (t)X−1 (t)ẋ(t)

= xT (t)AT X−1 x(t)
+xT (t)X−1 Ax(t)
= xT (t)X−1 [XAT + AX]X−1 x(t). (3.45)
Hence, V(x(t)) > 0, if X > 0 satisfies (3.44), and V̇(x(t)) ≤ 0, V̇(x(t)) ≡ 0 implies x = 0 if
(A, B) is a controllable pair.
(i) The system ẋ = Ax is asymptotically stable in the sense of Lyapunov.
(ii) The eigenvalues of A lie in the open left half plane.
(iii) If (A, B) is a controllable pair, then there exists X > 0, satisfying
0 = XAT + AX + BBT . (3.46)
(iv) If (A, B) is a stabilizable pair, then there exists X ≥ 0, satisfying (3.46).

Obviously (A, B) or equivalently (A, BBT ) is controllable for any nonsingular B, and (A, C)
is observable for any nonsingular C. Hence Theorems 3.5.2 and 3.5.3 readily lead to the following.
Corollary 3.5.1 The following statements are equivalent
(i) ẋ = Ax is asymptotically stable
(ii) There exists P > 0 satisfying

PA + AT P < 0
(iii) X > 0 satisfying
XAT + AX < 0. (3.47)
One final statement is important for the case when only stability, rather than asymptotic sta-
bility is of interest.
Corollary 3.5.2 The following statements are equivalent
(i) ẋ = Ax is at least stable
(ii) There exists P > 0 satisfying

PA + AT P ≤ 0
(iii) There exists X > 0 satisfying

XAT + AX ≤ 0.
Example 3.5.1 Show that the eigenvalues of A lie in the open left half plane if and only if X > 0
satisfies
XAT + AX < 0
Solution: Define some left eigenvector of A by l. Then l∗ A = λl∗ where λ is an eigenvalue of

A. For negative definiteness:
l∗ (XAT + AX)l < 0 ∀l 6= 0
and using l∗ A = λl∗

l∗ Xlλ̄ + λl∗ Xl < 0
(λ + λ̄) l∗ Xl < 0
(Re λ)l∗ Xl < 0.
But since X > 0, this is equivalent to

Re λ < 0.
Note that 2Reλ ≤ 0 results when the inequality XAT + AX ≤ 0 is used.

3.5.2 Discrete-Time Systems

Stability theorems for the discrete case follow in a similar manner as for the continuous-time case.
Hence, results are merely summarized without proof below. Consider a linear system described by
xk+1 = Axk , yk = Cxk
where C is any such that (A, C) is observable. This means

∞
X
∆
P = (AT )i−k CT CAi−k > 0.
i=k
Define
∞
X
∆
V(xk ) = yiT yi
i=k
∞
X
= xTk (AT )i−k CT CAi−k xk
i=k
"∞ #
X
= xTk T i−k
(A ) T
C CA i−k
xk
i=k
= xTk Pxk
Then
V(xk+1 ) − V(xk ) = −ykT yk

= −xTk CT Cxk .
Note also that

V(xk+1 ) − V(xk ) = xTk+1 Pxk+1 − xTk Pxk
It is straightforward by substitution of xk+1 = Axk to show that P satisfies the linear matrix
equation
P = AT PA + CT C.
Consider now another Lyapunov function
V(xk ) = xTk X−1 xk k>0
where X is defined by
∞
X
∆
X= Ai BBT (AT )i
i=0
which satisfies
X = AXAT + BBT .
Theorem 3.5.4 The following statements are equivalent

(i) All eigenvalues of A lie in the open unit circle.
(ii) P − AT PA > 0, P>0
(iii) X − AXAT > 0, X>0
Exercise 3.5.3 Show that the eigenvalues of A lie inside the open unit disk if and only if
−X + AXAT < 0, X > 0.
Chapter 3 Closure
This chapter contains a solution to the state space equations, and a discussion of the classical tests
for observability, controllability of continuous and discrete-time systems. But since controllability
and observability have nothing to do with stability, we need more “abilities” of linear systems
to capture the essential performance. In later chapters, we will be interested in characterizing
the set of all observability and controllability Gramians (3.14)-(3.23) which can be assigned by
feedback control. More detailed expositions on the state space solution and the observability and
controllability of continuous and discrete-time systems can be found in traditional linear systems
textbooks such as [68], [11] [126]. Lyapunov stability theory is much more general than the linear
system analysis presented here. For asymptotically stable solutions of linear or nonlinear systems, a
Lyapunov function always exists which will prove asymptotic stability. For nonlinear systems, such
Lyapunov functions are hard to find and there is no general procedure to construct a Lyapunov
function for nonlinear systems. However, such searches for Lyapunov functions are not necessary
for linear systems, since quadratic functions always work. This chapter parameterizes all quadratic
Lyapunov functions that a linear system can have. There are two subdivisions of this set: Those
Lyapunov matrices satisfying equality constraints (3.46) for a controllable pair (A, B); and those
Lyapunov matrices satisfying inequality constraints (3.47). When the plant matrix A contains
adjustable parameters such as control parameters, then further chapters can take the next step of
parameterizing the set of all control parameters and Lyapunov matrices that a stable system can
have. A detailed exposition of Lyapunov stability theory for linear systems can be found in [152].
For discrete-time systems see [79]. Lyapunov techniques are used extensively for control design of
nonlinear systems in the very important work of Corless, Leitmann, and others [2, 17, 18, 111, 15,
16].
Chapter 4
Second-Order Information in Linear

Systems
4.1 The Deterministic Covariance Matrix for Continuous-Time

Systems
Consider the time-varying system
ẋ(t) = A(t)x(t) + D(t)w(t) w(t) ∈ Rnw , x(t) ∈ Rnx

(4.1)
y(t) = C(t)x(t).
Define x(i, t, τ ), y(i, t, τ ) as the state (output) response at time t due to the ith excitation
applied at time τ ≤ t. The admissible excitation events are r = nx + nw in number (nw impulses
and nx initial conditions):
)
wα (t) = wα δ(t − τ ), α = 1, ..., nw
(4.2)
xβ (τ ) = xβ0 , β = 1, ..., nx
where i = α for i ≤ nw and i = nw + β for i > nw , and where wα is the strength of the impulse
applied in the αth input channel.
With zero initial conditions x(0) = 0, and unit intensities wα = 1, α = 1, ..., nw , the vectors
y(i, t, τ ) associated with system (4.1) are simply the columns of the impulse response matrix
C(t)Φ(t, τ )D(τ ) = [y(1, t, τ ), y(2, t, τ ), ..., y(nw , t, τ )] (4.3)
where Φ(t, r) is the state transition matrix for A(t). For the time-invariant case we can take τ = 0
and write
CeAt D = [y(1, t), y(2, t), ..., y(nw , t)]. (4.4)
Define r Z
∆ X t
X(t) = x(i, t, τ )xT (i, t, τ )dτ, r = nx + nw . (4.5)
i=1 0
69
70 CHAPTER 4. SECOND-ORDER INFORMATION IN LINEAR SYSTEMS
Consider that, from (4.1) and (4.2),

(
eA(t−τ ) xβ0 , i = nw + β
x(i, t, τ ) =
eA(t−τ ) Dwi0 , i ≤ nw
where
   
0 0
   
 ..   .. 
 .   . 
   
   
xβ0 =  xβ0 , or wi0 =  wi0 
   
 ..   .. 
 .   . 
   
0 0
and
∆
ẋ(i, t, τ ) = AeA(t− τ ) d, d = xβ0 or Dwi0 .
Then from (4.5) it is easy to show that

X
r Z t X
r
Ẋ(t) = x(i, t, t)xT (i, t, t) + A x(i, t, τ )xT (i, t, τ )dτ
i=1 0 i=1
Z t X
r
+ x(i, t, τ )xT (i, t, τ )dτ AT .
0 i=1
and
X
r
x(i, t, t)xT (i, t, t) = DW0 DT + X0
i=1
where
∆
W0 deltau diag [... wi2 ...] > 0, X0 = diag [... x2i (0) ...] > 0. (4.6)
Hence X(t) satisfies the differential equation
Ẋ(t) = X(t)AT + AX(t) + DW0 DT + X0 , X(0) = 0. (4.7)
Definition 4.1.1 For system ẋ = Ax+Dw, the deterministic covariance, or simply “D-Covariance”
of the state is defined by (4.5) and satisfies (4.7).
In the stability theory of Theorem 3.5.3 we are free to choose B. Now comparing (3.46) with
∆
(4.7) leads to the following conclusion by choosing BBT in (3.46) as BBT = DW0 DT + X0 .
Note that for any A the pair (A, B) is controllable with this choice, since X0 > 0 guarantees that
BBT > 0.
In the time-invariant case, it is not necessary to let the time of excitations be variable as in
(4.5), so τ can be taken as 0, and integrating over t instead, yields the steady-state answer
r Z
X ∞ Z ∞
∆ T
X = x(i, t)xT (i, t)dt = eAt (DW0 DT + X0 )eA t dt (4.8)
i=1 0 0
4.1. THE DETERMINISTIC COVARIANCE MATRIX FOR CONTINUOUS-TIME SYSTEMS71
where
x(i, t) = eAt xi0 or eAt Dwi0
0 = XAT + AX + DW0 DT + X0 , X0 > 0. (4.9)
Theorem 4.1.1 Let (A, D, W0 , X0 ) be constant, with X0 > 0. These statements are equivalent:
(i) The eigenvalues of A all lie in the open left half plane
(ii) The D-Covariance satisfying (4.9) is positive definite X > 0.
The proof follows immediately from Theorem 3.5.3.

Define r Z
∆ X t
Y(t) = y(i, t, τ )yT (i, t, τ )dτ = C(t)X(t)CT (t) (4.10)
i=1 0
The outputs y(i, t, τ ) denote the response at time t, applying only the ith excitation at time τ
and where X(t) satisfies (4.7), or, in the time-invariant case
r Z
X ∞
Y= y(i, t)yT (i, t)dt = CXCT (4.11)
i=1 0
where X satisfies (4.9). The matrices (4.10) and (4.11) have yet another physical significance.
Theorem 4.1.2 Let an unknown strictly proper linear system (time-invariant) have zero initial
conditions and let impulses (of any finite intensity wi , wi (t) = wi δ(t)) be applied at the input, one
at a time. Compute the following integral Y from the response y(i, t) of these nw experiments
nw Z
X ∞
Y= y(i, t)yT (i, t)dt. (4.12)
i=1 0
There exists some input w(t) (not impulses) to take the output y(t) to an arbitrarily specified value
y(tf ) = ȳ in a finite time tf < ∞ if and only if the calculation (4.12) yields Y > 0.
The proof of this theorem follows from the output controllability results of (3.13) requiring
CXCT > 0
0 = XAT + AX + DW0 DT .
The important conclusion from Theorem 4.1.2 is that the necessary and sufficient condition for
output controllability can be stated in terms of a physically meaningful matrix (4.12), (which
is obtained from the physical impulse responses rather than knowledge of the internal model).
Consider, for example, the time-invariant case (4.8), where the square root of the diagonal elements
of X represent the (loosely called RMS) value of the state variables, as defined by
" r Z
#1/2
∆ X ∞
1/2
xk (RM S) = [Xkk ] = x2k (i, t)dt , k = 1, ..., nx .
i=1 0
Likewise h i1/2
∆
yk (RM S) = [Ykk ]1/2 = CXCT , k = 1, ..., ny . (4.13)
kk
Hence, a parametrization of all stable linear systems given in terms of all attainable RMS values
of the state variables xk , k = 1, ..., nx provides an explicit connection between stability and RMS
performance. Later in the text we exploit these relationships to show, for a single input system,
an explicit one-to-one correspondence between the nx coefficients of the characteristic polynomial
and the nx RMS values of the state variables.
4.2 Models for Control Design (Continuous-Time)

Consider now that (4.1) represents a closed-loop system to be designed. Then all results of the
previous section apply by replacing the matrices A, D, C by their closed-loop equivalents. To
develop the matrices let subscripts p and c denote, respectively, plant and controller matrices.
Then the system of interest is described by
PLANT : ẋp (t) = Ap xp (t) +Bp u(t) +Dp w(t)
OUTPUT : y(t) = Cp xp (t) +By u(t) +Dy w(t)
MEASUREMENT : z(t) = Mp xp (t) +Dz w(t) (4.14)
CONTROLLER : ẋc (t) = Ac xc (t) +Bc z(t)
u(t) = Cc xc (t) +Dc z(t)
or, by assembling these equations in compact form,
" # " #" # " #
ẋp (t) Ap + Bp Dc Mp Bp Cc xp (t) Dp + Bp Dc Dz
= + w(t)
ẋc (t) Bc Mp Ac xc (t) Bc Dz
" #
xp (t)
y(t) = [Cp + By Dc Mp By Cc ] + [Dy + By Dc Dz ] w(t)
xc (t)
or, simply
ẋ(t) = Ac` x(t) + Bc` w(t)
(4.15)
y(t) = Cc` x(t) + Dc` w(t)
where
∆
Ac` = A + BGM, Bc` = D + BGE (4.16)
∆
Cc` = C + HGM, Dc` = F + HGE
" # " # " #
∆ Ap 0 ∆ Bp 0 ∆ Mp 0
A = , B = , M = (4.17)
0 0 0 Inc 0 Inc
" # " # " #
Dz h i Dp Dc Cc
∆ ∆ ∆ ∆
E = , H = By 0 , D= , G =
0 0 Bc A c
" #
xp h i
∆ ∆ ∆
F = Dy , x = , C= Cp 0
xc
4.3. STOCHASTIC INTERPRETATIONS 73
and vector dimensions are: xp ∈ Rnp , xc ∈ Rnc , yp ∈ Rny ,
z ∈ Rnz , w ∈ Rnw , u ∈ Rnu

x ∈ Rnx , nx = n p + n c .
Applying the definition (4.5) to system (4.15) we can see from (4.7) that X(t) satisfies
Ẋ(t) = X(t)(A + BGM)T + (A + BGM)X(t)

+(D + BGE)W0 (D + BGE)T + X0 ,
X(0) = 0, (4.18)
where the initial conditions are arranged as follows

   
x2p1 (0) x2c1 (0)
   
 ..  ∆  .. 
Xpo =  .  > 0, Xco =  .  > 0
   
x2pnp (0) x2cnc (0)
" #
Xpo 0
X0 = > 0,
0 Xco
and the impulsive elements wi (t) have strengths wpi and

 
w12
 
W0 = 

..
.  > 0.

wn2w
∆
If the steady-state exists for the time-invariant case then, limt→∞ X(t) = X satisfies [126]
0 = X(A + BGM)T + (A + BGM)X + (D + BGE)W0 (D + BGE)T + X0 (4.19)
4.3 Stochastic Interpretations

Let wp (t), v(t), and xp (0), xc (0) in (4.14) be random with statistics described as follows [78]
" # " #" #T " #
w(t) w(t) w(τ ) Ws (t)δ(t − τ ) 0
E = 0, E = (4.20)
x(0) x(0) x(0) 0 Xo
where
" #
Wp Wpv
Ws = T
> 0, Xo > 0 (4.21)
Wpv V
and E[·] is the expectation operator (an integral over the sample space weighted by the probability
density function; see any book on stochastic processes, such as (4.1 - 4.2). The matrix Xo is the
covariance of the random initial state x(0). The matrix W0 is not necessarily diagonal and is
the intensity matrix of the white noise process w(t). Wpv is the correlation between the random
processes wp (t), v(t). Define the covariance matrix Xs (t) for the state of (4.15) by
h i
∆
Xs (t) = E x(t)xT (t) . (4.22)
Then Xs (t) satisfies
Ẋs (t) = Xs (t)(A + BGM)T + (A + BGM)Xs (t)

+ (D + BGE)Ws (D + BGE)T
Xs (0) = Xo (4.23)
It is interesting to compare the deterministic theory (4.18) with the stochastic theory (4.23), espe-
cially the conditions under which they represent the same mathematical problem.
Theorem 4.3.1 Define Xs (t) by (4.22) when (4.15) is excited by random variables with statistics
(4.20) and (4.21). Define X(t) by (4.5) where x(i, t, τ ) refers to the response of (4.15) under the
deterministic conditions similar to (4.2). Then Xs (t) = X(t) for all t ≥ 0 if:
(i) the stochastic process w(t) in (4.20) and (4.21) is an independent process (Ws (t) is diagonal)
with constant intensity (Ws constant).
(ii) E[wi (t)wi (τ )] = wi2 δ(t−τ ) (the intensity of the white noise process wi (t) is equal in magnitude
to the square of the intensity of the impulse applied in the deterministic case).
(iii) X0 = X0 = 0 (the initial conditions are zero in both the deterministic and stochastic
problems).
The difference between the stochastic and deterministic problems deserve comment. In the
stochastic problem (4.23) the matrix Ws can be time-varying and an arbitrary positive definite
matrix, whereas in the deterministic problem (4.18) W0 is constant diagonal. This difference seems
to be insignificant, since a time-varying scaling and transformation of the disturbance input w(t)
could be applied to eliminate this difference. That is, one can choose Θ(t) such that
ws = w (stochastic) = Θ(t)w (new) = Θ(t)wsn
where wsn is intended to have constant diagonal intensities Wsn ;

h i
E ws (t)wsT (τ ) = Θ(t)E[wsn (t)wsn
T
(τ )]ΘT (t) = Ws (t)δ(t − τ )
= Θ(t)Wsn ΘT (t) = Ws (t).
The initial state enters the two problems in substantially different ways. In the stochastic case (4.23)
has a positive definite initial condition, whereas (4.18) has zero initial condition. But (4.18) has an
extra forcing term X0 to “compensate” for the zero initial condition. A scalar case illustrates the
4.3. STOCHASTIC INTERPRETATIONS 75
essential differences in the treatment of initial conditions. Let W0 = Ws . For the scalar equation
(4.18) (stable time-invariant case)
(D + BGE)2 W0 + X0
X(t) = [1 − e2(A+BGM )t ]
− 2(A + BGM )
and for the scalar equation (4.23)

Ã !
(D + BGE)2 Ws (D + BGE)2 Ws
Xs (t) = X + o
e2(A+BGM )t −
2(A + BGM ) 2(A + BGM )
yielding an initial difference (t = 0),
Xs (0) − X(0) = Xs (0) = X o
and a steady-state difference
X0
Xs (∞) − X(∞) = < 0.
2(A + BGM )
Hence the deterministic problem is conservative in comparison with variances from the stochastic
p
analysis. That is, X(∞) > Xs (∞), meaning that the RMS value of the state X(∞) is larger
p
than the standard deviation Xs (∞) associated with the stochastic problem, this difference is zero
when X0 in the deterministic problem is zero.
The final comparison is perhaps the most important one. Consider the steady-state covariance
Xs and D-covariance X, associated with (4.15), satisfying
0 = Xs ATc` + Ac` Xs + Bc` Ws BTc`

0 = XATc` + Ac` Xs + Bc` W0 BTc` + X0
Due to X0 > 0, asymptotic stability of Ac` is equivalent to X > 0, regardless of properties of the
matrix D. On the other hand asymptotic stability of Ac` is not equivalent to Xs > 0 because
(Ac` , Bc` ) might not be controllable, whereas (Ac` , Bc` W0 BTc` + X0 ) is always controllable. If
(A+BGM, D+BGE) is a controllable pair, then X > 0 is equivalent to asymptotic stability of A+
BGM. It is an unfortunate deficiency of the stochastic problem (4.23) that stability of A + BGM
is not equivalent to the condition Xs > 0. Furthermore, to make matters worse, the controllability
condition never holds in the physical world we seek to model. That is, physical systems are never
completely controllable nor observable, even though the (simplified) mathematical model might be.
See [126] for a more complete discussion of the uncontrollability and unobservability of physical
systems.
Hence, to capture the set of all stabilizing controllers of fixed order, one can parametrize the
set of all G(t) and X(t) > 0 satisfying (4.18), but, because (A, D) might not be controllable
one cannot capture the set of all stabilizing controllers by parametrizing all G(t) and X(t) > 0
satisfying (4.23). This gives the deterministic approach a big advantage.
However, the language of (4.23) (“covariance analysis”) is more familiar, and indeed covariance
analysis is the cornerstone of several fields of systems theory (filtering theory, state estimation,
identification, linear quadratic Gaussian optimal control, etc.). So, we should not dismiss the
stochastic problem, but rather, modify it slightly to suit our needs. Hereafter, whenever we refer to
the stochastic problem (4.23) we shall assume that D is a nonsingular matrix. This is equivalent to
adding noise sources that can excite all state variables, which is the effect of the X0 term in 4.18.
We must choose a name for matrix X (appearing in (4.5), (4.8) and (4.18)). In order to simplify
our language we will call it the “D-covariance” (deterministic covariance) matrix, since it is the
deterministic equivalent of the familiar stochastic definition of covariance, (4.20)-(4.22). When the
context makes it clear whether we are discussing deterministic or stochastic problems, we may drop
the “D” and just use the word “covariance.”
4.4 The Discrete System D-Covariance

Consider the time-varying systems
xk+1 = Axk + Dwk (4.24)
Define x(i, k, j) as the state at time k due to the ith excitation applied at time j ≤ k. The admissible
set of excitation events are r = nx + nv in number (nw pulses and nx initial conditions)
)
wα (k) = wα δkj α = 1, · · · , nw
(4.25)
xβ (j) = xβ0 β = 1, · · · , nx
where for convention we (arbitrarily) assign i = α for i ≤ nw and i = nw + β for i > nw , and when
wα is the magnitude of the pulse applied in the αth input channel. Define
∆ X X
r k−1
Xk = x(i, k, j)xT (i, k, j). (4.26)
i=1 j=0
Consider that from (4.24), (4.25)

(
Ak−j xβ0 i = nw + β
x(i, k, j) =
Ak−j−1 Dwi0 i ≤ nw
where
 
0
 
 .. 
 . 
 
 
wi0 =  wi  , when w(k) = wio δkj i ≤ nw
 
 .. 
 . 
 
0
4.4. THE DISCRETE SYSTEM D-COVARIANCE 77
 
0
 
 .. 
 . 
 
 
xβ0 =  xβ0  when i = nw + β, β = 1, · · · , nx
 
 .. 
 . 
 
0
and
∆
x(i, k + 1, j) = Ak−j d, d = Axβ0 or Dwi0 .
Then from (4.26) the reader should show that Xk satisfies
Xk+1 = AXk AT + DWDT + X0 (4.27)
where
h i h i
W = diag · · · wα2 · · · > 0, X0 = diag · · · x2β0 · · · > 0
Definition 4.4.1 The matrix defined by (4.26), satisfying (4.27) is called the “D-Covariance” for
the system (4.24).
Theorem 4.4.1 Let (A,D) be constant. The eigenvalues of A all lie in the open unit disk if and
only if the steady-state “D-Covariance” exists satisfying
X = AXAT + DWDT + X0 , X > 0. (4.28)
Further motivation for selecting the second-order information (the covariance matrix) as the
design space for linear systems comes from the fact that a larger class of systems can be treated with
linear methods in the space of covariance matrices than in the state space. Consider an example.
Let a certain system be described by
1
x(k + 1) = x(k) + x(k)u(k) (4.29)
2
If u(k) is a zero mean white noise (with covariance U ) uncorrelated with x(k), then the state
covariance satisfies the linear equation,
1
X(k + 1) = X(k) + X(k)U (4.30)
4
Hence, in the space of covariances (4.29) is a linear system, whereas in the space of the state (4.29)
is a nonlinear system. Hence, developing a control design theory based on second-order information
(covariances), where the design space is the space of all symmetric matrices, will allow a larger class
of systems to be treated with linear methods.
4.5 Models for Control Design (Discrete-Time)

The discrete-time equivalent of the previous section is presented now. The systems of interest are
described by
PLANT : xp (k + 1) = Ap xp (k) +Bp u(k) +Dp w(k)
OUTPUT : y(k) = Cp xp (k) +By u(k) +Dy w(k)
MEASUREMENT : z(k) = Mp xp (k) +Dz w(k) (4.31)
CONTROLLER : xc (k + 1) = Ac xc (k) +Bc z(k)
u(k) = Cc xc (k) +Dc z(k)
or, by assembling these equations in matrix form
" # " #" # " #
xp (k + 1) Ap + Bp Dc Mp Bp Cc xp (k) Dp + Bp Dc Dz
= + w(k)
xc (k + 1) Bc Mp Ac xc (k) Bc Dz
" #
xp (k)
y(k) = [Cp + By Dc Mp By Cc ] + [Dy + By Dc Dz ] w(k)
xc (k)
or simply
x(k + 1) = Ac` x(k) + Bc` w(k)
(4.32)
y(k) = Cc` x(k) + Dc` w(k)
where
∆ ∆
Ac` = A + BGM, Bc` = D + BGE
∆ ∆
Cc` = C + HGM, Dc` = F + HGE
" # " # " # " #
∆ Ap 0 ∆ Bp 0 ∆ Mp 0 ∆ Dc Cc
A = , B = , M = , G =
0 0 0 Inc 0 Inc Bc A c
" # " #
∆ Dz ∆ ∆ Dp
E = , H = [By 0] , D = ,
0 0
" #
xp h i
∆ ∆ ∆
F = Dy , x = , C = Cp 0
xc
and the vector dimensions are the same as in the continuous-time case (4.15). Consider the equation
X(k + 1) = (A + BGM)X(k)(A + BGM)T

+ (D + BGE)W0 (D + BGE)T + X0
X(0) = 0 (4.33)
where
   
" # x2p1 0 x2c1 0
Xp0 0    
 .. 
Xco =  .. 
∆ ∆ ∆
X0 = , Xpo =  . ,  . 
0 Xc0  
x2pnp 0 x2cnc 0
4.5. MODELS FOR CONTROL DESIGN (DISCRETE-TIME) 79
 
w12
 
W0 = 

..
. ,

wn2w
where wi is the magnitude of a pulse in the ith channel wi (k) = wi δkτ applied at time k = τ , and
∆
xpi o is the initial condition at k = m, xpi (m) = xpi o . Then X(k) is defined by
X
r X
k
X(k) = x(i, k, m)xT (i, k, m), r = nx + nw (4.34)
i=1 m=o
where x(i, k, m) denotes the response of (4.32) when only the ith excitation is applied at time m,
from the admissible set of excitations
( )
wα (k) = wα δkm α = 1, 2, ..., nw
xβ (m) = xβo β = 1, ..., nx
for a total of r = nw + nx excitations.

In the time-invariant case m = 0 and the steady-state value of X(k) becomes
X ∞
r X
∆
X= x(i, k)xT (i, k) = lim X(k) (4.35)
k→∞
i=1 k=o
and satisfies, if it exists,
X = (A + BGM)X(A + BGM)T + (D + BGE)W0 (D + BGE)T + X0 . (4.36)
The stochastic equivalent of (4.33) is as follows
Xs (k + 1) = (A + BGM)Xs (k)(A + BGM)T

+ (D + BGE)Ws (D + BGE)T , (4.37)
X(0) = Xo
where
Xs (k) = Ex(k)x(k)T
" #
w(k)
E =0
x(0)
" #" #T " #
w(k) w(j) Ws (k)δkj 0
E = .
x(0) x(0) 0 Xo
Similar arguments hold, comparing stochastic and deterministic interpretations of (4.33), (4.37),
as in the continuous-time case.
4.6 System Performance Analysis

Consider the linear time-invariant system
" # " #" #
ẋ(t) A B x(t)
= (4.38)
y(t) C D w(t)
where x is the state, w is the disturbance, and y is the output of interest. Suppose that the system
is considered to have “good” performance if y is “small” regardless of the disturbance w. The
purpose of this section is to define quantitative measures of system performance and to provide
(computable) characterizations of the performance measures.
A standard way to quantify system performance is to consider the system gain Γ:
∆ size(y)
Γ = sup ,
w size(w)
or equivalently,
∆
Γ = sup{ size(y): size (w) ≤ 1}.
w
The quantity Γ measures the size of the output signal y in response to the worst-case disturbance
w with zero initial state. Thus, the smaller the system gain, the better the system performance.
The definition of system gain Γ given above is still not concrete enough; we need to specify how to
measure the size of signals w and y. Clearly, different ways of measuring the size lead to different
performance measures. Several measures for the size of a signal are summarized below.
The size a of square integrable (vector) signal v may be measured by
µZ ∞ ¶1/2
∆
kvkL2 = kv(t)k dt
2
,
0
∆ √
where k · k is the Euclidean norm of a (constant) vector; for a vector x, kxk = xT x. The quantity
kvkL2 is called the L2 norm of signal v. It is also referred to as the energy of signal v, in the control
literature. The size of magnitude-bounded signal v can be measured by
∆
kvkL∞ = sup kv(t)k.
t≥0
The quantity kvkL∞ is called the L∞ norm of signal v. Note that, if v is a scalar signal, then
kvkL∞ is simply the peak value. The size of an impulsive signal v(t) = v0 δ(t) may be quantified
as kv0 k, where δ(·) is the Dirac’s delta function.
Now, using these notions of the signal size, we define the following performance measures (system
gains) for system (4.38):
Impulse to Energy Gain:

∆
Γie = sup kykL2
w(t) = w0 δ(t)
kw0 k ≤ 1
4.6. SYSTEM PERFORMANCE ANALYSIS 81
Energy to Peak Gain:

∆
Γep = sup kykL∞
kwkL2 ≤1
Energy to Energy Gain:

∆
Γee = sup kykL2
kwkL2 ≤1
To analyze system performance, we would like to compute the system gains defined above. The
following results are useful for this purpose. We use kAk to denote the spectral norm (the maximum
singular value) of a matrix A.
Theorem 4.6.1 Consider system (4.38). The impulse-to-energy gain Γie is finite if the system is
strictly proper (D = 0) and asymptotically stable. In this case, Γie is given by
Γie = kBT YBk1/2 , YA + AT Y + CT C = 0, (4.39)
or alternatively,
Γie = inf {kBT PBk1/2 : PA + AT P + CT C < 0 }. (4.40)
P
Proof. Recall that the state trajectory of the system (4.38) in response to w with zero initial state
is given by Z t
x(t) = eA(t−τ ) Bw(τ )dτ.
0
Using this, for impulsive disturbance w(t) = w0 δ(t), we have
y(t) = CeAt Bw0 .
Hence, the L2 norm of the output signal y is given by

Z ∞ T
kyk2L2 = w0T BT eA t CT CeAt Bw0 dt
0
= w0T BT YBw0 (4.41)
where Z ∞
∆ T
Y= eA t CT CeAt dt.
0
Since the system is asymptotically stable, matrix Y defined above exists and can be computed
as the solution to the Lyapunov equation in (4.39). Now, the impulse-to-energy gain Γie is given
by maximizing the right-hand side of (4.41) over w0 subject to kw0 k ≤ 1. Clearly, the worst-case
direction w0 of the impulsive disturbance is given by the (unit) eigenvector of BT YB corresponding
to the largest eigenvalue. Hence
max w0T BT YBw0 = kBT YBk,

kw0 k≤1
and we have (4.39).

∆
Now we prove (4.40). Define R = P − Y and consider the Lyapunov inequality in (4.40):
(R + Y)A + AT (R + Y) + CT C < 0.
Using the Lyapunov equation in (4.39), we have
RA + AT R < 0.
Since A is stable, it follows that R > 0. Thus we have P > Y ≥ 0. Note that inequality P > Y
is tight, i.e., for any given ε > 0, there exists P satisfying the Lyapunov inequality in (4.40) such
that kP − Yk ≤ ε. Hence, the smallest (infimum) value of kBT PBk1/2 is given by kBT YBk1/2 ,
which is equal to Γie due to (4.39). This completes the proof. 2
Theorem 4.6.2 Consider system (4.38). The energy-to-peak gain Γep is finite if the system is
strictly proper (D = 0) and asymptotically stable. In this case, Γep is given by
Γep = kCXCT k1/2 , AX + XAT + BBT = 0, (4.42)
or alternatively,
Γep = inf {kCQCT k1/2 : AQ + QAT + BBT < 0 }. (4.43)
Q
Proof. [155, 19] Let Q be any symmetric matrix satisfying the Lyapunov inequality in (4.43).
Note that Q > 0 since A is stable. After multiplying the Lyapunov inequality in (4.43) by Q−1
from the left and the right, the use of the Schur complement formula yields
" #
∆ Q−1 A + AT Q−1 Q−1 B
Φ= < 0.
BT Q−1 −I
∆
Let x and w be any signals that satisfy the state equation (4.38). Define vT = [ xT wT ] and
consider
vT (t)Φv(t) = xT (t)Q−1 (Ax(t) + Bw(t)) + (Ax(t) + Bw(t))T Q−1 x(t) − wT (t)w(t)

d T
= (x (t)Q−1 x(t)) − kw(t)k2 < 0.
dt
Integrating from t = 0 to τ with zero initial state,
Z τ
−1
T
x (τ )Q x(τ ) < kw(t)k2 dt ≤ kwk2L2 .
0
Using the Schur complement formula, we have

" #
kwk2L2 xT (τ )
> 0,
x(τ ) Q
which implies that

" #" #" # " #
1 0 kwk2L2 xT (τ ) 1 0 kwk2L2 yT (τ )
= ≥ 0.
0 C x(τ ) Q 0 CT y(τ ) CQCT
Using the Schur complement formula again,
1
y(τ )yT (τ ) ≤ CQCT ≤ kCQCT kI.
kwk2L2
Hence
yT (τ )y(τ ) ≤ kwk2L2 kCQCT k ≤ kCQCT k
for all disturbances such that kwk2L2 ≤ 1. Since the above inequality holds for all τ ≥ 0, and for
all Q satisfying the Lyapunov inequality in (4.43), we have established that
Γep ≤ inf {kCQCT k1/2 : AQ + QAT + BBT < 0 } = kCXCT k1/2 (4.44)
Q
where X is the solution to the Lyapunov equation in (4.42), and the last equality can be verified
in a similar manner to the proof of Theorem 4.6.1.
Now we show that inequality ≤ in (4.44) is tight. Consider the disturbance signal given by
( −1/2 T (T −t)
∆ λT BT eA CT vT (0 ≤ t ≤ T )
wT (t) = (4.45)
0 (T < t)
where vT is the unit eigenvector of CXT CT corresponding to the largest eigenvalue λT , where
Z T
∆ T
XT = eAt BBT eA t dt.
0
Note that
Z T 1 T A(T −t)
BBT eA (T −t) CT vT dt
T
kwT k2L2 = v Ce
0 λT T
1 T
= v (CXT CT )vT = 1,
λT T
for any fixed T > 0. We show that the L∞ norm of the output signal yT , in response to this
disturbance wT , approaches kCXCT k1/2 when T approaches infinity. To this end, first note that
Z T
−1/2
CeA(T −τ ) B(λT
T (T −τ )
yT (T ) = BT eA CT vT )dτ
0
−1/2
= λT CXT CT vT .
Hence,
1 T
kyT (T )k2 = v (CXT CT )2 vT = λT .
λT T
Taking the limit,
lim kyT (T )k2 = lim λT = kCXCT k.
T →∞ T →∞
Thus we have shown that Γep ≥ kCXCT k1/2 using a particular disturbance w in (4.45). From this
inequality and (4.44), we conclude the result. 2
∆
System gains Γie and Γep are related to the H2 norm of the transfer matrix T(s) = C(sI −
−1
A) B, defined in the frequency domain by
µ Z ∞ ¶1/2
1
T(jω)T(jω)∗ dω
∆
kTkH2 = tr ,
2π −∞
or equivalently,
µ Z ∞ ¶1/2
1
T(jω)∗ T(jω)dω
∆
kTkH2 = tr .
2π −∞
Using the fact that Z
1 ∞
(jωI − A)−1 BBT (−jωI − AT )−1 dω,
∆
X=
2π −∞
Z ∞
∆ 1
Y= (−jωI − AT )−1 CT C(jωI − A)−1 dω,
2π −∞
satisfy
AX + XAT + BBT = 0,
YA + AT Y + CT C = 0,
it is easy to see that the H2 norm of T(s) is given by
kTk2H2 = tr(CXCT ) = tr(BT YB).
Note that, if we replace tr(·) by k · k, then we obtain the characterizations of Γie and Γep in
Theorems 4.6.1 and 4.6.2. Since tr(A) = kAk if A is a (nonnegative) scalar, we see that kTkH2 =
Γie = Γep for single-input, single-output systems. A time-domain interpretation of the H2 norm is
given by
X
nw
kTk2H2 = ky(i, ·)k2L2
i=1
where y(i, ·) is the output signal in response to an impulsive disturbance with unit intensity applied
to the ith disturbance channel, and nw is the number of such channels (i.e., the dimension of w).
Finally, we shall give a computable characterization of the energy-to-energy gain Γee .
Theorem 4.6.3 Consider system (4.38) and let a scalar γ > 0 be given. Suppose the system is
asymptotically stable. Then the following statements are equivalent.
(i) Γee < γ.

∆
(ii) R = γ 2 I − DT D > 0 and there exists a symmetric matrix Y > 0 such that
YA + AT Y + (YB + CT D)R−1 (YB + CT D)T + CT C = 0
and A + BR−1 (YB + CT D)T is asymptotically stable.

∆
(iii) R = γ 2 I − DT D > 0 and there exists a symmetric matrix P > 0 such that
PA + AT P + (PB + CT D)R−1 (PB + CT D)T + CT C < 0.

(iv) There exists a symmetric matrix P > 0 such that

 
PA + AT P PB CT
 
 BT P −γI DT 
  < 0. (4.46)
C D −γI
Proof. We shall prove (iii) ⇔ (iv) ⇒ (i) only. A complete proof may be found in [25, 151].
First note that statement (iii) is equivalent to
" # " #
PA + AT P PB CT h i
∆
Φ= + C D <0
BT P −γ 2 I DT
where we used the Schur complement formula. Another use of the Schur complement formula for
Φ/γ < 0 yields statement (iv). Let x and w be any signals that satisfy the state equation (4.38).
∆
Then we have vT (t)Φv(t) < 0 for v = [ xT wT ]T , or equivalently,
d T
(x (t)Px(t)) − γ 2 wT (t)w(t) + zT (t)z(t) < 0.
dt
Integrating from t = 0 to ∞ with zero initial state, and using the stability property limt→∞ x(t) = 0,
we have
kzk2L2 < γ 2 kwk2L2 .
Thus we conclude Γee < γ. 2

The energy-to-energy gain Γee has a significant frequency domain interpretation; it is equal to
the H∞ norm of the transfer matrix T(s) = C(sI − A)−1 B + D:
∆
∆
kTkH∞ = sup kT(jω)k.
ω
Hence, kTkH∞ < γ is equivalent to each of statements (i), (ii) and (iii) in Theorem 4.6.3. The fact
that Γee = kTkH∞ may be proved by using the Parseval’s equality [84]. The H∞ norm is related
to robustness to norm-bounded perturbations as will be shown later.
Example 4.6.1 Consider the following second-order system:

k
P (s) =
s2 + 2ζωs + ω 2
where ω is the natural frequency and ζ is the damping. A state space realization for this system is
given by  
" # −2ζω −ω 2 1
A B  
= 1 0 0 
.
C D
0 k 0
We consider the following two cases:
P1 (s) : ζ = 0.1, ω = 1, k = 1,
P2 (s) : ζ = 1, ω = 3, k = 20.
The system gains Γie , Γep and Γee are computed by solving the Lyapunov equations in (4.39), (4.42)
and the linear matrix inequality in (4.46), for Y, X and P, respectively. In particular, to compute
the energy-to-energy gain Γee , the scalar γ is minimized subject to (4.46); in this way, Γee can be
found as the minimum value of γ. The results are summarized in Tables 5.1 and 5.2.
Table 5.1 System gains for P1 (s) and P2 (s)
Γie Γep Γee

P1 (s) 2.500 2.500 5.026
P2 (s) 3.703 3.703 2.222
Table 5.2 Solutions to (4.39), (4.42) and (4.46)
Y X P
" # " # " #
2.500 0.500 2.500 0 1.005 0.101
P1 (s)
0.500 2.600 0 2.500 0.101 1.005
" # " # " #
3.704 22.222 0.0833 0 13.334 19.985
P2 (s)
22.222 166.667 0 0.0093 19.985 120.007
Note that, for single-input, single-output systems, the impulse-to-energy gain and the energy-
to-peak gain have exactly the same value, which is equal to the H2 norm. Recall also that the
energy-to-energy gain Γee is equal to the H∞ norm. From Table 5.1, we see that the system P1 (s)
has the smaller H2 norm and the larger H∞ norm than P2 (s). This fact is also evident from the
Bode plots of P1 (s) and P2 (s) shown in Figure 4.1.
1
10 0
0
−50
Phase (deg)
10
Magnitude
−100 dashed P1
dashed P1
−1
10 dotted P2
dotted P2
−150
−2
10 −1 0 1
−200 −1 0 1
10 10 10 10 10 10
Frequency (rad/s) Frequency (rad/s)
Figure 4.1: Bode plots of P1 (s) and P2 (s)
The H∞ norm and the H2 norm correspond to the peak value of the magnitude plot and the
area under the magnitude plot, respectively. The system P1 (s) has a low damping, and hence its
magnitude plot has a sharp peak, resulting in a larger system gain Γee .

Consider the linear time-invariant discrete-time system
" # " #" #
x(k + 1) A B x(k)
= (4.47)
y(k) C D w(k)
where x is the state, w and y are the output of interest and the disturbance input, respectively.
The purpose of this section is to develop system performance analysis results which are analogous
to the continuous-time counterpart presented in the previous subsection. To this end, we shall first
give several ways of measuring the size of discrete-time signals (sequences).
The size of square summable signal v can be measured by the `2 norm:
Ã ∞
!1/2
∆ X
kvk`2 = kv(k)k dt
2
.
k=0
The size of magnitude-bounded signal v may be measured by the `∞ norm:

∆
kvk`∞ = sup kv(k)k.
k≥0
The size of pulse signal v(k) = v0 δ(k) may be quantified as kv0 k, where δ(·) is the Kronecker’s
delta; δ(0) = 1 and δ(k) = 0 for all k 6= 0.
System gains for the discrete-time system (4.47) can be defined using the above signal-size
measures as follows.
Pulse to Energy Gain:

∆
Υpe = sup kyk`2
w(k) = w0 δ(k)
kw0 k ≤ 1
Energy to Peak Gain:

∆
Υep = sup kyk`∞
kwk`2 ≤1
Energy to Energy Gain:

∆
Υee = sup kyk`2
kwk`2 ≤1
Next we characterize these system gains in terms of algebraic conditions.
Theorem 4.6.4 Consider system (4.47). Suppose the system is asymptotically stable. Then the
pulse-to-energy gain Υpe is given by
Υpe = kBT YB + DT Dk1/2 , Y = AT YA + CT C, (4.48)
or alternatively,
Υpe = inf {kBT PB + DT Dk1/2 : P > AT PA + CT C }. (4.49)
P
Proof. Recall that the solution of the difference equation (4.47) is given by
X
k−1
x(k) = Ak x(0) + Ak−i−1 Bw(i).
i=0
For pulse disturbance w(k) = w0 δ(k) with zero initial state, we have
(
Dw0 (k = 0)
y(k) = .
CA Bw0 (k ≥ 1)
k−1
The `2 norm of this output signal is given by

∞
X
kyk2`2 = w0T DT Dw0 + w0T BT (AT )k−1 CT CAk−1 Bw0
k=1
= w0T (BT YB T
+ D D)w0
where ∞
∆ X
Y= (AT )` CT CA` .
`=0
This Y satisfies the Lyapunov equation in (4.48). The worst-case direction of the pulse disturbance,
which maximize kyk`2 , is given by the (unit) eigenvector of BT YB + DT D corresponding to the
largest eigenvalue. In this case, the worst-case `2 norm of the output is kyk2`2 = kBT YB + DT Dk.
This proves (4.48). Finally, the other characterization (4.49) can be proved in a similar manner to
the proof of Theorem 4.6.1, and hence, is omitted. 2
Theorem 4.6.5 Consider system (4.47) and suppose the system is asymptotically stable. Then
the energy-to-peak gain Υep is given by
Υep = kCXCT + DDT k1/2 , X = AXAT + BBT , (4.50)
or alternatively,
Υep = inf {kCQCT + DDT k1/2 : Q > AQAT + BBT }. (4.51)
Q
Proof. [19, 165] Consider the Lyapunov inequality in (4.51). Noting that Q > 0 due to stability
assumption, and using the Schur complement formula, we have
" # " #
Q−1 0 AT h i
Q−1
∆
Φ= − A B > 0.
0 I BT
Let w be any square summable signal, and x be the solution to the state equation (4.51) for this
∆
disturbance. Then we have vT (k)Φv(k) > 0 for v = [ xT wT ]T , or equivalently,
xT (k)Q−1 x(k) − xT (k + 1)Q−1 x(k + 1) + kw(k)k2 > 0.
Taking the summation over k = 0, 1, · · · , n − 1, we have

X
n−1
xT (n)Q−1 x(n) < kw(k)k2
k=0
where we used x(0) = 0. Using the Schur complement formula,

" P #
n−1
k=0 kw(k)k2 xT (n)
> 0,
x(n) Q
which implies that

" #" P #" # " #
h i
k=0 kw(k)k
n−1 2 xT (n)
1 0 1 0 wT (n)
+ w(n) DT ≥ 0,
0 C x(n) Q 0 CT D
or equivalently, " P #
n−1
kw(k)k2 yT (n)
k=0
≥ 0.
y(n) CQCT + DDT
Using the Schur complement formula again,
Ãn−1 !
X
y (n)y(n) ≤
T
kw(k)k 2
(CQCT + DDT ) ≤ kwk2`2 kCQCT + DDT kI.
k=0
Hence, after a few more manipulations, we have
ky(n)k2 ≤ kwk2`2 kCQCT + DDT k
for all n ≥ 1 and Q satisfying the Lyapunov inequality in (4.51). Note that, for n = 0,
ky(n)k2 = wT (0)DT Dw(0) ≤ kDDT kkw(0)k2 ≤ kCQCT + DDT k.
Thus we conclude that
Υep ≤ inf {kCQCT + DDT k1/2 : Q > AQAT + BBT } = kCXCT + DDT k. (4.52)
Q
where X is the solution to the Lyapunov equation in (4.50). In the above, the last equality can be
verified by a similar argument to that used in the proof of Theorem 4.6.1.
We now prove that Υep ≥ kCXCT + DDT k by exhibiting a worst-case disturbance. Consider

 −1/2 T

T n−k CT v
 λn B (A ) n (k = 0, 1, · · · , n)
∆ −1/2
wn (k) =

λn DT vn (k = n + 1) (4.53)

 0 (k = n + 1, n + 2, · · ·)
where vn is the eigenvector of CXn CT + DDT corresponding to the largest eigenvalue λn , where
∆ X
n
Xn = Ak BBT (AT )k .
k=0
For any fixed integer n ≥ 0, we have

X
n
1 1 T
kwn k2`2 = vnT CAn−k BBT (AT )n−k CT vn + v DDT vn
λ
k=0 n
λn n
1 T
= v (CXn CT + DDT )vn = 1.
λn n
With this unit energy disturbance, the output signal yn (k) at k = n + 1 is given by
Ã !
X
n ³ ´
yn (n + 1) = C A n−k
B(λ−1/2
n BT (AT )n−k CT vn ) + D λ−1/2
n DT vn
k=0
−1/2
= λn (CXn CT + DDT )vn .
Therefore,
1 T
kyn (n + 1)k2 = v (CXn CT + DDT )2 vn = λn .
λn n
Taking the limit,
lim kyn (n + 1)k2 = kCXCT + DDT k.
n→∞
Thus, we established that Υep ≥ kCXCT + DDT k. From this inequality and (4.52), we conclude
the result. 2
The energy-to-energy gain Υee can be characterized by the following algebraic conditions.
Theorem 4.6.6 Consider system (4.47) and let a scalar γ > 0 be given. Suppose the system is
asymptotically stable. Then the following statements are equivalent.
(i) Υee < γ.
(ii) There exists a solution Y = YT ≥ 0 to the Riccati equation
Y = AT YA + (AT YB + CT D)(γ 2 I − BT YB − DT D)−1 (AT YB + CT D)T + CT C
such that
∆
R = γ 2 I − BT YB − DT D > 0
and all the eigenvalues of A + BR−1 (BT YA + DT C) have (strictly) negative real parts.
(iii) There exists a symmetric matrix P > 0 such that
P > AT PA + (AT PB + CT D)(γ 2 I − BT PB − DT D)−1 (AT PB + CT D)T + CT C,
γ 2 I − BT PB − CT C > 0.
(iv) There exists a symmetric matrix P > 0 such that

" # " #T " #" #
P 0 A B P 0 A B
> . (4.54)
0 γ2I C D 0 I C D
Proof. Here, we shall give a simple proof for (iii) ⇔ (iv) ⇒ (i). Detailed proofs for other implica-
tions may be found in [33, 97].
First note that the Riccati inequality in statement (iii) and the linear matrix inequality in
statement (iv) are related by the Schur complement formula, and hence equivalent. To prove
(iv) ⇒ (i), consider any square summable disturbance signal w (not identically zero) and the
corresponding state x satisfying the difference equation (4.47) with zero initial state. Define a
∆
vector signal v = [ xT wT ]T . Then multiplying inequality (4.54) by vT (k) from the left and by
v(k) from the right, we have
xT (k)Px(k) + γ 2 kw(k)k2 > xT (k + 1)Px(k + 1) + ky(k)k2 .
Taking the summation from k = 0 to ∞ with zero initial state,
γ 2 kwk2`2 > kyk2`2 ,
where we used the stability property limk→∞ x(k) = 0. Thus, we conclude that Υee < γ. 2
As in the continuous-time case, the energy-to-energy gain Υee has a frequency domain interpre-
tation:
∆
Υee = kTkH∞ = max kT(ejθ )k
0≤θ≤2π
where
T(z) = C(zI − A)−1 B + D.
∆
The quantity kTkH∞ is called the H∞ norm of the (discrete-time) transfer matrix T(z). We will
show in the next section that kTkH∞ is related to robustness of the system (4.47) subject to
perturbation ∆ that affects the system dynamics by w = ∆y.
Example 4.6.2 Consider the stable continuous-time system described by

" # " #
−1 0 1 0
Â = , B̂ = .
1 −2 0 2
This system is discretized by using the zero-order hold with the sampling time T = 0.5 to get
" # " #
0.9048 0 0.0952 0
A= , B= .
0.0861 0.8187 0.0045 0.1813
Suppose that C and D matrices are given by C = I2 and D = 0, respectively. In this case,
the energy-to-peak gain Υep can be computed by solving the Lyapunov equation in (4.50). The
solution X is found to be " #
0.2449 0.0848
X=
0.0848 0.5024
and the energy-to-peak gain is computed as Υep = 0.7265. Note that the H2 norm of this system
is given by 0.8645 and is different from Υep since there are more than one output.
The worst-case disturbance w(k) is computed using the formula in (4.53) with n = 5, and is
plotted together with the system response to this disturbance in Figure 4.2. The solid line in the
figure exhibits the time history of ky(k)k = (y(k)T y(k))1/2 . Note that the peak value of ky(k)k
occurs at k = 6 with ky(6)k = 0.7259 which is very close to Υep . By using a larger n, the peak
value of ky(k)k can be made arbitrarily close to Υep . From Figure 4.2, we see that the worst-case
1 0.8
0.8
0.6 dashed y1
dashed w1
0.6 dotted w2 dotted y2
Output
Input
0.4 solid sqrt(y’y)
0.4
0.2
0.2
0 0
0 5 10 15 0 5 10 15
Time k Time k
Figure 4.2: Υep worst-case disturbance and system responses
disturbance w is a monotonically increasing function up to the time k = 5, just before the output
peak occurs.
Now we compute the pulse-to-energy gain of the system. Three cases are considered: y = x,
y = x1 , and y = x2 , or equivalently, we consider C = I2 , C1 = [ 1 0 ], and C2 = [ 0 1 ], with
D = 0 for all the three cases. For each case, the pulse-to-energy gain Υpe is computed by solving
the Lyapunov equation in (4.48). The result is summarized in Table 4.3 together with the worst-
case direction of the pulse disturbance w0 . Recall from the proof of Theorem 4.6.4, the worst-case
direction is given by the eigenvector of BT YB + DT D where Y is the solution to the Lyapunov
equation (4.48).
Table 4.3 Pulse-to-energy gains and the worst-case disturbance directions
Υpe w0
" #
0.3844
Case: C 0.7065
0.9232
" #
1
Case: C1 0.4949
0
" #
−0.1985
Case: C2 0.6929
−0.9801
The trajectory of the pulse response for each case is plotted in the phase plane (x1 –x2 plane)
in Figure 4.3. The dashed lines indicates the directions of the eigenvectors of A;
" # " #
0 1
λ1 = 0.8187 with e1 = and λ2 = 0.9048 with e2 =
1 1
where λi are the eigenvalues and ei are the corresponding eigenvectors. Note that e1 is the direction
of the faster mode. Recall that the pulse response for w(k) = w0 δ(k) with zero initial state is
exactly the same as the initial state response with x(0) = Bw0 and no external disturbances. The
possible location of the initial state x(0) corresponding to kw0 k ≤ 1 is given by the region inside the
ellipse; x(0)T (BBT )−1 x(0) ≤ 1 (see Figure 4.3). Thus, the worst-case state trajectory must start
4.7. ROBUST STABILITY AND PERFORMANCE ANALYSIS 93
from a point on the ellipse and converge to the origin. From Figure 4.3, we see that, if the pulse
disturbance tries to maximize the `2 norm of x1 (Case: C1 ), the worst-case disturbance pushes x(0)
close to the point (x1 , x2 ) = (0.1, 0), and the corresponding state trajectory is almost “horizontal”
since the disturbance is chosen without caring for the size of x2 . Similarly, if the `2 norm of x2 is
of our interest (Case: C2 ), the worst-case trajectory becomes almost “vertical.”
0.2
0.15
Case: C
0.1
0.05
x2
0 Case: C1
−0.05
−0.1 Case: C2
−0.15
−0.2
−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1
x1
Figure 4.3: Υpe worst-case state trajectories
4.7 Robust Stability and Performance Analysis

It is well known [127] that there are always four kinds of errors associated with a linear model:
errors in model order, errors in disturbances, errors in nonlinearities, and errors in parameters.
Furthermore, there is no control theory which guarantees stability or performance nonconservatively
in the presence of all four of these error types. Below, we focus on parameter errors to motivate
the analysis of uncertain systems.
Let a model of a linear system be described by
ẋ = Âx + K̂w
y = M̂x + Ĥw. (4.55)
Suppose that the system parameters Â, K̂, M̂, Ĥ are subject to the two types of uncertainties:
scalar and matrix uncertainties; δ and ∆, respectively.
Example 4.7.1 The system matrix A may be given by

" # " # " # " #
0 1 0 1 0 0 0 0
A = = + δ1 + δ2
−δ1 −δ2 0 0 −1 0 0 −1
where δ1 and δ2 are the uncertain parameters.
Example 4.7.2 Consider the vector second-order system described by M̂q̈ + K̂q = B̂u. Suppose
that the stiffness matrix K̂ is subject to an uncertainty ∆ as in K̂ = K0 + L∆R. The system can
be put into the state space representation ẋ = Ax + Bu as follows:
" # " #
0 I 0
A= −1
, B= .
−M̂ K̂ 0 M̂−1 B̂
Note that the system matrix A depends linearly on the uncertainty ∆ and can be written as
" # " #
0 I 0 h i
A= + ∆ R 0 .
M̂−1 K0 0 −M̂−1 L
Now let Â, K̂, M̂, Ĥ have the structure

X X
Â = A + δiA Ai + AL A R
i ∆i Ai = A + AL ∆A AR
i i
X X
K̂ = K + δiK Ki + KL K R
i ∆ i Ki = K + K L ∆ K KR
i i
X X
M̂ = M + δiM Mi + ML M R
i ∆i Mi = M + ML ∆M MR
i i
X X
Ĥ = H + δiH Hi + HL H R
i ∆i Hi = H + HL ∆H HR (4.56)
i i
where AL , AR and ∆A are defined by

  
δ1A I A1
  
h i

..
.
  .. 
 . 
Â = A + I ··· AL
1 ··· 


 R
 = A + AL ∆ A AR

 ∆A
1   A1 
  
.. ..
. .
and similarly for K̂, M̂, and Ĥ. Then, it is clear that, for the choices for C and L indicated below,
we have the following description for the uncertain system:
" # Ã" # " # !
h i ∆A 0 AR 0
ẋ = Ax + Kw + AL KL x+ w
0 ∆K 0 KR
= Ax + Kw + B∆(Cx + Lw)
= Ax + Kw + Bω, ω = ∆ψ, ψ = Cx + Lw.
Similar development allows ∆ to contain uncertain parameters in all input, output matrices of
(4.55). This motivates the model for a general class of uncertain systems given below.
We consider the following class of uncertain systems;
    
ẋ(t) A B K x(t)
    
 ψ(t)  =  C D L   ω(t)  , ω(t) = ∆(t)ψ(t), (4.57)
    
y(t) M N H w(t)
where x is the state, w is the disturbance, y is the output of interest, ψ and ω are the exogenous
signals to describe the uncertainty ∆. The nominal system (∆ ≡ 0) is linear time-invariant, and the
uncertainty ∆ is assumed to belong to the following set of norm-bounded time-varying structured
uncertainties;
∆
BUC = { ∆ : R → Rm×m , k∆(t)k ≤ 1, ∆(t) ∈ U },
where
U = { block diag(δ1 Ik1 · · · δs Iks ∆1 · · · ∆f ) : δi ∈ R, ∆i ∈ Rks+i ×ks+i }.

∆
(4.58)
Note that the sizes of the (square) subblocks add up to m;
X
s+f
ki = m.
i=1
By considering the “square” uncertainty ∆, we have implicitly assumed that the dimensions of
vector signals w and y are equal. This assumption can always be satisfied by adding rows or
columns to appropriate system matrices in (4.57).
Recall that the nominal system (∆ ≡ 0) is (asymptotically) stable if x(t) approaches zero as
t → ∞ regardless of the initial state x(0). The first objective in this section is to extend this
stability concept to the uncertain system (4.57).
Definition 4.7.1 The uncertain system (4.57) is said to be robustly stable if limt→∞ x(t) = 0
regardless of the initial state x(0), and the uncertainty ∆ ∈ BUC .
Note that the robust stability property of (4.57) is completely determined by matrices A, B,
C and D and BUC . In this case, ignoring the other matrices, the uncertain system (4.57) can be
described by
ẋ = (A + B∆(I − D∆)−1 C)x.
We would like to find a necessary and sufficient condition for robust stability. However, the problem
is difficult and no results are available in the literature (see [23, 90, 125, 143] for conditions in the
case of dynamic uncertainty).
In the sequel, we shall develop a sufficient condition for robust stability. First, let’s consider
the simple special case where the uncertainty is unstructured, i.e., U in (4.58) is just a set of full
block matrices (s = 0, f = 1 and U = Rm×m ).
Theorem 4.7.1 (Small Gain Theorem) Let T(s) = C(sI−A)−1 B+D and suppose kTk∞ < 1,
∆
i.e., there exists P > 0 such that

" # " #
PA + AT P PB CT h i
+ C D < 0. (4.59)
BT P −I DT
∆
Then the uncertain system (4.57) is robustly stable for all ∆ ∈ BUC with U = Rm×m .
Proof. Let the uncertainty ∆ ∈ BUC be given and fixed (but unknown). Let x, ω and ψ be any
signals that satisfy (4.57) for some nonzero initial state x(0) with w(t) ≡ 0. Then, from (4.59),
Ã " # " # !" #
h i PA + AT P PB CT h i x(t)
xT (t) ω T (t) + C D < 0.
BT P −I DT ω(t)
Using (4.57), we have
xT (t)Pẋ(t) + ẋT (t)Px(t) − ω T (t)ω(t) + ψ T (t)ψ(t) < 0.
Note that, for ∆ ∈ BUC , we have k∆(t)k ≤ 1 and
ψ T (t)ψ(t) − ω T (t)ω(t) = ψ T (t)(I − ∆T (t)∆(t))ψ(t) ≥ 0.
Hence,
d T
xT (t)Pẋ(t) + ẋT (t)Px(t) = (x (t)Px(t)) < 0.
dt
∆
Thus, the function V (x) = xT Px satisfies V (x) > 0, ∀ x 6= 0 and V̇ (x) < 0, which qualifies V (x)
as a Lyapunov function to prove stability of (4.57) for any given ∆ ∈ BUC . 2
T
In the above proof, a quadratic Lyapunov function V (x) = x Px is constructed to show robust
stability of the uncertain system (4.57). Note that the Lyapunov function is independent of ∆, and
works for all ∆ ∈ BU. Thus, the condition kTkH∞ < 1 implies the existence of a single quadratic
Lyapunov function to prove stability of the uncertain system (4.57).
Definition 4.7.2 [80, 2] The uncertain system (4.57) is said to be quadratically stable if there
exists a single quadratic Lyapunov function V (x) = xT Px to prove stability for all ∆ ∈ BUC , i.e.,
if (I − D∆) is invertible for any ∆ ∈ BUC and there exists P > 0 such that
P(A + B∆(I − D∆)−1 C) + (A + B∆(I − D∆)−1 C)T P < 0, ∀ ∆ ∈ BUC .
In general, every quadratically stable system is robustly stable, but the converse is not neces-
sarily true. The gap is due to the requirement that the Lyapunov function be independent of the
uncertainty. In view of Theorem 4.7.1and its proof, the H∞ norm condition kTkH∞ < 1 implies
quadratic stability of (4.57). In fact, it is known [72] that the converse is also true, that is, if
(4.57) is quadratically stable, then kTkH∞ < 1 holds. In summary, for the case of unstructured
uncertainty, we have
kTkH∞ < 1 ⇔ Quadratic Stability ⇒ Robust Stability.
Next, consider the structured uncertainty BU with U given by (4.58). In this case, in general,
quadratic stability does not necessarily imply kTkH∞ < 1, [113] and we have
kTkH∞ < 1 ⇒ Quadratic Stability ⇒ Robust Stability.
Thus, the small gain condition kTkH∞ < 1 guarantees robust stability of (4.57), but it may be very
conservative. The conservatism due to the second “⇒” is by definition, and that due to the first
“⇒” is (partly) caused by the fact that the available structure information on ∆ is totally ignored.
A standard approach [23, 116] to fill conservatism by taking the structure information into
account, is to introduce the following subset of positive definite matrices:
S = { block diag(S1 · · · Ss s1 Iks+1 · · · sf Iks+f ) : Si ∈ Rki ×ki , si ∈ R, Si > 0, si > 0 }.

∆
Clearly, for each S ∈ S, we have
∆ = S−1/2 ∆S1/2 , ∀ ∆ ∈ U
where S1/2 denotes the positive definite square root of S, and is introduced for later convenience.
Hence, we can replace ∆ in (4.57) by S−1/2 ∆S1/2 to obtain a new representation of the same
uncertain system (4.57):
" # " #" #
ẋ Â B̂ x
= , ω̂ = ∆ψ̂ (4.60)
ψ̂ Ĉ D̂ ω̂
where " # " #" #" #

Â B̂ ∆ I 0 A B I 0
= −1/2
. (4.61)
Ĉ D̂ 0 S1/2 C D 0 S
From the small gain theorem (Theorem 4.7.1), the uncertain system (4.57) is robustly stable if
kT̂kH∞ < 1 where
T̂(s) = Ĉ(sI − Â)−1 B̂ + D̂.
∆
Note that T̂ = S1/2 TS−1/2 and S could be any matrix such that S ∈ S. Thus, this freedom can
be used to reduce the conservativeness. The following definition is due to [26].
Definition 4.7.3 The uncertain linear system (4.57) is said to be Q-stable if there exists a matrix
S ∈ S such that kS1/2 TS−1/2 kH∞ < 1.
A characterization of Q-stability can be obtained by replacing matrices A, B, C and D in

(4.59) with Â, B̂, Ĉ and D̂ defined in (4.61):
" # " #
PA + AT P PBS−1/2 CT S1/2 h i
+ S1/2 C S1/2 DS−1/2 < 0.
S−1/2 BT P −I S−1/2 DT S1/2
Using the congruent transformation with block diag (I, S1/2 ), we have
" # " #
PA + AT P PB CT h i
+ S C D < 0. (4.62)
BT P −S DT
Thus, the uncertain system (4.57) is Q-stable if and only if there exist matrices P > 0 and S ∈ S
satisfying (4.62).
Exercise 4.7.1 (a) Show that, for each S ∈ S, the following holds:
S − ∆T (t)S∆(t) ≥ 0, ∀ ∆ ∈ BUC .
(b) Using the result of (a), prove that Q-stability implies quadratic stability.
Now we can summarize the above discussion as follows:
kTkH∞ < 1 ⇒ Q-stability ⇒ Quadratic Stability ⇒ Robust Stability
for the system (4.57) with structured uncertainty . Note that the notion of Q-stability has been
introduced to reduce the gap between kTk∞ < 1 and quadratic stability. In general, Q-stability
is more conservative than quadratic stability [98, 111]. For the case of dynamic time-varying
uncertainty ∆, Q-stability is equivalent to robust stability [90, 125]. Quadratic Lyapunov matrix
P associated with Q-stability constitutes the basis for robust performance analysis in the sequel.
Our next objective is to define performance measures by certain “sizes” of the output signal y
in response to certain classes of the disturbances w, in the presence of the uncertainty ∆, and to
provide (computable) characterizations of the performance measures. We shall define three robust
performance measures. All three are given by the worst-case disturbance attenuation level in the
presence of the uncertainty, but the measures of the “size” of the error signal and the classes of the
disturbance are different.
The first one is a generalization of the (nominal) H2 norm performance measure for the uncertain
system. Since the uncertain system is time-varying, the transfer matrix from w to y does not exist,
and hence the H2 norm to extend the concept of the H2 norm to time-varying systems is to consider
the energy (L2 norm) of the output signal y in response to the impulsive disturbance w.
Robust H2 Performance: Let a Linear Quadratic (LQ) cost be given by

Z ∞
∆
JˆH2 (∆, w0 ) = kyk2L2 = yT (t)y(t)dt
0
where y is the output signal in the presence of the uncertainty ∆ when the system (4.57) is excited
by an impulsive disturbance w(t) = w0 δ(t) with a zero initial state, i.e., x(0) = 0. Define the
robust H2 performance measure as the worst-case LQ cost;
∆
JH2 = sup {JˆH2 (∆, w0 ) : kw0 k ≤ 1, ∆ ∈ BUC }.
w0 , ∆
Note that JH2 is the worst-case cost over all possible directions of the impulsive disturbance as well
as all admissible perturbations. It is also possible to define a similar robust performance measure by
taking the sum of the LQ costs for impulsive disturbances applied one at a time at each disturbance
channel, instead of the worst-case w0 .
Another measure we consider for the robust disturbance attenuation is the worst-case peak value
of the output signal y in response to the disturbance w with its energy being below a specified (a
priori known) level.
Robust L∞ Performance: Consider the following cost function;
∆
JˆL∞ (∆, w) = kyk2L∞ = sup yT (t)y(t)
t≥0
where y is the output signal in the presence of disturbance w and the plant perturbation ∆ with a
zero initial condition x(0) = 0. The robust L∞ performance measure is defined by the worst-case
peak value of the error signal subject to a finite energy disturbance as follows;
∆
JL∞ = sup {JˆL∞ (∆, w) : kwkL2 ≤ 1, ∆ ∈ BUC }.
w, ∆
Thus the robust L∞ performance measure is the worst-case energy (L2 ) to peak (L∞ ) gain of the
system subject to the uncertainty ∆.
The last robust performance measure considered here is the worst-case energy of the output
signal y in response to the energy bounded disturbance w [26].
Robust H∞ Performance: Consider

Z ∞
∆
JˆH∞ (∆, w) = yT (t)y(t)dt
0
where y is the output signal in response to the disturbance w, in the presence of the uncertainty
∆, with the zero initial state x(0) = 0. The robust H∞ performance is defined by taking the worst
case of JˆH∞ (∆, w) over all possible uncertainty ∆ ∈ BUC and the disturbance w with bounded
energy:
∆
JH∞ = sup {JˆH∞ (∆, w) : kwkL ≤ 1, ∆ ∈ BUC }.
2
w, ∆
Hence, this performance measure is a robustified version of the L2 gain. Since the L2 gain of the
nominal system is given by the H∞ norm, we call this the robust H∞ performance.
Now, we shall present characterizations of the robust performance measures defined above.
We restrict our attention to the class of Q-stable systems, by which the finiteness of each robust
performance measure is guaranteed.
The following theorem provides an upper bound on the robust H2 performance [86, 57, 140].
Theorem 4.7.2 The following statements are equivalent.
(i) The uncertain system (4.57) is Q-stable for all ∆ ∈ BUC .
(ii) There exist P > 0 and S ∈ S such that

" # " #T " #" #
PA + AT P PB C D S 0 C D
+ < 0. (4.63)
BT P −S M N 0 I M N
Suppose the above statements hold. Then the robust H2 performance measure JH2 is finite if L = 0
and H = 0, in which case, it is bounded above by (recall that kMk = σ̄[M]).
JH2 < kKT PKk.
Proof. Recall that the system (4.57) is Q-stable if and only if (4.62) holds. This implies the
existence of P̄ > 0, S̄ ∈ S and a sufficiently small ε > 0 such that
" # " # " #
P̄A + AT P̄ P̄B CT h i MT h i
+ S̄ C D +ε M N < 0.
BT P̄ −S̄ DT NT
Dividing the left-hand side by ε, and defining P = ε−1 P̄ > 0 and S = ε−1 S̄ ∈ S, we have (4.63).
∆ ∆
This proves the equivalence (i) ⇔ (ii).

Now, consider the signals x and ω in response to the impulsive disturbance w(t) = w0 δ(t) with
∆
the zero initial state, and a (candidate) quadratic Lyapunov function V (x(t)) = xT (t)Px(t). Denote
∆
the left-hand side of (4.63) by Φ. Since Φ < 0, for any given vector v(t) = [ xT (t) ω T (t) ]T 6= 0,
we have vT (t)Φv(t) < 0, or equivalently,
V̇ (x(t)) < ω T (t)Sω(t) − ψ T (t)Sψ(t) − yT (t)y(t), ∀ t ≥ 0,
where we note that x(0) = 0 and w(t) = w0 δ(t) are equivalent to x(0) = Kw0 and w(t) = 0, ∀ t ≥
0. Integrating the both sides from t = 0 to ∞,
Z ∞ Z ∞
w0T (KT PK)w0 > ky(t)k2 dt + (ψ T (t)Sψ(t) − ω T (t)Sω(t))dt, ∀ ∆ ∈ BUC ,
0 0
where we used the fact limt→∞ x(t) = 0 due to the Q-stability (which implies asymptotic stability).
Noting that, for each S ∈ S,
ω T (t)Sω(t) − ψ T (t)Sψ(t) = ψ T (t)(∆T (t)S∆(t) − S)ψ(t) ≤ 0, ∀ ∆ ∈ BUC , t ≥ 0, (4.64)

and the worst-case disturbance intensity w0 is given by the eigenvector of KT PK corresponding

to the maximum eigenvalue, we see that
Z ∞
kK PKk >T
ky(t)k2 dt, ∀ ∆ ∈ BUC , kw0 k ≤ 1.
0

It can be seen in the above proof that the performance bound in Theorem 4.7.2 holds as long
as the uncertainty set BUC is defined such that
Z ∞
(ψ T (t)Sψ(t) − ω T (t)Sω(t))dt ≥ 0, ∀ ∆ ∈ BUC . (4.65)
0
This simple observation allows us to investigate the conservativeness of the bound by searching for
the “largest” uncertainty set BUC such that (4.65) holds. Let us first consider the case where ∆ is a
∆ ∆
linear operator. In this case, defining ω̂ = S1/2 ω, ψ̂ = S1/2 ψ and noting that ω̂ = S1/2 ∆ψ = ∆ψ̂,
the condition (4.65) becomes
kψ̂kL2 ≥ kω̂kL2 , ∀ ∆ ∈ BUC .
Thus we see that the result of Theorem 4.7.2 holds for any linear dynamic (possibly noncausal)
time-varying uncertainties with L2 gain less than or equal to 1. For the full block ∆i of the
uncertainty, we can further relax the linearity assumption on the operator ∆i since, in this case,
the scaling S is just a scalar and can be factored out in (4.65).
The following theorem provides a characterization of the robust L∞ performance bound [4, 57].
As in the nominal performance case discussed in the previous section, the performance bound is
given by the maximum singular value of a matrix associated with a controllability-Gramian-type
Lyapunov matrix, which is the dual of the Lyapunov matrix used in the robust H2 performance
characterization.
(ii) There exist Q > 0 and S ∈ S such that

" # " #" #" #T
AQ + QAT QCT B K S 0 B K
+ < 0. (4.66)
CQ −S D L 0 I D L
Suppose the above statements hold. Then the robust L∞ performance measure JL∞ is finite if N = 0
and H = 0, in which case, it is bounded above by
JL∞ < kMQMT k.

Proof. The equivalence (i) ⇔ (ii) can be verified by dualizing the result of Theorem 4.7.2. Now
suppose statement (ii) holds. By the Schur complement formula, inequality (4.66) is equivalent to
 
AQ + QAT QCT B K
 
 −S 
 CQ D L 
 −1
 < 0.
 BT DT −S 0 
 
KT LT 0 −I
After a congruent transformation involving Q−1 , another use of the Schur complement formula
yields
   
Q−1 A + AT Q−1 Q−1 B Q−1 K CT h i
   
Ω=  +  DT  S−1 C D L < 0.
∆ T Q−1 −1
 B −S 0   
KT Q−1 0 −I LT
Now we can prove the result by a similar approach to the case of the robust H2 performance
bound. Consider the signals x and ω in response to some L2 disturbance w with the zero initial
state, and a (candidate) Lyapunov function V (x(t)) = xT (t)Q−1 x(t). Then, for any vector u(t) =
∆ ∆
[ xT (t) ω T (t) wT (t) ]T 6= 0, we have uT (t)Ωu(t) < 0, or equivalently,
V̇ (x(t)) < kw(t)k2 + ω T (t)S−1 ω(t) − ψ T (t)S−1 ψ(t), ∀ t ≥ 0.
Integrating the both sides from t = 0 to τ , we have

Z τ Z τ
−1
T
x (τ )Q x(τ ) < kw(t)k dt +
2
(ω T (t)S−1 ω(t) − ψ T (t)S−1 ψ(t))dt, ∀ ∆ ∈ BUC , τ ≥ 0.
0 0
It is easily verified that

Z τ
(ω T (t)S−1 ω(t) − ψ T (t)S−1 ψ(t))dt ≤ 0, ∀ ∆ ∈ BUC , τ ≥ 0, (4.67)
0
and hence, for any (nonzero) finite energy disturbance w and any perturbation ∆ ∈ BUC ,
Z τ
−1
T
x (t)Q x(t) < kw(t)k2 dt ≤ kwk2L2 , ∀t ≥ 0,
0
or equivalently, " #
kwk2L2 xT (t)
> 0, ∀ t ≥ 0.
x(t) Q
By a (nonsquare) congruent transformation,
" #" #" # " #
1 0 kwk2L2 xT (t) 1 0 kwk2L2 yT (t)
= ≥ 0, ∀ t ≥ 0,
0 M x(t) Q 0 MT y(t) MQMT
where the last inequality is strict if M has linearly independent rows. Using the Schur complement
formula,
1
y(t)yT (t) ≤ MQMT ≤ kMQMT kI, ∀ t ≥ 0,
kwk2L2
or
ky(t)k2
≤ kMQMT k, ∀ t ≥ 0.
kwk2L2
As in the case of robust H2 performance bound, we shall examine how conservative our perfor-
mance bound is, by searching for a larger class of uncertainties for which Theorem 4.7.3 is valid.
From the proof of Theorem 4.7.3, our result holds for any uncertainty set BUC such that (4.67) is
satisfied. If ∆ is a linear operator, then following a similar procedure to that given for the case of
the robust H2 performance, we see that (4.67) is equivalent to
Z τ
(kω̂(t)k2 − kψ̂(t)k2 )dt ≤ 0, ∀ ∆ ∈ BUC , τ ≥ 0, (4.68)
0
where ω̂ = ∆ψ̂. According to Lemma 2.4 of [140], (4.68) holds if BUC is the set of linear causal
operators with L2 gain less than or equal to 1. Note that causality of the uncertainty came into
play in contrast to the case of the robust H2 performance bound where Theorem 4.7.2 is valid for
possibly noncausal uncertainties.
Finally, we state a characterization of the robust H∞ performance bound [26, 97].
(ii) There exist matrices P > 0, S ∈ S and a scalar γ > 0 such that
   
PA + AT P PB PK CT MT " #" #
    S 0 C D L
 BT P −S 0   T T 
 + D N  < 0. (4.69)
0 I M N H
KT P 0 −γ 2 I LT H T
Suppose the above statements hold. Then the robust H∞ performance JH∞ is bounded above by
JH∞ < γ.
Proof. Using Finsler’s Theorem, there exists γ > 0 satisfying (4.69) if and only if there exist P > 0
and S ∈ S satisfying (4.63). Hence, the equivalence (i) ⇔ (ii) follows from Theorem 4.7.2.
To prove the performance bound, consider the signals in (4.57) and let Ψ be the left-hand side
∆
of (4.69). Then, for any vector ζ(t) = [ xT (t) ω T (t) wT (t) ]T 6= 0, we have ζ T (t)Ψζ(t) < 0, or
equivalently,
2xT (t)Pẋ(t) + yT (t)y(t) − γ 2 wT (t)w(t) < ω T (t)Sω(t) − ψ T (t)Sψ(t) ≤ 0,
where the last inequality holds since S ∈ S and ω(t) = ∆(t)ψ(t) with ∆ ∈ BUC . Thus,
d T
(x (t)Px(t)) + yT (t)y(t) − γ 2 wT (t)w(t) < 0.
dt
Integrating from t = 0 to ∞ with x(0) = 0, we have

Z ∞ Z ∞
T 2
y (t)y(t)dt < γ wT (t)w(t)dt
0 0
where we used the fact that limt→∞ x(t) = 0 due to Q-stability. Hence,
kykL2 < γkwkL2 ≤ γ.

Consider the discrete-time uncertain system
    
x(k + 1) A B K x(k)
    
 ψ(k)  =  C D L   ω(k)  , ω(k) = ∆(k)ψ(k), (4.70)
    
y(k) M N H w(k)
where x is the state, w is the disturbance, y is the output of interest, ψ and ω are the exogenous
signals to describe the uncertainty ∆. Suppose that the uncertainty ∆ is known to belong to the
following set of norm-bounded time-varying structured uncertainties;
∆
BUD = { ∆ : I → Rm×m , k∆(k)k ≤ 1, ∆(k) ∈ U },
where I is the set of integers and U is defined in (4.58).

In this section, robust stability and performance of the uncertain system (4.70) are analyzed in
a manner that is completely analogous to the continuous-time counterpart. Let us start by defining
the concept for robust stability.
Definition 4.7.4 The uncertain discrete-time system (4.70) is said to be robustly stable if limk→∞ x(k) =
0 regardless of the initial state x(0), and the uncertainty ∆ ∈ BUD .
Definition 4.7.5 [80] The uncertain discrete-time system (4.70) is said to be quadratically stable if
there exists a single quadratic Lyapunov function V (x) = xT Px to prove stability for all ∆ ∈ BUD ,
i.e., if (I − D∆) is invertible for any ∆ ∈ BUD and there exists P > 0 such that
P > (A + B∆(I − D∆)−1 C)T P(A + B∆(I − D∆)−1 C), ∀ ∆ ∈ BUD .
Definition 4.7.6 [26] The uncertain discrete-time system (4.70) is said to be Q-stable if there
exists a matrix S ∈ S such that kS1/2 TS−1/2 kH∞ < 1 where T(z) = C(zI − A)−1 B + D.
∆
As in the continuous-time case, the above notions of stability can be ordered as follows:
kTkH∞ < 1 ⇒ Q-stability ⇒ Quadratic Stability ⇒ Robust Stability.

Ideally, we would like to have a necessary and sufficient condition for robust stability, but this
is difficult. The easiest (and hence most conservative) way to guarantee robust stability is the
small gain condition kTkH∞ < 1. This condition and Q-stability and quadratic stability are all
equivalent if the uncertainty is unstructured. However, in general, there are gaps between these
three. Conservatism in kTkH∞ < 1 can be reduced by taking the structure information on the
uncertainty into account, which leads to the notion of Q-stability or the scaled H∞ norm condition.
Q-stability is still conservative, but has a nice (computable) characterization as follows [32, 97]. (A
proof is left for the reader as an easy exercise).
(i) The uncertain system (4.70) is Q-stable.
(ii) There exist matrices P > 0 and S ∈ S such that

" # " #T " #" #
P 0 A B P 0 A B
> .
0 S C D 0 S C D
Now, we shall define robust performance measures and provide characterizations based on the
notion of Q-stability.
Robust H2 Performance: Consider

∞
X
∆
F̂H2 (∆, w0 ) = kyk2`2 = yT (k)y(k)
k=0
where { y(k) }∞
k=0 is the output signal in response to
(
w0 (k = 0)
w(k) = , x(0) = 0,
0 (k ≥ 1)
in the presence of perturbation { ∆(k) }∞

k=0 . The robust H2 performance measure is defined by
∆
FH2 = sup {F̂H2 (∆, w0 ) : kw0 k ≤ 1, ∆ ∈ BUD }.
w0 , ∆
Robust `∞ Performance: Consider
∆
F̂`∞ (∆, w) = kyk2`∞ = sup yT (k)y(k)
k≥0
where { y(k) }∞ ∞
k=0 is the output signal for the disturbance { w(k) }k=0 and the perturbation
{ ∆(k) }∞
k=0 , with a zero initial state x(0) = 0. The robust `∞ performance measure is defined by
∆
F`∞ = sup {F̂`∞ (∆, w) : kwk`2 ≤ 1, ∆ ∈ BUD }.
w, ∆
Robust H∞ Performance: Consider

∞
X
∆
F̂H∞ (∆, w) = kyk2`2 = yT (k)y(k)
k=0
where { y(k) }∞k=0 is the output signal in response to the disturbance w, with the zero initial state
x(0) = 0, in the presence of perturbation { ∆(k) }∞ k=0 . The robust H∞ performance is defined by
∆
FH∞ = sup {F̂H∞ (∆, w) : kwk`2 ≤ 1, ∆ ∈ BUD }.
w, ∆
The above robust performance measures are completely analogous to the ones for the continuous-
time case. We can obtain upper bounds for the performance measures by applying similar tech-
niques to those given in the previous section. However, for the robust H2 performance, there is
a difficulty associated with discrete-time systems, which results from the pulse disturbance as op-
posed to the (continuous-time) impulsive disturbance. To avoid this difficulty, let us introduce the
following technical assumption.
Assumption 1 ω(0) = 0.
This assumption holds if the uncertainty is modeled by a linear dynamic strictly proper system
with the zero initial state.
Now we are ready to give characterizations for upper bounds on the robust performance mea-
sures for the discrete-time uncertain system (4.70) [57].
(i) The discrete-time uncertain system (4.70) is Q-stable for all ∆ ∈ BUD .
(ii) There exist P > 0 and S ∈ S such that

" # " #T " #" # " #
P 0 A B P 0 A B MT h i
> + M N . (4.71)
0 S C D 0 S C D NT
Suppose the above statements hold. Further we assume that L = 0 and Assumption 1 is satisfied.
Then the robust H2 performance measure FH2 is bounded above by
FH2 < kKT PK + HT Hk.
Proof. Using the discrete-time bounded real lemma (see e.g. [26, 97]), and following a similar
procedure to the proof of Theorem 4.7.2, the equivalence of statements (i) and (ii) can be verified.
Now, let x and ω be the signals in response to the pulse disturbance. Then, multiplying (4.71)
by [ xT (k) ω T (k) ] and its transpose from the left and right, respectively, we have
yT (k)y(k) < xT (k)Px(k) − xT (k + 1)Px(k + 1) + ω T (k)Sω(k) − ψ T (k)Sψ(k), ∀ k ≥ 1.

Taking the summation over k = 1, 2, . . ., and adding ky(0)k2 to both sizes, after some manipulations,
    
T CT MT
h i A 
P 0 0

A B K

x(0)

kyk2`2 < xT (0) ω T (0) wT (0) 
 B
T DT N 
T    
  0 S 0   C D L   ω(0) 
KT LT H T 0 0 I M N H w(0)
∞ ³
X ´
−ω T (0)Sω(0) + ω T (k)Sω(k) − ψ T (k)Sψ(k)) .
k=0
Note that, for S ∈ S and ∆ ∈ BUD , we have ω T (k)Sω(k)−ψ T (k)Sψ(k) ≤ 0. Then the performance
bound follows immediately with assumptions x(0) = 0 and ω(0) = 0, by taking the worst-case over
the direction of the pulse disturbance. 2
The result of Theorem 4.7.6 clearly shows analogy to a characterization of the H2 norm for
the nominal system, based on the Lyapunov inequality. Indeed, the nominal case result can be
recovered by letting the matrices associated with the uncertainty be zero. In this case, the upper
bound for the H2 norm is tight [110]. However, the upper bound in Theorem 4.7.6 on the robust
H2 performance may not be tight in general due to the conservativeness of Q-stability. In view
of the above proof, Theorem 4.7.6 holds for the larger class of uncertainties, i.e., linear noncausal
time-varying uncertainties with `2 gain less than or equal to 1.
It is worth noting that the matrix block diag(P, S) in Theorem 4.7.6 may be considered as a
structured Lyapunov matrix to prove Q-stability. In particular, it is the structured observability
Gramian if we replace the inequality by equality, for the augmented system
" #
A B h i
∆ ∆
Â = , Ĉ = M N .
C D
The following theorem characterizes an upper bound on the robust `∞ performance [57]. The
result exhibits a certain duality to the result of Theorem 4.7.6, and in this case, we have a structured
controllability Gramian to describe the bound.
(i) The discrete-time uncertain system (4.70) is Q-stable for all ∆ ∈ BUD .
(ii) There exist Q > 0 and S ∈ S such that

" # " #" #" #T " #
Q 0 A B Q 0 A B K h i
> + KT LT . (4.72)
0 S C D 0 S C D L
Suppose the above statements hold. Then the robust `∞ performance measure F`∞ is finite if N = 0,
in which case, it is bounded above by
F`∞ < kMQMT + HHT k.

Proof. The equivalence (i) ⇔ (ii) follows from Theorem 4.7.6 by dualizing the result. Note that
(4.72) is equivalent to
   
Q−1 0 0 AT CT " −1 #" #
∆     Q 0 A B K
ΩD = 
 0 S−1 0   T
− B
T
D   > 0,
0 S−1 C D L
0 0 I KT LT
∆
by the Schur complement formula. Then, for ζ(k) = [ xT (k) ω T (k) wT (k) ]T 6= 0, we have
ζ Tk ΩD ζ k > 0, or
kw(k)k2 + xT (k)Q−1 x(k) − xT (k + 1)Q−1 x(k + 1) − ψ T (k)S−1 ψ(k) + ω T (k)S−1 ω(k) > 0.
Taking the summation over k = 0, 1, . . . , N,
X
N X
N
xT (N + 1)Q−1 x(N + 1) < kw(k)k2 + (ω T (k)S−1 ω(k) − ψ T (k)S−1 ψ(k)) + xT (0)Q−1 x(0).
k=0 k=0
The rest of the proof is similar to the proof of Theorem 4.7.3. 2

Finally, we give a characterization of robust H∞ performance bound [26, 97]. The result can
be proved by a similar procedure to the proof of Theorem 4.7.6, and hence is left for the reader as
an exercise.
(i) The uncertain system (4.70) is Q-stable for all ∆ ∈ BUD .
(ii) There exist matrices P > 0, S ∈ S and a scalar γ > 0 such that
   T   
P 0 0 A B K P 0 0 A B K
      
 0 S 0  >  C D L   0 S 0  C D L . (4.73)
      
0 0 γ2I M N H 0 0 I M N H
Suppose the above statements hold. Then the robust H∞ performance FH∞ is bounded above by
FH∞ < γ.
Chapter 4 Closure
This chapter presents both deterministic and stochastic interpretations of the covariance of the
state or output of a dynamic system. The covariance matrix may be constructed by injecting
impulses (pulses) into one input channel at a time in continuous (discrete) systems, and summing
the second-order information (the outer product of the state with itself) over the number of input
channels. It is shown that some problems that are nonlinear in the state space, remain linear in
the state covariance equations. Hence, using second-order information (the covariance) permits a
larger class of systems to be treated by linear methods.
It is of fundamental interest in this book that the system gains can be related to the second-order
information (covariance) of the state. In later chapters, the ability to assign a specified covariance
matrix will be interpreted in terms of the system gains that can be assigned by feedback. The system
gains are defined to relate the norm of an output to the norm of an input. A basic motivation for
defining these quantities is that many control design objectives can be given by the requirement that
the error output signal be small regardless of the disturbance and/or command input. Thus, the
system gains are measures for performance of control systems. A variety of nominal performance
measures are treated in [19, 155, 165] while several robust performance measures can be found in
[26, 57, 56, 140].
For further reading on stochastic processes in systems and control, see [10, 137, 78].
Chapter 5
Covariance Controllers
Covariance analysis is the workhorse of systems engineering. Almost every engineering discipline
uses covariance analysis to assess errors and to evaluate performance. It was not until [53] that
techniques were developed to use feedback control to assign a desired covariance. This chapter
outlines and extends those procedures.
5.1 Covariance Control Problem
Recall that the state covariance carries many system properties, and that the closed-loop stability is
equivalent to the existence of a positive definite state covariance under controllability assumptions.
This motivates the following covariance control problem [53]: Suppose a state covariance which
satisfies certain closed-loop performance specifications is given. We wish to find all controllers
which assign the given state covariance. We call such controllers covariance controllers. Note that
not all positive definite matrices can be assigned as state covariances. A positive definite matrix X
is assignable as a state covariance if it satisfies
Ac` X + XATc` + Bc` BTc` = 0; (continuous-time case)
X = Ac` XATc` + Bc` BTc` (discrete-time case)
for some controller G ( note that the matrices Ac` and Bc` are functions of G ). The solution to the
covariance control problem consists of the following two parts; (1) necessary and sufficient conditions
for a positive definite matrix to be assignable, and (2) an explicit formula for all controllers which
assign a given state covariance to the closed-loop system. Throughout the chapter, we assume for
simplicity that there is no correlation between the process- and measurement-noises, and we define
the noise covariances W and V as follows,
" # " #
Dp h i W 0
DTp DTz = . (5.1)
Dz 0 V
111
112 CHAPTER 5. COVARIANCE CONTROLLERS
We shall also assume that (Ap , Dp ) is controllable and (Ap , Bp , Mp ) is a stabilizable and detectable
triple. In the following section, we consider the covariance control problem for the linear time-
invariant continuous-time system.
5.2 Continuous-Time Covariance Controllers

5.2.1 State Feedback
Consider the linear system given by (4.15) where we assume that all the states are available for
feedback, i.e., Mp = I and Dz = 0. With a controller of constant state feedback gain G, the
closed-loop state covariance X satisfies the following Lyapunov equation.
(Ap + Bp G)X + X(Ap + Bp G)T + W = 0 (5.2)
Theorem 5.2.1 [53, 130] Let a positive definite matrix X ∈ Rnp ×np be given. Then the following
(i) There exists a control gain G which assigns X as a state covariance.
(ii) X satisfies
(I − Bp B+
p )(Ap X + XAp + W)(I − Bp Bp ) = 0.
T +
In this case, all such control gains G are given by

1 −1
G = − B+ (Ap X + XATp + W)(2I − Bp B+
p )X (5.3)
2 p
+ −1
+ B+
p SF Bp Bp X + (I − B+
p Bp )ZF
where ZF is arbitrary and SF is an arbitrary skew-symmetric matrix.
Proof. Recall that a given matrix X > 0 is assignable as a state covariance if and only if there
exists a controller G satisfying (5.2). Rearranging (5.2), we have
Bp GX + (Bp GX)T + Q = 0, (5.4)

∆
Q = Ap X + XATp + W.
Then the result directly follows by applying Theorem 2.3.9 to solve (5.4) for GX. 2
5.2.2 Covariance Construction for Assignability

The following result permits the construction of covariances assignable by state feedback (Theo-
rem 5.2.1) in a finite number of steps. Let the SVD of Bp be given by
" #" #
h i Σ1 0 V1T
T
Bp = UΣV = U1 U2 .
0 0 V2T
and define A2 , B2 , W2 , X2 by
A2 = UT2 Ap U2 , B2 = UT2 Ap U1 , W2 = UT2 WU2 , X2 = UT2 XU2

5.2. CONTINUOUS-TIME COVARIANCE CONTROLLERS 113
Lemma 5.2.1 Given Ap , Bp , W, the following statements are equivalent:
(i) X > 0 satisfies
(I − Bp B+
p )(Ap X + XAp + W)(I − Bp Bp ) = 0
T +
(5.5)
(ii) There exist real matrices X2 > 0, R = RT > 0, S = −ST and Z such that
" #
R + G2 X2 G2 T G2 X2
X = U UT (5.6)
X2 G2 T X2
1 −1
G2 = − B+ (A2 X2 + X2 AT2 + W2 )(2I − B2 B+
2 )X2
2 2
−1
+ B+ +
2 SB2 B2 X2 + (I − B+
2 B2 )Z (5.7)
where X2 satisfies
(I − B2 B+
2 )(A2 X2 + X2 A2 + W2 )(I − B2 B2 ) = 0.
T +
(5.8)
Proof. Suppose (i) holds. Define

" # " #
Q L UT1 h i
∆
= X U1 U2
LT X2 UT2
where X2 > 0 follows from X > 0. Note that I − Bp B+ T

p = U2 U2 , and (5.5) yields
(A2 + B2 G2 )X2 + X2 (A2 + B2 G2 )T + W2 = 0, (5.9)
for some G2 = LX−1 2 . Applying Theorem (5.2.1) to (5.9) yields (5.7-5.8), where S = −S and Z
T
is arbitrary. The existence of R follows from (5.6), from whence

" #
R + G2 X2 GT2 G2 X2
X > 0 ⇐⇒ U XU > 0 ⇐⇒T
>0
X2 GT2 X2
or, using the Schur complement, {X > 0} ⇐⇒ {X2 > 0 and R + G2 X2 GT2 − G2 X2 X−1 T
2 X2 G 2
= R > 0}. This proves that (i)⇒(ii). To show the converse, suppose (ii) holds. Then (5.6) yields
X > 0, since R > 0 and X2 > 0. From Theorem 5.2.1, X2 and G2 satisfy (5.9) for arbitrary
S = −ST and Z. Then by substitution X given by (5.6) satisfies (5.5). 2
Note that if X2 > 0 can be found to satisfy (5.8), then (5.7) gives a set of G2 which can be
placed into a matrix of the form (5.6) to produce an assignable X, given an arbitrary R > 0. So
what is now needed is an X2 > 0 satisfying (5.8). Note that finding X2 to satisfy (5.8) is the same
as the mathematical problem of finding X to satisfy (5.5). What has been accomplished is that
(5.8) is a smaller matrix equation. Hence to solve (5.8) simply apply Theorem 5.2.1 again. This
yields another problem of the same form, but of smaller size. This process can be repeated until
the size of the problem is trivial to solve. This is the idea of the algorithm presented below.
Given matrices Ap , Bp , W define for k = 1, 2, . . . , q, the set of matrices,
∆ ∆
Ak+1 = UT2k Ak U2k A1 = Ap
∆
Bk+1 = UT2k Ak U1k B1 = Bp (5.10)
∆
Wk+1 = UT2k Wk U2k W1 = W
using the singular value decompositions,

" #" #
h i Σ1k 0 T
V1k
Bk = Uk Σk VkT = U1k U2k T
(5.11)
0 0 V2k
where the terminal integer q is defined by the event
Bq BTq > 0 or Bq = 0.
To show that q < ∞, note that if Bk BTk 6> 0 and Bk 6= 0 then U1k and U2k are still well-defined
(neither is an empty matrix). Since U1k is not empty, U2k is not square and UT2k Ak has fewer rows
than (Ak ). That is dim(Ak+1 ) = dim(Bk+1 ) < dim(Ak ) where dim(·) denotes the row dimension
of (·). Hence the dimension of the matrices reduces by at least one on each iteration, and thus, the
number of iterations is q < n, where Ap is n × n.
Recursive Algorithm for Assignable Covariances
Step 1 Compute the sequences Ak , Bk , Wk , k = 1, 2, . . . , q and Uk , k = 1, 2, . . . , q − 1 from (5.10-

5.11).
Step 2 Initialize k = q and choose Xq according to the following rule: If Bq BTq > 0, choose Xq as
any positive definite matrix. If Bq = 0, choose Xq as the unique positive definite solution
to Aq Xq + Xq ATq + Wq = 0.
Step 3 Choose an arbitrary Zk and an arbitrary Sk = −STk , and compute
1 −1
Gk = − B+ (Ak Xk + Xk ATk + Wk )(2I − Bk B+
k )Xk
2 k
+ −1
+B+k Sk Bk Bk Xk + (I − Bk Bk )Zk
+
Step 4 Choose an arbitrary Rk > 0 and update Xk

" #
Rk + Gk Xk GTk Gk Xk
Xk−1 = Uk−1 UTk−1
Xk GTk Xk
Step 5 If k = 2 go to Step 6. If k > 2, update k to k − 1 and return to Step 3.
Step 6 Let X = X1 .
This algorithm is a natural consequence of Lemma 5.2.1, and the only thing that needs to be
justified is the choice of Xq in Step 2. At k = q we need to solve
(I − Bq B+
q )(Aq Xq + Xq Aq + Wq )(I − Bq Bq ) = 0
T +
for Xq > 0. If Bq BTq > 0 then Xq > 0 is arbitrary. If Bq = 0, and Aq is stable with (Aq , Wq )
controllable (see that Wq > 0), then Xq > 0 satisfying
Aq Xq + Xq ATq + Wq = 0 (5.12)
is unique. If Bq = 0 and Aq is not stable, then no X > 0 solves (5.12). In this case the matrix
pair (A, B) is not stabilizable. If fact, if W = BBT , then Wk = 0 for all k, and the matrix pair
(A, B) is stabilizable if and only if (5.12) has a positive definite solution Xq > 0.
Example 5.2.1 Consider a fourth-order single-input system given in the controllable canonical
form:
   
0 1 0 0 0
   
   
 0 0 1 0   0 
Ap =  , Bp =  .
 0 0 0 1   0 
   
−a0 −a1 −a2 −a3 1
Let W = Dp DTp and Dp = Bp . In this case, (Ap + Bp G, Dp ) is controllable for any G. We will
apply the recursive algorithm to generate all assignable covariances (controllability Gramians).
In step 1 of the algorithm, using (5.10) and (5.11), we find A1 = Ap , B1 = Bp , W1 = Bp BTp
and
 
0 1 0 0      
  0 1 0 0 0 1 0
       
 0 0 1 0      
U1 =  , A2 =  0 0 1  , B2 =  0  , U2 =  0 0 1 
,
 0 0 0 1 
  0 0 0 1 1 0 0
1 0 0 0
" # " # " #

0 1 0 0 1
A3 = , B3 = , U3 = , A4 = 0, B4 = 1
0 0 1 1 0
and Wk = 0 (k = 2, 3, 4). Note that q = 4.

In step 2 of the algorithm, Bq BTq > 0 and therefore Xq can be chosen to be an arbitrary positive
scalar; we denote it by x4 > 0. For the computation of Gk in step 3 of the algorithm, the choices
of parameters Sk = −STk and Zk do not affect the value of Gk because of the special structure of
Bk . In step 4, the parameter Rk is a positive scalar and we denote it by rk .
Iterating steps 3 and 4 for k = 4, 3, 2, we have

" #
x4 0
G4 = 0, X3 = ,
0 r4
 
" #T x4 0 −r4
r4 /x4  
G3 = − , X2 = 
 0 r4 0 
,
0

−r4 0 r5 
 T x4 0 −r4 0
0  
   0 −r5 
 0 r4 
G2 = −  r /r
 3 4 + r 4 /x 
4  , X 1 =  
 −r4 0 r5 0 
0  
0 −r5 0 r6
where
∆ r42 ∆ r52
r5 = r3 + , r6 = r 2 + .
x4 r4
Thus, the set of all assignable covariances is parametrized by X = X1 in terms of positive scalars
r2 , r3 , r4 and x4 .
Note that X has the signature Hankel structure. In fact, as shown in the next subsection,
when an nth order single-input system is in the controllable canonical form, the set of assignable
controllability Gramians coincides with the set of positive definite signature Hankel matrices. This
fourth-order example can be generalized to give a parametrization of n×n positive definite signature
Hankel matrices in terms of n positive scalars.
Finally, using (5.3), the unique state feedback gain G, that assigns a given X, is found to be
 T ∆
a0 − p2 p4 p1 = 1/(2r2 ),
 
 a1 − p1 (p3 + p4 )  ∆
  p2 = r2 /r3 ,
G=  ,
 a2 − p2 − p3 − p4  ∆
p3 = r3 /r4 ,
 
a3 − p1 ∆
p4 = r4 /x4 .
Since pk > 0 (k = 1, . . . , 4) if and only if rk > 0 (k = 2, 3, 4) and x4 > 0, the above formula
for G provides a parametrization of all stabilizing state feedback gains in terms of positive scalars
pk (k = 1, . . . , 4). From the analysis point of view, we see that the set of all stable fourth-order
polynomials1 is parametrized by
φ(λ) = λ4 + p1 λ3 + (p2 + p3 + p4 )λ2 + p1 (p3 + p4 )λ + p2 p4
where pk > 0 (k = 1, . . . , 4).
1
We say that a polynomial is stable if all its roots have negative real parts.
5.2.3 State Feedback for Single Input Systems

The above algorithm collapses to a closed form analytical solution for single-input systems. For
the system described by
ẋ = Ax + B(u + w) (5.13)
suppose (A, B) is in phase variable form, and

" # " #
0 Inx 0
A = , B= (5.14)
−aT 1
h i
aT = a1 a2 · · · an , u = Gx
h i
G = g1 g2 · · · gn (5.15)
where the characteristic equation for A is
λn + an λn−1 + · · · a2 λ + a1 = 0. (5.16)
and the characteristic equation of the closed-loop system is the same as (5.16) with ai replaced by
∆
ai − gi , i = 1, · · · nx . Define σi = Xii where X satisfies
0 = X(A + BG)T + (A + BG)X + BBT . (5.17)
Hence σi is the impulse to energy gain between the input and the ith state variable, or σi is the
variance of the ith state variable if the input w is zero mean white noise. There are n values of ai
and n values of σi . We seek to show the one-to-one relationship between them.
Many engineering problems, such as antenna or telescope pointing, have performance require-
ments naturally stated in terms of inequalities on the variances, σi . Hence, it might be of interest
to explicitly relate stability and performance by expressing the allowed variance values σi in terms
of the coefficients ai , which contain all stability information. Note that xT X−1 x is a Lyapunov
function for (5.13).
The motivation for this section was influenced by two known results:
I) The work of [88], proved the Hurwitz-Routh test [114, 55], from Lyapunov stability theory,
by using a transformation of (5.13) which yields a diagonal Lyapunov matrix. Positive defi-
niteness of this Lyapunov matrix is shown to be equivalent to the Hurwitz-Routh test. Other
studies connecting Lyapunov to the Hurwitz-Routh test include [81, 123, 13, 31, 147, 124,
150, 69, 107, 101, 100, 102, 88, 67].
II) The covariance control theory 5.2.1 gives the necessary and sufficient conditions under which
a given state covariance matrix X may be assigned by a feedback controller.
These two results (I, II) are combined as follows. The set of all X that may be “assigned” by
a state feedback controller for the controllable system {ẋ = Ax + B(u + w), u = Gx} has special
structure (to be shown), and A + BG is asymptotically stable, if and only if the “assigned” X is
positive definite. Hence, we exploit this special structure of X and use xT (t)X−1 x(t) as a Lyapunov
function, in lieu of the diagonal choice of a Lyapunov matrix used by [88]. The potential advantages
are the new explicit relationships between the physical performance entities σi , i = 1, · · · , n and
stability (the coefficients ai , i = 1, · · · , n). Such information may also be useful in model reduction
[132].
Let the even and odd coefficients of (5.16) be collected in vectors ae , a0 , respectively,
   
a1 a2
   
 a3   a4 
∆   ∆  
a0 =  , ae =   (5.18)
 a5   a6 
   
.. ..
. .
and define two Hankel matrices composed of the σi

 
σ1 −σ2 σ3 −σ4 σ5 −σ6
 
 −σ2 σ3 −σ4 σ5 −σ6 σ7 
 
 
 σ3 −σ4 σ5 −σ6 σ7 −σ8 
 
 
∆  −σ4 σ5 −σ6 σ7 −σ8 σ9 
X0 = 


 (5.19)
 σ5 −σ6 σ7 −σ8 σ9 −σ10 
 
 −σ6 σ7 −σ8 σ9 −σ10 σ11 
 
 .. 
 . 
 
etc.
 
σ2 −σ3 σ4 −σ5 σ6 −σ7
 
 −σ3 σ4 −σ5 σ6 −σ7 σ8 
 
 
 σ4 −σ5 σ6 −σ7 σ8 −σ9 
 
 
∆  −σ5 σ6 −σ7 σ8 −σ9 σ10 
Xe = 


 (5.20)
 σ6 −σ7 σ8 −σ9 σ10 −σ11 
 
 −σ7 σ8 −σ9 σ10 −σ11 σ12 
 
 .. 
 . 
 
etc.
where a0 ∈ Rn0 , ae ∈ Rne . When n = even, n0 = ne = n/2. When n = odd, n0 = ne + 1, ne =

(n − 1)/2. Also X0 ∈ Rn0 ×n0 , Xe ∈ Rne ×ne . Define the phrase “stable polynomial” to mean that
all roots of the polynomial lie in the left half plane.
Theorem 5.2.2 For the linear time-invariant system (5.13), let σi , i = 1, · · · , n denote the state
variances, and let ai , i = 1, · · · , n denote the coefficients of the characteristic polynomial (5.16).
The following statements (i-iii) are equivalent:
(i) The matrices X0 , Xe in (5.19, 5.20) are positive definite.

(ii) The set of ai corresponds to a stable polynomial.
(iii) For n = even

" #
1 0
a0 = X−1 ae = X−1
∆
0 Xe 1, 1, 1= (5.21)
2 e 1
and for n = odd,

1 −1 h i
a0 = X 1, ae = X−1
e 0 Ie X0 1, (5.22)
2 0
where Xe , X0 are any positive definite matrices having structure and a0 , ae are defined by
(5.18).
Theorem 5.2.2 gives the explicit (unique) dependence of the coefficients of the characteristic
polynomial on the assignable variances of each state variable of (5.13). Table 5.1 lists the results
of Theorem 5.2.2 for n = 2, 3, 4, 5, 6. Note that the stability conditions for n = 5 include all the
conditions for n = 4 plus one additional condition ∆5 > 0 (denoted “+∆5 > 0” in Table 5.1).
Corollary 5.2.1 Define the positive scalar η by

∆ 1 T −1
η = 1 Xe 1, n = even,
2
1 T −1
= 1 X0 1, n = odd.
2
All poles of the system (5.13) lie between the vertical lines in the complex plane −η + jω and 0 + jω
and (0 ≤ ω ≤ ∞). Furthermore, the average of all pole locations is −η/n.
Proof. The sum of all poles is

X
n
λk = tr[A] = −an , (5.23)
i=1
Pn
where an is given by (5.21). Hence, an = η and (1/n) i=1 λi =(1/n)tr[A]= (−1/n)an = (−1/n)η.
Now consider (5.17), with G = 0 and
`∗i A = λi `∗i
then from (5.17)

h i
`∗i XAT + AX + BBT ì = 0,
yields
`∗i BBT ì
λi + λ∗i = − ≥ −λ̄(X−1 BB∗ ),
`∗i Xì
˚
and λ̄ denotes maximum eigenvalue. For any nonsingular T

h i
λ̄(X−1 BB∗ ) = λ̄ T−1 (X−1 BBT )T .
For a certain choice of T (to be discussed) it is possible to show that

h i
λ̄ T−1 (X−1 BBT )T = η.
Now consider G 6= 0 in (5.17). From Theorem 5.2.1, the set of all covariances assignable by G
satisfy
(I − BB+ )(XAT + AX)(I − BB+ ) = 0. (5.24)
Lemma 5.2.2 Let (A, B) have structure (5.14). These statements are equivalent.
(i) A matrix X has the “signature Hankel” structure

 
σ1 0 −σ2 0 σ3 0 −σ4
 
 0 σ2 0 −σ3 0 σ4 0 
 
 
 −σ2 0 σ3 0 −σ4 0 σ5 
 
 
 0 −σ3 0 σ4 0 −σ5 0 
 
 −σ4 −σ6 
X =  σ3 0 0 σ5 0  (5.25)
 
 0 σ4 0 −σ5 0 σ6 0 
 
 
 −σ4 0 σ5 0 −σ6 0 σ7 
 
 .. 
 . 
 
etc.
(ii) X satisfies (5.24).
Lemma 5.2.3 Equation (5.13) describes an asymptotically stable system, if and only if matrix
X in (5.17) is a positive definite matrix of “signature Hankel” structure (5.25). Furthermore, a
suitable Lyapunov function is
V(x(t)) = xT (t)X−1 x(t), (5.26)
and its time rate change is (for n = even),
V(x(t)) = −4(xTe ae )2 , (5.27)
where
h i h i
∆ ∆
xTe = x2 x4 x6 x8 · · · , aTe = a2 a4 a6 a8 · · · (5.28)
and, for n = odd,
V(x(t)) = −4(xT0 a0 )2 , (5.29)
where
h i h i
∆ ∆
xT0 = x1 x3 x5 x7 · · · , aT0 = a1 a3 a5 a7 · · · . (5.30)
Furthermore, the “time constant” of the system, defined by τ where
V̇(x(t))
τ −1 = sup
∆
(5.31)
x V(x(t))
is
(
(aTe Xe ae )−1 , (n = even),
τ = (5.32)
(aT0 X0 a0 )−1 , (n = odd).
Remark 1: Note from (5.17) that (5.26, 5.27, 5.28, 5.29, 5.30, and 5.32) is a Lyapunov function
suitable for proving stability, since X > 0 is a necessary and sufficient condition for the stability of
A, given the controllability of (A, B) assumed in (5.17).
Remark 2: The V in (5.26, 5.27, 5.28, 5.29, 5.30, and 5.32) is negative semidefinite and zero only
at the origin x = 0. Hence, X > 0 is necessary and sufficient for stability. Relationships (5.21),
(5.22) have been exploited to get expressions (5.27, 5.29).
Remark 3: Note that the Lyapunov matrix (the inverse of the covariance matrix) has been
completely specified in terms of only the n variances σi , i = 1, · · · , n. Hence, the study of positive
definiteness of X (and the stability of A) involves only n positive numbers. We should not be
surprised, therefore, to later find a one-to-one correspondence between the two sets of parameters
(ai and σi ).
Proof of Lemma 5.2.2 From Lemma (5.2.2),

" #
In−1 0
I − BB+ = .
0 0
Hence (5.24) is equivalent to the (n − 1) × (n − 1) block of XAT + AX equal to zero,
(XAT + AX)11 = 0(n−1)×(n−1) (5.33)
or equivalently,
" # " #
h i In−1 h i 0
0 In−1 X + In−1 0 X = 0 (5.34)
0 In−1
or equivalently,
Xi+1,j = −Xi,j+1 , i, j = 1, · · · , n − 1. (5.35)

∆
Note from (5.35) and symmetry that Xaβ = 0 when a + β = odd. By defining σi = Xii and
using symmetry (Xaβ = xβa ) (5.35) is equivalent to (5.25).
Consider the unitary coordinate transformation of (5.13) defined by T = [T0 Te ],
   
1 0 0 0 0 0 0
   
 0 0 0 0   1 0 0 
   
   
 0 1 0 0   0 0 0 
   
   
∆  0 0 0 0  ∆  0 1 0 
T0 = 


 , Te = 


 (5.36)
 0 0 1 0   0 0 0 
   
 0 0 0 0   0 0 1 
   
   
 0 0 0 1   0 0 0 
   
.. ..
. .
n×n0 n×ne
where
(
n̄ ∆ ∆ n, if n is even,
n0 = , n̄ =
2 n + 1, if n is odd,
(
n ∆ ∆ n, if n is even,
ne = , n= ,
2 n − 1, if n is odd,
where xT0 xe (t) = 0 since T0 Te = 0.

The transformed state vector has these properties
   
x1 x2
   
" #  x3   x4 
   
x0    
Tx = , x0 = 
 x5  , xe = 
  x6 .
 (5.37)
xe  ..   .. 
 .   . 
   
xodd xeven
We shall refer to x0 ∈ Rn0 and xe ∈ Rne as the “odd” and “even” state vectors. Likewise, we shall
refer to “odd” and “even” coefficients a0 , ae in (5.18).
Define the “signature Toeplitz” matrices
n+1 n−1
×n ×n
Λ00 ∈ R 2 , Λe0 ∈ R 2 ,
 
0 · · · · · 0··· 0
 .. . . .. 
 .. .. .. .. 
 . . . . . . . 
 
 .. .. .. 
Λe0 =  . . ã2 ã4 ã6 · · · ãn−1 1̃ . , (5.38)
 
 .. .. .. .. .. .. .. 
 . . . . . . . 0 
 
0 ··· 0 · · · · ·
 
· · · · 0 ··· 0
 .. 
 .. .. .. .. .. 
 0 . . . . . . 
 
 .. . . . . .. 
Λ00 =  . . ã1 ã3 ã5 · · · ãn . . , (5.39)
 
 .. .. .. .. .. .. 
 . . . . . . 0 
 
0 ··· 0 · · · ·
and the two n/2 × n “signature Toeplitz” matrices Λoe , Λee , composed, respectively, of odd and
even coefficients,
 
· · · · · · 0··· 0

 . . .. 
. . 
.. .. .. .. ..
 0 . . . . . 
 
 .. . . .. 
Λ0e =  . . ã1 ã3 ã5 · · · ãn−1 1̃ .  (5.40)
 
 .. .. .. .. .. .. .. 
 . . . . . . . 0 
 
0 ··· 0 · · · · ·
 
0 · · · · 0 ··· 0

 .. . . . . .. 
. . 
.. .. .. ..
 . . . . . . 
 
 .. .. .. 
Λee =  . . ã2 ã4 ã6 · · · ãn .  , (5.41)
 
 .. .. .. .. .. .. 
 . . . . . . 0 
 
0 ··· 0 · · · ·
where the diagonal dotted lines denote the elements of a vector (such as ãi ) arranged diagonally in
the matrix, and where
(
∆ 1, 5, 9, 13, 17, · · ·
ãi = 1̄ai , i= , i ≤ n, (5.42)
2, 6, 10, 14, 18, · · ·
(
∆ 3, 7, 11, 15, · · ·
ãj = −1̄aj , j= , j ≤ n, (5.43)
4, 8, 12, 16, · · ·
where
∆
1̄T = [1, −1, 1, −1, 1, −1, · · · etc.], (5.44)
∆ ne
1̃ = (−1) 1̄. (5.45)
Theorem 5.2.3 Let ai , i − 1, . . . , n represent any coefficients associated with a stable polynomial.
Then the variances of all state variables are given uniquely by, for n = even.
" #
1 Λ0e
σ = Λ−1
∆
1, Λe = (5.46)
2 e Λee
and, for n = odd,

" #
1 Λe0
σ = Λ−1
∆
1, Λ0 = (5.47)
2 0 Λ00
In Theorem 5.2.2, it is clear that the n ai may be determined from n σi by inverting a square
matrix of size no greater than (n + 1)/2. In (5.46), (5.47), it appears that a full n × n matrix must
be inverted to find the σi in terms of given ai . However, the matrices Λe , Λ0 have special structure,
and only the last column of Λ−1 −1
0 or Λe need be computed.
Tables 5.2 (a) and (b) list the results of Theorem 5.2.3 for n = 2, 3, 4, 5, 6. Note that the stability
condition (X > 0) is the same as for Table 5.1, but when expressed in terms of the coefficients,
yield the same tests as the Hurwitz-Routh stability test.
The Lyapunov equation (5.17) expressed in the transformed coordinates (TT AT, TT BT , TT XT)
yields the Lyapunov equation
TT ATTT XT + TT XTTT AT T + TT BBT T = 0 (5.48)
where
" #
X0 0
TT XT = , since TT0 XTe = 0. (5.49)
0 Xe
For n = even, the 12 block partition of (5.48) yields
" #
0 −a0
X e + X0 = 0
I0
leading to (5.21) and
" # " #
I0 0
Xe + X0 = 0
0 I0
which holds by virtue of (5.19). In a similar way, the 22 block of (5.48) yields ae in (5.22). For n =
odd, the 11 block of (5.48) and the 12 block of (5.48) yield (5.22).
To prove stability, note that A, B is controllable, hence stability is equivalent to X > 0, or
T
T XT > 0, or X0 > 0, Xe > 0. 2
Proof of Theorem 5.2.3. Consider n = even and verify (5.46). Let both equations in (5.21) be
arranged in one equation as follows:
" # " #
X 0 a0 − X e 1 0
= 1
(5.50)
X 3 ae 21
This equation can be rearranged to collect the n elements σi , i = 1, . . . , n contained in X0 , Xe into

a single vector σ.
h i
σT = σ1 σ 2 σ 3 · · · σn ,
˚
˚
so that the left-hand side of (5.50) becomes equivalently, Λe σ, where Λe is given by (5.46), (5.40),
For example, for n = 8, (5.50) is rearranged as follows:
    
a1 −a3 a5 −a7 1 0 0 0 σ1 0
    
 0 −a1 a3 −a5 a7 −1 0 0  σ2    0
    
    
 0 0 a1 −a3 a5 −a7 1 0 σ3    0
    
 0 0 0 −a1 a3 −a5 a7 −1    0
  σ4   
Λe σ =   = .
 0 a2 −a4 a6 −a8 0 0 0   σ5   0 
    
    
 0 0 −a2 a4 −a6 a8 0 0   σ6   0 
    
    
 0 0 0 a2 −a4 a6 −a8 0  σ7   0 
0 0 0 0 −a2 a4 −a6 a8 σ8 1
2
which is defined by the notation in (5.40), and where

" # " #
Λ0e Λ0e1 Λ0e2
Λ = = (5.51)
Λee Λee1 Λee2
Now consider (5.47). Rearrange (5.47) to get

 h i  " #
Xe ae − 0 Ie X0 1 0
  = . (5.52)
1
X 0 a0 21
Similarly, since X0 , Xe contain only n positive numbers σi , i = 1, · · · , n, each of the terms Xe ae ,

X0 a0 , [0 Ie ] X0 1, can be rearranged as some matrix times the n-vector σ . Hence, by straight
forward construction (5.52) leads to Λ0 σ = (1/2)1 with the Λ0 given by
 
0 a2 −a4 a6 −1 0 0
 
 0 0 −a2 a4 −a6 1 0 
 
 
" #  0 0 0 a2 −a4 a6 −1 
Λe0  
 
Λ0 = =  a1 −a3 a5 −a7 0 0 0  (5.53)
Λ00  
 0 −a1 −a5 0 
 a3 a7 0 
 
 0 0 a1 −a3 a5 −a7 0 
 
0 0 0 −a1 a3 −a5 a7
leading to the notation (5.38). This completes the Proof of Theorem 5.2.3. 2
Theorems 5.2.2 and 5.2.3 have two uses. When G = 0 these theorems characterize the one-to-
one relationship between the coefficients of the characteristic polynomial and the variances of each
state variable, due to a unit intensity white noise input. If G 6= 0 then aTcl = aTol = −G where aTcl
and aTol contain the characteristic polynomial coefficients of the closed-loop and open-loop systems,
respectively, and G is the state feedback gain. Obviously, then,
G = aTol − aTcl
where aTcl is given by Theorem 5.2.2 expressed as a function of variances σi .

5.2.4 Output Feedback without Measurement Noise
In this section, we generalize the result of the previous section for the case where not all the states
are available for feedback. However, we assume that a linear combination of the states can be
measured without noises, i.e., Dz = 0 ( or equivalently, V = 0 ). In this case, the covariance
equation is given by
(A + BGM)X + X(A + BGM)T + DDT = 0 (5.54)
where the matrices A, B, M, D and G are the augmented matrices defined in (4.15). Accordingly,
we shall partition the state covariance X and define the Schur complement of X as follows;
" #
Xp Xpc
, X̄p = Xp − Xpc X−1
∆ T
X= c Xpc
Xpc Xc
Theorem 5.2.4 [159] Let a positive definite matrix X ∈ R(np +nc )×(np +nc ) be given. Then the
(i) There exists a controller G which assigns X as a state covariance.
(ii) X satisfies
(I − Bp B+
p )(Ap Xp + Xp Ap + W)(I − Bp Bp ) = 0,
T +
(5.55)
−1 −1
(I − M+
p Mp )X̄p (Ap X̄p + X̄p Ap + W)X̄p (I − Mp Mp ) = 0.
T +
(5.56)
(I − ΓΓ+ )(I − M+ M)X−1 Q = 0
where
Γ = (I − M+ M)X−1 BB+
∆
In this case, all such controllers G are given by
1
G = − B+ (Q + S)X−1 M+ + ZF − B+ BZF MM+
2
where ZF is arbitrary and
∆
Q = AX + XAT + DDT , (5.57)
∆
S = [Θ+ Φ + (I − Θ+ Θ)SF ](I − Θ+ Θ) − (Θ+ Φ)T , (5.58)
" # " #
∆ I − BB+ ∆ −(I − BB+ )
Θ= , Φ= Q
(I − M+ M)X−1 (I − M+ M)X−1
where SF is an arbitrary skew-symmetric matrix.
Proof. A given X > 0 is assignable if and only if there exists a G satisfying (5.54), that is,
BGMX + (BGMX)T + Q = 0.
Then the result directly follows from Theorem 2.3.7. In this case, the solvability conditions are
given by
(I − BB+ )Q(I − BB+ ) = 0 (5.59)
(I − M+ M)X−1 QX−1 (I − M+ M) = 0 (5.60)
(I − ΓΓ+ )(I − M+ M)X−1 Q = 0.
Substituting the partitioned matrices defined in (4.15) into (5.59) and (5.60) yields (5.55) and
(5.56), respectively. 2
In Theorem 5.2.4, the controller order nc is fixed by the dimension of the prespecified state
covariance X ∈ R(np +nc )×(np +nc ) . The static output feedback case (nc = 0) can be deduced by
replacing the augmented matrices A, B, etc. with the original matrices Ap , Bp , etc., and setting
Xp = X̄p = X.
5.2.5 Static Output Feedback with Noisy Measurements

This section treats the case where the measurements are completely contaminated by noises, i.e.,
∆
Dz DTz = V > 0. As in the state feedback case, we consider the covariance controller of static
(constant) feedback gain G. In this case, the covariance equation is given by
(Ap + Bp GMp )X + X(Ap + Bp GMp )T + (Dp + Bp GDz )(Dp + Bp GDz )T = 0 (5.61)
or equivalently, using the definition (5.1),
(Ap + Bp GMp )X + X(Ap + Bp GMp )T + W + Bp GVGT BTp = 0 (5.62)
Theorem 5.2.5 Let a positive definite matrix X ∈ Rnp ×np be given. Then the following statements
are equivalent.
(i) There exists a control gain G which assigns X as a state covariance.
(ii) X satisfies
(I − Bp B+
p )(Ap X + XAp + W)(I − Bp Bp ) = 0,
T +
Ap X + XATp − XMTp V−1 Mp X + W + Lp LTp = 0
for some matrix Lp ∈ Rnp ×nz .
−2
− XMTp V−1 ) + (I − B+
1
G = B+
p (Lp Up V p Bp )ZF (5.63)
where ZF is arbitrary and " #

I 0 T
Up = V L VR
0 UF
" #
ΣL 0
(I − Bp B+
p )Lp = UL T
VL (SVD)
0 0
" #
T − 12 ΣL 0
(I − Bp B+
p )XMp V = UL T
VR (SVD)
0 0
where UF is an arbitrary orthogonal matrix.
Proof. We shall solve the equation (5.62) for G. After expanding, completing the square with
respect to Bp G yields
(Bp G + XMTp V−1 )V(Bp G + XMTp V−1 )T = Q
Q = XMTp V−1 Mp X − Ap X − XATp − W.

∆
Applying Theorem 2.3.9, the above equation is solvable for G if and only if
Q ≥ 0, rank(Q) ≤ nz
T −1
(I − Bp B+
p )(Q − XMp V Mp X)(I − Bp Bp ) = 0
+
or equivalently, there exists Lp ∈ Rnp ×nz such that
Q = Lp LTp
(I − Bp B+
p )(Ap X + XAp + W)(I − Bp Bp ) = 0
T +
In this case, all solutions G are given by (5.63). 2
5.2.6 Dynamic Output Feedback with Noisy Measurements

In this section, we impose the same assumption (V > 0), as in Section 5.2.5, but we consider the
dynamic output feedback covariance controller. In this case, the state covariance satisfies
(A + BGM)X + X(A + BGM)T + (D + BGE)(D + BGE)T = 0. (5.64)
The above Lyapunov equation has the same structure as in (5.61) for the static output feedback
case. However, due to the singularity of the matrix
" # " #
T Dz DTz 0 V 0
EE = = , V > 0,
0 0 0 0
the derivation of the covariance controller is much more involved.
Theorem 5.2.6 Let a positive definite matrix X ∈ R(np +nc )×(np +nc ) be given. Then the following
(i) There exists a controller which assigns X as a state covariance.

(ii) X satisfies
(I − Bp B+
p )(Ap Xp + Xp Ap + W)(I − Bp Bp ) = 0
T +
Ap X̄p + X̄p ATp − X̄p MTp V−1 Mp X̄p + W + Lp LTp = 0
Q̄ = (BPK )(BPK )+ Q̄(M̄PC PN )+ (M̄PC PN ) (5.65)
for some Lp ∈ Rnp ×nz where

" #
Xp Xpc
, X̄p = Xp − Xpc X−1
∆ ∆ T
X= c Xpc
XTpc Xc
h i h i
∆ ∆
M̄ = Mp 0 , C= 0 Inc (5.66)
Q̄ = (I − BK+ PC X−1 )(AX + XAT + DDT )X−1 PC PN

∆
∆ ∆ ∆
PC = I − C+ C, PN = I − NN+ , PK = I − K+ K
K = PC X−1 B, N = KB+
∆ ∆
In this case, all such controllers are given by

" #
Cc 1
= − B+ [Q̂ + Θ+ Φ(I − Θ+ Θ) − (Θ+ Φ)T ]X−1 C+
Ac 2
+B+ (I − Θ+ Θ)SF (I − Θ+ Θ)X−1 C+
+(I − BB+ )ZF 1
" #
Dc
= K+ (LUV− 2 − PC M̄T V−1 )
1
Bc
−PK (BPK )+ Q̄(M̄PC PN )+ (5.67)
−PK ZF 2 + PK (BPK )+ (BPK )ZF 2 (M̄PC PN )(M̄PC PN )+
where
∆
Q̂ = (A + BG1 M̄)X + X(A + BG1 M̄)T + DDT + BG1 VGT1 BT
" # " #
∆ I − BB+ ∆ −(I − BB+ )
Θ= , Φ= Q̂
PC X−1 PC X−1
" # " #
∆ Lp ∆ Dc
L= , G1 =
0 Bc
" #
∆ I 0 T
U = VL VR (5.68)
0 UF
" #
∆ ΣL 0 T
PN L = UL VL (SVD)
0 0
" #
T − 21 ΣL 0 T
PN PC M̄ V = UL VR (SVD)
0 0
where SF is an arbitrary skew-symmetric matrix and UF is an arbitrary orthogonal matrix and
ZF 1 , ZF 2 are arbitrary.
Proof. We shall solve (5.64) for controller G. Defining M̄ and C as in (5.66) and
" # " #
∆ ∆ Dc ∆ Cc
Q = AX + XAT + DDT , G1 = , G2 = ,
Bc Ac
(5.64) can be written
BG1 VGT1 BT + (BG1 M̄X)T + (BG1 M̄X)T + BG2 CX + (BG2 CX)T + Q = 0.
Completing the square with respect to BG1 yields
0 = (BG1 + XM̄T V−1 )V(BG1 + XM̄T V−1 )T + BG2 CX + (BG2 CX)

+Q − XM̄T V−1 M̄X
or equivalently, using Q̂ defined as above,
BG2 CX + (BG2 CX)T + Q̂ = 0.
From Theorem 2.3.7, the above equation is solvable for G2 if and only if
(I − BB+ )Q̂(I − BB+ ) = 0 (5.69)
(I − C+ C)X−1 Q̂X−1 (I − C+ C) = 0 (5.70)
(I − NN+ )(I − C+ C)X−1 Q̂ = 0 (5.71)
hold, in which case, all solutions G2 are given by (5.67). Using the property (I − BB+ )B = 0,
(5.69) is equivalent to
(I − BB+ )Q(I − BB+ ) = 0. (5.72)
Note that
PN N = 0 ⇒ PN NB = (I − NN+ )(I − C+ C)bf X −1 B = 0.
Hence (5.71) is equivalent to
PN PC X−1 (Q + XM̄T GT1 BT ) = 0. (5.73)
Note that (5.70) holds if and only if there exists L ∈ R(np +nc )×nz such that
PC X−1 (Q − XM̄T V−1 M̄X)X−1 PC + LLT = 0 (5.74)
LLT = PC X−1 (BG1 + XM̄T V−1 )V(BG1 + XM̄T V−1 )T X−1 PC . (5.75)
Now we claim that, for any given L satisfying (5.74), (5.75) is always solvable for G1 provided
(5.72) holds. This can be shown as follows.
First note that (5.75) can be written
LLT = (KG1 + PC M̄T V−1 )V(KG1 + PC M̄T V−1 )T . (5.76)
Note that (5.76) is solvable for G1 if and only if
LLT = (NĜ1 + PC M̄T V−1 )V(NĜ1 + PC M̄T V−1 )T (5.77)

∆
is solvable for Ĝ1 since, if G1 exists, then Ĝ1 = BG1 is a solution to (5.77), and if Ĝ1 exists, then
∆
G1 = B+ Ĝ1 is a solution to (5.76). Hence, from Theorem 2.3.9, there exists G1 satisfying (5.76)
if and only if
PN (LLT − PC M̄T V−1 M̄PC )PN = 0 (5.78)
holds, in which case, all such G1 are given by
G1 = K+ (LUV− 2 − PC M̄T V−1 ) + PK ZK

1
(5.79)
where ZK is arbitrary and U is given in (5.68). Using (5.74), the existence condition (5.78) is
equivalent to
PN PC X−1 QX−1 PC PN = 0. (5.80)
Since Q satisfies (5.72), there exists ZQ such that
Q = ZQ − (I − BB+ )ZQ (I − BB+ ).
Noting that PN PC X−1 BB+ = PN N = 0, it is easy to see that (5.80) holds. Thus (5.75) is solvable
for G1 , and all solutions are given by (5.79). Finally, we need to find the existence condition and
all matrices UF and ZK in (5.79) such that (5.73) holds. Substituting (5.79) into (5.73) yields
QX−1 PC PN + BK+ (LUV− 2 − PC M̄T V−1 )M̄PC PN + BPK ZK M̄PC PN = 0.

1
Using the definition of U in (5.68), the second term in the above equation is
" #
I 0 T −2
V − PC M̄T V−1 )M̄PC PN
1
+
BK (LVL VR
0 UF
" # " #
I 0 ΣL 0
+
= BK (LVL T
VR VR UL ) − BK+ PC M̄T V−1 M̄PC PN
0 UF 0 0
= BK+ (LLT − PC M̄V−1 M̄PC )PN
= −BK+ PC X−1 QX−1 PC PN
where we used (5.74) in the last equality. Hence we have
(I − BK+ PC X−1 )QX−1 PC PN = −BPK ZK M̄PC PN

or equivalently,
Q̄ = −BPK ZK M̄PC PN .
There exists ZK solving the above equation if and only if
Q̄ = (BPK )(BPK )+ Q̄(M̄PC PN )+ (M̄PC PN )
holds, in which case, all such ZK are
ZK = −(BPK )+ Q̄(M̄PC PN )+ + ZF 2 − (BPK )+ (BPK )ZF 2 (M̄PC PN )(M̄PC PN )+ (5.81)
where ZF 2 is arbitrary. Substituting (5.81) into (5.79) yields (5.67). Finally, the first two conditions
in statement (ii) can be obtained by substituting the augmented matrices into (5.72) and (5.74). 2
5.2.7 Structure of Covariance Controllers

In this section, we shall consider full-order (nc = np ) dynamic controllers. The closed-loop state
covariance " #
Xp Xpc
X= (5.82)
XTpc Xc
has dimensions 2np × 2np . For simplicity, we assume that the square matrix Xpc is invertible.
Under this assumption, the following theorem shows that the third assignability condition (5.65)
in Theorem 5.2.6 becomes redundant, and the structure of all full-order covariance controllers is
shown to be observer-based.
Theorem 5.2.7 Let a positive definite matrix X ∈ R2np ×2np be given. Then the following state-
ments are equivalent.
(i) There exists a full-order dynamic controller which assigns X as a state covariance.
(ii) X satisfies
(I − Bp B+
p )(Ap Xp + Xp Ap + W)(I − Bp Bp ) = 0,
T +
(5.83)
Ap X̄p + X̄p ATp − X̄p Mp V−1 Mp X̄p + W + Lp LTp = 0 (5.84)
for some Lp ∈ Rnp ×nz where
X̄p = Xp − Xpc X−1
∆ T
c Xpc . (5.85)
If these conditions hold, all controllers which assign X are given by
Ac = Xc X−1 T −1 −1
1
pc (Ap + Bp Dc Mp + Lp V Bc Xpc )Xpc Xc
2
−Bc Mp Xpc X−1 −1

c + Xc Xpc Bp Cc , (5.86)
Bc = Xc X−1 T −1
pc (X̄p Mp V + Bp Dc − Lp V ), − 12
(5.87)
1 + −T
Cc = B Q̂p (2I − Bp B+p )Xpc
2 p
+ −T
+B+p SF Bp Bp Xpc + (I − Bp Bp )ZF ,
+
(5.88)
Dc = arbitrary (5.89)
where ZF is arbitrary and SF is an arbitrary skew-symmetric matrix and

∆
Q̂p = (Ap + Bp Dc Mp )Xp + Xp (Ap + Bp Dc Mp )T + W + Bp Dc VDTc BTp . (5.90)
Proof. Consider the covariance equation

∆
Q = (A + BGM)X + X(A + BGM) + (D + BGE)(D + BGE)T = 0.
Computing each partitioned block of Q, we have

∆
Q11 = Q̂p + Bp Cc XTpc + Xpc CTc BTp = 0, (5.91)
∆
Q12 = (Ap + Bp Dc Mp )Xpc + Xpc ATc + (X̄p MTp + Bp Dc V)BTc + Bp Cc Xc = 0, (5.92)
∆
Q22 = Ac Xc + Xc ATc + Bc Mp Xpc + XTpc MTp BTc + Bc VBc = 0, (5.93)
where Q̂p is defined by (5.90).

Necessity: Suppose a given matrix X > 0 is assignable as a state covariance. Then there exists a
controller (Ac , Bc , Cc , Dc ) satisfying (5.91)-(5.93). Pre- and post-multiplying (5.91) by (I − Bp B+
p)
immediately yields (5.83). Let
h i
∆ ∆
Px = I −Xpc X−1
c , Q̄ = Px QPTx . (5.94)
Using (5.91)-(5.93), (5.94) yields
Q̄ = Q̄p + (Bp Dc − Xpc X−1 −1

c )Mp X̄p + X̄p Mp (Bp Dc − Xpc Xc Bc )
T T
+(Bp Dc − Xpc X−1 −1

c Bc )V(Bp Dc − Xpc Xc Bc ) = 0
T
(5.95)
where
∆
Q̄p = Ap X̄p + X̄p ATp + W. (5.96)
Completing the square, (5.95) is equivalent to
Q̄p − X̄p MTp V−1 Mp X̄p + Lp LTp = 0 (5.97)
where
Lp = (Bp Dc − Xpc X−1 T −1
∆ 1
c Bc + X̄p Mp V )V . (5.98)
2
Thus the condition (5.84) is necessary.

Sufficiency: Suppose the assignability conditions (5.83) and (5.84) are satisfied for a matrix X > 0.
Sufficiency will be shown by constructing all controller matrices satisfying (5.91)-(5.93). Applying
Theorem 2.3.9 to the equation (5.91), the existence of Cc satisfying Q11 = 0 is guaranteed ( for
any choice of Dc ) by (5.83) and invertibility of Xpc , and all such matrices Cc are given by
1 −T
Cc = − B+ Q̂p (2I − Bp B+
p )Xpc
2 p
+ −T
+B+p SF Bp Bp Xpc + (I − Bp Bp )ZF
+
(5.99)
where ZF is arbitrary and SF is an arbitrary skew-symmetric matrix. Now, instead of solving

(5.92) and (5.93), we consider the following equivalent equation:
" #" #
I −Xpc X−1
c Q12
= 0. (5.100)
0 I Q22
The first row block of (5.100) gives
(Ap + Bp Dc Mp − Xpc X−1 −1 1

c Bc Mp )Xpc + Bp Cc Xc − Xpc Xc Ac Xc + Lp V Bc = 0.
2
T
Solving for Ac , we have
Ac = Xc X−1 T −1 −1
1
pc (Ap + Bp Dc Mp + Lp V Bc Xpc )Xpc Xc
2
−Bc Mp Xpc X−1 −1

c + Xc Xpc Bp Cc . (5.101)
Finally, we need to show the existence of Bc and Dc satisfying the second row block of (5.100), or
Q22 = 0, with Ac and Cc given by (5.101) and (5.99), respectively. Recall that, if any solution
Bc exists, then it must satisfy (5.98) for some Lp where Lp satisfies (5.97). Solving (5.98) for Bc ,
yields (5.87)
Bc = Xc X−1 T −1
+ Bp Dc − Lp V− 2 ).
1
pc (X̄p Mp V (5.102)
Now we claim that, given any matrix Dc , the controller matrices Ac , Bc and Cc given by (5.101),
(5.102) and (5.99) satisfy Q22 = 0. This can be verified as follows. Substituting Ac and Bc into
(5.93), then using (5.91) to eliminate Cc , we have
Q22 = −Xc X−1 T −1 −T −T

pc (Q̄p − X̄p Mp V Mp X̄p + Lp Lp )Xpc Xc = 0
T
where the last equality holds due to (5.97). This completes the proof. 2
The following corollary simplifies Theorem 5.2.7.
Corollary 5.2.2 Assume that (Ap , Dp ) is controllable and (Ap , Bp , Mp ) is a stabilizable and de-
tectable triple, where from (5.1), Dp DTp = W. Let a positive definite matrix Xp ∈ Rnp ×np be given.
Then the following statements are equivalent:
(i) There exists a full-order (nc = np ) controller which assigns Xp as a plant state covariance.
(ii) Xp satisfies
(I − Bp B+
p )(Xp Ap + Ap Xp + W)(I − Bp Bp ) = 0,
T +
(5.103)
Xp > P.
where P > 0 is the solution to
PATp + Ap P − PMTp V−1 Mp P + W = 0 (5.104)
which stabilizes
Ap − PMTp V−1 Mp (5.105)

Proof. From Theorem 5.2.7 statement (i) is equivalent to (5.83) and the existence of Xpc , Xc > 0,
and Lp satisfying (5.84) and (5.85). To prove that (i) implies (ii), suppose (5.83) and (5.84) hold.
We need the following lemma.
Lemma 5.2.4 Let matrices A, Q, and Wi (i = 1, 2) be given, where Q ≥ 0. Suppose Xi (i = 1, 2)

satisfy
0 = Xi AT + AXi − Xi QXi + Wi , i = 1, 2. (5.106)
Then, X2 ≥ X1 if W2 ≥ W1 and A − X2 Q is stable.
Proof. Subtract equations (5.106) to get, for X̃ = X2 − X1 ,
0 = X̃AT + AX̃ + W2 − W1 − X2 QX2 + X1 QX1

= X̃(AT − QX2 ) + (A − X2 Q)X̃ + Q̃
where
Q̃ = X̃QX̃ + W2 − W1 ≥ 0.
Since A − X2 Q is stable,
Z ∞ Tt
X̃ = e(A−X2 Q)t Q̃(A−X2 Q) dt ≥ 0.
0
This proves Lemma 5.2.4. 2

From Lemma 5.2.4, note that the stabilizing solution P to (5.104) and X̄p in (5.84) satisfy
X̄p ≥ P for any choice of Lp in (5.84). Since Xpc is of full rank (assumed in Section 5.2.7)
Xp = X̄p + Xpc X−1 T

c Xpc > X̄p
Hence, Xp > P, and (i) implies (ii).

To prove the converse, assume that (ii) holds. We will construct Xpc , Xc > 0, and Lp satisfying
(5.84) and (5.85). Choose Lp = 0. Then X̄p = P, and hence Xp − X̄p > 0. Now we can choose
Xpc = Xc = (Xp − X̄p )
Clearly this choice satisfies (5.85). This completes the proof of Corollary 5.2.2. 2
In Theorem 5.2.5, an assignable state covariance can be constructed by solving the linear equa-
tion (5.103) for Xp > P and the Riccati equation (5.84) for X̄p > P. In this case, the parameter
Lp must be chosen such that X̄p < Xp since, from (5.85), we have
Xp − X̄p = Xpc Xc−1 XTpc > 0
where the right-hand side is positive definite since Xc > 0 and Xpc is square nonsingular. If
X̄p < Xp holds, then we can find Xpc and Xc > 0 satisfying the above equality. Thus we have
an assignable state covariance as in (5.82) where the positive definiteness of X is guaranteed by

Xc > 0 and X̄p > 0. Note that the choice of Xpc and Xc for given Xp and X̄p is immaterial and
contributes only to the coordinate transformation on the controller states. If we choose
∆
Xpc = Xc = (Xp − X̄p ) = X̄c ,
and consider the strictly proper controller (Dc = 0), then the controller formulas (5.86)-(5.88)
become
Ac = Ap − Bc Mp + Bp Cc + Lp V 2 BTc X̄−1
1
c , (5.107)
Bc = X̄p MTp V−1 − Lp V , − 12
(5.108)
1 −1
Cc = − B+ (Ap Xp + Xp ATp + W)(2I − Bp B+
p )X̄c
2 p
+ −1
+B+ p SF Bp Bp X̄c + (I − Bp Bp )ZF
+
(5.109)
where ZF is arbitrary and SF is an arbitrary skew-symmetric matrix. Note that the controller has
the following observer-based structure;
ẋc = Ap xc + Bp u + Bc (z − Mp xc ) + Lp V 2 BTc X̄−1

1
c xc ,
u = Cc xc .
If we call the estimator part of the covariance controller obtained by choosing Lp = 0 the “central
estimator”, it is apparent that the central estimator is the Kalman filter. Recall that the Riccati
solution X̄p for the Kalman filter (Lp = 0) (we shall denote this solution by P) has a physical
significance; the estimation error covariance [1]. In fact, for nonzero choices of Lp , the Riccati
solution X̄p can still be considered as the estimation error covariance since
E[(xp − xc )(xp − xc )T ] = E[xp xTp ] − E[xp xTc ] − E[xc xTp ] + E[xc xTc ]
= Xp − X̄c − X̄c + X̄c = X̄p .
Since P satisfies P ≤ X̄p for any choice of Lp , the Kalman filter optimizes not only the scalar
objective tr(X̄p ) as in the standard LQG theorem but also the matrix-valued, or multiobjective
function X̄p (in the sense P ≤ X̄p over all Lp ). On the other hand, nonzero choices of the free
matrix Lp may improve some other performances such as robustness.
For the estimated-state feedback part of the covariance controller, we see that the feedback gain
Cc given by (5.109) is identical to the state feedback covariance controller G in (5.3) if we replace
X̄c by Xp . This difference can be interpreted as the compensation for the estimation error due to
noisy measurements by subtracting the error covariance X̄p from Xp to obtain X̄c . Also note that
the set of all assignable plant state covariances with full-order controllers is a subset of that with
state feedback with the additional constraint X̄p < Xp . This fact makes sense physically since the
estimation error covariance X̄p is zero if all the states are available without noise.
Finally, we see that the separation principle does not hold for the covariance controller, i.e.
the state estimator and the estimated-state feedback gain cannot be designed separately since the
determination of the estimator parameters Ac and Bc involves closed-loop information Xp , and

the computation of the estimated-state feedback gain Cc requires the estimator information X̄p .
Nevertheless, the plant state covariance Xp and the estimation error covariance X̄p to be assigned
can be specified by solving the linear equation (5.83) and the Riccati equation (5.84) with a simple
constraint 0 < X̄p < Xp .
5.3 Discrete-Time Covariance Controllers
Consider the linear system given by (4.31) where we assume that all the states are available for
feedback, i.e., Mp = I and Dz = 0. With a controller of constant state feedback gain G, the
closed-loop state covariance X satisfies the following Lyapunov equation.
X = (Ap + Bp G)X(Ap + Bp G)T + W. (5.110)
Theorem 5.3.1 Let a positive definite matrix X ∈ Rnp ×np be given. Then the following statements
are equivalent.
(i) There exists a state feedback gain G which assigns X as a state covariance.
(ii) X satisfies
(I − Bp B+
p )(X − Ap XAp − W)(I − Bp Bp ) = 0,
T +
X ≥ W.
−2 1
G = B+
p (Lp Up X − Ap ) + (I − B+
p Bp )ZF

∆ 1
Lp = (X − W) 2 ,
" #
∆ I 0 T
Up = VL VR ,
0 UF
" #
ΣL 0
(I − Bp B+
p )L = UL T
VL , (SVD)
0 0
" #
1 ΣL 0
(I − Bp B+
p )Ap X
2 = UL T
VR , (SVD)
0 0

5.3. DISCRETE-TIME COVARIANCE CONTROLLERS 141
Proof. Rearranging (5.110), we have
(Bp G + Ap )X(Bp G + Ap )T = X − W.
Then the result directly follows from Theorem 2.3.9. Note that the rank condition
rank(X − W) ≤ np
is redundant since the dimension of X − W is np . 2
5.3.2 Output Feedback

In this section, we generalize the result of the previous section for the case where not all the
states are available for feedback. As opposed to the case for continuous-time systems, we can allow
∆
general structure for the measurement noise for the discrete-time plant, that is, V = Dz DTz can
have arbitrary rank (of course it should be less than or equal to nz by the dimension constraint).
However, we assume that there is no redundant measured output, or equivalently, Mp MTp > 0.
Practically speaking, this assumption is reasonable since we don’t want to add a costly sensor to
obtain redundant information, and also from the theoretical point of view, this assumption can
be removed by increasing the complexity of the result. Consider the dynamic output feedback
controller of fixed-order given in (4.31). The closed-loop state covariance satisfies
X = (A + BGM)X(A + BGM)T + (D + BGE)(D + BGE)T . (5.111)
where X0 = 0, W0 = I in (4.36).
Theorem 5.3.2 Let a positive definite matrix X ∈ R(np +nc )×(np +nc ) be given. Then the following
(i) There exists a controller G which assigns X as a state covariance.
(ii) X satisfies
(I − BB+ )(X − AXAT − DDT )(I − BB+ ) = 0,
X = AXAT − AXMT (MXMT + EET )−1 MXAT + DDT + LLT
for some L ∈ R(np +nc )×(nz +nc ) .
In this case, all such controllers G are given by
G = B+ (LUR− 2 − AXMT R−1 ) + (I − B+ B)ZF

1
(5.112)

∆
R = MXMT + EET ,
" #
∆ I 0 T
U = VL VR ,
0 UF
" #
ΣL 0
(I − BB )L = UL
+ T
VL , (SVD)
0 0
" #
− 12 ΣL 0
(I − BB )AXM R
+ T
= UL T
VR , (SVD)
0 0
Proof. Noting that EDT = 0 since we assume that there is no correlation between the process
and measurement noises (Dp DTz = 0), the covariance equation (5.111) can be expanded as
X = AXAT + DDT + BGMXAT + (BGMXAT )T + BGRGT BT
where
∆
R = MXMT + EET .
Since X > 0 and there is no redundant sensor (MMT > 0), we have R > 0 for any matrix E.
Hence, we can complete the square with respect to BG as follows.
(BG + AXMT R−1 )R(BG + AXMT R−1 )T = Q
Q = X − AXAT + AXMT R−1 MXAT − DDT .

∆
Using Theorem 2.3.9, the above equation is solvable for G if and only if
Q ≥ 0, rank(Q) ≤ np + nc , (5.113)
(I − BB+ )(Q − AXMT R−1 MXAT )(I − BB+ ) = 0
hold, in which case, all solutions G are given by (5.112), where L ∈ R(np +nc )×(nc +nz ) is such that
Q = LLT . Note that the existence of such matrix L is equivalent to the condition (5.113). 2
5.3.3 Plant Covariance Assignment

In the previous section, we have considered a problem of assigning a matrix X > 0 as a closed-loop
state covariance. For a dynamic controller of order nc , the closed-loop state covariance X can be
partitioned as follows; " #
Xp Xpc
X= (5.114)
XTpc Xc
where Xp is the plant state covariance, Xc is the controller state covariance, and Xpc is the correla-
tion between the plant and controller states. Since Xc and Xpc are dependent upon the choice of the
controller coordinate, there is less motivation to assign a prespecified matrix as a controller state
covariance or a correlation (except for the case with observer-based control where these matrices
are related to the estimation error covariance). Moreover, an important output performance can
be completely specified by the plant state covariance Xp . These facts motivate us to ask; (1) When
is a given matrix Xp assignable as a plant state covariance? (2) If Xp is assignable, what are the
5.3. DISCRETE-TIME COVARIANCE CONTROLLERS 143
controllers which assign this Xp ? Note that we can find all controllers which assign a given Xp using
the controller formula given in (5.112) if we can find all assignable closed-loop state covariances X
whose 11-block is a specified assignable plant state covariance Xp . The following theorem answers
these questions.
Theorem 5.3.3 Let a positive definite matrix Xp ∈ Rnp ×np be given. Then the following state-
(i) There exists a controller (of some unspecified order) which assigns Xp as a plant state covari-
ance.
(ii) Xp satisfies
(I − Bp B+
p )(Xp − Ap Xp Ap − W)(I − Bp Bp ) = 0,
T +
Xp ≥ P
where P > 0 is the solution to
P = Ap PATp − Ap PMTp (Mp PMTp + V)−1 Mp PATp + W.
which stabilizes Ap − PMp T V−1 Mp .
Suppose Xp is assignable as a plant state covariance. Then all assignable closed-loop state co-
variances X > 0 whose 11-block is Xp can be constructed as follows. Choose an arbitrary matrix
Lp ∈ Rnp ×nz such that the stabilizing solution X̄p > 0 to
X̄p = Ap X̄p ATp − Ap X̄p MTp (Mp X̄p MTp + V)−1 Mp X̄p ATp + W + Lp LTp
satisfies
X̄p ≤ Xp .
Then let Xc and Xpc be any matrix factor such that
Xpc X−1
c Xpc = Xp − X̄p , Xc > 0
T
and form a matrix " #

Xp Xpc
X=
XTpc Xc
which is assignable as a closed-loop state covariance.
Proof. From Theorem 5.3.2, a given positive definite matrix X ∈ R(np +nc )×(np +nc ) is assignable
as a closed-loop state covariance if and only if it satisfies
(I − BB+ )(X − AXAT − DDT )(I − BB+ ) = 0,
X = AXAT − AXMT (MXMT + EET )−1 MXAT + DDT + LLT

for some L ∈ R(np +nc )×(nz +nc ) . Substituting the augmented matrices defined in (4.32), (5.114) and
" #
L1
, L1 ∈ Rnp ×(nc +nz ) , L2 ∈ Rnc ×(nc +nz ) ,
∆
L=
L2
we have
(I − Bp B+
p )(Xp − Ap Xp Ap − W)(I − Bp Bp ) = 0,
T +
(5.115)
X̄p = Ap X̄p ATp − Ap X̄p MTp (Mp X̄p MTp + V)−1 Mp X̄p ATp + W + L1 LT1 − Xpc X−1 T
c Xpc ,
Xpc = L1 LT2 , Xc = L2 LT2 ,
where
X̄p = Xp − Xpc X−1
∆ T
c Xpc . (5.116)
Let h i
L2 = U Σ 0 VT , (SVD)
h i
= L1 V, L̄1 ∈ Rnp ×nc , L̄2 ∈ Rnp ×nz
∆
L̄1 L̄2
where we used the fact in the SVD of L2 that L2 has full row rank since Xc = L2 LT2 > 0. Using
these definitions, we have
X̄p = Ap X̄p ATp − Ap X̄p MTp (Mp X̄p MTp + V)−1 Mp X̄p ATp + W + L̄2 L̄T2 , (5.117)
Xpc = L̄1 ΣUT , Xc = UΣ2 UT . (5.118)
Therefore, X > 0 is assignable if and only if (5.115)-(5.118) hold for some L̄1 , L̄2 , U and Σ where
U is an orthogonal matrix and Σ is a diagonal matrix with positive diagonal elements. Note that
(5.118) is always satisfied by some L̄1 , U and Σ for any given Xpc and Xc > 0. Thus we need only
(5.115) and (5.117) for X > 0 to be assignable.
Now, a given matrix Xp > 0 is assignable as a plant state covariance if and only if it satisfies
(5.115) and there exists L̄2 such that the positive definite solution X̄p to (5.117) satisfies
X̄p ≤ Xp ,
which is necessary and sufficient for the existence of Xpc and Xc > 0 such that (5.116). Conceptu-
ally, such a matrix L̄2 exists if and only if Xp is “larger” than the “smallest” X̄p satisfying (5.117)
for some L̄2 . Let the solution X̄p for the choice L̄2 = 0 be denoted by P. From the monotonicity
property of the stabilizing solution X̄p > 0 to the Riccati equation (5.117) with respect to the
forcing term L̄2 L̄T2 , the “smallest” X̄p is given by P. The term “smallest” means that P ≤ X̄p
holds for any choice of L̄2 . Thus, if Xp does not satisfy P ≤ Xp , then there is no L̄2 such that
X̄p ≤ Xp . Conversely, if Xp satisfies P ≤ Xp , then a choice L̄2 = 0 will surely yield X̄p ≤ Xp
(since X̄p = P in this case). This completes the proof. 2
For a given assignable plant state covariance Xp , all controllers which assign this Xp can be
obtained by first constructing an assignable closed-loop state covariance X by solving the Riccati
5.4. MINIMAL ENERGY COVARIANCE CONTROL 145
equation, then computing a controller using the explicit formula given in Theorem 5.3.2. In this
case, the controller order nc is determined when we factor the matrix Xp − X̄p to find Xpc and Xc ,
and is minimal if the factorization is such that Xpc has full column rank, in which case, we have
nc = rank(Xp − X̄p ).
Hence, to assign a given plant state covariance Xp , we can search for a lower-order controller
(nc < np ) by choosing the parameter Lp to reduce the rank of Xp − X̄p .
Finally, the plant covariance assignment formulation considered in this section can also be
applied to the full-order covariance controllers for continuous-time systems (see Section 5.2.7).
However, the formulation for more general (the controller order nc < np ) continuous-time covariance
controllers is much more difficult due to the third assignability condition (5.65) in Theorem 5.2.6.
The reader may want to prove a result parallel to Theorem 5.2.7 for the discrete-time system.
5.4 Minimal Energy Covariance Control

Covariance control theory provides a parametrization of all controllers which assign a particular
state covariance matrix to the closed-loop system. Such controllers are not unique. In this section
we obtain the analytic expressions for the covariance controllers which minimize the required control
effort. Both continuous-time and discrete-time systems are discussed.
5.4.1 Continuous-Time Output Feedback

We seek the continuous-time covariance controller to minimize the required control effort
· ¸1/2
∆
kuk = lim Eu (t)Ru(t)
T
(5.119)
t→∞
subject to
lim E[x(t)xT (t)] = X (5.120)
t→∞
where R > 0 is a given weighting matrix and X > 0 is a given assignable covariance. An equivalent
deterministic formulation of the control effort (5.119) can be provided as in Chapter 4. We have
seen in Section 5.2 that all covariance controllers which assign X to the closed-loop system (4.15)
can be parametrized in the following form (Theorem 5.2.1),
G = G1 + G2 SF G3 (5.121)
where SF is an arbitrary skew-symmetric matrix and the G1 , G2 and G3 are known matrices
which depend on the plant parameters and the state covariance X. For simplicity we have assumed
that the input matrix Bp has full column rank and the measurement matrix Mp has full row rank.
Hence B+ B = I and MM+ = I and in this case ZF − B+ BZF MM+ = 0 in the parametrization
of all controllers provided in Theorem 5.2.4.
The following result provides the solution of the continuous-time minimum effort covariance
control.
Theorem 5.4.1 The continuous-time covariance controller G, which solves the minimum effort
covariance control problem is provided by any choice of the skew-symmetric matrix SF which solves
the following generalized Sylvester equation
K1 SF K2 + K2 SF K1 + K3 = 0 (5.122)
where
K1 = GT2 IT1 RI1 G2 = KT1

K2 = G3 MXMT GT3 = KT2
K3 = GT2 IT1 RI1 G1 MXMT GT3 − G3 MXMT GT1 IT1 RI1 G2 = −KT3
(5.123)
and I1 = [I 0].
Proof. The control effort kuk can be computed as follows

h i
kuk2 = E∞ (Cc xc + Dc Mp xp )T R(Cc xc + Dc Mp xp )
h i
= E∞ (I1 GMx)T R(I1 GMx) = tr(I1 GMXMT GT IT1 R).
Substituting the expression G = G1 + G2 SF G3 we obtain

h i
kuk2 = tr I1 (G1 + G2 SF G3 )MXMT (G1 + G2 SF G3 )T IT1 R
= tr(I1 G1 MXMT GT1 IT1 R)
+ 2 tr(I1 G2 SF G3 MXMT GT1 IT1 R)
+ tr(I1 G2 SF G3 MXMT GT3 STF GT2 IT1 R)
Since we are looking for a skew-symmetric solution, we substitute SF = (SF 1 − STF 1 )/2 and we
minimize with respect to SF 1 . The minimum control effort solution is obtained by differentiating
the above expression with respect to SF 1 . This differentiation provides the following condition for
a global minimizer
GT2 IT1 RI1 G2 (SF 1 − STF 1 )G3 MXMT GT3 (5.124)
+G3 MXMT GT3 (SF 1 − STF 1 )G3 MXMT GT3 (5.125)
+2GT2 IT1 RI1 G1 MXMT GT3 − 2G3 MXMT GT1 IT1 RI1 G2 = 0 (5.126)
Substituting (SF 1 − STF 1 )/2 = SF provides the condition (5.122). 2

Next we provide an explicit expression for the skew-symmetric solutions of the equation (5.122)
using a Kronecker matrix algebra approach. (Note that the equation (5.122) always has a skew-
symmetric solution.) To begin we need the following lemma [87].
Lemma 5.4.1 Let SF ∈ Rk×k be a skew-symmetric matrix, and let
s = [s12 , s13 , . . . , s1k , s23 , . . . , s2k , . . . s(k−1)k ]T (5.127)
where S = [sij ]. Then there exist a k 2 × k(k − 1)/2 matrix ∆ such that
vecS = ∆s (5.128)
where the vec operator stacks the columns of a matrix one underneath the other.
Note that the columns of ∆ form a basis for the vector space of k × k skew-symmetric matrices.
The choices of the skew-symmetric matrices SF which provide a solution to the minimum effort
covariance control problem are parametrized in the following result.
Theorem 5.4.2 The minimum effort continuous-time covariance controller G is provided by the
following choices of the skew-symmetric matrix SF in the parametrization (5.121)
vecSF = −∆(K∆)+ vec(K3 ) + ∆[I − (K∆)+ K∆]q (5.129)

∆ N N
where K = KT2 K1 + KT1 K2 , q is an arbitrary (np + nc )(np + nc − 1)/2 dimensional vector,
N
K1 , K2 and K3 are defined in (5.123), and ∆ is defined by (5.128). The Kronecker product is
defined in Appendix A.
The optimal choice (5.129) is obtained by solving equation (5.122) using the vec operator. The
static state feedback case is provided by setting Mp = I and nc = 0.
Example 5.4.1 Consider the following continuous-time model of a missile attitude regulator
     
0 0 0 0 1 0 0
     
 10 −1 0 −1   0.5 0.4   
     1 
ẋ =  x +  u +  w
 0 1 0 2   0.8 1   1 
     
1 0 0 −1 0 0 0
where w(t) is a white noise process with intensity W = 10. Suppose that the following state
feedback control gain has been designed to satisfy output variance requirements
" #
−40.8760 −66.5402 −78.3062 −100.6755
Go =
27.4402 47.1713 56.7027 73.5772
This controller assigns the following covariance matrix to the closed-loop system
 
0.5968 −0.3167 −0.1201 0.0604
 
 −0.3167 0.6068 −0.3172 0.0324 
 
X= 
 −0.1201 −0.3172 0.4848 −0.1298 
 
0.0604 0.0324 −0.1298 0.0604
and the required control effort (for unit weight matrix R = I) is
kuk2 = 535.6851
We wish to redesign the controller Go preserving the original closed-loop state covariance matrix
X such that kuk2 is minimized. Theorem 5.4.1 provides the following state feedback gain as the
solution to this problem
" #
−14.5586 −24.4277 −18.5667 −21.1357
G=
−11.1921 −8.1730 −16.3240 −15.1148
This controller assigns the same covariance matrix X to the closed-loop system but the required
control effort is
kuk2 = 109.6756
Note that the controller Go corresponds to the choice S = 0 in the covariance controller parametriza-
tion (5.121) but the minimum effort controller G corresponds to the following optimal choice pro-
vided by the expression (5.129)
 
0 −0.8465 −2.1162 0
 
 0 −0.3809 0 
 0.8465 
SF = 
 2.1162 0.3809 0 0 
 
0 0 0 0
5.4.2 Discrete-Time Output Feedback

In this section we seek the discrete-time dynamic output feedback controller with measurement
noise to minimize the control effort
∆
kuk = lim E[u(k)T Ru(k)]1/2 (5.130)
t→∞
subject to
lim E∞ [x(k)x(k)T ] = X (5.131)
t→∞
where R > 0 is a given weighting matrix and X > 0 is a given assignable covariance. According
to the results in Section 5.3, the set of all covariance controllers which assign X to the closed-loop
system is parametrized by (5.112).
" #
Ik 0 T −1/2
+
G = B (LVL VR R − AXMT R−1 ) (5.132)
0 UF
where UF is an arbitrary orthogonal matrix. We have assumed that the input matrix Bp has full
column rank. The discrete-time minimum effort covariance controller is provided by the following
result.
Theorem 5.4.3 The discrete-time covariance controller with measurement noise G, which solves
the minimum effort covariance control problem is provided by the following choice of the orthogonal
matrix UF in the parametrization (5.132)
UF = U1 UT2 (5.133)
where U1 and U2 are defined from the singular value decomposition
ΦT2 RΘ2 = U1 ΛUT2 (5.134)
and the matrices Φ2 and Θ2 are defined by the following expressions

∆
[Θ1 Θ2 ] = I1 B+ LVL
∆
[Φ1 Φ2 ] = I1 B+ AXMT VR
(5.135)
where I1 = [I 0]. The value of the minimum control effort is
kukmin = [kΘ1 − Φ1 k2 + kΘ2 k2 + kΦ2 k2 − 2(λ1 + . . . + λk )]1/2 (5.136)
where Λ = diag(λ1 , . . . , λk ).
For the proof we need the following lemma (see [122], [51]).
Lemma 5.4.2 Let N ∈ Rn×n be a given matrix and let N = U1 ΣU2 be a singular value decom-
position of N. Then the optimization problem
maximize tr (NU), UUT = I (5.137)
has the solution U = U1 UT2 , and the value of the maximum is σ1 + . . . + σn , where σi are the
singular values of N
We now prove Theorem 5.4.3 using the above lemma.

Proof. The control effort can be computed as follows
h i
kuk2 = E∞ [u(k)T Ru(k)]1/2 = tr I1 G(MXMT + EET )GT IT1 R . (5.138)
By substituting the expression (5.132) for the covariance controller G we obtain after simplifications
" #
Ik 0
kuk = kI1 B LVL
2 +
− I1 B+ AXMT R−1/2 VR k2 . (5.139)
0 UF
The definitions (5.135) result in the following expression
kuk2 = k [Θ1 Θ2 UF ] − [Φ1 Θ2 UF ] k2 = kΘ1 − Φ1 k2 + kΘ2 UF − Φ2 k2 . (5.140)

Hence, the discrete-time minimum effort covariance control problem is provided by the solution of
the minimum norm problem
minimize kΘ2 UF − Φ2 k2 , UF UF T = I. (5.141)
Expanding this expression we obtain
kΘ2 UF − Φ2 k2 = kΘ2 k2 + kΦ2 k2 − 2tr(ΦT2 RΘ2 UF ) (5.142)
where the orthogonality of UF has been used. Hence, our problem requires the solution of the
maximization problem
maximize tr(ΦT2 RΘ2 UF ), UF is orthogonal (5.143)
By applying Lemma 5.4.2 we obtain the optimal choice of the orthogonal matrix UF and the value
of the minimum control effort.
Example 5.4.2 Consider the following discretized state space model of a simply supported beam
with 4 states, 2 inputs and 2 outputs
   
0.8778 0.4782 0 0 0.0718 −0.1222
   
 −0.4782 0.8730   0.2811 −0.4782 
 0 0   
x(k + 1) =   x(k) +   u(k)
 0 0 −0.4075 0.2251   0.0837 0.1759 
   
0 0 −3.6011 −0.4165 0.2141 0.4501
 
0.0214 0.0055 0 0
 
 −0.0055 
 0.0214 0 0 
+  w(k)
 0 0 0.0102 0.0039 
 
0 0 −0.0629 0.0101
where w(k) is a white noise process with intensity W = 100I. Also, consider the following state
feedback control that has been designed to satisfy covariance control requirements
" #
−0.7604 −3.4667 9.5586 0.2955
Go = .
−0.0010 1.2221 3.3200 −0.2642
The corresponding control effort (for a unit weight matrix R = I) is
kuk2 = 3.3546.
Theorem 5.4.3 provides a controller that assigns the same closed-loop state covariance matrix to
the closed-loop system with minimum control effort. This minimum effort covariance controller is
" #
−0.0645 −0.6566 1.6037 −0.0619
G=
−0.0219 0.6336 1.7486 −0.1774
5.5. FINITE WORDLENGTH COVARIANCE CONTROL 151
and the minimum required control effort is
kuk2 = 0.3549.
The orthogonal matrix UF in the covariance controller parametrization (5.133) that corresponds
to the minimum effort covariance controller is
" #
0.4704 0.8824
UF =
−0.8824 0.4704
5.5 Finite Wordlength Covariance Control

This section introduces a control design which takes into account the state quantization errors
for controllers synthesized in finite precision fixed-point machines, using either synchronized or
skewed sampling. In addition, the problem of controller complexity for fixed point arithmetic is
addressed to answer the question “what memory (total length of all words) is required in the control
computer to guarantee a specified RMS performance bound on each output of the plant?” The
tools of covariance control are used to solve these problems.
The integration of the signal processing and control disciplines has received considerable atten-
tion in recent years. No longer is it considered wise to separate the design of controllers (with the
assumption of infinite precision implementation), and the signal processing concerns (the synthesis
of the given controller). It is well known [153], [37] that there exists an optimal realization of a
given controller, so that synthesis in these optimal coordinates will minimize the noise gain from
state round-off effects. (A controller that is designed without regard to controller synthesis, can
then be implemented in an optimal realization for this given controller.) It is also known that such
controllers are not optimal overall. That is, the design and synthesis problems are not independent
problems.
This chapter makes some improvements in digital control theory. First, we parametrize all
state covariances that are assignable to the closed-loop system in the presence of quantization
error in the control computer and in the A/D and D/A devices. Secondly, we characterize all
dynamic controllers which assign these covariances to the closed-loop system. To this end, desired
performance objectives expressed in terms of the covariance matrix can be traded with controller
complexity (controller order and available wordlength). All of these results are also derived for
problems with skewed sampling. We shall refer to “skewing” as the asynchronous sampling of the
measurement and control. The sampling instants occur at the same rate, but are “skewed” by
amount δ seconds. Typically, δ is chosen as the computer duty cycle (time required by the control
computer to compute a new control given a new measurement). In this way, computational delays
are accommodated in the model and control design process.
5.6 Synchronous Sampling

Consider a controller which is synthesized in a digital computer with synchronous sampling and
fixed-point arithmetic. It is well known [153], [37] that the effects of the quantization error in
the control computer depend on the realization of the controller. To this end, we shall study the
control design problem in a transformed set of controller parameters Ac = T−1 −1
c Ãc Tc , Bc = Tc B̃c ,
Cc = C̃c Tc , Dc = D̃c and write the controller dynamics
xc (k + 1) = T−1 −1
c Ãc Tc (xc (k) + ex (k)) + Tc B̃c (z(k) + ez (k))
u(k) = C̃c Tc (xc (k) + ex (k)) + D̃c (z(k) + ez (k)) (5.144)
where ex (k) is the quantization error introduced by the controller state computation xc (k) in the
control computer (with wordlength βx , and ez (k) is the quantization error introduced by the A/D
converter (with wordlength βz ). The plant is described by
xp (k + 1) = Ap xp (k) + Bp (u(k) + eu (k)) + Dp wp (k)

yp (k) = Cp xp (k) (5.145)
z(k) = Mp xp (k) + v(k)
where eu (k) is the quantization error introduced by the D/A converter (with wordlength βi ). Under
sufficient excitation conditions we can approximate the quantization errors ex (k), ez (k) and eu (k)
as zero-mean white noise processes Wx = diag[. . . qi . . .], Wz = qz I and Wu = qu I, respectively,
where qi = (1/12) 2−2βi , qz = (1/12) 2−2βz , qu = (1/12) 2−2βu , and βi is the length of the fractional
part of the word storing the ith controller state variable.
Define the matrix  
qu I 0 0 0
 
 0 W 
 p 0 0 
W=  (5.146)
 0 0 Wz + V 0 
 
0 0 0 Tc Wx TTc
where Wp and V are the covariances of the plant noise wp (k) and the measurement noise v(k),
respectively. " # " # " #
Ap 0 Bp 0 Cp 0
A= , B= , C=
0 0 0 I 0 0
" # " # " #
Mp 0 Bp Dp 0 0 0 0
M= , D= , H= (5.147)
0 I 0 I 0 0 I 0
" # " # " # " #
0 0 I 0 I 0 Dc C̃c −1 Dc Cc
J= , T= , G̃ = , G=T G̃T =
0 0 0 I 0 Tc B̃c Ãc Bc A c
Then the closed-loop system is described by
x(k + 1) = (A + BGM)x(k) + (D + BGJ)w(k) (5.148)

y(k) = (C + HGM)x(k) + HGJw(k)
5.7. SKEWED SAMPLING 153
h iT h iT h iT
where x = xTp xTc , w = eTu wpT eTz + vT eTx and y = ypT uT . The state and output
covariances satisfy
X = (A + BGM)X(A + BGM)T + (D + BGJ)W(D + BGJ)T (5.149)

T T T T
Y = (C + HGM)X(C + HGM) + HGJWJ G H .
We seek to obtain necessary and sufficient conditions for assignability of a covariance matrix X, as
well as a parametrization of all covariance controllers which assign a particular covariance.
5.7 Skewed Sampling

In this section we present the finite wordlength control problem for the case where the computational
time delay δ in the control computer is taken into account. With quantization errors, the dynamics
of the plant for skewed sampling of z are described by
xp (k + 1) = Ap xp (k) + Bp (u(k) + eu (k)) + Dp wp (k) (5.150)

zδ (k + 1) = Mδ xp (k) + Hδ (u(k) + eu (k)) + v(k)
where zδ (k + 1) is a measurement occurring δ seconds before xp (k + 1). The matrix Hδ is a leakage

term (allowing the plant inputs to appear in the output) that goes to zero as the skewing δ goes
to zero. The controller with quantization error is described by
xc (k + 1) = Ac (xc (k) + ex (k)) + Bc (zδ (k + 1) + ez (k + 1)) (5.151)

u(k) = Cc (xc (k) + ex (k)) + Dc (zδ (k + 1) + ez (k + 1)) .
Hence the closed-loop system is described by

    
xp (k + 1) Ap Bp 0 xp (k)
    
 u(k + 1)  =  (Dc + Cc Bc )M (Dc + Cc Bc )H Cc Ac   u(k) 
   δ δ  
xc (k + 1) Bc Mδ B c Hδ Ac xc (k)
 
  eu
Bp Dp 0 0  
  wp 

 (Dc + Cc Bc )H 0 0 0 
 δ 


(5.152)
ez + v
Bc Hδ 0 0 0  
ex
where the extra state variable u(k) is required due to the skew delay δ. In the presence of controller
state, input and output quantization errors the closed-loop skewed system can be described by
x(k + 1) = (A + BḠM)x(k) + (D + BḠJ)w(k) (5.153)

y(k) = (C + HḠM)x(k) + HḠJw(k)
h iT h iT h iT
where x = xTp uT xTc , w = eTu wpT eTz + vT eTx and y = ypT uT , and
" # " # " # " #
As 0 Ap Bp Bs 0 0
A= , As = , B= , Bs =
0 0 0 0 0 I I
" # " # " #
Ds 0 Bp Dp Ms 0 h i
D= , Ds = , M= , Ms = Mδ Hδ
0 0 0 0 0 I
" # " # " #
Cp 0 0 Hδ 0 0 0 0 0
C= , J= , H= (5.154)
0 I 0 0 0 0 0 I 0
and " #
Dc + Cc Bc Cc Ac
Ḡ = (5.155)
Bc Ac
The state covariance of system (5.153) satisfies the same equation as (5.149), except G replaces G
in (5.149).
Now suppose that G is found to provide the closed-loop system the desired response x(k). The
question is whether or not there exists a controller that yields this response. The following result
provides a constraint on G which allows the synthesis of a finite wordlength controller to give the
desired response.
Lemma 5.7.1 Given a matrix G, there exist controller matrices Ac , Bc , Cc and Dc satisfying
(5.155) if and only if
+
G12 (I − G22 G22 ) = 0 (5.156)
in which case all such controller matrices are given by
+ +
Ac = G22 , Bc = G21 , Cc = G12 G22 + Z(I − G22 G22 ), Dc = G11 − Cc G21 (5.157)
where Z is arbitrary.
Note that the controller satisfying (5.156) is unique if the controller matrix Ac = G22 is invertible.
5.8 Covariance Assignment

In this section we shall describe the necessary and sufficient conditions for a covariance matrix to
be assignable (satisfies (5.149)) and parametrize all finite wordlength controllers G which assign
this covariance to the closed-loop system.
Theorem 5.8.1 The set of all covariance matrices X > 0 that satisfy the finite-wordlength covari-
ance equation (5.149), for some G, are parametrized by the following conditions
(I − BB+ )(X − AXAT − DWDT )(I − BB+ ) = 0 (5.158)
X = AXAT + DWDT (5.159)

− (AXMT + DWJT )(MXMT + JWJT )−1 (AXMT + DWJT )T + LLT
5.8. COVARIANCE ASSIGNMENT 155
where L is some (np + nu + nc ) × (nz + nc ) matrix. For any such X, the set of all matrices G
satisfying the covariance equation (5.149) is given by
G = B+ (LUR−1/2 − (AXMT + DWJT )R−1 ) + ZF − B+ BZF (5.160)
where ZF is arbitrary matrix and

" #
I 0
U = VL VR T , UF UTF = I (5.161)
0 UF
where VL and VM are obtained from the singular value decompositions
(I − BB+ )L = UL ΣL VL
T
(5.162)
− 12
(I − BB+ )(AXMT + DWJT )R T
= UL ΣM VM , ΣM = [ΣL 0]
and L is defined by (5.159), and R is defined by

∆
R = MXMT + JWJT (5.163)
Proof. The proof follows the proof of Theorem 5.3.2 without the assumption EDT = 0 made
there. Rearranging equation (5.149) we obtain
X − AXAT − DWDT =
BGMXAT + AXMGT BT + BGFGT BT + DWJT GT BT + BGJWDT

∆
where R = MXMT + JWJT > 0. Adding to both sides the term
(AXMT + DWJT )R−1 (AXMT + DWJT )T
and completing the square provides
P = (BG + SR−1 )R(BG + SR−1 )T
where S = AXMT + DWJT . Using Theorem 2.3.9 we obtain the parametrization of all assignable
covariances (5.159) and the set of all finite wordlength covariance controllers (5.160).
2
Chapter 5 Closure
Covariance analysis is a cornerstone of systems theory. Engineers have often evaluated performance
and performed error analysis using variances and covariances because they have physical meaning
and can be computed from signals. This is the first book chapter to provide a general method to
control covariances, rather than merely analyzing them after a system is developed. The significance
of these results go beyond control theory, since they also allow new criteria for design by passive
systems. For example, since the L2 to L∞ gain of a linear system is the maximum singular value of
the output covariance matrix, one can choose physical parameters (springs mass, stiffness, material
properties) to achieve a given covariance upper bound or a given L2 to L∞ gain. Peak stresses in
aircraft structures can be limited by such designs. This is a new direction in structure design.
This chapter provides the necessary and sufficient conditions for a given covariance matrix to
be assignable by output feedback control. The set of all covariances that can be assigned to a
given linear system is parametrized. The set of all controllers that assign a given covariance matrix
is given by an explicit formula with free parameters. A choice of the free parameters is given to
minimize control energy while assigning a covariance. The necessary conditions for the existence of
G to solve (5.2) were perhaps first presented in [93]. Since then many other papers on covariance
control have appeared; interested readers are referred to [14, 30, 35, 54, 61, 95, 96, 130, 131, 133,
134, 135, 136, 148, 156, 157, 158, 159]. This chapter also describes the covariance assignment
problem for discrete-time systems subject to skewed sampling between measurement and control,
and subject to finite precision computing in the A/D, D/A devices and in the controller state
noise with variance related to the wordlength. The covariance equation for the problem has similar
structure to other problems in this book, and the same theorems for control design apply.
Chapter 6
Covariance Upper bound Controllers
The motivation for this chapter is that upper bounds on performances might be more useful than
the more difficult task of assigning exact performance. Mathematically, this allows the use of
inequality constraints in lieu of equality constraints on covariance matries.
6.1 Covariance Bounding Control Problem
Consider the linear time-invariant systems given in Sections 4.2 and 4.5 for the continuous-time
and discrete-time cases, respectively. As we have seen in Chapter 4, the output covariance is closely
related to many performance properties, such as the H2 norm, the L2 to L∞ gain and the variance
of each output signal. In many cases, the “smaller” the output covariance, the better the system
performance. This provides motivation to consider a performance specification given in terms of
a bound on the output covariance. In this chapter, consider the following covariance bounding
control problem;
Determine if there exists a controller which stabilizes the system and yields an output
covariance bounded above by a given matrix 0. Find all such controllers when one
exists.
In Chapter 5, we have formulated and solved the covariance control problem, where a given
matrix X is assigned to the closed-loop system as the state covariance. This chapter is concerned
with the design of a controller which guarantees an upper bound on the state covariance. Note also
that our interest here is the output covariance which is more general than the state covariance in
the sense that the state covariance is a special case (Cc` = I) of the output covariance.
Consider the following Lyapunov inequalities;
Ac` X + XATc` + Bc` BTc` < 0, (continuous-time case)
X > Ac` XATc` + Bc` BTc` , (discrete-time case).
157
158 CHAPTER 6. COVARIANCE UPPER BOUND CONTROLLERS
It is easy to verify that any matrix X > 0 satisfying each of the above inequalities is an upper
bound on the state covariance when the system is excited by white noise w with covariance I, i.e.,
X > lim E [x(t)xT (t)], (continuous-time case),
t→∞
X > lim E [x(k)xT (k)], (discrete-time case).

k→∞
Thus, each inequality defines a set of covariance upper bounds. Moreover, the upper bound is tight,
i.e., for any ε > 0, one can find a matrix X > 0 satisfying the above inequality such that
kX − lim E [x(α)xT (α)]k < ε,
α→∞
where α = t (k) for continuous-time (discrete-time) systems. From the above discussion, it can be
seen that an inequality constraint on the output covariance matrix can be equivalently written in
terms of the above Lyapunov inequality as follows.
Lemma 6.1.1 Let a symmetric matrix 0 be given. Consider the linear time-invariant continuous-
time system (4.14) where w is the stochastic white noise process with intensity I. Suppose the
system is strictly proper, i.e., Dc` = 0. Then the following statements are equivalent.
(i) The system is asymptotically stable and the output covariance is bounded above by 0;
lim E [y(t)yT (t)] < 0.
t→∞
(ii) There exists a matrix X > 0 such that

Cc` XCTc` < 0, Ac` X + XATc` + Bc` BTc` < 0. (6.1)
The assumption Dc` = 0 is necessary for the output covariance to be finite for continuous-time
systems. Note that the strict properness of the system is not required for discrete-time systems
since the output covariance is finite even if Dc` 6= 0, provided the system is asymptotically stable.
Thus, we have the following lemma for the discrete-time case.
Lemma 6.1.2 Let a symmetric matrix 0 be given. Consider the linear time-invariant discrete-
time system (4.31) where w is the stochastic white noise process with covariance I. The following
(i) The system is asymptotically stable and the output covariance is bounded above by 0;
lim E [y(k)yT (k)] < 0.
k→∞

Cc` XCTc` + Dc` DTc` < 0, X > Ac` XATc` + Bc` BTc` . (6.2)
Now, using Lemmas 6.1.1 and 6.1.2, the covariance bounding control problem can be converted
to an algebraic problem of finding matrices X and G satisfying the matrix inequalities (6.1) or (6.2).
In the sequel, we shall assume that there is no redundant actuator (BTp Bp > 0) nor redundant sensor
(Mp MTp > 0). This assumption reflects a reasonable practical situation, and can easily be removed
at the expense of more complicated controller formulas.
6.2. CONTINUOUS-TIME CASE 159
6.2 Continuous-Time Case

This section provides a solution to the covariance bounding control problem for the continuous-time
case. Recall that the output covariance is finite only if the closed-loop system is strictly proper. In
order to assure Dc` = 0, in (4.15) we require Dy = 0 and either By = 0 or Dz = 0, when designing
proper (possibly Dc 6= 0) controllers. In the subsequent sections, we shall assume Dy = 0 and
By = 0 (no penalty on the control input) and allow possibly nonzero Dz (which accounts for the
measurement noise). The case where By 6= 0 with strictly proper controllers will be treated in
Section 6.2.4.
In this section, we assume that all the state variables are available for feedback without measurement
noises (Mp = I, Dz = 0). In this case, the Lyapunov inequality associated with the covariance
bounding control problem is the following;
(Ap + Bp G)X + X(Ap + Bp G)T + Dp DTp < 0. (6.3)
Theorem 6.2.1 Let a symmetric matrix 0 be given and consider the system (4.14) with Dy = 0,
By = 0, Mp = I and Dz = 0. The following statements are equivalent.
(i) There exists a stabilizing state feedback gain G such that
lim E [y(t)yT (t)] < 0.

t→∞
Cp XCTp < 0, B⊥ T T ⊥T
p (Ap X + XAp + Dp Dp )Bp < 0.
(iii) There exist a scalar γ > 0 and a matrix Q > 0 such that a positive definite solution P > 0 to
1
PAp + ATp P + P( Dp DTp − Bp BTp )P + Q = 0 (6.4)
γ2
satisfies
γ 2 Cp P−1 CTp < 0.
In this case, all such state feedback gains are given by
G = −BTp P + LQ1/2 (6.5)
where the matrices P and Q are those in (iii) and L is an arbitrary matrix such that kLk < 1.
Proof. From Lemma 6.1.1, statement (i) holds if and only if there exist matrices G and X > 0
satisfying (6.3) and Cp XCTp < 0. Note that a given matrix pair (X,G) satisfies the inequality in
(6.3) if and only if there exists a scalar γ > 0 such that
1
(Ap + Bp G)X + X(Ap + Bp G)T + XGT GX + Dp DTp < 0.
γ2
Completing the square with respect to G, we have
1 1
( XGT + γBp )( XGT + γBp )T < γ 2 Bp BTp − Dp DTp − Ap X − XATp =: Φ.
γ γ
From Corollary 2.3.6 there exists a matrix G satisfying the above inequality if and only if Φ > 0,
in which case, all such G are given by
G = −γ 2 BTp X−1 + γLΦ1/2 X−1 , kLk < 1.
Defining P = γ 2 X−1 and Q = γ 2 X−1 ΦX−1 , we have statement (iii). Finally, the equivalence of
∆ ∆
Φ > 0 and the second inequality in statement (ii) follows immediately from Finsler’s Theorem
(Theorem 2.3.10) 2
There are several ways to interpret Theorem 6.2.1. In view of the covariance control problem
considered in Chapter 5, we can define the set of assignable covariance bounds by any matrices
X > 0 satisfying (6.3) for some controller G. Then such a set is characterized by the Linear Matrix
Inequalities (LMIs) in statement (ii). Since the set is convex, an assignable covariance bound can
be computed by solving a convex feasibility problem1 . Once we find such an X, all controllers
which assign X to the closed-loop system as a covariance upper bound can be obtained by finding
γ such that Φ > 0. Note that Φ > 0 can be achieved by sufficiently large γ > 0. Then P is
given by P = γ 2 X−1 and Q is determined by the Riccati equation in statement (iii). Alternatively,
one may find an assignable covariance bound by solving the Riccati equation in statement (iii) by
choosing parameters γ > 0 and Q > 0. Note that, for any Q > 0, there exists a positive definite
solution P to the Riccati equation if γ is sufficiently large, provided (Ap ,Bp ) is stabilizable.
Clearly, there are many controllers which achieve a given covariance bound. Such freedoms are
captured by γ, Q and L. We shall investigate the physical significance of these parameters. Note
that, for any vector x ∈ Rnp , we have
1
xT (PAp + ATp P + P( Dp DTp − Bp BTp )P + Q)x = 0,
γ2
or equivalently,
xT [(PBp + GT )(BTp P + G) − GT G + Q)x + γ 2 wT w
1 1
−( xT PDp − γwT )( DTp Px − γw) − 2xT P(Ap x + Bp u + Dp w) = 0
γ γ
1
See Chapters 10 and 11.
where w ∈ Rnw and G ∈ Rnu ×np are arbitrary. Now considering any trajectory of the system
(4.15) with the state feedback given in Theorem 6.2.1, we have
ku(t)k2 = kLQ1/2 x(t)k2 − kQ1/2 x(t)k2 + γ 2 kw(t)k2

1 d ³ ´
−γ 2 kw(t) − 2 DTp Px(t)k2 − x(t)T Px(t) .
γ dt
Integrating the both sides from t = 0 to ∞, we have

Z ∞ Z ∞ Z ∞
ku(t)k2 dt = γ 2 kw(t)k2 dt + (kLQ1/2 x(t)k2 − kQ1/2 x(t)k2 )dt
0 0 0
Z ∞ 1
−γ 2
kw(t) − 2 DTp Px(t)k2 dt + x(0)T Px(0),
0 γ
where we used the stability property limt→∞ x(t) = 0. From the above observation, we have the
following.
Theorem 6.2.2 Let G be a controller generated by Theorem 6.2.1. Then the following statements
hold.
(a) The L2 gain from the disturbance to the control input is bounded above by γ, i.e.,
Z ∞ Z ∞
ku(t)k dt ≤ γ
2 2
kw(t)k2 dt,
0 0
for all L such that kLk < 1.

" #
u
(b) If we choose the parameter L = 0, then the L2 gain from the disturbance to is
Q1/2 x
bounded above by γ, i.e.,
Z ∞ Z ∞
(xT (t)Qx(t) + u(t)T u(t))dt ≤ γ 2 kw(t)k2 dt.
0 0
If we let the output covariance bound be 0 = σI and take the limit σ → ∞, then the per-
formance specification in Theorem 6.2.1 can be effectively removed, in which case, Theorem 6.2.1
provides a parametrization of all stabilizing (static) state feedback gains. Note that the choice of
Dp is immaterial since the closed-loop stability specification does not involve Dp . Thus, letting
Dp = 0, we have the following.
Corollary 6.2.1 [63] Consider a system ẋ = Ax + Bu. The following statements are equivalent.
(i) There exists a stabilizing state feedback controller u = Gx.
B⊥ (AX + XAT )B⊥T < 0.

(iii) There exists a matrix Q > 0 such that the Riccati equation
PA + AT P − PBBT P + Q = 0 (6.6)
has a solution P > 0.
G = −BT P + LQ1/2 (6.7)
where the matrices P and Q are the ones in statement (iii) and L is an arbitrary matrix such that
kLk < 1.
Note that, for each choice of the matrix Q > 0, there exists a positive definite matrix P
satisfying the Riccati equation (6.6) if and only if (A, B) is stabilizable, and in fact such a solution
is unique. Hence, Corollary 6.2.1 parametrizes the set of all stabilizing state feedback gains in terms
of an arbitrary positive definite matrix Q and arbitrary matrix L with the norm bound kLk < 1.
Clearly, a choice of the freedom L = 0 yields the LQ optimal state feedback gain with respect to
the cost function Z ∞
∆
J= (xT (t)Qx(t) + uT (t)u(t))dt, x(0) = x0 .
0
Indeed, for a stabilizing state feedback gain G generated by (6.7),
xT Qx + uT u = xT (PBBT P − AT P − PA + GT G)x
= xT [(G + BT P)T R(G + BT P)
−P(A + BG) − (A + BG)T P]x
d
= xT Q1/2 LT LQ1/2 x − (xT Px).
dt
Integrating the both sides from t = 0 to ∞, and using the stability property limt→∞ (xT (t)Px(t)) =
0, we have Z ∞
J = xT0 Px0 + kLQ1/2 x(t)k2 dt.
0
Thus, the choice L = 0 provides the optimal controller which minimizes the quadratic cost function
J.
Example 6.2.1 Consider the double integrator system given by the following state space real-
ization: " # " # " #
1 0 0 h i 1
Ap = , Bp = , Cp = 1 0 , Dp = .
0 0 1 1
We will design a covariance upper bound controller with covariance bound 0 = 4 using the result
in Theorem 6.2.1. Choosing the parameters γ = 2 and Q = I2 , one can obtain a positive definite
solution P to the Riccati equation in (6.4) as follows:
" #
6.854 6.997
P= .
6.997 8.794
In this case, we have γ 2 Cp P−1 CTp = 3.110 < 0 and thus the prescribed performance bound is
achieved.
Note that " #
2 −1 3.110 −2.475
X := γ P =
−2.475 2.424
satisfies the conditions in statement (ii) of Theorem 6.2.1:
Cp XCTp = 3.1102 < 0, B⊥ T T ⊥T

p (Ap X + XAp + Dp Dp )Bp = −3.950 < 0.
Now, a class of state feedback gains that achieves the performance bound is given by (6.5) with
L being the free parameter. Choosing L = 0, we have
h i
G = −BTp P = − 6.997 8.794 .
With this controller, the state covariance Xcov := limt→∞ E[x(t)xT (t)], which is the solution to the
following Lyapunov equation:
(Ap + Bp G)Xcov + Xcov (Ap + Bp G)T + Dp DTp = 0,
is given by " #
0.836 −0.500
Xcov = .
−0.500 0.4547
Note that X computed above is an upper bound on the state covariance Xcov ; since λ(X − Xcov ) =
4.102, 0.141 where λ denotes the eigenvalue, we have X > Xcov . Finally, the actual output
covariance is given by
lim E[y(t)yT (t)] = Cp Xcov CTp = 0.8363 < 0,
t→∞
confirming that the control design objective is achieved.
6.2.2 Static Output Feedback

In this section, we consider the covariance bounding control problem with static output feedback.
The Lyapunov inequality considered here is then given by
(Ap + Bp GMp )X + X(Ap + Bp GMp )T + (Dp + Bp GDz )(Dp + Bp GDz )T < 0. (6.8)
In the sequel, we shall assume

" # " #
Dp 0
DTz = , V > 0, (6.9)
Dz V
i.e., there is no correlation between the process and measurement noises, and the measured outputs
are fully contaminated by noises.
Theorem 6.2.3 Let a symmetric matrix 0 be given and consider the system (4.15) with Dy = 0
and By = 0. Suppose the assumption (6.9) holds. Then the following statements are equivalent.
(i) There exists a stabilizing static output feedback gain G such that
lim E [y(t)yT (t)] < 0.

t→∞
Cp XCTp < 0,
B⊥ T T ⊥T
p (Ap X + XAp + Dp Dp )Bp < 0,
Ap X + XATp − XMTp V−1 Mp X + Dp DTp < 0.
In this case, all such gains are given by
G = −(BTp QBp )−1 BTp QXMTp V−1 + (BTp QBp )−1/2 LS1/2
Q = (XMTp V−1 Mp X − Ap X − XATp − Dp DTp )−1 > 0,

∆
S = V−1 − V−1 Mp X[Q − QBp (BTp QBp )−1 BTp Q]XMTp V−1 > 0.
∆
Proof. Consider the Lyapunov inequality (6.8), which after expanding, completing the square with
respect to G yields
(Bp G + XMTp V−1 )V(Bp G + XMTp V−1 )T < Q−1 .
Then the result follows directly from Corollary 2.3.6. 2

As in the state feedback case, we can obtain a parametrization of all stabilizing static output
feedback gains by removing the performance constraint. In particular, a variety of necessary and
sufficient conditions for stabilizability via static output feedback can be derived as follows.
Corollary 6.2.2 [56] Consider a system ẋ = Ax + Bu, z = Mx. The following statements are
equivalent.
(i) There exists a static output feedback controller u = Gz which stabilizes the system.
B⊥ (AX + XAT )B⊥T < 0, (6.10)
AX + XAT − XMT MX < 0.

(iii) There exists a matrix pair (X,Y) such that
X = Y−1 > 0,
B⊥ (AX + XAT )B⊥T < 0,

MT ⊥ (YA + AT Y)MT ⊥T < 0.
(iv) There exist matrices Q > 0 and W > 0 such that the solutions X > 0 and Y > 0 to
AX + XAT − XMT MX + W = 0,
YA + AT Y − YBBT Y + Q = 0
exist and satisfy XY = ρI for some ρ > 0.
(v) There exist K, F and X > 0 such that
(A + BK)X + X(A + BK)T < 0,
(A + FM)X + X(A + FM)T < 0.
Proof. Letting 0 be arbitrarily large, Dp be zero and V be identity in Theorem 6.2.3, we imme-
diately have (i) ⇔ (ii). In the above, if we let V = εI with a sufficiently small ε > 0 instead of
V = I, we see the equivalence (i) ⇔ (iii) by Finsler’s Theorem 2.3.10. To prove (ii) ⇔ (iv), note
that (6.10) holds if and only if
AX + XAT − ρBBT < 0
holds for some ρ > 0 by Finsler’s Theorem, or equivalently,
X−1 A + AT X−1 − ρX−1 BBT X−1 < 0.
Multiplying the both sides by ρ and defining Y = ρX−1 , (iv) can be verified. Finally, (iii) ⇔ (v)
∆
follows from Corollary 6.2.1 and some algebraic manipulations. 2
6.2.3 Reduced Order Dynamic Output Feedback

This section obtains all dynamic output feedback controllers, of order less than or equal to the
plant, which solve the covariance bounding control problem. The Lyapunov inequality associated
with the closed-loop state covariance bounds is given by
(A + BGM)X + X(A + BGM)T + (D + BGE)(D + BGE)T < 0. (6.11)
Note that the above inequality has exactly the same structure as (6.8) for the static output feedback
∆
case. However, EET is not positive definite (invertible) even if V = Dz DTz > 0, and hence the
result in the previous section (Theorem 6.2.3) cannot be applied directly to the dynamic output
feedback case. The following result is derived without any assumptions on matrices D and E, and
thus can be specialized to the previous result.
Theorem 6.2.4 Let a symmetric matrix 0 be given and consider the system (4.15) with Dy = 0
and By = 0. The following statements are equivalent.
(i) There exists a dynamic output feedback controller of order nc which stabilizes the system and
yields
lim E [y(t)yT (t)] < 0.
t→∞
(ii) There exists a positive definite matrix X ∈ R(np +nc )×(np +nc ) such that
B⊥ (AX + XAT + DDT )B⊥T < 0, CXCT < 0,

" #⊥ " #" #⊥T
MT X−1 A + AT X−1 X−1 D MT
< 0.
ET DT X−1 −I ET
G = −R−1 ΓT ΦΛT (ΛΦΛT )−1 + S1/2 L(ΛΦΛT )−1/2
such that
Φ = (ΓR−1 ΓT − Θ)−1 > 0,
∆
and
S = R−1 − R−1 ΓT [Φ − ΦΛT (ΛΦΛT )−1 ΛΦ]ΓR−1 ,
∆
" # " #
AX + XAT D B h i
∆ ∆ ∆
Θ= , Γ= , Λ= MX E .
DT −I 0
Proof. Using the Schur complement formula, (6.11) can be equivalently written
" #
(A + BGM)X + X(A + BGM)T D + BGE
< 0.
(D + BGE)T −I
By the definitions for Θ, Γ and Λ given in Theorem 6.2.4, the above inequality can be rewritten
ΓGΛ + (ΓGΛ)T + Θ < 0.
Then the result follows from Theorem 2.3.12 where we note that
" #⊥ " #
⊥ B B⊥ 0
Γ = = ,
0 0 I
" #⊥ " #⊥ " #
T⊥ XMT MT X−1 0
Λ = = .
ET ET 0 I
Exercise 6.2.1 Specialize Theorem 6.2.4 for the static output feedback case with the assumption
(6.9). Verify your result by comparing with Theorem 6.2.3.
Let the set of assignable state covariance bounds be defined by any matrices X > 0 satisfying
the matrix inequalities in statement (ii) of Theorem 6.2.4. Note that the controller order nc which
assigns a given state covariance bound X to the closed-loop system is fixed by the dimension of X.
Recall that matrices A, B, etc. are the augmented matrices defined in (4.15). Utilizing the
structure of these matrices, we have another characterization of all assignable state covariance
bounds as follows.
Corollary 6.2.3 Let a matrix X > 0 be given. The following statements are equivalent.
(i) X is an assignable state covariance bound, i.e., X satisfies the matrix inequalities in statement
(ii) of Theorem 6.2.4.
(ii) X satisfies
B⊥ T T ⊥T
p (Ap Xp + Xp Ap + Dp Dp )Bp < 0, Cp Xp CTp < 0, (6.12)
" #⊥ " #" #⊥T
MTp Yp Ap + ATp Yp Yp Dp MTp
< 0, (6.13)
DTz DTp Yp −I DTz
where Xp > 0 and Yp > 0 are defined by
" #
Xp Xpc
Yp = (Xp − Xpc X−1 T −1
∆
X= , c Xpc ) . (6.14)
XTpc Xc
Proof. The result follows by noting that

" #⊥
Bp 0 h i
B⊥ = = B⊥
p 0 ,
0 I
 
⊥  
" #⊥ MTp 0  " #⊥  I 0 0
MT   M T  
=
 0 I 
 =
 p
0  0 0 I ,
 
ET DTz
DTz 0 0 I 0
and noting the matrix inversion lemma;
" #
−1 Yp ∗
X = ,
∗ ∗
where ∗ denotes appropriate partitioned blocks. 2

Corollary 6.2.3 provides a way to construct an assignable state covariance bound. Specifically,
we first find matrices Xp > 0 and Yp > 0 satisfying (6.12) and (6.13), then determine Xpc and
Xc > 0 such that
Xpc X−1
c Xpc = Xp − Yp
T −1
(6.15)
to construct X as in (6.14). Note that Xp and Yp must satisfy Xp ≥ Yp−1 in order for Xpc and
Xc > 0 satisfying (6.15) to exist. Note that the constraint Xp ≥ Yp−1 > 0 is convex since it is
equivalent to the following LMI by the Schur complement formula;
" #
Xp I
≥ 0. (6.16)
I Yp
Clearly, the resulting controller order nc is minimal for a given pair (Xp ,Yp ) if we choose Xpc to
be full column rank. In this case, we have
nc = rank(Xp − Yp−1 ). (6.17)
Example 6.2.2 Consider the double integrator plant with the following state space realization:
 
  0 1 0 0 0
Ap Dp Bp  
   0 0 1 0 1 
 Cp  
 Dy By 
= .
 1 0 0 0 0 
Mp Dz ∗  
1 0 0 1 ∗
We will design a first-order dynamic controller that achieves the output covariance upper bound
0 = 1.5. Note that the plant is of order 2, and hence this is a reduced (low) order control design.
To design such a controller, we need to find matrices Xp and Yp satisfying (6.12), (6.13), (6.16) and
(6.17) with nc = 1. Note that the size of Xp − Yp−1 in (6.17) is 2 × 2, and thus we need to reduce its
rank by one. This rank reduction occurs when the matrix in (6.16) is singular, i.e. matrices Xp and
Yp satisfy constraint (6.16) on the boundary. This observation leads us to formulate a problem of
minimizing tr(Xp +Yp ) subject to constraints (6.12), (6.13), (6.16). This is a minimization problem
with a linear objective function and LMI constraints, and hence can be solved using a commercial
software package (e.g. MATLAB). In this way, we have obtained
" # " #
1.500 −0.000 1.414 −1.000
Xp = , Yp = .
−0.000 13.078 −1.000 1.414
Note that, for this particular example, rank(Xp − Yp−1 ) = 1 is achieved, although the above
minimization formulation did not guarantee satisfaction of this rank condition. Using the singular
value decomposition
" #" #" #T
0.085 −0.996 11.749 0 0.085 −0.996
Xp − Yp−1 = UΣU =T
,
−0.996 −0.085 0 0.0000 −0.996 −0.085
the matrices Xpc and Xc satisfying (6.15) are found to be the first column of U and 1/11.749,
respectively, to give
 
" # 1.500 −0.000 0.0854
Xp Xpc  
X= = 
 −0.000 13.078 −0.996  .
XTpc Xc
0.085 −0.996 0.085
It can be verified that this X satisfies the conditions in statement (ii) of Theorem 6.2.4. Now, using
the controller formula given in Theorem 6.2.4 with L = 0 and R = εI (ε > 0 sufficiently small), we
have a controller that achieves the performance bound 0:
" #
−17.493 154.083
G= ,
1.409 −13.077
or in the transfer function form,
s + 0.667
Cc (sI − Ac )−1 Bc + Dc = −17.493 ×
s + 13.077
which is a first-order controller. The resulting closed-loop output covariance is found to be
lim E[y(t)yT (t)] = 1.49996 < 1.5 = 0.

t→∞
To see the effect of the parameter 0 on the closed-loop performance, the above design process
is repeated for 0 = 1.5, 2, 5 to obtain two additional first-order controllers. The impulse responses
(w1 (t) = δ(t), w2 (t) ≡ 0) of the closed-loop system for the three controllers thus obtained are
plotted in Figure 6.1.
0.5 1
solid: bound=1.5
Regulated Output
0
Control Input
0.5 dashed: bound=2
−0.5 solid: bound=1.5 dotted: bound=5

dashed: bound=2
dotted: bound=5 0
−1
−1.5 −0.5
0 5 10 15 0 5 10 15
Time Time
Figure 6.1: Closed-loop impulse responses
We see from Figure 6.1 that the output performance (e.g. small peak and fast settling) is im-
proved by reducing the value of the performance bound 0, at the expense of more control effort.
6.2.4 Full-Order Dynamic Output Feedback
This section derives a full-order (nc = np ) dynamic output feedback controller which solves the
covariance bounding control problem. In the sequel, we will assume for simplicity that
h i h i h i h i
Dy = 0, Dz DTp DTz = 0 I , BTy Cp By = 0 I (6.18)
hold.
For the closed-loop output covariance to be bounded, it is necessary that the closed-loop transfer
matrix from w to y is strictly proper:
Dc` = Dy + By Dc Dz = 0
where Dc is the high frequency gain of the controller. With assumptions (6.18), this condition
requires that the controller be strictly proper, i.e., Dc = 0.
Recall from Lemma 6.1.1 that the closed-loop system is stable and the output covariance is
bounded above by 0 if and only if there exists a symmetric positive definite matrix X > 0 such
that
Cc` XCTc` < 0, Ac` X + XATc` + Bc` BTc` < 0. (6.19)
In the above inequalities, we need to consider matrix X of the following structure only:
" #
Xp Z
X= .
Z Z
This is because, if there exist a controller (Ac , Bc , Cc , Dc ) and X satisfying (6.19), then there
always exists a coordinate transformation matrix Tc such that the controller realization
(T−1 −1
c Ac Tc , Tc Bc , Cc Tc , Dc ) solves (6.19) for some positive definite matrix X̂ of the above struc-
ture. Verification of this simple fact is left for the reader as a straightforward exercise.
Consider the coordinate transformation for the closed-loop states by
" #
∆ I 0
T= . (6.20)
I −I
This transformation is possible since the controller is of full-order, and the resulting new closed-loop
states consist of the plant state xp and the difference xp − xc , where the latter may be interpreted
as the estimation error if the controller were to dynamically estimate the plant state. The following
identity is useful:
" #" #" #" #
T 0 Ac` Bc` X 0 TT 0
=
0 I Cc` Dc` 0 I 0 I
 
Ap Xp + Bp Cc Z Ap Yp−1 Dp
 
 (Ap − Bc Mp )Xp + (Bp Cc − Ac )Z (Ap − Bc Mp )Y −1 Dp − Bc Dz 
 p  , (6.21)
Cp Xp + By Cc Z Cp Yp−1 Dy
where Yp = (Xp − Z)−1 . Using this identity, it can readily be verified that the congruent transfor-
∆
mation by T for the Lyapunov inequality in (6.19) yields

" #
Ψ11 Ψ12 ∆
Ψ= = T(Ac` X + XATc` + Bc` BTc` )TT < 0
ΨT12 Ψ22
where
Ψ11 = Ap Xp + Xp ATp + Bp Cc Z + ZCTc BTp + Dp DTp , (6.22)

ΨT12 = Yp−1 ATp + (Ap − Bc Mp )Xp + (Bp Cc − Ac )Z + (Dp − Bc Dz )DTp , (6.23)
Ψ22 = (Ap − Bc Mp )Yp−1 + Yp−1 (Ap − Bc Mp )T + (Dp − Bc Dz )(Dp − Bc Dz )T . (6.24)
Note that the first inequality in (6.19) and X > 0 are equivalent to
" #" #" #
I 0 0 Cc` X I 0
T
>0
0 T XCc` X 0 TT
or equivalently, with a congruent transformation involving Yp ,

 
0 Cp Xp + By Cc Z Cp
 
 Xp C + ZCT BT
T I 
 p c y Xp  > 0. (6.25)
CTp I Yp
Our objective is to find (Ac , Bc , Cc , Dc ) and (Xp , Yp ) such that Ψ < 0 and (6.25) hold. To do
this, we will eliminate some of the controller parameters and reduce the control design problem to
a convex LMI problem (see Chapters 10 and 11 for the details of numerical solution procedures for
LMI problems).
We eliminate Ac first. Note that Ac appears only in inequality Ψ < 0 (not in (6.25)). If Ψ < 0
holds for some Ac , then Ψ11 < 0 and Ψ22 < 0 hold. Conversely, if Ψ11 < 0 and Ψ22 < 0 hold,
then there exists Ac such that Ψ < 0 and one such choice is given by solving Ψ12 = 0 as follows:
Ac = Yp−1 ATp Z−1 + (Ap − Bc Mp )Xp Z−1 + Bp Cc + (Dp − Bc Dz )DTp Z−1 . (6.26)
Thus Ψ < 0 has been replaced by Ψ11 < 0 and Ψ22 < 0 by eliminating Ac . Next, we eliminate Bc
which appears only in Ψ22 < 0. Completing the square with respect to Bc , we have
(Bc − Yp−1 MTp )(BTc − Mp Yp−1 ) + Ap Yp−1 + Yp−1 ATp − Yp−1 MTp Mp Yp−1 + Dp DTp < 0.
This inequality holds for some Bc if and only if
Ap Yp−1 + Yp−1 ATp − Yp−1 MTp Mp Yp−1 + Dp DTp < 0 (6.27)
holds, in which case a choice of Bc is given by
Bc = Yp−1 MTp . (6.28)
Thus we have shown that there exists a stabilizing controller achieving the covariance upper bound
0 if and only if there exist matrices Xp > 0, Yp > 0 and Cc such that Ψ11 < 0, (6.27) and (6.25)
hold, where Ψ11 is defined in (6.22) and Z = Xp − Yp−1 . Converting these inequalities to LMIs via
∆
change of variables, we have the following.

Theorem 6.2.5 Let a symmetric matrix 0 be given and consider the system (4.15). Suppose the
assumptions in (6.18) hold. Then the following statements are equivalent.
(i) There exists a dynamic output feedback controller which stabilizes the system and yields
lim E [y(t)yT (t)] < 0.

t→∞
(ii) There exist matrices Xp , Yp and K such that
Ap Xp + Xp ATp + Bp K + KT BTp + Dp DTp < 0, (6.29)

" #
Yp Ap + ATp Yp − MTp Mp Yp Dp
< 0, (6.30)
DTp Yp −I
 
0 Cp Xp + By K Cp
 
 Xp CT + KT BT I 
 p y Xp  > 0. (6.31)
CTp I Yp
In this case, one such controller is given by
Ac = Ap + Bp Cc − Bc Mp − Yp−1 Ω(I − Xp Yp )−1 ,

Bc = Yp−1 MTp ,
Cc = K(Xp − Yp−1 )−1 ,
Dc = 0,
where
∆
Ω = Yp Ap + ATp Yp + Yp Dp DTp Yp − MTp Mp .
∆
Proof. Inequalities (6.29) and (6.31) follow from Ψ11 < 0 and (6.25) by letting K = Cc Z.
Inequality (6.30) follows from (6.27) using the Schur complement formula and the congruent trans-
∆
formation. Controller formulas can be derived using (6.26), (6.28) and K = Cc Z. 2
The above result can be specialized to the standard LQG (Linear Quadratic Gaussian) result.
The LQG problem is to minimize tr(0) subject to the conditions in statement (i) of Theorem 6.2.5.
The solution to this problem can be obtained as follows.
In view of inequality (6.31), if Yp is larger (in the sense of positive definite matrices), then the
performance bound tr(0) < γ is more likely to be satisfied. On the other hand, the “largest” Yp
satisfying (6.30) is given by the inverse2 of the stabilizing solution Q to
Ap Q + QATp − QMTp Mp Q + Dp DTp = 0. (6.32)
Therefore we choose Yp in (6.30) and (6.31) to be Q−1 .

2
The inverse of Q exists if (A, Dp , Mp ) is a controllable, detectable triple. This assumption can be relaxed by
avoiding the use of Q−1 in the above discussion.
Using the Schur complement formula, (6.31) is equivalent to

" #
0 − Cp QCTp (Cp + By Cc )Z
>0
Z(Cp + By Cc )T Z
or
0 > Cp QCTp + (Cp + By Cc )Z(Cp + By Cc )T (6.33)
where we used K = Cc Z. Also using (6.32), inequality (6.29) can be written
(Ap + Bp Cc )Z + Z(Ap + Bp Cc )T + QMTp Mp Q < 0. (6.34)
Now, it can be verified using the dual characterization of the H2 norm that, for given γ and Cc
such that Ap + Bp Cc is asymptotically stable, there exist Z and 0 satisfying (6.33), (6.34) and
tr(0) < γ if and only if there exists P > 0 such that
P(Ap + Bp Cc ) + (Ap + Bp Cc )T P + (Cp + By Cc )T (Cp + By Cc ) < 0, (6.35)
tr(Mp QPQMTp ) < γ − tr(Cp QCTp ). (6.36)
Completing the square with respect to Cc in (6.35),
(PBp + CTc )(BTp P + Cc ) + PAp + ATp P − PBp BTp P + CTp Cp < 0. (6.37)
In view of (6.36), smaller P yields smaller γ. From the monotonicity property of the Riccati
equation solution [108], the smallest P such that (6.37) is given by the stabilizing solution to
PAp + ATp P − PBp BTp P + CTp Cp = 0,
with
Cc = −BTp P.
In summary, we have the following.
Theorem 6.2.6 Consider the system given in (4.15). Suppose the assumptions in (6.18) hold.
Then the controller that stabilizes (4.15) and solves
∆
γopt = min tr( lim E[y(t)yT (t)])
t→∞
is given by
Ac = Ap + Bp Cc − Bc Mp
Bc = QMTp
Cc = −BTp P
Dc = 0
where P and Q are the stabilizing solutions to
PAp + ATp P − PBp BTp P + CTp Cp = 0,
Ap Q + QATp − QMTp Mp Q + Dp DTp = 0.
Furthermore, the minimum value of the cost function is given by
γopt = tr(Mp QPQMTp ) + tr(Cp QCTp ).
6.3 Discrete Time Case

This section presents a solution to the covariance bounding control problem for the discrete Time
case. In the previous section, we have made the closed-loop system strictly proper (Dc` = 0) since
the output covariance is not defined for proper (Dc` 6= 0) systems. For discrete-time systems,
however, the output covariance is finite even if the system is not strictly proper, provided the
system is stable. In the sequel, we do not assume Dy = 0. It is desirable to allow possibly nonzero
By , which corresponds to a control input covariance constraint. However, By 6= 0 introduces a
technical difficulty due to the fact that Cc` becomes dependent upon the controller parameter G.
For this reason, we shall treat the simple case By = 0 first, and remove this assumption later.

In this section, we consider the (static) state feedback case where Mp = I and Dz = 0. The
Lyapunov inequality which characterizes a set of state covariance bounds is given by
X > (Ap + Bp G)X(Ap + Bp G)T + Dp DTp .
Theorem 6.3.1 Let a symmetric matrix 0 be given and consider the system (4.32) with By = 0,
Mp = I and Dz = 0. The following statements are equivalent.
(i) There exists a stabilizing state feedback gain G such that

h i
lim E y(k)yT (k) < 0.
k→∞
(ii) There exists a matrix X such that
Dp DTp < X, Cp XCTp + Dy DTy < 0,
B⊥ T T ⊥T
p (X − Ap XAp − Dp Dp )Bp > 0.
G = −(BTp PBp )−1 BTp PAp + (BTp PBp )−1/2 LS1/2

6.3. DISCRETE TIME CASE 175
P = (X − Dp DTp )−1 > 0,

∆
S = X−1 − ATp PAp + ATp PBp (BTp PBp )−1 BTp PAp > 0.
∆
Proof. The result directly follows by solving
(Bp G + Ap )X(Bp G + Ap )T < X − Dp DTp
for G using Corollary 2.3.6. 2

As a simple consequence, we have the following.
Corollary 6.3.1 Let a discrete-time system x(k + 1) = Ax(k) + Bu(k) be given. The following
(i) There exists a (static) state feedback controller u(k) = Gx(k) which stabilizes the system.
(ii) There exist matrices K and X > 0 such that
X > (A + BK)X(A + BK)T .
(iii) There exists a matrix X > 0 such that
B⊥ (X − AXAT )B⊥T > 0.
(iv) There exists a matrix Y > 0 such that
Y > AT YA − AT YB(BT YB)−1 BT YA. (6.38)
Proof. The equivalence of (i) and (ii) is a direct consequence of Lyapunov’s stability theory for
discrete-time systems. (i) ⇔ (iii) follows from Theorem 6.3.1 by letting the performance bound 0
be arbitrarily large and specializing to the case Dp = 0 without loss of generality. Finally, (iii) ⇔
(iv) can be proved as follows. Note that a matrix Y > 0 satisfies (6.38) if and only if there exists
R > 0 such that
Y > AT YA − AT YB(BT YB + R)−1 BT YA.
Using the matrix inversion lemma, we have
Y > AT (Y−1 + BR−1 BT )−1 A,
Y−1 + BR−1 BT > AY−1 AT .
From Finsler’s Theorem (Corollary 2.3.5), there exists R > 0 satisfying the above inequality if and
only if
B⊥ (Y−1 − AY−1 AT )B⊥T > 0
holds. Thus letting X = Y−1 , we have (iii).
∆
2

This section considers the covariance bounding control problem for the static output feedback case.
The Lyapunov inequality in this case is given by
X > (Ap + Bp GMp )X(Ap + Bp GMp )T + (Dp + Bp GDz )(Dp + Bp GDz )T . (6.39)
As in the continuous-time case, we shall assume orthogonality between the process and measurement
noises (Dp DTz = 0) for simplicity. However, we do not assume that the measured outputs are fully
∆
contaminated by noises (V = Dz DTz > 0). In other words, we allow possibly singular V ≥ 0 since
such a generality does not introduce an additional complexity in the following derivation.
Theorem 6.3.2 Let a symmetric matrix 0 be given and consider the system (4.31) with By = 0
and Dp DTz = 0. The following statements are equivalent.
(i) There exists a stabilizing static output feedback gain G such that
h i
k→∞
B⊥ T T ⊥T
p (X − Ap XAp − Dp Dp )Bp > 0, Cp XCTp + Dy DTy < 0,
X > Ap XATp − Ap XMTp (Mp XMTp + V)−1 Mp XATp + Dp DTp

∆
where V = Dz DTz .
In this case, all such static output feedback gains are given by
G = −(BTp PBp )−1 BTp PAp XMTp (Mp XMTp + V)−1 + (BTp PBp )−1/2 LS1/2
P = (X − Ap XATp + ΓRΓT − Dp DTp )−1 > 0,

∆
S = R−1 − ΓT [P − PBp (BTp PBp )−1 BTp P]Γ > 0,

∆
Γ = Ap XMTp R−1 .
∆ ∆
R = Mp XMTp + V,
Proof. After expanding each term in (6.39), completing the square with respect to G yields
(Bp G + Γ)R(Bp G + Γ)T < P−1 .
Then the result follows by applying Corollary 2.3.6. 2
Corollary 6.3.2 Let a discrete-time system
x(k + 1) = Ax(k) + Bu(k), z(k) = Mx(k)
be given. The following statements are equivalent.

(i) There exists a static output feedback controller u(k) = Gz(k) which stabilizes the system.
B⊥ (X − AXAT )B⊥T > 0,
X > AXAT − AXMT (MXMT )−1 MXAT .
X = Y−1 > 0,
B⊥ (X − AXAT )B⊥T > 0,
MT ⊥ (Y − AT YA)MT ⊥T > 0.
(iv) There exist a matrix pair (X,Y) and a scalar µ > 0 such that
X = Y−1 > 0,
X > AXAT − µBBT ,
Y > AT YA − µMT M.
(v) There exist matrices K, F and X > 0 such that
X > (A + BK)X(A + BK)T ,
X > (A + FM)X(A + FM)T .
(vi) There exists a coordinate transformation matrix T such that the transformed system
Â = T−1 AT, B̂ = T−1 B, M̂ = MT

∆ ∆ ∆
satisfies
kÂxk < kxk, ∀ x 6= 0 such that M̂x = 0,
kÂT xk < kxk, ∀ x 6= 0 such that B̂T x = 0.
Proof. The equivalence (i) ⇔ (ii) follows from Theorem 6.3.2 by letting 0 be arbitrarily large
and Dp = 0 and V = 0. (ii) ⇔ (iii) and (iii) ⇔ (v) are immediate from Corollary 6.3.1 and its
proof. (iii) ⇔ (iv) is a direct consequence of Finsler’s Theorem. Finally, we shall prove (iii) ⇔ (vi).
Choosing the coordinate transformation matrix T = X1/2 and noting that B̂⊥ = B⊥ T, it can be
∆
shown that (iii) is equivalent to the existence of T such that
B̂⊥ (I − ÂÂT )B̂⊥T > 0,
M̂T ⊥ (I − ÂT Â)M̂T ⊥T > 0.

Then the result simply follows once we notice that, for given matrices Q and D, the following holds;
D⊥ QD⊥T > 0 ⇔ xT Qx > 0, ∀ x 6= 0 such that DT x = 0.

Corollary 6.3.2 provides several characterization of conditions for stabilizability via static output
feedback in terms of matrix inequalities. Although none of them are immediately verifiable, they
may be useful for developing computational algorithms or algebraically verifiable tests to determine
if a given system is stabilizable via static output feedback. For this purpose, we shall discuss each of
the above conditions. Condition (iii) can be viewed as the existence of a matrix pair (X,Y) such that
X and Y belong to convex sets defined by LMIs, and satisfies the nonconvex coupling condition
X = Y−1 > 0. Convexity of the sets may be useful for developing computational algorithms.3
Condition (iv) is given in terms of Lyapunov inequalities with a negative forcing term. Condition
(ii) shows that the set of Lyapunov matrices is given by the intersection of the two sets defined by an
LMI and a Riccati-like inequality. Condition (v) exhibits certain “separation” property of the static
output feedback stabilization problem. In particular, each of the inequalities in (v) corresponds
to the state feedback or state estimation problem. What makes the whole problem difficult is the
requirement of a single quadratic Lyapunov function (i.e., Lyapunov matrix X) to prove stability
of the both problems. Finally, condition (vi) says that the system is stabilizable via static output
feedback if and only if there exists a basis for the state space such that Â and ÂT are contractions
if the domains are, respectively, unobservable and uncontrollable subspaces of the state space.
6.3.3 Reduced-Order Dynamic Output Feedback

This section solves the covariance bounding control problem for the reduced-order dynamic output
feedback case. The results presented in the previous sections can also be derived as special cases
of the results in this section. The underlying Lyapunov inequality is
X > (A + BGM)X(A + BGM)T + (D + BGE)(D + BGE)T . (6.40)
Theorem 6.3.3 Let a symmetric matrix 0 be given and consider the system (4.32) with By = 0.
The following statements are equivalent.
(i) There exists a dynamic output feedback controller of order nc which stabilizes the system and
yields h i
k→∞
(ii) There exists a positive definite matrix X ∈ R(np +nc )×(np +nc ) such that
B⊥ (X − AXAT − DDT )B⊥T > 0, CXCT + FFT < 0,

" #⊥ " #" #⊥T
MT X−1 − AT X−1 A −AT X−1 D MT
> 0.
ET −DT X−1 A I − DT X−1 D ET
3
We shall propose algorithms in Chapters 10 and 11.
G = −(BT ΦB)−1 BT ΦΛRΓT (ΓRΓT )−1 + (BT ΦB)−1/2 LΨ1/2
Φ = [X − ΛRΛT + ΛRΓT (ΓRΓT )−1 ΓRΛT ]−1 ,

∆
Ψ = Ω − ΩΓRΛT [Φ − ΦB(BT ΦB)−1 BT Φ]ΛRΓT Ω,

∆
h i
Ω = (ΓRΓT )−1 ,
∆ ∆
Λ= A D ,
" #
h i X 0
∆ ∆
Γ= M E , R= .
0 I
Proof. Noting that the Lyapunov inequality (6.40) can be equivalently written
" #Ã " # " # !
³ h i h i ´ X 0 AT MT
X> A D + BG M E + GT BT ,
0 I DT ET
the result immediately follows from Theorem 2.3.11. 2
Corollary 6.3.3 Let a matrix X > 0 be given. The following statements are equivalent.
(i) X is an assignable state covariance bound, i.e., X satisfies the matrix inequalities in statement
(ii) of Theorem 6.3.3.
(ii) X satisfies
B⊥ T T ⊥T
p (Xp − Ap Xp Ap − Dp Dp )Bp > 0, Cp Xp CTp + Dy DTy < 0,
" #⊥ " #" #⊥T

MTp Yp − ATp Yp Ap −ATp Yp Dp MTp
> 0,
DTz −DTp Yp Ap I − DTp Yp Dp DTz
where Xp > 0 and Yp > 0 are defined by

" #
Xp Xpc
Yp = (Xp − Xpc X−1 T −1
∆
X= , c Xpc ) .
XTpc Xc
Proof. The result follows by specializing Theorem 6.3.3 where we utilize the structure of augmented
matrices defined in (4.32). See the proof of Corollary 6.2.3. 2
6.3.4 Full-Order Dynamic Output Feedback

We now remove the assumption By = 0. In the general output feedback case, the best performance
can be achieved by a full-order controller. That is, increasing the order of the controller does not
help to improve performance if the controller order is larger than or equal to the plant order. Hence
we may restrict our attention to full-order controllers to achieve a given performance bound. The
following result does not require any assumptions on the plant and provides a state space formula
for a full-order covariance upper bound controller.
Theorem 6.3.4 Let a symmetric matrix 0 be given and consider the system (4.32). The following
(i) There exists a dynamic output feedback controller which stabilizes the system and yields
h i
k→∞
(ii) There exist matrices Xp , Yp , K and L such that

 
0 Cp Xp + By K Cp + By LMp Dy + By LDz
 
 ∗ 
 Xp I 0 
  > 0, (6.41)
 ∗ ∗ Yp 0 
 
∗ ∗ ∗ I
 
Xp Ap Xp + Bp K Ap + Bp LMp Dp + Bp LDz
 
 ∗ 
 Xp I 0 
  > 0, (6.42)
 ∗ ∗ Yp 0 
 
∗ ∗ ∗ I
" #⊥ Ã" # " # !" #⊥T
MTp Yp 0 ATp h i MTp
− Yp Ap Dp > 0, (6.43)
DTz 0 I DTp DTz
where ∗ denotes symmetric entries. In this case, one such controller is given by
Ac = (I + ΩZ−1 )−1 (Âp + Bp Cc ) − Bc Mp

Bc = (I + ΩZ−1 )−1 (Âp Yp−1 MTp + D̂p DTz )(Mp Yp−1 MTp + Dz DTz )−1
Cc = (K − Dc Mp Xp )Z−1
Dc = L
where
h i h i h i
∆
Âp D̂p = Ap Dp + Bp Dc Mp Dz
Z = Xp − Yp−1
∆
Ω = Yp−1 − Âp Yp−1 ÂTp + D̂p D̂Tp

∆
+(Âp Yp−1 MTp + D̂p DTz )(Mp Yp−1 MTp + Dz DTz )−1 (Âp Yp−1 MTp + D̂p DTz )T .
Proof. Recall that the closed-loop system satisfies the specifications in statement (i) if and only if
there exists X > 0 such that
0 > Cc` XCTc` + Dc` DTc` , (6.44)
X > Ac` XATc` + Bc` BTc` . (6.45)

h i
Note that Ac` Bc` can be written
" # " # " #
h i Âp Bp Cc D̂p 0 h i 0 I 0
Ac` Bc` = + Ac Bc .
0 0 0 I | {z } Mp 0 Dz
| {z } | {z } | {z }
G1
A1 B1 C1
Applying Theorem 2.3.11 to (6.45), there exists G1 such that
" #
T ∆ X 0
X > (A1 + B1 G1 C1 )Ψ(A1 + B1 G1 C1 ) , Ψ=
0 I
if and only if
B⊥ T ⊥T
1 (X − A1 ΨA1 )B1 >0 (6.46)
CT1 ⊥ (Ψ−1 − AT1 X−1 A1 )CT1 ⊥T > 0 (6.47)
hold. In this case, one such G1 is given by
G1 = −(BT1 ΦB1 )−1 BT1 ΦA1 ΨCT1 (C1 ΨCT1 )−1 (6.48)
where
Φ = (X − A1 ΨAT1 + A1 ΨCT1 (C1 ΨCT1 )−1 C1 ΨAT1 )−1 .
∆
Now, without loss of generality, we consider X of the following structure:

" #
Xp Z
X= .
Z Z
Defining
Yp = (Xp − Z)−1 ,
∆
it can be verified that (6.47) is equivalent to (6.43). Note that (6.46) is equivalent to
" #" #
h i Xp Z ÂTp
Xp > Âp Bp Cc + D̂p D̂Tp (6.49)
Z Z CTc BTp
while (6.44) is equivalent to

" #" #
h i Xp Z ĈTp
Ω> Ĉp By Cc + D̂y D̂Ty (6.50)
Z Z CTc BTy
where h i h i h i
∆
Ĉp D̂y = Cp Dy + By Dc Mp Dz .
Clearly, (6.49) and (6.50) have exactly the same form. Hence, we only show the equivalence between
(6.50) and (6.41); the equivalence between (6.49) and (6.42) follows in a similar manner.
Write (6.50) and X > 0 as
   
I 0 0 0 Cc` X Dc` I 0 0
   
 0 T 0   XCT  
0   0 TT 0 
  c` X >0
0 0 I Dc` 0 I 0 0 I
where T is defined in (6.20). Using the identity in (6.21), this inequality becomes
 
0 Ĉp Xp + By Cc Z Ĉp Yp−1 D̂y
 
 ∗ Yp−1 0 
 Xp 
 >0
 ∗ ∗ Yp−1 0 
 
∗ ∗ ∗ I
where Cp and Dy have been replaced by Ĉp and D̂y , respectively, due to the presence of nonzero
Dc . Defining
∆ ∆
K = Cc Z + Dc Mp Xp , L = Dc (6.51)
and using the congruent transformation involving Yp , we have (6.41).

Finally, the controller formulas for Ac and Bc can be derived by computing G1 in (6.48), and
those for Cc and Dc from (6.51). 2
We shall obtain a solution for the LQG problem based on Theorem 6.3.4. As in the continuous-
time case, the LQG problem is to design a stabilizing controller that minimizes tr(0) subject to
the closed-loop constraints (6.44) and (6.45). In the sequel, we assume
h i h i h i h i
Dz DTp DTz = 0 V , BTy Cp By = 0 R (6.52)
with V > 0 and R > 0 for simplicity. Note that Dy is not necessarily zero.
The following is the discrete-time LQG solution.
Theorem 6.3.5 Consider the system given in (4.32). Suppose the assumptions in (6.52) hold.
Then the controller that stabilizes (4.32) and solves
µ h i¶
∆
γopt = min tr lim E y(k)y (k) T
k→∞
is given by
Ac = Ap + Bp Cc − Bc Mp + Bp Dc Mp
Bc = Ap QMTp (Mp QMTp + V)−1 + Bp Dc
Cc = −(BTp PBp + R)−1 BTp PAp − Dc Mp
" #" #
h i Ap Dp QMTp
−1
Dc = −(BTp PBp + R) BTp P BTy (Mp QMTp + V)−1
Cp Dy DTz
where P and Q are the stabilizing solutions to
P = ATp PAp − ATp PBp (BTp PBp + R)−1 BTp PAp + CTp Cp (6.53)
Q = Ap QATp − Ap QMTp (Mp QMTp + V)−1 Mp QATp + Dp DTp . (6.54)
Furthermore, the minimum value of the cost function is given by

°" #" #" #°2
° P1/2 Q1/2 0 °
° 0 Âp D̂p °
γopt =° ° − tr(PQ)
° 0 I Ĉp D̂y 0 I °F
where " # " # " #

Âp D̂p Ap Dp Bp h i
∆
= + Dc Mp Dz .
Ĉp D̂y Cp Dy By
Proof. Since Dz has full row rank due to assumption (6.52), a choice for the left annihilator in
(6.43) is given by
" #⊥ " #
MTp I −MTp DTz +
= .
DTz 0 DTz ⊥
With this choice, after some manipulation, (6.43) becomes
Q > Ap QATp − Ap QMTp (Mp QMTp + V)−1 Mp QATp + Dp DTp .
where Q = Yp−1 . Since the other constraints (6.41) and (6.42) are more likely to be satisfied with
∆
larger Yp , we should search for the smallest Q ≥ 0 satisfying the above Riccati inequality. It turns
out [108] that such Q is given by the stabilizing solution to the Riccati equation (6.54). Thus we
choose Yp−1 = Q in (6.41) and (6.42) where Q is the stabilizing solution to (6.54).
By the Schur complement formula, it is readily verified that (6.41) and (6.42) are respectively
equivalent to
Ω > (Ĉp + By Cc )Z(Ĉp + By Cc )T + (Ĉp QĈTp + D̂y D̂Ty ), (6.55)
Z > (Âp + Bp Cc )Z(Âp + Bp Cc )T + (−Q + Âp QÂTp + D̂p D̂Tp ). (6.56)
Using the dual characterization of t he H2 norm, we see that there exist Z > 0 and Ω > 0 satisfying
(6.55), (6.56) and tr(Ω) < γ if and only if there exists P > 0 such that
P > (Âp + Bp Cc )T P(Âp + Bp Cc ) + (Ĉp + By Cc )T (Ĉp + By Cc ), (6.57)
γ > tr [ P(−Q + Âp QÂTp + D̂p D̂Tp ) ] + tr(Ĉp QĈTp + D̂y D̂Ty ). (6.58)
Noting that
∆
Âp + Bp Cc = Ap + Bp Ĉc , Ĉp + By Cc = Cp + By Ĉc , Ĉc = Cc + Dc Mp ,
inequality (6.57) can be written
[Ĉc + (BTp PBp + R)−1 BTp PAp ]T (BTp PBp + R)[Ĉc + (BTp PBp + R)−1 BTp PAp ]
−P + ATp PAp − ATp PBp (BTp PBp + R)−1 BTp PAp + CTp Cp < 0.
Hence, Cc satisfying (6.57) exists if and only if
P > ATp PAp − ATp PBp (BTp PBp + R)−1 BTp PAp + CTp Cp (6.59)
holds, in which case, one such Cc is given by
Cc = Ĉc − Dc Mp , Ĉc = −(BTp PBp + R)−1 BTp PAp .
Now, since the matrix (·) on the right of P in (6.58) is positive (semi)definite due to (6.54), smaller
P yields better performance (smaller γ). The smallest P such that (6.59) is given by the stabilizing
solution to the Riccati equation (6.53) [108].
Finally, we find an optimal Dc . Note that (6.58) can be written
°" #" #" #°2
° P1/2 Q1/2 0 °
° 0 Âp D̂p °
γ>° ° − tr(PQ).
° 0 I Ĉp D̂y 0 I °F
where k · kF is the Frobenius norm. Also note that the matrix inside k · kF can be written
" # " #
P1/2 Ap Q1/2 P1/2 Dp P1/2 Bp h i
+ Dc Mp Q1/2 Dz .
Cp Q1/2 Dy By
Hence, applying Theorem 2.3.2, matrix Dc that minimizes the above Frobenius norm is given by
" #+ " #
P1/2 Bp P1/2 Ap Q1/2 P1/2 Dp h i+
Dc = − Mp Q1/2 Dz .
By Cp Q1/2 Dy
After some manipulation this formula leads to the formula for Dc in Theorem 6.3.5. 2
Chapter 6 Closure
This chapter repeats the control design question of Chapter 5 except that the equality constraints
of the covariance equations in Chapter 5 are replaced by inequality constraints in Chapter 6.
This provides controllers that guarantee upper bounds on the actual covariance assigned by the
controller. As shown later, the results obtained from this upper bound approach are computable
via convex programming when state feedback is used, or when dynamic controllers of order equal to
the plant are used. In all other cases the computation is nonconvex, and no algorithm is available
which guarantees a solution when one exists. However, Chapters 10 and 11 will provide useful
algorithms to obtain fixed-order controllers, even though the problems are not convex. All the
results in this chapter are essentially from [56]. See [110, 164] for related results.
Chapter 7
H∞ Controllers
In classical control theory the peak of the frequency response of a closed-loop system can be
determined from the intersection of the Nyquist diagram and the “M circles” (representing constant
closed-loop magnitudes). The pioneering work of [161] allowed these ideas to be extended to
MIMO systems, where analytical tests replaced the graphical work of Nyquist. The peak of the
frequency response had other interpretations as well, generating fundamental results in robust
control theory. This chapter characterizes all controllers which can yield specified upper bounds on
the peak frequency response.
7.1 H∞ Control Problem

Consider the linear time-invariant systems given in Sections 4.2 and 4.5. The closed-loop transfer
matrix from disturbance w to regulated output y is given by T(α) = Cc` (αI − Ac` )−1 Bc` + Dc`
where α = s for the continuous-time case and α = z for the discrete-time case. Recall that, if the
transfer matrix T is stable, then its H∞ norm is defined by
∆
kTkH∞ = sup kT(jω)k, (continuous-time case),
ω∈R
∆
kTkH∞ = sup kT(ejθ )k, (discrete-time case).
θ∈[0,2π]
Analysis results in Section 4.6 show that the H∞ norm can be interpreted in the following two
ways. One is a measure for the disturbance attenuation level. In particular, the energy-to-energy
gain from w to y is exactly given by kTkH∞ . The other is a measure for robustness. Specifically,
the closed-loop system remains stable for all perturbations w = ∆y such that k∆k ≤ γ −1 if
kTkH∞ < γ. These properties associated with the H∞ norm motivate us to consider the following
H∞ control problem;
Let a performance bound γ > 0 be given. Determine if there exists a controller which
stabilizes the system and yields the closed-loop transfer matrix such that kTk∞ < γ.
Find all such controllers when one exists.
185
186 CHAPTER 7. H∞ CONTROLLERS
The following lemmas are useful for solving the above control problem. Both of these are just
restatements of the analysis results in Section 4.6 and hence proofs are omitted.
Lemma 7.1.1 [151, 162] Let a scalar γ > 0 be given and consider the linear time-invariant
continuous-time system (4.14). The following statements are equivalent.
(i) The controller G stabilizes the system and yields the closed-loop transfer matrix T(s) such
that kTkH∞ < γ.
∆
(ii) R = γ 2 I − DTc` Dc` > 0 and there exists a matrix Y > 0 such that
YAc` + ATc` Y + (YBc` + CTc` Dc` )R−1 (YBc` + CTc` Dc` )T + CTc` Cc` < 0.
(iii) There exists a matrix Y > 0 such that

 
YAc` + ATc` Y YBc` CTc`
 
 BTc` Y −γ 2 I DTc` 
  < 0.
Cc` Dc` −I
Lemma 7.1.2 [32, 97] Let a scalar γ > 0 be given and consider the linear time-invariant discrete-
time system (4.32). The following statements are equivalent.
(i) The controller G stabilizes the system and yields the closed-loop transfer matrix T(z) such
that kTkH∞ < γ.

∆
R = γ 2 I − Cc` XCTc` − Dc` DTc` > 0,
X > Ac` XATc` + (Ac` XCTc` + Bc` DTc` )R−1 (Ac` XCTc` + Bc` DTc` )T + Bc` BTc`
(iii) There exists a matrix X > 0 such that

" # " #" #" #T
X 0 Ac` Bc` X 0 Ac` Bc`
> .
0 γ2I Cc` Dc` 0 I Cc` Dc`
In view of these lemmas, the H∞ control problem is equivalent to an algebraic problem of

solving a certain matrix inequality for the controller parameter G (Recall that closed-loop matrices
Ac` , Bc` , etc. are functions of G). In this case, solvability conditions are given in terms of some
qualifications for the Lyapunov matrix X or Y, which in turn define existence conditions for an
H∞ controller. All H∞ controllers will be given explicitly as the general solution to the matrix
inequality.
Throughout this chapter, we shall assume that there is no redundant actuator (BTp Bp > 0)
and no redundant sensor (Mp MTp > 0), to facilitate the presentation. These assumptions reflect
reasonable practical situation, and can easily be removed at the expense of a little more complicated
controller formulas.

In this section, we consider the case where all the state variables can be measured without noises;
Mp = I, Dz = 0. (7.1)
We shall introduce the following simplifying assumptions;

h i h i
Dy = 0, BTy Cp By = 0 I . (7.2)
To provide another interpretation to this problem, define

Z ∞ Z ∞³ ´
∆
J= yT (t)y(t)dt = xTp (t)Qxp (t) + uT (t)Ru(t) dt,
0 0
∆
where Q = CTp Cp and R = I. Recall that the H∞ norm constraint on the closed-loop transfer
matrix kTkH∞ < γ is equivalent to guaranteeing J < γ 2 for all disturbances w with its L2 norm
bounded above by 1. Note also that the assumption (7.2) implies that the control input is fully
penalized, i.e., R > 0 and that there is no penalty on the disturbance w, and there is no cross-
weighting term between the state and the control input in the cost functioned. This cost function
J is identical to that for the conventional Linear Quadratic Regulator (LQR) problem [1]. The
difference is that the excitation w is impulsive (or equivalently, nonzero initial states with no
external disturbances), which makes J 1/2 the H2 norm of T(s).
In fact, the assumption (7.2) is only as restrictive as the assumption;
∆
Dy = 0, U = BTy By > 0, (7.3)
since the H∞ control problem for plants with the property (7.3) can be converted to that for plants
with the property (7.2). This can be verified as follows. First note that
kyk2 = (Cp xp + By u)T (Cp xp + By u)

= xTp CTp (I − By B+ + T +
y )Cp xp + (u + By Cp xp ) U(u + By Cp xp ).
Motivated by the above equality, let us define
B̂y = By U−1/2 ,
∆ ∆
Ĉp = (I − By B+
y)
1/2
Cp ,
∆ ∆
û = U1/2 (u + B+
y Cp xp ), ŷ = Ĉp xp + B̂y û.
Then we see that kyk = kŷk and hence the original H∞ control problem can be converted to that
for the following new system;
ẋp = Âp xp + B̂p û + Dp w, (7.4)

ŷ = Ĉp xp + B̂y û,
where
B̂p = Bp U−1/2 .
∆ ∆
Âp = Ap − Bp B+
y Cp ,
In this case, the control input for the original plant can be determined by
u = U−1/2 û − B+
y Cp xp . (7.5)
Note that the new system (7.4) satisfies the assumption (7.2). Finally, it should be clear that
the above transformation may not be possible for the more general output feedback case since the
controller implementation (7.5) requires the knowledge of the quantity B+y Cp xp .
The following result provides all state feedback H∞ controllers with the above assumptions on
the plant.
Theorem 7.2.1 Let a scalar γ > 0 be given. Consider the system (4.14) and suppose the assump-
tions (7.1) and (7.2) hold. Then the following statements are equivalent.
(i) There exists a (static) state feedback controller u = Gxp which stabilizes the system and yields
kTk∞ < γ.

" #
Ap X + XATp + Dp DTp − γ 2 Bp BTp XCTp
< 0.
Cp X −γ 2 I
1
YAp + ATp Y + Y( Dp DTp − Bp BTp )Y + CTp Cp + Q = 0
γ2
has a solution Y > 0.
In this case, all such state feedback controllers are given by
G = −BTp Y + LQ1/2
where L is an arbitrary matrix such that kLk < 1.
Proof. From Lemma 7.1.1, a given state feedback gain G stabilizes the system and yields kTk∞ < γ
if and only if there exists a matrix Y > 0 such that
1
Y(Ap + Bp G) + (Ap + Bp G)T Y + YDp DTp Y + (Cp + By G)T (Cp + By G) < 0.
γ2
After expanding, completing the square with respect to G yields
(G + BTp Y)T (G + BTp Y) < Q
where
∆ 1
Q = −(YAp + ATp Y + Y( Dp DTp − Bp BTp )Y + CTp Cp ).
γ2
Then statement (iii) and the controller formula follow from the Schur complement formula and
Corollary 2.3.7. The LMI in statement (ii) is related to γ 2 Y−1 QY−1 > 0 with X = γ 2 Y−1 by the
∆
Schur complement formula. 2


This section considers the static output feedback case. Let us introduce the following assumption
on the plant; h i h i
Dz DTp DTz = 0 I . (7.6)
This assumption restricts how the disturbance enters the system; Dz DTp = 0 means that the process
and measurement disturbances are independent (or in the stochastic setting, they are uncorrelated),
and Dz DTz = I basically means that all the measurements are contaminated by the disturbance.
The following results present a solution to the H∞ control problem with static output feedback
controllers.
Theorem 7.2.2 Let a scalar γ > 0 be given and consider the system (4.14). Suppose the assump-
(i) There exists a static output feedback controller u = Gz which stabilizes the system and yields
kTkH∞ < γ.
(ii) There exists a matrix pair (X,Y) such that
X > 0, Y > 0, XY = γ 2 I,
1 T
Ap X + XATp + X( C Cp − MTp Mp )X + Dp DTp < 0, (7.7)
γ2 p
1
YAp + ATp Y + Y( 2 Dp DTp − Bp BTp )Y + CTp Cp < 0. (7.8)
γ
X > 0, Y > 0, XY = γ 2 I,
" #
Ap X + XATp + Dp DTp − γ 2 Bp BTp XCTp
< 0, (7.9)
Cp X −γ 2 I
" #
YAp + ATp Y + Cp CTp − γ 2 Mp MTp YDp
< 0. (7.10)
DTp Y −γ 2 I
G = −γ 2 BTp YP−1 MTp + V1/2 LW1/2 (7.11)
V = I − BTp YP−1 YBp > 0,

∆
W = γ 2 I − γ 4 Mp P−1 MTp > 0.

∆
∆ 1
P = −[YAp + ATp Y + Y( 2 Dp DTp − Bp BTp )Y + CTp Cp − γ 2 MTp Mp ] > 0.
γ
Proof. Recall that, from Lemma 7.1.1, a given controller G stabilizes the system and yields
kTk∞ < γ if and only if R > 0 and there exists Y > 0 satisfying Q < 0 where
∆
R = γ 2 I − DTc` Dc` ,
Q = YAc` + ATc` Y + (YBc` + CTc` Dc` )R−1 (YBc` + CTc` Dc` )T + CTc` Cc` .
∆
Using the matrix inversion lemma, after some manipulations, Q can be rewritten as
" #−1 " #
h i I −G BTp Y
Q= YBp γ 2 MTp −P
−GT γ2I γ 2 Mp
where P is defined in Theorem 7.2.2. Using the Schur complement formula, it can be verified that
R > 0 if and only if " #
I −G
> 0.
−GT γ 2 I
Hence, Q < 0 and R > 0 are equivalent to P > 0 and
" # " #
I −G BTp Y h i
> P−1 YBp γ 2 MTp .
−GT γ2I γ 2 Mp
Then, from Corollary 2.3.7, there exists a matrix G satisfying the above inequality if and only if
V > 0 and W > 0, in which case, all such G are given by (7.11). Note that P > 0 and V > 0 are
equivalent to
P > YBp BTp Y,
which leads to (7.7) with X = γ 2 Y−1 . Similarly, P > 0 and W > 0 are equivalent to (7.8). Finally,
∆
multiplying (7.7) by γX−1 from the left and right, we have

1
YAp + ATp Y + YDp DTp Y + CTp Cp − γ 2 MTp Mp < 0
γ2
where Y = γ 2 X−1 . Then (7.10) follows from the Schur complement formula. A similar manipula-
∆
tions shows (7.8) ⇔ (7.9) for X = γ 2 Y−1 . This completes the proof.
∆
2
Exercise 7.2.1 Provide a step by step derivation of Theorem 7.2.2 following the guideline given
above.
7.2.3 Dynamic Output Feedback

In this section, we shall present a solution to the H∞ control problem with (possibly) dynamic
output feedback controllers. Recall that our derivations are based on the bounded real inequality
given in Lemma 7.1.1. Since the closed-loop matrices for the (fixed-order) dynamic output feedback
controller have exactly the same structure as those for the static output feedback controller (see
Section 4.2), mathematical problems of solving the bounded real inequalities for both cases are es-
sentially the same. However, for the dynamic output feedback case, the assumptions corresponding
to (7.2) and (7.6) , e.g., HT H > 0 and EET > 0, cannot be made since matrix H (E) is not full
column (row) rank even if By (Dz ) is full column (row) rank, where F, H, E are defined in (4.17).
Hence, the static result in the previous section cannot be specialized to the dynamic case. In the
sequel, we shall impose no assumptions on matrices E, F and H.
Theorem 7.2.3 Let a scalar γ > 0 be given and consider the system (4.14). The following state-
(i) There exists a controller of order nc which stabilizes the system and yields kTkH∞ < γ.
(ii) There exists a matrix pair (X,Y)∈ R(np +nc )×(np +nc ) × R(np +nc )×(np +nc ) such that
X > 0, Y > 0, XY = γ 2 I,
" #⊥ " #" #⊥T
B AX + XAT + DDT XCT + DFT B
< 0, (7.12)
H CX + FDT FFT − γ 2 I H
" #⊥ " #" #⊥T
MT YA + AT Y + CT C YD + CT F MT
< 0. (7.13)
ET DT Y + FT C FT F − γ 2 I ET
(iii) There exists a matrix pair (Xp ,Yp )∈ Rnp ×np × Rnp ×np such that
" # " #
Xp γI Xp γI
≥ 0, rank ≤ np + nc , (7.14)
γI Yp γI Yp
" #⊥ " #" #⊥T
Bp Ap Xp + Xp ATp + Dp DTp Xp CTp + Dp DTy Bp
< 0, (7.15)
By Cp Xp + Dy DTp Dy DTy − γ 2 I By
" #⊥ " #" #⊥T
MTp Yp Ap + ATp Yp + CTp Cp Yp Dp + CTp Dy MTp
< 0. (7.16)
DTz DTp Yp + DTy Cp DTy Dy − γ 2 I DTz
G = −R−1 ΓT ΦΛT (ΛΦΛT )−1 + S1/2 L(ΛΦΛT )−1/2 (7.17)
such that
Φ = (ΓR−1 ΓT − Θ)−1 > 0,
∆
and
S = R−1 − R−1 ΓT [Φ − ΦΛT (ΛΦΛT )−1 ΛΦ]ΓR−1 ,
∆
   
YA + AT Y YD CT YB h i
  ∆  
Θ= −γ 2 I FT  Γ= 
∆ ∆
 DT Y ,  0 , Λ= M E 0 .
C F −I H
Proof. Using the definitions for the closed-loop matrices given in (4.16), it is easy to verify that
the matrix inequality in statement (iii) of Lemma 7.1.1 can be written
ΓGΛ + (ΓGΛ)T + Θ < 0. (7.18)
Then the result follows from Theorem 2.3.12 where we note that
 " #⊥  
B Y−1 0 0
 0   
Γ⊥ = 
 H  0
 0 I 
,
0 I 0 I 0
 " #⊥ 
MT
 0 
ΛT ⊥ = 
 ET 
,
0 I
and use the Schur complement formula to reduce the dimensions of the LMIs describing the existence
conditions. Then multiplying the reduced-order LMI for Γ⊥ ΘΓ⊥T < 0 by γ 2 , and defining X =
∆
γ 2 Y−1 , we have the equivalence (i) ⇔ (ii). To prove (ii) ⇔ (iii), first suppose (ii) holds. Then
defining partitioned blocks of X and Y by
" # " #
Xp Xpc Yp Ypc
X= , Y= ,
XTpc Xc T
Ypc Yc
and exploiting the structure of the augmented matrices defined in (4.17) and using
 ⊥   
" #⊥ Bp 0  " #⊥  I 0 0
n
B   Bp  p 
=
 0 I 
nc  =  
0  0 0 Inz 
,
H By
By 0 0 Inc 0
 
⊥  
" #⊥ MTp 0  " #⊥  I 0 0
n
MT   M T  p 
=
 0

Inc  =  p  
0  0 0 Inw 
,
ET DTz
DTz 0 0 Inc 0
we see that (7.12) ⇒ (7.15) and (7.13) ⇒ (7.16). Note that X > 0 and Y > 0 imply Xp > 0 and
Yp > 0, and 11-block of X = γ 2 Y−1 is given by
Xp = γ 2 (Yp − Ypc Yc−1 Ypc

T −1
) .
Hence
Yp − γ 2 X−1 −1 T
p = Ypc Yc Ypc ≥ 0 (7.19)
or equivalently, using Lemma A.2.1

" #
Xp γI
≥ 0.
γI Yp
Moreover, we have
" # " #" #" #
Xp γI I 0 Xp γI I −γX−1
p
rank = rank −1
γI Yp −γXp I γI Yp 0 I
" #
Xp 0
= rank
0 Yp − γ 2 X−1
p
= rank(Yp − γ 2 X−1
p ) + rank(Xp ).
Noting that
rank(Xp ) = np , rank(Yp − γ 2 X−1 −1 T
p ) = rank(Ypc Yc Ypc ) ≤ nc ,
we have (ii) ⇒ (iii). Conversely, if (iii) holds, then defining Ypc and Yc > 0 by any matrices
satisfying (7.19), it can be verified that matrix pair (X,Y) such that
" #
Yp Ypc
Y= T
, X = γ 2 Y−1 (7.20)
Ypc Yc
satisfies the conditions in statement (ii). This completes the proof. 2
To design an H∞ controller, one must find a matrix pair (X,Y) satisfying the conditions in
statement (ii) of Theorem 7.2.3. In this case, the controller order is fixed by the dimension of X.
As we see in chapters 11 and 12, it is extremely difficult to find such a matrix pair directly using
the conditions in statement (ii). In view of the above proof, another way to construct (X,Y) is the
following. First find a matrix pair (Xp ,Yp ) in statement (iii). Then compute a matrix factor Ypc
and Yc > 0 such that
Ypc Yc−⊥ Ypc
T
= Yp − γ 2 X−1
p , (7.21)
and construct Y and X as in (7.20). Note that it is possible not to fix the controller order a priori
by removing the rank condition in statement (ii). In this case, a matrix pair (Xp ,Yp ) can be found
by convex programming1 , and the resulting controller order can be chosen to be any integer such
that
nc ≥ rank(Yp − γ 2 X−1 p ).
In particular, the controller order is minimal (equality holds in the above inequality) for a given
pair (Xp ,Yp ) if the matrix factor in (7.21) is chosen so that Ypc has full column rank. In this case,
we have
nc = rank(Yp − γ 2 X−1p ) ≤ np
where the inequality holds due to the dimension constraint. Thus we have the following.
Corollary 7.2.1 Suppose there exists a stabilizing dynamic controller of some order (possibly larger
than the plant order np ) such that kTkH∞ < γ. Then there exists a stabilizing dynamic controller
of order np such that kTkH∞ < γ.
1
See chapters 10 and 11.
A similar statement holds for the state feedback case as follows. Suppose all the state variables
are available for feedback, i.e., Mp = I and Dz = 0. Then the LMI (7.16) reduces to kDy k < γ.
Hence, for any Xp > 0 satisfying (7.15), we can always choose Yp to be γ 2 X−1 p to satisfy the
conditions in statement (iii). In this case, the controller order can be chosen
nc = rank(Yp − γ 2 X−1
p ) = 0.
Thus we have the following.
Corollary 7.2.2 Suppose all the state variables are measured without noise, i.e., Mp = I and
Dz = 0. Suppose further that there exists a (possibly dynamic) controller which stabilizes the system
and yields kTkH∞ < γ. Then there exists a stabilizing static state feedback controller u = Gxp
such that kTkH∞ < γ.
In other words, for the state feedback case, the (optimal) H∞ performance cannot be improved by
increasing the controller order; an optimal H∞ state feedback controller can always be chosen to
be a static feedback gain.
Once we find a pair (Xp ,Yp ), we have the following controller design freedoms; the factorization
to Ypc and Yc , the positive definite matrix R > 0, and the matrix L such that kLk < 1. The
freedom in the choice of Ypc and Yc contributes only to the controller coordinate transformation.
The positive definite matrix R can be restricted to have the structure of R = µI, where µ is a real
number, without loss of generality (see the proof of Theorem 2.3.12). It can be shown by Theorem
2.3.10 that the admissible interval of the scalar µ is 0 < µ < µmax where µmax is a computable
(without iteration) positive scalar or infinity. Hence, the essential controller design freedoms beyond
the choice of (Xp ,Yp ) are kLk < 1 and µ > 0. We shall discuss intensively the use of the freedom
due to (Xp ,Yp ) in a later section.
Finally, we shall show how our general result (Theorem 7.2.2) can be specialized with the
standard assumptions (7.2) and (7.6). In this case, existence conditions are given by two Riccati
inequalities and a spectral radius condition on the positive definite solutions to the Riccati inequal-
ities. More importantly, we can eliminate the freedom R > 0 in the controller parametrization,
leaving L as the only free parameter.
Corollary 7.2.3 Let γ > 0 be given and consider the system (4.14). Suppose the assumptions
(7.2) and (7.6) hold. Then the following statements are equivalent.
(i) There exists a controller of some order which stabilizes the system and yields kTk∞ < γ.
(ii) There exists a matrix pair (X̄p ,Ȳp ) such that
X̄p > 0, Ȳp > 0, ρ(X̄p Ȳp ) ≤ γ 2 ,

1
Ȳp Ap + ATp Ȳp + Ȳp ( Dp DTp − Bp BTp )Ȳp + CTp Cp < 0,
γ2
1
Ap X̄p + X̄p ATp + X̄p ( 2 CTp Cp − MTp Mp )X̄p + Dp DTp < 0.
γ
G = G1 + G2 LG3
G1 = (Θ12 Θ−1
∆
22 Λ2 − Λ1 )G3 ,
T T 2
G2 = (Θ12 Θ−1 −2 T 1/2

∆
22 Θ12 − Θ11 − G1 G3 G1 )
T
,
G3 = (−Λ2 Θ−1 T −1/2

∆
22 Λ2 ) ,
" # " #
Θ11 Θ12 Γ+ h i
∆
= Θ Γ+T Γ⊥T ,
ΘT12 Θ22 Γ⊥
h i h i
∆
Λ1 Λ2 =Λ Γ+T Γ⊥T ,
and Θ, Γ and Λ are defined in Theorem 7.2.3 in terms of the plant matrices and
" #
Yp Ypc
Yp = γ 2 X̄−1 Ypc Yc−1 Ypc = γ 2 X̄−1
∆ ∆ ∆
p − Ȳp .
T
Y= T
> 0, p ,
Ypc Yc
Proof. With the assumptions (7.2) and (7.6), we can choose

" #⊥ " # " #⊥ " #
Bp Inp −Bp B+
y MTp Inp −(D+
z Mp )
T
= ⊥
, = ,
By 0 By DTz 0 DTz ⊥
and substitute into the LMIs (7.15) and (7.16) in statement (iii) of Theorem 7.2.3. Then using
the Schur complement formula and defining X̄p = γ 2 Yp−1 and Ȳp = γ 2 X−1
∆ ∆
p , it is easy to obtain
the above Riccati inequalities. The first LMI in statement (iii) of Theorem 7.2.3 is equivalent to
Xp ≥ γ 2 Yp−1 > 0, which is equivalent to Xp > 0, Yp > 0 and ρ(X−1 −1 −2
p Yp ) ≤ γ . This proves the
existence condition.
To prove the controller formula, recall that all H∞ controllers are given as the solution to the
matrix inequality (7.18). The result simply follows by applying Corollary 2.3.8 to (7.18), and we
only need to show that ΓT Γ > 0 and ΛΓ⊥T Γ⊥ ΛT > 0. To this end, note that
 
" # Y 0 0
BTp 0 0 BTy  
ΓT =  0 I 0 
 
0 Inc 0 0
0 0 I
has linearly independent rows (ΓT Γ > 0) since Y > 0 and BTy By > 0. Using a choice of Γ⊥ ;
  
In 0 0 −Bp B+
y Y−1 0 0
 p  
Γ =
⊥
 0 0 0 B⊥y
 0
 I 0 
,
0 0 Inw 0 0 0 I
we have " #
⊥T γ −2 Mp Xp 0 Dz
ΛΓ =
γ −2 XTpc 0 0
where Xp and Xpc are the partitioned blocks of X = γ 2 Y−1 . Clearly, ΛΓ⊥T has linearly indepen-
∆
dent rows (ΛΓ⊥T Γ⊥ ΛT > 0) if and only if XTpc Xpc > 0 since Dz DTz > 0 due to (7.6). But we can
assume XTpc Xpc > 0 without loss of generality due to the following reason. Let an H∞ controller G
be given. Then there exists a matrix X > 0 such that Y = γ 2 X−1 satisfies the matrix inequality
∆
(7.18). If XTpc Xpc 6> 0, let X̂pc be the matrix given by replacing the zero singular value(s) of Xpc
with ε > 0 in the singular value decomposition of Xpc . Correspondingly, define X̂ by replacing the
12-block Xpc and 21-block XTpc by X̂pc and X̂Tpc , respectively. Then, for sufficiently small ε > 0,
we have X̂Tpc X̂pc > 0, X̂ > 0 and X̂ satisfies (7.18). Thus, all H∞ controllers can still be captured
even if we restrict our attention to the class of matrices Y > 0 with the property XTpc Xpc > 0 for
X = γ 2 Y−1 . This completes the proof.
∆
2
Example 7.2.1 Consider the double integrator system given by

 
0 1 0 0 0
   
Ap Dp Bp  0 0 1 0 1 
    
 Cp 
 Dy By 
= 1 0 0 0 0 .
 
Dz ∗  
Mp  0 0 0 0 1 
1 0 0 1 ∗
Note that this system is exactly the same as the one considered in Example 6.2.3 except for the
fact that the regulated output y now takes the control input into account.
We will first design a controller that achieves the H∞ norm of the closed-loop transfer function
bounded above by γ = 3. By minimizing tr(Xp + Yp ) subject to (7.15), (7.16), and the first
inequality in (7.14), we have
" # " #
6.810 −3.622 6.440 −3.622
Xp = , Yp = .
−3.622 6.440 −3.622 6.810
See Example 6.2.3 for a motivation of the above minimization problem. Note that rank(Yp −
γ 2 X−1
p ) = 1 and hence the H∞ performance bound γ = 3 can be achieved by a 1st order controller.
Using the singular value decomposition of Yp − X−1
p , we have found matrices Ypc and Yc satisfying
(7.19) and thus
 
" # 6.440 −3.622 0.697
Yp Ypc  
Y= 
=  −3.622 6.810 −0.717 
T .
Ypc Yc
0.697 −0.717 0.107
It can be verified that this Y and X := γ 2 Y−1 satisfy the conditions in statement (ii) of Theo-
rem 7.2.3. Hence, a controller achieving the H∞ norm bound γ is computed from (7.17) by choosing
the free parameters L = 0 and R = εI (ε > 0 sufficiently small) as follows:

" #
−2.615 −0.238
G= .
−22.814 −2.522
To see the effect of the H∞ performance bound γ on the actual closed-loop performances, the
above design procedure is repeated for γ = 3, 4, 5, yielding two additional controllers (We mention
that there is no controller achieving γ ≤ 2.5). For each controller, the H∞ norm of the closed-loop
transfer function and the closed-loop poles are computed. The results are summarized in Table 8.1.
Table 8.1 H∞ controllers and their closed-loop properties
γ Achieved H∞ norm Controller Closed-loop poles

3 2.756 −2.615 × s+0.448
s+2.522 −0.699 ± 0.745, −1.123
4 3.511 −2.884 × s+0.775
s+3.900 −0.340 ± 0.761, −3.220
5 4.473 −3.278 × s+1.198
s+5.999 −0.232 ± 0.810, −5.535
From this table, we see that there is a gap between γ and the achieved H∞ norm for every cases. As
γ becomes smaller, the maximum (the smallest in magnitude) of the real-parts of the closed-loop
eigenvalues becomes smaller, and thus the transient response becomes faster.
Figure 7.1 shows the impulse responses of the closed-loop system for each of the three controllers,
where the disturbance input is taken to be w1 (t) = δ(t) and w2 (t) ≡ 0.
0.5 1
solid: bound=3
Regulated Output
dashed: bound=4
Control Input
0 0.5
dotted: bound=5
solid: bound=3
−0.5 0
dashed: bound=4
dotted: bound=5
−1 −0.5
0 5 10 15 0 5 10 15
Time Time
Figure 7.1: Closed-loop impulse responses
Note that the transient dies out faster if γ is smaller. In this case, the peak value of the control
input remains to be small as opposed to the result obtained in Example 6.2.3. This is not because
of the difference between the performance measure (i.e. the output covariance or the H2 norm, and
the H∞ norm), but because of the fact that the control input u is contained in the performance
output z in this example while it is not contained in z in Example 6.2.3.
7.3 Discrete-Time Case

In this section, we consider the H∞ control problem for discrete-time systems (4.31) where all the
state variables are assumed to be measurable without noises, i.e., the assumption (7.1) holds. As
in the continuous-time case, we shall impose the standard assumption (7.2).
tion (7.2) hold. Then the following statements are equivalent.
(i) There exists a (static) state feedback controller u = Gxp which stabilizes the system and yields
kTk∞ < γ.
X > Dp DTp ,
" #
X − Ap XATp − Dp DTp + γ 2 Bp BTp Ap XCTp
> 0.
Cp XATp γ 2 I − Cp XCTp
Y = ATp YAp − ATp YEp (ETp YEp + Jp )−1 ETp YAp + CTp Cp + Q,
has a solution Y > 0 satisfying kDTp YDp k < γ 2 , where

" #
h i Inu 0
∆ ∆
Ep = Bp Dp , Jp = .
0 −γ Inw
2
G = −(BTp PBp + I)−1 BTp PAp + (BTp PBp + I)−1/2 LQ1/2

1
P = (Y−1 − Dp DTp )−1 > 0.
∆
γ2
Proof. From Lemma 7.1.2, a given state feedback controller G is stabilizing and yields kTk∞ < γ
if and only if there exists a matrix X > 0 such that
" # " #" #" #T
X 0 Ap + Bp G Dp X 0 Ap + Bp G Dp
> ,
0 γ2I Cp + By G 0 0 I Cp + By G 0
or equivalently,
" # " #
X − Dp DTp 0 Ap + Bp G h i
> X ATp + GT BTp CTp + GT BTy .
0 γ2I Cp + By G
7.3. DISCRETE-TIME CASE 199
Using the Schur complement formula, the above inequality and X > 0 are equivalent to
" #−1 " #
h i X − Dp DTp 0 Ap + Bp G
−1
X > ATp + GT BTp CTp + GT BTy
0 γ2I Cp + By G
and P̂ = (X − Dp DTp )−1 > 0. After expanding, completing the square with respect to G yields
∆
[G + R̂−1 BTp P̂Ap ]T R̂[G + R̂−1 BTp P̂Ap ] < Q̂
where
∆ 1
R̂ = BTp P̂Bp + I,
γ2
1 T
Q̂ = X−1 − ATp P̂Ap + ATp P̂Bp (BTp P̂Bp + γ −2 I)−1 BTp P̂Ap −
∆
C Cp .
γ2 p
∆ ∆
Then, defining P = γ 2 P̂ and Q = γ 2 Q̂, Corollary 2.3.6 yields the existence condition Q > 0 and
the controller formula given above. Now, using the matrix inversion lemma, it can be verified that
Q = Y − ATp YAp + ATp YEp (ETp YEp + Jp )−1 ETp YAp − CTp Cp
where Y = γ 2 X−1 . Since P̂ > 0, we have

∆
1
P = γ 2 P̂ = (Y−1 − Dp DTp )−1 > 0,
γ2
or equivalently, γ 2 I − DTp YDp > 0. Thus we have (iii). To prove (ii), note that Q̂ can be rewritten
1 T
Q̂ = X−1 − ATp (X − Dp DTp + γ 2 Bp BTp )−1 Ap − C Cp > 0
γ2 p
and hence X > Dp DTp and Q > 0 are equivalent to
" #
X−1 − γ −2 CTp Cp ATp
> 0, X > Dp DTp .
Ap X − Dp DTp + γ 2 Bp BTp
The first inequality holds if and only if

1 T
X − Dp DTp + γ 2 Bp BTp − Ap (X−1 − C Cp )−1 ATp > 0, (7.22)
γ2 p
1 T
X−1 − C Cp > 0. (7.23)
γ2 p
By the matrix inversion lemma, (7.22) is equivalent to
X − Dp DTp + γ 2 Bp BTp − Ap XATp − Ap XCTp (γ 2 I − Cp XCTp )−1 Cp XATp > 0.
By the Schur complement formula, (7.23) is equivalent to
γ 2 I − Cp XCTp > 0.
Thus, another use of the Schur complement formula yields (ii). 2


Consider the discrete-time system given by (4.31). This section provides all H∞ controllers for the
static output feedback case. The standard assumptions (7.2) and (7.6) are imposed throughout
this section.
(i) There exists a static output feedback controller u = Gz which stabilizes the system and yields
kTkH∞ < γ.
(ii) There exists a matrix pair (X,Y) such that
X > 0, Y > 0, XY = γ 2 I, (7.24)

" #
X − Ap XATp − Dp DTp + γ 2 Bp BTp Ap XCTp
> 0, (7.25)
Cp XATp γ 2 I − Cp XCTp
" #
Y − ATp YAp − CTp Cp + γ 2 MTp Mp ATp YDp
> 0. (7.26)
DTp YAp γ 2 I − DTp YDp
X > 0, Y > 0, XY = γ 2 I, (7.27)
X > Ap XATp − Ap XFTp (Fp XFTp + Tp )−1 Fp XATp + Dp DTp , (7.28)

X−1 + FTp T−1
p Fp > 0, (7.29)
Y > ATp YAp − ATp YEp (ETp YEp + Jp )−1 ETp YAp + CTp Cp , (7.30)
Y−1 + Ep Jp−1 ETp > 0, (7.31)
where " #
h i Inu 0
∆ ∆
Ep = Bp Dp , Jp = ,
0 −γ Inw
2
" # " #
∆ Mp ∆ Inz 0
Fp = , Tp = .
Cp 0 −γ 2 Iny
G = ΓΦ−1 ΛT + V1/2 LW1/2

" #
∆ X − Dp DTp + γ 2 Bp BTp Ap
Φ= −1 −2
> 0,
ATp X − γ CTp Cp + MTp Mp
h i h i
∆ ∆
Γ= γ 2 BTp 0 , Λ= 0 Mp ,
V = γ 2 I − ΓΦ−1 ΓT , W = I − ΛΦ−1 ΛT .
∆ ∆
Proof. From the bounded real lemma, (i) holds if and only if there exist matrices X > 0 and G
satisfying
" # " #" #" #T
X 0 Ac` Bc` X 0 Ac` Bc`
> , (7.32)
0 γ2I Cc` Dc` 0 I Cc` Dc`
 
X 0 Ac` Bc`
 
 2 
 ∆ 0 γ I Cc` Dc` 
Ψ=  > 0.
 ATc` CTc` X−1 0 
 
BTc` DTc` 0 I
Let us introduce a square nonsingular matrix T defined by

 
Inp 0 0 0
 
 0 0 Inp 0 
 
 
∆  0 BTy 0 0 
T=


 0 0 0 Dz 
 
 0 By⊥ 0 0 
 
0 0 0 DTz ⊥
where the left annihilators are normalized such that B⊥ ⊥T

y By = I and
T ⊥
Dz DzT ⊥T T
= I. Then, by the congruent transformation with T, we have TΨT > 0 or
 
X Ap + Bp GMp 0 Bp G 0 Dp DTz ⊥T
 
 (Ap + Bp GMp )T X −1 (GMp )T 0 ⊥
(By Cp )T 0 
 
 
 0 GMp γ2I G 0 0 
  > 0.
 
 (Bp G)T 0 GT I 0 0 
 
 0 B⊥ γ2I 
 y Cp 0 0 0 
DTz ⊥ DTp 0 0 0 0 I
Using the Schur complement formula for the partitioned blocks,

 
X − Dp DTp Ap + Bp GMp 0 Bp G
 
 (A + B GM )T X−1 − γ −2 CTp Cp (GMp )T 0 
 p p p 
 >0
 0 GMp γ2I G 
 
(Bp G)T 0 GT I
where we note that

Dp DTz ⊥T DTz ⊥ DTp = Dp (I − DTz Dz )DTp = Dp DTp ,
CTp B⊥T ⊥
y By Cp = Cp (I − By By )Cp = Cp Cp .
T T T
Using the Schur complement formula again for the partitioned blocks,
" #
X − Dp DTp Ap + Bp GMp
(Ap + Bp GMp )T X−1 − γ −2 CTp Cp
" #" #−1 " #T

0 Bp G γ2I G 0 Bp G
> , (7.33)
(GMp )T 0 GT I (GMp )T 0
" #
γ2I G
> 0. (7.34)
GT I
Substituting the identity
" #−1 " #
γ2I G R −RG
R = (γ 2 I − GGT )−1
∆
= ,
GT I −G R I + GT RG
T
into (7.33), and then collecting terms involving G, we have

" #
X − Dp DTp + γ 2 Bp BTp Ap
ATp −1 −2
X − γ CTp Cp + MTp Mp
" #" #−1 " #T
γ 2 Bp 0 γ2I G γ 2 Bp 0
> . (7.35)
0 MTp GT I 0 MTp
Now we see that (7.32) ⇔ (7.33)-(7.34) ⇔ (7.34)-(7.35) ⇔
" # " #
γ2I G Γ h i
> Φ−1 ΓT ΛT , Φ > 0.
GT I Λ
Then from Corollary 2.3.7, there exists G satisfying the above inequality if and only if V > 0 and
W > 0, in which case, all such G are given by the formula in Theorem 7.3.2. Finally, note that
V > 0 and Φ > 0 are equivalent to Φ > γ −2 ΓT Γ, or
" #
X − Dp DTp Ap
> 0. (7.36)
ATp X−1 − γ −2 CTp Cp + MTp Mp
Then by a similar procedure to the proof of Theorem 7.3.1, it is easy to verify that (7.36) ⇔
(7.28)-(7.29) ⇔ (7.26) with Y = γ 2 X−1 . Similarly, W > 0 and Φ > 0 are equivalent to
∆
" #
X − Dp DTp + γ 2 Bp BTp Ap
>0 (7.37)
ATp X − γ −2 CTp Cp
−1
and we have (7.37) ⇔ (7.30)-(7.31) ⇔ (7.25) with Y = γ 2 X−1 . This completes the proof.
∆
2
7.3.3 Dynamic Output Feedback

This section considers the H∞ control problem in a general setting, i.e., we shall not impose the
standard assumptions (7.2) and (7.6). The following theorem gives all dynamic (or static when
nc = 0) H∞ controllers of order nc .
Theorem 7.3.3 Let a scalar γ > 0 be given and consider the system (4.31). The following state-
(i) There exists a controller of order nc which stabilizes the system and yields kTk∞ < γ.
(ii) There exists a matrix pair (X,Y)∈ R(np +nc )×(np +nc ) × R(np +nc )×(np +nc ) such that
X > 0, Y > 0, XY = γ 2 I,
" #⊥  " # " #" #" #T  " #⊥T
B 
X 0 A D X 0 A D 
B
− > 0, (7.38)
H 0 γ2I C F 0 I C F H
" #⊥ " # " #T " #" # " #⊥T
MT 
Y 0 A D Y 0 A D  MT
− > 0. (7.39)
ET 0 γ2I C F 0 I C F ET
(iii) There exists a matrix pair (Xp ,Yp )∈ Rnp ×np × Rnp ×np such that
" # " #
Xp γI Xp γI
≥ 0, rank ≤ np + nc ,
γI Yp γI Yp
" #⊥  " # " #" #" #T  " #⊥T
Bp 
Xp 0 Ap Dp Xp 0 Ap Dp 
Bp
− > 0, (7.40)
By 0 γ2I Cp Dy 0 I Cp Dy By
" #⊥ " # " #T " #" # " #⊥T
MTp 
Yp 0 Ap Dp Yp 0 Ap Dp 
MTp
− > 0.
DTz 0 γ2I Cp Dy 0 I Cp Dy T Dz
(7.41)
G = −(ΓT ΦΓ)−1 ΓT ΦΘRΛT (ΛRΛT )−1 + (ΓT ΦΓ)−1/2 LΨ1/2
Φ = (Q − ΘRΘT + ΘRΛT (ΛRΛT )−1 ΛRΘT )−1 ,

∆
Ψ = Ω − ΩΛRΘT (Φ − ΦΓ(ΓT ΦΓ)−1 ΓT Φ)ΘRΛT Ω,

∆
Ω = (ΛRΛT )−1 ,
∆
" # " # " #

∆ X 0 ∆ X 0 ∆ A D
Q= , R= , Θ= ,
0 γ2I 0 I C F
" #
B h i
∆ ∆
Γ= , Λ= M E .
H
Proof. From Lemma 7.1.2, a given controller satisfies the conditions in statement (i) if and only
if there exists a matrix X > 0 such that
(Θ + ΓGΛ)R(Θ + ΓGΛ)T < Q.
Then statement (ii) and the controller formula directly follow from Theorem 2.3.11. Statement (iii)
can be proved by a similar procedure to the proof of Theorem 7.2.3. 2
Exercise 7.3.1 By specializing Theorem 7.3.3, obtain existence conditions
(a) for the state feedback case with assumptions (7.1) and (7.2), and
(b) for the static output feedback case with the assumptions (7.2) and (7.6).
Chapter 7 Closure
Explicit formulas for all H∞ controllers of any order are given in this chapter for both continuous
and discrete-time systems using state feedback or output feedback. The necessary and sufficient
existence conditions involve linear matrix inequalities with a coupling constraint. The results
presented here are essentially from [60]. Similar matrix inequality approaches may be found in
[34, 83, 119, 120, 121]. Riccati equation approaches for the state feedback case can be found in
[71, 105, 162]. Excellent, self-contained approaches include [25, 39, 141, 138, 139].
Chapter 4 shows the relationship between system gains and robust performance. The energy to
energy gain also has use in nonlinear systems, although nonlinear problems are not treated here.
Chapter 8
Model Reduction
Modeling of physical systems usually results in complex high-order models, and it is often desirable
to replace these models with simpler reduced-order models without significant error. The model
order reduction problem consists of approximating a high-order system G by a lower-order system
Ĝ according to some given criterion. In this chapter, necessary and sufficient conditions are derived
for the solution of the H∞ and the covariance bounded model reduction problems using a linear
matrix inequality formulation. These approaches are consistent with the algebraic emphasis of
this book. However, many other model reduction methods can be found in the literature, e.g., see
[21, 126, 75, 146, 22, 92, 5, 74, 40, 27, 82].
8.1 H∞ Model Reduction

The optimal H∞ model reduction seeks to provide a reduced-order model that minimizes the H∞
norm of the error between the full-order and the reduced-order model. The suboptimal H∞ model
reduction problem provides an upper bound on the H∞ norm of the error system.
8.1.1 Continuous-Time Case

Consider a stable nth -order linear, time-invariant system G with a state space representation
ẋ = Ax + Bu (8.1)
y = Cx + Du (8.2)
The optimal H∞ model reduction problem is to find a stable n̂th -order system Ĝ with state space
representation
x̂˙ = Âx̂ + B̂u (8.3)

ŷ = Ĉx̂ + D̂u (8.4)
where n̂ < n, such that the H∞ norm error kG(s) − Ĝ(s)k∞ is minimized, where Ĝ(s) = Ĉ(sI −
Â)−1 B̂ + D̂. The γ-suboptimal H∞ model reduction problem is to find Ĝ, if exists, such that
205
206 CHAPTER 8. MODEL REDUCTION
kG − Ĝk∞ < γ where γ is a given positive scalar. If n̂ = 0, the reduced-order system Ĝ is a

constant matrix D̂ and the model reduction problem is called the zeroth -order H∞ approximation
problem.
Appendix C describes a balanced realization and the balanced truncation method for model
reduction [91]. The balanced truncation method consists of transforming the system G to a special
realization such that the observability and controllability Gramians are equal and diagonal and
truncating the states that correspond to the smallest diagonal elements (called Hankel singular
values). This procedure provides a guaranteed twice-the-sum-of-the-tail H∞ error bound [27]
kG − Ĝk∞ ≤ γB = 2(σn̂+1 + . . . + σn ) (8.5)
where the Hankel singular values σ1 , σ2 , . . . , σn̂ , σn̂+1 , . . . , σn are ordered in descending order. The
optimal Hankel norm model reduction can be chosen to satisfy the sum-of-the-tail error bound [38]
kG − Ĝk∞ ≤ γH = σn̂+1 + . . . + σn (8.6)
The Hankel model reduction is H∞ optimal for n̂ = n − 1. However, it might be conservative in

the general case.
The following results provide necessary and sufficient conditions for the solution of the γ-
suboptimal H∞ model reduction problem in terms of LMIs, and an explicit parametrization of
all reduced-order models that correspond to a feasible solution.

(i) There exists an n̂th -order system Ĝ to solve the γ-suboptimal H∞ model reduction problem.
(ii) There exist matrices X > 0 and Y > 0 such that the following conditions are satisfied
AX + XAT + BBT < 0 (8.7)

T T
YA + A Y + C C < 0 (8.8)
" #
X γI
≥0 (8.9)
γI Y
and " #
X γI
rank ≤n+n
b . (8.10)
γI Y
If these conditions are satisfied, all γ-suboptimal n̂th -order models that correspond to a feasible
matrix pair (X, Y) are given by
" #
D̂ Ĉ
= Ĝ1 + Ĝ2 LĜ3 (8.11)
B̂ Â
8.1. H∞ MODEL REDUCTION 207
b ) × (m + n
where L is an (p + n b ) matrix such that kLk < 1, and
Ĝ1 = (M1 − Q12 Q−1 T −1 T −1

22 M2 )(M2 Q22 M2 )
Ĝ2 = (−Q11 + Q12 Q−1

22 Q12 − G1 G3 G1 )
T 2 T 1/2
(8.12)
Ĝ3 = (−M2 Q−1 T 1/2
22 M2 )
where " # " #

0 0 0 I
M1 = , M2 =
0 R2x Rx LTx 0
" # " #
−γ 2 I −CLx Rx −CX −D
Q11 = , Q12 =
−Rx LTx CT 0 Rx LTx AT 0
" #
AX + XAT B
Q22 = (8.13)
BT −I
and Rx is an arbitrary positive-definite matrix and Lx is an arbitrary matrix factor such that
Lx LTx = X − γ 2 Y−1 . (8.14)
Proof. The necessary and sufficient conditions (8.7)-(8.10) can be obtained using the state
space representation of the error system G − Ĝ
x̄˙ = (Ā + B̄ḠM̄)x̄ + (D̄ + B̄ḠĒ)u (8.15)

z = (C̄ + H̄ḠM̄)x̄ + (F̄ + H̄ḠĒ)u (8.16)
where " # " # " # " #

A 0 0 0 I B
Ā = , B̄ = M̄ = , Ē = , D̄ = (8.17)
0 0 0 I 0 0
h i h i
C̄ = C 0 , H̄ = −I 0 , F̄ = D (8.18)
" #
D̂ Ĉ
Ḡ = , z = y − ŷ (8.19)
B̂ Â
and applying the conditions of Theorem 7.2.3 in Chapter 7, to guarantee that the transfer function
of (8.15)-(8.16) from u to z has H∞ norm less than γ. The parametrization (8.11)-(8.14) of all
reduced-order models that correspond to a feasible solution can be obtained using the corresponding
parametrization in Theorem 7.2.3. 2
Hence, the γ-suboptimal H∞ model reduction problem is characterized as a feasibility problem
of finding a pair of positive definite matrices (X, Y) in the intersection of the constraint sets (8.7),
(8.8), (8.9) and (8.10).
Particularly simple necessary and sufficient conditions and a parametrization of all solutions
can be obtained for the zeroth -order H∞ approximation problem.
(i) There exists a constant matrix D̂ to solve the zeroth -order γ-suboptimal H∞ approximation
problem.
AX + XAT + BBT < 0 (8.20)

" #
AX + XAT XCT
< 0. (8.21)
CX −γ 2 I
All zeroth -order γ-suboptimal H∞ solutions D̂ that correspond to a feasible matrix X are given by
b3
D̂ = D̂1 + D̂2 LD (8.22)
where
D̂1 = D − CX(AX + XAT )−1 B

h i1/2
D̂2 = γ 2 I + CX(AX + XAT )−1 XCT (8.23)
h i1/2
D̂3 = I + BT (AX + XAT )−1 B
and L is any p × m matrix such that kLk < 1.
Proof. For n̂ = 0, condition (8.10) of Theorem 8.1.1 is equivalent to rank (γ 2 I − XY) = 0,

i.e., XY = γ 2 I, where X > 0, Y > 0 Hence, it can easily be shown that condition (8.9) is trivially
satisfied. Multiplying (8.8) on the left-hand and right-hand side by Y−1 = γ12 X provides
1
AX + XAT + XCT CX < 0
γ2
which according to the Schur complement formula is equivalent to condition (8.21).
To obtain the parametrization of all zeroth -order γ-suboptimal H∞ approximations note that
for n̂ = 0 the expressions (8.12)-(8.14) become
Ĝ1 = −Q12 Q−1 T −1 T −1

22 M (MQ22 M )
Ĝ2 = (γ 2 I + Q12 Q−1 −2 T 1/2

22 Q12 − Ĝ1 Ĝ3 Ĝ1 )
T
(8.24)
Ĝ3 = (−MQ22 MT )−1/2
where h i
M= 0 I
" #
h i AX + XAT B
Q12 = CX D , Q22 = .
BT −γ 2 I
By defining " #
Ψ11 Ψ12
Q−1
22 = Ψ =
ΨT12 Ψ22
we obtain after some matrix algebra
Ĝ1 = CXΨ12 Ψ−1

22 + D
h i
Ĝ2 = γ 2 I + CX(Ψ11 − Ψ12 Ψ−1 T
22 Ψ12 )XC
T
(8.25)
Ĝ3 = (−Ψ22 )−1/2 .
Using the block matrix inverse expression for Q−1

22 we find after several matrix algebraic manipu-
lations
Ψ12 Ψ−1
22 = (AX + XAT )−1 B
Ψ11 − Ψ12 Ψ−1 T T −1
22 Ψ12 = (AX + XA )
Ψ22 = −I − BT (AX + XAT )−1 B
Substituting these expression in (8.25) we obtain (8.23). 2
8.1.2 Discrete-Time Case

Next, we consider the discrete-time H∞ model reduction problem: Given a stable, nth -order
discrete-time system G described by
xk+1 = Axk + Buk (8.26)

yk = Cxk + Duk (8.27)
where G(z) = C(zI − A)−1 B + D, find a stable n̂th -order discrete-time system Ĝ
x̂k+1 = Âx̂k + B̂uk (8.28)

ŷk = Ĉx̂k + D̂uk (8.29)
such that the H∞ norm kG − Ĝk∞ is minimized, where Ĝ(z) = Ĉ(zI − Â)−1 B̂ + D̂. The discrete-
time γ -suboptimal H∞ model reduction and the discrete-time zeroth -order H∞ approximation
problems are defined accordingly.
The following results provide necessary and sufficient conditions for the existence of a solution
to the discrete-time γ-suboptimal H∞ model reduction problem and a state space parametrization
of all reduced-order models.

(i) There exists an n̂th -order system Ĝ to solve the discrete-time γ-suboptimal H∞ model re-
duction problem. (ii) There exist matrices X > 0 and Y > 0 such that the following conditions are
satisfied
X − AXAT − BBT > 0 (8.30)

Y − AT YA − CT C > 0 (8.31)
" #
X γI
≥0 (8.32)
γI Y
and " #
X γI
rank ≤ n + n̂ . (8.33)
γI Y
All γ-suboptimal n̂th -order models that correspond to a feasible matrix pair (X, Y) are given by
" #
D̂ Ĉ
= Ĝ1 + Ĝ2 LĜ3 (8.34)
B̂ Â
b ) × (m + n
where L is any (p + n b ) matrix such that kLk < 1, and
Ĝ1 = −(ΓT ΦΓ)−1 ΓT ΦΘRΛT (ΛRΛT )−1

Ĝ2 = (ΓT ΦΓ)−1/2 (8.35)
h i
−1
Ĝ3 = {Ω − ΩΛRΘ T
Φ − ΦΓ(Γ ΦΓ) T T
Γ Φ ΘRΛ Ω} T 1/2
where
Φ = (Q − ΘRΘT + ΘRΛT (ΛRΛT )−1 ΛRΘT )−1
" # " # " #
X̄ 0 X̄ 0 I 0
Q= , R= , Ω=
0 γ2I 0 I 0 X−1
c
" # " #
X Xpc 0 0 I
X̄ = , Λ= (8.36)
XTpc Xc 0 I 0
   
A 0 B 0 0
   
Θ =  0 0 0 , Γ =  0 I 
  

C 0 D −I 0
and Xpc , Xc are arbitrary matrices such that X̄ > 0.
Proof. The necessary and sufficient conditions for the solvability of the discrete-time γ-
suboptimal H∞ model reduction problem and the parametrization of all reduced-order models can
be obtained from the characterization of all solutions of the discrete-time H∞ problem via LMIs.
That is, the discrete-time equivalent of the error system (8.15), yields the same structures (8.17)-
(8.19). Then Theorem 7.3.3 gives the appropriate H∞ control solution for Ḡ, where substitutions
of the structure (8.17)-(8.19) readily verify the theorem. 2
The conditions (8.30)-(8.33) and the parametrization of the reduced-order models (8.34)-(8.36)
can be significantly simplified for the case of the discrete-time zeroth -order γ-suboptimal H∞ ap-
proximation problem.

(i) There exists a constant matrix D̂ to solve the discrete-time zeroth -order γ-suboptimal H∞
approximation problem.
(ii) There exists a matrix X > 0 such that the following conditions are satisfied
X − AXAT − BBT > 0 (8.37)

" #
X− AXAT −AXCT
> 0. (8.38)
−CXAT γ 2 I − CXCT
b that correspond to a feasible solution X are given by

All constant matrices D
D̂ = D̂1 + D̂2 LD̂3 (8.39)
where
D̂1 = D + CXAT (X − AXAT )−1 B

h i1/2
D̂2 = γ 2 I − CXCT − CXAT (X − AXAT )−1 AXCT (8.40)
h i1/2
D̂3 = I − BT (X − AXAT )−1 B
and L is any p × m matrix such that kLk < 1.
Proof. As in the continuous-time case, for n̂ = 0, condition (8.33) of Theorem 8.1.3 provides
XY = γ 2 I, where X > 0, Y > 0. Hence, (8.31) is equivalent to
1 T
X−1 − AT X−1 A − C C>0
γ2
which provides, using the Schur complement formula

" # " #
X−1 AT 1 T h i
γC 1
> γC 0 .
A X 0
Using the Schur complement formula again we obtain

 
X−1 AT 1 T
γC
 
 A X 0 >0.
 
1
γC 0 I
Using the dual formula we get

" # " #
X 0 A h i
1 T
> 1
X AT γC
0 I γC
which provides (8.38).

The parametrization of all constant matrices D̂ that solve the discrete-time zeroth -order H∞
approximation problem is obtained following similar steps as in the continuous-time case. 2
Example 8.1.1 Consider the zeroth -order H∞ approximately problem for a system G(s) with
state space matrices
   
−2 3 −1 1 −2.5 0 −1.2
   
 0 −1 1   1.3 −1 1 
 0   
A= , B =  
 0 0 −3 12   1.6 2 0 
   
0 0 0 −4 −3.4 0.1 2
 
−2.5 1.3 1.6 −3.4
 
C=
 0 −1 2 0.1  , D = 0.
−1.2 1 0 2
It can be shown that the following matrix X satisfies the conditions (8.20)-(8.21) of Theorem
8.1.2 for γ = 6.2824.
 
28.3554 −4.3175 16.2043 0.9239
 
 −4.3175 9.7310 −3.5798 0.5693 
 
X= 
 16.2043 −3.5798 31.7152 −0.7418 
 
0.9239 0.5693 −0.7418 3.3769
Hence, according to Theorem 8.1.2, there exists a constant matrix hatD to satisfy the zeroth -order
H∞ approximation
kG(s) − D̂k∞ ≤ 6.2824
Numerical algorithms to obtain the feasible matrix X and the H∞ approximation norm bound γ
will be presented in Chapters 10 and 11. Using the techiques in these chapters, it can be shown
that the above value of γ is the optimal H∞ norm bound for this problem.
The zerot h-order H∞ approximates that correspond to the feasible solution X are parametrized
as follows
D̂ = D̂1 + D̂2 LD̂3
where D̂1 , D̂2 and D̂3 are computed from 8.23 as follows:
 
1.6731 0.7830 0.7233
 
D̂1 
=  −1.9232 1.3841 2.7552 

−0.0738 −0.4314 2.2064
 
1.8600 −1.1183 −1.8145
 
D̂2 
=  −1.1183 1.3841 −0.2787 

−1.8145 −0.2787 4.7054
 
0.3627 0.2575 0.3165
 
D̂3 
=  0.2575 0.7781 −0.0302 

0.3165 −0.0302 0.4987
and L is any matrix with kLk < 1.
8.2. MODEL REDUCTION WITH COVARIANCE ERROR BOUNDS 213
8.2 Model Reduction with Covariance Error Bounds

The covariance error bound model reduction problem seeks to provide a reduced-order model to
bound the covariance of the output difference of the full and the reduced-order models.
8.2.1 Continuous-Time Case
Consider a stable nth -order strictly proper system with state space representation
ẋ = Ax + Bw (8.41)
y = Cx (8.42)
where w is a zero-mean white noise excitation with unit intensity. The suboptimal covariance upper
bound model reduction problem is to find a reduced-order strictly proper system
x̂˙ = Âx̂ + B̂w (8.43)

ŷ = Ĉx̂ (8.44)
of order n̂ < n such that the covariance matrix Ỹ = limt→∞ E ỹỹT of the output error ỹ =
y − ŷ satisfies a bound Ỹ < Ȳ, where Ȳ is a specified symmetric positive definite matrix. For
simplicity we will consider the case Ȳ = ²I, where ² is a given positive scalar. Following the system
performance analysis discussion in Chapter 4, a deterministic interpretation of this model reduction
problem is to find a reduced-order model (8.43)-(8.44) such that the peak value of the output error
√
kỹkL∞ is less than ², when w is any bounded energy input with kwkL2 ≤ 1.
The solution to the upper bound covariance model reduction problem is given by the following
result
Theorem 8.2.1 Let ² be a given positive scalar. Consider the stable linear time-invariant system
(8.41)-(8.42) where w is a stochastic white noise process uncorrelated with x(0), and with covariance
I. The following statements are equivalent:
(i) There exists a reduced-order model (8.43)-(8.44) of order n̂ such that the output covariance
error is bounded above by ², that is Ỹ < ²I.
(ii) There exist a matrix pair (X, Y) such that
" # " #
X I X I
≥ 0, rank ≤ n + n̂ (8.45)
I Y I Y
AX + XAT + BBT < 0 (8.46)

²Y − CT C > 0 (8.47)
YA + AT Y < 0. (8.48)
In this case all such reduced-order models are given by

h i
B̂ Â = −R−1 ΓΦΛT (ΛΦΛT )−1 + S1/2 L1 (ΛΦΛT )−1/2
−1/2
Ĉ = CX21 X−1 −1 T 1/2
2 + (²I − CY C ) L2 X2
where      
0 AX + XAT AX21 B 0 X21
     
Γ=  
 I , Θ =  XT21 AT 0 0  , Λ =  0 X2 
 T 
 (8.49)
0 BT 0 −I I 0
S = R−1 − R−1 ΓT [Φ − ΦΛT (ΛΦΛT )−1 ΛΦ]ΓR−1

Φ = (ΓR−1 ΓT − Θ)−1 > 0
X21 and X2 are any matrix factors satisfying
X − Y−1 = XT21 X2−1 X21 > 0
R is an arbitrary positive definite matrix and L1 and L2 satisfy kL1 k < 1 and kL2 k < 1.
Proof. Consider the augmented system
x̃˙ = Ãx̃ + B̃w̃ (8.50)

ỹ = C̃x̃ (8.51)
where
Ã = A1 + B1 Ĝ1 M1
B̃ = B2 + B1 Ĝ1 M2
T
C̃ = C1 − ĈB1
and
" # " # " # " # " #
A 0 0 B 0 0 I
A1 = , B1 = , B2 = , M1 = , M2 =
0 0 I I 0 I 0
h i h i
C1 = C 0 , Ĝ1 = B̂ Â
From Lemma 8.1.1 it follows that the upper bound model reduction problem has a solution if and
only if there exists a matrix X̃ > 0 such that
T T
ÃX̃ + X̃Ã + B̃B̃ < 0
T
C̃X̃C̃ < ²I
that is
T
(A1 + B1 Ĝ1 M1 )X̃ + X̃(A1 + B1 Ĝ1 M1 )T + B̃B̃ < 0 (8.52)
T T
(C1 − ĈB1 )X̃(C1 − ĈB1 )T < ²I. (8.53)
The matrix inequality (8.52) is equivalent to the linear matrix inequality
ΓĜ1 Λ + (ΓĜ1 Λ)T + Θ < 0
where Γ, Λ and Θ are given by

" #    
T T
B1 A X̃ + X̃A1 B2  X̃M1 
Γ= , Θ= 1 , ΛT =  .
0 BT2 −I MT2
Using Theorem 2.3.12, it is easy to show that there exists a solution Ĝ1 if and only if
T
B⊥ T ⊥T
1 (A1 X̃ + X̃A1 + B2 B2 )B1 < 0
" #⊥ " #" #⊥T
MT1 X̃−1 A1 + AT1 X̃−1 X̃−1 B2 MT1
< 0.
MT2 BT2 X̃−1 −I MT2
Partitioning X̃ and X̃−1 = Ỹ as

" # " #
X XT21 Y Y21T
X̃ = , Ỹ =
X21 X2 Y21 Y2
and utilizing the definitions of A1 , B1 ,B2 , M1 and M2 we get (8.46) and (8.48).
The solvability condition of (8.53) is obtained from Theorem 2.3.11 as follows
B⊥
1 (X̃
−1
− ²−1 CT1 C1 )B⊥T
1 >0
which provides (8.46). Moreover, from X̃−1 = Ỹ we get that
X − Y−1 = XT21 X−1

2 X21 > 0
and also " #

X I
rank = rank(X − Y−1 ) + rank(Y) ≤ n + n̂
I Y
which provide (8.45).
The parametrization of all reduced-order models is obtained from the parametrization of all
solutions in Theorems 2.3.11 and 2.3.12. 2
Example 8.2.1 Consider the following continuous-time system

" # " #
−0.005 −0.99 1
ẋ = x+ w
−0.99 −5000 100
h i
y= 1 100 x
We seek to obtain a first-order model that approximates the given system in a covariance error
bound sense.
The following pair of matrices (X, Y) satisfy the conditions (8.45)-(8.48) of Theorem 8.2.1 for
² = 96.0787 " # " #
0.1161 0.0312 8.6139 0.0003
X= , Y=
0.0312 108.9409 0.0003 1.3048
Hence, there exists a first-order system to satisfy the output covariance error bound Ỹ <
96.0787I. Using the formulas of Theorem 8.2.1 we obtain the following reduced-order system:
x̂˙ = −4999.8x̂ − 100.0w

ŷ = −100.0x̂
The values of X, Y and ² can be obtained utilizing the algorithms that will be presented in
Chapters 10 and 11. In fact it can be shown that the above value the covariance error bound ² is
the minimum one.
For comparison, it is noted that a balanced model reduction method produces the following
reduced-order model
x̂˙ = −0.005x̂ + w
ŷ = x̂
and the covariance error bound that correspond to this model is Ȳ = 104 I. Hence, balanced model
reduction could result in reduced-order models with poor covariance error bounds.
8.2.2 Discrete-Time Case
We consider here the discrete-time covariance bounded model reduction problem defined as follows.
Let the model, its reduction, and the corresponding output error ỹ = y − ŷ, be described by (8.26)-
(8.27). Assume w is a zero mean white noise with covariance I. Define the output covariance by
Ỹ = limt→∞ E(y − ŷ)(y − ŷ)T . Then the covariance bounded model reduction problem is to find a
realization of specified order n̂ such that Ỹ ≤ Ȳ where Ȳ is a specified symmetric positive definite
matrix. The solution to this problem is described as follows, where we set Ȳ = ²I for simplicity.
Consider a stable nth -order LTI discrete-time system
" # " #" #
yk D C wk
= . (8.54)
xk+1 B A xk
We want to find a reduced-order system of order n̂ ≤ n

" # " #
ŷk wk
= Ĝ (8.55)
x̂k+1 x̂k
such that the output covariance of (y − ŷ) is bounded by ²I.

By augmenting the given model (8.54) to the reduced model (8.55), defining X̃ as a state
covariance upper bound of the augmented system, namely,
" #
x h i
X̃ ≥ E∞ xT x̂T ,
x̂
then the upper bound X̃ and the covariance Ỹ satisfy
ÃX̃ÃT + B̃B̃T < X̃ (8.56)

Ỹ ≤ C̃X̃C̃ + D̃D̃ < ²I T T
(8.57)
where " # " #

Ĝ1 D̂ Ĉ
Ĝ = =
Ĝ2 B̂ Â
and
Ã = A1 + B1 Ĝ2 M1 , B̃ = B2 + B1 Ĝ2 M2
C̃ = C1 − Ĝ1 M1 , D̃ = D − Ĝ1 M2
" # " # " #
A 0 0 0 0
A1 = , B1 = , M1 =
0 0 I 0 I
" # " #
B I h i
B2 = , M2 = , C1 = C 0
0 0
The solution for the covariance bounded model reduction is given as follows:
(i) There exists an n̂th -order system (8.55) that solves the covariance bounded model reduction
problem, Ỹ < ²I.
(ii) There exists a pair of n × n matrices (X, Y) such that

" # " #
X I X I
≥ 0, rank ≤ n + n̂
I Y I Y
AXAT − X + BBT < 0
C C − ²Y < 0
T
AT YA − Y < 0.
If the above conditions hold for some X and Y, then all the solutions can be expressed by:
h iT
Ĝ1 = Â + Q̂ 2 L̂R− 2 ,
1 1
Ĝ = ĜT1 ĜT2 , kL̂k < 1
Ĝ2 = −(BT1 Q−1 B1 )−1 BT1 Q−1 Ā + (BT1 Q−1 B1 )− 2 LΨ 2 , kLk < 1
1 1
where X̃21 and X̃2 are any matrix factors satisfying
X − Y−1 = X̃T21 X̃−1

2 X̃21
and X̂ is defined by,

" #
X X̃T21
X̃ = ,
X̃21 X̃2
where
Q̂ = ²I − CY−1 CT
" #
X − AY−1 AT X̃T21
Q=
X̃21 X̃2
" #
B AX̃T21 X̃−1
2
Ā =
0 0
h i
Â = D CX̃T21 X̃−1
2
" #
I 0
R=
0 X̃2
Ψ = R−1 − ĀT Q−1 Ā + ĀT Q−1 B1 (B1 Q−1 B1 )−1 BT1 Q−1 Ā.
Proof. (8.56) and (8.57) can be written as
(A1 + B1 Ĝ2 M̄)X̃(A1 + B1 Ĝ2 M̄)T + (D̄ + B1 Ĝ2 Ē)(D̄ + B1 Ĝ2 Ē)T < X̃ (8.58)
(C̄ − Ĝ1 M̄)X̃(C̄ − Ĝ1 M̄)T + (D − Ĝ1 Ē)(D − Ĝ1 )T < ²I (8.59)
i.e., the above two matrix inequalities are decoupled with respect to the unknowns Ĝ1 and Ĝ2 .
After completing the square in Ĝ2 , theorem 2.3.11 can be used to show that, there exists a Ĝ2
solving (8.58) if and only if there exists a X̃ = X̃T > 0 with X̃ ∈ R(n+n̂)×(n+n̂) such that
B⊥ T T ⊥T
1 (X̃ − A1 X̃A1 − B2 B2 )B1 >0
" #⊥ " #" #⊥T
MT1 X̃−1 − AT1 X̃−1 A1 −A1 X̃−1 B2 MT1
> 0.
MT2 −B2 X̃−1 A1 I − BT2 X̃−1 B2 MT2
Partitioning X̃ and Ỹ = X̃−1 > 0 as

" # " #
X X̃T21 Y Ỹ21T
X̃ = , Ỹ =
X̃21 X̃2 Ỹ21 Ỹ2
and substituting the known matrices A1 , B1 , B2 , M1 and M2 together with the partition of X̃
and Ỹ into the above inequalities leads to
X − AXAT − BBT > 0

Y − AT YA > 0.
The same technique for solving (8.58) for Ĝ2 can also solve (8.59) for Ĝ1 by substituting the
parameters A1 → C̄, B1 → −I, D̄ → D, etc. (Note that B⊥ 1 is void in this case). Hence there
exists a Ĝ1 solving (8.59) if and only if there exists X̃ satisfying
" #⊥ " #" #⊥T
MT1 X̃−1 − CT1 ²−1 C1 −CT1 ²−1 D MT1
>0
MT2 −DT ²−1 C1 I − DT ²−1 D MT2
Y − CT C/² > 0
Notice that, by Schur complement, Ỹ = X̃−1 implies
Y = (X − X̃T21 X̃2−1 X̃21 )−1
or
X − Y−1 = X̃T21 X̃−1
2 X̃21 ≥ 0.
By another Schur complement, the above leads to

" #
X I
≥ 0.
I Y
Since X̃2 ∈ Rn̂×n̂ we have

rank(X − Y−1 ) ≤ n̂
Also consider
" # " #" #" #
X I I −Y−1 X I I 0
rank = rank −1
I Y 0 I I Y −Y I
" #
X − Y−1 0
= rank
0 Y
= rank (X − Y−1 ) + rank(Y) ≤ n̂ + n
This proves the existence condition in (ii). The rest of the theorem comes from direct application
of the other part of Theorem 2.3.11. 2
Chapter 8 Closure
This chapter provides necessary and sufficient conditions for the solution of two kinds of model
reduction problems. All reduced-order models are parametrized that have modeling error less than
a specified “value” are given. The two measurements for “value” are the frequency response errors
and covariance errors. One cannot know what is a good model without knowing its purpose.
Hence, the modeling and control problems are not independent. Modeling for control design is
still an active research topic. This chapter only gives results that lend themselves to the matrix
inequality methods of this book. See the special journal issue [129] devoted to control related
modeling approaches.
Chapter 9
Unified Perspective
This chapter shows a unified perspective based on Linear Matrix Inequalities (LMIs) for designing
controllers with various specifications including stability, performance and robustness. In particular,
many control design problems can be reduced to just one linear algebra problem of solving an LMI
for the controller parameter G. In fact, the covariance control problem and the H∞ control problem
treated in the previous chapters are examples of such control problems. The purpose of this chapter
is to show explicitly the common mathematical structure of these and other control problems hidden
in the proofs of the previous synthesis results. We shall list 17 control problems that all reduce to
a single linear algebra problem to solve
ΓGΛ + (ΓGΛ)T + Θ < 0
for the matrix G.

Consider the linear time-invariant plant and the controller
    
∂xp Ap Dp1 Dp 2 Bp xp
     " # " #" #
    
 y1   Cp1 Dy11 Dy12 By1  w1  ∂xc Ac Bc xc
 =  , = (9.1)
 y2   Cp 2 Dy21 Dy 22 By 2  w2  u Cc Dc z
    
z Mp Dz1 Dz 2 0 u
where ∂ is the differentiation/delay operator for the continuous-time/discrete-time case, xp and

xc are the plant and the controller states, respectively, z and u are the measured output and the
control input, and y1 , y2 , w1 and w2 are the exogenous signals that will be used to describe control
design specifications. The closed-loop system can be described by
    
∂x Ac` Bc`1 Bc`2 x
    
 y1  =  C  
   c`1 Dc`11 Dc`12   w1  , (9.2)
y2 Cc`2 Dc`21 Dc`22 w2
221
222 CHAPTER 9. UNIFIED PERSPECTIVE
∆
where x = [ xTp xTc ]T is the closed-loop state and
     
Ac` Bc`1 Bc`2 A D1 D2 B
 ∆    h i
 C     
 c`1 Dc`11 Dc`12  =  C1 F11 F12  +  H1  G M E1 E2 (9.3)
Cc`2 Dc`21 Dc`22 C2 F21 F22 H2
 
Ap 0 Dp1 Dp 2 Bp 0
   
A D1 D2 B  0 0 0 0 0 Inc 
    
 C   C 
 1 F11 F12 H1  ∆  p1 0 Dy11 Dy12 By1 0 

 = . (9.4)
 C2 F21 F22 H2   Cp 0 Dy21 Dy 22 By 2 0 
   2 
M E1 E2 GT  M 0 Dz1 Dz 2 DTc BTc 
 p 
0 Inc 0 0 CTc AcT
These definitions will be used in the subsequent sections where we show how to solve different
control problems in a unified manner.

In this section, we shall present a unified perspective for the continuous-time case. The main result
of this section shows that certain control problems with stability, performance and robustness
specifications can be reduced to a mathematical problem of solving an LMI
ΓGΛ + (ΓGΛ)T + Θ < 0 (9.5)
for the controller parameter G. Specifically, we show appropriate matrices Γ, Λ, and Θ for each
control problems in terms of the plant data and a Lyapunov matrix (and possibly a scaling matrix).
The linear algebra problem (9.5) can be solved by using Theorem 2.3.12, which provides necessary
and sufficient conditions for the existence of G satisfying (9.5);
Γ⊥ ΘΓ⊥T < 0, ΛT ⊥ ΘΛT ⊥T < 0. (9.6)
As has been shown for the covariance upper bound control problem and the H∞ control problem
in Chapters 6 and 7, these existence conditions reduce to LMIs in terms of Lyapunov matrix X
and its inverse X−1 , and possibly a scaling matrix S, which can further be specialized to Riccati
inequalities with certain assumptions on the plant. Finally, when (9.6) holds, one such G is given
by
G = −ρΓT ΦΛT (ΛΦΛT )−1 (9.7)
where ρ > 0 is an arbitrary scalar such that
Φ = (ρΓΓT − Θ)−1 > 0.

∆
All such G are also available in Theorem 2.3.12.

9.1.1 Stabilizing Control

The simplest control problem is the one with only the stability specification. Consider a linear
time-invariant plant;
" # " #" #
ẋp Ap Bp xp
= (9.8)
z Mp 0 u
where xp is the state, u is the control input and z is the measured output. The stabilization
problem is the following.
Determine whether or not there exists a controller in (9.1) which asymptotically stabi-
lizes the system (9.8). Parametrize all such controllers when one exists.
This problem can be reduced to the above linear algebra problem (9.5) as follows.
Theorem 9.1.1 Let a controller G be given. The following statements are equivalent.
(i) The controller G solves the stabilization problem.
(ii) There exists a matrix X > 0 such that (9.5) holds where
h i h i
∆
Γ ΛT Θ = B XMT AX + XAT .
Proof. The result directly follows from Lyapunov’s stability theory which states that (i) is equiv-
alent to the existence of a Lyapunov matrix X > 0 satisfying
(A + BGM)X + X(A + BGM)T < 0
where matrices A, B and M are the augmented matrices defined in (9.4). 2
9.1.2 Covariance Upper Bound Control

Next, we shall consider the covariance upper bound control problem discussed in Chapter 6. We
explicitly show that the problem can be reduced to the LMI of the form (9.5) although this reduction
process has already been done implicitly in the proofs given in Chapter 6. To this end, we shall
restate the covariance upper bound control problem for the reader’s convenience. Consider the
linear time-invariant system
    
ẋp Ap Dp Bp xp
    
 y  =  Cp 0 0  w 
  (9.9)
   
z Mp Dz 0 u
where xp is the state, w is the white noise with intensity I, u is the control input, y is the output
of interest and z is the measured output. The covariance upper bound control problem is the
following.
Let an output covariance bound 0 > 0 be given. Determine whether or not there
exists a controller in (9.1) which asymptotically stabilizes the system (9.9) and yields
an output covariance satisfying
lim E [y(t)yT (t)] < 0.

t→∞
Parametrize all such controllers when one exists.
This problem can be reduced to the following.
Theorem 9.1.2 Let a controller G and an output covariance bound 0 > 0 be given. The following
(i) The controller G solves the covariance upper bound control problem.
(ii) There exists a matrix X > 0 such that CXCT < 0 and (9.5) hold where
" #
h i B XMT AX + XAT D
∆
Γ ΛT Θ = .
0 ET DT −I
Proof. Recall from Chapter 6 that statement (i) holds if and only if there exists a state covariance
upper bound X > 0 such that CXCT < 0 and
(A + BGM)X + X(A + BGM)T + (D + BGE)(D + BGE)T < 0
or equivalently, by the Schur complement formula,

" #
(A + BGM)X + X(A + BGM)T D + BGE
< 0.
(D + BGE)T −I
It is trivial to verify the equivalence of the above statements and (ii). 2
9.1.3 Linear Quadratic Regulator
Another control problem which falls into the framework of (9.5) is the Linear Quadratic Regulator
(LQR) problem. Consider the linear time-invariant system
    
ẋp Ap Dp Bp xp
    
 y  =  Cp 0 By   w 
  (9.10)
   
z Mp 0 0 u
where xp is the state, w is the impulsive disturbance w(t) = w0 δ(t) where δ(·) is the Dirac’s delta
function, u is the control input, y is the output of interest and z is the measured output. The LQR
problem is to guarantee an upper bound on the square integral of the output signal as follows.
Let a performance bound γ > 0 be given. Determine whether or not there exists
a controller in (9.1) which asymptotically stabilizes the system (9.10) and yields the
zero initial state response y such that kykL2 < γ for all directions of the impulsive
disturbance kw0 k ≤ 1. Parametrize all such controllers when one exists.
It turns out that this problem can be reduced to a mathematical problem which is the dual of that
for the covariance upper bound control problem.
Theorem 9.1.3 Let a controller G and a performance bound γ > 0 be given. The following
(i) The controller G solves the LQR problem.
(ii) There exists a matrix Y > 0 such that kDT YDk < γ 2 and (9.5) hold where
" #
h i YB MT YA + AT Y MT
∆
Γ ΛT Θ = .
H 0 M −I
Proof. Using the analysis result given in Section 4.6, statement (i) is equivalent to the existence
of a Lyapunov matrix Y > 0 such that kDT YDk < γ 2 and
Y(A + BGM) + (A + BGM)T Y + (C + HGM)T (C + HGM) < 0.
Then the result follows from the Schur complement formula. 2
9.1.4 L∞ Control
    
ẋp Ap Dp Bp xp
    
 y  =  Cp 0 0  w 
  (9.11)
   
z Mp Dz 0 u
where xp is the state, w is the disturbance with finite energy, y is the output of interest, and z
and u are the measured output and the control input, respectively. The L∞ control problem is to
find a controller that guarantees a bound on the peak value of the output y in response to any unit
energy disturbance. This problem can be given as follows.
output y satisfying kykL∞ < γ for all disturbances w such that kwkL2 ≤ 1. Parametrize
all such controllers when one exists.
This problem is mathematically equivalent to the covariance upper bound control problem with
∆
covariance bound 0 = γ 2 I. This fact can readily be verified from the analysis results in Chapter 4,
and hence, we give a characterization of the L∞ controllers without proof as follows.
(i) The controller G solves the L∞ control problem.
(ii) There exists a matrix X > 0 such that CXCT < γ 2 I and (9.5) hold where
" #
h i B XMT AX + XAT D
∆
Γ ΛT Θ = .
0 ET DT −I
9.1.5 H∞ Control
    
ẋp Ap Dp Bp xp
    
 y  =  Cp Dy By   w  (9.12)
    
z Mp D z 0 u
where xp is the state, y and w are the regulated output and the disturbance, and z and u are the
measured output and the control input, respectively. Let the closed-loop transfer matrix from w
to y with the controller in (9.1) be denoted by T(s);
T(s) = Cc` (sI − Ac` )−1 Bc` + Dc` .

∆
The H∞ control problem can be stated as follows.
closed-loop transfer matrix such that kTkH∞ < γ. Parametrize all such controllers
when one exists.
This problem may be interpreted in the following two ways: First, as we discussed in Chapter 4, the
energy-to-energy gain Γee defined in Section 4.6.1 is equal to the H∞ norm of the corresponding
transfer matrix, i.e., Γee = kTkH∞ . Hence, a controller that solves the H∞ control problem
guarantees that the energy (L2 norm) of the output y is less than γ for all disturbances w with
kwkL2 ≤ 1. Thus, γ is the worst-case performance bound. Another interpretation is to view the
H∞ control problem as a robust stabilization problem. As shown in Section 4.7.1, by the small
gain theorem, the condition kTkH∞ < γ guarantees robust stability with respect to norm-bounded
uncertainty of size less than or equal to γ −1 . Hence, in this case, γ is a robustness bound.
For the H∞ control problem, we have the following result.
Theorem 9.1.5 Let a controller G and a performance (or robustness) bound γ > 0 be given. The
(i) The controller G solves the H∞ control problem.

 
h i  B XMT AX + XAT XCT D

Θ = F 
∆
Γ ΛT  H 0 CX −γI .
0 ET DT FT −γI
Proof. The result simply follows from substituting the definitions of the closed-loop matrices in
(9.3) into the LMI in statement (iv) of Theorem 4.6.3, where we note that Γee = kTkH∞ . 2
9.1.6 Positive Real Control

    
ẋp Ap Dp Bp xp
    
 y  =  Cp Dy By   w  (9.13)
    
z Mp Dz 0 u
where xp is the state, z and u are the measured output and the control input, and y and w are the
exogenous signals to describe the design specification. Let the closed-loop transfer matrix from w
to y with the controller in (9.1) be given by
T(s) = Cc` (sI − Ac` )−1 Bc` + Dc` .

∆
The transfer matrix T(s) is said to be strongly positive real if it is asymptotically stable and
T(jω) + TT (−jω) > 0
for all frequencies ω, including infinity. Using this notion, the positive real control problem can be
stated as follows.
lizes the system (9.13) and yields the strongly positive real closed-loop transfer matrix
T(s). Parametrize all such controllers when one exists.
This problem can be considered as a robust stabilization problem for systems with positive real (or
passive) uncertainties [48, 142]. The positive real control problem can also be reduced to a problem
of the type given by (9.5) as follows.
(i) The controller G solves the positive real control problem.
" #
h i B XMT AX + XAT XCT − D
∆
Γ ΛT Θ = .
H −ET CX − DT −F − FT
Proof. It can be shown that the transfer matrix T(s) is internally stable and is strongly positive
real if and only if there exists a matrix X > 0 such that
" #" # " #" #T
Ac` Bc` X 0 X 0 Ac` Bc`
+ < 0.
Cc` Dc` 0 −I 0 −I Cc` Dc`
Then the result follows by substituting the closed-loop matrices in (9.3) into the above matrix
inequality. 2
9.1.7 Robust H2 Control

Consider the uncertain system described by
    
ẋp Ap Dp1 Dp 2 Bp xp
    
    
 y1   Cp1 Dy11 0 By1  w1 
 =  , w1 = ∆y1 (9.14)
 y2   Cp 2 Dy21 0 By 2  w2 
    
z Mp Dz1 0 0 u
where xp is the state, y1 and w1 are the exogenous signals to describe the uncertainty ∆, y2 and w2
are the output of interest and the impulsive disturbance, and z and u are the measured output and
the control input. The nominal system (∆ ≡ 0) is linear time-invariant, while the uncertainty ∆
is assumed to belong to the following set of norm-bounded time-varying, structured uncertainties:
∆
BUC = { ∆ : R → Rm×m , k∆(t)k ≤ 1, ∆(t) ∈ U } (9.15)
where
U = {block diag(δ1 Ik1 , · · · , δs Iks , ∆1 , · · · , ∆f ) : δi ∈ R, ∆i ∈ Rks+i ×ks+i }.

∆
(9.16)
In the above, we have implicitly assumed for simplicity that the uncertainty ∆ is square. Define a
subset of positive definite matrices that commute with ∆ ∈ U:
∆
S = {block diag (S1 · · · Ss s1 Iks+1 · · · sf Iks+f ) :
Si ∈ Rki ×ki , si ∈ R, Si > 0, si > 0 }. (9.17)
We consider the following robust performance problem to guarantee a bound on the energy of the
output y2 in response to the worst-case impulsive disturbance w2 , for all uncertainties ∆ ∈ BUC .
Let a robust performance bound γ > 0 be given. Find a controller in (9.1), for the
uncertain system (9.14), such that the closed-loop system is robustly stable and the
output y2 satisfies ky2 kL2 < γ for all impulsive disturbances w2 (t) = w0 δ(t) with
kw0 k ≤ 1, and for all possible uncertainties ∆ ∈ BUC .
This problem can be approached using the robust H2 performance bound given in Theorem 4.7.2
as follows.
Theorem 9.1.7 Let a controller G and a robust performance bound γ > 0 be given. Suppose there
exist a Lyapunov matrix Y > 0 and a scaling matrix S ∈ S such that DT2 YD2 < γ 2 I and (9.5)
hold where
 
YB MT YA + AT Y YD1 C1 T CT2
 
h i  ET1 D1 T Y −S FT11 FT21 
∆ 0 
Γ ΛT Θ = .
 H1 0 C1 F11 −S−1 0 
 
H2 0 C2 F21 0 −I
Then the controller G solves the robust H2 control problem.
Proof. The result follows from Theorem 4.7.2 and the definitions for the closed-loop matrices given
in (9.3). 2
9.1.8 Robust L∞ Control

Consider the uncertain system given by
    
    
    
 y1   Cp1 Dy11 Dy12 By1  w1 
 =  , w1 = ∆y1 (9.18)
 y2   Cp 2 0 0 0  w2 
    
z Mp Dz1 Dz 2 0 u
where xp is the state, y1 and w1 are the exogenous signals used to describe the uncertainty ∆, y2
and w2 are the output of interest and the finite energy disturbance, and z and u are the measured
output and the control input. The uncertainty ∆ is assumed to belong to the set BUC defined in
(9.15). The robust L∞ control problem is to guarantee a bound on the peak value of the output
y2 for any unit energy disturbance w2 in the presence of uncertainty ∆ ∈ BUC .
output y2 satisfies ky2 kL∞ < γ for any disturbance w2 such that kw2 kL2 ≤ 1, and for
all possible uncertainties ∆ ∈ BUC .
We may try to solve this problem using the robust L∞ performance bound given in Theorem 4.7.3
as follows.
exist a Lyapunov matrix X > 0 and a scaling matrix S ∈ S such that C2 XCT2 < γ 2 I and (9.5)
hold where
 
XMT B AX + XAT XC1 T D1 D2
 
h i  −S 
∆ 0 H1 C1 X F11 F12 
Γ ΛT Θ = .
 ET1 0 D1 T FT11 −S−1 0 
 
ET2 0 DT2 FT12 0 −I
Then the controller G solves the robust L∞ control problem.
Proof. The result follows from Theorem 4.7.3 and the definitions for the closed-loop matrices in
(9.3). 2
9.1.9 Robust H∞ Control

Consider the uncertain system
    
    
    
 y1   Cp1 Dy11 Dy12 By1  w1 
 =  , w1 = ∆y1 (9.19)
 y2   Cp 2 Dy21 Dy 22 By 2  w2 
    
z Mp Dz1 Dz 2 0 u
where xp is the state, y1 and w1 are the exogenous signals used to describe the uncertainty ∆ ∈
BUC , y2 and w2 are the output of interest and the finite energy disturbance, and z and u are the
measured output and the control input. The robust H∞ control problem is to guarantee a bound
on the energy of the output y2 in response to any unit energy disturbance w2 in the presence of
uncertainty ∆ ∈ BUC .
output y2 satisfies ky2 kL2 < γ for any disturbance w2 such that kw2 kL2 ≤ 1, and for
all possible uncertainties ∆ ∈ BUC .
exist a Lyapunov matrix Y > 0 and a scaling matrix S ∈ S such that (9.5) holds where
 
YB MT YA + AT Y YD1 YD2 C1 T CT2
 
 0 ET1 D1 T Y −S 0 FT11 FT21 
h i  
∆  
Γ ΛT Θ = 0 ET2 DT2 Y 0 −γI FT12 FT22 .
 
 F12 −S−1 
 H1 0 C1 F11 0 
H2 0 C2 F21 F22 0 −γI
Then the controller G solves the robust H∞ control problem.
Proof. Using the definitions for the closed-loop matrices (9.3), the result can be verified from
Theorem 4.7.4. 2
9.2 Discrete-Time Case

This section provides a unified perspective for the discrete-time case. We shall show that, for the
discrete-time case, the control design problems considered in the previous section can be reduced
to the problem of solving a Quadratic Matrix Inequality (QMI)
(Θ + ΓGΛ)R(Θ + ΓGΛ)T < Q (9.20)

for the controller parameter G where matrices Θ, Γ, Λ, R and Q are appropriately defined for
each control problem. Note that the above linear algebra problem is a special case of (9.5) which
arose in the continuous-time case. To see this, using the Schur complement formula, write (9.20)
as " # " # " #
Γ h i 0 h i Q Θ
G 0 −Λ + G T
ΓT 0 − < 0, (9.21)
0 −ΛT ΘT R−1
where we used R > 0 (which holds for all the control problems considered below). Thus, both
continuous-time and discrete-time control problems reduce to the LMI (9.5). In this regard, the
LMI (9.5) defines a fundamental algebraic problem in the LMI formulation of control problems.
To solve the QMI (9.20) for G, we could apply Theorem 2.3.12 to the LMI (9.21). However,
the QMI (9.20) can be directly solved by Theorem 2.3.11. In this case, necessary and sufficient
conditions for the existence of G satisfying (9.20) are given by
Γ⊥ (Q − ΘRΘT )Γ⊥T > 0,
ΛT ⊥ (R−1 − ΘT Q−1 Θ)ΛT ⊥T > 0,
and all such G are explicitly parametrized by a free parameter L such that kLk < 1. As in the
continuous-time case, these existence conditions lead to LMIs in terms of the Lyapunov matrix X
and its inverse X−1 , which can further be reduced to a convex LMI problem for certain cases.
9.2.1 Stabilization Problem

Let us first consider the simplest control problem of stabilizing a linear time-invariant plant;
" # " #" #
xp (k + 1) A p Bp xp (k)
= (9.22)
z(k) Mp 0 u(k)
where xp is the state, u is the control input and z is the measured output. The stabilization
problem is the following.
lizes the system (9.22). Parametrize all such controllers when one exists.
This problem can be reduced to the following linear algebra problem.
(i) The controller G solves the stabilization problem.
h i h i
∆
Θ Γ Q = A B X ,
h i h i
∆
ΛT R = MT X .
Proof. The result is just a restatement of Lyapunov’s stability theory for discrete-time systems,
which says that statement (i) holds if and only if there exists a Lyapunov matrix X > 0 such that
X > (A + BGM)X(A + BGM)T .
9.2.2 Covariance Upper Bound Control
Next, we shall consider the covariance upper bound control problem discussed in Chapter 6. The
state space model of the plant is given by
    
xp (k + 1) Ap Dp Bp xp (k)
    
 y(k)  =  Cp Dy 0   w(k)  (9.23)
    
z(k) Mp Dz 0 u(k)
where xp is the state, w is the white noise with covariance I, u is the control input, y is the output
of interest and z is the measured output. The covariance upper bound control problem is the
following.
Let an output covariance bound 0 > 0 be given. Determine whether or not there exists
a controller in (9.1) which asymptotically stabilizes the system (9.23) and yields an
output covariance satisfying
lim E [y(k)yT (k)] < 0.

k→∞
Parametrize all such controllers when one exists.
For this problem, we have the following result.
Theorem 9.2.2 Let a controller G and an output covariance bound 0 > 0 be given. The following
(i) The controller G solves the covariance upper bound control problem.
(ii) There exists a matrix X > 0 such that CXCT + FFT < 0 and (9.20) hold where
h i h i
∆
Θ Γ Q = A D B X ,
" #
h i MT X 0
∆
ΛT R = .
ET 0 I
Proof. The result simply follows from Lemma 6.1.2. 2

9.2.3 Linear Quadratic Regulator

We consider the Linear Quadratic Regulator problem for the linear time-invariant system
    
    
     
 y(k)  =  Cp Dy By   w(k)  (9.24)
z(k) Mp 0 0 u(k)
where xp is the state, w is the pulse disturbance w(k) = w0 δ(k) where δ(k) is the Kronecker delta
function, u is the control input, y is the output of interest and z is the measured output. The LQR
problem is defined by an upper bound on the square summation of the output signal as follows.
Let a performance bound γ > 0 be given. Determine whether or not there exists a
controller in (9.1) which asymptotically stabilizes the system (9.24) and yields the zero
initial state response y satisfying kyk`2 < γ for all directions of the pulse disturbance
kw0 k ≤ 1. Parametrize all such controllers when one exists.
(i) The controller G solves the LQR problem.
(ii) There exists a matrix Y > 0 such that kDT YD + FT Fk < γ 2 and (9.20) hold where
h i h i
∆
Θ Γ Q = AT CT MT Y ,
" #
h i B Y 0
∆
ΛT R = .
H 0 I
Proof. From the analysis result in Chapter 4, statement (i) holds if and only if there exists a
matrix Y > 0 such that kDT YD + FT Fk < γ 2 and
Y > (A + BGM)T Y(A + BGM) + (C + HGM)T (C + HGM). (9.25)
Then it is straight forward to verify the result. 2
9.2.4 `∞ Control
    
    
     
 y(k)  =  Cp Dy 0   w(k)  (9.26)
z(k) Mp Dz 0 u(k)
where xp is the state, y is the output of interest, w is the finite energy disturbance, z and u are
the measured output and the control input, respectively. The `∞ control problem can be stated as
follows.
output y such that kyk`∞ < γ for any disturbances w with kwk`2 ≤ 1. Parametrize all
such controllers when one exists.
This problem reduces to the following.
(i) The controller G solves the `∞ control problem.
(ii) There exists a matrix X > 0 such that kCXCT + FFT k < γ 2 and (9.20) hold where
h i h i
∆
Θ Γ Q = A D B X ,
" #
h i MT X 0
∆
ΛT R = .
ET 0 I
Proof. The result follows from Theorem 4.6.5. 2
9.2.5 H∞ Control
    
    
 y(k)  =  Cp Dy By   w(k)  (9.27)
    
z(k) Mp Dz 0 u(k)
where xp is the state, y and w are the performance signals that we use to describe the design
specification, and z and u are the measured output and the control input. We denote the closed-
loop transfer matrix from w to y, with the controller in (9.1), by
T(z) = Cc` (zI − Ac` )−1 Bc` + Dc` .

∆
The H∞ control problem is the following.
closed-loop transfer matrix such that kTkH∞ < γ. Parametrize all such controllers
when one exists.
As in the continuous-time case, this problem has two distinct physical significances; namely, robust-
ness with respect to norm-bounded perturbations, and the disturbance attenuation level measured
by the energy-to-energy gain. See Chapter 4. The H∞ control problem for the discrete-time case
reduces to the following.
Theorem 9.2.5 Let a controller G and a robustness/performance bound γ > 0 be given. The
(i) The controller G solves the H∞ control problem.
" #
h i A D B X 0
∆
Θ Γ Q = ,
C F H 0 γ2I
" #
h i MT X 0
∆
ΛT R = .
ET 0 I
Proof. The result can be verified using Theorem 4.6.6 and noting that Υee = kTkH∞ . 2
9.2.6 Robust H2 Control

    
xp (k + 1) Ap Dp1 Dp 2 Bp xp (k)
    
    
 y1 (k)   Cp1 Dy11 0 By1  w1 (k) 
 =  , w1 = ∆y1 (9.28)
 y2 (k)   Cp 2 Dy21 Dy 22 By 2  w2 (k) 
    
z(k) Mp Dz1 0 0 u(k)
where xp is the state, y1 and w1 are the exogenous signals to describe the uncertainty ∆, y2
and w2 are the output of interest and the impulsive disturbance, and z and u are the measured
output and the control input. We assume that the uncertainty ∆ belongs to the following set of
norm-bounded, time-varying, structured uncertainties:
∆
BUD = { ∆ : I → Rm×m , k∆(k)k ≤ 1, ∆(k) ∈ U } (9.29)
where I is the set of integers and U is the subset of structured positive definite matrices defined
by (9.16). In the sequel, we will use the set of scaling matrices S in (9.17), corresponding to the
uncertainty structure (9.16). The robust H2 control problem for the discrete-time case is analogous
to the continuous-time counterpart, and can be stated as follows.
output y2 satisfies ky2 k`2 < γ for all pulse disturbances w2 (k) = w0 δ(k) with kw0 k ≤ 1,
and for all uncertainties ∆ ∈ BUD .
To address this problem in the framework of Theorem 4.7.6, we need a technical assumption w1 (0) =
0. With this assumption, we have the following.
exist a Lyapunov matrix Y > 0 and a scaling matrix S ∈ S such that DT2 YD2 + FT22 F22 < γ 2 I and
(9.20) hold where
" #
h i AT C1 T CT2 MT Y 0
∆
Θ Γ Q = ,
D1 T FT11 FT21 ET1 0 S
 
h i  B Y 0 0

R = 0 S 0 
∆
ΛT  H1 .
H2 0 0 I
Then the controller G solves the robust H2 control problem.
Proof. The result follows from Theorem 4.7.6 by substituting the closed-loop matrices defined in
(9.3). 2
9.2.7 Robust `∞ Control

    
    
    
 y1 (k)   Cp1 Dy11 Dy12 By1  w1 (k) 
 =  , w1 = ∆y1 (9.30)
 y2 (k)   Cp 2 0 Dy 22 0  w2 (k) 
    
z(k) Mp Dz1 Dz 2 0 u(k)
where xp is the state, y1 and w1 are the exogenous signals to describe the uncertainty ∆ ∈ BUD , y2
and w2 are the output of interest and the finite energy disturbance, and z and u are the measured
output and the control input. We consider the following robust `∞ control problem.
output y2 satisfies ky2 k`∞ < γ for all disturbances such that kw2 k`2 ≤ 1, and for all
uncertainties ∆ ∈ BUD .
This problem may be (conservatively) approached using the following theorem.
Theorem 9.2.7 let a controller G and a robust performance bound γ > 0 be given. Suppose there
exist a Lyapunov matrix X > 0 and a scaling matrix S ∈ S such that C2 XCT2 + F22 FT22 < γ 2 I and
(9.20) hold where " #
h i A D1 D2 B X 0
∆
Θ Γ Q = ,
C1 F11 F12 H1 0 S
 
T
h i  M X 0 0

∆ 
ΛT R =  ET1 0 S 0 
.
ET2 0 0 I
Then the controller G solves the robust `∞ control problem.
Proof. The result follows by replacing matrices in Theorem 4.7.7 by the closed-loop matrices
defined in (9.3). 2
9.2.8 Robust H∞ Control

    
    
    
 y1 (k)   Cp1 Dy11 Dy12 By1  w1 (k) 
 =  , w1 = ∆y1 (9.31)
 y2 (k)   Cp 2 Dy21 Dy 22 By 2  w2 (k) 
    
z(k) Mp Dz1 Dz 2 0 u(k)
where xp is the state, y1 and w1 are the exogenous signals to describe the uncertainty ∆ ∈ BUD ,
y2 and w2 are the signals to describe the performance specification, and z and u are the measured
output and the control input. The robust H∞ control problem can be stated as follows.
output y2 satisfies ky2 k`2 < γ for all disturbances such that kw2 k`2 ≤ 1, and for all
uncertainties ∆ ∈ BUD .
The following formulation of the robust H∞ control problem is called the state space upper bound
µ-synthesis [26, 97].
exist a Lyapunov matrix X > 0 and a scaling matrix S ∈ S satisfying inequality (9.20) with
 
h i A D1 D2 B X 0 0
∆  
Θ Γ Q = C1 F11 F12 H1 0 S 0  ,
C2 F21 F22 H2 2
0 0 γ I
 
T
h i  M X 0 0

R = 0 S 0 
∆ T
ΛT  E1 .
ET2 0 0 I
Then the controller G solves the robust H∞ control problem.
Proof. The result follows from Theorem 4.7.8, using the closed-loop matrices defined in (9.3). 2
Chapter 9 Closure
We have shown that many linear control design problems, with stability, performance, and robust-
ness specifications, can be reduced to a single problem of solving a matrix inequality
ΓGΛ + (ΓGΛ)T + Θ < 0

for the controller parameter G, where the other matrices are appropriately defined for each control
problems, in terms of the plant data, a Lyapunov matrix X (or Y), and possibly with a scaling
matrix S.
When designing a controller based on this approach, one must find a Lyapunov matrix X (or
Y) satisfying the existence conditions
Γ⊥ ΘΓ⊥T < 0, ΛT ⊥ ΘΛT ⊥T < 0.
Once such an X is found, then a controller can be computed by the explicit formula given in either
Theorem 2.3.12 (for the continuous-time case) or Theorem 2.3.11 (for the discrete-time case). In
the subsequent chapters, we shall give some algorithms to find a Lyapunov matrix X. As will be
shown, this Lyapunov-matrix-search for each control problem can also be unified.
The unified approach in this chapter is essentially from [56], and have appeared in a conference
paper [62]. An LMI solution and detailed discussion for each control problem considered here can
be found in the literature including [63, 65, 60]. The impact of the unified approach on control
education is discussed in [128]. Unified perspective for control design based on algebraic problem
formulation can be traced back to [131]. Note, however, that the formulation in [131] is based on
matrix equations rather than matrix inequalities.
The unified approach presented here is based on a single algebraic problem of the type (9.5).
A complete solution (existence conditions and an explicit formula) for this problem was first given
in [59] (see [56, 60] for a proof). The existence condition also appeared in [8, 34]. The problem
(9.20), which is a special case of (9.5) has been solved [20, 103] and the solution has been applied
to the scaled H∞ control problem [99].
Chapter 10
Projection Methods
The main objectives of Chapters 10 and 11 are to provide effective computational tools for the
numerical solution of the control design problems discussed in the previous chapters. To this
end, the concept of convexity, defined later in this chapter, will play a fundamental role in our
discussion. Convexity will allow existing computational techniques for our design purposes, but
also it will enable development of some new tools specifically designed to find solutions for these
design problems. Convexity is the most important property to seek in the control design problems
formulated in the previous chapters, since effective techniques exist to solve such problems.
In this chapter, we will follow a geometric approach in our computational design. The design
problems will be often formulated as feasibility problems, where the desired solution lies in the
intersection of a family of constraint sets, or as minimum distance problems where the solution
is the point in the intersection that minimizes the distance from a given point, or as problems
involving the minimum distance between disjoint sets. The constraint sets will often have a simple
geometric structure, such as planes, cones, or polygons and the reader is encouraged to visualize the
geometry of the problems in simple examples. The simple geometry of some of the design problems
will motivate the use of Alternating Convex Projection techniques to obtain a numerical solution.
The reader should be familiar with Appendix A before reading this chapter.
10.1 Alternating Convex Projection Techniques
10.1.1 Convexity
Consider a set K in a vector space. The set K is called convex if for any two vectors x and y in K
the vector (1 − λ)x + λy is also in K for 0 ≤ λ ≤ 1. This definition says that given two points in a
convex set, the line segment between them is also in the set. For example, a subspace is a convex
set. Also, the set of matrices X satisfying X ≥ P, for a given P, forms a convex set.
Example 10.1.1 Show that the set of positive semidefinite matrices forms a convex set.
239
240 CHAPTER 10. PROJECTION METHODS
Solution: Let X and Y be n × n positive semidefinite matrices. Define the matrix C =

(1 − λ)X + λY where 0 ≤ λ ≤ 1. We have to show that C is positive semidefinite. To this end,
consider any vector z ∈ C n and observe that z∗ Cz = (1 − λ)z∗ Xz + λz∗ Yz. However, z∗ Xz ≥ 0
and z∗ Yz ≥ 0 for any z. Hence, z∗ Cz ≥ 0 for any z, that is C is positive semidefinite.
10.1.2 Feasibility, Optimization and Infeasible Optimization Problems

We begin our discussion with a description of the type of computational problems we will encounter.
Generally, our unknowns will be symmetric matrices constrained to satisfy given matrix equality or
inequality constraints. Each one of these matrix constraints defines a set in the space of symmetric
matrices. For example, consider the matrix equality constraint
AX + XAT + Q = 0 (10.1)
where " # " #

0 1 1 0
A= , Q= .
0 0 0 0
It is simple to verify that (10.1) defines the following set in the space of 2 × 2 symmetric matrices
1
{X : X11 = arbitrary, X12 = X21 = − , X22 = 0}. (10.2)
2
In the (X11 , X22 , X12 ) space, this set corresponds to a line parallel to the X11 axis which crosses
the X12 axis at −1/2. Note that when Q = I, the set defined by (10.1) is the empty set, since in
this case there exists no matrix X to satisfy (10.1). In our computational problems we will often
seek a matrix in the intersection of these type of matrix constraint sets.
To begin, consider a family C1 , C2 , . . . , Cm of m sets in the space of symmetric matrices. We
assume that these sets are convex and closed (i.e., they contain their limit points [115]). For our
purposes, any set defined by a matrix equality constraint, as in (10.1), or a matrix semidefinite
constraint, as the constraint AX + XAT ≤ 0, is closed. Note that the concept of closedness of a
set should not be confused with the concept of boundedness. A set can be closed but unbounded,
as for example the set (10.2).
We now define the type of problems we will solve:
Feasibility problem: Suppose that the sets C1 , C2 , . . . , Cm have a nonempty intersection. Then,
the feasibility problem is to find a symmetric matrix X in this intersection, i.e., to find X such that
\ \ \
X ∈ C1 C2 ... Cm . (10.3)
Note that there might be no solution or an infinite number of solutions in a feasibility problem (see
Fig. 10.1).
Optimization problem: Again suppose that the sets C1 , C2 , . . . , Cm have nonempty intersec-
tion and consider a given n × n symmetric matrix Xo . The optimization problem we will solve is to
10.1. ALTERNATING CONVEX PROJECTION TECHNIQUES 241
C1
.
X
C3
Figure 10.1: Feasibility Problem
find the matrix in the intersection of the sets, which is closest to the matrix Xo . In mathematical
terms, we seek to solve the minimization problem
\ \ \
minimize kX − Xo k subject to X ∈ C1 C2 ... Cm . (10.4)
According to the projection theorem for convex sets [84], this minimization problem has a unique
solution, given by the projection of the matrix Xo on the intersection of the sets (see Fig. 10.2).
Infeasible optimization problem: Now consider the case of two nonempty constraint sets
C1 and C2 and suppose that they are disjoint, i.e., their intersection is empty. The infeasible
optimization problem is to find a symmetric matrix X in the set C1 which is closest to the set C2
(see Fig. 10.3). In mathematical terms we seek to solve the minimization problem
minimize dist(X, C2 ) subject to X ∈ C1 (10.5)
where dist(X, C2 ) is the distance of the matrix X from the set C2 , defined by
∆
dist(X, C2 ) = inf{kX − Yk : Y ∈ C2 }.
The solution to the infeasible optimization problem might not be unique. For example, when C1
and C2 are two parallel planes, then any point of C1 provides a solution to the the minimization
problem (10.5).
In the following, we will describe numerical techniques to solve the above types of problems
using Alternating Convex Projection (ACP) techniques.
C1
X .
C3
Figure 10.2: Optimization Problem
10.1.3 The Standard ACP Method
Consider a family C1 , C2 , . . . , Cm of closed, convex sets in the space of symmetric matrices. We

suppose that the sets have a nonempty intersection and we seek to solve the Feasibility problem
defined in Section 12.1.2. Let PCi denote the orthogonal projection operator onto the set Ci where
i = 1, . . . , m. That is, for any n × n symmetric matrix X, the matrix PCi (X) denotes the orthogonal
projection of X onto Ci , i.e., the matrix in Ci which has minimum distance from the matrix X. The
orthogonal projection (theorem [84] and Appendix A), guarantees that this projection is uniquely
defined. We assume that the sets C1 , C2 , . . . , Cm are of simple geometric structure (for example
planes, cones, spheres, polygons, etc) so that an analytical expression for each projection operator
PCi can be derived. The question we would like to answer is the following: Is it possible to provide a
solution to the feasibility problem by making use of the orthogonal projections onto each constraint
set? The answer is yes, and is provided by the following result [12], [47] which we call the Standard
Alternating Projection Theorem.
Theorem 10.1.1 Let X0 be any n × n symmetric matrix, and C1 , C2 , . . . , Cm be a family of closed,

convex sets in the space of symmetric matrices. Then, if there exists an intersection, the sequence
of alternating projections
X1 = PC1 X0
X2 = PC2 X1
..
.
Xm = PCm Xm−1
C 1
.
X
.
C
Figure 10.3: Infeasible Optimization Problem
..
.
X2m = PC2m X2m−1 (10.6)
X2m+1 = PC2m+1 X2m
..
.
X3m = PC3m X3m−1
..
.
T T T
converges to a point in the intersection of the sets, that is Xi → X where X ∈ C1 C2 . . . Cm .
If no intersection exists, the sequence converges to a limit cycle (a periodic iteration between the
disjoint sets).
Hence, starting from any symmetric matrix, the sequence of alternating projections onto the
constraint sets converges to a solution of the feasibility problem, if one exists. A schematic repre-
sentation of the Standard Alternating Projection Method is shown in Fig. 10.4. It can be easily
verified that the limit X of the alternating projection sequence depends on the starting point X0 ,
as well as the order of the projections. Hence, by rearranging the sequence of projections we can
obtain a different feasible point. See [74] for some examples.
X0 X1
C1
X3
X4
X2
C2
Figure 10.4: Standard Alternating Convex Projection Algorithm
10.1.4 The Optimal ACP Method

Our next objective is to provide a technique to solve the Optimization Problem (10.4), using an
alternating projection approach. We first observe that the Standard ACP algorithm (10.6) does not
necessarily converge to the solution of the Optimization Problem. In fact, a simple example where
C1 is a disc in the plane and C2 is a line which intersects the disc, can easily show that the sequence
of alternating convex projections (10.6) is not adequate in this case. However, a simple modification
of the Standard ACP method provides an algorithm which solves the Optimization problem. To
describe this result, consider the closed, convex sets C1 , C2 , . . . , Cm and a given n × n matrix X0 .
The following result [9], [49], which we call the Optimal Alternating Convex Projection Theorem,
provides an algorithm which converges to the solution of the Optimization Problem (10.4).
Theorem 10.1.2 Consider the sequence of matrices {Xi }, i = 1, 2, . . . , ∞ computed as follows
X1 = PC1 X0 Z1 = X1 − X0
X2 = PC2 X1 Z2 = X2 − X1
.. ..
. .
Xm = PCm Xm−1 Zm = Xm − Xm−1
Xm+1 = PC1 (Xm − Z1 ) Zm+1 = Z1 + Xm+1 − Xm
(10.7)
Xm+2 = PC2 (Xm+1 − Z2 ) Zm+2 = Z2 + Xm+2 − Xm+1
.. ..
. .
X2m = PCm (X2m−1 − Zm ) Z2m = Zm + X2m − X2m−1
X2m+1 = PC1 (X2m − Zm+1 ) Z2m+1 = Zm+1 + X2m+1 − X2m
.. ..
. .
X0 X1
C1
X3
X4
X2
C2
Figure 10.5: Directional Alternating Projection Algorithm
Then the sequence {Xi } converges to a matrix X that solves the Optimization Problem (10.4).
Note that the algorithm (10.7) is a modified alternating projection algorithm, where in each
step, an increment Zi is removed before projection to the corresponding convex set. This forces
the algorithm to converge to the solution X of the Optimization Problem.
An important feature of the two projection algorithms (10.6) and (10.7) is that, when the
analytical expressions for the projections onto the constraint sets are available, then the algorithms
can be implemented very easily and the amount of calculations in one iteration is very small.
However, in some cases the algorithms may suffer from slow convergence. For example, consider
the case of two planes intersecting with a small angle. In this case the standard ACP algorithm
(10.6) might oscillate for many iterations between the two sets before it converges to a point in
the intersection. An effective remedy, for the case of the Feasibility problem might be to use the
Directional Alternating Convex Projection Algorithm, described next [47].
10.1.5 The Directional ACP Method

The Directional ACP method uses information about the geometry of the constraint sets to provide
an algorithm with accelerated convergence to solve the Feasibility Problem (10.3). The basic idea
behind this approach is to utilize in each iteration the tangent plane of one of the constraint sets,
so that the sequence of points we obtain approaches the intersection of the sets more rapidly (see
Fig. 10.5). For simplicity, we will consider the case of two closed and convex constraint sets C1
and C2 . The Directional Alternating Convex Projection Theorem is described next, where hX, Yi
denotes the inner product of two matrices A and B (see Appendix A.5).
Theorem 10.1.3 Let Xo be any n × n symmetric matrix. Then the sequence of matrices {Xi },
i = 1, 2, . . . , ∞ given by
X1 = PC1 X0 , X2 = PC2 X1 , X3 = PC1 X1 ,

kX1 − X2 k2
X4 = X1 + λ1 (X3 − X1 ), λ1 =
hX1 − X3 , X1 − X2 i
(10.8)
X5 = PC1 X4 , X6 = PC2 X5 , X7 = PC1 X6 ,
kX5 − X6 kF 2
X8 = X5 + λ2 (X7 − X5 ), λ1 =
hX5 − X7 , X5 − X6 i
..
.
converges to a point in the intersection of the sets C1 and C2 .
Hence, starting from any symmetric matrix, the sequence of directional alternating projections
(10.8) provides an accelerated numerical algorithm to solve the Feasibility Problem (10.3). In
fact, it can be easily verified that when the two sets C1 and C2 are hyperplanes in the space of
symmetric matrices then the alternating projection algorithm converges to a feasible point in one
cycle, independently of the angle between the two hyperplanes.
Exercise 10.1.1 Based on the geometry of Fig. 10.5 derive the expression for X4 and λ1 in (10.8),
in terms of X1 , X2 and X3 .
Example 10.1.2 We consider the state space model of a fighter aircraft provided in MATLAB
Robust Control Toolbox (1988).
 
−0.0226 −36.6170 −18.8970 −32.0900 3.2509 −0.7626
 
 0.0001 −1.8997 0.9831 −0.0007 −0.1708 −0.0050 
 
 
 0.0123 11.7200 −2.6316 0.0009 −31.6040 22.3960 
Ap = 



 0 0 1.0000 0 0 0 
 
 0 0 0 0 −30.0000 0 
 
0 0 0 0 0 −30.0000
 
0 0
 
 0 0 
 
 
 0 0  " #
 
  0 1 0 0 0 0
Bp = Dp =  0 0 , Cp =
  0 0 0 1 0 0
 
 0 0 
 
 30 0 
 
0 30
We assume that all states are available for feedback, i.e., z = xp . We look for a static state
feedback controller to satisfy the following output performance constraints
Ey1 2 ≤ 0.01, Ey2 2 ≤ 0.01
i.e., the output covariance constraint set V has the form
V = {X ∈ S : X22 ≤ 0.01, X44 ≤ 0.01}
We seek to solve the Covariance Feasibility problem (10.27) i.e., to find an assignable covariance
matrix Xp such that Xp ∈ V, and the Covariance Optimization Problem (10.28)-(10.29) where we
look for an assignable covariance matrix Xp such that Xp ∈ V and kXp − 2Ik is minimized. For
comparison, we will apply all three alternating projection methods described in Section 12.1. The
algorithms are initialized from the same starting point (Xp )0 = 2I and we will require an error
bound for the solution (sum of the distances from the constraint sets) less than 10−5 . The standard
alternating projection technique (10.6) provides the following feasible covariance matrix
 
2.0037 0.0138 −0.0277 −0.0638 −0.5342 0.3760
 
 0.0138 0.0028 0.0058 −0.0038 0.0042 −0.0531 
 
 
 −0.0277 0.0058 1.3283 0.0000 −0.0596 0.0689 
Xp = 


 −0.0638 −0.0038 0.0000 0.0100 0.0752 0.0488 

 
 −0.5342 0.0042 −0.0596 0.0752 2.0009 −0.0014 
 
0.3760 −0.0531 0.0689 0.0488 −0.0014 2.0044
This method required 15855 iterations (a number of 6.3607 107 floating point operations (flops))
to converge to the feasible solution. The distance of the starting point (Xp )0 = 2I to the feasi-
ble covariance Xp is kXp − (Xp )0 k = 3.0499. The algorithm (10.7) which provides the feasible
covariance of minimum distance from (Xp )0 = 2I converged to the following solution
 
2.0005 0.0207 −0.0144 −0.0795 −0.5332 0.3762
 
 0.0207 0.0028 0.0054 −0.0037 0.0022 −0.0561 
 
 
 −0.0144 0.0054 1.3301 0.0000 −0.0599 0.0690 
Xp = 



 −0.0795 −0.0037 0.0000 0.0100 0.0713 0.0432 
 
 −0.5332 0.0022 −0.0599 0.0713 2.0014 −0.0020 
 
0.3762 −0.0561 0.0690 0.0432 −0.0020 2.0029
This method required 16259 iterations (9.3526 107 flops) and the distance from the starting point
is kXp − (Xp )0 k = 3.0496. The directional alternating projection algorithm (10.8) provided the
following answer
 
2.0056 0.0101 −0.0328 −0.0567 −0.5350 0.3757
 
 0.0101 0.0028 0.0060 −0.0039 0.0046 −0.0525 
 
 
 −0.0328 0.0060 1.3274 0.0000 −0.0595 0.0688 
Xp = 


 −0.0567 −0.0039 −0.0000 0.0100 0.0767 0.0511 

 
 −0.5350 0.0046 −0.0595 0.0767 2.0009 −0.0010 
 
0.3757 −0.0525 0.0688 0.0511 −0.0010 2.0055
This method required 6 iterations (1.7070 105 flops) and the distance from the starting point is
kXp − (Xp )0 k = 3.0502. Note that in all cases the output variance constraint which corresponds
to the first output y1 (2 × 2 element of Xp ) is not binding, although the one which corresponds to
y2 (4 × 4 element of Xp ) is binding (i.e., is reaching the allowed bound 0.01).
A controller which assigns to the closed-loop system the covariance matrix Xp which resulted
from the directional projection algorithm (10.8) (setting the free parameters to minimize the re-
quired control effort as in [43]) is the following
" #
−0.0103 26.5768 1.4216 14.0812 0.3636 0.2426
G=
0.0402 −10.6659 −1.0383 −4.9792 0.2426 −0.8011
We observe that for our example, the directional alternating projection algorithm converges much
faster than the other two algorithms, to a feasible solution.
10.2 Geometric Formulation of Covariance Control

In this section we will provide a geometric formulation of the covariance control design problem, as
a Feasibility, Optimization or Infeasible Optimization problem in the space of np × np symmetric
matrices. Hence, later we will apply the ACP techniques of the previous section for a numerical
solution. The results of this section can also be found in [44] and [46].
Consider that static state feedback covariance control design problem described in section 6.2.1.
Let S be the set (vector space) of np × np real symmetric matrices and define the following subsets
of S
A = {X ∈ S : (I − BB+ )(AX + XAT + W)(I − BB+ ) = 0} (10.9)
P = {X ∈ S : X > 0}. (10.10)
Then, according to Theorem 6.2.1, a matrix X is assignable to the closed-loop system as a state
covariance if and only if X is in the intersection of the two sets A and P. Hence, we have the
following geometric interpretation of the assignability theory.
T
Corollary 10.2.1 A matrix X is an assignable plant state covariance if and only if X ∈ A P.
Note that the sets A and P have special structure. Specifically, since the assignability equation
in (10.9) is a linear equation, the set A is a plane (an affine subspace) in the space of symmetric
matrices S. On the other hand, the set P of positive definite matrices, is a convex cone in S. The
10.2. GEOMETRIC FORMULATION OF COVARIANCE CONTROL 249
set A is obviously closed and convex, however P is not closed. Since it will be easier for us to deal
with closed sets we define the following closed ²-approximation of the set P
P² = {X ∈ S : X ≥ ²I} (10.11)
T
where ² is an arbitrarily small positive number. We call any matrix X ∈ A P² an ²-assignable plant
covariance. Since we will choose ² to be very small, we will not distinguish between ²-assignability
and assignability.
Our geometric formulation implies that the static state feedback covariance assignability prob-
lem can be seen as a feasibility problem: Find a matrix X such that
\
X∈A P² . (10.12)
This formulation of the covariance control problem will allow us to use the alternating convex
projection techniques for a numerical solution.
10.2.2 Dynamic Output Feedback with Measurement Noise

Now we consider a more complex problem, that of full-order dynamic output feedback covariance
control. The necessary and sufficient conditions for a matrix Xp to be assignable as a plant state
covariance by a full-order dynamic controller are provided in Corollary 5.2.2. Define the following
set in the space S of np × np symmetric matrices.
P = {Xp ∈ S : Xp > P}. (10.13)
where P is defined by
Ap P + PATp − PMp T V−1 Mp P + W = 0. (10.14)
Note that the set P in (10.13) is similar to the one in (10.10), by setting P = 0. Hence, we call
both sets with the same symbol, and the reader must distinguish the different cases of static state
feedback and full-order, dynamic feedback. We conclude that the following result is true.
Corollary 10.2.2 A matrix Xp is an assignable plant state covariance by full-order dynamic output
T
feedback if and only if Xp ∈ A P.
As earlier, we define the following closed ²-approximation of P
P² = {Xp ∈ S : Xp ≤ P + ²I} (10.15)
where ² is an arbitrarily small positive number.

Therefore, the full-order dynamic covariance control problem can be posed as a feasibility prob-
lem: Find a matrix Xp such that
\
Xp ∈ A P² (10.16)
Hence, both the static state feedback and the full-order dynamic output feedback covariance control
problems can be formulated as similar feasibility problems of finding a symmetric matrix in the
intersection of a plane with a convex cone.
10.2.3 Output Performance Constraints

To make the covariance control problems of the previous sections more practical, one can add some
output performance constraints to the design problem. To this end, consider the following output
variance constraints (or L2 to L∞ constraints) on the system outputs
(CXCT )ii ≤ σi , i = 1, 2, . . . , ny (10.17)
where C = [Cp , 0] is the system output matrix and σi , i = 1, 2, . . . , ny , are the desired bounds on
the output variances. We suppose that
£ ¤
Cp T = c1 , c2 , . . . , cny .
where ci are the columns of the Cp T . Then, by defining the following subsets of S
Vi = {Xp ∈ S : ci T Xp ci ≤ σi } (10.18)
each one of the output variance constraints (10.17) is equivalent to the feasibility condition
X p ∈ Vi . (10.19)
In a similar way we consider the output quadratic cost constraint
tr(Xp CTp QCp ) ≤ γ (10.20)
where Q is a positive definite weighting matrix and γ is a given positive bound on the output cost.
We define the following constraint set in S
C = {Xp ∈ S : tr(Xp CTp QCp ) ≤ γ}. (10.21)
Then, the output cost constraint (10.20) is equivalent to the feasibility condition
Xp ∈ C. (10.22)
We can also treat block output covariance constraints as follows. Suppose that the output
matrix Cp is decomposed in block matrices as follows
CTp = [C1 , C2 , . . . , Ck ] .
where the decomposition is compatible with the block output covariance constraints of interest.
That is, we assume that we want to satisfy matrix covariance constraints of the form
Ci T Xp Ci ≤ Yi , i = 1, 2, . . . , k. (10.23)
where Yi are given bounding matrices. By defining the constraint sets
Oi = {Xp ∈ S : Ci T Xp Ci ≤ Yi } (10.24)
10.2. GEOMETRIC FORMULATION OF COVARIANCE CONTROL 251
the ith block covariance constraint is equivalent to the feasibility condition
Xp ∈ Oi . (10.25)
From the above geometric formulation we conclude that the covariance control design problem,
subject to output variance, output cost and output matrix covariance constraints can be formulated
as a feasibility problem of the general form: Find a matrix Xp such that
\ \ \ \ \ \ \ \ \
Xp ∈ A P² V1 V2 ··· Vny O1 O2 . . . Ok C. (10.26)
We examine next the fundamental covariance control problems we wish to solve, using our
geometric approach.
10.2.4 Covariance Control Problems
Following the abstract mathematical formulation of Section 12.1.2. we can define the following
covariance control design problems using the geometric approach of the previous section.
Covariance Feasibility Problem
As we have already seen, the covariance control design problem, subject to output covariance and
output cost constraints, can be formulated as a feasibility problem in S, as follows: Find a matrix
Xp such that:
\ \ \ \ \ \ \ \ \
Xp ∈ A P² V1 V2 ··· Vny O1 O2 . . . Ok C. (10.27)
or determine that none exists. Note that stabilizability of the system implies that the intersection
of the sets A and P² is nonempty. However, when the output performance constraints, determined
by the sets Vi , Oi and C, are too stringent, then it is possible that no matrix Xp exists to satisfy
the feasibility condition (10.27).
Covariance Optimization Problem
Alternatively, we can formulate the covariance design problem as an optimization problem, as

follows: Suppose that X∗ denotes a desirable but nonassignable plant covariance matrix. Hence,
X∗ contains the desired closed-loop system properties we would like our system to possess (we
know from Chapter 4 that many closed-loop properties can be characterized in terms of the plant
covariance matrix, for example robustness, pole location, output performance, etc.) We seek an
assignable plant covariance Xp which also satisfies output variance and output cost constraints,
and minimizes the distance from the desired covariance X∗ . In mathematical terms, the problem
can be posed as: Given X∗ , find Xp to solve the minimization problem
minimize kXp − X∗ k (10.28)

subject to
\ \ \ \ \ \ \ \ \ \
Xp ∈ A P² V1 V2 ··· Vny O1 O2 ... Ok C. (10.29)
Note that when the intersection of the constraint sets is nonempty, this problem has a unique
solution.
Infeasible Optimization Problem
The natural question which arises from the above discussion is what happens when the intersection
of the constraint sets is empty? Can we obtain a satisfactory answer by relaxing some of the
constraints? Note that this is a very important question in practical design problems since usually
the design engineer does not know a priori if all the design constraints can be met. In the covariance
design problem the assignability constraints A and P² are hard constraints (that is they must be
satisfied in the design). Hence, in the case of infeasible constraints we seek an assignable covariance
which approximates as close as possible the desired, but unachievable, performance constraints. In
mathematical terms, we look to solve the following optimization problem: Find a matrix Xp to
solve the minimization problem
\ \ \
minimize dist(Xp , V O C) subject to Xp ∈ A P² (10.30)
T T T T T T T T
where V = V1 V2 · · · Vny and O = O1 O2 · · · Ok . The function dist(Xp , V O C) is
the distance of the matrix Xp from the intersection of the sets V, O and C.
When we obtain a matrix Xp that solves the covariance design problem then all controllers
which assign this covariance to the closed-loop system, can be obtained using the covariance control
parametrizations of Chapter 5 or Chapter 6.
10.3 Projections for Covariance Control

In this section we develop the analytic expressions for the projection operators onto the constraints
sets of the covariance design problem, formulated in the previous section. Using these expressions
we will be able to compute numerical solutions for the covariance design problem, utilizing the
alternating convex projection techniques of Section 12.1. We start with the assignability constraint
set A.
10.3.1 Projection onto the Assignability Set
Recall that the assignability set A is a plane (or an affine space) in the space of symmetric ma-
trices S. Hence, we can derive an expression for the orthogonal projection onto this set using the
Projection Theorem. Given any matrix X0 in S, the orthogonal projection PA can be calculated
using the following result [44].
10.3. PROJECTIONS FOR COVARIANCE CONTROL 253
Theorem 10.3.1 The projection PA X0 of the matrix X0 onto the assignability constraint set A
is given by
PA X0 = vec−1 {X0 − ∆(K∆)+ vec(Ep Qp Ep )} (10.31)
N N
where Ep = I − Bp Bp + , K = Ep (Ep Ap ) + (Ep Ap ) Ep , Qp = Ap Xp + Xp Ap T + W and ∆ is
the n2p × np (np + 1)/2 matrix whose columns form an orthonormal basis for S.
N
Recall that the Kronecker product M N of two matrices M ∈ Rm×m and N ∈ Rk×k is the
mk × mk matrix L = [Mij N]. Note that the generalized inverse (K∆)+ is independent of X0 ,
hence this pseudoinverse must be calculated only once in the alternating projection algorithm.
10.3.2 Projection onto the Positivity Set

The orthogonal projection of a matrix X0 onto the positivity set P² , defined by (10.11) or (10.15)
can be easily computed using the following result [50].
Theorem 10.3.2 Let X0 be a given matrix in S, and let X0 − P = ULUT be the eigenvalue-
eigenvector decomposition of X0 − P where L is a diagonal matrix of eigenvalues and U is the
orthogonal matrix of eigenvectors. The projection PP² X0 of the matrix X0 onto the constraint set
P² is given by
PP² X0 = UL+ UT + P (10.32)
where L+ is the diagonal matrix obtained by replacing the negative eigenvalues in L by zero.
Hence, the numerical computation of this projection requires an eigenvalue-eigenvector decom-

position of a symmetric matrix.
10.3.3 Projection onto the Variance Constraint Set

The projection of a matrix X0 on the variance constraint set Vi is provided by the following result.
Theorem 10.3.3 Let X0 be a given matrix in S. The orthogonal projection PVi X0 is given by
σi ∗ − ci T X0 ci
P V i X0 = 4 ci ci T + X0 (10.33)
kci k
where σi ∗ = min(σi , ci T X0 ci ).
The proof of the above result is left as an exercise to the reader.
10.3.4 Projection onto the Block Covariance Constraint Set

The following result provides the expression for the projection onto the set of the ith block output
covariance constraint Oi . ‘
Theorem 10.3.4 Let X0 a given matrix in S. Consider the singular value decomposition
CTi = Ui [Σi 0] ViT (10.34)
and define " #

X̄pi11 X̄pi12
, X̄pi11 ∈ Rnyi ×nyi .
∆
X̄pi = ViT Xp Vi = (10.35)
X̄Tpi12 X̄pi22
Consider the eigenvalue-eigenvector decomposition
X̄pi11 − Σ−1 T −1 T
i Ui Yi Ui Σi = Wi Λi Wi (10.36)
where Λi is a diagonal matrix which contains the eigenvalues of the matrix X̄pi11 −Σ−1 T −1
i Ui Yi Ui Σi
and Wi is the orthogonal matrix of eigenvectors. The orthogonal projection POi X0 is given by
" #
X̄∗pi11 X̄pi12
POi X0 = Vi ViT (10.37)
X̄Tpi12 X̄pi22
where
X̄∗pi11 = Wi Λ− −1 T −1
∆ T
i Wi + Σi Ui Yi Ui Σi (10.38)
and Λ−
i is the diagonal matrix obtained by replacing the positive eigenvalues in Λi by zero.
Proof. Let " #

X̂pi11 X̂pi12
X̂pi = , X̂pi11 ∈ Rnyi ×nyi .
X̂Tpi12 X̂pi22
Consider the inner product h i
tr (X∗pi − Xp )(X∗pi − X̂pi ) .
Since Vi is an orthogonal matrix, this inner product is equal to

h i
tr (ViT X∗pi Vi − ViT Xp Vi )(ViT X∗pi Vi − ViT X̂pi Vi ) . (10.39)
Define " #
X̃pi11 X̃pi12
ViT X̂pi Vi =
X̃Tpi12 X̃pi22
and note that (10.37) implies that
" #
X̄∗pi11 X̄pi12
ViT X∗pi Vi = .
X̄Tpi12 X̄pi22
Hence, the inner product 10.39) is equal to

" # " # " # " #
X̄∗pi11 X̄pi12 X̄pi11 X̄pi12 X̄∗pi11 X̄pi12 X̃pi11 X̃pi12
tr( − )( − )
X̄Tpi12 X̄pi22 X̄Tpi12 X̄pi22 X̄Tpi12 X̄pi22 X̃Tpi12 X̃pi22
h i
= tr (X̄∗pi11 − X̄pi11 )(X̄∗pi11 − X̃pi11 ) . (10.40)
10.3. PROJECTIONS FOR COVARIANCE CONTROL 255
Now observe that since X̂pi ∈ Oi we have
CTi X̂pi Ci ≤ Yi
and by substituting in this inequality the singular value decomposition (10.34), and pre- and post-
multiplying by UTi and Σ−1i we obtain
" #
I
[I 0] ViT X̂pi Vi ≤ Σ−1 T −1
i Ui Yi Ui Σi
0
or
X̂pi11 ≤ Σ−1 T −1
i Ui Yi Ui Σi .
Hence, X̂pi11 is an element of the set
{X̂pi11 : X̂pi11 ≤ Σ−1 −1

i Ui Yi Ui Σi }.
T
According to Theorem 12.3.3, the orthogonal projection of a matrix X̄pi11 onto this set is provided
by the expression (10.32). Hence, the minimum distance condition implies that the inner product
(10.40) is nonpositive. Hence the inner product (10.39) is nonpositive, and this completes the proof.
2
This projection requires an eigenvalue-eigenvector decomposition and matrix manipulation to
compute the projection matrix.
10.3.5 Projection onto the Output Cost Constraint Set

Next, the projection onto the output cost constraint set C is provided. This result is a special case
of Theorem 12.3.3.
Theorem 10.3.5 Let X0 a given matrix in S and γ > 0. The projection PC X0 is given by
γ ∗ − tr(X0 R)
PC X0 = R + X0 (10.41)
kRk2
where R = CT QC and γ ∗ = min [γ, tr(Xp R)].
The above expressions for the projections can be used to solve the Covariance feasibility Problem
(10.27), the Covariance Optimization Problem (10.28)-(10.29) or the Infeasible Covariance Opti-
mization Problem (10.30) using the Alternating Convex Projection methods described in Section
12.1. An application of alternating projection techniques for the controller redesign problem of the
Hubble space telescope can be found in [163]. An application to structural control appears in [74].
Exercise 10.3.1 Consider the state space model of a double integrator:
ẋ1 = x2
ẋ2 = u + w
y = x1
where w(t) is white noise with unit intensity. We seek to find a static state feedback control law to
stabilize the system and satisfy the output variance constraint
E(y 2 ) ≤ 0.1
Show that the assignable covariance matrix X should be in the intersection of the following con-
straint sets
A = {X ∈ S : X12 = 0}
p
P = {X ∈ S : X11 ≥ 0, X22 ≥ 0, X12 ≤ X11 X22 }
V = {X ∈ S : X11 ≤ 0.1}
Draw a sketch of these sets in the (X11 , X22 , X12 ) space. Show that the standard alternating
projection method will converge to a feasible solution in at most 3 steps. Show that all covariance
controllers are parametrized as
X22 1
G = [− − ]
X11 2X12
10.4 Geometric Formulation of LMI Control Design

In this section we reformulate the LMI control design problems described in Sections 8.2.3 and 9.2.3
as matrix feasibility problems of a simpler geometry. This allows us to explicitly derive the ex-
pressions for the orthogonal projection operators onto the constrained sets and use alternating
projection techniques for solution. For details see [45].
To start our discussion, we consider the following set of real symmetric matrices
L = {X ∈ S : EXF + FT XET + Q < 0} (10.42)
where the matrices E, F and Q ∈ S are given real matrices of compatible dimensions. It can be
easily verified that L is a convex subset in the space of symmetric matrices. Note that each one
of the LMI constraints (7.15) and (7.16) can be written as in (10.42) by appropriate choices of the
matrices E, F and Q. For example, the LMI (7.15) obtains the form in (10.42) by defining
" #⊥ " # " #⊥T " #T " #" #⊥T
∆ Bp Ap ∆ Bp ∆ Bp Dp DTp Dp DTy
Bp
E= , F = [I 0] , Q= .
By Cp By By Dy DTp Dy DTy − γ 2 I
By
(10.43)
For computational purposes, we prefer to have closed sets (i.e., sets which include their limit points),
so we consider the following closed “²-approximation” of the set L
∆
L² = {X ∈ S : EXF + FT XET ≤ −Q − ²I} (10.44)
where ² is a small positive constant. Hence, the set L² is closed, convex, it is contained in L and
it approaches L as ² → 0. We will use L² to represent (in an ²-approximation sense) any of the
constraint sets defined by LMIs, by an appropriate choice of the matrices E, F, Q and X.
10.4. GEOMETRIC FORMULATION OF LMI CONTROL DESIGN 257
The next result, provides a decomposition of the set L² into two closed, convex sets of simpler
geometry by increasing the dimension of the parameter space.
Theorem 10.4.1 Define the following sets

" #
h i ET
∆
J² = {W ∈ S2n : E F T
W ≤ −Q² } (10.45)
F
" #
∆ W11 W12
T = {W ∈ S2n : W = T
, W11 = W22 = 0, W12 ∈ Sn } (10.46)
W12 W22
∆
where Q² = Q + ²I. Then the following statements are equivalent.
(i) X ∈ L²
T
(ii) X = W12 where W ∈ J² T
Proof. Let X be in L² . Then X satisfies
EXF + FT XET ≤ −Q − ²I (10.47)
which is equivalent to " #

h i ET
E F T
W ≤ −Q² . (10.48)
F
This provides condition (ii). Conversely, if X satisfies (ii) then simple calculations reveal that
(10.47) holds, hence X is in L² . 2
Therefore the two closed convex sets J² and T can be used to provide an equivalent description
of the LMI constraint set L² . The advantage of describing the set L² in terms of the intersection
T
J² T follows from the fact that we can obtain explicit expressions for the orthogonal projection
operators onto the sets J² and T . These expressions are computed next.
The following proposition provides the orthogonal projection onto the constraint set J² .
h i
Theorem 10.4.2 Let W ∈ S2n . Consider the singular value decomposition of E FT
h i
E FT = U [Σ 0] VT (10.49)
where U and V are orthogonal matrices, and define

" #
∆ W̄11 W̄12
T
W̄ = V WV = T
, W̄11 ∈ Sn . (10.50)
W̄12 W̄22
Consider the eigenvalue-eigenvector decomposition
W̄11 + Σ−1 UT Q² UΣ−1 = LΛLT (10.51)

where Λ is a diagonal matrix that contains the eigenvalues of the matrix W̄11 + Σ−1 UT Q² UΣ−1
and L is the corresponding orthogonal matrix of the normalized eigenvectors. The projection W̄∗ =
PJ² W of the matrix W onto the set J² is given by
" #
∗
W̄11 W̄12
W∗ = V T
VT (10.52)
W̄12 W̄22
where
∗
W̄11 = LΛ− LT Σ−1 UT Q² UΣ−1 (10.53)
where Λ− is the diagonal matrix obtained by replacing the positive eigenvalues of Λ by zero.
The proof of this theorem is similar to the proof of Theorem 12.3.4. The following result provides
the projection onto the set T .
Theorem 10.4.3 Let W ∈ S2n . The orthogonal projection W∗ = PT W of the matrix W onto
the set T is provided by " #
∗ 0 X∗
W = (10.54)
X∗ 0
where X∗ = (W12 + W12
T )/2.
Proof. It is clear that T is a subspace of S2n and W∗ is in T . Now let

" #
W11 W12
W= T
(10.55)
W12 W22
be in S2n and let " #

0 X̂
Ŵ = (10.56)
X̂ 0
be any element of T . Then simple calculations reveal that
hW − W∗ , Ŵ − W∗ i =
  
 W11
W12 −W12
T
0 X̂ −
T
W12 +W12 
tr  T −W
2  2  =0 (10.57)
 W12 12
W22 X̂ −
T
W12 +W12
0 
2 2
Hence, W∗ is the orthogonal projection of W onto T . 2

In addition to the LMI constraints sets, we seek explicit expressions for the orthogonal projection
onto the positivity and the rank constraint sets corresponding to the conditions (7.14). To this
end, define the following sets in S2n
( " # )
∆ X 0
D= Z ∈ S2n : Z = , X, Y ∈ Sn (10.58)
0 Y
∆
P = {Z ∈ S2n : Z ≥ −J} (10.59)
10.4. GEOMETRIC FORMULATION OF LMI CONTROL DESIGN 259
∆
R = {Z ∈ S2n : rank(Z + J) ≤ k} (10.60)
where k is a given integer such that n ≤ k ≤ 2n, and
" #
0 In
J= ∈ S2n (10.61)
In 0
Notice that the sets D and P are closed convex sets. The expressions for the orthogonal projections
onto these sets are provided next.
Theorem 10.4.4 Let " #

Z11 Z12
Z= ∈ S2n . (10.62)
ZT12 Z22
The orthogonal projection, Z∗ = PD Z of Z onto the set D is given by
" #
∗ Z11 0
Z = ∈ S2n . (10.63)
0 Z22
The proof of the above result is simple and it is left to the reader. The orthogonal projection
onto the set P is provided by the following result which follows from [50].
Theorem 10.4.5 Let Z ∈ Sn and let Z + J = LΛLT be the eigenvalue-eigenvector decomposition

of Z + J where Λ is the diagonal matrix of the eigenvalues and L is the orthogonal matrix of the
normalized eigenvectors. The orthogonal projection, Z∗ = PP Z of Z onto the set P is given by
Z∗ = LΛ− LT − J (10.64)
where Λ− the diagonal matrix obtained by replacing the negative eigenvalues in Λ by zero.
Hence, this projection requires an eigenvalue-eigenvector decomposition of the 2n×2n symmetric

matrix Z + J.
We note that the rank constraint set R, defined by (10.60), is a closed set, but it is not convex.
Therefore, given a matrix Z in S2n , there might be several matrices in R which minimize the
distance from Z. We will call any such matrix, a projection of Z on R. The following result (see
[51]) provides a projection onto the set R.
Theorem 10.4.6 Let Z ∈ Sn and let Z + J = UΣVT be a singular value decomposition of Z + J.

The orthogonal projection, Z∗ = PR Z of Z onto the set R is given by
Z∗ = UΣk VT − J (10.65)
where Σk is the diagonal matrix obtained by replacing the smallest 2n − k singular values in Z + J
by zero.
To summarize this section, the constraint sets of the linear matrix inequality control design
problems of Chapters 6 through 9, are reformulated as matrix constraint sets of simpler geometric
structure. This allowed analytical expressions for the projection operators onto these sets. Hence,
alternating projection algorithms can be used to provide numerical solutions.
10.5 Fixed-Order Control Design
For fixed-order control design, the linear matrix inequality problems involve a nonconvex set R.
We seek a solution to the feasibility problem (10.3) for the case where some of the sets Ci are
not convex. To extend our projection techniques to this case, we define the orthogonal projection
to a nonconvex set Ci to be given by any solution of the minimum distance problem (10.4). We
notice that a given point might have several projections (possibly infinite) to a nonconvex set Ci .
Unfortunately, when some of the sets Ci are nonconvex, the alternating projection algorithms of
Theorems 12.1.1 and 121̇.3 are not guaranteed to converge to a feasible solution, even when such a
solution exists. However, it can be shown that convergence is still guaranteed in a local sense, i.e.,
when the starting point of the algorithm is in a neighborhood of a feasible solution.
The following procedure is proposed to address the low-order control design problem:
Step 1 Solve the feasibility problem (10.3) that corresponds to a full-order controller nc = np . This
is a convex problem and convergence of the alternating projection algorithms to a feasible
solution (X, Y) is guaranteed.
Step 2 Consider the problem where the controller order is reduced by one, that is, set nc = nc − 1.
Using as the initial condition the solution of the previous step, solve the low-order controller
feasibility problem using the proposed algorithms.
Step 3 Return to step 2, until the controller order nc is the desired one, or the alternating projection
algorithm does not converge. In this last case, a different initial condition might be tried
for solving step 1, and the process can be repeated.
We conclude this section by summarizing the fact that alternating projection techniques provide
simple and easy to implement algorithms for solution of feasibility problems. These algorithms
require expressions for the projection operators onto the constraint sets, therefore these sets require
a simple geometry so that such expressions can be analytically derived. For convex feasibility
problems these algorithms converge globally, however for nonconvex problems the convergence is
guaranteed only locally.
Example 10.5.1 In this example we use the alternating convex projection techniques for control
design for a 2-mass-spring system. We assume that the two bodies have equal mass m1 = m2 = 1
and they are connected by a spring with stiffness k = 1. We assume that only the position of body
2 is measured and a control force acts on body 1, that is, the problem is noncollocated.
A state space representation of the system, for the noise free case (w = 0), is the following
ẋp = Ap xp + Bp u
z = Mp xp (10.66)
10.5. FIXED-ORDER CONTROL DESIGN 261
where    
0 0 1 0 0
   
    h i
 0 0 0 1   0 
Ap =  , Bp =  , Mp = 0 1 0 0 (10.67)
 −1 1 0 0   1 
   
1 −1 0 0 0
and the state variables x1 and x3 are the position and velocity respectively, of body 1; and x2 and
x4 are the position and velocity of body 2.
We seek a stabilizing controller for this system, which places the closed-loop poles to the left of
a vertical line Re(z) = −α in the complex plane. To this end, according to the results of Chapter 6,
we seek a feasible solution to the following set of matrix inequalities
B⊥ T ⊥T
p ((Ap + αI)Xp + Xp (Ap + αI))Bp < 0 (10.68)
MTp ⊥ (Yp (Ap + αI) + (ATp + αI)Yp )MTp ⊥T < 0 (10.69)
" # " #
Xp I Xp I
≤ 0 and rank ≤ np + nc (10.70)
I Yp I Yp
where the matrix Ap + αI has been used to guarantee that the closed-loop poles are to the left of
Re(z) = −α. We set α = 0.2, and we begin with the search of a full-order controller (nc = 4). The
feasibility problem (10.68)-(10.70) is a convex problem and our alternating projection methods are
guaranteed to converge to a feasible solution. Using the directional alternating projection algorithm
of Section 12.1.5 along with the orthogonal projection expressions developed in Section 12.4, we
obtain the following feasible solution
 
2.4864 1.3467 −0.4973 −1.1690
 
 0.6304 −0.6915 
 1.3467 3.4575 
Xp =  
 −0.4973 0.6304 2.9937 −0.6721 
 
−1.1690 −0.6915 −0.6721 2.3874
 
2.3874 −0.6721 −0.6915 −1.1690
 
 −0.6721 0.6304 −0.4973 
 2.9937 
Yp =  .
 −0.6915 0.6304 3.4575 1.3467 
 
−1.1690 −0.4973 1.3467 2.4864
It can be easily verified that these matrices satisfy the conditions (10.68)-(10.70). The directional
alternating projection algorithm needed 606 iterations to converge to this solution (this corresponds
to a CPU time of 152.66 sec on a Sun Sparc II Workstation). Using the results of Chapter 6, we
can obtain the following stabilizing controller which corresponds to this feasible set (Xp , Yp )
   
−0.5231 4.4631 7.5167 0.0000 −8.4938
   
 0.7067 −0.4077   
 1.0140 0.0000   0.8348 
ẋc =   xc +  z
 −0.3783 −5.5355 −5.7237 0.0000   9.0487 
   
0.0000 0.0000 0.0000 −33.5333 0.0000
h i
u = −0.2128 3.4045 4.4224 0.0000 xc − 6.2692 y.
This controller provides the following closed-loop poles
−0.2028 ± 0.4625i, −0.2257 ± 1.1387i

−0.2982 ± 1.6737i, −5.2012, − 33.5333
which obviously satisfy the required settling time constraint.

In the following, we seek a second-order controller (nc = 2) to satisfy the same requirements. In
this case we include the nonconvex rank condition (10.70). The directional alternating projection
algorithm converged to the following feasible solution.
 
2.7374 1.9351 −0.5475 −1.4226
 
 0.6468 −0.8700 
 1.9351 4.3499 
Xp =  
 −0.5475 0.6486 1.5012 −0.2332 
 
−1.4226 −0.8700 −0.2332 2.7629
 
2.7629 −0.2332 −0.8700 −1.4226
 
 −0.2332 0.6486 −0.5475 
 1.5012 
Yp =  .
 −0.8700 0.6486 4.3499 1.9351 
 
−1.4226 −0.5475 1.9351 2.7374
It can be easily checked that these matrices satisfy the conditions (10.68)-(10.70) for nc = 2. In
this case the directional alternating projection method required 901 iterations (the corresponding
CPU time is 202.76 sec) for convergence. A second-order controller which correspond to the feasible
solution (Xp , Yp ) is
" # " #
−0.5667 0.6145 1.2966
ẋc = xc + z
−0.6954 −0.6373 1.3876
h i
u = −0.2271 −0.5598 xc + 1.1671y.
This controller provides the following closed-loop poles
−0.2006 ± 0.3778i, − 0.2007 ± 1.9358i, − 0.2007 ± 1.3343i
which satisfy the desired pole regional constraint. Note that a second-order controller is the lowest
order stabilizing controller for the plant (10.66)-(10.67).
Next we consider the H∞ control design problem for the case where there is a plant disturbance
w2 acting on body 2 and a sensor measurement noise v; that is, the state space representation of
the system is
ẋp = Ap xp + Bp u + Dp wp
z = Mp xp + v
y = Cp xp (10.71)
10.5. FIXED-ORDER CONTROL DESIGN 263
where Ap , Bp , Mp are as before and

 
0
 
 
 0 
Dp =  , Cp = Mp , wp = w2 , v = v.
 0 
 
1
We seek a second-order stabilizing controller to guarantee that the H∞ norm of the closed-loop
transfer function from the disturbance w = [w2 v]T to the regulated output y = x2 is less than
γ = 8. To this end, we use the low-order alternating projection algorithm to solve the feasibility
problem (7.14)-(7.16). The directional alternating projection algorithm provides a feasible pair
(Xp , Yp ) in 2173 iterations (which required 745.96 sec of CPU time). A second-order controller
which corresponds to this feasible pair is
" # " #
−0.6113 0.4507 0.2624
ẋc = xc + z
−1.3614 −3328 0.3165
h i
u = −0.4880 −0.3328 xc + 0.3100 y
This controller provides a closed-loop H∞ norm equal to 4.96 < 8. Hence, the required disturbance
attenuation properties are satisfied.
Chapter 10 Closure
Alternating projection algorithms are proposed to solve: i) Covariance control design problems,
and ii) linear matrix inequality control design problems. These problems are formulated as matrix
feasibility problems of finding matrix parameters in the intersection of a family of matrix con-
straint sets. Analytical expressions for the orthogonal projections onto these constraint sets have
been developed. Alternating projection methods utilize these projections in an iterative fashion to
obtain matrix parameters that satisfy the design constraints. The full-order covariance control and
linear matrix inequality control problems are described via convex matrix equality and inequality
constraints and convergence of the alternating projection algorithms is guaranteed when a solution
exists.
Also, fixed-order control design problems have been considered. However, the fixed-order control
design is characterized by adding a matrix rank constraint in the matrix feasibility design problem.
Projections onto this rank constraint sets can be easily obtained and alternating projection methods
can be used for a solution, but global convergence of the alternating projection algorithms is not
guaranteed in this case because of the non convexity of this rank constraint.
although global convergence of the algorithms is not guaranteed in this case.
Algorithms are given here to find the intersection of convex sets. The method is called Al-
ternating Projection since it alternately projects from a given point to the surface of a convex
set and repeats this process, projecting onto the next set, etc. The computational advantage of
this process is that for the problems of interest in this book, it is possible to derive analytical
solutions to the orthogonal projection, avoiding the need that most convex problem solvers have
of numerically solving a least squares problem to compute the minimum distance to the desired
set. Geometrically speaking, the orthogonal projection onto one of the member sets is not the best
direction (the intersection sought may be in another direction). While gradient calculations might
give better research directions, gradient calculations provide no clue on the step size. Alternating
projection methods know exactly how far to go, hence a convergence proof is possible.
Analytical expressions for the projections onto “rank constrained” nonconvex sets are also given.
This is useful in solving fixed-order control problems. In fact, this is the most important usefulness
of the Alternating Projection method.
Chapter 11
Successive Centering Methods
11.1 Control Design with Unspecified Controller Order

This section addresses computational aspects of our approach to the linear control design based on
the unified LMI formulation discussed in Chapter 9. Recall that our unified approach to control
system design yields analytical solutions which are of the same mathematical nature. Specifically,
many of the control problems considered in the earlier chapters can be formulated as a convex
feasibility/optimization problem if the controller order is not fixed. The objective of this section is
to give a computational algorithm to solve this convex problem.
11.1.1 Problem Formulation

The results given in Chapters 6 to 9 show that many suboptimal control problems1 can be reduced
to the following type of convex problem:
The Suboptimal Control Problem: Find a matrix pair (Xp ,Yp )∈ C where C is a
convex set defined by LMIs as follows;
∆
C = { (X, Y) : Φ(X) > 0, Ω(Y) > 0, Ψ(X, Y) ≥ 0 }
where Φ(·) and Ω(·) are affine mappings on the set of real symmetric matrices, and
" #
∆ X I
Ψ(X, Y) = . (11.1)
I Y
For instance, it can be easily verified using Theorems 9.2.3 and 2.3.11 that there exists a controller
of some order which solves the discrete-time LQR problem described in Section 9.2.3 if and only if
C=6 φ where
" #⊥ " #" #⊥T
∆ Bp X − Ap XATp −Ap XCTp Bp
Φ(X) = , (11.2)
By −Cp XATp I − Cp XCTp By
1
By “suboptimal”, we mean that we do not optimize a performance measure but merely guarantee an upper bound
on it.
265
266 CHAPTER 11. SUCCESSIVE CENTERING METHODS
" #
∆ MTp ⊥ (Y − ATp YAp − CTp Cp )MTp ⊥T 0
Ω(Y) = .
0 γI − Dp YDp − DTy Dy
T
In fact, this existence condition can be directly derived by dualizing Theorem 6.3.3 since the
covariance control problem and the LQR problem are dual to each other as shown in Chapter 9.
Once we find a matrix pair (Xp ,Yp )∈ C, a Lyapunov matrix Y ∈ R(np +nc )×(np +nc ) in Theorem
9.2.3 can be constructed by taking any matrix factor of Yp − X−1
p to obtain Ypc and Yc as follows;
" #
Yp Ypc
Y= T
,
Ypc Yc
where
Ypc Yc−1 Ypc = Yp − X−1
T ∆
p , Yc > 0.
Then all feasible controllers can be given by the explicit formula in Theorem 2.3.11.
It should be clear from the unified LMI formulations that all the control problems in Chapter 9
can be solved by the same design procedure. Namely, we first compute that pair (Xp ,Yp )∈ C, find
a Lyapunov matrix X or Y, then obtain a controller by the explicit formulas. Note that, since
all feasible controllers G are given explicitly, we do not require an iterative computation for the
last step of the control design process. The only iterations we need are to find the matrix pair
(Xp ,Yp )∈ C.
In view of the above example for the LQR problem, the set C depends on the performance bound
γ in general. So let us express the dependence explicitly by C(γ). As shown above, suboptimal
control problems naturally lead to the convex feasibility problem of finding a matrix pair (Xp ,Yp )∈
C(γ) for a given value of γ > 0. On the other hand, optimal control problems can be formulated in
the following way.
The Optimal Control Problem: Solve2
γ ∗ = min{γ : C(γ) 6= φ }
∆
where
∆
C(γ) = {(X, Y) : Φ(X, γ) > 0, Ω(Y, γ) > 0, Ψ(X, Y) ≥ 0 }
where Φ(·, ·) and Ω(·, ·) are affine with respect to the first arguments, and Ψ(·, ·) is
defined in (11.1).
For the LQR problem, only Ω(·, ·) depends on γ and Φ(·, ·) does not. However, in general, both
of these may depend on γ as for the H∞ control problem. Note also that, according to Theorems
7.2.3 and 7.3.3, the LMI corresponding to Ψ(X, Y) ≥ 0 also depends on γ as follows;
" #
X γI
≥ 0.
γI Y
2
Strictly speaking, the “min” should be replaced by “inf” since C(γ) is an open set and the optimal solution may
lie on the boundary of C(γ). However, we prefer the use of “min” since we are interested in finding a matrix pair
(Xp ,Yp )∈ C(γ) where γ is arbitrarily close to the infimum γ ∗ . Note that γ ∗ may not be attained.
11.1. CONTROL DESIGN WITH UNSPECIFIED CONTROLLER ORDER 267
However, a simple change of variables X̂ = γ −1 X, Ŷ = γ −1 Y leads to the constraint Ψ(X̂, Ŷ) ≥ 0,

∆ ∆
and thus the Optimal Control Problem stated above can also treat this class of problems. For the
covariance upper bound control problem, we may fix the structure of the output covariance bound
as in CXCT < γ0, then minimize γ.
It should be noted that the Suboptimal Control Problem for a fixed performance level γ can also
be formulated as an optimization problem of the form given above. For instance, we can minimize
α over (X,Y) subject to
Φ(X) + αI > 0, Ω(Y) + αI > 0, Ψ(X, Y) ≥ 0.
In this case, the Suboptimal Control Problem is feasible if and only if the minimum value of α is
such that α∗ ≤ 0.
The Optimal Control Problem can equivalently be written
(X∗p , Yp∗ , γ ∗ ) = arg min{γ : Φ(X, γ) > 0, Ω(Y, γ) > 0, Ψ(X, Y) ≥ 0}

∆
where ∗ denotes the optimal solution (not the complex conjugate transpose of a matrix). In this
case, γ ∗ is the optimal value of the performance measure, and the pair (X∗p ,Yp∗ ) will be used to
construct an optimal Lyapunov matrix in the next control design step. In many cases, the feasible
domain of the above problem, i.e., the set
∆
D = {(X, Y, γ) : (X, Y) ∈ C(γ) }
is convex and hence a globally optimal solution can be found.

If we want to fix the controller order to be nc , then the following additional nonconvex constraint
on (Xp ,Yp )∈ C(γ) is necessary (and sufficient):
nc = rank(Xp − Yp−1 )
or equivalently (see the proof of Theorem 7.2.3),

" #
Xp I
np + nc = rank .
I Yp
This additional rank constraint destroys the convexity of the problem and makes it much harder
to solve. Note that, if we do not fix the controller order a priori, then the computational problem
is convex and the resulting controller can always be chosen to be of order equal to or less than the
plant order since the rank of matrix Xp − Yp−1 can never exceed np due to its dimension constraint.
Note that for the static output feedback case (nc = 0), the above rank condition and Ψ(Xp ,Yp )≥ 0
reduce to Xp = Yp−1 > 0. For the general fixed-order control problem (nc > 0), we have an
alternative formulation leading to the conditions which have exactly the same structure as those
for the static output feedback case (e.g. Theorems 6.3.3 and 7.2.3) where the Lyapunov matrices
X and Y are constrained by X = Y−1 > 0. We shall address this nonconvex problem in later
sections. The following two sections present an algorithm to solve the Optimal Control Problem
stated above.
11.1.2 Analytic Center

This section summarizes preliminary materials for the algorithm given in the next section. Specifi-
cally, we shall define the analytic center of an LMI and provide necessary steps to compute it. The
concept of the analytic center is very simple, and in fact, is essential to all the algorithms given in
this chapter.
Consider an LMI
∆ X
m
F(x) = F0 + xi Fi > 0
i=1
where Fi = FTi ∈ Rn×n for i = 0, 1, · · · , m, and the corresponding feasible set F:

∆
F = { x : F(x) > 0 },
∆
where x = (x1 · · · xm )T . A barrier function φ : F → R for the set F can be given by
φ(x) = log det F(x)−1 .

∆
Note that φ(x) approaches infinity when x approaches the boundary of F (where F(x) becomes
singular). The function is strictly convex on F. Hence, if F is nonempty and bounded, then it has
a unique minimizer x∗ ;
x∗ = arg min{φ(x) : x ∈ F},
∆
or equivalently,
x∗ = arg max{ det F(x) : x ∈ F }.
We call x∗ the analytic center of the LMI F(x) > 0. Intuitively, the analytic center is the “most
feasible point” of the set F in the sense that x∗ is the point where the “distance from the boundary
of F” (det F(x)) is maximum.
Given an initial feasible point x0 ∈ F, the analytic center can be computed by Newton’s method
;
xk+1 = xk − ζk H(xk )−1 g(xk )
∆
where g(xk ) and H(xk ) are the gradient and Hessian of φ at xk ∈ F, and can be computed by
gi (x) = −tr F(x)−1 Fi ,
Hij (x) = tr F(x)−1 Fi F(x)−1 Fj ,
and ζk is the damping factor of the k th iteration;

(
∆ 1 if δ(xk ) ≤ 1/4
ζk =
1/(1 + δ(xk )) if δ(xk ) > 1/4
where q
∆
δ(xk ) = g(xk )T H(xk )−1 g(xk ).
11.1.3 The Method of Centers

This section provides a comprehensive summary of the “Method of Centers” for solving a class of
(quasi)convex minimization problem involving a Linear Matrix Inequality (LMI).
The optimization problem, for which an algorithm based on the notion of the analytic center
will be given, is the following;
γ ∗ = min{ γ : C(γ) 6= φ }
∆
where
∆
C(γ) = { x : F(x, γ) > 0 },
and F(·, ·) is a matrix-valued function which maps Rn × R into Psm×m , where Ps denotes the set
of real symmetric matrices. Note that the Optimal Control Problem stated in Section 11.1.1 falls
into this type of problem where
 
Φ(X, γ) 0 0
∆  
F(x, γ) = 
 0 Ω(Y, γ) 0 

0 0 Ψ(X, Y)
∆
where the vector x ∈ Rn (n = np (np + 1)) consists of np (np + 1)/2 elements of symmetric matrices
X and Y. Notice that this formulation enforces the constraint Ψ(X, Y) > 0 which is stronger than
the constraint Ψ(X, Y) ≥ 0 of the original problem. However, this difference is immaterial since we
can always come as close to the boundary of Ψ(X, Y) ≥ 0 as desired while satisfying Ψ(X, Y) > 0.
We shall consider the class of problems for which F(·, ·) has the following properties;
(a) C(γ) 6= φ for sufficiently large γ > 0 and
C(γ0 ) 6= φ ⇒ C(γ) 6= φ, ∀ γ ≥ γ0 .
(b) C(γ) = φ for γ < 0.
(c) C(γ) is bounded for each γ such that C(γ) 6= φ, i.e., there exists a scalar σ(γ) such that
kxk ≤ σ(γ), ∀ x ∈ C(γ).
(d) F(·, ·) is affine with respect to the first argument, and hence it can be written in the following
form;
X
n
F(x, γ) = F0 (γ) + xi Fi (γ)
i=1
where Fi (γ) = Fi (γ)T for all γ > 0 and i = 0, 1, . . . , n.
Items (a) and (b) basically guarantee that the optimization problem is well-posed, i.e., the feasible
domain is nonempty and the value of the objective function is bounded below in the feasible
domain. Items (c) and (d) imply that C(γ) is a bounded convex set whenever it is nonempty. Thus
the existence of the analytic center of F(x, γ) > 0 for each fixed (large enough) γ is guaranteed. In
the context of control design, the first statement of item (a) corresponds to the requirement that
the plant be stabilizable (by a possibly dynamic output feedback controller of some order)3 since,
for sufficiently large γ, the performance requirement is effectively removed, leaving the closed-loop
stability as the only control design specification. The meaning of the second statement of item
(a) becomes clear once we recall that, for a given performance bound γ, there exists a controller
which guarantees the performance level γ if and only if C(γ) is nonempty. Clearly, if there exists
a controller with performance γ0 , then such a controller guarantees the worse performance γ ≥ γ0 .
Item (b) is a reasonable assumption since we usually define a performance measure such that
zero is the best possible performance. Thus, there should not exist a controller with a negative
performance level.
We are now ready to state the algorithm based on the notion of the analytic center for solving
the above LMI optimization problem.
The Algorithm:
1. Initialize γ0 and x0 such that F(x0 , γ0 ) > 0 and let k = 0. Choose a reduction factor 0 < θ < 1
and an error tolerance ε > 0.
2. Update γk by
∆
γk+1 = (1 − θ)f (xk ) + θγk
where f (x) is the smallest value of γ such that F(x, γ) ≥ 0;
∆
f (x) = inf { γ : F(x, γ) > 0 }.
3. Let xk+1 be the analytic center of F(x, γk+1 ) > 0;

∆
xk+1 = arg max {det F(x, γk+1 ) : F(x, γk+1 ) > 0}.
4. If γk − f (xk ) ≤ ε, then stop. Otherwise, let k ← k + 1 and go to 2.
The idea of the above algorithm is to update x so that a better (strictly smaller) value of γ can
be found at each iteration, which leads to a strictly decreasing sequence of γk . One way to update
x to achieve this purpose is to let xk be the analytic center of F(x, γk ) > 0. Since this inequality
is strict (i.e., the pair (xk ,γk ) is an interior point of the closed set defined by F(x, γ) ≥ 0), small
perturbation in γk does not push the point to the boundary of the set. Therefore, we can always
find a better γ. Intuitively, this improvement for the value of γk+1 over γk will be large if the point
(xk ,γk ) is “deep inside” the set. Thus, it makes sense to choose xk to be the analytic center.
This algorithm generates a decreasing sequence of γk ;
γ0 > γ1 > · · · .
3
The system ẋp = Ap xp + Bp u, z = Mp xp can be stabilized by a dynamic controller if and only if the triple
(Ap ,Bp ,Mp ) is stabilizable and detectable.
If F is given by  
γA(x) − B(x) 0 0
 

F(x, γ) =  0 
0 A(x) .
0 0 C(x)
then the problem is a quasiconvex optimization problem and the sequence γk converges to the
globally optimal solution γ ∗ [7].
An initial feasible point (x0 ,γ0 ) can be found by solving a Lyapunov or Riccati equation. For
instance, consider the LQR problem for the discrete-time case. Recall that, in our formulation, γ
is the performance bound and x is the vector formed by stacking the elements of (Xp ,Yp )∈ C(γ),
which is directly related to the Lyapunov matrix Y solving the Lyapunov inequality (9.25). Noting
that the performance measure is finite if and only if the closed-loop system is stable, we can solve
Y = ATc` YAc` + CTc` Cc` + Q
for some stabilizing (possibly full-order) controller G and an arbitrary Q > 0, and compute the
value of the performance measure kBT YB + FT Fk. Since this Y satisfies (9.25), the matrix pair
(Xp ,Yp ) defined by
" #
Yp Ypc
Y= T
, Xp = (Yp − Ypc Yc−1 Ypc
T −1
)
Ypc Yc
must be such that (Xp ,Yp )∈ C(γ) for any γ satisfying kBT YB + FT Fk < γ. Thus, using the
triple (Xp ,Yp ,γ), an initial point (x0 ,γ0 ) can be determined. In the above, the stabilizing controller
can be designed using any method such as LQG, pole assignment, etc. Finally, for the covariance
upper bound control problem, the dual of the above discussion can be applied. For the H∞ control
problem, we need to solve the H∞ Riccati equations in Lemmas 7.1.1 and 7.1.2 for sufficiently large
γ > 0 instead of the Lyapunov equation.
The choice of the parameter θ does not alter the convergence property (i.e., any θ such that
0 < θ < 1 will result in convergent sequences γk and xk ). However, the value of θ does affect
the speed of convergence. In general, a smaller θ yields a smaller number of iteration steps for
convergence, but in this case, the computational load for the analytic center determination at each
iteration will be more demanding.
If we use the Newton’s method for computing the analytic center at each iteration, we need an
initial feasible point. Clearly, xk is a feasible point for the k+1th iteration since F(xk , γk+1 ) > 0 due
to the property (a) of F(·, ·) given above. Thus initialization for the analytic center computation
does not introduce additional computational burdens.
We have described an algorithm for solving a class of LMI optimization problems. The algorithm
is based on the notion of the analytic center, and utilizes the special structure (LMI constraints) of
the problem. In this sense, the algorithm is more specialized than other general convex programming
methods such as the cutting plane algorithm and the ellipsoid algorithm, and hence one can expect
a better performance for solving the particular LMI problem. Indeed, numerical experiences suggest
that the algorithm converges much faster than the above mentioned methods.
11.2 Control Design with Fixed Controller Order

This section addresses computational aspects of the fixed-order control design. As mentioned in
the previous section, the fixed-order control problem is much harder than the problem of designing
controllers of unspecified order. Specifically, the fixed-order control problem involves a nonconvex
coupling constraint and hence, it is extremely difficult to determine the feasibility of the control
specifications even for the most fundamental stabilization problem. We shall first define a general
form of the fixed-order control problems in the next section, and then suggest an approach to tackle
this problem in later sections.

Consider the following feasibility problem consisting of two convex sets and a nonconvex coupling
constraint;
The Dual LMI Problem: Find a matrix pair (X,Y) such that
∆
X ∈ X = { X : Φ(X) > 0, X ∈ Ps } (11.3)
∆
Y ∈ Y = { Y : Ω(Y) > 0, Y ∈ Ps } (11.4)
X = Y−1 > 0
where Φ(·) and Ω(·) are affine mappings on the set of real symmetric matrices, and Ps
is the set of real symmetric matrices with a block diagonal structure.
We have shown in Chapters 6 to 9 that many fixed-order controller design problems with stability,
performance and robustness specifications for both continuous-time and discrete-time systems can
be reduced to the Dual LMI Problem. For example, the linear time-invariant continuous-time
system
ẋp = Ap xp + Bp u, z = Mp xp (11.5)
is stabilizable via dynamic output feedback controller of order nc if and only if there exists a matrix
pair (X,Y) such that
Φ(X) = −B⊥ (AX + XAT )B⊥T > 0,
∆
(11.6)
Ω(Y) = −MT ⊥ (YA + AT Y)MT ⊥T > 0,

∆
(11.7)
X = Y−1 > 0, (11.8)
where " # " # " #

∆ Ap 0 ∆ Bp 0 ∆ Mp 0
A= , B= , M= . (11.9)
0 0 0 Inc 0 Inc
The controller order is fixed by the dimension of X ∈ R(np +nc )×(np +nc ) . Thus, stabilizability via
a fixed-order controller can be checked by solving the Dual LMI Problem with Φ(·) and Ω(·)
defined as in (11.6) and (11.7), respectively, and Ps being the set of (unstructured) real symmetric
11.2. CONTROL DESIGN WITH FIXED CONTROLLER ORDER 273
matrices. Moreover, all stabilizing static output feedback gains can be computed using solutions
(X,Y) to the Dual LMI Problem by the explicit parametrization given in Theorem 2.3.12. In fact,
all the control problems considered in Chapter 9 can be formulated as the Dual LMI Problems.
Typically, the structure constraint Ps arises in robust control problems for systems with structured
uncertainty. For example, the discrete-time state space upper bound (SSUB) µ-synthesis problem
can be reduced to the Dual LMI Problem with
Φ(Xs ) = Γ⊥ (Xs − ΘXs ΘT )Γ⊥T ,

∆
Ω(Ys ) = ΛT ⊥ (Ys − ΘT Ys Θ)ΛT ⊥T ,

∆
(" # )
∆ X 0
Ps = : X = XT , S ∈ S
0 S
where Γ, Λ and Θ are the augmented matrices defined in Theorem 9.2.8, and S is the set of scaling
matrices defined in (9.17).
11.2.2 A Minimization Approach

This section introduces an approach to address the Dual LMI Problem. An algorithm based on a
minimization approach is proposed and its advantages and limitations are discussed. The objective
here is to provide a motivation for the XY-Centering Algorithm to be presented in the next section.
To this end, consider a minimization problem
λ∗ = min{λmax (XY) : (X, Y) ∈ C}

∆
(11.10)
where
∆
C = {(X, Y) : X ∈ X , Y ∈ Y, Ψ(X, Y) ≥ 0 } (11.11)
where Ψ(X, Y) is defined in (11.1). Recall that the convex set C is related to the control problem
with unspecified controller order (possibly equal to the plant order). For example, for the stabi-
lization problem in Section 9.1.1, C =6 φ (with Ps being the set of unstructured real symmetric
matrices) holds if and only if (A, B) is stabilizable and (A, C) is detectable. Thus, the Dual LMI
Problem is feasible only if the above minimization problem is feasible (i.e., C =
6 φ). Moreover, the
Dual LMI Problem is feasible if and only if the optimal value of the above problem is λ∗ = 1 and
attained at an interior point of C (not on the boundary of C). This can be easily seen once we
notice that, if X > 0, Y > 0,
X ≥ Y−1 ⇔ λmin (XY) ≥ 1,
λmin (XY) = λmax (XY) = 1 ⇔ XY = I.
Unfortunately, the minimization problem is not convex since the objective function λmax (XY) is
not. Thus, it is extremely difficult to compute a globally optimal solution.
An approach to this nonconvex problem is to utilize the fact that, for each fixed Y > 0, the
∆
function ϕ(X) = λmax (XY) is convex. Thus we can successively minimize λmax (XY) over one
variable while fixing the other by convex programming problems;
Initialization: k = 1, (X0 , Y1 ) ∈ C
∆
Xk = arg min{λmax (XYk ) : (X, Yk ) ∈ C},
∆
Yk+1 = arg min{λmax (Xk Y) : (Xk , Y) ∈ C}.
Of course, this algorithm may not give a global solution to the original problem (11.10). Note that
the optimal value of each minimization problem may not be attained, i.e., Xk and Yk+1 lie on the
boundaries of the (open) feasible domains. Specifically, it is possible that Xk 6∈ X or Yk+1 6∈ Y.
Hence, the sets X and Y must be replaced by closed set inner approximations X̄ and Ȳ using a
small scalar ε > 0 to guarantee Xk ∈ X and Yk ∈ Y for all k, where
∆
X̄ = {X : Φ(X) ≥ εI, X ∈ Ps },
∆
Ȳ = {Y : Ω(Y) ≥ εI, Y ∈ Ps }.
With these modifications, the feasibility of each minimization problem at any iteration is guaranteed
since Yk (Xk−1 ) is a feasible point of the second (first) minimization problem. Moreover, this fact
guarantees that the optimal values of the minimization problems are nonincreasing;
λmax (Xk Yk ) ≥ λmax (Xk Yk+1 ) ≥ λmax (Xk+1 Yk+1 ).
The algorithm we shall present in the next section is obtained by modifying the above successive
minimization algorithm so that an upper bound on λmax (Xk Yk ) strictly decreases at each iteration
without requiring special modifications for X and Y. The idea is to replace the solution to each
minimization problem in the above algorithm by analytic centers.
11.2.3 The XY-Centering Algorithm
This section presents an algorithm based on the notion of the analytic center to address the Dual
LMI Problem. We shall state the algorithm first, then discuss its properties and limitations.
The XY-Centering Algorithm:
1. Choose a parameter θ such that 0 < θ < 1.
2. Find (X̂, Ŷ) ∈ C and let k = 1 and
∆
Y1 = Ŷ, β1 > λmax (X̂Ŷ).
3. Compute the analytic center Xk and update αk ;
∆ 1/2 1/2
Xk = ac{ I < Yk XYk < βk I, X ∈ X },
∆
αk = (1 − θ)λmax (Xk Yk ) + θβk .
4. Compute the analytic center Yk+1 and update βk+1 ;
∆ 1/2 1/2
Yk+1 = ac{ I < Xk YXk < αk I, Y ∈ Y },
∆
βk+1 = (1 − θ)λmax (Xk Yk+1 ) + θαk
5. If X−1 −1
k ∈ Y or Yk+1 ∈ X , then stop. Otherwise let k ← k + 1 and go to step 3.
In the above algorithm, an initialization parameter (X̂,Ŷ)∈ C can be found by convex program-
ming, or it can be determined that C 6= φ, in which case, the Dual LMI Problem is infeasible. In
fact, noniterative methods to find (X̂,Ŷ)∈ C are available for many control problems. For instance,
an element of C for the fixed-order dynamic output feedback stabilization problem can be found as
follows; Find a stabilizing controller of any order (e.g., full-order LQG) and compute the closed-loop
system matrix Ac` . Choosing an arbitrary matrix Q > 0, solve the Lyapunov equation for X > 0;
Ac` X + XAc` + Q = 0. (11.12)
Then letting
" #
Xp Xpc
Xpc Xc−1 XTpc )−1 ,
∆ ∆ ∆
X̂ = Xp , Ŷ = (Xp − X=
XTpc Xc
yields (X̂,Ŷ)∈ C. A justification of this method is left for the reader as an exercise.
The notation ac{·} in steps 3 and 4 denotes the analytic center of the LMI {·}. For example,
Xk is the analytic center of
 1/2 1/2

βk I − Yk XY k 0 0
 
F(X) =  0 
∆ 1/2 1/2
 0 Yk XYk −I  > 0,
0 0 Φ(X)
subject to the structure constraint X ∈ Ps .

The XY-Centering Algorithm has the following property.
Theorem 11.2.1 Suppose C 6= φ. Then the XY-Centering Algorithm is well-defined, i.e., each set
defined by the argument of ac{·} at steps 3 and 4 is nonempty and bounded, and hence the analytic
centers exist at any iteration k ≥ 1. Moreover, upper bounds αk and βk on λmax (Xk Yk ) are strictly
decreasing;
βk > αk > βk+1 > 1, ∀ k ≥ 1.
Proof. Since C 6= φ, we can find an initial pair (Y1 , β1 ) as in step 1. Note that the set
∆ 1/2 1/2
Xk = { X : I < Yk XYk < βk I, X ∈ X }
is nonempty since X̂ ∈ Xk for k = 1. Clearly, Xk is bounded, and hence the analytic center Xk is
well-defined. Noting that
λmax (Xk Yk ) < αk < βk ,
the set
∆ 1/2 1/2
Yk = { Y : I < Xk YXk < αk I, Y ∈ Y }
is again nonempty since Y1 ∈ Yk for k = 1. βk+1 is determined so that
λmax (Xk Yk+1 ) < βk+1 < αk .
Thus, in general, Xk and Yk are nonempty for all k since Xk−1 ∈ Xk , Yk−1 ∈ Yk . 2
The XY-Centering Algorithm may be interpreted as a modified version of the successive min-
imization algorithm discussed in the previous section, except that each minimization problem is
replaced by the computation of the analytic center. The strictly decreasing property of λmax (Xk Yk )
may not hold for the XY-Centering Algorithm, but upper bounds αk and βk strictly decrease. A
novel aspect of the XY-Centering Algorithm is that we do not require λmax (Xk Yk ) = 1 to have
a solution to the Dual LMI Problem. Intuitively, the algorithm tries to make Xk and Yk−1 (or
equivalently, X−1k and Yk ) closer to each other while maintaining Xk and Yk to be located “deep
inside” the sets X and Y. Thus, X−1 −1
k and Yk are forced to move into Y and X , respectively.
To compute the analytic center by the method described in the previous section, we need an
initial feasible point. As discussed in the proof of Theorem 11.2.1, the analytic centers Xk and
Yk+1 at step k can be used as such feasible starting points for the computation of the analytic
centers Xk+1 and Yk+2 at step k + 1. Thus, the result of the previous iteration is directly useful
to the next iteration. This fact speeds up the algorithm. Moreover, in general, the analytic center
can be computed much faster than the solution to the optimization problems in the successive
minimization algorithm. Of course, the total number of iterations for the successive minimization
algorithm may be less than that for the XY-Centering Algorithm.. However, numerical experiences
suggests that the XY-Centering Algorithm has an overall advantage.
Since the analytic center is uniquely determined, the XY-Centering Algorithm can be considered
as a point-to-point continuous mapping (Yk , βk ) → (Yk+1 , βk+1 ) on D where
∆
D = { (Y, β) : 0 < Y ∈ Y, ∃X such that I < Y1/2 XY 1/2 < βI, X ∈ X }.
Hence, in view of the global convergence theorem [84], we see that the sequence (Yk , βk ) converges
to the boundary of D or diverges. From Theorem 11.2.1, we know that the sequence βk is strictly
decreasing and bounded below by one, and hence converges. If the limit is β∞ > 1, then Yk
will never converge to the solution of the Dual LMI Problem. To see this, suppose Yk converges
to Y∞ > 0, Y∞ ∈ Y. If Y∞ was a solution of the Dual LMI Problem, i.e., Y∞ −1 ∈ X , then
(Y∞ , β∞ ) ∈ D since a matrix X can be chosen as Y∞ −1 + εI for sufficiently small ε > 0. Thus,
(Y∞ , β∞ ) is not on the boundary of D and, by contradiction, we conclude that Y∞ is not a solution
to the Dual LMI Problem. If the limit is β∞ = 1, then kXk Yk −Ik can be made arbitrarily small by
choosing sufficiently large k, while Xk ∈ X , Yk ∈ Y. Note, however, that Xk and Yk may approach
the boundary of X and Y, respectively, and it may still be possible that X−1 −1
k 6∈ Y, Yk 6∈ X even
though kXk Yk −Ik is arbitrarily small. From the above discussion, the parameters αk and βk in the
XY-Centering Algorithm approach one if and only if the Dual LMI Problem is feasible, provided
there exist fixed, closed inner approximations of the sets X and Y such that Xk and Yk belong to
the closed sets for all k, and also the sequences Xk and Yk are contained in compact subsets of the
set of positive definite matrices.
Example 11.2.1 Consider the double integrator system given by (11.5) where
" # " #
0 1 0 h i
Ap := , Bp := , Mp := 1 0 .
0 0 1
This system is not stabilizable via static output feedback, but stabilizable by a dynamic controller
of order nc ≥ 1. We apply the XY-Centering Algorithm for designing stabilizing controllers of order
nc = 0 and nc = 1. Here, we use Φ(·) and Ω(·) defined in (11.6) and (11.7), respectively.
Figure 11.1 shows the behavior of the upper bounds αk and βk on λmax (Xk Yk ), and the min-
imum eigenvalues of Φ(Yk−1 ) and Ω(X−1 k ) for the case nc = 0. The parameter θ is chosen to be
θ = 0.2. (see the discussion about the effect of the choice of θ on the speed of convergence given
in [7]. The initial Y1 is determined by the method described in Section 11.2.3 where the free
matrix Q > 0 in (11.12) is chosen to be identity. To visualize the convergence property, βk−1 is
plotted instead of βk ; in this way, the distance between the two curves corresponding to αk and
βk−1 can be considered as a measure for the “distance” between Xk and Yk−1 . Similarly, the curves
−λmin (Φ(Yk−1 )) and −λmin (Ω(X−1 −1
k )) express the “distance” between X and Yk , and Y and Xk ,
−1
respectively. For example, Yk−1 ∈ X if and only if −λmin (Φ(Yk−1 )) < 0.

Interestingly, the sequence Xk Yk approaches I (or equivalently, αk → 1) even though the
system is not stabilizable for nc = 0. This is not a violation of the stabilizability conditions, since
Yk−1 approaches X , but never reaches (belongs to) X (i.e., λmin (Φ(Yk−1 )) ≤ 0 for all k). After 33
iterations, we have
" #
1.5470 −4.4991 × 10−5
X= , Φ(X) = 9.00 × 10−5 , Ω(X−1 ) = −1.23 × 10−5 ,
−4.4991 × 10−5 4.7298
" #
0.64646 −5.1534 × 10−6
Y= , Φ(Y−1 ) = −7.54 × 10−5 , Ω(Y) = 1.03 × 10−5 ,
−5.1534 × 10−6 0.21144
λ(XY) = 1.00008, 1.00002.
Thus X ∈ X and Y ∈ Y but X−1 6∈ Y and Y−1 6∈ X . Note that, since Xk and Yk are approaching
the boundaries of X and Y, respectively, the fact that αk → 1 does not imply that the Dual LMI
Problem is feasible (i.e., the system is not stabilizable via static output feedback).
αk
βk−1
−λmin (Φ(Yk−1 ))
−λmin (Ω(X−1
k ))
2
1.5
0.5
Iteration Number k
0
Figure 11.1: Behavior of the XY-Centering Algorithm (nc = 0)

-0.5
0 5 10 15 20 25 30 35
αk
βk−1
−λmin (Φ(Yk−1 ))
−λmin (Ω(X−1
k ))
-1
Iteration Number k
-2
Figure 11.2: Behavior of the XY-Centering Algorithm (nc = 1)

-3
0 2 4 6 8 10 12 14 16 18
For the fixed-order (nc = 1) dynamic controller case, the parameter is chosen as before; θ = 0.2.
The initial matrix Y1 is also generated by the same procedure as before, but Q > 0 in (11.12) is
chosen randomly (using the Matlab command “rand”). It can be proved that if Y1 is chosen to
be block diagonal (two blocks with dimensions np and nc where np is the plant order), then the
XY-Centering Algorithm always fails even if the Dual LMI Problem is feasible, when the system
matrices have the structure as in (11.9). However, for all of the (nonblock diagonal) initial conditions
we have tried, the XY-Centering Algorithm was successful. Figure 11.2 shows the behavior of the
algorithm. To design a stabilizing controller, 4 iterations suffice since Yk−1 ∈ X and X−1 k ∈ Y
−1 −1
at k=4. If the algorithm continues to run, Xk and Yk move “deep inside” the sets Y and X ,
respectively. This is a typical behavior of the XY-Centering Algorithm. The result is
  
1.1458 −0.6910 −1.6325 

   1.000048
Y−1 ∼  
= X =  −0.6910 1.7966 2.0712  , λ(XY) =

1.000048 .

 1.000048
−1.6325 2.0712 3.6253
11.2.4 Extension to Optimal Control
The Dual LMI Problem defined in Section 11.2.1 is a feasibility problem. Hence, it naturally
applies to suboptimal control problems rather than optimal control problems. To make this point
clear, let us take an example: the continuous-time LQR problem discussed in Section 9.1.3. Using
Theorem 9.1.3, one can derive the following: there exists a fixed-order dynamic output feedback
controller which solves the LQR problem if and only if there exists a matrix pair (X,Y) such that
X = Y−1 > 0 and
" #⊥ " #" #⊥T
∆ B AX + XAT XCT B
Φ(X) = − > 0, (11.13)
H CX −I H
" #
∆ −MT ⊥ (YA + AT Y + CT C)MT ⊥T 0
Ω(Y, γ) = > 0. (11.14)
0 γI − DT YD
Thus, the existence of a suboptimal controller which guarantees the LQ performance bound γ can
be examined by solving the Dual LMI Problem with the above definitions for Φ and Ω for each
fixed γ > 0.
The purpose of this section is to extend the XY-Centering Algorithm which solves suboptimal
control problems to handle optimal control problems: Find a controller which minimizes a specified
performance measure. To this end, we shall define the following:
The Dual LMI Optimization Problem: Solve
γ ∗ = inf { γ : L(γ) 6= φ }
∆
where
L(γ) = { (X, Y) : X = Y−1 > 0, X ∈ X (γ), Y ∈ Y(γ) },
∆
∆
X (γ) = { X : Φ(X, γ) > 0, X ∈ Ps },
∆
Y(γ) = { Y : Ω(Y, γ) > 0, Y ∈ Ps },
where Φ(·, ·) and Ω(·, ·) are affine with respect to the first arguments, and Ps is the set
of structured real symmetric matrices.
In general, the set L(γ) corresponding to a well-posed optimal control problem has the following
property;
L(γ0 ) 6= φ ⇒ L(γ) 6= φ, ∀ γ ≥ γ0
since the condition L(γ) 6= φ is equivalent to the existence of a suboptimal controller which achieves
the performance level γ. Each of the sets X (γ) and Y(γ) also has a similar property;
X (γ0 ) 6= φ ⇒ X (γ) 6= φ, ∀ γ ≥ γ0 , (11.15)
Y(γ0 ) 6= φ ⇒ Y(γ) 6= φ, ∀ γ ≥ γ0 . (11.16)

These properties are crucial to develop an algorithm to solve the Dual LMI Optimization Problem.
We shall propose the following algorithm to solve the Dual LMI Optimization Problem. To
state the result, let us define the set C(γ) as in (11.11) by replacing X and Y by X (γ) and Y(γ),
respectively.
The XY-Centering Optimization Algorithm:
1. Choose parameters 0 < θλ < 1 and 0 < θγ < 1.
2. Let γ1 > 0 be a sufficiently large number, and find (X̂,Ŷ)∈ C(γ1 ). Then initialize k = 1 and
∆
Y1 = Ŷ, α1 > λmax (X̂Ŷ).
3. Compute the analytic centers;

∆ 1/2 1/2
Xk = ac{ I < Yk XYk < αk I, X ∈ X (γk ) },
∆ 1/2 1/2
Yk+1 = ac{ I < Xk YXk < αk I, Y ∈ Y(γk ) }.
4. If X−1 −1
k ∈ Y(γk ) or Yk+1 ∈ X (γk ), then
∆ ∆
γk+1 = (1 − θγ )ψ(Xk , Yk+1 ) + θγ γk , αk+1 = αk ,
where
∆
ψ(X, Y) = min{γ : Φ(X, γ) ≥ 0, Ω(Y, γ) ≥ 0}. (11.17)
Otherwise, let
∆ ∆
αk+1 = (1 − θλ )λmax (Xk Yk+1 ) + θλ αk , γk+1 = γk .
5. If γk − ψ(Xk , Yk+1 ) < ε for sufficiently small ε > 0, then stop. Otherwise let k ← k + 1 and
go to step 3.
As in the XY-Centering Algorithm for the Dual LMI (feasibility) Problem, the initialization
parameter (X̂,Ŷ)∈ C(γ1 ) can be found by convex programming or some other noniterative methods.
For the (optimal) LQR problem, for instance, we can use the following method similar to the one
described in Section 11.1.3. First, design a stabilizing controller of any order (possibly equal to
the plant order). Then compute the closed-loop system matrices Ac` , Bc` and Cc` . Choosing an
arbitrary positive definite matrix Q > 0, solve the Lyapunov equation for Y > 0;
YAc` + ATc` Y + CTc` Cc` + Q = 0. (11.18)
Then letting
" #
Yp Ypc
Ypc Yc−1 Ypc
T −1
∆ ∆ ∆
Ŷ = Yp , X̂ = (Yp − ) , Y= T
Ypc Yc
yields (X̂,Ŷ)∈ C(γ1 ) for a given γ1 such that
kBTc` YBc` k < γ1 .
Note that the above condition on the initial parameter γ1 may not be sufficient to guarantee
L(γ1 ) 6= φ. In this case, the only available information is that L(γ) 6= φ for sufficiently large γ > 0
if and only if the system is stabilizable via fixed-order dynamic output feedback controller. For the
H∞ control problem, a similar procedure can be applied to find (X̂,Ŷ)∈ C(γ1 ) where we replace
the above Lyapunov equation (11.18) by the H∞ Riccati equation.
The idea of the XY-Centering Optimization Algorithm is the following. If X−1 k ∈ Y(γk ) or
−1
Yk+1 ∈ X (γk ), then there exists a fixed-order dynamic output feedback controller which yields the
cost function bounded above by γk . Indeed, this bound γk is not tight since the analytic centers
Xk and Yk+1 are interior points, and the best (smallest) bound that can be obtained from these
information is ψ(Xk , Yk+1 ). Thus we can tighten the performance bound as
ψ(Xk , Yk+1 ) < γk+1 < γk .
Feasibility of any γk+1 satisfying the above inequality is guaranteed by the properties (11.15) and
(11.16). In many cases, the value of ψ(Xk , Yk+1 ) can be evaluated without iteratively solving the
minimization problem (11.17). For the LQR problem, for instance,
ψ(X, Y) = kDT YDk.
Now, if X−1 −1
k 6∈ Y(γk ) and Yk+1 6∈ X (γk ), then do the feasibility iteration; tighten the bound αk
on λmax (Xk Yk+1 ). Then, as in the XY-Centering Algorithm, Xk and Yk−1 get closer to each other
by feasibility iterations and, when X−1 −1
k ∈ Y(γk ) or Yk+1 ∈ X (γk ), we switch to the optimality
iteration as described above.
γk
αk
−λmin (Ω(X−1
k )) × 10
20
−λmin (Φ(Yk−1 )) × 100
15
10
0
0 10 20 30 40 50 60 70 80 90 100
20
Iteration Number k
10
Figure 11.3: Behavior of the XY-Centering Optimization Algorithm

-10
0 10 20 30 40 50 60 70 80 90 100
Example 11.2.2 Consider the longitudinal motion of the VTOL helicopter [70] described by
(9.10) where
   
−0.0366 0.0271 0.0188 −0.4555 0.4422 0.1761
   
 0.0482 −1.0100 
0.0024 −4.0208   
  3.5446 −7.5922 
Ap :=   , Bp :=   , Dp := I4 ,
 0.1002 0.3681 −0.7070 1.4200   −5.5200 4.4900 
   
0 0 1 0 0 0
" # " #
I2 0 0 h i
Cp := , By := , Mp := 0 1 0 0 .
0 0 I2
We apply the XY-Centering Optimization Algorithm for designing an LQ (optimal) static out-
put feedback controller. The system is open-loop unstable, and the LQ optimal state feedback cost
is J(Gsf ) = 1.5039 and the (full-order) LQG cost J(Gf o ) = 7.7062 where the LQG controller is
designed with the process noise intensity I and the measurement noise intensity εI for sufficiently
small ε > 0. For the XY-Centering Optimization Algorithm, the parameters θλ and θγ are chosen
as θλ = 0.6 and θγ = 0.3. These values are chosen after a few trials, and may be refined for faster
convergence. The initial matrix Y1 is determined by the method described above, where the LQG
controller is chosen as the stabilizing controller and the arbitrary matrix Q > 0 in (11.18) is set to
identity. The initial cost bound γ1 is chosen as γ1 = 1.2 × kDTp Y1 Dp k. Figure 11.3 shows the be-
havior of the algorithm. It can be seen that the first 18 iterations are feasibility iterations where we
update αk . At k = 18, −λmin (Φ(Yk−1 )) becomes negative and γk is updated. Then, for k > 18, the
11.3. CONTROL DESIGN WITH FIXED CONTROLLER STRUCTURE 283
feasibility and optimality iterations alternate. As a result, the bound αk on λmax (Xk Yk ) converged
to one and the cost bound γk converged to 7.77561. Thus we have
  
5.7066 −1.3896 −2.2371 −1.9297 
 1.000010
  

 
∼  −1.3896 1.0544 0.7844 0.5429 

 1.000037
X−1 =Y= , λ(XY) = ,
 −2.2371 0.7844 1.2420 0.9183  
 1.000044
  


 1.000056
−1.9297 0.5429 0.9183 1.0556
λmin (Φ(X)) = 1.3626 × 10−5 , λmin (Ω(X−1 )) = −4.8012 × 10−5 ,
λmin (Φ(Y−1 )) = 2.0180 × 10−6 , λmin (Ω(Y)) = 3.5281 × 10−7 .
The matrix X satisfies Φ(X) > 0 but Ω(X−1 ) 6> 0. On the other hand, the matrix Y satisfies
Ω(Y) > 0 and Φ(Y−1 ) > 0, and thus we have (Y−1 ,Y)∈ L(7.77561). Using Y−1 in place of X
in the definition for Γ, Λ, and Θ given in Theorem 9.1.1, a controller can be computed from the
formula in (9.7) as
" #
−1.4427
G= .
8.8963
We cannot claim that this is the optimal static output feedback controller, but we may say this is
a satisfactory result since the value of the cost is very close to that for the LQG controller.
11.3 Control Design with Fixed Controller Structure

In this section, we consider a problem of designing controllers with some fixed structure. Specifically,
control design specifications include the structure of the controller (e.g. decentralized structure, low
controller order, etc.) as well as the usual stability, performance and robustness requirements. The
corresponding computational problem is more difficult than the other two considered in the previous
sections. We shall first formulate the fixed structure control design problem as a computational
problem in the next section, then provide an algorithm to address this problem later.

Recall from Chapter 9 that many control design problems are reduced to the following type of
linear algebra problems;
ΓGΛ + (ΓGΛ)T + Θ < 0 (continuous-time case), (11.19)
(Θ + ΓGΛ)R(Θ + ΓGΛ)T < Q (discrete-time case), (11.20)
where G is the controller parameter and the other matrices are defined in terms of the plant matrices
and the Lyapunov matrix X or Y, and possibly the scaling matrix S. Our approach was to solve
the above algebraic problems for the controller parameter G, while fixing the Lyapunov matrix
(and the scaling matrix). In this way, the parameter search for (X,G) can be replaced by that
for X, in which case, all controllers associated with a given Lyapunov matrix X are parametrized
explicitly. This approach was possible if the controller has no structural constraints (although we
can handle the fixed controller order case as has been done in the previous sections). If we restrict
the controller to have a certain structure G ∈ Gs , then the above approach does not work, and in
general, we need to find G ∈ Gs and P ∈ Ps satisfying the above matrix inequality, where Ps is
the set of (structured) Lyapunov matrices which possibly includes the scaling matrix as well as the
original Lyapunov matrix X.
As an example, consider the discrete-time SSUB µ-synthesis problem with a decentralized con-
troller structure. This problem can be reduced to a search for (P,G)∈ Ps × Gs satisfying
(Θ + ΓGΛ)P(Θ + ΓGΛ)T < P
where " #
∆ X 0
Ps = { : X > 0, S ∈ S },
0 S
Gs = {block diag(G1 · · · Gng ) : Gi ∈ Rnai ×nsi },

∆
" #
h i A D B MT
∆
Θ Γ ΛT = .
C F H ET
Now we state a general form of the fixed structure control design problem.
The Fixed Structure Control Problem: For a set of structured Lyapunov matrices
Ps and a set of fixed structure controllers Gs , solve
α∗ = min{α : M(α)}
∆
where M(α) ⊂ Ps × Gs is such that

∆
P(G, α) = {P : (P, G) ∈ M(α) }
∆
G(P, α) = {G : (P, G) ∈ M(α) }
are bounded convex sets characterized by LMIs.
In the above, the parameter α has been introduced to convert a certain feasibility problem to
the above optimization problem. For instance, the discrete-time SSUB µ-synthesis problem can be
converted to the above optimization problem with
∆
M(α) = {(P, G) : (Θ + ΓGΛ)P(Θ + ΓGΛ)T < αP, P ∈ Ps , G ∈ Gs } (11.21)
where Θ, Γ and Λ are defined above. Note that a given pair (P,G) solves the SSUB µ-synthesis
problem if and only if (P,G)∈ M(α) for some α ≤ 1. Thus, the problem is feasible if and only
if the optimal value of the above minimization problem is α∗ ≤ 1. Clearly, the set P(G, α) is
characterized by an LMI and thus convex. Note that the set G(P, α) is in fact convex and can be
characterized by an LMI as follows:
( " # )
αP Θ + ΓGΛ
G(P, α) = (G : >0 .
(Θ + ΓGΛ)T P−1
The set G(P, α) is bounded if α > 0, P > 0, and Γ and Λ are of full column and row rank,
respectively. These conditions are usually satisfied in practice. The set P(G, α) is not bounded
in general. However, adding another constraint tr(P) ≤ 1 in M(α), the set P(G, α) can be
made bounded without loss of generality. Boundedness of P(G, α) and G(P, α) is required in the
algorithm given below, to guarantee the existence of the analytic centers.
11.3.2 The VK-Centering Algorithm
This section provides an algorithm to address the Fixed Structure Control Problem defined in the
previous section. The idea is very simple; noting the fact that the sets P(G, α) and G(P, α) are
convex, we can alternately minimize α over P and G;
Initialize k = 0, G0 ,
∆
(Pk , αk ) = arg min {α : P ∈ P(Gk , α)} ,
∆
(Gk+1 , βk+1 ) = arg min {α : G ∈ G(Pk , α)} .
In this case, each minimization problem is quasi-convex and the optimal values αk and βk+1 are
nonincreasing. Since the value of α is bounded below, the sequences converge.
As in the XY-Centering Algorithm, we shall replace each minimization problem by the compu-
tation of the analytic center. The following function will be used:
∆
ψ(P, G) = inf {α : (P, G) ∈ M(α)} .
The VK-Centering Algorithm:
1. Choose a parameter θ such that 0 < θ < 1.

∆ ∆
2. Find α̂, Ĝ and P̂ such that (P̂,Ĝ)∈ M(α̂) and let G1 = Ĝ, β1 = α̂, and k = 1.
3. Compute the analytic center Pk and update αk ;
∆
Pk = ac {P(Gk , βk )}
∆
αk = (1 − θ)ψ(Pk , Gk ) + θβk
4. Compute the analytic center Gk+1 and update βk+1 ;

∆
Gk+1 = ac {G(Pk , αk )} .
∆
βk+1 = (1 − θ)ψ(Pk , Gk+1 ) + θαk .
5. If βk+1 − ψ(Pk , Gk+1 ) < ε for sufficiently small ε > 0, then stop. Otherwise, let k ← k + 1
and go to step 3.
For the special case of discrete-time unstructured (but fixed-order) control design, we do not
require iterative computations for finding the analytic center Gk ; we have the following closed form
solution to the analytic center.
Theorem 11.3.1 Let matrices B, C, D, R and Q be given. Suppose BT B > 0, CCT > 0, R > 0
and Q > 0, and consider a matrix valued function;
∆
F(G) = Q − (BGC + D)R(BGC + D)T .
Suppose further that there exists a matrix G such that F(G) > 0. Then, the optimization problem
∆
ψ = max {det F(G) : F(G) > 0}
G
is well-posed, and has the unique maximizer
G∗ = −(BT ΦB)−1 BT ΦDRCT (CRCT )−1 (11.22)
where
³ ´−1
∆
Φ = Q − DRDT + DRCT Rc CRDT
³ ´−1
∆
Rc = CRCT .
Moreover, the maximum value ψ is given by

det Ψ
ψ= , (11.23)
det Φ det Rc
where µ ³ ´−1 ¶
∆
Ψ = Rc − Rc CRDT Φ − ΦB BT ΦB BT Φ DRCT Rc .
Proof. Following the proof of Theorem 2.3.11, we have
F(G) = Φ−1 − (BG + DRCT Rc )R−1 T T

c (BG + DRC Rc ) .
Then, using the determinant formula,

" #
1 Φ−1 BG + DRCT Rc
det F(G) = det
det Rc GT BT + Rc CRDT Rc
det Φ−1
= det[Rc − (BG + DRCT Rc )T Φ(BG + DRCT Rc )].
det Rc
After expanding the term [·] and completing the square, we have
det[Ψ − (G − G∗ )T (BT ΦB)(G − G∗ )]

det F(G) =
det Φ det Rc
where Ψ is defined in (11.1) and G∗ is given by (11.22). Noting that Ψ > 0, BT ΦB > 0 and
Ψ > Ψ − (G − G∗ )T (BT ΦB)(G − G∗ ) > 0
for all G such that F(G) > 04 , we conclude that the determinant of F(G) is maximum when
G = G∗ , and the maximum value is given by (11.23). 2
Example 11.3.1 We consider a mechanical system consisting of two masses connected by a spring
[149]. A state space realization of the system is given by
      
ẋ1 0 0 1 0 x1 0
      
      
 ẋ2   0 0 0 1  x2   0 
 =  +  u,
 ẍ1   −k/m1 k/m1 0 0  ẋ1   1/m1 
      
ẍ2 k/m2 −k/m2 0 0 ẋ2 0
where m1 and m2 are masses and k is the spring constant. The position of the mass mi is denoted
by xi , and u is the control force input. The sensor measures the position of m2 , that is, x2 . We
choose the following values for the parameters:
m1 = m2 = 1, k = 1.
The continuous-time system is first converted to a discrete-time system assuming the zero-order
hold for the control input with sampling period T = 1 as follows:
 
0.5780 0.4220 0.8492 0.1508 0.4610
 
" #  0.4220 0.5780 0.1508 0.8492 0.0390 
Ap Bp  
 
= −0.6985 0.6985 0.5780 0.4220 0.8492 .
Mp ∗  
 0.6985 −0.6985 0.4220 0.5780 0.1508 
 
0 1 0 0 0
This system is unstable since all the eigenvalues of Ap are on the unit circle. Our objective here is
to design an nth
c order stabilizing controller with nc being less than the plant order (np = 4).
To address this objective, we will try to minimize the spectral radius of the closed-loop “A”
matrix. This is a special case of the Fixed Structure Control Problem defined above, with M(α)
given by (11.21) where Ps is the set of (np + nc ) × (np + nc ) positive definite matrices such that
P ∈ Ps implies tr(P) ≤ 1, Gs is the set of (1 + nc ) × (1 + nc ) unstructured matrices, and
" #
h i Ap 0 Bp 0 MTp 0
∆
Θ Γ ΛT = .
0 0 0 Inc 0 Inc
4
See the proof of Theorem 2.3.11.
solid: βk
dashed: ψ(Pk , Gk+1 )
2.5
1.5
0.5
0
0 5 10 15 20 25 30
Iteration Number k
Figure 11.4: Convergence property of βk
Here, α is a (tight) upper bound on the square of the spectral radius of the closed-loop “A” matrix
(that is, Ac` := Θ + ΓGΛ). Hence, the plant is stabilizable via nth c order controller if and only if
∗
α < 1.
We choose nc = 2 and apply the VK-Centering Algorithm to obtain an upper bound on α∗ , and
corresponding controller G and Lyapunov matrix P. The algorithm generated decreasing sequences
of αk and βk ; the sequence βk is plotted in Fig. 11.4 together with the curve for ψ(Pk , Gk+1 ).
The results are:
 
" # −0.6942 0.3803 −0.1565
Dc Cc  
G= =
 0.4832 0.4467 −0.1839 
,
Bc Ac
−0.1989 −0.1839 0.0757
 
9.2522 4.2627 −1.6270 −3.7541 8.7372 −3.5965
 
 4.2627 20.2459 2.2757 −2.2217 6.1116 −2.5156 
 
 
 −1.6270 2.2757 5.7888 −2.9487 −2.4767 1.0196 
P=

 × 10−2 .
 −3.7541 −2.2217 −2.9487 16.4480 −11.2276 4.6215 

 
 8.7372 6.1116 −2.4767 −11.2276 18.9101 −1.9036 
 
−3.5965 −2.5156 1.0196 4.6215 −1.9036 15.0693
The state space realization G for the controller has eigenvalues at 0.5224 and 0. The latter eigen-
value is in fact canceled by a zero at 0, and thus the controller becomes
z − 0.8319
C(z) = −0.6942 × ,
z − 0.5224
which is of order one. With this controller, the eigenvalues of the resulting closed-loop system are



 0.7938 (0.7938)
λ= 0.7235 ± 0.5485i (0.9079)


 0.2832 ± 0.8675i (0.9125)
where the numbers in (·) indicate the magnitudes. Thus we have obtained a first-order stabilizing
controller.
Chapter 11 Closure
Control problems whose controller order is not fixed a priori often reduce to convex problems.
This chapter gives a method for obtaining the numerical solution of each problem. The centering
method goes to the center of each convex constraint set, and is quite different from the alternating
projection method of Chapter 10, which goes to the boundary of each convex set on each iteration.
As yet, there is no definitive example which can prove that one method of these two (Chapters 10
and 11) is better than the other.
Section 11.1 gives an algorithm that computes the globally optimal solution to a certain class
of convex LMI problems, that includes those arising from control design with unspecified controller
order. This algorithm is cited from [7]. The readers are warned that this is not necessarily the best
algorithm for solving convex LMI optimization problems. Other algorithms include the potential
reduction method [144], the projective method [94], positive definite programming [145], and the
linear complementarity problem formulation [77].
The XY-Centering Algorithm is cited from [64]. The idea for this algorithm is motivated by
the min/max algorithm of [36] proposed for the fixed-order stabilization problem. The min/max
algorithm has been extended for the linear quadratic suboptimal control problem in [65]. An
algorithm similar to the XY-Centering Algorithm (but with a different parameter space) has been
proposed in [112] to deal with the constantly scaled H∞ synthesis problem.
The VK-Centering Algorithm is a conceptually simple way of treating control problems with
design specifications related to input/output scaling and/or state similarity scaling (the Lyapunov
function) “V” (while “K” is the controller). This simple idea has been used in the µ/km -synthesis
literature (the DK iteration) [24, 117]. Other related results include [41, 58, 118].
Appendix A
Linear Algebra Basics
A.1 Partitioned Matrices

Definition A.1.1 Consider the partitioned matrix
" #
A B
M= .
C D
Then (i) if A−1 exists, a Schur complement of M is defined as D − CA−1 B, and (ii) if D−1 exists,
a Schur complement of M is defined as A − BD−1 C.
Theorem A.1.1 Suppose A, B, C, D are all n × n matrices. Then:

" #
A B
1. det = det[A] det[D − CA−1 B] provided det[A] 6= 0.
C D
" #
A B
det = det[D] det[A − BD−1 C] provided det[D] 6= 0.
C D
2. If A is an m × n matrix, and B is an n × m matrix, then
det[Im − AB] = det[In − BA]
3. If A is invertible, det[A−1 ] = det[A]−1
Theorem A.1.2 (Matrix Inversion Lemma) Suppose A, B, C and D are n × n, n × p, p × p and

p × n matrices respectively. Assume A−1 and C−1 both exist. Then
(A + BCD)−1 = A−1 − A−1 B(DA−1 B + C−1 )−1 DA−1 (A.11)
Theorem A.1.3 Suppose A, B, C and D are n × n, n × p, p × p and p × n matrices respectively.
291
292 APPENDIX A. LINEAR ALGEBRA BASICS
1. Assume A−1 exists. Then

" #−1 " #
A B A−1 + EG−1 F −EG−1
= (A.12)
C D −G−1 F G−1
provided G−1 exists, where
G = D − CA−1 B; E = A−1 B; F = CA−1
2. Assume D−1 exists. Then

" #−1 " #
A B H−1 −H−1 J
= (A.13)
C D −KH−1 D−1 + KH−1 J
provided H−1 exists where
H = A − BD−1 C; J = BD−1 ; K = D−1 C

∆ ∆ ∆
A.2 Sign Definiteness of Matrices

Definition A.2.1 A Hermitian matrix Q ∈ C n×n is called positive definite if
x∗ Qx > 0 f or all x 6= 0 (A.21)
and positive semidefinite if

x∗ Qx ≥ 0 f or all x, (A.22)
and negative definite if

x∗ Qx < 0 f or all x 6= 0, (A.23)
and negative semidefinite if

x∗ Qx ≤ 0 f or all x. (A.24)
Since the definiteness of the scalar x∗ Qx is a property only of the matrix Q, we need a test for
determining definiteness of a constant matrix Q. Define a principal submatrix of a square matrix
K as any square submatrix sharing some diagonal elements of K.
Theorem A.2.1 The constant Hermitian matrix K ∈ C n×n is
(a) Positive definite (K > 0) if either of these equivalent conditions holds:
1. All eigenvalues of K are positive,

2. All successive principal submatrices of K (minors of successively increasing size) have
positive determinants;
(b) Positive semidefinite (K ≥ 0) if either of these equivalent conditions holds:

A.2. SIGN DEFINITENESS OF MATRICES 293
1. All eigenvalues of K are zero or positive,

2. All principal submatrices of K have zero or positive determinants;
(c) Negative definite (K < 0) if either of these equivalent conditions holds:
1. All eigenvalues of (−K) are positive

2. All successive principal submatrices of (−K) have positive determinants;
(d) Negative semidefinite (K ≤ 0) if either of these equivalent conditions holds:
1. All eigenvalues of (−K) are zero or positive,

2. All principal submatrices of (−K) have zero or positive determinants.
Example A.2.1 Consider the matrix

 
1 2 1
 
A= 2 4 2 


1 2 0
The eigenvalues of A are

λ1 = −0.8541, λ2 = 0, λ3 = 5.8541
Hence, A is an indefinite matrix (not a positive semidefinite or negative semidefinite matrix).

Note that the principal minors of A have the following determinants
" #
1 2
1 > 0, det = 0, det(A) = 0
2 4
Hence, we see that nonnegativeness of all successive principal submatrices is not enough to deter-
mine positive semidefiniteness of a matrix. We must check all principal submatrices. Hence
" #
4 2
0 = 0, det ≤0
2 0
indicates indefiniteness.
Exercise A.2.1 Show that, for any matrix A,
(1) A∗ A ≥ 0
(2) A∗ A > 0 if and only if A has linearly independent columns.
The following result provides conditions for the positive definiteness and semidefiniteness of a
partitioned matrix in terms of its submatrices.
Lemma A.2.1 The following three statements are equivalent:

" #
A11 A12
(i) >0 (A.25)
AT12 A22
(ii) A22 > 0, A11 − A12 A−1 T
22 A12 > 0 (A.26)
(iii) A11 > 0, A22 − AT12 A−1
11 A12 > 0 (A.27)
The following statements are equivalent:

" #
A11 A12
(i) ≥0
AT12 A22
(ii) A11 ≥ 0, A22 − AT12 A+
11 A12 ≥ 0, (I − A11 A11 )A12 = 0
+
(iii) A22 ≥ 0, A11 − A12 A+

22 A12 ≥ 0, A12 (I − A22 A22 ) = 0
T +
A.3 A Linear Vector Space

Thinking in an abstract manner, let L be a nonempty set of elements, and assume each pair of
elements x and y in L can be combined by a process called addition to yield an element z in L
denoted by
x+y =z
Definition A.3.1 A linear space L is defined by the following 8 properties:
1. x + y = y + x
2. x + (y + z) = (x + y) + z
3. There exists in L a unique element, denoted by 0, and called the zero element such that: for
all x in L
x+0=x
4. To each element x in L there corresponds a unique element in L, denoted by −x, and called
the negative of x, such that
x + (−x) = 0
Furthermore, each real (or complex) scalar α and each element x in L can be combined by a
process called scalar multiplication to yield an element y in L denoted by y = αx such that:
5. α(x + y) = αx + αy
Furthermore, for all scalars α and β
6. (α + β)x = αx + βx
7. (αβ)x = α(βx)
A.3. A LINEAR VECTOR SPACE 295
8. 1 · x = x.
The algebraic system L which is defined by these two operations of addition and scalar mul-
tiplication which satisfy properties 1-8 is called a linear space. If all the scalars are real, then we
refer to the real linear space L; otherwise we refer to the complex linear space L.
The set of all real n-dimensional vectors of the form
vT = [v1 v2 · · · vn ]
satisfy properties 1-8 for real scalars α and so define a real linear n-dimensional space L which we
denote by
L = Rn . (A.31)
Definition A.3.2 A nonempty subset S of a linear space L is called a linear subspace of L if x + y

and αx are in S whenever x and y are in S for any scalar α.
Since S is a nonempty set, 0 · x = 0 and so the zero element 0 is in any subspace S of L.

A set of elements X = {x1 , x2 , · · · , xn } is said to be a spanning set for a linear subspace S of L
if every element s in S can be written as a linear combination of the xk . That is, we have
S = {s ε L : s = α1 x1 + α2 x2 + · · · + αn xn } (A.32)
for some scalars α1 , α2 , · · · , αn .

A spanning set X is said to be a basis for S if no element xk of the spanning set X of S can
be written as a linear combination of the remaining elements x1 , x2 , · · · , xk−1 , xk+1 , · · · , xn , i.e., xi ,
1 ≤ i ≤ n form a linearly independent set.
The subspace S in (A.32) is said to be of dimension n (or said to be n-dimensional) if {x1 , x2 , · · · , xn }
is a basis for S For any linear space L, the largest subspace S is given by
S = L. (A.33)
Example A.3.1 Suppose L = Rn . Then
X = {e1 , e2 , · · · , en }
where
eTk = [0 0 · · · 0 1 0 · · · 0]
has a 1 in the k th position and zeros elsewhere, is a spanning set for Rn . A 3-dimensional subspace
S3 of L is defined by
S = {s ∈ Rn : s = α1 e1 + α6 e6 + α8 e8 }
for any scalars α1 , α6 and α8 .

The set of all m × n matrices of the form
A = [aij ] ; 1 ≤ i ≤ m, 1 ≤ j ≤ n
satisfy properties 1-8 for real scalars α, and so also define a real linear space L which we denote by
L = Rm×n (A.34)
Example A.3.2 Let L(n, n) denote the linear space of all real n × n matrices. Then the following
sets are all linear subspaces of L(n, n):
1.
∆
S1 = {A ∈ L(n, n) : A = AT } (A.35)
2.
∆
S2 = {A ∈ L(n, n) : A = −AT } (A.36)
S1 is a subspace since (a) 0 = 0T and so belongs to S1 , and (b)
A = AT , B = BT implies A + B = (A + B)T
For similar reasons, S2 is also a subspace. The following sets X are not subspaces of L:
1. X = {A ∈ L(n, n) : det(A) = 1}
2. X = {A ∈ L(n, n) : A−1 exists }
A.4 Fundamental Subspaces of Matrix Theory

The geometric ideas of linear vector spaces have lead to the concepts of “spanning a space” and
a “basis for a space”. The idea now is to define four important subspaces which are useful. The
entire linear vector space of a specific problem can be decomposed into the sum of these subspaces.
A.4.1 Geometric Interpretations Definitions
Definition A.4.1 The column space of A is the space spanned by the columns of A, is also called
the range space of A, denoted by R[A]. The row space of A is the space spanned by the rows of
A.
A.4. FUNDAMENTAL SUBSPACES OF MATRIX THEORY 297
" #
1
Example A.4.1 Determine whether y = lies in the column space of
−1
" #
1 −1 −3
A=
0 10 0
Solution: The vector y lies in the column space of A if y can be expressed as a linear combination
of the columns of A. In other words, we need to decide if
" # " # " # " #
1 1 −1 −3
y= = x1 + x2 + x3 f or some x1 , x2 , x3 .
−1 0 10 0
or equivalently to decide if
Ax = y f or some x.
Since the columns of A span the entire two dimensional space, the answer is yes for any y. In the
present example, one can choose, for instance, x1 = 9/10, x2 = −1/10, x3 = 0.
 
1
 
Example A.4.2 Does the vector x =  10 
 lie within the row space of the A in example A.4.1?
0
Solution: Note from the definition of row space that the row space of A is the column space of AT ,
denoted by R[AT ]. For complex matrices, we write [A∗ ]. Thus, the question reduces to “Does x
lie in the space spanned by the rows of A?” or “A∗ β = x for some β?”
   
1 0 " # 1
   
 −1 10  β1 =  10 
   
β2
−3 0 0
 
1
 
In three-dimensional space the row space of A is the plane spanned by the two vectors  −1 


−3
   
0 1
   
and    
 10 . The vector  10  is not in this plane. Hence, the answer is no. Note from the scalar
0 0
equations,
β1 = 1
−β1 + 10β2 = 10
−3β1 = 0
that the first and third equations are contradictory.

Since the column rank of a matrix is the dimension of the space spanned by the columns and
the row rank is the dimension of the space spanned by the rows, it is clear that the spaces R[A]
and R[A∗ ] have the same dimension r = rankA.
Definition A.4.2 The right null space of A is the space spanned by all vectors x that satisfy
Ax = 0, and is denoted N [A]. The right null space of A is also called the kernel of A. The left
null space of A is the space spanned by all vectors y satisfying y∗ A = 0. This space is denoted
N [A∗ ], since it is also characterized by all vectors y such that A∗ y = 0.
 
1
 
Example A.4.3 Does x =  
 10  lie in the right null space of the A in Example A.4.1
0
Solution: The question reduces to deciding whether or not
Ax = 0
Specifically,  
" # 1 " # " #
1 −1 −3   −9 0
 10  = 6=
 
0 10 0 100 0
0
so the answer is no.
The dimensions of the four spaces R[A], R[A∗ ], N [A], and N [A∗ ] are to be determined now.
Since A is m × n, we have the following:
∆
r = rankA = dimension of column space R[A],
∆
dim N [A] = dimension of null space N [A],
∆
n = total number of columns of A.
Hence,
r + dim N [A] = n
yields the dimension of the null space N [A],
dim N [A] = n − r
Now, do the same for A∗ , using the fact that rankA = rankA∗ ,
r = rank[A∗ ] = dimension of row space R[A∗ ],
dim N [A∗ ] = dimension of left null space, N [A∗ ],
m = total number of rows of A.

Hence,
r + dim N [A∗ ] = m
yields
dim N [A∗ ] = m − r
These facts are summarized below:
R[A∗ ] = row space of A: dimension r,

N [A] = right null space of A: dimension n − r. (A.41)
R[A] = column space of A: dimension r,
N [A∗ ] = left null space of A: dimension m − r. (A.42)
Note from these facts that the entire n-dimensional space can be decomposed into the sum of the
two subspaces R[A∗ ] and N [A]. Note also that the entire m-dimensional space can be decomposed
into the sum of the two subspaces R[A] and N [A∗ ].
 
1
 
Example A.4.4 For the A in example A.4.1, express the vector x =  
 10  as the sum of vectors
0
in R[A∗ ] and N [A].
Solution:
x = α1 x1 + α2 x2 + α3 x3
where xi , i = 1, 2, 3, are vectors in either R[A∗ ] or N [A]. To find out how many are in each
subspace, compute dim R[A∗ ] = rankA = 2, and dim N [A] = 3 − 2 = 1. Hence, x1 and x2 will be
in R[A∗ ] and x3 will be in N [A]. By definition, x3 satisfies
( )
x31 + x32 − 3x33 = 0
Ax = 0 →
3
10x32 = 0
 
3
 
→ x = 0 
3 
.
1
By definition, x1 , x2 satisfy
A∗ β 1 = x1 ,
A∗ β 2 = x2
for some β 1 , β 2 . Independent x1 and x2 are obtained by the choice β 1 = [1, 0]∗ , β 2 = [0, 1]∗ :
 
1
 
A∗ β 1 =   1
 −1  = x ,
−3
 
0
 
A∗ β 2 =   2
 10  = x .
0
Now, N [A] is one-dimensional and is spanned by x3 , and R[A∗ ] is two-dimensional and is spanned
by x1 , x2 . Solve now for α1 , α2 , α3 :
      
1 α1 1 0 3 α1
      
x =  10  = [x x x ]  α2  =  1 10 0   α2 
  1 2 3     
,
0 α3 −3 0 1 α3
 1

   −1    10

α1 1 0 3 1  
       
 α2  =  1 10 0   10  =  99 
        100


−3 0 1  
α3 0  
3
10
Note from example A.4.1 that x3 is orthogonal to both x1 and x2 ; that is
x3∗ x2 = 0, x3∗ x1 = 0
The following shows that this orthogonality is no accident.
Theorem A.4.1 N [A] and R[A∗ ] are orthogonal subspaces. This fact is denoted by R[A∗ ]⊥ =
N [A].
Proof. The meaning of theorem A.4.1 is that every vector in N [A] is orthogonal to every vector
in R[A∗ ]. To prove this, show that x ∈ N [A] satisfying
Ax = 0 (A.43)
is orthogonal to z ∈ R[A∗ ] satisfying

A∗ β = z. (A.44)
This is accomplished as follows:
x∗ z = x∗ A∗ β = (Ax)∗ β = 0
where (A.44) is used to obtain the second equality and (A.43) is used to obtain the last equality.
2
It should now be clear how to construct a basis for R[A∗ ] and N [A]. Simply take the indepen-
dent rows of A as a basis for R[A∗ ] and take the vectors perpendicular to these independent rows
of A as a basis for N [A].
" #
1 10
Example A.4.5 Graphically show the spaces N [A] and R[A∗ ] if A = .
0 0
Solution: Since r = rankA = 1, both N [A] and R[A∗ ] are one-dimensional: dim N [A] = n − r =
2 − 1 = 1, dim R[A∗ ] = r = 1. The independent rows of A form a basis for R[A∗ ]. Any vector
perpendicular to this is a basis for N [A].
Corollary A.4.1 R[A] and N [A∗ ] are orthogonal subspaces. This fact is denoted by R[A]⊥ =
N [A∗ ].
Proof. Substitute A for A∗ in Theorem A.4.1. 2
Exercise A.4.1 Graphically show the spaces R[A] and N [A∗ ] if

" #
1 −10
A=
1 −10
" # " #
1 1
Express x = as the sum of vectors in R[A] and N [A∗ ]. Does x = lie in the left null
7 −1
space of A?
Exercise A.4.2 Find the set of all vectors v such that
Av = 0
where
 
" # 1 2
1 3  
(i) A= (ii) A =  
 3 5 
4 12
6 9
" #
4 1 6
(iii) A = .
0 1 3
A.4.2 Construction of the Fundamental Subspaces by SVD

Consider an SVD of the m × n matrix A of rank r.
" #" #
Σ1 0 V1∗
A = [U1 U2 ] = UΣV∗ (A.45)
0 0 V2∗
where Σ1 > 0 and,
U1 ∈ C m×r , U2 ∈ C m×(m−r) , Σ1 ∈ Rr×r

V1 ∈ C n×r , V2 ∈ C n×(n−r)
Then the following interpretations of U1 , U2 , V1 , V2 construct the four fundamental subspaces.

Theorem A.4.2 1) U1 is an orthogonal basis for R[A], the range space of A, and all matrices
that lie in R[A] are given by U1 K1 for arbitrary K1 with r rows.
2) U2 is an orthogonal basis for N [A∗ ], the right null space of A∗ , and all matrices that lie in
N [A∗ ] are given by U2 K2 for arbitrary K2 with (m − r) rows.
3) V1 is an orthogonal basis for R[A∗ ], the range space of A∗ , and all matrices that lie in R[A∗ ]
are given by V1 P1 for arbitrary P1 with r rows.
4) V2 is an orthogonal basis for N [A], the right null space of A, and all matrices that lie in this
space are given by V2 P2 for arbitrary P2 with (n − r) rows.
Remark: The left null space of A is often denoted A⊥ . Hence, A⊥ A = 0, and A⊥ = K∗2 U∗2 for
any (m − r) × (m − r) matrix K2 .
Proof.
1) The range space of A is spanned by the set of vectors Ax for arbitrary x. Note that from
the SVD A = U1 Σ1 V1∗ , hence
Ax = U1 Σ1 V1∗ x
= U1 z
for arbitrary z or z = K1 y for arbitrary K1 , y.
2) From the SVD

" #" #
∗ Σ1 0 U∗1
A = [V1 V2 ]
0 0 U∗2
and
" #
∗ Σ1 0
A [U1 U2 ] = [V1 V2 ] = [V1 Σ1 0] .
0 0
Hence
A∗ U2 = 0
and for arbitrary K2

A∗ U2 K2 = 0
3) The range space of A∗ is shown to be spanned by V1 by applying the arguments in 1) to the

SVD of A∗ .
4) From the SVD,

" #
Σ1 0
A [V1 V2 ] = [U1 U2 ] = [U1 Σ1 0] .
0 0
Hence
AV2 P2 = 0
for arbitrary P2 .
2
It is clear from the unitary properties of U and V
" # " # " # " #
∗ U∗1 I 0 ∗ V1∗ I 0
U U= [U1 U2 ] = , V V= [V1 V2 ] =
U∗2 0 I V2∗ 0 I
that R[A] and N [A∗ ] are orthogonal since
(U1 K1 )∗ (U2 K2 ) = 0
and that R[A∗ ] and N [A] are orthogonal since
(V1 P1 )∗ (V2 P2 ) = 0.
Note that for any given matrix B ∈ C m×p and any given unitary matrix U ∈ C m×m there exists K
such that
B = UK.
Now suppose U is obtained from the SVD of any m × n matrix A ∈ C m×n as in (A.45). Then B
can be written, for some K1 , K2 ,
" #
K1
B = UK = [U1 U2 ] = U1 K1 + U2 K2 . (A.46)
K2
From Theorem 2.9.2 and (A.46) we can say that any m × p matrix B and any m × n matrix A,
(for any n) the matrix B can be decomposed into two orthogonal parts
B = B1 + B2 , B∗1 B2 = 0 (A.47)
where, for some K1 , K2
B1 = U1 K1
B2 = U2 K2 .
Equation. (A.47) is often written as a direct sum of R[A] and N [A∗ ], using the notation
B = R[A] ⊕ N [A∗ ], (A.48)
where the orthogonality of R[A] and N [A∗ ] has already been shown.
Likewise B ∈ C m×p can be decomposed as the sum of the remaining two (of the four) funda-
mental subspaces. In the above decomposition we need not specify n, the column dimension of A.
In what follows we need not specify the row dimension. Let A be any q × m matrix, then any m × p
matrix B can be written
B = B3 + B4 , B∗3 B4 = 0
where
B3 = V1 P1 ∈ R[A∗ ]
B4 = V2 P2 ∈ N [A]
" #" #
Σ1 0 V1∗
A = [U1 U2 ] .
0 0 V2∗
A.5 Convex Sets

In this section the basic definitions of convexity of sets and functions are provided. These concepts
play a fundamental role in the computational techniques developed in the text.
Consider a finite dimensional vector space L. Often, in this book, L will be a subspace of C n×n
or Rn×n , for example the space of Hermitian or skew-Hermitian n × n matrices.
Definition A.5.1 A set K in L is said to be convex if for any two vectors x and y in K any vector
of the form (1 − λ)x + λy is also in K where 0 ≤ λ ≤ 1.
This definition merely says that given two points in a convex set, the line segment between
them is also in the set. Note in particular that subspaces and linear varieties (a linear variety is a
translation of linear subspaces) are convex. Also the empty set is considered convex. The following
facts provide important properties for convex sets. Their proofs may be found in [84].
Fact 1: Let Ci i = 1, · · · m be a family of m convex sets in L. Then the intersection C1 ∩C2 ∩· · ·∩Cm
is convex.
Fact 2: Let C be a convex set in L and xo ∈ L. Then the set {xo + x : x ∈ C} is convex.
Exercise A.5.1 Is the statement provided in Fact 1 true when the intersection is replaced by
union?
Exercise A.5.2 Show that sets S1 , S2 defined by (A.35), (A.36) are convex sets.
An important special case of a convex set is the convex cone.
Definition A.5.2 A set K in L is said to be a convex cone with vertex xo if K is convex and x ∈ K
implies that xo + λx ∈ K for any λ ≥ 0.
An important class of convex cones for our purposes is the one defined by the positive semidefinite
ordering of matrices, e.g. A1 ≥ A2 ≥ A3 .
Theorem A.5.1 Let P a positive semidefinite n × n matrix. The set of matrices X ∈ C n×n such
that X ≥ P is a convex cone in C n×n .
A.6. MATRIX INNER PRODUCTS AND THE PROJECTION THEOREM 305
Proof. Let A and B n × n matrices such that A ≥ P and B ≥ P. Then for any x ∈ C n we have
x∗ (A − P)x ≥ 0
and
x∗ (B − P)x ≥ 0.
Consider a scalar 0 ≤ λ ≤ 1. Then
x∗ {[(1 − λ)A + λB] − P}x

= x∗ {(1 − λ)A + λB − (1 − λ)P − λP}x
= (1 − λ)x∗ (A − P)x + λx∗ (B − P)x ≥ 0
Hence
(1 − λ)A + λB ≥ P
i.e., the set of matrices X ≥ P is convex. To show that this set is a convex cone note that for any
λ ≥ 0 and any X ≥ P then P + λX ≥ P since X ≥ P ≥ 0. 2
Exercise A.5.3 Let Y ∈ C k×k be given positive semidefinite matrix and C ∈ C k×n k ≤ n be a
given matrix. Then show that the set of matrices X ∈ C n×n such that CXC∗ ≥ Y is a convex
cone.
A.6 Matrix Inner Products and the Projection Theorem

Definition A.6.1 The trace inner product of two k × n matrices A and B is defined by trA∗ B
and is denoted by hA, Bi that is
hA, Bi = trA∗ B.
Definition A.6.2 The Frobenius norm of a matrix A is defined by [trA∗ A]1/2 and is denoted by
kAkF , that is
kAkF = trA∗ B = [hA, Bi]1/2
Two matrices A and B will be called orthogonal to each other if their inner product is zero,
that is hA, Bi = trA∗ B = 0.
Example A.6.1 Determine the conditions on α and β such that the following matrices
   
α 1 0 1
   
 0 −1  ,  5 1 
   
β β 0 1
are orthogonal to each other with respect to the trace inner product.
The following result is the classical projection theorem onto a linear subspace.
Theorem A.6.1 Consider a finite dimensional vector space L and let M be a subspace of L. For
any vector x in L there exists a unique vector yo in M such that kx − yo k ≤ kx − yk for any vector
y in M. Furthermore, the vector x − yo is orthogonal to any vector in M. The vector bf y o is
called the orthogonal projection of x onto the subspace M.and is denoted by yo = PM x.
The generalized projection theorem onto convex sets is as follows.
Theorem A.6.2 Consider a finite dimensional vector space L and let M be a closed convex set
in L. For any vector x in L there exists a unique vector yo in M such that kx − yo k ≤ kx − yk for
any vector y in M. Furthermore, the vector x − yo satisfies hx − yo , y − yo i ≤ 0 for any vector in
M. The vector bf y o is called the orthogonal projection of x onto the subspace M.and is denoted
by yo = PM x.
In this text we are interested in orthogonal projections of matrices onto convex matrix constraint
sets. Expressions for the orthogonal projections onto some simple matrix constraint sets that are
important for control design are obtained in Chapter 10.
Appendix B
Calculus of Vectors and Matrices
B.1 Vectors
Definition B.1.1 The derivation of the real scalar valued function f (v) of an n-dimensional real
vector v where
vT = [v1 v2 · · · vn ] ; vk real is defined by
· ¸T
∂f (v) ∆ ∂f (v) ∂f (v) ∂f (v)
= ··· (B.11)
∂v ∂v1 ∂v2 ∂vn
where the partial derivations are defined by
∂f (v) ∆ f (v + ∆v) − f (v)

= lim
∂vk ∆vk →0 ∆vk
∆
∆vT = [0 · · · ∆vk · · · 0] . (B.12)
Exercise B.1.1 Show that

∂ T
(y Qx) = QT y if x, y, Q are real, (B.13)
∂x
∂
(y∗ Qx) = QT y if Q is real, x, y are complex (and x = xR +jxI ), (B.14)
∂xR
∂ ∗
(x Qx) = QT x if Q is real, x is complex, (B.15)
∂x
∂ ∗
(x Qx) = Qx if Q is real, x is complex. (B.16)
∂x
Suppose f (x), is a real scalar function of a real vector x ∈ Rn . The first three terms of the
∆
Taylor’s series expansion of f (x) about x0 (in terms of δx = x − x0 ) are
∆ X
n
∂f (x) 1 X n
∂ 2 f (x)
δf (x) = f (x) − f (x0 ) = δxα + δxβ δxγ . (B.17)
α=1
∂xα 2 β,γ=1 ∂xβ ∂xγ
307
308 APPENDIX B. CALCULUS OF VECTORS AND MATRICES
By defining the gradient and variational vectors by (B.11),
· ¸T · ¸
∂f (x) ∆ ∂f (x) ∂f (x)
= ··· (B.18)
∂x ∂x1 ∂xn
∆
(δx)T = [δx1 · · · δxn ] (B.19)
and the second derivative, called the Hessian matrix, by

 ³ ´   
∂f T ∂2f ∂2f
µ ¶T 
∂
  ∂x21
···
∂ 2 f (x) ∆ ∂ 
∂x1 ∂x
 
∂x1 ∂xn 
∂f .. .. .. 
= =
 . =
  . .  (B.110)
∂x2 ∂x ∂x  ³ ´T  
∂2f ∂2f
∂ ∂f
∂xn ∂x1 ··· ∂x2n
∂xn ∂x
(B.17) may be written in the compact vector matrix form
µ ¶T " #
∂f 1 ∂2f
δf (x) = δx + δxT δx. (B.111)
∂x 2 ∂x2
Suppose one wishes to choose x so as to minimize a scalar function f (x). A necessary condition
is that small perturbations (from the minimizing x) in f (x) are not negative; δf (x) ≥ 0.
Exercise B.1.2 (i) Consider the function f (x)
f (x) = x21 + 2x22 − 2x1 x2 + x1 + x2 + 1
Construct the matrix (called the Hessian)
∂2f
Hij = j, i = 1, 2
∂xi ∂xj
to get
" #
1 −1
H= .
−1 2
Note that since

∂ 2 f (x) ∂ 2 f (x)
Hij = = = Hji
∂xi ∂xj ∂xj ∂xi
the Hessian is always symmetric for any twice differentiable f (x).
(ii) Repeat this exercise (i) in vector notation by writing f (x) = x∗ Qx + b∗ x + c (find Q, b, c
first).
B.2. MATRICES 309
B.2 Matrices
The previous section shows how to differentiate a scalar function of a vector with respect to the
vector. This section shows how to differentiate a scalar function of a matrix with respect to the
matrix.
The derivative of a scalar f (A) with respect to a matrix A = [a1 , · · · , an ] is defined by
 ∂f ∂f 
· ¸ ∂A11 ··· ∂A1n
∂f ∂f ∂f  .. .. 
= ,···, =
 . ··· . 
 (B.21)
∂A ∂a1 ∂an
∂f ∂f
∂An1 ··· ∂Ann
Note that
X
k X
n X
n X
k
trAB = Aαβ Bαβ = Bβα Aαβ = trBA (B.22)
α=1 β=1 β=1 α=1
proves the identity

trAB = trBA (B.23)
Exercise B.2.1 Prove another useful identity
trAB = trAT BT . (B.24)
Now, the right-hand side of (B.23) can be expanded in terms of the columns of A,
 
b∗1
 .  X
n
trBA = tr  .
 . 
 [a1 , · · · , an ] = b∗β aβ (B.25)
β=1
b∗n
where b∗i is defined as the ith row of B. Equation (B.25) readily leads to the conclusions
· ¸T
∂(trAB) ∂(trAB)
= b∗α or = bα (B.26)
∂aα ∂aα
See from (B.110) and (B.27) that the structure of ∂(trAB)/∂A is
· ¸
∂(trAB) ∂(trAB) ∂(trAB)
= ,···,
∂A ∂a1 ∂an
= [b1 , · · · , bn ] = B .
T
This proves an identity worth remembering:

∂(trAB)
= BT . (B.27)
∂A
Exercise B.2.2 Prove that any A ∈ C k×n , B ∈ C n×k
∂(trAB)
= AT (B.28)
∂B
Example B.2.1 Derive (B.25) by appealing directly to (B.22).

Solution
 
· ¸
∂(trAB) ∂(trAB) ∂ X k X n
= = Aωγ Bγω 
∂A αβ ∂Aαβ ∂Aαβ ω=1 γ=1
Ã !
X
k X
n
∂Aωγ k X
X n
= Bγω = δαω δγβ Bγω
ω=1 γ=1
∂Aαβ ω=1 γ=1
Hence, · ¸
∂(trAB) ∂(trAB)
= Bβα ⇒ = BT (B.29)
∂A αβ ∂A
Trace Identities
For convenient reference the following identities are recorded. Each identity can be derived by re-
peated application of (B.23), (B.24) and (B.27). The elements of A are assumed to be independent:
∂(trAB) ∂(trAT BT ) ∂(trBT AT ) ∂(trBA)

= = = = BT (B.210)
∂A ∂A ∂A ∂A
∂(trBAC) ∂(trBT CT AT ) ∂(trCT AT BT ) ∂(trACB)

= = =
∂A ∂A ∂A ∂A
∂(trCBA) T T
∂(trA B C ) T
= = = BT CT (B.211)
∂A ∂A
∂(trAT BA) ∂(trBAAT ) ∂(trAAT B)
= = = (B + BT )A (B.212)
∂A ∂A ∂A
Using these basic ideas, a list of matrix calculus results are given below.
∂
tr[AXT ] = A
∂X
∂
tr[AXB] = AT BT
∂X
∂
tr[AXT B] = BA
∂X
∂
tr[AX] = A
∂XT
∂
tr[AXT ] = AT
∂XT
∂
tr[AXB] = BA
∂XT
∂
tr[AXT B] = AT BT
∂XT
∂
tr[XX] = 2XT
∂X
B.2. MATRICES 311
∂
tr[XXT ] = 2X
∂X
∂
tr[Xn ] = n(Xn−1 )T
∂X
∂ X
n−1
tr[AXn ] = ( Xi AXn−1−i )T
∂X i=0
∂
tr[AXBX] = AT XT BT + BT XT AT
∂X
∂
tr[AXBXT ] = AT XBT + AXB
∂X
∂
tr[X−1 ] = −(X−1 X−1 )T = −(X−2 )T
∂X
∂
tr[AX−1 B] = −(X−1 BAX−1 )T
∂X
∂
log det[X] = (X−1 )T
∂X
∂ ∂
det[XT ] = det[X] = (det[X])(X−1 )T
∂X ∂X
∂
det[Xn ] = n(det[X]n (X−1 )T )
∂X
Exercise B.2.3 The elements of A are not always independent. Show for symmetric A that
∂(trAB)
= BT + B − diag[B].
∂A
∆
where diag[B] = diag[· · · Bii · · ·].
Appendix C
Balanced Model Reduction
Consider the H∞ model reduction problem in Chapter 8. Obviously a suboptimal reduced-order

model Ĝ(s) exists to satisfy an H∞ bound kG(s) − Ĝ(s)k∞ < γ for some γ. We shall construct
one such realization. Consider a realization with properties.
Ĝ(s) = Gb (s) = Cb (sI − Ab )−1 Bb

0 = Σ1/2 ATb + Ab Σ1/2 + Bb BTb
0 = Ab Σ1/2 + Σ1/2 ATb + CTb Cb
 
σ1
 
Σ = 

..
. ,
 γB = 2(σn̂+1 + . . . σn ).
σn̂
Such a realization (Ab , Bb , Cb ) is called a balanced realization. The following is an algorithm

to compute a balanced realization of a stabilizable, detectable model (A, B, C):
Step 1: Solve for X from
0 = XAT + AX + BBT . (C.01)
This is possible to do uniquely if and only if A has no eigenvalues that are symmetric about
the j axis (λj + λi 6= 0 ∀i, j) and if the controllable modes are stable.
Step 2: Find the singular value decomposition of X:

" #" #
h i Σ1 0 U11T
X = U11 U12 = U11 Σ1 U11T . (C.02)
0 0 U12T
Now, the columns of U11 span the controllable subspace and the columns of U12 span the
uncontrollable subspace, and
Σ1 = diag {σ11 , . . . , σ1nc }, U11 ∈ Rnc ×nc . (C.03)
313
314 APPENDIX C. BALANCED MODEL REDUCTION
Step 3: Solve for K:
0 = KA + AT K + CT C. (C.04)
This is possible if (λi + λj 6= ∀i, j) holds and if the observable modes are stable.
Step 4: Find the singular value decomposition

" #" #
h i Σ2 0 U21T
TT1 KT1 = U21 U22 = U21 Σ2 U22T , (C.05)
0 0 U22T
1/2
where T1 = U11 Σ1 . Now, U21 columns span the subspace that is both controllable and
observable; U22 columns span the controllable unobservable subspace; and
Σ = diag {σ21 , . . . , σ2nc0 }, U21 ∈ Rnc ×nc0 . (C.06)
A balanced realization is constructed as follows.
ẋb = Ab xb + Bb u, xb ∈ Rnc ×nc0 ,

y = Cb xb ,
1/4 −1/2 1/2 −1/4
Ab = (Σ2 U21T Σ1 U11T )A(U11 Σ1 U21 Σ2 ), (C.07)
1/4 −1/2
Bb = (Σ2 U21T Σ1 U11T )B,
1/2 −1/4
Cb = C(U11 Σ1 U21 Σ2 ).
Note from Step 2 that Σ1 contains all of the nonzero singular values of X and nc is the number
of controllable states (i.e., the controllable subspace is spanned by U11 ). Note from Step 4 that
Σ2 contains all of the nonzero singular values of TT1 KT1 , and hence nc0 is the dimension of the
controllable states that are also observable (i.e., controllable observable subspace is spanned by
U21 ).
Exercise C.0.4 If the original system (A, B, C) is observable and controllable (that is, if nc0 =
nx ), show that Ab = E−1 −1 1/2
b AEb , Bb = Eb B, Cb = CEb , where Eb = T1 T2 , T1 = U11 Σ1 , and
−1/4
T2 = U21 Σ2 .
Exercise C.0.5 Verify that (C.07) is both controllable and observable even if the triple (A, B, C)
is not controllable and observable.
Bibliography
[1] B. D. O. Anderson and J. B. Moore. Optimal Control: Linear Quadratic Methods. Prentice
Hall, Englewood Cliffs, New Jersey, second edition, 1990.
[2] B. R. Barmish. Stabilization of uncertain systems via linear control. IEEE Trans. Auto.
Control, AC-28(8):848–850, 1983.
[3] A. Ben-Israel and T. N. Greville. Generalized inverses:Theory and applications. Wiley-

Interscience, New York, 1974.
[4] D. S. Bernstein and W. M. Haddad. Robust stability and performance analysis for state-space
systems via quadratic Lyapunov bounds. SIAM J. Matrix Anal. Appl., 11(2):239–271, 1990.
[5] D. S. Bernstein and D. C. Hyland. Optimal projection maximum entropy stochastic modelling
and reduced order design synthesis. IFAC Workshop on Model Errors and Compensation,
1985.
[6] S. Boyd and C. H. Barratt. Linear Controller Design: Limits of Performance. Prentice Hall,
Englewood Cliffs, New Jersey, 1991.
[7] S. Boyd and L. El Ghaoui. Method of centers for minimizing generalized eigenvalues. Linear
Algebra and Appl., 188,189:63–111, 1993.
[8] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan. Linear Matrix Inequalities in System
and Control Theory. SIAM Studies in Applied Mathematics, 1994.
[9] J. P. Boyle and R. L. Dykstra. A method for finding projections onto the intersection of
convex sets in hilbert space. Lecture Notes in Statistics, 37:28–47, 1986.
[10] A. E. Bryson and Y.-C. Ho. Applied Optiomal Control. Hemisphere, New York, 1975.
[11] C.-T. Chen. Linear System Theory and Design. Holt, Rinehart and Winston, New York,
1984.
[12] W. Cheney and A. A. Goldstein. Proximity maps for convex set. Proc. Amer. Math. Society,
12(2):448–450, 1959.
315
316 BIBLIOGRAPHY
[13] A. Cohn. über die anzahl der wurzeln einer algebraischen gleichung in einem krreise. Math.
Zeit., 14:110–148, 1914.
[14] E. Collins and R. Skelton. A theory of state covariance assignment for discrete systems. IEEE
Trans. Auto. Control, AC-32(1):35–41, January 1987.
[15] M. Corless. Control of uncertain nonlinear systems. ASME JDSCD, 115:362–372, 1993.
[16] M. Corless. Robust stability analysis and controller design with quadratic lyapunov functions.
Variable Structure and Lyapunov Control, 1993.
[17] M. Corless and L. Glielmo. On the exponential stability of singularly perturbed systems.
SIAM J. Control and Optimization, 30(6):1338–1360, 1992.
[18] M. Corless and G. Leitmann. Bounded controllers for robust exponential convergence. J.
Optimization Theory and Application, 76(1), 1993.
[19] M. Corless, G. Zhu, and R. E. Skelton. Robustness of covariance controllers. Proc. IEEE
Conf. Dec. Control, pages 2667–2672, 1989.
[20] C. Davis, W. M. Kahan, and H. F. Weinberger. Norm-preserving dilation and their applica-
tions to optimal error bounds. SIAM J. Numerical Analysis, 19(3):445–469, 1982.
[21] C. de Villemagne and R. E. Skelton. Model reductions using a projection formulation. Int.
J. Control, 46:2141–2169, 1987.
[22] C. de Villemagne and R. E. Skelton. Controller reduction using a projection formulation.

IEEE Trans. on Automatic Control, 33(8), 1988.
[23] J. C. Doyle. Analysis of feedback systems with structured uncertainties. IEE Proc., 129, Part
D(6):242–250, 1982.
[24] J. C. Doyle. Synthesis of robust controllers and filters. Proc. IEEE Conf. Decision Contr.,
pages 109–114, 1983.
[25] J. C. Doyle, K. Glover, P. P. Khargonekar, and B. A. Francis. State-space solutions to

standard H2 and H∞ control problems. IEEE Trans. Automat. Contr., AC-34(8):831–847,
August 1989.
[26] J. C. Doyle, A. Packard, and K. Zhou. Review of LFTs, LMIs, and µ. Proc. IEEE Conf.
Decision Contr., pages 1227–1232, 1991.
[27] D. Enns. Model reduction with balanced realizations: An error bound and a frequency
weighted generalization. Proc. IEEE Conf. Decision Contr., pages 127–132, 1984.
[28] P. Finsler. Uber das vorkommen definiter und semidefiniter formen in scharen quadraticher
formen. Commentarii Mathematici Helvetici, 1:19–28, 1937.
BIBLIOGRAPHY 317
[29] B. A. Francis. A Course in H∞ Control Theory. Springer-Verlag, New York, 1987.
[30] H. Fujioka and S. Hara. State covariance assignment problem with measurement noise: A
unified approach based on a symmetric matrix equation. Linear Algebra Appl., 203-204:579–
605, 1994.
[31] M. Fujiwara. über die algebraische gleichungen duren wurzeln in einem kreise oder in einer
halbebene liegen. Math. Zeit., pages 160–169, 1926.
[32] K. Furuta. Closed-form solution to discrete-time LQ optimal control and disturbance atten-
uation. Sys. Control Lett., pages 427–437, 1993.
[33] K. Furuta and S. Phoojaruenchanachai. An algebraic approach to discrete-time H∞ control

problems. Proc. American Contr. Conf., pages 3067–3072, 1990.
[34] P. Gahinet and P. Apkarian. A linear matrix inequality approach to H∞ control. Int. J.
Robust Nonlin. Contr., 4:421–448, 1994.
[35] X. Gang, C. Xuemin, G. Zhi, and F. Zuangang. Q-markov output covariance assignment
control of continuous systems. pages 137–140, 1993.
[36] J. C. Geromel, P. L. D. Peres, and S. R. Souza. Output feedback stabilization of uncertain

systems through a min/max problem. IFAC World Congress, 1993.
[37] M. Gevers and G. Li. Parametrizations in Control, Estimation and Filtering Problems.
Springer-Verlag, New York, 1993.
[38] K. Glover. All optimal Hankel norm approximations of linear multivariable systems and
L∞ -error bounds. Int. J. Contr., 26:1115–1193, 1984.
[39] K. Glover and J. Doyle. State-space formulae for all stabilizing controllers that satisfy an
H∞ norm bound and relations to risk sensitivity. Sys. Contr. Lett., 11:167–172, 1988.
[40] K. Glover and D. Limebeer. Robust multivariable control system design using optimal reduced
order plant models. ACC, page 644, 1983.
[41] K. C. Goh, L. Turan, M. G. Safonov, G. Papavassilopoulos, and J. Ly. Biaffine matrix

inequality properties and computational methods. Proc. American Contr. Conf., pages 850–
855, 1994.
[42] G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins, Baltimore, 1989.
[43] K. M. Grigoriadis and R. E. Skelton. Minimum energy covariance controllers. Proc. IEEE
Conf. Decision Contr., pages 823–824, December 1993.
[44] K. M. Grigoriadis and R. E. Skelton. Alternating convex projection methods for covariance
control design. Int. J. Control, 60(6):1083–1106, December 1994.
318 BIBLIOGRAPHY
[45] K. M. Grigoriadis and R. E. Skelton. Low order control design for LMI problems using
alternating projection methods. Automatica, 1994. To appear.
[46] K. M. Grigoriadis and R. E. Skelton. Alternating convex projection methods for discrete-time
covariance control design. J. Optimization Theory Appl., 88(2):399–432, February 1996.
[47] L. G. Gubin, B. T. Polyak, and E. V. Raik. The method of projections for finding the common
point of convex sets. USSR Comp. Math. Phys., 7:1–24, 1967.
[48] W. M. Haddad and D. S. Bernstein. Robust stabilization with positive real uncertainty:
Beyond the small gain theorem. Sys. Contr. Lett., 17:191–208, 1991.
[49] S. P. Han. A successive projection method. Math. Program., 40:1–14, 1988.
[50] N. J. Higham. Computing the nearest symmetric positive semidefinite matrix. Linear Algebra
Appl., 103:103–118, 1988.
[51] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge, New York, 1990.
[52] I. Horowitz. Synthesis of Feedback Systems. Academic Press, 1963.
[53] A. Hotz and R. E. Skelton. Covariance control theory. Int. J. Contr., 46(1):13–32, 1987.
[54] C. Hsieh and R. E. Skelton. All covariance controllers for linear discrete-time systems. IEEE
Trans. Automat. Contr., AC-35(8):908–915, 1990.
[55] A. Hurwitz. über die bedingungen unter welchen eine gleichung nur wurzeln mit negativen
reellen teilen besitz. Math. Annalen., 46:273–284, 1895.
[56] T. Iwasaki. A unified matrix inequality approach to linear control design. Ph.D Dissertation,
Purdue University, West Lafayette, IN 47907, December 1993.
[57] T. Iwasaki. Robust performance analysis for systems with norm-bounded time-varying struc-
tured uncertainty. Int. J. Robust Nonlinear Contr., 6:85–99, 1996.
[58] T. Iwasaki and M. A. Rotea. Fixed order scaled H∞ synthesis. Optimal Control Theory and
Applications, 1995. Submitted.
[59] T. Iwasaki and R. E. Skelton. A complete solution to the the general H∞ control problem:
LMI existence conditions and state space formulas. Proc. American Contr. Conf., pages
605–609, 1993.
[60] T. Iwasaki and R. E. Skelton. All controllers for the general H∞ control problem: LMI
existence conditions and state space formulas. Automatica, 30(8):1307–1317, 1994.
[61] T. Iwasaki and R. E. Skelton. On the observer-based structure of covariance controllers. Sys.
Contr. Lett., 22:17–25, 1994.
BIBLIOGRAPHY 319
[62] T. Iwasaki and R. E. Skelton. A unified approach to fixed order controller design via linear
matrix inequalities. Proc. American Contr. Conf., pages 35–39, 1994.
[63] T. Iwasaki and R. E. Skelton. Parametrization of all stabilizing controllers via quadratic
Lyapunov functions. J. Optimiz. Theory Appl., 85:291–307, 1995.
[64] T. Iwasaki and R. E. Skelton. The XY-centering algorithm for the dual LMI problem: a new
approach to fixed order control design. Int. J. Contr., 62:1257–1272, 1995.
[65] T. Iwasaki, R. E. Skelton, and J. C. Geromel. Linear quadratic suboptimal control with static
output feedback. Sys. Contr. Lett., 23:421–430, 1994.
[66] D. H. Jacobson. Extensions of Linear-Quadratic Control, Optimization and Matrix Theory.

Academic Press, 1977.
[67] E. I. Jury. Inners and Stability of Dynamic Systems. Wiley-Interscience, N.Y., 1974.
[68] T. Kailath. Linear Systems. Prentice Hall, New Jersey, 1980.
[69] R. E. Kalman and J. E. Bertram. Control system analysis and design via the second method
of lyapunov i: continuous-time systems. J. Basic Engineering, 82:371–393, June 1960.
[70] L. H. Keel, S. P. Bhattacharyya, and J. W. Howze. Robust control with structured pertur-
bations. IEEE Trans. Automat. Contr., AC-33(1):68–78, January 1988.
[71] P. P. Khargonekar, I. R. Petersen, and M. A. Rotea. H∞ optimal control with state feedback.
IEEE Trans. Automat. Contr., AC-33(8):786–788, 1988.
[72] P. P. Khargonekar, I. R. Petersen, and K. Zhou. Robust stabilization of uncertain linear

systems: Quadratic stabilizability and H∞ control theory. IEEE Trans. Automat. Contr.,
AC-35(3):356–361, 1990.
[73] C. G. Khatri and S. K. Mitra. Hermitian and nonnegative definite solutions of linear matrix
equations. SIAM J. Appl. Math., 14(4):579–585, 1976.
[74] T. Kimura. A fault tolerant controller design using alternating projection techniques. M.S.
Thesis, Purdue University, West Lafayette, IN 47907, December 1994.
[75] A. M. King, U. B. Desai, and R. E. Skelton. A generalized approach to q-markov covariance

equivalent realizations of discrete systems. Automatica, pages 507–515, 1988.
[76] V. C. Klema and A. J. Laub. The singular value decomposition: Its computation and some
applications. IEEE Trans. Automatic Control, 25(2):164–176, 1980.
[77] K. Kojima, S. Shindoh, and S. Hara. Interior-point methods for the monotone linear comple-
mentarity problem in symmetric matrices. Tech. Rep., Dept. Info. Sci., Tokyo Inst. Tech.,
1994.
320 BIBLIOGRAPHY
[78] H. Kwakernaak and R. Sivan. Linear Optimal Control Systems. Wiley-Interscience, 1972.
[79] J. P. LaSalle. The Stability and Control of Discrete Processes. Springer-Verlag, 1986.
[80] G. Leitmann. Guaranteed asymptotic stability for some linear systems with bounded uncer-
tainties. J. Dyn. Sys., Meas. Contr., 101:202–216, 1979.
[81] A. Lièenard and M. H. Chipart. Sur le signe de la partie rèele des racines d’une equation
albebrique. J. Math Pures et Appl., 10:291–346, 1914.
[82] Y. Liu. Frequency-weighted controller and model order reduction in linear systems design.
Ph.D. dissertation, Australian National University, 1989.
[83] W. M. Lu and J. C. Doyle. H∞ control of LFT systems: An LMI approach. Proc. IEEE
Conf. Decision Contr., pages 1997–2001, 1992.
[84] D. G. Luenberger. Optimization by Vector Space Methods. John Wiley, 1968.
[85] A. M. Lyapunov. Problème gènèral de la stabilitè du mouvement. Ann. Frac. Sci. Toulouse,
9:203–474, 1907.
[86] A. Madiwale, W. Haddad, and D. Bernstein. Robust H∞ control design for systems with
structured parametric uncertainty. Sys. Contr. Lett., 12:393–407, 1989.
[87] J. R. Magnus. L-structured matrices and linear matrix equations. Lin. Multilin. Algebra,
14:67–88, 1983.
[88] M. Mansour. Stability criteria of linear systems and the second method of lyapunov. Scientia
Electrica, XI(Fasc. 3):87–96, 1965.
[89] J. L. Massera. Contributions to stability theory. Ann. Math., 64:182–206, 1956.
[90] A. Megretski. Necessary and sufficient conditions of stability: a multiloop generalization of

the circle criterion. IEEE Trans. Auto. Contr., AC-38(5):753–756, 1993.
[91] B. Moore. Principal component analysis in linear systems: Controllability, observability and
model reduction. IEEE Trans. Automat. Contr., AC-26(1):17–31, 1981.
[92] W. J. Naeije and O. H. Bosgra. The design of dynamic compensators for linear multivariable
systems. IFAC, Fredrericton, Canada, pages 205–212, 1977.
[93] M. Nagayasu. Realization of prescribed state covariance for linear state feedback control
systems with disturbances. Technical Report of National Aerospace Laboratory, TR-492:1–15,
1977. In Japanese.
[94] A. Nemirovskii and P. Gahinet. The projective method for solving linear matrix inequalities.
Proc. American Contr. Conf., pages 840–844, 1994.
BIBLIOGRAPHY 321
[95] A. Ohara and T. Kitamori. Geometric structures of stable state feedback systems. Proc.
IEEE Conf. Decision Contr., pages 2494–2499, 1990.
[96] A. Ohara and T. Kitamori. Geometric structures of stable state feedback systems. IEEE
Trans. Automat. Contr., July 1993.
[97] A. Packard and J. Doyle. The complex structured singular value. Automatica, 29(1):71–109,
1993.
[98] A. Packard and J. C. Doyle. Quadratic stability with real and complex perturbations. IEEE
Trans. Automat. Contr., AC-35(2):198–201, 1990.
[99] A. Packard, K. Zhou, P. Pandey, and G. Becker. A collection of robust control problems
leading to LMIs. Proc. IEEE Conf. Decision Contr., pages 1245–1250, 1991.
[100] P. C. Parks. Further comment on ralston (1962). IEEE Trans. Automatic Control, AC-
8(3):270–271, 1963.
[101] P. C. Parks. A new proof of the hurwitz stability criterion by the second method of lya-
pounov with applications to optimum transfer functions. IEEE Trans. Automatic Control,
AC-9(3):319–322, 1963.
[102] P. C. Parks. Lyapunov and the schur-cohn stability criterion. IEEE Trans. Automatic Control,
AC-9(1):121, 1964.
[103] S. Parrott. On a quotient norm and the Sz-Nagy Foias lifting theorem. J. Functional Analysis,
30:311–328, 1978.
[104] R. A. Penrose. A generalized inverso for matrices. Proc. Cambridge Phil. Soc., 52:17–19,
1955.
[105] I. R. Petersen. Disturbance attenuation and H∞ optimization: A design method based on

the algebraic Riccati equation. IEEE Trans. Automat. Contr., AC-32(5):427–429, May 1987.
[106] I. R. Petersen and C. V. Hollot. A Riccati equation approach to the stabilization of uncertain
linear systems. Automatica, 22(4):397–411, 1986.
[107] A. Ralston. A symmetric matrix formulation of the hurwitz-routh criterion. IEEE Trans.
Automatic Control, AC-7:50–51, 1962.
[108] A. C. M. Ran and R. Vreugdenhil. Existence and comparison theorems for algebraic Riccati
equations for continuous and discrete time systems. Linear Algebra Appl., 99:63–83, 1988.
[109] C. R. Rao and S. K. Mitra. Generalized inverse of matrices and its applications. John Willey
& Sons, 1971.
[110] M. A. Rotea. The generalized H2 control problem. Automatica, 29(2):373–386, 1993.

322 BIBLIOGRAPHY
[111] M. A. Rotea, M. Corless, D. Da, and I. R. Petersen. Systems with structured uncertainty: re-
lations between quadratic and robust stability. IEEE Trans. Automat. Contr., AC-38(5):799–
803, 1993.
[112] M. A. Rotea and T. Iwasaki. An alternative to the D-K iteration? Proc. American Contr.
Conf., pages 53–57, 1994.
[113] M. A. Rotea and P. P. Khargonekar. H2 -optimal control with an H∞ -constraint: The state
feedback case. Automatica, 27(2):307–316, 1991.
[114] E. J. Routh. Dynamics of a System of Rigid Bodies. 6th edition, Pt. II, 221, London, 1905.
[115] W. Rudin. Principles of Mathematical Analysis. McGraw-Hill, 1976.
[116] M. G. Safonov. Stability margins of diagonally perturbed multivariable feedback systems.

IEE Proc., 129, Part D(6):251–256, November 1982.
[117] M. G. Safonov. L∞ -optimal sensitivity vs. stability margin. Proc. IEEE Conf. Decision
Contr., pages 115–118, 1983.
[118] M. G. Safonov, K. C. Goh, and J. H. Ly. Control system synthesis via bilinear matrix
inequalities. Proc. American Contr. Conf., pages 45–49, 1994.
[119] M. Sampei, T. Mita, and M. Nakamichi. An algebraic approach to H∞ output feedback

control problems. Sys. Contr. Lett., 14:13–24, 1990.
[120] C. Scherer. H∞ -control by state feedback for plants with zeros on the imaginary axis. SIAM
J. Contr. Opt., 30(1):123–142, 1992.
[121] C. Scherer. H∞ -optimization without assumptions on finite or infinite zeros. SIAM J. Contr.
Opt., 30(1):143–166, 1992.
[122] S. Schonemann. A generalized solution to the orthogonal Procrustes problem. Psychometrica,

31:1–10, 1966.
[123] I. Schur. über potenzreihen, die im innern des einheitskreises beschränkt sind. Crelle’s J.,
147 and 148:205–232, 122–145, 1917 and 1918.
[124] H. Schwarz. Ein verfahren zur stabilitätsfrage bie matrizen-eigenwertproblemen. Zeit F.

Angew Math. u. Physik, pages 473–500, 1956.
[125] J. S. Shamma. Robust stability with time-varying structured uncertainty. IEEE Trans. Auto.
Contr., AC-39(4):714–724, 1994.
[126] R. E. Skelton. Dynamic Systems Control. Wiley, 1988.
[127] R. E. Skelton. Model error concepts in control design. Int. J. Contr., 49(5):1725–1753, 1989.
BIBLIOGRAPHY 323
[128] R. E. Skelton. Increased roles of linear algebra in control education. Proc. Amer. Control
Conf., pages 393–397, 1994.
[129] R. E. Skelton. Model validation for control design. Mathematical Modelling of Systems, 1996.
[130] R. E. Skelton and M. Ikeda. Covariance controllers for linear continuous-time systems. Int.
J. Contr., 49(5):1773–1785, 1989.
[131] R. E. Skelton and T. Iwasaki. Liapunov and covariance controllers. Int. J. Contr., 57(3):519–
536, 1993.
[132] V. Sreeram and P. Agathoklis. The generation of q-markov covers via the inverse solultion
of lyapunov equation. 30th IEEE CDC, 138(6), 1991.
[133] V. Sreeram and P. Agathoklis. On the theory of state-covariance assignment for linear siso
systems. Proceedings of the 1992 ACC, 1991.
[134] V. Sreeram and P. Agathoklis. Solution of lyapunov equation with system matrix in com-
panion form. IEE Proceedings-D, 138(6):529–534, 1991.
[135] V. Sreeram and P. Agathoklis. On covariance control theory for lilnear continuous systems.
Proceedings of the 31st CDC, 1992.
[136] V. Sreeram and P. Agathoklis. On the theory of state-covariance assignment for single-input
linear discrete systems. Proceedings of the 1992 ACC, 1992.
[137] R. F. Stengel. Stochastic Optimal Control: Theory and Applications. Wiley, New York, 1986.
[138] A. A. Stoorvogel. The singular H∞ control problem with dynamic measurement feedback.
SIAM J. Contr. Opt., 29(1):160–184, 1991.
[139] A. A. Stoorvogel. The H∞ Control Problem: A State Space Approach. Prentice Hall, 1992.
[140] A. A. Stoorvogel. The robust H2 control problem: a worst case design. IEEE Trans. Automat.
Contr., AC-38(9):1358–1370, 1993.
[141] A. A. Stoorvogel and H. L. Trentelman. The quadratic matrix inequality in singular H∞

control with state feedback. SIAM J. Contr. Opt., 28(5):1190–1208, 1990.
[142] W. Sun, P. Khargonekar, and D. Shim. Solution to the positive real control problem for linear
time-invariant systems. IEEE Trans. Automat. Contr., AC-39(10):2034–2046, October 1994.
[143] A. Tikku and K. Poolla. Robust performance against slowly-varying structured perturbations.
Proc. IEEE Conf. Dec. Contr., pages 990–995, 1993.
[144] L. Vandedberghe and S. Boyd. A primal-dual potential reduction method for problems in-
volving matrix inequalities. Mathematical Programming, Seies B, June 1993.
324 BIBLIOGRAPHY
[145] L. Vandenberghe and S. Boyd. Positive definite programming. SIAM Review, 1994.
[146] D. Wagie. Model reduction and controller synthesis in the presence of parameter uncertainty.
Automatica, 2:295–308, 1986.
[147] H. Wall. Polynomials whose zeros have negative real parts. Amer. Math. Monthly, 52:308–322,
1945.
[148] M. A. Wicks and R. A. DeCarlo. Gramian assignment based on the Lyapunov equation.
IEEE Trans. Automat. Contr., AC-35(4):465–468, 1990.
[149] B. Wie and D. S. Bernstein. A benchmark problem for robust control design. Proc. American
Contr. Conf., pages 961–962, May 1990.
[150] H. Wile. A stability criterion for numerical integration. J. Assoc. Comp. Mach., 6:363, 1959.
[151] J. C. Willems. Least squares stationary optimal control and the algebraic Riccati equation.
IEEE Trans. Automat. Contr., AC-16:621–634, 1971.
[152] J. L. Willems. Stability Theory of Dynamical Systems. John Wiley & Sons, 1970.
[153] D. Williamson. Digital Control and Implementation: Finite Wordlength Considerations.

Prentice Hall International, 1991.
[154] D. Williamson and R. E Skelton. Linear Algebra with Engineering Applications. Book in
preparation.
[155] D. A. Wilson. Convolution and Hankel operator norms for linear systems. IEEE Trans.
Automat. Contr., AC-34(1):94–98, 1989.
[156] J. H. Xu and R. E. Skelton. Plant covariance equivalent controller reduction for discrete
systems. Proc. IEEE Conf. Decision Contr., 30:2668–2669, December 1991.
[157] C. Xuemin, X. Gang, G. Zhi, and F. Zuangang. Pole assignment in state covariance control.
pages 48–51, 1993.
[158] K. Yasuda and R. E. Skelton. Assigning controllability and observability grammians in feed-
back control. AIAA J. Guidance, 14(5):878–885, 1990.
[159] K. Yasuda, R. E. Skelton, and K. M. Grigoriadis. Covariance controllers: A new parametriza-

tion of the class of all stabilizing controllers. Automatica, 29(3):785–788, 1993.
[160] D. C. Youla, H. A. Jabr, and J. J. Bongiorno. Modern Wiener-Hopf design of optimal

controllers: Part 2. IEEE Trans. Automat. Contr., AC-21:319–338, 1976.
[161] G. Zames. Feedback and optimal sensitivity: Model reference transformations, multiplicative
seminorms, and approximate inverses. IEEE Trans. Automat. Contr., AC-26:301–320, 1981.
BIBLIOGRAPHY 325
[162] K. Zhou and P. P. Khargonekar. An algebraic Riccati equation approach to H∞ optimization.

Sys. Contr. Lett., 11:85–92, 1988.
[163] G. Zhu, K. M Grigoriadis, and R. E. Skelton. Covariance control design for the Hubble space
telescope. J. Guidance, Control and Dynamics, 18(2):230–236, 1995.
[164] G. Zhu, M. A. Rotea, and R. E. Skelton. A convergent algorithm for the output covariance
constrained control problem. Submitted for publication.
[165] G. Zhu and R. E. Skelton. Mixed L2 and L∞ problems by weight selection in quadratic
optimal control. Int. J. Contr., 53(5):1161–1176, 1991.
Index
Algorithm for assignable covariances, 114 robust L∞ , 229

Alternating convex projection method, 239, 241 robust `∞ , 236
directional, 245, 246, 248, 261–263 SSUB µ, 273, 284
optimal, 244 stabilizing, 223, 226, 231
standard, 242–245, 247 suboptimal, 265, 266, 279, 289
Analytic center, 268, 270 Controllability, 51, 58
output, 71
Balanced realization, 313
state, 75
Bounded real lemma, 190, 201
Controllable observable subspace, 314
Calculus of vectors and matrices, 307 Controllable subspace, 313
Cayley-Hamilton theorem, 59 Controller
Central estimator, 139 complexity, 151
Change of variables, 171 structure, 131, 135, 139, 141
Characteristic polynomial, 72, 118, 119, 128 Convex, 239
Congruent transformation, 47, 102 feasibility problem, 160, 168, 171, 178, 184,
Control problem 265, 266
H∞ , 185–187, 189, 190, 198, 202, 226 optimization problem, 265
L∞ , 225 programming, 274
covariance, 111, 112, 140, 146, 147, 149, 150, set, 239, 241, 304
160 Convex projection method
covariance upper bound, 157–160, 163, 165, optimal, 244
169, 174, 176, 178 Covariance
covariance upperbound, 222–225, 232, 267 algorithm, 114
finite wordlength covariance, 151 analysis, 75, 76
fixed structure, 283, 284 assignable, 112, 114, 160, 167, 179
fixed-order, 260, 272 assignment, 142, 145, 148, 154, 156
LQ, 162, 289 deterministic, 69, 70, 76
LQG, 172, 182 equation, 129, 130, 136
LQR, 187, 224, 225, 233, 265, 271, 279, 281 feasibility problem, 247, 251
optimal, 147, 150, 266, 279 optimization problem, 251
positive real, 227, 228 output, 69, 71, 81, 87, 90, 95, 98, 104, 105,
robust H∞ , 230, 237 108, 157–159, 161, 170, 174
robust H2 , 228, 229, 235, 236 state, 69
326
INDEX 327
stochastic, 73 Kalman filter, 139

upper bound, 157, 158, 160, 171, 180, 184 Kronecker matrix algebra, 146
upperbound, 223
Covariance:algorithm, 114 Linear
subspace, 295
Derivative, 309 vector space, 294
Detectability, 137 LMI (Linear matrix inequality), 91, 160, 168,
Devices 171, 178, 188, 192, 194, 195, 221, 222,
A/D, 151 227, 231, 238, 259, 269, 270
D/A, 151 LQG (Linear quadratic Gaussian), 139, 172
Disturbance attenuation, 98 LQR (Linear quadratic regulator), 187, 224
Dual LMI problem, 272, 274–277, 279 Lyapunov
equation, 81
Eigenvectors, 59
function, 64, 96, 104
Energy of signal, 80
inequality, 82, 83, 88, 89, 107, 157–159, 163–
Estimation error covariance, 139
165, 170, 174, 176, 178, 179, 186
Feasibility problem, 207, 240 matrix, 186, 222, 223, 225, 229–232, 236–238
Feasible stability, 62
domain, 269
Matrix, 121, 124
point, 246
negative definite, 293
Finite-dimensional vector space, 239
defective, 50
Finsler’s theorem, 41, 44, 46, 175, 177
Hermitian, 26
Fixed-point arithmetic, 152
inequality, 158
Fundamental subspaces, 296, 301
inversion lemma, 291
column space, 296
negative semidefinite, 293
orthogonal subspaces, 300
orthogonal, 41, 133
range space, 296
positive definite, 97, 292
right null space, 298
positive semidefinite, 292
Global convergence theorem, 276 signature Hankel, 121
Gradient, 268 signature Toeplitz, 123
skew-Hermitian, 35, 38
Hankel signature, 121 skew-symmetric, 147
Hankel singular values, 206 square root of, 97
Hessian, 268 unitary, 26
Hurwitz-Routh test, 117, 125 Method of centers, 269
Min/max algorithm, 289
Infeasible optimization, 241, 243
Minimial energy covariance control, 145
Infinite precision implementation, 151
Model reduction, 205
Jordan form, 55 H∞ , 205–207, 209, 210
328 INDEX
γ-suboptimal H∞ , 206 onto the assignability set, 252

balanced, 313 onto the block covariance constraint set, 253
covariance error bounds, 213 onto the output cost constraint set, 255
covariance upperbound, 213, 217 onto the positivity set, 253
Hankel, 206 onto the variance constraint set, 253
Moore-Penrose Inverse, 27 orthogonal, 242, 258
Projection Theorem, 306
Newton’s method, 268, 271
Noise gain, 151 QMI (Quadratic matrix inequality), 230, 231
Norm Quantization error, 151–153
H∞ , 85, 91, 97, 99, 105, 185, 187
Rank constraint, 267
H2 , 84, 98, 107, 183, 187
Riccati
L∞ , 80
equation, 90, 138, 140, 144, 145, 160, 162,
L2 , 80, 98, 187
173, 183, 184, 198
`∞ , 87
inequality, 91, 178, 183, 194, 195
`2 , 87, 88
Euclidean, 80 Sampling
Frobenius, 31, 184 skewed, 151, 153, 156
Hankel, 206 synchronous, 152
Scaling matrix, 222, 229, 230, 235–238
Observability, 51, 56, 60
Schur complement, 43, 82, 113, 172, 173, 175,
gramian, 107
183, 188, 190, 192, 195, 199, 201, 291
Observer based control, 135, 139, 142
Separation principle, 139
Orthogonal projection theorem, 242
Signal processing, 151
Output feedback, 129, 141
Signature Hankel structure, 121
dynamic, 131, 165, 178, 190, 202
Signature Toeplitz, 124
full-order dynamic, 169, 180
Singular value, 26, 27, 114, 149
static, 130, 163, 164, 176, 178, 189, 200
decomposition, 27, 259, 313
Performance Small gain theorem, 96, 97, 105
analysis, 93 Spectral decomposition, 26, 50
robust, 72, 98–106, 108, 109 Spectral radius, 288
robust H∞ , 99 Stability
robust H2 , 98, 105 Q-, 97, 100, 105
robust L∞ , 99 asymptotic, 64
robust `∞ , 105 Lyapunov, 62, 232
Positive real, 227 quadratic, 96–98, 104, 105
Projection robust, 93, 95, 97, 104, 185
for covariance control, 252 Stabilizability, 137
methods, 239 Stable polynomial, 118, 119, 124
onto constraint sets, 242 Standard alternating projection theorem, 242
INDEX 329
Standard assumptions, 194, 198, 200, 202

State
estimator, 139
feedback, 112, 129, 130, 139–141, 147, 159,
187, 198, 248
State feedback for single input systems, 117
Strictly proper system, 82
Successive centering methods, 265
Sylvester equation, 146
System gain, 80, 81, 84, 87, 109
energy to energy, 87
energy to peak, 87
impulse to energy, 80
pulse to energy, 87
Trace identities, 310
Uncertainty
norm bounded, 104
structured, 95, 97–99, 103
unstructured, 97
Uncontrollable subspace, 313
Unitary coordinate transformation, 123
VK-centering algorithm, 285, 288, 289
White noise, 158
XY-centering algorithm, 273, 274, 276, 277, 279,

285, 289
View publication stats

Lmibook

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lmibook

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A uniﬁed algebraic approach to linear control design

Book · January 2017

Robert E Skelton Karolos Grigoriadis

SEE PROFILE SEE PROFILE

Integrating structure and control design View project

The user has requested enhancement of the downloaded file.

R.E. Skelton, T. Iwasaki, and K. M. Grigoriadis

February 26, 2013

Dedicated to Judy, Stephanie, Hope, Katie and Grahm. – Robert E. Skelton

2 Linear Algebra Review 25

3 Analysis of First-Order Information 49

4 Second-Order Information in Linear Systems 69

5 Covariance Controllers 111

5.3.2 Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

6 Covariance Upper bound Controllers 157

8 Model Reduction 205

8.1.2 Discrete-Time Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

9 Uniﬁed Perspective 221

10 Projection Methods 239

10.3 Projections for Covariance Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

11 Successive Centering Methods 265

A Linear Algebra Basics 291

B Calculus of Vectors and Matrices 307

C Balanced Model Reduction 313

Frequency Domain versus State Space Methods.

Deterministic versus Stochastic Methods.

Control versus Signal Processing.

Modeling versus Control Design.

Scalar versus Multiobjective Methods.

Performance versus Stability.

Choosing a Design Space.

spaces and complex analysis [29].

{w(t) = wδ(t), x(0) = 0} f or i = 1 (1.3a)

y(1, t) = ceat dw (1.4a)

where it can readily be shown that X and Y satisfy the following;

ẋ(t) = ax(t) + bu(t) + dw(t)

1.1 Output Performance and Second-Order Information

1.2 Stability, Pole locations, and Second-Order Information

2aX = −(d2 W + X0 ). (1.10)

Hence a < 0 is equivalent to X > 0 provided d2 W + X0 > 0. Furthermore, Y in (1.5) can be

a = 0, b = m−1 = d, c = 1, u = fc − mg, w = fn , y = x = v − v̄,

fn (t) = wδ(t), W = w 2 , X0 = (v(0) − v̄)2 ,

1.3 Stability Robustness and Second-Order Information

1.4 Disturbance Attenuation and Second-Order Information

Figure 1.1: Energy Equivalent Signals

1.5 Stability Margins Measured by H∞ Norms

since the positive solution of (1.19) is

1.6 Computational Errors

The X corresponding to a real solution g follows from

(ii) Disturbance attenuation requirement for impulsive disturbances: We require

in response to w(t) = wδ(t) with |w| ≤ ²6 , for given ²4 , ²5 and ²6 .

This lends a multiobjective capability to the theory of multi-input, multi-output systems.

6. Disturbance attenuation properties can be accomplished by proper choices of the second-order

Linear Algebra Review

TABLE 2.1 Common Notations

2.1 Singular Value Decomposition

where Σ has the canonical structure

The numbers σi , i = 1, · · · , r are called the nonzero singular values of A.

σ12 ≥ σ22 ≥ · · · ≥ σn2 ≥ 0.

Suppose that the rank of A is r. Then, the rank of A∗ A is r, and we have

σ12 ≥ σ22 ≥ · · · ≥ σr2 > 0, σr+1 = · · · = σn = 0.