Lecture Notes in Control and Information Sciences

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 399

Lecture Notes in

Control and
Information Sciences
Edited by A.V. Balakrishnan and M.Thoma

I
Fo

IRP!
l

42

Advances in Filtering
and Optimal Stochastic Contrc
Proceedings of the IFIP-WG ?/1 Working Conference
Cocoyoc, Mexico, February 1-6, 1982

Edited by
W.H. Fleming and L.G. Gorostiza
I I

Springer-Verlag
Berlin Heidelberg NewYork 1982
Series Editors
A. V, Balakrishnan • M. Thoma

Advisory Board
L D. Davisson • A. G. J. MacFarlane • H. Kwakernaak
J. L. Massey • Ya. Z. Tsypkin • A..1. Viterbi

Editors
Wendell H. Fleming
Division of Applied Mathematics
Brown University
Providence, Rhode Island 02912
USA

Luis G. Gorostiza
Departamento de Matem&ticas
Centro de Investigaci6n y de Estudios Avanzados del IPN
Apartado Postal 14-740
M~xico 07000, D.E
M~xico

ISBN 3-540-119364 Springer-Verlag Berlin Heidelberg NewYork


ISBN 0-387-11936-1 Springer-Verlag NewYork Heidelberg Berlin

This work is subject to copyright. All rights are reserved, whether the whole
or part of the material is concerned, specifically those of translation, re-
printing, re-use of illustrations, broadcasting, reproduction by photocopying
machine or similar means, and storage in data banks.
Under § 54 of the German Copyright Law where copies are made for other
than private use, a fee is payable to 'Verwertungsgesellschaft Wort', Munich.
© Springer-Verlag Berlin Heidelberg 1982
Printed in Germany
Printing and binding: Beltz Offsetdruck, Hemsbach/Bergstr.
2061/3020-543210
PREFACE

This volume contains contributions to a conference on filtering, optimal sto-


chastic control, and related topics, held in Cocoyoc, Mexico in February 1982. The
conference was held under the auspices of IFIP WG 7.1. The host institution was the
Centro de Investigaclon y de Estudios Avanzados, whose assistance was appreciated by
conference participants. In addition, we wish to thank the following sponsoring or-
ganizations in Mexico, whose generous support was essential to the success of the
conference:
Centro de Investigaci6n y de Estudlos Avanzados del IPN
Campanfa Nestle
Consejo National de Ciencia y Tecnologfa
Fondo de Fomento Educativo BCH
Instituto de Investigaciones El~ctricas
Instituto Polit~nlco National
Secretarfa de Asentamientos Humanos y Obras P6hlicas
Secretarfa de Comercio
Secretarfa de Educaci6n P6bllca
Secretarfa de Turismo
Universidad Aut6noma Metropolitana-Azcapotzalco
Unlversidad Aut6noma Iietropolitana-Iztapalapa
Our thanks also go to other members of the International Program Committee and of the
Local Organizing Co~nittee for their valuable assistance in arranging the conference,
and to Y-C Liao, R. MeGwler, and S-J Sheu for their help in editing this volume.
In addition to specia!ists in nonlinear filtering and stochastic control~ several
outstanding probabilists working in related fields came to the conference. Their pre-
sence was an important element in its success~ both through the formal lectures presen-
ted and through informal discussion with other participants. Conference speakers in-
cluded senior scientists of long experience and energetic younger people. As put by
one participant, we had both "old coyotes and young lions"*.
The intent of the conference organizers was to focus not only on the mathemati-
cal aspects of the theory, but also on some newer areas of application and on numer-
ical techniques of approximate solution to problems in filtering and stochastic con-
trol. We think that this objective was fairly well met.

Wendell H. Fleming
Luis G. Gorostlza

* This simile was in part suggested by the fact that Cocoyoc means "place of the
coyotes" in the N~huatl language, and in part by the name of one young speaker.
ADDRESSES OF CONTRIBUTORS

BARAS, J. S. FLEISCHMANN, K.
Deparment of Electrical Engineering Akademie der Wissenschaften der DDR
University of Maryland ZentralinstitHt fur Mathematik
College Park, MD 20742 und Mechanik
U.S.A. DDR - 1080 Berlin, Mohrenstrasse 39
GERMAN DEMOCRATIC REPUBLIC
BENES, V. E.
Bell Laboratories FLEMING, W. H.
Murray Hill, NJ 07974 Division of Applied Mathematics
U.S.A. Brown University
Providence, RI 02912
BENSOUSSAN, A. U.S.A.
INRIA
Domaine de Voluceau - Rocquencourt GOROSTIZA, L. G.
B. P. 105 Departamento de Matem~ticas
78150 Le Chesnay Centro de Investigacidn y de Estudios
FRANCE Avanzados, IPN
Apartado Postal 14-740
BLANKENSHIP, G. L. M~xico 14, D. F.
Department of Electrical Engineering ~x~co
University of Maryland
College Park, MD 20742 HAUSSMANN, U. G.
U.S.A. Department of Mathematics
University of British Columbia
CLARK, J. M. C. Vancouver, B. C., V6T IW5
Department of Electrical Engineering CANADA
Imperial College
London, SW7 2BT HELMES, K.
ENGLAND InstitUt fur Angewandte
Mathematik
DAVIS, M. H. A. Universit~t Bonn
Department of Electrical Engineering 5300 Bonn, Wegelerstr. 6-10
Imperial College FEDERAL REPUBLIC OF GERMANY
London, SW7 2BT
ENGLAND HI JAB, O.
Department of Mathematics and Statistics
DAWSON, D. A. Case Western Reserve University
Department of Mathematics and Statistics Cleveland, OH 44106
Carleton University U.S.A.
Ottawa KIS 5B6
CANADA KURTZ, T. G.
Department of Mathematics
EL KAROUI, N. University of Wisconsin
Ecole Normale Sup~rieure Madison, WI 53706
3, rue Boucicaut U. S. A.
92260 Fontenay-aux Roses
FRANCE KUSHNER, H. J.
Division of Applied Mathematics
ELLIOTT, R. J. Brown University
Department of Pure Mathematics Providence, RI 02912
The University of Hull U.S.A.
Hull HU5 2DW
ENGLAND LIONS, P-L.
Ceremade, Paris IX University
Place de Lattre de Tassigny
75775 Paris Cedex 16
FRANCE
V

MANDL, P. PARDOUX, E.
Department of Probability and Mathematical U.E.R. de Mathematiques
Statistics Universit~ de Provence
Charles University 3 Place Victor-Hugo
Sokolovsk~ 83 13331-Maiselle Cedex 3
186 Prague 8 FRANCE
CZECHOSLOVAKIA
PLISKA, S. R.
MARCUS, S. I. Department of Industrial
Department of Electrical Engineering Engineering and Management
University of Texas at Austin Science
Austin, TX 78712 Northwestern University
U.S.A. Evanston, IL 50201
U.S.A.
MAZZIOTTO, G.
Centre National d'Etudes des T~lacommunications PRAGARAUSKAS, H.
92 131 - ISSY LES MOULINEAUX Institute of Mathematics and
FRANCE Cybernetics
• Academy of Sciences of the
MENALDI, J-L. Lithuanian SSR
Department of Mathematics 232 600 Vilnius 54, K. Pozelos Str.
Wayne State University U.S.S.R.
Detroit, MI 48202
U.S.A. QUADRAT, J-P.
Domaine de Voluceau-Rocquencourt
MITTER, S. K. B. P. 105
Department of Electrical Ehgineering and 78150 Le Chesnay
Computer Science and FRANCE
Laboratory for Information and Decision
Systems RISHEL, R. W.
Massachusetts Institute of Technology Department of Mathematics
Cambridge, MA 02139 University of Kentucky
U.S.A. Lexington, KY 40506
U.S.A.
NISIO, M.
Department of Mathematics SAZONOV, V. V.
Faculty of Sciences Steklov Mathematical Institute
Kobe University Academy of Sciences of the USSR
Rokkodai-machi, Nada-Ku 42 Vavilova Street
Kobe 657 Moscow - B 333
JAPAN U.S.S.R.

SHENG, D. D.
Bell Laboratories
Holmdel, NJ 07733
U.S.A.

STROOCK, D. W.
Mathematics Department
University of Colorado
Boulder, CO 80309
U.S.A.

VARADHAN, S. R. S.
Courant Institute of Mathematical
Sciences
New York University
251 Mercer Street
New York, NY 10012
U. s. A.
CONTENTS

BARAS, J. S., HOPKINS, Jr., W. E., BLANKENSHIP, G. L. Existence, uniqueness i


and tail behavior of solutions to Zakai equations with unbounded
coefficients

BENES, V. E. Optimal stopping under partial observations 18

BENSOUSSAN, A. Optimal control of partially observed diffusions 38

BLANKENSHIP, G. L., HOPKINS, Jr., W. E., BARAS, J. S. Accurate evaluation of 54


conditional densities in nonlinear filtering

CLARK, J. M. C. An efficient approximation scheme for a class of stochastic 69


differential equations

DAVIS, M. H. A. Stochastic control with noisy observations 79

DAWSON, D. A., KURTZ, T. G. Applications of duality to measure-valued 91


diffusion processes

EL KAROUI, N., LEPELTIER, J-P., MARCHAL, B. Optimal stopping of controlled 106


Markov processes

ELLIOTT, R. J., AL-HUSSAINI~ A. Two paramter filtering equations for Jump 113
process semimartingales

FLEISCHMANN, K. Space-time mixing in a branching model 125

FLEMING, W. H. Logarithmic transformations and stochastic control 131

GOROSTIZA, L. G. Generalized Gaussian random solutions of certain evolution 142


equations

HAUSSMANN, U. G. Extremal controls for completely observable diffusions 149

HELMES, K., SCHWANE, A. L~vy's stochastic area formula in higher dimensions 161

HIJAB, O. Asymptotic nonlinear filtering and large deviations 170

KURTZ, T. G. Representation and approximation of counting processes 177

KUSHNER, H. J. Approximate invariant measures for the asymptotic distributions 192


of differential equations with wide band noise inputs

LIONS, P. L. Optimal stochastic control of diffusion type processes and 199


Hamilton-Jaeobi-Bellman equations

MANDL, P. On reducing the dimension of control problems by diffusion 216


approximation

MARCUS~ S. I.~ LIU, C-H.~ BLANKENSHIP~ G.L. Lie algebraic and approximation 225
methods for some nonlinear filtering problems

MAZZIOTTO, G., SZPIRGLAS, J. Optimal stopping for two-parameter processes 239

MENALDI, J-L. Stochastic control problem for reflected diffusions in a convex 246
bounded domain
VIII

MITTER, S. K. Nonlinear filtering of diffusion processes: a guided tour 256

NISIO, M. Note on uniqueness of semigroup associated with Bellman 267


operator

PARDOUX, E., BOUC, R. PDE with random coefficients: asymptotic expansion 276
for the moments

PLISKA, S. R. A discrete time stochastic decision model 290

PRAGARAUSKAS, H. On the approximation of controlled jump diffusion processes 305

QUADRAT, J-P. On optimal stochatie control problem of large systems 312

RISHEL, R. W. Unnormalized conditional probabilities and optimality for 326


partially observed controlled jump Markov processes

SAZONOV, V. V. On normal approximation in Banach spaces 344

SHENG, D. D. A class of problems in the optimal control of diffusions with 353


finitely many controls

STROOCK, D. W. A resum~ of some of the applications of Malliavin's calculus 376

VARADHAN, S. R. S. Large deviations 382


EXISTENCE, UNIQUENESS AND TAIL B E H A V I O R
OF S O L U T I O N S TO ZAKAI EQUATIONS WITH UNBOUNDED COEFFICIENTS

W.E. Hopkins, Jr., J,S. Baras and G.L. Blankenshlp


Electrical Engineering Department
University of M a r y l a n d
College Park, Maryland 20742

ABSTRACT

Conditions are given to g u a r a n t e e the existence and u n i q u e n e s s of


solutions to the D u n c a n - M o r t e n s e n - Z a k a l equation for n o n l i n e a r filtering
of m u l t i v a r i a b l e diffusions with u n b o u n d e d coefficients. Sharp upper
and lower b o u n d s on the tall of c o n d i t i o n a l densities are also obtained.
A methodology is d e s c r i b e d to treat these p r o b l e m s using classical
p.d.e, m e t h o d s applied to the "robust" version of the DMZ equation.
Several e x a m p l e s are included.

Supported in part by ONR C o n t r a c t N00014-79-C-0808.


I. INTRODUCTION AND STATEMENT OF THE PROBLEM

Recently the p r o b l e m of filtering a diffusion process x(t) from non-


linear observations y(t) in a d d i t i v e Gaussian noise has been studied by
analyzing an u n n o r m a l l z e d version of the conditional distribution of
x(t) given the past of y(.). If this conditional distribution is a b s o -
lutely continuous with respect to L e b e s g u e measure, then it has a
density which satisfies a linear stochastic partial differential equa-
tion known as the D u n c a n - M o r t e n s e n - Z a k a i (DMZ) equation [I]. Background
information on this equation and other aspects of the nonlinear filter-
ing p r o b l e m may be found in [2]. In the present paper we focus on
existence-uniqueness results for the DMZ equation and on tail estimates
of the resulting solutions. Our motivation for these problems stems
primarily from the following areas: (a) n u m e r i c a l algorithms for the
solution of DMZ equation and their subsequent implementation by special
purpose array processors; (b) n u m e r i c a l evaluation of the K a l l i a n p u r -
Striebel path integral representation of the solution; (c) a c c u r a c y and
convergence analysis in a s y m p t o t i c expansio~ of the solution.

In c a s e s where the process x(t) evolves in a b o u n d e d domain in ~ n , or


when the state space is u n b o u n d e d but the coefficients of the DMZ equa-
tion are bounded, a satisfactory emistence~uniqueness theory is
available [3]-[5]. More recently, existence-uniqueness of solutions
has been established for filtering problems having "strongly" unbounded
coefficients. In [6] p o l y n o m i a l observations are studied via a
related optimal control problem. In ~7] classical results for funda-
mental solutions of p a r a b o l i c equations with unbounded coefficients
are applied to the "robust" version of the DMZ equation Eli, [2], for
scalar diffusions. In [8], [9], the m e t h o d of [7] is e x t e n d e d to
multidimensional problems. Furthermore in [7]-C9] tight estimates of
the tail behavior of solutions are obtained. In the present paper we
review and summarize the m e t h o d and m a i n results of ~7]-[9]~
^

To set the problem, consider the pair of Ito stochastic differential


equations

dx(t) = f(x,t)dt + g(x,t)dw(t), x(0) = x0


(1)
dy(t) = h(x,t)dt + dr(t) , y(O) = 0

where x(t), w(t)C~ n, y(t), v(t)E~ m, w(-) and v(-) are m u t u a l l y


independent Wiener processes independent of x0, and x0 has a den-
sity p0 (.) 6 L I ( ~ n) N C 0 0 R n ) . The c o e f f i c i e n t s f,g,h are assumed to
satisfy fi E HC I'0 h i cHC 2,1oc , where - -

denotes the
space of functions having locally H~Ider continuous derivatives of
order i in x and J in t. Furthermore the generator L of the diffusion
process x(') is assumed t o be uniformly elliptic; that is, there exist
continuous functions 8i(x,t) i=1,2, and a constant eo>O such that for
all ~ n and
( x , t ) ~ @01512<@l(x,t)l$12<~_ eiJ (x,tl~isjie2(x,t)l~l 2,
i T ..--i,~=l
where ( iJ)~ ~gg The filtering pron±em lot (I) is to estimate
statistics of x(t) given the o - a l g e b r ~ r ~ = o{y(s)lO~s~t}. Equivalently
we can compute the conditional distribution of x(t) given~.

Formally, the conditional density of x(t) given~ is given by

p(x,t) = U(x,t)/l U(x,t)dx


~n

where U(x,t) is a solution of the DMZ equatlon

dU = (L* - ~ l h l 2 ) U d t + <h,dy(t)>U, (x,t)e~


(2)
U(x,0) = Po(X) , x~R n

Here L is the adjoint of L and (2) is written using Stratonovich


calculus.

It is well known, particularly in recent studies, that if one introduces


the t r a n s f o r m a t i o n

V(x,t) = U(x,t) exp (~h(x,t), y(t)>) (3)

then V satisfies a classical, linear, parabolic partial differential


equation, parametrized by the observation paths y(*), which is called
the "robust" version of the DMZ equation rl], [2~:
u n
~tBV (x,t) = i,j=iZA i J ( x ' t ) V x i x j ( x ' t ) + i=iZBi(x,t)Vxi(X,t) +

C(x,t)V(x,t) , (x,t)e9 (4)

V(x,O) = ~(x) , xsm n .

The functions Bi(x,t) are pointwise linear functions of y(t), while


C(x,t) is pointwise a quadratic function of y(t). Since the paths of
y(-) are HSlder continuous, (4) is a classical p.d,e, and classical
results on e x i s t e n c e - u n i q u e n e s s of fundamental solutions for linear
parabolic equations due to Besala ~I0] can be fruitfully applied to the
robust DMZ equation, Direct analyses of the DHZ equation is rather
complicated. However, since the t r a n s f o r m a t i o n (3) is invertible,
positivity preserving, and (2), (4) are l!nea~ existence, uniqueness,
and tail behavlour of solutions of the DMZ equation may he obtained by
analyzing (4) instead,

2. OUTLINE OF THE M E T H O D

Our method can be a p p l i e d to a v a r i e t y of p r o b l e m s with unbounded


coefficients. We outline briefly here the m a i n steps. We shall refer
to the zeroth order coefficient of a p a m a h o l i c equation like (4) as the
potential (i.e. C(x,t) in (4)), We use a result of B e s a l a [I0 1 w h i c h
asserts that if we can find a "weight" o~ the form ~(x,t) =
exp(~(x,t)) such that the function U=V~, sat±sfles a parabolic p.d.e.
where the potential term and the potential term of the adjoint are
nonpositive for (x,t)~, then there exist a classical fundamental
solution for (4). Furthermore integral growth estimates for the
fundamental solution are given in ~10].

To a p p l y this idea to (4) we m u s t consider stopp~nR time partitions


of the interval [O,T~ ; t h e r e are special cases where stopping times
are not needed. These partitions are defined as follows. Given a
small positive number c>0, we define 0 = t o < t l < . . . < t N = T vla
to = 0
inf {t: l y ( t ) - Y ( t k ) I = s } , whenever inf exists
tk+ 1 =~tk<t<T (5)
T, otherwise
N = min {k:tk=T}.

Here e will be fixed by other considerations. Then on each set


~=A]Rnx(tk,tk_l] , O ~ k < N - i we introduce the transformations

uk(x,t) = U(x,t) exp(~k(x,t) - ~h(x,t), y(t)>). (6)


k
Then u satisfies

II t

uk(x'tk ) = ~
r P0(x)exp(~0(x,0)) , k = 0
(7)
[u k - l ( x , t k ) e x p ( ~k
(x,t k) -
~k-1
(X,tk)),l<k_<N-I

where
n n
;gkuk = E aiJu + Z biku + cku
i,J =I xi X .3 i=l X.
Z

bik -- - fi + 2Zn a ij + 2Zn a 13 (<h


. . , y> - ck ) (8)
j=l Xj J=l Xj xj
n
k k - I a i J ( ~ kji~x - <h ,y>) -
c = Cess i,j=l x.x,l3
n ~ n . n fl
- 2~ alJ (~ - <hxj,Y>) + E a i3 - Z - y>
i , j ~ l xl 3" i,J =I xixj i=l xl <ht'

k 1 k (<hx i > n
eess " ~ IgV~ - g 'Y )i=l +

+ (g-l)Tfl2 _ ½1h12 - ~1 I (g-l)Tfl 2 + ~t"


k

Thus uk satisfy linear parabolic equations of the same type as (4). We


then c h o o s e #k on e a c h s u b i n t e r v a l so as to r e n d e r the p o t e n t i a l t e r m
k ~k *
e of as w e l l as the p o t e n t i a l t e r m of (~k) n o n p o s l t l v e , and a p p l y
Besala's results.

Theguldel~nes for this construction are relatively simple to u n d e r s t a n d .


The actual computatlons and resulting conditions depend on the c a s e at
hand. First one poses appropriate assumptions on f,g,h so that c k and
the a d j o i n t potential are dominated by k s as
Ces Ixl+~. Then one fixes
the parameters in #k (in p a r t i c u l a r #~)
~ so that ck is <0. To c o n s t r u c t
ess w
#k one first constructs the p i e c e of #k c o r r e s p o n d i n g to the F o k k e r -
Planck equation for x ( ' ) . This is n a t u r a l since the diffusion itself
must be w e l l - b e h a v e d : no explosions, nlce solutions of the Fokker-
Planck equation, etc. Indeed if we set h = O in (4) w e get the Fokker-
Planek equation. The construction of thfs first part of ~ (here we
mean that typically ~i + #2 ) b r i n g s us n a t u r a l l y into contact with
Khas'minskli's test for n o n e x p l o s l o n s and generalizationsthereof.
To c o n s t r u c t #2' after obtaining #I" one returns to the full expressions
for c k a n d the a d j o l n t potential and appropriately balances growth
conditions on h w i t h those of f,g, so as to m a k e both potentials <0.
For u n i q u e n e s s we use classical weak maximum principles. The procedure
gives us u p p e r bounds on t h e tails by m e a n s of the u n i q u e n e s s class
identified. To obtain lower bounds on the tails we choose ~k s u c h
that the p o t e n t i a l c k of ~ k is p o s i t i v e ; i.e. using classical
comparison theorems for linear parabolic equations, We refer to the
references cited above [7]-19~ for the d e t a i l s of these constructions.
In the remainder of the p a p e r we summarize the results in p a r t i c u l a r
cases.

3. THE SCALAR CASE n=l, m=l

In this section we describe our results for the scalar case including
polynomial nonlinearities, Our assumptions on the coefficients in
(1.6) are stated in t e r m s of the orlg~na! functions f,g,h. To state
these succinctly, we will use the relative order notation:
Definition. Let F,G,:R~R and
L = lim sup IF(x)/CCx)le[O,~]

Then F = 0(G) if L<~ and F - o(G) if L ~ O.

The coefficients of the d i f f u s i o n x are a s s u m e d to s a t i s f y


(AI) fECI(R), g6C2(R), fx,gxx are locally H~Ider continuous;
(A2) g(x)~A>O,~xeR and somel;
(A3) -Ix(f/g2)(~)d~M,VxcR and some M;
0
(A4) (f/g2) = o(f2/g4), f = o(f2/g2); and
X X
(AS) the m a r t i n g a l e p r o b l e m for (f,g) is w e l l - p o s e d .

The last condition implies that t h e stochastic differential equation


for x has a u n i q u e w e a k solution for all t>0. A sufficient condition
for this is t h e existence of a L y a p u n o v function for the b a c k w a r d s
Kolmogorov equation associated with the process x[ll]. If the i n t e g r a l
in (A3) d i v e r g e s to +- as IxI+ ~, then it could s e r v e as the L y a p u n o v
function. If the m a r t i n g a l e problem is not w e l l posed, then the p r o c e s s
x may h a v e "explosions" (escape times w h i c h are finite with probability
one). In this case the conditioral d i s t r i b u t i o n of x(t) given~ may

have singular components which are not computed by the D M Z - e q u a t i o n .

The o b s e r v a t i o n function h is a s s u m e d t~o satisfy:


(BI) hEC2(R), h is locally H~Ider continuous;
2 xx
(B2) either g hxx , (g2hx) x = o(h2), or

g2hxx, (g2hx) x = o ( g 2 h ~ )
(B3) either gh x = ~ h ) or gh x = o ( f / g ) ;

(B4) eft~er (g2)xx = o(h 2) or (g2)xx = o ( f 2 / g 2 ) ;

(B5) one of the two m u t u a l l y exclusive cases holds:


(i) either h = O(f/g) or h = O(ghx) ; or
(ii) both
f/g = o(h) and gh x = o(h); in a d d i t i o n ,
2
ghx, gx h = o(h )
(B6) in c a s e (B5)(i),

lim m a x { ] h ( x ) l , - I x ( f / g 2 ) (~)d~} = +~;


txI+- 0
and in case (Bb)(il),

lira IfXCh/g)C~)d~l = + -
0
Ixl++~

Remarks. (i) the growth conditions are relatively easy to u n d e r s t a n d


in the case w h e n f,g,h are p o l y n o m i a l s , especiall~ ~(x) = f0 xj , g(x) =
g0(l+x2)k, h(x) u h0x~ .
(2) The conditions (AI)-(B6) are not necessary; different choices
of the weight functions used in the proofs would lead to @ifferent
growth restrictions. In fact, one could consider optimizing the ~hoice
of the weight functions. Here the weights a~e chosen as

4k(x,t) = ~k<x) - 7kt (9)

where

~k(x) = =41(x) k
+ 8142 (x) + 82FI + 422(x)]½

{ -Ix(f/g2)(~)d~ in case (B5)(i)


+l(X ) = 0
0 in case (B5) (ii) (9a)

42(X) = [ h(x) in case (Bh)(1)

IX(h/g)(~)d~ in case (B5) (il)


0
k k
The parameters ~,82,{8[,
Y_ ,tk}k= 0 will be functlonals of the path y(t),
t~0; for their explicit definition see [7].

Assumptions (A3) and (B6) together with the constraints s>0, 82>0,
62>16~I, imply that the weight functions ~k(x) diverge to + ~ as Ixl÷ ~.
The remaining growth conditions serve to identify the dominant terms
(as Ixl÷=) in the potential ek(t,x) in (7) and in the potential of the
adjoint of (7).
A s s u m p t i o n (B3) permits us to select the functions ~k
k
and the constants y so that these potentials are nonpositive. This
in turn permits the use of a maximum principle.

Under these a s s u m p t i o n s we sho~ that the ro%ust equation (4) has a


fundamental solution which may he used to construct a unique solution
to the D M Z - e q u a t i o n within a certain class of functions. To describe
this class, we define the constants

qi = lim sup Ig.(4i)xl/[h2+f2/g2]½ (i0)

v i = lim inf Ig.(41)xl/[h2+f2/g2] ½, i = 1,2


l l÷"
The assumptions imply ~i' ~i £[0"I] and ~2' ~2 £[0'~) when (B5)(1) holds,
while ql = 0 = 91' q2 = i = v 2 when (B5)(il) holds, The assumption
that either (Bh)(i) or (ii) holds implies (91 + 92 )>0"
Theorem i. Suppose (AI) - (A5), .(BI) - (B6) hol,d. Let Po(X) be
continuous, Po(X)>O, and assume that there exist constants el>0, i=1,2,
such that O<81n I + 82n2<I, and

Po(X)exp[81~l(X) + 821~2(x)l]<M, ~xcR (Ii)


~

and some M<~. Then for any constants 8i,0<ei<ei, i=l,2, there exists a

unique solution to the D M Z - e q u a t i o n (2) within the class of functions


satisfying

lira sup U(x,t) exp[81@l(X) + 821@2(x) I] = 0, Vt>0 (12)


IxI÷"
This solution satisfies U(t,x) = uk(t,x), tC(tk,tk+l)
uk(x,t) = eh(x)Y (t)l~rk (t ,x;z,tk)uk-l(tk,Z)dz (13)

U0(x,0) = P o ( X ) , k = 1,2 . . . .

where r k is the fundamental solution o_f (7).

Theorem 2. Suppose (AI) - (A5), (BI) - (B6) hold, and assume


when case (B5)(i) holds with 91>0, 92>0, that -f(x) sgn(x) and hx(X)
sgn(xh(x)) are n o n - n e g a t i v e for Ixl sufficiently large. Let Po(X)
satisfy the conditions in Theorem I, and suppose further that there
exist M 0 > 0 , K 0 > 0 such that

M 0 exp[-Ko¢(X)]<P o (x~VxcR_ (14)

where

~(x) = +l(X) +iCe(x)1 (15)

Then for. any T<~, there exist positive constants Hi, M2, KI, K2, which
may depend on the path {y(t),O~t~T} such that the solutfon of the DMZ-
equation ~iven by (13) satfs~fes

M1 exp [ - K l ¢ ( X ) ] < U ( x , t ) < M 2 exp [-K2@(x)] (16)

V (x,t)cRx[O,T]

To lllustrate our results and make contact with other recent work on
nonlinear filtering (e.g., [6],[12]), we consld'er a class Of systems
with polynomial f,h.

So let f,h, be polynomials with f odd' and stable, i.e.,


2q-i
f(x) = Z fi xl, -f2q_i>O (17)
i=O
s
h(x) = I h.x j, h #0
J=0 3 s

where q,s are p o s i t i v e integers. Suppose

g(x) = g o ( l + x 2 ) r / 2 ,go>0 (18)

where r~[0,~). Our conditions for existence and uniqueness and


estimates of the a s y m p t o t i c b e h a v i o r of the density depend on w h e t h e r
or not g(x) is g l o b a l l y L i p s c h i t z and on the degree of h(x) relative
to the degree (or stability) of f(x). There are two cases covered by
Theorems 1-2.

Case I: rE[O,l], q>r, s~l, q~l.

The r e s t r i c t i o n s (AI)-(A5),(BI)-(B6), applied here require g(x) to satisfy


a linear growth c o n s t r a i n t r£[0,1], that f be at least a cubic poly-
nomial, q~2, w h e r e g(x) is of linear growth, r=l, and that h(x) be non-
constant.

Then from Theorem 2 for any 0~tl~t 2, there exist constants Mi,K i
depending on the o b s e r v a t i o n path such that

M I exp [-ll']XlP]~U(x,t)!M2 exp [-K21xl p] (19)

where

I s-r+l , r<l and r+s>2q-I


p = (20)
[max [s,2(q-r)], otherwise

We refer to [7] for details. Although it is not covered in the present


case, the s i t u a t i o n r=0, f and its first two d e r i v a t i v e s are bounded,
and h is a s y m p t o t i c to a n o n - c o n s t a n t polynomial~ can he easily treated
by adapting the arguments in Theorems 1-2. In particular, the
inequalities (19) hold w i t h p=s+l, and this result overlaps [61 [12].
For example, if f=0, g=l, h(x) = h s x s, hs~0, then

0<K2<lhsl/(s+l), p = s+l. (21)

This was obtained by Sussmann for s=3 ~n ~12].


i
Case 2: r>l, q>r+~s, s~l, q~2.
Here g(x) is of super linear growth, f(x) is at least a cubic polynomial,
and h(x) is dominated, as ~ndicated, by the dynamics of the state
process. In this case the a s y m p t e t i e b e h a v i o r o~ the e e n d i t i o n a l
density is the same as that of the a prldrl density (of x(t)).
10

4. THE SCALAR BILINEAR FILTERING PROBLEM

The example presented here illustrates a different grouping of terms and


different growth conditions, necessary for this class of p r o b l e m s . It
is p r e s e n t e d here as another application of the m e t h o d described earlier.

Consider the system

dz(t) ffi f ( z ( t ) ) d t + z(t)dw(t)


dy(t) = h(z(t))dt + dv(t) (22)
z(0) = z , y(O) = O, O<t<T<~
o

with Zo h a v i n g density po(Z), and Zo, w,v mutually independent as


before. Since z(t) will eventually be trapped in e i t h e r the positive
or n e g a t i v e half space, we shall arrange that z(t)6[0,~) by taking
fECI(o, ~) satisfying
I
f(z) ~ K ( l + z ) for some K>O
(CI) (23)
f (0)~0

and by taking po(Z) defined on (0,~) and continuous and integrable


there. We also assume tha~ h E C 2 ( 0 , ~) w i t h hz, hzz locally HUlder
continuous (and so, bounded at zero).

We impose the following growth conditions on f,h.

(CI') f (z) is b o u n d e d and locally H~ider continuous


z

(C2) lira [f(z)/z] = >0


z~0
(c3) lim lh(z)I/log z = + -

(C4) lira [hzz(Z)/h2(z)]_ = 0

(C5) for some constants Ki, Mi, i=I,2

M 1 + KlJZhz(Z)]!Jh(z)J!M 2 + K21zhz(z)1
Note that these conditions are satisfied whe~ ~(z) is a f f i n e and h(z)
is a n o n - c o n s t a n t polynomial; we do not consi@e~ the case h(z) constant.
The assumptions that f,h and thei~ derivatives are bounded at the
origin are m a d e for convenience only. They can be relaxe~ by intro-
ducing more complex growth restrictions. The other assumptions (C2)-
(C5) are essential (to our method).
11

The D M g - e q u a t i o n associated with (22) is

dU(z,t) ~ [½(z2U)zz - (fU)z - ½h2U]dt + h U dy(t)


(24)
u(z,O) = po(Z) , (z,t)Er0,-) x pO,T]

Because the g e n e r a t o r for the d i f f u s i o n x in (22) is not u n i f o r m l y


elliptic our t h e o r e m s are not directly applicable. If we make the
logarithmic change of c o o r d i n a t e s x=logz and let W(x,t) = u(eX,t), x~R,
(24) b e c o m e s
1
dW(x,t) = {~ Wxx + [3/2 - e-Xf(eX)]w x
1 2
+ [l-fz(e x) - ~h (eX)]w}dt + h(eX)Wdy(t) (25)

W(x,O) = Po(e x)
The m e t h o d s of B e s a l a [I0] as used in the proofs of T h e o r e m s I and 2
can be d i r e c t l y applied to the robust version of this equation. The
proofs of T h e o r e m s i-2 go t h r o u g h when the w e i g h t functions ~k(x) are
defined by (9) with x=log z and

el(Z) = -/zf(~)/~2d~
i
(26)
~2(z) = h(z)
Similarly as b e f o r e define

"~I = n l = limlflzll[h2 + f2/g2]½.


z~O
(27)
= i by (C2)

V 2 '~ lira inf Z l h z ( Z l l / [ h 2 + f2/g2]½E(O,m)


z+a,- ,m
(28)
n 2 = lim sup Zlhz(Z)ll[h 2 + f21g21%£(O,m)

Then we have obtained [7] the following result.

Theorem 3. Suppose (Ci)-(C5). hol d. Let 0i>0, 0<81 + n282<I, and


suppose po(Z) satisfies

p o ( Z ) ~ M 1 exp [811z(f(~)/~2)d~-821h(z)l] (291

for all zE(0,~) and some MI>0, Then for an~ Oi~O i the D M Z - e q u a t l o n
(24) has a unique solution in the class of f u n c t i o n s satlSfyln~ for
all t>0

llm sup U(z,t) exp [-01fz(f(~)/~2)d~ + 021h(z)[~ = 0 (30)


~I
M_oreover, if there exist constants, M'2, 0 i, 0<Gi<@i, such ~hat for all
12

z£(0,-)

M 2 exp [81 fz(f(~)/~2)d~-O21h(z)ll£Po(Z)


1 (31)

then for all t~0 the solution U(z, t) is a s y m p t o t i c to

exp [ I z ( f ( ~ ) / ~ 2 ) d ~ - l h ( z ) l] (32)
i
in the s e n s e of (16) in T h e o r e m 2.

For example, when f(z) = az + b, assumption (C2) implies either b>O or


a>O and b = 0. Then

#l(Z) = bz-l-b-a log z

and, whenever (29) (30) are satisfied, U(z, t) is a s y m p t o t i c to

exp [-lh(z)l - b/z] , if b>0


(33)
z a exp [ - ] h (z)I] , if b=0 and a>0 ,

5. THE M U L T I V A R I A B L E CASE

In [8] [9] the previous results have been generalized to m u l t i v a r i a b l e


diffusions. We shall briefly describe one of the cases here and refer
the reader to [8], [9] for other cases and further details. The
assumptions on the functions f,g,h are of two types basically: relative
order relations implying the potentials of ~ k and its adjolnt are
dominated by certain terms in c k as Ixl+ ~, and inequalities providing
ess
for control of these dominant terms. These assumptions may be stated
succlntly using the following definitions.
Definition. Let f,g,Ec~Rnx[0,T]) with g~0. Then f = oh(g) if for
every e>O there exists a constant K(e) such that for all (x,t)E~

If(x,t)l!Eg(x,t) + K(E)
2
Definition, A nonnegatlve function rEHloc0Rn) is said to be a s c a l e
function if
(1) there exist positive constants DI,R such that IVr(x)12~01
for all IxlZR
(ll) llm m l n r(x) = +
R+~ Ixl=R
We shall use
on o e c a s ion the n o t a t i o n
n
Ar(x,t)~2 I aiJ(x,t)rx (x)r x (x) = Ig(x,t)Vr(x)12~0,
i,J=l i J
Definition. Let ~ denote the collection of pairs of functions (F(z,t),
r(x)) satisfying
13

2'l']Rx
F E HCIo c( [0,T]), reH~oc(]Rn) and ~(1) Fz(z,t) ~ Ob(F2(z,t))

r (x) r (x)
f F(z,t)dz, I Ft(z,t)dz = Ob(F2Ar)
0 0
n n
~(ii) F ~ a ij r , F Y. a ij r = Oh(F2
i,J=l xixj i,j=l x i xj Ar) "

Definition. Two time-varying vector fields fl(x,t), f2(x,t) are said to be com-
n i i >
patible if there exists a constant R > 0 such that Z aiJ(x,t)fl(x,t)f2(x,t) 0,
i,J=l
for all L~[ >_R, tC[0,T]. If we ~ow Zet (~lj) d e n o t e t h e i~verse of (a lj) we

can state the assumptions on the coefficients f,g of the diffusion as follows.

We let 81,82 be such that


n
el<x,t)l~l=_< I alJ~i~ j _< S2(x,tll~l2-
i,j=l
Hypothesis F. There exist a scale function r(x), nonnegative functions
F, FcHC~oclORx[0,T]) satisfying 0<F<F, a~d constants F0_>0, Fl>0, R>0
sU=h that (~,r), (F,r)e~, and for all IxI>R, te[0,T],
n
(i) ~(r (x) , t)< (F0-Z fi(x,t)r (x))/Ar (x,t)
-- i=l xi
n
(ii) F(t (x) ,t)> (-F0-~ fi(x,t)r (x))/Ar (x, t)
_ _ i= 1 xi
82 n
(iii) Ar(X,t)F2(r(x),t)>2Fl[-F0+(Z -- E ~iJfifJ) (x,t)]
v I ij=l

(iv) div(f), n[ ~"


ailixj (x,t) = °h (~2Ar)
i,j=l
Hypothesis G. There exist constants 0<~<v<2 and positive constants
R, K 0, K 1 such that for all Ixl>R, tE[0,T],

K I Ix 1 2 - v < 8 1 ( x , t)

la ij (x,t)l<__K01x 12-~
la ij (x,t)l<K01x ll-V
xi

IaxiJixj(x, t) I_<K0

If hypothesis G is not satisfied, it will be assumed that


r(x) uniformly in [O,T].
Hypothesis L. lim I F(x,t)dx = + =,
Ixl÷- 0

One sees easily that F(i) is a generalization of Khas'minskii's test


for explosions. In fact we have the following result [8] [9].
14

Theorem 4. If either F(i), (iii) and G hold, or F(1) and L hold, then
the martingale problem for the pair (f,g) is well-posed.

Next the observation nonlinearity is assumed to satisfy the following


conditions. Let
+

h+(x,t) = ([i + hk(x,t)1½) m


"k=l
+

where i = (I,i ..... 1)TelRn.

Hypothesis H. There exists a scale function s(x), nonnegative functions


H, HcHC~IORx[0,T])xoc satisfying 0<H<H, and constants R>0, H0~0 such that
(H,s), ( H , s ) e ~ a n d for all ]x]iR, te[0,Ti,
(i) H2(s(x),t)~(lh(x,t)]2 + H0)/As(X,t)
(ii) H2(s(x),t)~(Ih(x~t)I2-Ho)/As(X,t)

(iii) lht] = Ob(]h]2), ]htl = OB(H2As )


n . . n . . n . . ->

r alJlhxixj I , X alJlhxj I , l al3<h+'l'x x '


i,j=l i,j=l xi i,j=l i j
n -~ n . ,

r aiJ<h+,l> = o (~ am3lh ] lhx j


i,j= I x i xj b i,j=l xi I)"

(iv) nX . lhxi I lhx.l = o bcH2As + g2A r )


IaiJ]
i,J =I 3
s(x)
(v) [h(x,t) I = O b ( I H(z,t)dz)
0

Finally we need the following hypothesis which help control the growth
of the potentials.
Hypothesis I.

(1) xilhxi I = Ob(H2As )


n

(il) l filhxi I = Ob(H2A + F2A r)


i= I s
n .

(iii) ~ Z alJr lhx. I, g E aiJs lhxj I = Ob(H2A + F2A r)


i,j=l xi 3 i,j=l xl s
n n . .

F r aiJr lhx. I, H I aXJSxilhxj] = ObCH2A + F2Ar )


-- i , j = l xi 3 -- i,j=l -- s --

To compute lower bounds on the density we shall use

Hypothesis K. (i) Vr and Vs are compatible


(ii) x is compatible with both Vr and Vs.

We can now state the following existence result. For a proof see 19].
15

Theorem 5. (Existence of fundamental solutions). Suppose hypotheses F,


H, I (li) and I (iii) hold. Then for each H~ider continuous path
{y(t), 0<t<T} of the observation process, there exists a classical
fundamental solution of the robust DMZ equation (4).

To d e s c r i b e our results for uniqueness and tail behavior we n e e d the


definition of c e r t a i n function classes, given below.

Definition. Let feC0Rnx[o,t]) and~p_C~. Then fE~(~) if B constant K

s.t. Ifl<E exp (@) for all (x,t)c~ and

f~o(~) if l l m Ifl exp(-¢) = O, uniformly in t E [ 0 , T ] .

Let

loglx I , if u=O
~0(x,t) = , for IxltR
|IxIR/~, if ~e(0,2]
lany C -time invariant extension for Ixl!R
r(x)
~l(x,t) = I ~(z,t)dz
0
s(x)
~2(x,t) = / ~(z,t)dz

Theorem 6. [9] (Existence and uniqueness of solutions). Suppose F,


G,H,I hold and that Po is c o n t i n u o u s , nonnegatlve and integrable. Then
there exist positive constants a I, ~i' ~i' i = 1,2 such that
whenever

po(X)~ ~ (-=o~X) - ~1~I(x,0) - =2~2(x,0))


for some constants aOe(0,~O), ale(~l,al) , a2c(0,~2) , the
DMZ equation has a unique solution in

(~0 .---0
0 1---~1t~ (x) - ~ i TI (x,t) + (1-~)~2T2(x,t))
for all e~(O,l). If i n s t e a d of h y p o t h e s i s G, h y p o t h e s i s L holds the
result remains valid with s 0 = ~0 © 0 and ~i chosen arbitrarily small.

Theorem 7. [9] (Lower bounds). Suppose the same hypotheses hold as


in t h e o r e m 6, i n c l u d i n g the growth assumption on Po" Suppose in
addition that ~>0 and hypothesis K holds.

(a) If supp(po(X))~(Ixl~R), where R is the m a x i m u m of the radii


R in F , G , H , K , then there exist constants M, el, i=0,i,2, depending on
16

the p a t h of the o b s e r v a t i o n process y(.), such that


2
U(x,t)>M exp[-l(l + ~ ~i~i(x,t))].
-- t i=0

(b) If there exist constants M0 ' 0


~i>O, i = 0,1,2, such that for
all x d R n
2
0 i
Po(X)>_M0 exp [-E ~i ~_ (x,0)]
i=0
then there exist c o n s t a n t s M, ~i' i=0,I,2, with M depending on the
observation path
such that
2
0 i
U(x, t)_>Mexp [-E ~ i ~ (x,t)].
i=0

There are several other c a s e s where the m e t h o d described h e r e has b e e n


successfully applied. We r e f e r to [9] for the d e t a i l s .

We c l o s e this section with a class of r e l a t i v e l y simple but interesting


examples. Consider the p r o b l e m of a scalar observation of a two
dimensional Wiener process,

dx(t) = dw(t)

dy(t) = h(Xl,X2)dt + d~(t), y(t)e]R I.

T h e n w i t h ~ = F = 0, the n a t u r a l upper and lower bounds of T h e o r e m s


6, 7 are of the form

exp(-A(x~ + x~) - Bs(x)).

Examples of p a i r s of f u n c t i o n s (h,s) for w h i c h Theorems 5-7 hold


2p+2
include the f o l l o w i n g : h = (x + x q)~, s = C (/I + x I +
/I 2q+2 2 x~)'.
+ x2 ); h = ax~ p + b x ~ p, s = C (h 2 0I - 2 p + 0) w h e r e p = (x I +
Here p,q are p o s i t i v e integers, a,b,c positive numbers.

REFERENCES
[I] M.H.A. Davis, S.I. Marcus, "An i n t r o d u c t i o n to n o n l i n e a r filtering",
in S t o c h a s t i c Systems: The M a t h e m a t i c s of F i l t e r i n ~ and Identifie
cation and A p p l i c a t i o n s , M. H a z e w i n k e l , J.C. Willems, eds., NATO
Adv. Study Inst. Ser., Reldel, Dordrecht, 1981, pp. 53-76.
[2] M. H a z e w i n k e l , and J.C. W i l l e m s , eds., The M a t h e m a t i c s of F i l t e r i n g
and I d e n t i f i c a t i o n and A p p l i c a t i o n s , N A T O Adv. Study Inst. Ser.,
Reidel, Dordreeht, 1981.
[3] N.V. Krylov, B.L. Rozovskii, "On the c o n d i t i o n a l distribution of
diffusion processes", Izvestia Akad. Nauk, SSSR~ M~th. Ser., __42,
1978, pp. 356-378.
[4] E. P a r d o u x , "Stochastic partial diff, eqs. and f i l t e r i n g of diff.
17

processes", Stoch., 3, 1979, p. 127.


[5] D. Michel, "Regularlte des lols condltlonnelles en theorle du
filtrage non-llnealre et ealeul des vat. stoch." J. Funct. Anal.,
41, 1981, pp. 8-36.
[6] W.H. Fleming, S.K. Mitter, "Optimal control and p a t h w i s e nonlinear
filtering of nondeg, diff.", 20th IEEE Conf. Dec. Confr., San
Diego, Dee. 1981.
[7] J.S. Baras, G.L. Blankenship and W.E. Hopkins, Jr., "Existence,
Uniqueness, and A s y m p t o t i c Behavior of Solutions to a Class of
Zakai Equations with unbounded Coeff," to appear in IEEE Trans.
Aut. Contr.
[8] W.E. Hopkins, Jr., "Nonlinear Filtering of M u l t i v a r i a b l e Diffusions
with Strongly Unbounded Coefficients", to appear in Proc. of 1982
Conference on I n f o r m a t i o n Sciences and Systems, Princeton Univ.
[9] W.E. Hopkins, Jr., Ph.D. Thesis, Electrical Engineering Dept.,
University of Maryland, in p r e p a r a t i o n .
[I0] P. Besala, "Fundamental Solution and Cauchy Problem for a P a r a b o l i c
System with Unbounded Coefficients" J. Diff. Eqs., 33, 1979,
pp. 26-38.
[ii] D.W. Stroock and S.R.S. Varadhan, Multidimensional Diffusion
Processes, Sprlnger-Verlag, New York, 1979.
[12] H.J. Sussmann, "Rigorous Results on the Cubic Sensor Problem", in
[2], above.
OPTIMAL STOPPING UNDER PARTIAL OBSERVATIONS
Vaclav E. Benes
Bell Laboratories
Murray Hill, New Jersey 07974

1. Introduction

The area of stochastic control, though long ago supplied by Richard Bellman with the
powerful methodology of dynamic programming, is still strewn with unsolved problems.
Yes, problems, because although the theory of stochastic control is well and even elegantly
formulated, the gritty task of finding or describing optimal control laws languishes, and
problems continue to stay generically unsolved, except on a set of empty interior by
(shudder!) numerical methods. In the difficult and currently active field of control with
incomplete noisy information, solutions are even more sparse; for good Lie algebraic reasons
they are limited with a few exceptions to linear dynamics and Gauss-Marker processes.
It is therefore important to note that a special but significant subclass of control problems
with incomplete noisy information turns out to be more readily approachable than one might
expect: these are the problems of optimally stopping a process (say a diffusion xt, ) while
observing only a function h(xt) muddied up by white noise, so as to maximize Ek(xf) for
some return function k.
Now stopping a process to maximize an expected return depending on the final value at
the stopping time is a problem that has attracted much theoretical interest, and found
applications in economics, finance, and inventory theory, inter alia. Here the admissible
stopping times are those of the process itself, because one has perfect information. Thus in
the problem's usual formulation it is assumed that the process to be stopped is Marker and
that it is exactly observed; the task then becomes one of finding a continuation region in
which the optimal return is harmonic (in the generalized sense appropriate to the process)
while everywhere dominating the return function k, and fitting it smoothly at the
(unknown) boundary. These optimality conditions can also be reformulated more
analytically as a quasi-variational inequality, convenient for proofs but obscuring the "free
boundary" problem.
We pose a version of the stopping problem in which only noisy incomplete information
about the past of the process x t to be stopped is available: the decision to stop must be based
on a given initial distribution for x o and on a noisy observation dyt --h(xt)dt +db t
consisting of a function h of xt corrupted by white Gaussian noise. This kind of problem
was posed first by Grigelionis, l and studied extensively by Mazziotto and Szpirglas. 2-5
It turns out, not surprisingly, that our form of the problem can be formulated and
attacked by means of nonlinear filtering, and that the resulting structure is a kind of infinite-
dimensional version of the original problem: the conditional distribution of the state is used
as a new state, and constitutes a measure-valued diffusion; again one looks for a function
(defined now on the positive cone of a Banach space of measures) which is " h a r m o n i c " in a
certain continuation region, and dominates a given function. The meaning of 'harmonic' is
clarified by the interpretation in terms of filtering: we "Markovize" the problem by taking
the conditional distribution of x t as an infinite-dimensional state variable, with dynamics
given by Zakai's equation. It is natural, if not almost obvious, that this conditional measure
is a su~cient statistic in the sense that by using only it one can do as well as by using all the
available data; so we can expect that a particular optimal stopping time can be defined as the
hitting time of a suitable boundary by the conditional distribution. The "harmonic"
19

functions in this setting are those which are annihilated by a natural infinite-dimensional
"Laplaeian" operator obtained from the Zakai equation. In this way we inject, into the
problem of stopping under partial observations, a more analytical method than had been
present in the previous work ~-4 of Szpirglas and Mazziotto.
We end this introduction with a summary of the rest of the work. Sections 2 and 3
describe a formulation and interpretation of the problem whose neatness owes much to
Fleming and Pardoux 7 and to Davis and Kohlmann. s-9 In Section 4 we advance some
heuristics on the appropriate meaning of 'harmonic' to be used in an analytical approach,
leading, in Section 5, to an Ire differential rule. In Section 6 we describe a quasi-variational
inequality proposed as equivalent to the stopping problem; the inequality is stated in terms of
an infinite-dimensional "Laplacian" advanced in Section 4 as expressing the appropriate
meaning of 'harmonic.' There follow a verification lemma in Section 7, a discussion of the
regularity of the optimal return in Section 8, and a reformulation of the whole problem as a
fixed point problem in Section 9, and a characterization of the optimal return as the least
superharmonic majorizer of the linear functional (k,t~). Section 10 studies the optimal
return as a fixed point in a complete lattice, while Section 11 suggests and calculates some
examples. The final Section 13 gives the proof of the differential formula.

2. Formulations
It is convenient to give our problem a series of formulations of increasing tractability.
We start with a diffusion process observed through noise, described by stochastic DEs
dx t -~ bdt + adw t state

dy t = h (x )dt + db t observation

and the problem is to maximize the expected return E k ( x t ) over stopping times r of y , for a
given initial measure # for x 0. The first reformulation consists in eliminating all reference to
x t. Let Y t = ulys, s<_t}, and for a stopping time r e l y , let Y~ ffi o-algebra of events prior
to r. Since
Elk(x~)lY~}ffiElk(xt)lYt}t., a.s.

we can write the criterion as


eelk(x,)JY,}t_, = ei,

where/ct is the conditional expectation: fct = elk(x,)lY,}. The process ~, is a functional of


the past of y , and our problem can be rephrased as the optimal stopping of k t on the basis of
Yt itself. It is now natural to invoke nonlinear filtering to describe the local character or
dynamics of k t.
For a starting measure #, let a~tz be the unnormalized conditional measure of x t given
Y~ defined by the Zakai dynamics
Ool.: = ~

d(f ,~t~ ) ---- (Af ,ot~)dt d- (fh,ot~)dy t

for bounded continuous f in the domain of an elliptic generator


1
A -----~ tr a V2a'-I- b ' V
20

The notation ( , ) refers to the pairing between the set C b of bounded continuous functions
and the set M of finite regular Borel measures, with the weak topology for M. at u is a
measure-valued diffusion driven by the observation process. In terms of it we can write

f,, =
(1,¢rt/t)

and state a first reformulation of the original problem as the search for
(k,~r,u)
sup E - , 7 a s.t. of Yt
r (1,artz)

subject to the Zakai dynamics for crtt~, the expectation E now being purely over the Yt-
process.
Finally, on the suggestion of M. H. A. Davis 6'8 we can use an idea due to W. H. Fleming
and E. Pardoux 7 to eliminate the troublesome-looking normalizing denominator (1,~r,t~) if
we replace the measure induced by Yt by Wiener measure. The final formulation is then
this: Find, for w t Brownian a stopping time ~'* of w t t h a t achieves
S (t~) -- sup E ( k , ~ u )
.g

subject to crou = t~ and


d ( f , a t ~ ) ~ ( A f ,crtt~)dt + (fh,~rtl~)dw t .

3. Interpretation via c o n v e x a n a l y s i s

The following explanation of S ( # ) is adapted from Davis and Kohlmann. s The functions
of the f o r m
S ( # ) -- sup E (k,~)
T

are convex, positive homogeneous, and bounded in the sense (11.11 = variation)
sup IS(I~)] < co for bounded k
II#)lffil

Viewing E ( k , a f # ) dually as (kT,~) for suitable k ~, and letting


K ~ {kr: 7 s.t.}

be the " r e a c h a b l e " set, we see that S ( # ) is the support function of K , in the language of
convex analysis. S ( # ) is not changed if the bounded but possibly nonconvex set K is
replaced by its convex hull c o K , and in terms of S ( ' ) , c o K is given by
c o K = { f E Cb: ( f ,#) ~ S ( # ) for all #}

This has a nice control theoretic interpretation: the set of which S is the support function
consists of those f such that for any #, given the choice of (f,t~) pesos now or of taking the
maximum expected payoff from the stopping problem, you would go for the latter.
21

4. Some informed guesses and heuristics

We provide here an informal motivation for the mathematical ideas used in the sequel.
It is tempting to conjecture that our problem of stopping with partial information, when
reformulated as a complete information problem on a new state space, will have the same
kind of structure as the well-known stopping problems governed by free boundary problems
or quasi-variational inequalities. This structure is revealed when we view the M+-valued
diffusion crt# as having a drift associated with the generator A, and a degenerate kind of
diffusion proportional to h.
To fix ideas, let us use the density formulation of the Zakai equation. For t > 0, the
unnormalized measure * t ~ has a density p which, according to the Zakai equation, moves
with " d r i f t " A ' p , and diffusion " h a " when driven by dyt. This suggests that our problem
might also be governed analytically by a second-order operator based on A and h.
For guidance, let us see how a smooth functional U(t,pt) would develop in time under
the Zakai dynamics: expanding in Taylor's series up to terms of order 2, we find
U(t+dt,p,+~) = U(t,p t) + Ui(t,pt)dt + U2(t,pt)[A*ptdt+hptdyt]
1
+ T U22(t'Pt)[dpt'dptl + " ""

Here Ut, U2, U22 are the ordinary and Frechet derivatives of U; U 2 is a linear functional
acting on a function f to produce U2(t,pt)[f], while U22 is a bilinear functional acting on a
pair f , g to produce U22(t,pt)[.f,g]. Considerations of quadratic variation indicate that U22
will contribute only through a term U22(t,pt)[h pt,h pt](dy) 2. Thus in p-space the analog of
a backward operator on a smooth U is
1
L U ( p ) --- "~ U22(p)[h p,h p] + Uz(p)[A* p] .

W¢ can expect that under reasonable circumstances the second term can be expressed in
an "adjoint" manner: U2(p) is represented by a function u -----u (p) in the domain of A, such
that Au is a bounded continuous function, and with ( , ) the natural pairing,
V2(p) [A* p] = (Au,p).

Let now V: M + - - . R be twice continuously Fr6chet differentiable with respect to


(weak topology), and such that its first derivative V 2 is (a linear functional) representable by
a function v E D(A), with Av E Cb. We call such functions V smooth. Then with h/~ the
measure defined by ( g , h # ) = (gh,#) it is natural to associate with the process crt/z the
operator
22

L V = - ~1- V22(t~)[h#,h~] + (Av,t~) (v depends on #)

as a kind of analog of the generators used in finite-dimensional diffusion. Such an


interpretation of L is justified by its consequences: (i) L is associated with an Ito differential
rule for ~ t # ; (ii) at/~ solves a martingale problem posed in terms o f L; for suitable test
functions F
t
F(ott~ ) -~ f LF(°s#)ds
0

is a martingale when wt is Brownian; finally, (iii) use of L leads to rather complete analogs
of classical results for optimal stopping.

5. A differential rule for ~r t / z

In order to use the method of quasi-variational inequalities, prove a "verification"


lemma, and generally, do dynamic programming for partially observed diffusions, it is
convenient to have an Ito differential rule for smooth functions of ~rt t~. In the notation of
Section 4, such a rule is given in

(1) Ito lemma: Let F : R+)<M + ---.R be C l in t and F r e e h e t C 2 in t~, such that its first tL-
derivative F 2 is represented by a function f 2 in the domain of A such that Af 2 E Cb. Then
t t
1
F(t,ottz) = F(O,l~) + ~ (Af2(attz),a,#)ds + T ~0 F22(s'as#)[h astz'h astz]ds

+ f (hf2(°slz),tral~)dw~ f2 =f2(s,°s#)
0
Proof: The proof is long, not quite standard, and in the Appendix, Section 13.

6. Quasi-variational inequality
We guess, in analogy with the finite-dimensional case, that an appropriate value function
for our problem will be a positive homogeneous function V: M + ~ R in the domain of L,
such that
V(~,) >__ (k,~,) = j" k d~,

L V ( ~ ) __< 0

L V (~ )[ V (~ ) - ( k , ~ )] = 0

This quasi-variational inequality (QVI) generalizes the finite-dimensional version to our


context. The first condition says that the best return is at least as large as what you get by
stopping at once; the other two express optimality. Hence L incorporates the sense of
'harmonic' that is appropriate to the problem. Continuing the analogy we can expect that
~,* = inf t: V ( , r t ~ ) = (k,~r,~)
23

is an optimal stopping time. The continuation region R is such that V(/*) > ( k , / * ) a n d
LV = 0 , and it is bounded by a surface V ( / * ) = ( k , v ) . Starting inside R we should
calculate a t / , and stop when V(att~ ) = (k,~t/*); when starting outside R , we should stop at
once.

7. Verification lemma

With all these preliminaries behind us, our first result is that a smooth solution of the
QVI must be the value function of the problem.

(2) Theorem: Let V be a Frcchct-C 2 function from M + to R such that


(i) V satisfies the QVI
(ii) the linear functional V 2 is representable by a function v -- v(/z, .) in the domain
of A, with Av E Cb (V is smooth, we say.)
Then
v ( ~ ) = sup E ( k , ~ T , ) ~ S ( # )
T

and an optimal stopping time r* is given by


7* = inf t: V(ot~ ) = ( k , o t l 0 .

Proof: Let ~r be any stopping time of the Brownian process wt that drives a t/*. By the Ito
lemma and the QVI,
(k,~,~)_< V ( ~ )
1" *" if"

<__ V ( ~ ) + ~o (Av'~s/*)ds + 1 ~ V22(crs#)[h%~,h%~]ds + ~o (v'h%~)dws

__< V ( / * ) + supermartingale at ,"

Taking expectations, we find that the return E(k,¢T/* ) obtained by stopping at ~ is not
greater than V(/*), so V >__S . For the reverse inequality, consider the special stopping time
.r* ----inf t: V(at/*) = ( k , ~ t ~ ) . Writing
T °

V(%,#) ffi V(#) + f L V ( % ~ ) d s + martingale at ~*


0

we note that L V ( , s U ) ffi 0 prior to 7", and that by definition of r* V ( , , , ~ ) = (k,cr,,/*).


Hence taking expectations we see that
E return from .~* = E(k,a**/*)

=- EV(~,,/*)

= v(/*)

so ~'* achieves the upper bound V(/*), and hence is optimal. Only one smooth V can satisfy
the conditions of Theorem (2); if there were two, each would equal S , and so the other.
24

8. Regularity of S
(3) Remarks: It is of course of prime importance here, as in any other problem governed by
a QVI or a Bellman equation, to determine when there exist solutions having the regularity
requisite for applicability of these analytical methods. This is already a difficult task in the
well-known cases, and it is correspondingly harder in the present problem, if only because of
the singularity of L. However, some results on regularity can be obtained directly from the
definition of the optimal return 5; (/z ) = sup E ( k , o ~#).
T

(4) Lemma: S is lower semi-continuous in the weak topology.

Proof: Let #n ---' ~- For each r the map o - - E ( k , c r r O ) is a hounded continuous on M, so


E ( k , o f # n) --, E ( k , o , l ~ ) , and hence for any ¢ > 0
S ( t z n ) = sup E ( k , r % t ~ n ) > S ( t z ) -- ~ eventually.
1"

This result is essentially the observation that the upper envelope of a family of linear
functions is 1.s.c. " S t r o n g e r " continuity can be obtained by strengthening the topology to
that of the variational norm I1.11.

(5) Lemma: EII~¢#II = I1#11 for all # E M

Proof: Let g + - - # - be a Hahn decomposition of ~, so that II/~ll ~ II#+ll-t-II#-II. By


linearity and positivity of ot
otl~ ~ otl~ + - trtl~-

and since II/~±ll = (1,~ ±) and A1 = 0, we find


t t

)w,~±)) = 0 , o , ~ ±) = 0,~ ±) + f (Al,o,u±) as + f (h,~,~+-)aY~


o o

II#rl~ll = Ill~ll + martingale at T

EIIor#ll = I1#11 .

(6) Lemma: S is/_~)~ in the I1.11topology, with ~ = sup]k [.

Proof: Using L e m m a ( ) , we obtain


S ( # l ) -- 3 ( ~ 2 ) <__ sup E ( k , o ~ t ~ l - o , l z 2 )
f

< gEllo¢(/Xl--~t2)ll-~ gll#l--#211

Similarly, --KII#I--#zll is a lower bound. This argument is from Davis and Kohlmann. 9
25

9. A fixed point formulation


Consider a positive homogeneous function V: M +---, R satisfying the two properties
V(#) >__ (k,t~) and
(7) v(u) = sup~V(~,) ~ rv(~)
T

If we can interpret V ( # ) as an achievable average return starting from " k n o w l e d g e " t~, then
the first property says that the policy leading tO V is at least as good as stopping at once,
while the second says that if you wait for any stopping time T (while the initial # moves
stochastically under its driving Wiener process to ~r~tt) and then from a~# follow the policy
yielding V then the average return starting from a , # is no larger: you can still do as well as
if you had followed the V-policy in the first place. Thus the second property expresses a
kind of optimality, since the null stopping time r -----0 is admissible.
Intuitively, then, we expect (7) to act as a kind of Bellman equation for the noisy
stopping problem, with the value function satisfying the two conditions V = FV and
V(tz) >-- ( k , # ) . This guess will be borne out in part by theorems to follow. However, it
turns out that these two conditions are only necessary and seemingly not sufficient: such a
function V might not be achievable by a stopping policy. Still, we shall show that the
smallest such function is the right one, and so characterize the value function (in a manner
similar to using the Snell envelope) as the least fixed point of I" that majorizes (k,t~).

(8) Remark: An equation (and cognate results) can be formulated for the ordinary stopping
problems in R n : v ( x ) >---k ( x ) and
v ( x ) ffisupEv(xr) r ~s.t. of x t
g

(the superharmonic property.)

(9) Remark: From an analytical viewpoint, introduction of r begs all the questions, because
so much is hidden in the definition of I' itself, and you cannot easily calculate with it, or
iterate it. This is one reason for trying to use the prima facie more explicit operator L.
However, it will appear that the inequalities V >_ PV and L V __< 0 are both analogous to the
superharmonic (excessive) property. W e have

(10) Theorem: If V is a Frechet-C 2 function with V' represented by v E D ( A ) with


Av E Cb, then
LV__<0 iff V = FV

Proof: For any stopping time r


T

V ( o ~ ) ffi V(I~) + f L V ( o , ~)ds + martingale at r


o

If LV ~ O, taking E and sup on r gives V >__ PV; the reverse inequality P V >__ V is trivially
26

true, so V = r v . Conversely V >__ r v implies E f L V ( % t ~ ) d s <- 0 for all stopping times


o
r. So for r(~o) = h > 0 we find
h
L V ( ~ ) ~ lira h - l E f LV(o's~)ds ~ 0 .
h~0 o

Remark: If LV __< 0, then V ( # t t z ) is a supermartingale.


We next show directly that the optimal return S(I~) - - s u p E ( k , a f l ~ ) is itself the least
T

fixed point of I" majorizing ( k , # )

(11) Theorem: S >--.r S , and


S ( z ) = i n f V ( g ) : V ( g ) >__ (k,t~) and V = r V

Proof: V - r v and V(.) >__ (k,-) imply


vo,) >-- sup EV(,,,t,) >__sup, E ( k , , ~ , ) = s ( ~ )

To show S ~ FS, it is enough to prove that for any s.t. ~"


sup E(k,~rl~ ) ~ E ( S ( a r ~ ) ) .

Let F be a fixed s.t. The function S is the upper envelope of a family of linear functions:
S(tt) =supE(k,#¢lz). So S is lower semi-continuous in the weak topology of M, and
T

hence measurable. Let ~ > 0 be fixed. By Lusin's theorem, there is a compact subset K of
M such that S is continuous on K and Pr{a~-g~K} ~ 1 -- ¢. For each g~K there is a
neighborhood N~, such that
p~K ¢3 Nu implies S ( v ) ~ S ( # ) + ¢

This is upper semi-continuity on K . By definition of S there is a stopping time ~(#) such


that
E(k,trTct,)g) > S ( t t ) -- ¢

O n N g N K then
E(k,cr,(u)tt ) > S ( v ) -- 2~

From the relative cover {Nt, (3 K , geK} one can pick a finite subcovcr Nt, ' = N i with
corresponding s.t.s. T(/zi) -----r i. Define a partition {A~.} of K by A t ----N1 ¢3 K

An+ l = N n + 1 f3 K - - 6 Ai
i-I

Now we can mimic each rs starting at F; that is, there is a stopping time ~'i of we+. -- we
with the same law as r i, and of course independent of F¢ = events prior to ~', such that
~-+ ~-~ is a s.t. Now set r ' = 0 = Ti on {%-#EK'} and
R7

Then ~-+ r ' is a s.t., and


g ( k , ~r+~,g) ~ E (k, a ~,o¢u)

>" E I ~ , ~ K (k,ar,a~At)

>_ E ~ 1,~,~,g(k,o,C~,)a~) -- ~ sup [kl


i--i

>-- Elo_,nc S(aTtt) --2~P{K} -- ~ sup Ikl

>__ e s ( ~ , ) - 2,v{z} - 2, sup [kl

But ~ was arbitrary. To verify that r ' is a stopping time we appeal to

(12) Lemma: Let ~ be a stopping time, F r the a-algebra of events prior to r, and r t stopping
times of w + . - - w ~ . Let B i be a countable system of disjoint sets from F~, of total
probability 1. Then the r.v. to equal to 7 + vi o n B i is a stopping time.

Proof: Similar to Meyer's T58, p. 74, Ref. 10.

(13) Theorem: If V is smooth, superharmonic (V>__PV), and such that for


-r ----inf t: V(#t~) = (k,attO
V(at^,/z) is a martingale

then S --> V.

Proofi From
T

(k,o,t*) = V ( a , # ) = V(tz) + f L V ( a s / ~ ) d s + martingale at ,"


0

it follows that

S(~) ~ E(k,~rrtz) = V(I~) + E *f LV(asl.Ods


0

Similarly
tar
V ( e t ^ ~ ) = V(~) + f L V ( a s ~ ) d s + martingale at r
0

so if the right hand side is a martingale then


tA~"

f Lv(~..)zs - o at.$.
0
28

By monotone convergence and L V __< 0 we find E f LV(~s#)ds = 0 and so S >" V.


o
Notice that the domination condition g ( ~ ) > (k,~) is not relevant here.

(14) Theorem: For smooth V ~ I ' V = V and r = inft: V ( a t u ) = (k,at~t), V(at^rl~) is a


martingale iff LV = 0 on V ( # ) > ( k , u )

Proof:
tar

V(OtArt*) = V(~) + f LV(~rst~ ) + martingale at t


o

tar

If this is a martingale we find f L V ( a s ~ ) d s E 0, and for V ( ~ ) > ( k , # )


o
t

L V ( g ) = lira E-ltAr E f L V ( a ~ ) d s --- 0


rio o

The converse is obvious. Note that the domination condition V(/z) ~ (k,/z), required by
the optimal return S , plays no role here.
We can now put together the following multiple characterization of the optimal return S :

(15) Proposition: For a smooth fixed point V of P, satisfying V(t~) >---( k , t t ) , the following
conditions are equivalent:
(i) For 7* ~ inft: V(~rtt, ) = (k,uttt), V(atAr, 1~) is a martingale
(ii) LV(V--(k,#)) E 0
(iii) V(I~) = ~n~f
E U(I~), E = excessive functions, i.e. U _> F U .

(iv) V = S = sup E(k,~rr~)


¢

Proof: Apply (2), (11), (13), and (14).

10. S as a fixed point in a complete lattice

The equation 1/ -----FF" can be studied directly in an abstract context by several methods
of functional analysis. Needless to say, the degrees of difficulty of such approaches are
directly related to the degree of regularity sought for the solution. W e have seen that S is
l.s.c, in the weak topology, and Lip, in the variational norm, of M; differentiability of S
remains an open question, with a negative answer in general, we suspect.
It is convenient to replace the weak topology of M + by the strong topology induced by
the variational norm Ifql, in which it is harder for functions to be continuous. We consider
the Banach lattice of bounded uniformly continuous functions ~: M +---, R with uniform
norm, and pointwise order. The subclass Llpa, a > 0, is defined by the condition
1~0,1)-~0,2)1 - < afl/~l--/*211 -

L e m m a (6) established that S belongs in Lip,, x = sup Ik I.


29

(16) Lemma: r carries Lipa into itself

Proof: If C) E Lipa
( P $ ) ( # I ) -- (F$)#2 -< sup E{C)(~.V.1)--¢(a,/~2)}

sup E IC)(#T~I)-C)(#,~2) I
1"

<_~a sup
I"
E IItTrl~ l--O rlz211

_~< a II#l--bt211

by Lemma (6). Similarly, --a 11#i--#21l is a lower bound.

Remark: Since we know that S is the smallest superharmonic majorant of (k,~), it is


tempting and natural to try to construct it directly as the appropriate, i.e., smallest, fixed
point of F in a suitable complete lattice, as follows.
We now single out the set K of functions C) defined for # >__ 0, with these properties
(i) C) is positive homogeneous and convex
(ii) II#ll • sup I k l _> c ) ( ~ ) _ (k,.)

(iii) C) E LipK, K = sup Ik I.


It can be verified that K is a complete lattice, and that F preserves (i)-(iii). Then P is an
isotone function from K to itself: C) ___ ~b implies PC) ---<Fg,. By the fixed point theorem for
complete lattices, there is a ~) E K such that C) ----PC). In particular, if E is the set of
functions C) from K that are excessive or superharmonic in the sense C) >__ I'C), then

is in K and is a fixed point of P. The set E is not empty, because ~ defined by


~(#) -~ II#ll • sup Ik[ belongs to it, and by Lemma (6), is itself the largest fixed point of F
i n K ! To show V>__PV, w e n o t e t h a t
PV=P~fEC)(/z)___<I'~p_<_<~ for any ~ E E

(rv)(~) _< ~ +(u) = v(~)

Since the value function S belongs to LipK, it follows that V = S.


30

11. E x a m p l e s

It is k n o w n that the infinite-dimensional Zakai equation can h a v e a u n i q u e solution


describable in finite-dimensional terms. In such cases our p r o b l e m also reduces to one of
ordinary analysis. For an e x a m p l e with nonlinear drift we pick the system o f o n e - d i m e n s i o n :
(17) d r t = tanh x t d t + d w t

dyt = x t , t t + dbt

As in Ref. 10 we calculate explicitly that for Borel A C R

otto(A) = .~ #(dx)sech x EIx.~(A cosh(x+wz)

exp (x+w,)~, - (x+w,)zds -

-- #(dx) dz sech x coshz exp t-m'v + v ' R v -- 2


2,=,h, j [5~-;~j

where

v' = (0, 1,--1)

dO,=--(tanht)Oz~+tanht@, , Oo=O

6==Am , mo=(x,0,0)

A ffi
[-o°il 0
Yt 0
, Y= yt 2
o

T h e "statistics" in m, R, and 0 can be c o m b i n e d into a single vector ~,, and a function /~


defined, so that

~ £(t,&,x)~,(dr) = (k,~,~,)

d ~, = F, ~,dt + g,C~,)dy,

w h e r e Ft is a matrix and g, a vector, neither d e p e n d i n g on y. N o w introduce an operator B


by

B - ~ -a+ ±2 ~j a~
Z g'g/a~~--~/+ I'F'v
31

and pose the QVI: to find a function u(t,~,u) such that

Bu _<0

n. lu-(i.,)] - o

It can be verified that this QVI is exactly equivalent to finding the best admissible stopping
time for (17). In particular, the time t functions here as another nonrandom statistic, and
S(g) = supE(k,~,#) ~ u (0,0,t~)

with the sup achieved by inft: u(t,~t,~) = (k,¢t#).


The reader is invited to construct a similar finite-dimensional reduction for the drift

z + e~2/ f e-"Zdu.
..-oo

12. Acknowledgements
We note with pleasure the value of discussions with M.H.A. Davis, G. Mazziotto, and L.
A. Shepp, and the stimulating influence of the preprinP by Davis and Kohimann. W . H .
Fleming suggested the invitation that led to presentation of this material at the IFIP Working
Conference on Recent Advances in Filtering and Optimization, Cocoyoc, Mexico, February
1-6, 1982.

13. Appendix: Differential formula for eztt


The weak topology of M can be metrized as follows: let {~,} be a countable set of
uniformly continuous functions dense in the bounded uniformly continuous functions on R d,
and write for a metric on M
I(,~.,,,,) - (~,,.,,.') I
d(~.v)= ~wn , ~,~(M
n-1 II~nll

with {w,} a sequence of weights to be chosen. There is no loss of generality in supposing that
the ¢, are all C 2 functions of norm 1 in the domain of A. Furthermore we can choose the
weights w, to decrease so fast that

W n 2 4n ~ O0
n--I

When the coefficients in A are bounded, we can further adjust the w, so that ~ w, lA~,(x)l
defines a bounded function.
Put /~f = (k,#t/z)/(1,aH~). If w, were really the observed function y,, then h would be
E{h(x,)lY,}. We have the relation

llcrt/zll~ (],crt/z)= ll/zll+ .~ q(h,w),hsdw ,


0
32

where q is the Girsanov functional

By the Kallianpur-Striebel formula, ~,~, can be written in the f o r m

o,#(A) ~ f u(dx)El~+w,¢a q(a-~b,Whq(h,w)t

where E represents integration o v e r an i n d e p e n d e n t Brownian m o t i o n w , and


a-lb--(a-lb)(x+W), h =h(x+W). Further, a s s u m i n g a-tb is of at m o s t linear growth,
q(-Ib,W), is in Lp for s o m e p > 1, and Eq(a-Z,Z,,W), ffi 1. T h e n by H61der

a t #(A ) ~ f l~(dx)E O'-I)/p lx+wt~a q (a-ib, W)tE1/Pqo (h ,w )~q (a-tb, W)t

<---f #(dx)EO'-I)/PZl,,+wtc:aEO'-I)/PqP(a-lb ,W)lEt/PqP(h.w)~q(a-lb,W)s

T h e first E does not i n v o l v e w, the second is b o u n d e d on t-intervals, and the third has m e a n
t

EWE Wq ( a-tb , W )tqP ( h ,w)t ffi Eq ( a-l b , W ):q (ph ,w )e

also h o u n d e d on intervals. H e n c e there is an increasing s e q u e n c e of w-sets /in, of


probabilities increasing to 1 as N -~, co, and for each N a class of compact sets {K~ _ R d, ,>0},
such that on at~

(ii) o_.~p Eq(a-'b,W),qP(h,w), <_ N

T h u s on .,Is t h e orbit {o,a, o<_s<_t} remains in a weakly compact set K 'vt of positive measures.
As usual, we consider a partition ,r = {offito<t]<...<h=t} of the interval [o,t]; setting
mesh(*r) ffi max Itl+t-ti{, we start with the identity
n--I n-I
F(t,~tI~ ) -- F(O,~a) ffi ~ [F(tl+1,~l+1)-F(ti,I~1)] ffi ~.j A F ~
I-0 f-O

where ~j ffi% ~ . Next, by the ordinary m e a n value theorem, for s o m e 0~ ~ [o,I]

AFl ffi F(tl+i,iat+l) -- F(tl,lil+l) + F(tl,lll+l) -- F(tl,llt)

ffi Fl(tt,Pl+i)Atl -I- [Fl(tcl-OiAtl,ll~+l) -- FI(tCFII~+I)] + [F(tl,lil+i)--F(tl,ll~)]

Except on sets o f arbitrarily low probability the orbit s, o,~ stays in a c o m p a c t set, so a
standard u n i f o r m continuity a r g u m e n t shows that the s u m o v e r t~ ~ w of the second t e r m on
the right is o(1) in pr. W e turn t h e n to the third term.
33

By the calculus theorem for Frcchet differentials

F(tl,p+l) - F(tl,pD ~ if F2(tl,pt+sAPD[APt]ds


0

= )o ~(t~,pDl:'pt] + fo F~2(tt'm+uAPD[APt'~P~lau

F2(t~,pt)[APt ] + ~"1 F22(tt,P~)[Apt,APt ]

+ i i {F,2(tt,pt+UA.t)-- Fz2(t,,.t)tAP,,AP,]duds
For s o m e st 6 [o,1] the last term is

i{F,l(tt.PCF'uAPt)--F,,(tt,Pt)tA.DAt~tldu
which in turn is equal, for s o m e u~ fi [O,stl, to

,t{F2,(t,.pt'I-utA,t)--F~(t,.pt)tA,t.A,,] .
Thus we have
1
F(tl,Pj+l) -- F(tDPD = F2(tt,pDIApl] -I-~ F22(tl,pD[API,APl]

(I 8) + s~{F22(t1,p~+ur:'at)-F22(tl,Pt)}[APt,APt]

W e claim that the s u m over tt E f of the third term on the right is o(I) in pr. Thinking of
F,, as a m a p o~" [0,1]XM + into the linear vector space of bilinear forms B on M>O4, such that
IB [#,u]l --< llBlld(p,O)d(v,O)

llnll =sup IBb,,,,]l


~,., d(p,O)d(v.O)
w e find that w i t h ~'~z = Fzz(t~,m+uJAD -- F22(tt,m)

~.~ stF~2[Api,Apt]<_.max ]IAF~II ~ d2(O,Ap D


t~E- tier tier

For a E [O, ll the metric d has the h o m o g e n e i t y p r o p e r t y


84

(19) d(~,a p.-t-(1--a ) n) ffi (l--a)a(#,u)

To proceed further we need a series o f technical results:

(20) L e m m a : E (~b.h.~.#)dw~ 0(~ ~) uniformly in n and 0 ---<s < t, in an interval.

Proof: First estimate Iio,#114 by noting that

Ilos#ll = I1#11-6 j" q (/~,w)/~dw


0

Iio,#114 < 811#114-68 q(h,w)

q'(',w)~ --q(4',w), exp{1-~52 ~ '2du }

EC(g,~). <-- e ~" , g =l---SZ sup Ihl 2

W e can now use an inequality o f S k o r o k h o d ~ to conclude that

e q(~,w)idw <_ ~6s eq'(i,w).i~du

< const, s(eX'--1)

Hence
EIIos#ll 4 < 811#114+ const, s(eg~--l)

T h e n using S k o r o k h o d ' s result again, we see that

elf (,.h.¢,#)aw, < c o n n . , f. Ellos#ll4ds

J+¢

< const, e f [411t~ll4+const. u(elCU-l)ldu


s

< 0 ( ~ 2) uniformly i n n and s E [0,t].

In the next two l e m m a s we use the abbreviations A#~ ~ % + ~ - o,tu, and

T h u s dr is the distance, in our special metric for the weak topology of measures on R a, from
eq+t# to otj#, tt+l > tI.
35

(21) L e m m a : E ~ d,2 < o(1) + 0(t) as mesh(r) --'* 0


tjE.o

Proof: W e have

dr = ~ ~ . l ( ~ . . a . , ) l
tl--I

'r+l~
il [~' ,o. ~)dw.
n~l n~l | tI

Use (a+b) 2 ~ 2(a2-I-b 2) and Schwarz's inequality to get

__< 2 ~ w. IA•.I ,,~.~,,ds + 2 w. w,. ~, (,,.h,a.t,)dws


tl~- .-t t [q t'-' , rE,, 1,~

The Lebesgue integral has quadratic variation zero, and the last s u m has mean at most
t .up Ih Iz.

(22) L e m m a : max dr --~ 0 in probability, as mesh(r) - . 0


tier

Proof: Write d~ = Xj + M~ where xs is the Lebesgue integral part, and Mr the sum of stochastic
integrals. Evidently, under our choice of ~,., EX 2 _< conn. (tr+l-tD ~, and
Pr{l×,l>oI ~ ~-z ~onn. (tt+l-tt) 2

~ l m a x Ix, l>Ol ~ ~-2 const. ~ (t~+l--tt) 2


I I

<_ 0-5 conn. t mesh (r)

Hence max JX~[ goes to zero in pr, as r refines. Similarly, using

IM, I <_ ~ w. a w.x,.


n--I ¢t a-I

we find that by L e m m a (20)


Pr{x~,>~l < {/-4EIx,.14

Pr {WnXn> 2-n~ } <___~--424nw4E IXtn 14

Pr w.x.>O <__~--4~ EIx,,I,2..w 4


sl--I

Pr{max IM, I>oI ~ 0-4 ~ . , Elxr, l'2"w, ~


n-I

_< ~-4 const.~ (tt+,--t,)


2 ~ w.424"
[ n--I

~-4 const, t mesh(r)


36

so that max IMII goes to zcro in pr.

On [0,t] )< K #v#, Fz~ is u n i f o r m l y continuous, and there is a m o d u l u s ~ , a function such


that using (19)
tlF22(tt,gl't-ujA/z~) -- F(t~,/~¢)ll --< coN(uidi)

w h e n c e on A N

and

IAF[2 [.A,I.A,1]] __<coN {max dt] ~ dl2

By (22) and (21) resp., the m a x i m u m converges to zero in probability, a n d the s u m Z dl2 is
b o u n d e d in the m e a n uniformly in ~r. Since o~N(U)decreases to zero with u it follows that

tt~

goes to zero in pr.; b u t P{AN} ~ 1. SO We can drop the 1A#. This completes proof that the
third t e r m on the right contributes o(I) in pr. as mesh(~r) ---, O.
It remains to deal with the first two terms on the right of (18). If F2 is represented by
the function f z in the d o m a i n of A with Af2 ~ Cb as a s s u m e d then with f [ = f z ( s , # . ~ ) ,
f[ ~f~, the first of these terms can be written as
t,+ I i,+~
F2(ti,gi)[A~,] = f (Af~,~s#)ds + f (hf~,,;,,t)dw,

liar| [1+1
-- f (Af[--Af~.,.,)ds -- f (hf[ - - h f i . ¢ . , ) d w .

Methods exactly like those already used show that the sums on tl ( ~r of the absolute values
of the third and fourth terms on the right go to zero in pr. as mesh( I : ) ---, O. The second term
can be written out in the form:
F22(tI~, ) [A;~ IA ;z,] = F22(tf,//i) [h O'tl I~ ,h~ttIS']
Atl
(23) 4- F2~(tt,l~)[hetllz,h ~5/~]((Aw,j)2-- A/I)

tl+1 ti+l 1
+F22(t,,~,t)[f h(~..--~,,.)d~. , f,, h(#..-~,,~)d~s
37

where at is the signed measure defined by


I~-I-I
(24) (~l,~.) ffi f ( A ~ , ~ , ) d s
tI

and f h ~,.a~, is that whose value at a set A is


tl
tl4.t

f I ~ hj(xl(...)(d~)aw,(~l n ~ dim h
II xEA J-I

That (24) defines a, as a signed measure rather than just a linear functional on a dense
subspace can be seen from
t/+l
A~t ma"b f h ~st~dws •
II

Again, methods like those used before show that the sums on t~ E ~r of the absolute values of
all the terms of Fz2(ti,ul)[At~l,At~l] in (23) except the first go to zero in pr. as mesh(w) - , O.

References

1. B. Grigelionis, On sufficient statistics for optimal stopping problems of stochastic


processes, Lietuvos Math. Rink., Vol. XI, No. 3, (1971), pp. 529-533.
2. J. Szpirglas and G. Mazziotto, Mod61e gdndral de filtrage non lindaire et dquations
differentielles stochastiques assocfees, Ann. Inst. H. Poincarde, Vol. XV, No. 2, (1979),
pp. 147-173.
3. , Thdor~me de sdparation pour rarr6t optimal, Sem. Prob. XIII, Lecture Notes in
Maths. No. 721, Springer Verlag 1979.
4. _ Thdoreme de separation pour le contrSle impulsionnel, Note technique
N T / P A A / A T R / M T I / 2 4 6 , CNET, September 1980.
5. _ _ , Separation theorems for optimal stopping and impulse control, ibid.
6. M . H . A . Davis, private communication.
7. W. H. Fleming and E. Pardoux, Optimal control for partially observed diffusions,
SIAM J. on Control and Optimization, Vol. 20, No. 2, March 1982, pp. 261-285.
8. M . H . A . Davis, Some current issues in stochastic control theory, preprint of paper for
Stochastics, Special Issue on Stochastic Optimization (ed., M. H. A. Dempster), based
on a lecture at the Taskforce Workshop on Stochastic Optimization, IIASA, December,
1980.
9. M . H . A . Davis, and M. Kohlmann, On the nonlinear semigroup of stochastic control
under partialobservations, unfinished manuscript, 1981.
I0. P . A . Meyer, Probability and potentials, Blaisdell, Waltham, 1966.
11. V. E. Benes, Exact finite dimensional filters for certain diffusions with nonlinear drift,
Stochastics, Vol. 5, (1981), pp. 65-92.
12. A. V. Skorokhod, Studies in the theory of random processes, Addision-Wesley,
Reading, Mass., 1965, p. 23, Theorem 4.
OPTIP~L CONTROLOF PARTIALLYOBSERVEDDIFFUSIONS

A. B~nso!,tssan
UniVersity Paris-Dauphine and INRIA

INTRODUCTION.

We discuss in this paper the stochastic maximum principle and the dynamic
programming approaches of the problem of stochastic control under partial observation.
We formulate the problem as a problem of stochastic control for a stochastic partial
differential equation, with full observation. The stochastic PDE is the ZakaT
equation. A formal approach of the stochastic maximum principle has been already
given by H. Kwakernaak [7]. Ou~ approach is rigorous. We also present an analytic
problem which is a surrogate of Dynamic Programming. The solution of the analytic
problem coTncides with the value function of the problem. We present here the
results and some ideas of the proofs. A detailed version of the paper will appear
elsewhere.

1. STUDYOF A STOCHASTICPARTIAL DIFFERENTIAL EQUATION.

1.1. Notation.

Let :

(1.1) g(x,v) : Rn x Rk ÷ Rn, bounded continuous

G2Zadwill be a compact non empty subset of Rk, (here after the set of control
values)

(1.2) ~(x) : Rn +c_E(Rn;Rn) bounded uniformly continuous ;

I *
Setting aij = ~(ee )ij we assume that :

i~j aij ~iCj ~ ~I{12 , V~ ~ Rn, ~ > 0

We will assume the additional regularity :

(1.3) @gi L~ ~ai~ @2aij e


3i 7 ~ ' @xj ( L~ " BxiBxj L=
39

We define the operator :


B2 @
(1.4) A = - aij ×iDxj " ! gi N
which we may written under the divergence fom

ij~ aij ~ + ai
where

(1.5) ai = - g i + ! ~xj aij

We also consider the adjoint :

(1.6) A* = - 1,j
"~" ~ (aij') + ! ~ (gi')

Bai

In fact, since gi' hence ai depends on a parameter w, we will index the


operators A and A* by v, and write

Av A*v .

Let h(x) such that :

(1.7) h : Rn ÷ Rd , bounded, h E W2'~(Rn)

Let (~,M,P) be a probability space, on which we can construct a standardised


d dimensional Wiener process, denoted by y(t). We will define

yt : (y(s), s ~ t)

We write :

(1.8) H = L2(Rn), V = HI(Rn) , V' = H-I(Rn)

(1.9) L~(o,T;V) = subspace of

L2((o,T) x R ; dt x dP;V) of processes z(t) such that a.e.t,


z(t) E L2(~,Yt;p;v).

In (1.9) we can take T = ~, and replace V by any Hilbert space (in particular
we will use i t with V replaced by Rk)
40

1.2. Admissible controls. State equation.

The set of admissible controls is defined as follows

(1.10) v(.) E L~(o,T;Rk)VT f i n i t e ,

a.e v(t) ~d' a.s.

Let
(1.11) ~ E L2(R~, ~ ~ 0

For a given control we want to solve the stochastic PDE

(1.12) I dp + A*V(')p dt = p h . dy

Ip(o) : ~ ,

which we will call the ZakaY equation, following the common practice. I t is
convenient to define :

(1.13) Ao = i ! j ~ i aij ~jj

which does not depend on the control, and

~ ai(x,v )

Therefore (1.12) can be written as follows :

(1.15) dp + Aop dt = BV(')p dt + p h . dy

p(o) :

We state the following result, which is a variant of the general results,of


E. Pardoux [9] (see also A. Bensoussan [2]).

Theorem 1.1. : We assume (1.1), (1.2), (1.3), (!.11), then for any admissible control
v(.) (cf. (1.10)), there exists one and only one solution of (1.15) in the
following functional space :

(1.16) p 2
Ly(o,T;V) n L2(O,M,P;C(o,T;H)), VT f i n i t e .

Moreover one has :

(1.17) p~ 0
41

1.3. Additional properties.

Let us notice that for @~ HI(Rn), we have :

Bai

Setting :

(1.1~) ~(x,v) : } @ai


~(x,v) + lh(x) 12

we can write the following energy equality :

(1.19) E Ip(T)l 2 + 2E
I
o
<Aop,p>dt = E
o Rn
a(x,v(t))p2(x,t)dxdt + I~I 2

al so
(1.20) E e-2YT Ip(T)) 2 + 2E IT em2yt <Aop,p> dt
o

+ E (2y - a(x,v(t)))p 2 e-2ytdx dt = I~I 2


o Rn
For a convenient choice of y, we deduce :

(1.21) E ~o e-2Yt IIP(t)ll 2 dt <

Consider next the case of a constant control v(.) z v. Let us write pV(t) to be
the solution of (1.15) at time t, emphazing the dependence with respect to v and to
the i n i t i a l condition. The map :

(I.22) ~ ÷ p~(t) e-~%~(H ; L2(a,M,P;H))

and from (1.20) we have :

(1.23) E Ip~(t)l 2 ~ e2Yt I~I 2

Let us consider the semi-group TV(t) from H into i t s e l f , which is defined by


solving the Cauchy problem

(1.24) ~dz + (A° - BV)z = 0

z(o) :

z(t) : TV(t}

We have the following :


42

Lemma 1.1. : The following representation formula holds :

(1.25) p~(t) = TV(t)~ + TV(t - s)(p(s)h) . dy(s)


o

We can consider (1.25) as a linear integral equaticn, whose solution is searched in


C(o,T;L2(~,M,P;H)), for T fixed. I t has a unique solution, since taking ~ = O, we
deduce the estimate :

E Ip(t)I 2 ~ cT E Ip(s)I2ds
o
and by Gronwall's inequality, i t follows p=O.

Let us set :

yS(s) : y(s + B) - y(O)

which is a standardised Wiener process with respect to yS+e = ~s. In addition, the
process yS(s) is independent of ye.

Consider now the integral equation analogue to (1.25) ( I )

(1.26) qv(t) = T(t)~ + T(t-s)(q(s)h).dYo(S )


o

which has a unique solution in say C(o,T;L2(R,M,P;H)). We can claim that the
random variable qv(t ;~) with values in H is Y8 independent. Moreover we have the
following

Proposition 1.1. : We have the property :

(1.27) ,E [F(p~(t + e)) I y e ] : E [F(p~(t))]v=p~(O )

and the process p~(t) is a Markov process with values in H.


D
1.4. Linear semi group.

Since p~(t) is a Markov process in H, we can define a linear semi-group


operating on the Banach spaces :

B = space of Borel bounded functionals on H


C = space of uniformly continuous bounded functionals on H.

setting :

(1.28) @v(t)(F)(~) = E [F(p~(t))]

(i) We omit to write the index v.


43

for F ~ B, then we have from Proposition 1.1, the semi group property •

(1.29) @v(t + s) = @v(t) @V(s)

In addition we can assert that :

(1.30) @v(t) ; C ÷ C.

This follows easily from (1.22), (1.23). Now, since pV(t) depends l i n e a r l y on
~, i t i t useful to consider functionals F, which are not bounded, and rather have
a linear growth.

To that extent, we introduce :

BI = space of Borel functionals on H, which have linear growth.

We put on B1 the norm :

IIFII : sup
and BI is a Banach space.

Similarly, we define :

C1 = subspace of B1 of functionals F such that ~ E C.

For F c B1 or CI , we have :

IF(pV(t))l = IIFII (1 + {pV(t) I)

IE F(pV(t))l = llFII (I + I=I • "Yt)

therefore ~v(t) E_~#(BI,B1) or_~(Cl,C1) with norm

(1.31) I1 ~v(t)ll ~ eYt


(BI;B I )
Hence cv(t) is not a contraction on B1 {we recall that y does not depend on v).
The semi group @v(t) has also the following regularity property :

Proposition 1.2. : I f F satisfies

(1.32) IFOrl) " FOr2)l < I[FH 6 I~rl - Trz[~ 0 ~ (S < 1

then one has :

(1,33) I~v(t)(F)(x1 ) - ~v(t)(F)(x2)l <-IIFII ~eY6t I~ 1 - ~2 I~


n
44

2. A STOCHASTICCONTROLPROBLEM.

2.1. Settin# of the problem.

Let for any v c ~Lad

(2.1) f(x,v) ~ fv(X) ~ H

Ifvl H ~ C , independent of v in~ad

(2.2) ~c H

Consider an admissible control v(.) (cf. (1.10)) and let DJ(')(t) be the
corresponding solution of (1.15). We define :

(2.3) d (v(.)) = E [
f'
O
(fv(t),p~(')(t))dt + (g,p~(')(T))]

The integral to the right hand side of (2.3) is well defined.

2.2. Preliminaries.

We want to minimize the functional (2.3). The existence of an optimal control


for (2.3) seems to be a very d i f f i c u l t problem. Introducing a concept of relaxed
control, W. Fleming and E. Pardoux [6] were able to obtain an existence theorem. Our
objective here is to derive a Maximum Principle. We follow an idea of
H. Kwakernaak [7] who derived a Maximum Principle, working with the K u s h n e r -
Stratonovitch equation of the conditional probability. The treatment of Kwakernaak
is formal, whereas our derivation is rigorous.

We wil] assume that :

(2.4) g(x,v) is C2 in v and

@gi ~2gi
are bounded
~vj ' Bvj~ vz

i=i . . . . . n j,Z=1-k

(2.s) f(x,v) is C2 in v and

B2f
,FRn(BVi~V---~(x'v))2dx -< Constant independent of v
45

(x,v)) dx ~ Constant independent of v


IRn(~i 2

We will set

fv ~ ~f(xsv~
'
~ i

and fv! can be considered as an element of Hk. We start with some preliminaries. Let
u(.) be an optimal control. We denote by p(.) the corresponding state, which is the
solution of

(2.6) dp + AoP dt = BU(')p dt + p h . dy

p(o) :

p E L2 (o,~;V) n L2(~,M,P;C(o,T;H)) VT
Y,y
where we have denoted

{zlE e-2 tz.t.dtnr l ,


0
z(t) E L2(R,Yt,p;v), a.e.t}

The fact that p ( L~,y(O,~;V)_ follows from the energy inequality (derived from
(1.20))

2 E ~ e - 2 y t <Aop,p> d r + E ~ I ( 2 y - a(x,u(t)))p2dx dt ~ ,~l 2


Rn
Since ~ w i l l not vary in this context, we write J(v(.)) for the functional (2.3).
We consider i t as a functional on the Hilbert space~ L~(o,T;Rk),^ and we want to show
that i t is Gateaux-differentiable. Let v(.) E L~(o,T;Rk), we consider the equation :

(2.7) dz + AoZ dt = BU(')z dt + I ~xi (ai'v(X'u(t))v(t)p)dt + z h . dy

z(o) = 0

z~ L~(o,T;V) n L2(a,M,P;C(o,T;H)).

Provided that v(t) is bounded by a deterministic constant, then (2.7) has one
and only one solution. In fact, we have the energy equality :
46

IT
(2.8) E e"2YT Iz(T)l 2 + 2E e-2Yt <Aoz,z> dt +

+ EI
T/
o

(2y - a(x,u(t)))z 2 e-2Yt dx dt =


o Rn

= -2E iT/o Rn
e-2Yt ~ ai,v(X,u(t))v(t)p ~
1
dx dt

We then state the following :

Lemma 2.1 : We have the formula

T !
(2.9) d~J(u(.)+ ev(.)lle= o = E {Jo[(fu(t) V(t),p(t)) +

+ ( f u ( t ) , z ( t ) ) ] dt + (~,z(T))}
D
~.s. M ~ P r ~ n o i B l e .

Let @c L~(o,T;V'). ~onsider p to be the solution of :

(2.107 dp + AoP dt = BU(')p dt + @(t)dt + ph . dy

p(o) = o

p ~ L~(o,T;V) n C(o,T;L2(R,M;P))
°

From the energy equality :

E e"2YT Ip(T)I 2 + 2E T e_2Yt <Aop,p> dt +


(2.11)
o I
+EIT I (2T - ~(x,.(t)))p 2 e-2~t dx dt =
Jo Rn

= 2E <p(t),@(t)> e'2ytdt
o
I t follows that the map : @÷p

(2.12) is linear and continuous from


2 o
L~(o,T;V') in L~(o,T;V) n C(o,T;L (a,M,P))

Consider then the functional


47

T
@÷ E {
f0 (fu(t),P(t))dt + (~,p(T))}
i t is linear and continuous on L~(o,T;V').

Therefore, there exists one and only one element L ~ L~(o,T;V) such that :
f T rT
(2.13) E{|o(fU(t),P(t))dt, + (~:p(T))} = E ,|o <X(t),@(t)>dt

Applying this result to (2.7), we can write (2.9) as follows :

(2.14) ~o d(u(.) + sv(.))le=o = E IT [ ( uf'( t ) v ( t ) , p ( t ) ) +


0

@
+ (~(t), ! ~ (ai,v(X,u(t~v(t)p))]dt =

Rn (~j(X,u(t)) +

+ ! @~(x;t),@xiBHi-
3v--j (x,u(t)))p(t,x)dx] dt
where we have used the fact that :
~ai ~gi
@-~j: @vj
We can then state the following

Theorem 2.1 : We make the assumptions of Theorem 1.1 as well as (2.1), (2.2), (2.4),
(2.5). We also assume that

(2.15) ~ad convex.

Le____tu(.) be an optimal control for problem (1.15), (2.3) aid p the correspondin~
trajectory, solution of (2.6). Then, there exists ~ i n L~(o,T;V) such that the
.followin 9 condition holds :

(2.16) k j - uj(t)) [ i
jZ1(v (~-~-,(,u(t))
~'J
~f x + ~ B~(x;t) ~ . (x,u(t~)p(t;x)dx]
= Rn i~i Bxi

> 0 Vv ~ ~ d ' a.e.t, a.s.

2.4. Equation for the ad4"oint process.

We derive here an equation for h, the adjoint process. Let us introduce ~ to


be the solution of
48

+ ~ [~ Ihl 2 - .~. aij .(y(t) . h) ~ (y(t) . h) +

i j ~ (aij ~jj (y(t) . h)) - ! a i ( x , u ( t ) ) ~ i i (y(t) . h)] :

= f ( x , u ( t ) ) exp y ( t ) . h(x)

~(x,T) = ~(x) exp + y(T) . h(x)

For a.s. m , (2.17) has one and only one solution in the functional space :

(2.18) p ~ L2(o,T;V} , ~ ~ L2(o,T;V').

We w i l l also set :

(2.19) ~ ( x , t ) = , ( x , t ) exp - y ( t ) . h(x).

Of course, we can assert that

(2.20) ~ ~ L2(o,T;V) n C(o,T;H) a.s.

One has to be careful in taking the mathematical expectation. However we have


the following result :

Lemma 2.2. : V@~ H (deterministic), Vs ( [o,T] (@,v(s)} ~ L2(~,M,P) and there


exists G(s) c L2(~,Ys,p;H) such that

(2.21) (@,~(s)) = E [(@,v(s)) I yS]

Moreover the adjoint process k satisfies

(2.22) k(s) = v(s) a.e.s, a.s.


D
To proceed we w i l l need an additional regularity property of v ( s ) , namely we
have :

Lemma 2.3. : The process v(s) satisfies

(2.23) E Iv(s)I ~ ~ c , Vs E [o,T].


D
The previous result can be strenghtened as follows :
49

Lemma 2.4. : Assume that

(2.24) ~ L~
@xk •
and

(2.25) ~ E V.

Then the process ~(s) s a t i s f i e s

(2.26) E llG(s)ll ~ ~ c

We then deduce that :

Lemma 2.5.: Set

u ( x , t ) = ~ ( x , t ) exp y ( t ) . h(x)

then one has

(2.27) EII;(t)ll~ ~ C

(2.28) EYt ~ c L~(o,T;L2(a,T;L2(a,M,P;V')).

Lemma 2.6. : There exists r • L~(o,T;V 'd) such that

(2.29) m(t) = ~(t) - Ep(o) - EYS as ds = r(s)dy


o o

We can now state the following D

Theorem 2.2. : We make the assumptions of Theorem 2.1, (2.24), (2.25). Then the
adjoint process X defined in Theorem 2.1 s a t i s f i e s :

(2.30) ~ • L (o,T;V) n L (o,T;L (a,M,P;V))

~(t) exp y ( t ) . h • L2(~,M,P;C(o,T;V')) -

d~ + (Ao~ + ! ai (x,u(t)) ~ +

+ ~ l h l 2 ) d t = %h . dy - exp - y ( t ) . h r ( t ) . dy +

+ ( f ( x , u ( t ) ) + exp - y ( t ) . h r ( t ) . h)dt

X(x,T) = ~(x)
50

9
Moreover there exists one and only one pair (k,r) with r c L~(o,T;V')
such that (2.30) holds.
0

3. SEMI-GROUP ENVELOPEAND APPLICATIONS TO STOCHASTIC CONTROL WITH PARTIAL INFORMATION.

3.1. Settin~ of the probZem.

We go back to the notation of section I and consider the family of semi-groups


¢v(t)(F) on B1 and CI defined in (1.28). Let fv be as in (2.1). We identify i t
with the element of CI ,

(3.1) fv(~) = (fv,~) V~ c H.

We take

(3.2) ~>y

where y has been chosen in §1.3.

We consider the following problem, called the problem of semi-group envelope.


This problem considered in A. Bensoussan - M. Robin [ 3 ] , is closely connected to
the approach of M. Nisio [8], who introduced a non linear semi-group associated
with stochastic control. As we shall see, the framework f i t s perfectly with the
control problem for the ZakaT equation considered in section 2 (although, we
consider here an i n f i n i t e horizon version of the problem). For different semi-group
approaches we refer to W. Fleming [5] and M.H.A. Davis, M. Kohlmann [4].

We introduce the set of functions S(~),

(3.3) S E C1

S~ e-Bs cV(s)f v ds + e-Bt ¢V(t)S Vt ~ 0


o
Our objective is to stucLv the structure of the set (3.3).

3.2. Preliminaries.

We give here some useful additional properties of the semi-group cV(t)(F).

Lemma 3.1. : We have the property

(3.4) t+¢v(t)(F)(x) CEo,~) , Vx~ H, VF~ CI.

0
51

Lemma 3.1 and (3.2) imply in particular that the integral

(3.5) ~o e-Bt @v(t)fv dt ~ C1

Let h be a parameter which will tend to O. We will define the family of


operators
h
(3.6) Th(F) : Min [ I e-St@v(t)fv dt + e-Bh @V(h)(F)] , F E CI.
V~Zad o

We assume

(3.7) v ~ fv is continuous from~ad into H, and bounded.

Lemma 3.2. : The operator Th maps CI into itself.


D
~.S. Approximation.

We solve the following equation :

(3.8) Sh : Th(Sh) , Sh ~ C1.

Lem~ 3.3. :.Equation (3.8) has one and only_one solution.

Lemma 3.4. : The solution Sh is uniformly Lipschitz and

(3.9) Ish(~)- Sh(~')I ~ ~-~-_Cyi~ - ~'I

where C is the bound on Ifvl H , Vv.


D
We can then state the following :

Theorem 3.1. : We assume (1.1), (1.2), (1.3), (1.7), (3.2), (3.7).


Then the set (3.3) is not empt£ and has a maximum element, which is moreover
uniformly Lipschitz.
D
3.4. Interpretation o~ the maximum element.

Let us consider the functional (2.3) over an infinite horizon, with discount B,
namely :

(3.10) Jr(v(.)) = E ~o e-Bt'f


~ v(t)'P~ ( ' ) ( t ) ) d t
J
52

where v(.) is an admissible control and p"~ ( ' ) ( t ) is the solution of (1.15).

We denote by W the class of step processes adapted to y t with values in~/Zad.


More precisely i f v(.) E W, there exists a sequence :

~o = 0 < TI <... < Tn < . . -

which is deterministic, increasing and convergent to+~, such that :

v(t;m) = Vn(W) , t E [Tn,Tn+1).

and vn is Y n measurable.

We have the following :

Lemma 3.5. : Let $ ~ B and v(.) c W. Then for Tn ~ t < Tn+1, one has

(3.11) E [@(p~(')(t)) I YTn] = @Vn(t - Tn) ~(p~(')(Tn) ).

0
Define for v(.) ( W

(3.12) Pn : P~(') = P~('I(Tn)

(3.13) fh,v = oih e_Bt @v(t)fv dt


From Lemma 3.5 which can be extended to @~ C we deduce
i
Zn+l Tn+1-~n v
EI = E -o e_~t
@n(t)fvn(Pn)dt
Tn

= E(f • v 'p )
~n+l- n' n n
hence f i n a l l y the formula :

(3.14) J (v(.)) = E ~ e"~Tn


n=o
(fTn+l-Tn,vn,Pn) •

Let us consider the subset of W

Wh = {V(.) E W I •n = nh}

then we have the following :


53

Theorem 3.2. : The assumptions are those of Theorem 3.1. We then have :

(3.15) Sh(~) = Inf aT(v(.)).


V(.)EWh

We f i n a l l y obtain the interpretation of the maximum element s t i l l denoted by S,


of the set (3.3).

We have :

Theorem 3.3. : The assumptions are those of Theorem 3.1 and (2.15). Then we have :

(3.16) s(x) : Inf Jr(v(.)).


v(.)

REFERENCES.

[i] A. Bensoussan, Filtrate Optimal des~Syst~mes Lin~aires, Dunod, Paris, 1971.


[2] A. Bensoussan, Control of Stochastic Partial Differential Equations, in
Distributed Parameter Systems, edited by W.H. Ray and D.G. Lainotis,
Marcel Dekker, N.Y., 1978.
[33 A. Bensoussan ; M. Robin, On the Convergence of the Discrete Time Dynamic
Programming Equation for General Semi groups, SIAM Control, to be
published.
[4] M.H.A. Davis; M. Kohlmann, On the Non linear Semi-group of Stochastic
Control under Partial Observations, submitted to SIAM Control.
[5] W.H. Fleming, Non linear Semi group for Controlled Partially Observed
Diffusions, preprint.
[6] W.H. Fleming ; E. Pardoux, Optimal Control for Partially Observed
Diffusions, submitted to SIAM Control.
[7] H. Kwakernaak, A Minimum Principle for Stochastic Control Problems with
Output Feedback, Systems and Control Letters, Vol. 1, n°1 (July 1981).

[8] M. Nisio, On a Non linear Semi-group Attached to Stochastic Optimal


Control, RIMS, Kyoto University, 13 (1976), 513-537.
[9] E. Pardoux, Stochastic Partial Differential Equations and Filtering of
Diffusion Processes, Stochastics, 1979, Vol. 3, pp. 127-167.
ACCURATE EVALUATION OF C O N D I T I O N A L DENSITIES
IN N O N L I N E A R FILTERING

W.E. Hopkins, Jr., G.L. Blankenship and J.S. Baras


Electrical Engineering Department
U n i v e r s i t y of M a r y l a n d
College Park, Maryland 20742

ABSTRACT

Using some approximation formulas for stochastic Wiener function space


integrals, it is p o s s i b l e to a p p r o x i m a t e the c o n d i t i o n a l densities
which arise in t h e n o n l i n e a r filtering of d i f f u s i o n processes to w i t h i n
O(n-2), with n>l arbitrary, by n - f o l d ordinary integrals. The latter
have the simple form of a " r e c t a n g u l a r rule", but their accuracy is an
order of m a g n i t u d e better. The n - f o l d integral can be further decom-
posed into a recurslon involving n one dimensional integrals. The
sequence is r e c u r s i v e in the increments of the observation process in
the filtering problem. It is not, however, recursive in time. The
one dimensional integrals are naturally treated by an m - s t e p Gaussian
quadrature w h i c h h a s an e r r o r p r o p o r t i o n a l to n m ! / [ 2 m ( 2 m ) ! ] . (The
proportionality constant can be estimated and optimized.) The computa-
tion of these individual integrals can be reduced further by exploiting
certain inherent symmetries of the problemj and by d o i n g some prelim-
inary, "off-line" computing. The end result is a h i g h l y accurate,
computatlonally efficient numerical algorithm for evaluating condi-
tional densities for a substantial class of n o n l i n e a r filtering
problems. By a c c e p t i n g slight reductions in a c c u r a c y , one can obtain
an algorithm (apparently) fast enough, when efficiently coded, for
"on-line," recursive filtering in real time.
55

i. THE PROBLEM

Consider the stochastic dynamical system

dx(t) = f(x(t))dt + g(x(t))dv(t) (l.la)

dy(t) = h(x(t))dt + dw(t) (l.lb)


x(0) = x0 ~ P0(X), 0<t<T
^

written in the Ire calculus. Here f,g,h are smooth functions .mapping
R into R; v(t), w(t) are independent, R-valued standard Wiener processes
mutually independent of the initial condition x 0 which has a smooth,
bounded density Po(X). We call x the signal and y the observation
process. The filtering problem is to d e t e r m i n e an estimate of x(t)
given observations y(s), 0~s~t, or, more precisely, the o-algebra Yt
generated by {y(s), 0~s~t}. For the estimate one usually takes the
conditional mean x(t) = E[x(t) IYt] since this produces the m i n i m u m
mean square error.
^

To c o m p u t e x(t), it s u f f i c e s to k n o w p(t,xlYt), the conditional density


of x(t) given Yt' if it exists. This satisfies a complex, nonlinear
stochastic partial differential equation which is d i f f i c u l t to a n a l y z e
or to t r e a t numerically [i] [2]. Alternately, one can w r i t e

p(t,xlYt) = u(t,x)/[f~u(t,z)dz] (1.2)

where the u n n o r m a l i z e d conditional density u satisfies

du(t,x) = i ~xx (g2u)


[~ _ 8x(fU) ~Ih 2 u]dt + h(x)u~y(t) (1.31

u(O,x) = P0(X), 0!tiT

(written in the Stratonovich calculus ) a linear stochastic PDE


discovered by M. Zakai [3] (and independently by R. M o r t e n s e n and T.
Duncan - see the remarks in [3]), The solution to this equation may
be w r i t t e n in terms of a f u n c t i o n space integral. Speci~cally,

u(t,x) = Ex{eXp[fth(x(s))dy(s)-~th2(x(s))ds]P0(x(t))}_ (1.4)


0 -0

where E (.} is e x p e c t a t i o n over the paths of (i.i) starting at x ( 0 ) - x ,


x

That is, (1.4) is formally the solution of (1.31 as a short calculation


using the Stratonovich calculus shows. (Existence and uniqueness of
solutions to (1.3) are discussed in [4], among other papers.)
Our o b j e c t i v e here is the evaluation of the function space integral
(1.4) by quadrature type approximations.

We use ~u, ~y, etc. to s i g n i f y the Stratonovieh calculus.


56

The first step of the approximation is c a r r i e d out by regarding (a


transformed version of) (1.4) as a " s t o c h a s t i c Wiener integral" and
applying the formulas in [5]. This leads to an a p p r o x i m a t i o n of the
function space integral by an n - f o l d ordinary integral with an error
0(n-2). The second step of the approximation is r e d u c t i o n of the n - f o l d
integral to a r e c u r s i v e sequence of one dimensional integrals which
may be evaluated by Gaussian quadrature.

To d e s c r i b e the formulas in [5], w e w i l l use the following setup: Let


Co([0,t]) be the space of R - v a l u e d , continuous functions x(s) on [0,t]
with x(0)=0 and let W be W i e n e r measure on CO . If F : C 0 ÷ R is a s m o o t h
functional, the W i e n e r integral

I = I F(x)dW(x) (1.5)
Co

is d e f i n e d as the sequential limit [6]

I . lira
. . . f n / d a l . . . d a n F (Zsx ) .
max]tj-tj_ll+O R R

l£j£n (i. 6)

n exp[-(a.-a )2/2(tj-tj_l)]
n ,j j-1
j=l [ 2 ~ ( t j - t j _ l ) ] n-2

where 0<tl<t2<...<tn = t and Zsx is a p o l y n o m i a l function on [0,t]


passing through x at s=0 and aj at tj, j=l,2,...,n.

Except for a few simple case,s, it is impossible to e v a l u a t e Wiener


integrals explicitly. Approximation formulas suitable for n u m e r i c a l
computations are, therefore, of c o n s i d e r a b l e interest in a p p l i c a t i o n s .
In [7] A . J Chorin presented some formulas of this type. His results
were based on the u s e of p a r a b o l a s to interpolate the W i e n e r paths
and on e x p a n s i o n s of the n o n l i n e a r functional F in a T a y l o r series
with the quadrature formula adjusted to o p t i m i z e the approximation of
the first two terms. Chorin's formulas were of the
form
2 2
-u I - ... - u
CI F ( x ) d W ( x ) = -n/2 Rnf F n ( u I, . .U n ) .e . . . . ndUl dUn+0(n-2)
(1.7)

where F had the simple form of a r e c t a n g l e ru!e in s p e c i f i c cases.


n

The function space integral in (1.4) involves a random (Ito) process


{y(s),0<s<t}, and the formulas of Chorin do not apply to it. In [5]
these formulas were adapted and extended to cover this case. The result
we need is the following:
57

Lemma. Let {w(s), O~s~t} be an R-yalued standard Wiener process on


(~,~,P) and let {f(s), g(s), O<s<t} b__eeR - v a l u e d random processes non-
anticipative with respect to w w h i c h have continuous paths almost Sdrely
and second moments uniformly bounded in s c [ 0 , t ] . Let

dy(s) = f(s)ds + g(s)dw(s)


(l.a)
y(0) = O, O<s<t.

Suppose V:R->R has derivatives up to o r d e r 4 with

f t + I / n l v .... ( z ( s ) ) 1 2 d s = 0(n -I) (1.9)


t
for all te[0,T] and any c o n t i n u o u s z : [ O , T ] ÷ R . Then for any tc[O,T]

I = fexp[ftV(x(s))dy(s)]dW(x)
C 0
n u t
f n )
= (2~) -n/2 {exp[E V(xi-i + 272~n A Y I - I ~ } (1.1o)
Rn i=l

1 2 2 + e
• exp[-~(Ul+'''+u )]dUl'''dUn n

where

x i = t(ul~...+ui)/n, i = 1,2 ..... n


(1.11)
t i = it/n, AYi_ I = Y ( t i ) - Y ( t i _ l )

The a p p r o x £ m a t i o n error is

(Eel) ½ = O(n -2) (1•12)

where E is e x p e c t a t i o n with respect to uhe distribution of { y ( s ) ~ O S s } .

Remarks I. The "ordinary" integral which appears in (1.4) admits a


similar approximation with At = n -! r e p l a c i n g Ayi_ I in (1.10). The
error, which is d e t e r m i n i s t i c , is also O(n~2).
2• The remarkable feature of formula (I.I0), as n o t e d in [7]~
is that it is no m o r e complicated in s t r u c t u r e no~ does it r e q u i r e more
computing effort than the standard "rectangle rule" ~8] w h i c h has
accuracy O(n-l).
3• The evaluation of the n - f o l d integra~ in (i. I0) may be
reduced to a s e q u e n c e of one dimensional integrals which are recursive
in the increments A y i _ I, This has some important implications in the
filtering problem•
4. The simple form of (i~I0), the error estimate, and the
recursive evaluation depend on the fact that the u n d e r l y i n g measure
W is W i e n e r measure. Since the process x(t) in (I.I) is not a Wiener
58

process, it is n e c e s s a r y to m a k e a change of c o o r d i n a t e s or a c h a n g e
of m e a s u r e (i.e., a Girsanov transformation) in (I.I) to take advantage
of this structure.

2. COORDINATE TRANSFORMATIONS
If the c o e f f i c i e n t s in (l.la) are sufficiently smooth, we can change
coordinates (pathwise) so that the resulting diffusion is a W i e n e r
process. When the coefficients are not smooth or w h e n x(t) is a m u l t i -
variable process, this procedure may not work, and a change of m e a s u r e ,
i.e., a Girsanov transformation, may be required to i m p l e m e n t our compu-
tational algorithms.

Suppose
(AI) g(x)>__g0>0 for some go and all xcR

f~dx/g(x) = fOdx/g(x) = + ~

(A2) P0(X) = exp[~0(x)]


(A3) f~cl (R), gGC2 (R)

Consider the change of c o o r d i n a t e s in (i.I)

z(t) = +[x(t)] = fx(t)dx/g(x) (2.1)


^ 0
Using Ito's formula

dz(t) = (f/g - ig')(~-l[z(t)]) dt + dr(t)


(2.2)

dy(t) = h(~-l[z:(t)])dt + dw(t)

The associated Zakai equation is


I + [ig f/g] ( ~ - l ( z ) ) ~ u
~u(t,z) = { ~zz u ' z
+ [ g ( ~1g , ' - ( f / g ) ' ) - - ~1h 2 ] ( ~ - l ( z ) ) u } d t (2.3)

+ h(~-I ( z ) ) u ~ Z )

(written in the Stratonovich calculus), We can eliminate the first


order term in (2.3) by u s i n g the exponential transformation

v(t,z) = u(t,z) exp[-~(z)]


(2.4)
~(z) = Iz(f/g _ ig,)(~-l(x))dx
0
The e q u a t i o n for v ( t , z ) is

~v(t,z) = [lSzzV(t,z ) - V(z)v(t,z)]dt + H(z)v~y(t) (2.5)


59

where

V(z) = ~[h2+(f/g_½g,)2+g(f/g_½g,) ,] (~-l(z))

(2.6)
R(z) = h(@-l(z))

Since the L a p l a c e a n in (2.5) is "isolated," the fundamental solution


of (2.5) involves Wiener measure, and we can apply the formulas of [5]

to e v a l u a t e it. This is done in the next section,


Instead of changing coordinates in (I.I) we could have changed the
coordinates in the Zakai equation 41.3) an@ then used an exponential
transformation to e l i m i n a t e the drift term. This would also lead to an
equation like 42.5), hut involving different functions ~(z), V(z).
While this procedure is p e r f e c t l y valid in the context of n u m e r i c a l
studies, the resulting equation cannot, in g e n e r a l , be a s s o c i a t e d with
a nonlinear filtering problem like (I.i), and we shall not pursue it
further.

In s i t u a t i o n s where f and g are not smooth, i.e., (A3) does not hold,
then we m u s t consider some more general type of w e a k transformation to
accomplish the reduction of the Zakai equation. Also, when x(t) is an
Rd-valued process with d~2, it w i l l be n e c e s s a r y to h a v e the integrand
in 42.4) be a g r a d i e n t . That is, with g(x~ dxd and non-singular
-I I
g (x)f(x) - ~ g ' ( x ) w i l l have to be a g r a d i e n t for the e x p o n e n t i a l
transformation (2.A) to h a v e a meaning.

3. APPROXIMATE EVALUATION OF THE CONDITIONAL DENSITY

Using the Feynman-Kac formula or the Kallianpur-Striebel formula as it


is c a l l e d in f i l t e r i n g theory, we can w r i t e the solution to (2.5) as

v(t,z) = Ez{eXp[ftH4z(s))dy(s ) i/tH2 (


0 - YO z(s))ds]
(3.1)
• exp[-ftV(z(s))ds+~O(Z(t))] }
0
where Ez is e x p e c t a t i o n over Brownlan paths starting at z(O) = z.
Applying the formulas of C h o r l n and the Lemma, we can w r i t e

v(t,z) = In(t,z) + O(n -2) (3.2)


where

In(t'z) = (2~)-n/2/ Kn(t'z'['z)d~ (3.3a)


Rn
80

Y = ( Y 0 ' Y l ..... Y n - I )' Y i = Y(ti+l) - Y(ti)


(3.3b)
_r = (r0, r I ..... r n _ l ) , t i = i t / n , i = 0,I ..... n - I
n-i
Kn(t,z,r,y) = exp{Z [H(z+<~n(t),r>)y i
i=0
1.2. n
- ~n tz+<~i(t),r>)t/n

- V ( z + < ~ i ( t ) ,r>) t/n] (3.3c)

+(~0_~ ) (~-l(z+< n_l (t),r_y))


i
- y<r,r> }
where for i = 0 , 1 .... ~n-i
n
~i(t) = (tl/~n, ..... t / / n , 0 ..... 0 )nT e- R

0 th entry (i-l) st entry (n-l) st entry (3.4)

and <~,r> is the Euclidean inner product. The error in (3.2) is inter-
preted as in the Lemma (even though it h a s a deterministic component
which is 0 ( n - 2 ) ) . Efficient evaluation of the n-fold integral (3.3a)
is our main objective here.

3.1 A Recursive Evaluation of In(t,z)


Let

wi = z + <=in ( t ) , r> ' i = 0 , I , . ,


-- ,n-I (3 5 )
T
- [w0, w I ..... W n _ 1 ]

and note dw 0 = (t/~2n)dr0, dw i = (t//-n)d~,

Zn(t,z) = rn__)n / T / -f~ I K^n ( t , z , w , y ) d w (3.6)


"2~ t Rn

where
n-I
^ I 2
K n ( t , z , w , ~) = exp{l [H(wi)Y i - ~H (wi)t/n
i=0

-V(wi)t/n ] + (g0-~)(~-l(Wn_l)) (3.7)

i 2 + .. + (w )2]]
2 n2-[2(Wo-Z) + (wl-Wo) • n_l-Wn_2
t
If w e now define
1 2
g(t,w,y) = H(w)y - ~-H ( w ) t / n - V(w)t/n (3.8)

and the sequence of integrals


61

ll(t,Z,Wl) = Cn(t)f~exp{-_. ½ ~[2(Wo-m)2+(Wl-Wo )2 ] + g ( t , w 0 , Y 0 ) ) d w 0


(3.9) I

I k ( t , z , w k) = C n ( t ) f exp{- i~ (Wk-Wk_ l)2+g( t,Wk_l,Yk_l) }

(3•9) k
• I k _ l ( t , Z , W k _ l ) d W k ~ I, k = 2,3 .... ,n-l,

then

In(t,z) = C n ( t ) f exp ( U O - ~ ) ( ~ - I (Wn_l)) + g(t,Wn_l,Yn_l)}

"In_l(t,Z,Wn_i)dWn_ 1 (3•9) n

where C n ( t ) ~ (21/2n/-ffn)/(t 2 / ~ ) .
Remarks I. The i n t e g r a l s Ii, I 2 , . . . , I n _ 1 are independent of the i n i t i a l
data; and so, (3.9) has the form of a G r e e n ' s representation
n

In(t,z) = ~q0(w)G(t,z;0,w)dw (3.10)

That is, Cn(t)exp~g(t,Wn.l,Yn_l)~In_l(t,Z,Wn_l) approximates the G r e e n ' s


function of the p a r t i a l differential equation (2.5) for the (transformed)
unnormalized conditional density.
2. The system (3.9) is r e c u r s f v e in th e o b s e r v a t i o n s
y0,Yl,•..,Yn_l; that is, Ik depends on y 0 , Y l , . . . , Y k _ l , and it is com-
puted f r o m Ik_ 1 and Yk-l" Unfortunately, it is not recursive in n or
in t. This last m e a n s that (3.9) does not constitute a recursive filter
in the u s u a l sense of the term.

At this p o i n t there is c o n s i d e r a b l e flexibility in d e s i g n i n g a numerical


implementation of (3.9). In d i s c u s s i n g possible designs we w i l l show
that the n - s t e p reeursive evaluation (3.9) has a significant compu~
rational advantage over (most) direct evaluations of the n - f o l d
integral in (3.6).

3.2 Implementation
Suppose we r e q u i r e the e v a l u a t i o n of v ( t , z ) = In(t,z) + 0(n -2) at p o i n t s
z = al,a 2 ..... a m for some m~l, and t=t I, t 2 , . . . , t N = T. The tj are p r e -
specified observation times• Let n l , n 2 , . . . , n N be i n t e g e r s w i t h nj~l,
j = 1,2,...,N. Consider the d i s c r e t i z a t i o n of the time interval shown
below

0 ~ i t12"'" ~ t21t22 .... t2 ..... tn_ I T

Here t l l = tl/n I, t12 = 2 t l / n I ..... t21 = ( t 2 - t l ) / n 2, t 2 2 = 2 ( t 2 - t l ) / n 2 ' ....


If we only w a n t v ( t , a i) at t l , t 2 , . . . , t N , then the m e s h p o i n t s tij can
be c h o s e n arbitrarily; that is, the i n t e g e r s n l , n 2 , • . . , n N can be
62

different. The recursion (3.9) is then used to e v a l u a t e I u (t,z) at


t = t l , t 2, .... t N. If v ( t , z ) is r e q u i r e d at the intermediate times tij,
then it is a d v a n t a g e o u s to take nl,n2=...=nN=n, since this permits the
use of p a r a l l e l processors (n of them) to c o m p u t e the intermediate
values.

The operational count for this procedure is as follows: Each integral


Ik(t,z,w) is e v a l u a t e d for e a c h t at (z,w) = (ai,aj),i,j=l,2"~ .... m,
2
a total of m e v a l u a t i o n s of I k. For this purpose we can use any
quadratBre rule for o n e - d i m e n s i o n a l integrals of the form l(z,w) =

f~e-#(V)F(z,w,v)dv with # a given weight function; that is,


m
l(z,w) = Z F(z,w,ai)A i (3.11)
i=l
Here A i are w e i g h t s associated with the rule, and the a i a r e chosen in
accordance with the p r e s c r i p t i o n of the quadrature rule. By u s i n g the
same rule for all the I k no interpolation is n e e d e d to evaluate Ik+ I
in terms of I k. At the final stage we require only m values of
V(tl~Z). Thus, on the initial interval ( O , t l ) we m a k e ( n - l ) m 2 + m
3 2
evaluations of the integral (3.11) and (n-l)m + m function evaluations
and the same number of m u l t i p l i c a t i o n s and additions in (3.11) to c o m -
pute (3.9). For example, if m = 20, n = 20, we m u s t compute about 7500
integrals, implying about 150,000 additions and m u l t i p l i c a t i o n s . While
this is not insubstantial, it compares very favorably with product
formulas for (3.6) that do not use the special structure in (3.9). By
introducing a particular quadrature representation, we can obtain a
more precise count of the elementary operations ~equired in the
evaluation.

4. FURTHER COMPUTATIONAL ASPECTS OF THE RECURSIVE FORMULAS

By m o d i f y i n g the n o t a t i o n we can streamline the representation of the


recursion (3.9) and, in this way, make the u n d e r l y i n g structure of the
computation more evident. Referring to (3.9) a natural choice for
the w e i g h t functions ~ in (3.11) are the diagonal terms of the quadratic
form in w i in (3.7). Unfortunately, it w i l l be seen in s e c t i o n 5 that
the error associated with the resulting Hermite-Gauss quadrature rule
is an increasing function of b o u n d s on derivatives of the integrand in
(3.11); these may not be finite because of the remaining cross terms in
the q u a d r a t i c form. This suggests selecting as w e i g h t functions a
portion of each diagonal element, and choosing the p r o p o r t i o n a l i t y
constants to p r e s e r v e the n e g a t i v e definiteness of the q u a d r a t i c form
in (3.7) while simultaneously optimizing the error bounds for the
quadrature rule. To this end let ~ie(0,1),0~ijn-l, and define
63

jl = ( t / / n ) d i a g [ ¢ ~ 7 ~ , 1 ..... 1,/~]
(4.i)
j2 = d i a g [ ( l _ a i ) - ½ ]

(diag = diagonal matrix), and


= j~, j=jIj2 ~ diag[Ji] (4.2)
Then

In(t,z) = Cn(t,z)I
n-i
[~ e
_~2/¢ []Kn(t,z,i,~l)d (4.3)
Rn i=0
where
n-i ½ _nz2/t2/2n/2
Cn(t,z ) = 2/2/3 ~ (l-al) e
i=0
n-i
K(t,z,_~,y ) = e x p { [ ~ 0 - $ l ( ~ - l ( W n _ l )1 + Z g(t,wi,Yi)]%r=j~
i=0
n-i 2 n-2
- i=0
Z aii~ i - 2Zi=0 a i ' i + l E i ~ i + l - b 0 ~ o ~

2 ~7~/Jn (4.4)
b0 = - ~ z

aii = ~i/(l-si)

=
[ -(i/¢~)//(i-%)(ii%) , i = 0

<i n-3
ai,i+ l , _ _

[ -(i/~)//(l-an_2)(l-an_ I) , i=n-2

The transformation jl isolates a factor e x p - Z ( l - ~ i ) $ ~, which is then


normalized by j2. To express (4.3) recursively, define

1
ri(~i,t) = exp[H(Ji~i)]

2
ri(~i,t) 1 2 (Ji~i )t/n
= exp[-~H - V(Ji~i)t/n]

qo($o,Z,t) = exp[(2/~-~7-~/t l~-~0)z$0 ] (4.5)

2 0<k<n-2
S k ( ~ k , ~ k + I) = e x p [ a k k + l ~ k ~ k + 1 akk~ k] ,

s n _ l(~n _ 1 ) = e x p [ ( ~ 0 - ~ ) ( ~ -I (Jn_l~n_l)) ]

Here w i -- Ji~i and the d e p e n d e n c e on u i has been suppressed• Then (4.3)


becomes
n-I _~2 n-i 1 Yi]
I (t,z) = ~ (t,z)f [~ e il~] [~ (ri(~i,t))
n n i= 0 i=0 (4.6)
n-I
•[ D r2(~l,t)si(~i,~i+l) ]q (~0,z,t)d_~
i=0
64

To e x p r e s s 2 t h i s reeursively in terms of one d i m e n s i o n a l integrals, let


P(~) = e -~ / ~ , and consider the sequence
Y0 2
ll(El,Z) : Cn(t,z)~p(~0)(r~(~o,t)) ro(Eo,t)

: "So($O,~l)q ($0, z, t)d~o

Ik+l($k+l,z) = Cn(t,z)f p(Ek)(rk($k,t))Ykr (~k,t)


(4.7)
• • Sk (¢k, $k+l) Ik(Ek, z) dEk, k=l,2 ..... n-2
I Yn-lr2
In(t,z) = Cn(t,z)f P(~n_l)(r _l(~n_l,t)) n_l(~n_l ,t)

• S n _ l(En_l~n ) In_l(~n_l,Z)d~n_ i
where
en(t,z) = [Cn(t,z)]i/n (4.8)

I 2
Note that the functions r k, r k, s k do not depend on k, for k=l,...,n-2.

For fixed ~i' we can integrate I =f_~p(E)f(E)d~ with an M-point Gaussian

formula (see, e.g., [9]):

I ~ S + eM
M (4.9)
S = Z Aif(a i)
i=l

where Ai>0 are weights and a i are the (real) zeros of the M th Hermite
po lynominal

HM(X) = 2 M x M + ... (4.10)


and are symmetric about x = 0. Using this, the k th term in (4.7),
k = 1,2,...,n-2, becomes
~

Ik+l(aj,a i) = C n ( t , a i ) [ S k ( a j , a i ) + e k] (4.11)
where
M i ]Ykr2
Sk(aj,a i) = Z A~[r (a~,t) (a£,t)
4=1

" s ( a ~ , a j ) I k ( a ~ , a i)
M
(4.12)
-" Z A~ [rl (a~, t) ] Ykr2 (a£, t)
4=i

" s (a~, aj ) Cn (t,ai) Sk_l (a~, a j )


(Note: we have dropped the k - d e p e n d e n c e on r I, r 2,s.)
65

The e r r o r in the s e c o n d expression should be clear from (4.11). Now


let ~ k be the M x M m a t r i x with elements S k ( a j , a i ) , i , J = l , 2 , .... M.

M
~ k = {$k(~aj'ai)}i,J=l (4.13)
and let

R ( t , l•, j , ~ ) = r 2 (a~,t)s(ag,a~ ) Cn(t, ai) (4 14)

Then
M rl
S k ( a j , a i) = Z A~[ ( a z , t ) ] Y k R ( t , i , j ,~) S k _ l ( a e , a i) (4.15)
6=1
The terms AZ, r I (ag,t), R(t,i,j,Z) can be p r e c o m p u t e d "off-line " In
this case, calculation of the m a t r i x ~k consists of
(i) r a i s i n g the e l e m e n t s of the v e c t o r r l(a~,t)
= 1,2,...,M to the Yk power;
(ii) p e r f o r m i n g M 2 v e c t o r p r o d u c t s of
1 Yk
~ij = {Ag[r (ag,t)] R(t,i,j,&), ~ ~ i ..... M}
i
and ~ k - I = {Sk-l(ag'ai)' ~ = i, .... M}
for i,j = 1,2 .... ,M.
To c o m p u t e Ik+l(aj,al), it is n e c e s s a r y to m u l t i p l y Sk(aj,ai) by
Cn(t,al) ,
Thus, a total of 2 M 3 + M 2 + M elementary operations is r e q u i r e d to
compute the M 2 e n t r i e s Ik+l(a~,a~) , i,j=l,...,M. This is for
k=l,2, .... n-2, a total of (n-2)(2M3+M2+M) operations. The o p e r a t i o n
count for l l ( a g , a i) is the same, adding (2M3+M2+M) operations. Only
2M 2 + 2M o p e r a t i o n s are r e q u i r e d to c o m p u t e the M - v e c t o r

In(t,a~) , ~ = I , . . . , M . Thus, the total number of e l e m e n t a r y operations


N required to c o m p u t e the a p p r e x i m a t i o n I Ct,zlr of v(it,z~ at the
oper n
points z = a~, Z=I,...,M is

N = 2(n.l~M3 + (n+l)(M2+M) (4.161


oper
For n=2Q, M = 2 0 we have N oper = 3 1 2 8 2 0 .
Note that it is p o s s i b l e to do the m u l t i p l i c a t i o n s of xlj and
vi and that of S k ( -a j , a i ) and Cn(t,a±) using "parallel processing
which leads to c o n s i d e r a b l e time savings.
Finally, referring back to (2.4) the e v a l u a t i o n of the
unnormalized density u ( t , a i) from v ( t , a i ) , i = 1 , 2 , . . . , M , requires an
additional M function evaluations and M m u l t i p l i c a t i o n s .
66

5. SOME COMMENTS ON AN E R R O R ANALYSIS

Our final formula (4.11) for the approximation of v ( t , z ) depends


on the w e i g h t parameters ~° a n d i n v o l v e s two b a s i c errors: (i) the
l
error 0(n -2) in the approximation (i.i0) of the function space integral
By an n - f o l d "ordinary" integral; and (ii) the error e M in the M - p o i n t
Gaussian quadrature formula (4.9). (The error in the substitution
(4.12) is c l e a r l y a linear multiple of eM.) For obvious reasons we
would like to h a v e sharp estimates of these quantities, the last of
which should be m i n i m i z e d with respect to !.
The analysis in [5] w h i c h led to the estimate O ( n -2) in (i.I0) is
so c o m p l e x that a precise evaluation of the order expression appears
to he out of the q u e s t i o n . In s p e c i f i c cases one can probably emulate
the analysis of C a m e r o n [8], sec. 6, w h i c h shows that in some cases
the O(n -2) estimate is a c t u a l l y very conservative. (E.g., "Simpson's
rule,"ineluded in our formulas, evaluates the W i e n e r integral of the
functional F[x] = []~ x 2 ( s ) d x ] 2 to w i t h i n an e r r o r of i/8~4n-3.)
We can say somewhat more ahout the error eM=eM(t,z,!). Natural
bounds are of the form

leM(t,z,i) l~C(n,M)N(t,z,~,n,~) (5.1)

where C is the error coefficient and N is s o m e bound on a d e r i v a t i v e of


the integrand in (4.6). There are several possible procedures, see,
e.g., [i0] [ii]. We shall follow the a p p r o a c h used by L e t h e r in [ii].
Since the r e e u r s i v e evaluation (4.7) does not reduce or increase the
error relative to d i r e c t evaluation of the n - f o l d integral (4.3), we
may as w e l l apply Lether's procedure to the latter.
Because the w e i g h t function in (4.3) is n o r m a l i z e d , and the error
coefficient for each coordinate is (independent of the coordinate)

¢(M) = MI/2M(2M)! (5.2)

the e r r o r for fixed (t,z,~) is b o u n d e d by (from [ii], equ.(7))


n-i 2M
E M ( t , z , i) = G(M) E ou2M n (~0 ,. "
. '
~ n-1 ) (5.3)
~ ~ i=0 ~i
for some point (~0,...,~n_l) in R n. Here K n is the integrand in
(4.3). (We h a v e supressed the t,z,~,~ variables in the a r g u m e n t of K.)
Of c o u r s e , we a s s u m e that K and, therefore P0' f,g,h, are sufficiently
smooth for (5.3) to m a k e sense.
The d e p e n d e n c e of E M on a m a y be found by computing upper bounds
for the d e r i v a t i v e s of K . Writing
n
n-1
Kn (~,~) =F (J_~)G (~,~), F (w) = e x p [ (~0-~) (+-I (Wn_l)) +
~ Z g(wi) ] (5.4)
i=0
87

n-I n-2
2
G(~,a)=exp[-( Z aii(~i~ i + Z aii+l(.~i,~i+l)~i~i+ 1 + bo(~0)~.O)]
i=O i=0
assume F has continuous derivatives up to order 2M+I, bounded by a
constant F 0 possibly depending on the o b s e r v a t i o n path y(t~. By
completing the square with respect to ~'i in G(~,al and e m p l o y i n g a
bound for H e r m i t e polynomials [12],
2
Hj (x)e -x _< c02J/2 3V~.,e-X2/2 , C0~I-086
ie_X 2 2
and d /dx i = e -x H.(x)- (-i) i, it may be shown that
I
n- 1
EM(t,z, E) -< CIe(M) Z Pi(~) sup e-Qi (i, -~) (5.6)
i=0
-n/2
Cl(n,t,z) ~ C0F n0 ~ (2) t-n e x p ( _ n z 2 / t 2)
n-i -1/2 )=m 2m 2m , 1 ~/2
Pi(~) = (j=o(l-~j)n ) (J~)-m(l-~i j=0Z ( j ) j~-~.'(2Ji ~i )

i 2 i T -i
exp[~ bi/aii+ ~ BiA i Bi] (a_)

Qi(i,~) = [(~+ ~1 A~IBi) TA i (~ + ~I A~IBI)](~)


--
where
0
Ai(_~) = M i ( ~ ) A ( ~ ) , 6
ai-l,i/aii
A-- (aiJ), M i = Inxn-i__2
0 i
aii+l/aii
.o
nxn

B0= , Bi=

Here aij,b i are d e f i n e d in (4.4) and A i is q u i n t - d i a g o n a l . By c o n t i n u -


ity a r g u m e n t s it may be shown that the m a t r i c e s A i are all p o s i t i v e
definite in some n o n e m p t y subset S of the cube ~ = ~ n-i
i=O (0,i). In this
region the factors exp-Q i in (.5,6) may be b o u n d e d by unity. Also, the
function Z[i(~) is c o n t i n u o u s in ~ and d i v e r g e s to + - as ~ ~ ~
Therefore, Z [ i ( ~ ,) attains a minimum in S at some point ~*(m)-

This v a l u e e may be c o m p u t e d numerically and used to define


the a p p r o x i m a t i o n s in section 4~ giving rise to the error bound
68

n-i
ieM(t'z'a*)I-< EM(t'z'a*) -< CIg(M)i%--0 P-i(a*) (5.7)

It is robust in the sense that a* does not depend on the data f,g,h,
or the o b s e r v a t i o n process y(tl. Of course, the d e p e n d e n c e of
(5.71 on (m,n) is crucial. At present, the a s y m p t o t i c behavior of
a~ as m -+ ~ has not been determined, but analysis of si'milar i n t e g r a l s
with A i diagonal suggests that the p e r f o r m a n c e should be good.
Remark : The a s s u m p t i o n that F(w) have bounded derivatives is not
very r e s t r i c t i v e . From (2.6) C3.8), (5.4] it is clear that we include
the cases
(i) f,g,h and their derivatives bounded
(ii) g constant , f,h together wi'th their derivatives of
"polynomial" growth, and lira sgn(x)f(]x) < ~
Ixl ~ ®.
More general classes of f,g,h may be identified from (~2.61, (~3.8), (5.4).

REFERENCES

I. H. Kushner, "Dynamical e q u a t i o n s for optimal n o n l i n e a r filtering",


J. D i f f e r e n t i a l Equations, 2 (19671, pp. 179-190.
2. F. Levieux, " C o n c e p t i o n d ' a l g o r i t h m s p a r a l l e l i s a b l e s et c o n v e r g e n t s
d e f i l t r a g e r&cursif n o n - l i n & a r e " , Appl. Math. Opt., ~ (1977),
pp. 61-~5.
3. M. Zakai, "On the optimal filtering of d i f f u s i o n processes",
Z. Wahrsch. Verw. Geb., ii (1969], pp. 230-243.
4. J. S. Baras, G. L. Blankenship, W. E. Hopkins, Jr., "Existence,
uniqueness, and asymptotic b e h a v i o r of solutions to a class of
Zakai equations with u n b o u n d e d c o e f f i c i e n t s " , to appear.
5. G. L. Blankenship, J. S. Baras, "Accurate e v a l u a t i o n of stochastic
Wiener integrals with a p p l i c a t i o n s to s c a t t e r i n g in random media
and to n o n l i n e a r filtering", SIAM J. Appl. Math., 4 1 (11981),
pp. 518-552.
6. R. H. Cameron, "A family of integrals serving to connect the
Wiener and Feynman integrals ~', J. Math. Physics, 39__: (!960),
pp. 126-140.
7 A. J. Chorin, "Accurate e v a l u a t i o n of Wiener integrals" Math
Comp., 27 (119731. pp. 1-15.
8. R. H. Cameron, "A Simpson's rule for the n u m e r i c a l e v a l u a t i o n of
W i e n e r ' s integrals in function space", Duke Math. J., 1 8 (1951)
pp. 111-130.
9. A. H. Stroud, D. Secrest, G a u s s i a n Q u a d r a t u r e Formulas, Prentice~
Hall, Englewood Cliffs, N.J., 1966.
i0. F. G. Lethe,, " C r o s s - p r o d u c t cubature error bounds", Math. Comp.,
24 (1970), pp. 583-592.
ii. F. G. Lether, "An error r e p r e s e n t a t ± o n for product cubature rules"
SIAM J. Numer. Anal., ~ (1970), pp. 363-365.
12. A. Erdelyi, ed., Bateman M a n u s c r i p t ProOect, Higher T r a n s c e n d e n t a l
Functions, vol. II, Calif. Inst. Tech., M c G r a w - H i l l , New York,
1955.
AN EFFICIENT APPROXIMATION SCHEME FOR A CLASS
OF STOCHASTIC DIFFERENTIAL EQUATIONS

J.M.C. Clark

Department of Electrical Engineering


Imperial College
London SW 7 2BT U.K.

INTRODUCTION

A feature that is characteristic of many of the multidimensional stochastic


differential equations that arise in nonlinear filtering theory is that they are
forced by a single, "drifting", Brownian motion ; that is, when written in
Stratonovich form, they look like
dXt(x,w) - f(Xt(x,w)dt + g(Xt(x,W))odW(t), Xo(X,W) = x (I)
where (X t) is a d-dimensional continuous process on [O,T] and (w(t)) is a scalar
continuous process with a distribution P that is absolutely continuous with res-
pect to Wiener measure PO" In what follows (w(t)) is taken to be the coordinate
process of the canonical space C[O,T], topologised with the uniform norm, and P
and PO are distributions on its Borel field B T. ~(x,w) is the natural solution of
(i) that is known to be continuous in t,x and w for sufficiently smooth f and g
([4], p. 215, [2], [lOJ) .
This paper is concerned with the asymptotic behaviour of a scheme for approx-
imating the solutions of such equations in which each approximant depends on the
yalues of the forcing process only at the points of a regular partition; that isj
the scheme produces for each n a PO-measurable approximation, where pO is the
n n
partition o-field O[w(iT/n) : i = O,l,...,n} c B T.
Suppose the accuracy of approximation is measured in terms of the L2-norm
(E[~-%]2)~n°f the error of approximation of ~ ; then the best rate of convergence
among all (P~)-adapted schemes is provided by the sequence of conditional means
E(~]F~). For scalar equations of the type (I) with a single forcing Brownian
-i
motion it is known Eli that for such a sequence the convergence is of order n and
that a number of difference schemes such as those of McShane ([4], p. 205) and
Mil'stein [5], and the second order Runge-Kutta schemes of Rumelin [8], have the
same order of convergence. If the norm SUPo<t<T (E[xt-xt]2) { were to be used in-
stead the best order would drop to n-l; so the choice of norm is crucial here. If
- _

there are two or more forcing Brownian motions driving non-commutative vector fields
then the best order is again n-~ [i].
Newton [6], in an analysis of approximations for bilinear equations with
Brownian motions driving eonmmting vector fields, has shown that a truncated Taylor-
series scheme is efficient, in the sense of possessing the best rate of convergence,
70

only if most terms of third order are retained. This suggests that the schemes just
mentioned, which are basically second-order methods, would generally be inefficient
when applied to equations of the type (I).
This paper considers the efficiency of a "prototypical" approximation scheme
for (I). It is shown that the solution of a sequence of relatively simple differ-
ential equations driven by piecewise linear approximations of the process w form
a schema that is efficient for all sufficiently regular distributions of w. This
is done by means of a centraL limit theorem for the conditional distributions of
normalized error. The technique of representing the the solution of a differential
equation as the composition of the solutions of two simpler equations will be used
repeatedly in the proofs; this technique is at the heart of the work of Doss [2]
and Sussmann [IO] and has also been used by Kunita [3] in a more general context.

AN ODE APPROXIMATION S C H E ~

Let ~ denote the piecewise linear approximation of w¢C[O,T]:


n
1
~n(t) = ~((i+l)h-t)w(ih) + ~(t-ih)w((i+l)h)

for ih < t < (i+l)h, i = 0 ..... n-i

where h = ~T . Let n~t(x,w) be the solution of the ordinary differential equation

~t = f(Xt ) + T2n [g' Cg,fl]<~t) * gC~t)~n(t),~O(X,w)= x (2)


for O < t < T. Here the vector fields f and g on ~d are of class .Cb2 (continuous
bounded derivatives up to second order) and [g,f] is the Lie bracket vector field

[~j(gj ~x
3~fi-" fj ~x
3~gi" ). Then (n~) is our scheme of approximations for ~ . The

presence of the double Lie bracket in (2) distinguishes it from the familiar Wong -
Zakai piecewise linear scheme [Ii],[4]. It also has an incidental geometric virtue not
possessed by the simpler diffe=ence schemes : suppose f and g are Cb and N is
a maximal integral manifold generated by f and g with a dimension d I possibly less
than d. It follows from the special structure of (I) ([2], [iO], also Le~mms i and
2) that if xeN then ~(x,.) stays in N for all t. However the vector field
generating n~ depends only on f and g and their brackets and so n~ also stays
t t
in N.

CONDITIONAL UENTRAL LIMIT THEOREMS

It turns out that the most natural way of studying the asymptotic behaviour
of n ~ (= n (',,w~ is by the action of its inverse map; roughly speaking if x lies
in a maximal integral manifold N of f and g, then in the limit n ( n ~ l o ~ ( x ) ~ x )
behaves like a normal random vector in the tangent space of N at x. Let P be
n
the completion of PDn by the P-null sets in B T.
71

Theorem i: Suppose f and g are of class CB, 3 and [g,f] and [g,[g,f]]are of class
2
Cb •
i) If P = PO' Wiener measure, then for almost all w, any version of the Pn-
conditional distribution of n(~lox~(x)-x)^- converges weakly to a normal distribution
in ~ d with mean at the origin and~ ~covariance matrix

r 2 ~ T {Xt,[g,f](Xt(x,w))
Vo(X,W) := ~-~ -i }{x~l[g,f](Xt(x,w)}'dt. (3)

Furthermore the P -conditional covariance matrix converges a.s. to V O.


n dP dP
ii) If P<<Po and E [--
dPo ] := f ~-~odP < ~,
then for almost all w (P), the conditional distribution and conditional covariance
matrix converge to the same limits along any infinite subsequence I for which
(Pn:n£1) is increasing.
Here x~l[g,f] denotes the vector field ohtained hy "pulling back" [g,fl
through the diffeomorphism X t(" ,w) ; that is,

Xt,[g, El(x) := DX-It(Xt (.x))• [g, f] (Xt (x)) (4)

where DF denotes the Jacobian --. of a diffeomorphism F. The proof, which


~x3
is rather long, is given in the final section.
Rootz~n [7] has a comparable "Renyi-stable" functional limit theorem for
approximations of Stratonovich integrals. In the present context a functional limit
theorem is not appropriate, for the following reason. In the proof of Theorem i, a
"ripple-free" approximation n~t of X t is introduced that equals X t at the points of a
partition and that is differentiable between these points. It is to this process
that nE t converges uniformly with rate l/n; it converges uniformly to X t only with
rate i/n ~ .
In the theorem corresponding to Theorem i for the usual piecewise linear app-
roximations ((2) without the bracket term) the limit distribution would have a non-
zero mean.
The corresponding theorem for n(n~-xT) takes a more complicated form. Intuit-
ively, for fixed x, the pair {~(x,-), n(n%(x,-)-~(x,'))} behaves in the limit
like a random element of the tangent bundle of the maximal integral manifold contain-
ing x,

Theorem 2: Suppose f and g are as in Theorem 1. Under either of the conditions


(i) (ii) on P, the P -conditional distribution of n(n~-x~) and its covariance
n n^_l ° ~
matrix converge in the same way as those of n( ~ ~(x)-x), except that the covarianee
matrix limit now has the form
VT(X,W ) := E[U T U~Iw] (5)
where (Ut) , O<__t<_T,is a continuous process in ~d that solves the Stratonovich equation
+T
dU t = Df(Xt)Utdt + Dg(Xt)U t odw(t) (i12)~ [g,f](Xt)odv(t), UO~ 0 (6)

on a probability space augmented to carry a Brownian motion (v(t)) that is independent


of (w(t)).
72

Again, the proof is given in the final section.

EFFICIENCY

An immediate consequence of Theorem 2 is the following (Var[.] denotes a var-


iance) :

Corollary: Suppose E dP < ~ and x, c e Then (c 'n ) is an efficient set

of approximations of c ' ~ in the sense that for all infinite subsequences of integers
I for which (Pn:nel) is increasing,

liN~i--Var[c,X~ipnj -l
for almost all w in the set {w:C'VT(X,W)C > O}
It is of interest to determine conditions under which the covariance matrix
VT(X,W) is a.s. positive defln~te. It is instructive to look at the elementary
Gaussian case where
dX t = AXtdt + bdw(t), X0(x,w) = x.
Then (6) reduces to
dU= " Autdt + 7 ~ Abdv(t) U 0 = 0.
and the VT(X,W) becomes the "controllability ~t Gra~mlian

V T = fT(eA(T-t)Ab) (eA(T-t)Ah) 'at • T 2


0 12
which is positive definite if and only ~f the m~trix
[Ab,A2b,...Adb] = A[b,Ab,...,Ad-lb] =: AR
is of full rank. If there exists a vector e that is orthogonal to the range of
A or, mare generally, to the image under A of the "controllable" subspace of
= Ax+bu, then e'Vc = O and the corollary is uninformative about the efficiency of
(c'n~). St so happens that in this case c ' n ~ = C'XT, and the scheme is still
"efficient" in a trivial sense, hut what is of more significance is that the positive
definiteness of V T is not guaranteed by the full rank of R, wliich is equivalent to
the condition that the transition densities of (X t) are non-degenerate in E d.
For the general case a promising line of argument(suggested to me by
D. W. Strooek) would seem to be to make use of the exact-time reachability theorems
of Sussmann and Jurdjevic [9] and others to establish the positive definitness of
VT, but I have not yet looked at this in detail.

PROOFS

We begin with soma leumms. In what follow~, rather than saying that a
sequence of continuous functions (qn) converges uniformly on hounded sets to a
continuous limit goo, we shall say that (x,n) ~ qn(X) is "continuous on E d x F ', where
73

is the usual one-point compactlfication ~ u {~} of the integers. I'I will denote
Euclidean norms in various finite dimensional spaces, and also the uniform norm in
C[O,T].
Suppose g:Ed+ffd is a vector field of class Cb3 and q:Ed×~÷~ d an indexed vector
field with bounded first and second derivatives continuous £n (x,n). Let Yt(X)
solve
~t = g(~t )' YO (x) = xE~d' tEE (7)

and Zt(x,~,n )solve, for weC[O,T]

t = DY-w(t)°Yw(t)(Zt)'qn°Yw(t)(Zt )' Zo(X'w'n) = x (8)


-i
=: Yw(t),qn~t )

Lemma i: i) (Doss [2], Sussman [IO]~ If the distribution P of w is absolutely


continuous with respect to Wiener measure PO' the composite function~
Yt(x,w,n) := Yw(t)oZt(x,w,n) (9)

is continuous in t,x,w and n on ~ := [O,T] x Edxc[O,T] × ~ , solves the


StratonoYich equation
dYt = qn(Yt )dt + g(Yt )°dw(t) YO = x

and solves the corresponding ordinary differential equation for all w that are piece-
wise C I .
ii) The x-derivatives of 7t(x) up to third order are con=inuous in t and
x and bounded by ke kt for some constant k.
iii) The first and second x-derivatives of Yt := Yt (''w'n) and those of its
inverse y~l are all continuous on ~ and bounded uniformly in t,x, and n for bound-
ed w.
Proof: (i) is simply an amalgam of theorems in [2] and [i0]. A standard theorem
of 0DE's shows that Yt and its x-derivatives exist and are continuous in t and x.
Furthermore, for d = i,
D~t = Dg.DYt , Dy 0 = i,

D2~t = Dg-D2yt + D2g(DYt)2 , D270 = O,


and boundedness of the derivatives of g and Gronwall's inequality establishes (ii).
The same argument applies for the third order and vector cases. For w in a bound-
ed region the vector field in (8) has bounde'~ x-derivatives up to second order that
are continuous on P. The same ODE theorem implies that Z
has the same properties.
t
Application of the chain rule of calculus to (9) shows that Y also has these pro-
t
perties. Now consider the solutions Zs, t and Ys,t defined by equations (8) and (9)
with w(t) replaced by w(t)-w(s) and with an initial condition Zs,s(X,w,n) = x. Then
it is clear that the arguments that have just been applied to Y also show that
t
s,t (x'w'n) and its derivatives are continuous in s,t,x,w and n, and are bounded for
y~l -

bounded w. But it is easy to verify that Yt,o = and so the proof of (iii) is
complete.
74

To simplify the notation, from now on we shall generally suppress the argum-
ents n and w.
Lenmm 2: Suppose X~ is defined^_lbY^_l(2) and f,g and [g'[g'f]]1 are of class C~.v The
Jacobian matrices DX t , DX t and ~Xt,k),where k is a C~ vector field on ~d, are
continuous on ~, bounded for bounded w and P -measurable for fixed t, x and n.
n

Proof: Set in Lemma i qn:-f +~2n[g[g,f]]. Then ~(x,w) = Yt(X,~n,n). Since


n--'lwl<lwl and (w,n) ~ ~
• is continuous it follows from Lemma i (iii) that DX t and
n
^-I
DX have the required properties. That the Jacobian of
t
k<t) := <Di 1 k>o£t<x)
also has these properties follows from the chain rule and the continuity and
boundedness of the first and second derivatives of Y and y-l.
t t
P -measurability is obvious.
n
Now let ~(t) be the process w(t)-%(t); then on (C[O,T],BT,P O) ~ is a
"Brownian bridge" process. The following lemma summarizes its properties. The
proof is elementary and is omitted.
Lemma 3: i) ~ is a continuous Gaussian process independent of ~ that is pinned to
n
zero at t = ih, i = 0,1,...,n, where h - T/n.
ii) E~(t) = O for all t~[O,T]
iii) E[~(t)~(s)] = h-l(s-ih)((i+l)h-t),for ih < s < t < (i+l)h
= 0 otherwise
iv) f ~ ( t ) 2dt = h2
0
v) fhfhEc~(t)~(s)]ds dt = 12
O0
Proof of Theorem I (i): As before, we set h = T/n and we generally suppress the
arguments n and w. Now P = PO; so w is a Brownian motion and w a Brownian
bridge process. First we introduce a piecewise smooth approximation of (Xt) ; let
Xt (x'w'n) 4be the solution of the ordinary differential equation
Xt = DY-~(t) ° f°Yw(t)(Xt) + g(Xt )~(t)' XO = z . (I0)
Then it follows from the decomposition-of-solution arguments used in the proof of
Leumm i that Xt=¥~(t)°X t and -Xt=Y ~(t)oX t. Notice that -Xih-Xih for i=O,l,. ..,n,

because ~(ih) = O. Now let L f denote [g,f] regarded as the Lie derivative of the
g
vector field f in the direction of g. The first term of (i0) can be expanded as
a Lie series. So

X t = f(Xt ) + L g f(Xt)~(t) + ½L ~f(~t)~(t)2 + ~ ( t ) + g(Xt)~(t)

= f(Xt ) + Lgf(Xt)~(t) +l-2n L f(Xt ) + g(XtI~(t) + Rl(t) + R2(t),

where Rl(t) is the integral reminder

~(t) :- ~(t)3[l(l-s)2~-!.._L3f(X )ds


0 " sw~t)~ g E
75

and R2(t ) := ~L~f(~t)w(t)2


T L~f(~t )
-l--~n
By well-known moment inequalities for stochastic differential equations we have that
for all r>l,_ suPtEIXt Ir~ ~ ; that is, X t is bounded in Lr-nOrm uniformly in t.
Now ~ is Gaussian, and it follows from Lemma 3 that exp Klwtt) I is similarly bounded.
But by Le~ma I (ii),

IXtl ~ K exp KI~(t) I,IXtl, and suet EIXt Ir < ~ also for all r ~ i. Now it follows
from Lemma i (ii) and Lipsehitz nature of L3fg that IRl(t) I < k I exp k21~(t) l'IXtl"
lw(t) I3. Since l~(t) l is O(h ½) in L for all r > i, uniformly in t, H~Ider's
r
inequality implies that IRl(t) I is O(h 3/2) in L r for all r uniformly in t. Simil-
arly by Lemm8 3, IR2(t) I is uniformly O(h).
It is convenient at this point to introduce a special notation for orders.
We shall say a earameterized process U(t,x,n,w) is "O (h~)'' if for all integer
c
r ~ i, suPt,x,n(E I ~ h ~ Ir) < ~ where, as before, h = T/n, and where (bn(W)) is a
n
positive (P~-adaeted sequence bounded uniformly in n. It is clear from Le~mm 2
that DXt, D ~ I and DX]~ [g,f] are all bounded by such a (Pn)-adapted sequence bn and
so are 0 (i) in this sense. Now consider the "error" process Z := X-Iox . We have
c t t t
Zo(X) = x. Ordinary calculus yields

zt = ~-i
t*Eg'f](zt)~(t) + DX~I (Xt)~l (t) + m2(t)) (Ii)

-: Jl(t) + J2(t) + J3(t)


The term Jl(t) has the alternative expression (Dxtl-[g,~](X~(t).
hence IJl(t) I !k31bnlXtIlw(t) I" An appeal to the previous moment bounds then shows
that Jl(t) is Oc(h2 ).

similarly J2(t) is 0c(h3/2) and J3(t) is Oc(h).


Now an application of Gronwall's inequality to (ll) shows that Zt(x)-x is
Oc(h~) , (with b n being replaced here by k4exp(k4bnT) for some k4) and that for
ih < t < (i+l)h Zt-Zih is 0c(h3/2). Furthermore it follows fro~m (2) and the
equations for its derivatives~ and the L~vy modulus theorem, that there is a (Pn)-
adapted sequence c , uniformly bounded in n, such that
n

IX -I
t* L~f(t) - r s *I L~f(~)l < CnlXlh{-~ , i = I or 2

for all It-sl ! h, O < s < t < T, and for O < ~ < ~. These bounds and the bounds
on the derivatives of the vector fields of (Ii) given by Lemma 2 show that it can be
transformed into the discrete form

Zih+h _ Zih = ~-l.[g,f


ih ](Zih)Ai~+ ½ Xih*
^-i Lgf(Zih)Ai u + R3,i

. J3(i)&iw + J4(i)~iu + R3, i

ih+h~ 2 1 2
where ~i~ = fih+h ~(£)dt and ~i u = ~h w(t) dt - ~ .
ih
76

The remainder R3, i is 0c(h5/2-~). Both J3(i) and J4(i) are PnVBih-measurable.

Ai w and ~i u are independent of this ~-field and have zero mean. It follows from
the Brownian bridge properties of ~ and the usual moment inequalities for martingale
transforms that for some (Pn)-adapted sequence dn, hounded in n,
(E 1 n-i
~i=O J4(i)ni u12)~ = 0(h3/2)

and, more generally, that ~iJ4(i)Aiu is Oc(h3/2 ) . Clearly

~iR3i = Oc(h3/2-~). A further expansion of the coefficients about x


gives

ZT(X)-X = In-l
i=O Xi~,[g,f](x)Aiw + (~iJ5 (i) "(Zih-X) "Ai ~ + R 4 ~12)

=: J6 + R5

where R 4 is Oc(h3/2-6). Since J5(i) := DX~,[g,f](y)for some y = x-0x+0Zih, 0<9<1,


!
J5(i) is Oc(1); so J5(i)(Zih-X) is 0c(~2) and is PnV~ih measurable and

independent of A.~. It follows, again by martingale arguments, that ~iJ5(i) (Zih-X)


Ai~ is 0c(h3/2). l

We now consider the behaviour of the normalized error ~n := n(ZT~X)-X). If


R 5 is the sum of the remainder terms in (12), then nR 5 = Oc(h~-~). So, for all
r > ~,it follows that for some positive n-bounded (Pn)-adapted sequence (bn) ,

(remembering that h = T/n)


r
~]-I E[bn I E[nrR~IPn ]] = ~i E bnllnR5 I <
and the moment form of the Borel-Cantelli lemma, and Jensen's inequality, then show
that for all r_> i, E[nrR~IPn ] converges to zero a.s.
Now consider nJ 6. This possesses a Pn-c°nditi°nal normal distribution on ~d
which the same as that of [iXi
^-~,[g,f](x)Aiv, where (v(t)) is some Brownian motion
on an augmented probability space that is independent of w, and where
a.vl := v(ih+h) - v(ih). Given the continuity properties of Xi~,[g,f], it is clear
that this d~stribution converges weakly tothe limit normal distribution of the theorem,
and that its moments converge correspondingly. The a.s. convergence to zero of all
the conditional moments of the remainder terms nR 5 then implies that the distribution
of ~n behaves in the same way. This completes the proof of (i).
Now consider (ii). Let E and E 0 denote expectations for P and PO
dP
respectively. Let M := ~ and assume for the moment that M > O a.s. (Po) . For
nel, (Pn) is increasing and VnP n = BT; and so (Mn, ~, BT,PO)nc I is a martingale with

Mn ÷ M a.s. (Po) . Since Eo[M 2] = E[M] < ~, we have that the tail of the quadratic
variation
77

~O[(M-Mn)21Pn]= ~O[M2[Pn] - M~
+ O a.s. (POI.
To prove (ii) it is sufficient to establish convergence to the corresponding limits of
the moments E[I~n 121F-]L, and the values of the character~st£c function E[exp iC'~n]P
3,,
for rational c ~ ~ . We consider just the first; the argument for the others is
the same. Now M > O a.s. (PO) implies that Mn > O a.s. (Po) . We have, by
routine arguments,
E[l~ni21Pn] =~'-i E0[Ml~nl2lPn]
n
i
n

The aeeond term is bounded by

!-Eo[M-Mn)ZlPn
] ~ M
n
EO[IEnI'IPn]~;
so it follows from (i) and the properties just established for M that this con-
n
verges to zero a.s. (Po) and that E[I~nl=IPn ] converges to the correct limit.
For cases where Po(M = O) > O, we restrict our attention to those w for which
M > O; since {M > O} is P -measurable and P(M > O1 = I, the results still hold a.s.
n
(P). similarly, we have that
P(E[exp ic'~IPn] ÷ exp(-~c'V0c), all rational c ~ ~d) , I

and, given the continuity properties of characteristic functions, this suffices to


prove the a.s. weak convergence of the conditional distributions. The proof of
Theorem 1 is complete.
Proof of Theorem 2: This follows from Theorem i and an application of the "~-method".
^
First notice that ~ ( x ) = ~OZT(X); it follows from the smoothness properties of
that

~(X) = ~^ ( X ) + ^
D~(ZT(X)-X) + 0c(n-2)

The limits of the condifional distribution and moments of nOX~-~) can be established
in exactly the same way as £n the proof of Theorem i and it is easy to see that these
correspond to the conditionally normal distribution of the Ito (or Stratonovich)
integral
UT := D~(x)" T [Tx-I [g,f](x)dv(t) (13)
~12-~)6 t*

where v(t) is a Brownian motion independent of w(t)). But it follows from Le~m~a 2that
d(DX t) = Df(Xt)'DX t dt + Dg(Xt)'DX t odw(t),

and an application of "Stratonovich" stochastic calculus to (.131 then yields the


alternative form (6).
78

REFERENCES

[i] J.M.C. Clark, R.J. Cameron, The maximum rate of convergence of discrete
approximations for stochastic differential equations. B. Grigelionis (Ed.)
Stochastic Differential Systems, Prec. IFIP-WG 7/1 Working Conference Vilnius
1978, Springer-Verlag Berlin 1980, pp. 162-171

E2] Halim Doss. Liens entre ~quations diff~rentielles stochastiques et ordinaires.


Ann. Inst. Henri Poincar6 XII](2) 1977, p. 99-125

[3] Hiroshi Kunita. On the decomposition of solutions of stochastic differential


equations. D. Willems (Ed.) Stochastic Integrals, Prec. Durham Synposium 1980.
Lect. Notes in Maths. 851. Springer-Verlag Berlin 1981

[4] E.J. McShane. Stochastic Calculus and Stochastic Models. Academic Press NY 1974.

[5] G.N. Mil'stein. Approximate integration of stochastic differential equations.


Theory Prob. Appli. 19 1974, pp. 557-562

[6] N.J. Newton. PhD. Thesis, E.E. Dept. Imperial College, Univ. of London 1982

[7] Holger Rootz~n. Limit distributions for the error in approximations of


stochastic integrals. Ann. Prob. 1980, 8(2), pp. 241-251

[8] W. R~melin, Numerical treat~nt of stochastic differential equations. To


appear in SIAM J. Num. Anal. (1982)

Hector J. Sussmann,V. Jurdjevic. Controllability of nonlinear systems. J. Diff.


Equ. 12(1) 1972, pp. 95-116

£i0] Hector J. Sussmann. On the gap between deterministic and stochastic ordinary
differential equations. Ann. Prob. 1978, ~(i), pp. 19-41.

[ii] E. Wong, M. Zakai. On the convergence of ordinary integrals to Stochastic


integrals. Ann. Math. Statist. 36 (1965)
STOCHASTIC CONTROL WITH NOISY OBSERVATIONS

M.H.A. Davis

D e p a r t m e n t of E l e c t r i c a l E n g i n e e r i n g
I m p e r i a l College, London,
ENGLAND

The last few y e a r s have seen c o n s i d e r a b l e p r o g r e s s in n o n l i n e a r


filtering theory; the p r o c e e d i n g s [23] of the ~980 Les Arcs s u m m e r
school can be c o n s u l t e d for an u p - t o - d a t e account. It is n a t u r a l to
ask w h a t the i m p a c t of these d e v e l o p m e n t s m i g h t be on control t h e o r y
for s t o c h a s t i c systems w i t h n o i s y o b s e r v a t i o n s , since, as i n d i c a t e d by
the "separation p r i n c i p l e " , f i l t e r i n g plays an e s s e n t i a l part in the
optimal control of such systems. In my talk at the C o c o y o c m e e t i n g I
discussed the g e n e r a l p r o b l e m of control w i t h i n c o m p l e t e o b s e r v a t i o n s
and o u t l i n e d some r e c e n t a p p r o a c h e s b a s e d on n o n l i n e a r f i l t e r i n g theory.
Most of this m a t e r i a l is c o v e r e d in a survey [141 w r i t t e n for a special
issue of Stochastic~. In this p a p e r I a i m to p r o v i d e a v e r y b r i e f
summary of r e c e n t w o r k t o g e t h e r w i t h an u p d a t e d list of references.

I. PROBLEM FORMULATION

Let us first c o n s i d e r control of a p a r t i a l l y - o b s e r v e d d i f f u s i o n


process of the form

dx t = b ( x t , u t ) d t + g ( x t ) d v t (I)

dy t = h ( x t ) d t + dw t (2)

Here the state p r o c e s s (xt) takes v a l u e s in R d and is g o v e r n e d by


equation (I), in w h i c h (vt) is a v e c t o r B r o w n i a n m o t i o n (BM) and (ut)
!
is the c o n t r o l process. The o b s e r v a t i o n process (yt) is supposed,
for n o t a t i o n a l convenience, to be scalar, and is g i v e n by (2), (wt)
being a B M i n d e p e n d e n t of (vt). The c o n t r o l u t should in some sense
be a f u n c t i o n of the o b s e r v a t i o n s (Ys,O<s<t). P r e c i s e l y h o w this
dependence s h o u l d be f o r m u l a t e d is a m a j o r q u e s t i o n in the theory.
The o b j e c t i v e will be to m i n i m i z e the cost

J(u) = E[#(XT)] (3)

where ~ is a n o n - n e g a t i v e r e a l - v a l u e d f u n c t i o n and T a fixed


80

terminal time. Some a p p a r e n t l y more general cost functions can be


p u t in this form; see for example [15]. S t a n d i n g assumptions will
be that b,g are Lipschitz continuous and have at most linear growth
in x t (uniformly in u in the case of b.) The control process (ut)
takes values in some compact Borel space U.
If u t = u(t,{Ys,O~s~t}) depends sufficiently smoothly on y
then (I), (2) can be c o n s i d e r e d as a stochastic differential equation
in the normal Ito sense. However, one does not wish to be r e s t r i c t e d
to smooth functions since d i s c o n t i n u o u s controls of the b a n g - b a n g type
arise in many situations. One way around this is to use the Girsanov
measure transformation; for this it is necessary that the d i f f u s i o n
m a t r i x g(x) be square and n o n - s i n g u l a r for each x, with, say, bounded
inverse. Let (v~,y~) be independent BMs on some p r o b a b i l i t y space
(~,F,P) and let x t be the s o l u t i o n of the Ito e q u a t i o n

dx t = g(xt)dv ~

Now let u t be any process adapted to Vt = O{Ys,O<s<t} and define a new

measure pU by the R a d o n - N i k o d y m derivative

T
dPu
dP - exp(;Tg-1(Xs)b(Xs'Us)dv°- f Ig-1 (Xs)b(Xs'Us)12 as
0 O

T Th2 (Xs)
+ f h ( X s ) d Y s - 10~ ds
O

Under the standing assumptions p U is a p r o b a b i l i t y measure, i.e.


pU(~) = I, and under pU, (xt,Y t) are a weak solution of (I), (2) in

that the p r o c e s s e s (vt), (w t) defined by (I), (2) are i n d e p e n d e n t BMs.


(u t) is then a feedback control in the direct sense of b e i n g adapted
to Yt"
An a l t e r n a t i v e m e t h o d of d e f i n i n g a solution, which does not
require non-singularity of g, was i n t r o d u c e d by F l e m i n g and P a r d o u x
[22]. For a given BM (vt), e q u a t i o n (I) has a strong (Ito) solution
for any given m e a s u r a b l e function u : [O,T] + U. Denote by ~u the
measure induced by this solution on the canonical space C [ O , T ; R d] and
let (x t) be the coordinate process on C [ O , T ; R d] . Define

= C [ O , T ; R d] × C [ O , T ; R I] and let (yt) be the coordinate functions on


the second factor, generating o-fields Vt, and define a measure ~u on
n by
ou ~u(y)
P (dx,dy) = (dX)~w(dY) (4)
81

where ~w is W i e n e r measure. Now define pU by taking

dp u = exp(jTh(xs)dYsr _ 2I 0fTh2(Xs)dS ) =: A T (5)


dP u 0
It is not h a r d to show that this d e f i n i t i o n of pU coincides with that
obtained p r e v i o u s l y via the Girsanov transformation.
A somewhat w i d e r class of controls can be i n t r o d u c e d in this
framework, following [22]. Let G be the set L 2 [ O , T ; U ] × C [ O , T ; R I ] w i t h
coordinate f u n c t i o n s ( u t , Y t) and let (Gt) be the f i l t r a t i o n

Gt = ~{Ys' fSu(r)dr' O_<s_<t}


O
Let A be the set of p r o b a b i l i t y measures on (G,G t) such that (yt) is
a Gt-BM. This is the set of w i d e - s e n s e controls. For any ~ • A
there is a r e g u l a r c o n d i t i o n a l distribution y(du;y) of u given y, so
that
9(du,dy) = ¥(du;Y)~w(dY)

The s t r i c t sense controls A s are those for w h i c h y is a Dirac measure:


y (du;y) = ~u(y) (du). Then the process u t = u(y)(t) is a d a p t e d to

Yt ' so that strict sense controls coincide with feedback controls as


previously introduced. The sample space measure corresponding to
¢ A is given not by (4) but by

p (dx,dy,du) = pU(dx)v(du,dy)
and finally the measure P~ is given by the e x p o n e n t i a l transformation
(5) as before. The cost c o r r e s p o n d i n g to ~ • A is

J(9) = E g [ 9 ( X T )] = f ~(Xl)dP~

We now wish to choose ~ • A so as to m i n i m i z e J(9).

II. N O N L I N E A R F I L T E R I N G A N D THE "SEPARATED" PROBLEM

Since the state (x t) cannot be o b s e r v e d directly we r e f o r m u l a t e


the p r o b l e m in terms of o b s e r v e d (i.e. VT-measurable) quantities.
First, the cost can be w r i t t e n

J(u) = E U [ E U [ ~ ( X T ) IVT]] (6)

It is a d v a n t a g e o u s to calculate the c o n d i t i o n a l expectation in terms of


measure ~u rather than pU. The r e l a t i o n between the two is
82

E U [ ~ ( X T ) A T I V T] aT(#)
EU[#(XT) IVT] = ou =:
E [ATIY T] ~T(1)

where ~u denotes i n t e g r a t i o n w i t h respect to m e a s u r e ~u and "I" denotes


the function 1(x) H I. Since the d e n o m i n a t o r does not depend on the
function #" °T can be r e g a r d e d as an u n n o r m a l i z e d conditional distri-
b u t i o n of x T given YT' i.e.

ST (~) = S ~(x)oT(dx)
Rd
Returning now to (6) we have

J(u) = EU[aT(~)/OT(1)]

= ~U[ATaT(#)/aT(1)]

= ~u[~u[ATIVT]OT(~)/OT(1)]

ou
= E [aT(~)] (7)

A small e x t e n s i o n of standard results in f i l t e r i n g theory shows that


the c o n d i t i o n a l distribution o t satisfies the "Zakai equation"

dat(f) = ~t(A(ut )f)dt + ~t(hf)dYt (8)

where A(u) is the d i f f e r e n t i a l generator

~f (x)
= ~1 . I (g(x)g' (x))ij ax.ax------~
A(u)f(x) ~2f + [ bi(x,u) ~-~.
z,] I 3 i x
oU
Recall that under measure P , (yt) is BM. Equations (7), (8) give the
control p r o b l e m in "separated" form. The new "state" is the un-
normalized conditional distribution at whose "dynamics" are given by
the Zakai e q u a t i o n (8), and the cost is the linear functional at (7).
The initial condition for (6) is 00 = w where ~ is the given probabil-
ity d i s t r i b u t i o n of the initial state x O. If this has a d e n s i t y Po(X)~
in a suitable function space then ot has a density p(t,x) w h i c h satis-
fies the "forward" v e r s i o n of e q u a t i o n (6), namely
a *
~--~ p(t,x) = A (ut)P(t,x)dt + h ( x ) p ( t , x ) d y t (9)

P(o,x) = Po(X)

See P a r d o u x [23][31] for details of the Zakai e q u a t i o n in this form .


83

III C O N T R O L OF F I N I T E - S T A T E M A R K O V PROCESSES WITH NOISY O B S E R V A T I O N S

An exactly analogous formulation can be given for control of


a Markov process on the state space {1, 2 . . . . ,N} with c o n t r o l l e d
transition intensities a.. (u) :
z3

P[xt+ 6 = ilx t = j] = aij(u)~ + o(6)

See [16], [8] for details. The o b s e r v a t i o n s are as before, i.e.

dy t = h(x t) + dw t
where (wt) is a BM i n d e p e n d e n t of (xt). The u n n o r m a l i z e d conditional
distribution is given by
N

P[xt = ilYt] = q~/j~l q~

where qt' = (q~ ..... q~) satisfies the Zakai e q u a t i o n

dqt = A ' ( u t ) q t d t
+ HqtdY t (10)
th
Here A(u) is the m a t r i x with i, j entry aij(u) and H is the diagonal
.th
matrix with i, 1 entry h(i). This p r o b l e m is f o r m u l a t e d for V t-
adapted or w i d e - s e n s e controls in a way e x a c t l y similar to the d i f f u s i o n
case. By c o n d i t i o n i n g on V t the cost function (3) becomes

J(u) = E < ~ , q T > (11)

where ~' := (~(I),..., ~(N)) and <-.> denotes the inner product in R d.
Thus the "separated" problem (10) - (11) concerns control of the degen-
erate p r o c e s s (qt)° This is an instructive case to study as it is a
problem of partial o b s e r v a t i o n s where the conditional distribution is
finite-dimensional. It has been studied in detail by Bismut [8] but
his methods do not generalize to the d i f f u s i o n case.

IV EXISTENCE OF O P T I M A L CONTROLS

This has been a l o n g - s t a n d i n g open question. Even in the linear


case s a t i s f a c t o r y existence results are available only under v e r y res-
trictive conditions [15][35]. For n o n l i n e a r systems C h r i s t o p e i t [10]
and Elliott and Kohlmann [20] have given existence results for systems
with rather specific information patterns, using a r g u m e n t s not invol-
ing nonlinear filtering. The result of [20] has also been p r o v e d by
Cutland [~3] by a m e t h o d involving n o n - s t a n d a r d analysis.
The best recent c o n t r i b u t i o n in this area is u n d o u b t e d l y that of

Fleming and P a r d o u x [22] who d e m o n s t r a t e the e x i s t e n c e of an


optimal control in the class A for the p r o b l e m d e s c r i b e d in §I
84

above. Their a r g u m e n t takes its s i m p l e s t f o r m w h e n the cost is a


terminal cost as in (3). The cost is r e - e x p r e s s e d as in (7), but w i t h
~t n o w b e i n g the u n - n o r m a l i z e d c o n d i t i o n a l d i s t r i b u t i o n given G t (rather
than Yt as w o u l d be a p p r o p r i a t e for s t r i c t - s e n s e controls), aT(~) is
then a G T - m e a s u r a b l e r a n d o m variable, and P~ r e s t r i c t e d to G T is just
the c o n t r o l m e a s u r e ~, so that

j(~) = / ~T(~) d~
G

The class of c o n t r o l m e a s u r e s A is c o m p a c t under w e a k convergence, so


it r e m a i n s to s h o w that J(~) is lower s e m i - c o n t i n u o u s , and this is done
in [22] u n d e r some a d d i t i o n a l a s s u m p t i o n s on the s y s t e m c o e f f i c i e n t s .
The a r g u m e n t is m o r e i n v o l v e d if J(~) i n c l u d e s a " r u n n i n g cost".
There is some evidence, a d d u c e d in [22], that the o p t i m a l control is
not a strict sense one, so that the g e n e r a l i z e d c o n t r o l s cannot be
d i s p e n s e d with. Their status is not e n t i r e l y clear :they are not rand-
o m i z e d or r e l a x e d c o n t r o l s in the c o n v e n t i o n a l sense of the term (this
w o u l d involve s e l e c t i n g a d i s t r i b u t i o n over U at e a c h time t in a
V t - a d a p t e d way) a l t h o u g h they can be a p p r o x i m a t e d by s e q u e n c e s of
p i e c e w i s e - c o n s t a n t r e l a x e d controls. These r e m a r k s are r e l e v a n t to the
q u e s t i o n of c o n d i t i o n s for optimality, d i s c u s s e d in §VII below.

V DYNAMIC PROGRAMMING

A formal "Bellman equation" for a p r o b l e m similar to the


"separated" problem (7) (8) was d e r i v e d m a n y years ago by M o r t e n s e n
[28] and the c o r r e s p o n d i n g e q u a t i o n for the M a r k o v c h a i n p r o b l e m was
g i v e n e v e n e a r l i e r by S h i r y a e v [33]. Little, h o w e v e r , could be said
a b o u t the s o l u t i o n of these equations. This subject has been r e v i s i t e d
by B e n e ~ and K a r a t z a s [2] [5]. The " M o r t e n s e n e q u a t i o n " is to be
solved for the f u n c t i o n V(t,p) r e p r e s e n t i n g the m i n i m a l cost s t a r t i n g
at time t w i t h the state x t h a v i n g d e n a i t y p. The u n n o r m a l i z e d
d e n s i t y then e v o l v e s a c c o r d i n g to the Zakai e q u a t i o n (9) w i t h p(t,x)=p.
In £2] the d e n s i t i e s live in an L I space and the M o r t e n s e n e q u a t i o n
i n v o l v e s f i r s t and s e c o n d F r ~ c h e t derivatives in the p variable.
A c h a n g e - o f v a r i a b l e s f o r m u l a is p r o v e d for L 1 - v a l u e d p r o c e s s e s and this
is u s e d to p r o v e a " v e r i f i c a t i o n theorem", to the e f f e c t that any
s o l u t i o n of the M o r t e n s e n e q u a t i o n gives a lower b o u n d to the a c h i e v a b l e
cost. The class of c o n t r o l s that a r i s e s n a t u r a l l y from the d y n a m i c
p r o g r a m m i n g f o r m a l i s m is that of f u n c t i o n s u(t,Pt) , i.e. f e e d b a c k of
the c u r r e n t "state", the state b e i n g in this case the u n n o r m a l i z e d con-
d i t i o n a l density. One is n o w faced w i t h the p r o b l e m of solving the
85

the state e q u a t i o n (9) w i t h u t ~ u(t,Pt). The e x i s t e n c e of s t r o n g


solutions is shown in [5] for s u f f i c i e n t l y smooth d e p e n d e n c e of u(t,p)
on p, but no general c o n d i t i o n s have so far b e e n stated, to my
knowledge, u n d e r w h i c h the c o n t r o l c o n s t r u c t e d in the u s u a l way f r o m
the s o l u t i o n of the ( B e l l m a n - ) M o r t e n s e n e q u a t i o n w o u l d h a v e these
smoothness properties. In v i e w of the remarks in the p r e v i o u s s e c t i o n
it seems u n l i k e l y that such c o n d i t i o n s can be g i v e n in any g e n e r a l i t y
since this w o u l d imply the e x i s t e n c e df o p t i m a l s t r i c t - s e n s e controls.

VI. NONLINEAR SEMIGROUPS

Let W(t,p,#) be the v a l u e f u n c t i o n for the control


with initial d e n s i t y p, %~me-to-~o t and t e r m i n a l cost f u n c t i o n
~. Thus W(t,p,#) = V(T-t,p) in the n o t a t i o n of the p r e v i o u s section.
A n a t u r a l way of f o r m u l a t i n g B e l l m a n ' s "principle of o p t i m a l i t y " is to
say that W has the s e m i g r o u p p r o p e r t y

W(t+s,p,~) = W(t,p,W(s,.,~)) (12)

If we define StY(p) = W(t,p,~) then S t is a family of (non-linear)


o p e r a t o r s a c t i n g on r e a l - v a l u e d f u n c t i o n s and (12) b e c o m e s

St+s~(p) = St(Ss~) (p)

If W is o b t a i n e d by solving the B e l l m a n - M o r t e n s e n e q u a t i o n the semi-


group p r o p e r t y is a u t o m a t i c since the e q u a t i o n is of "evolution" type,
e v o l v i n g forwards f r o m initial data. Since, however, little i n f o r m a t -
ion is a v a i l a b l e about the B e l l m a n equation, there is some i n t e r e s t
in e x a m i n i n g the s e m i g r o u p S t directly. This idea w a s i n i t i a t e d by
Nisio[29], [30] for a v a r i e t y of c o n t r o l p r o b l e m s w i t h c o m p l e t e o b s e r -
vations. Her method was to c o n s t r u c t S t as the lower e n v e l o p e of
a s e q u e n c e Sin)" of o p e r a t o r s c o r r e s p o n d i n g to p i e c e w i s e - c o n s t a n t c o n -
trols. B e n s o u s s a n and Lions [7] took a d i f f e r e n t approach, def~n4ng
W(t,p,~) as the infimal cost and then e s t a b l i s h i n g d i r e c t l y its semi-
group and other properties.
For the M a r k o v chain p r o b l e m w i t h n o i s y o b s e r v a t i o n s the Zakai
equation (10) for the u n - n o r m a l i z e d c o n d i t i o n a l d e n s i t y is f i n i t e -
dimensional, and hence the "separated" p r o b l e m falls w i t h i n the scope
of N i s i o ' s o r i g i n a l theory; the d e t a i l s w e r e w o r k e d out in D a v i s [16].
The d i f f u s i o n case has b e e n s t u d i e d by F l e m i n g [21] f o l l o w i n g the
Bensoussan/Lions "direct" approach. In his f o r m u l a t i o n the Zakai
86

equation (8) is t a k e n t® d e f i n e (for fixed c o n t r o l v a l u e u) a process


~t t a k i n g v a l u e s in the set M of p o s i t i v e m e a s u r e s on R d w i t h the
wea~ topology. The s e m i g r o u p S t is then constructed, a c t i n g on fun-
ctions # : M + R s a t i s f y i n g g r o w t h p r o p e r t i e s of the form
#(~) ~ (I+11 ~ If)k for some k.
Davis and K o h l m a n n [ 1 9 ] n o t e d - but have so far failed to e x p l o i t
s u c c e s s f u l l y - some v e r y strong c o n v e x i t y p r o p e r t i e s in this problem.
D e t a i l s will be f o u n d in [14]. The Zakai e q u a t i o n (8) is a b i l i n e a r
e q u a t i o n and hence linear in the initial c o n d i t i o n G O = ~. The cost
EUC~T(~)] is a linear functional of a T. U s i n g these facts it is easy
to s h o w that the infimal cost over constant control8 r e g a r d e d as a
f u n c t i o n S(~) of the initial m e a s u r e is a support function, i.e. it
s a t i s f i e s c o n d i t i o n s of
- positive homogeneity : S(ID) = IS(~), I £ R+ , ~ E
- concavity : S(~I+~ 2) ~ S(~ I) + S(~2)

It is argued in [19] that support f u n c t i o n s form the n a t u r a l class of


f u n c t i o n s on w h i c h the s e m i g r o u p S t should act. However, certain
t o p o l o g i c a l q u e s t i o n s arise: if one takes the space M w i t h any
w e a k t o p o l o g y then (in c o n t r a s t to the s i t u a t i o n in f i n i t e - d i m e n s i o n a l
spaces) concave f u n c t i o n s are not a u t o m a t i c a l l y continuous, and the
N i s i o c O n s t r u c t i o n d e p e n d s on some c o n t i n u i t y p r o p e r t i e s . There s e e m
to be two w a y s a r o u n d this obstacle. N. E1 K a r o u i [25] has o u t l i n e d
a s e m i g r o u p c o n s t r u c t i o n w h i c h does not d e p e n d on c o n t i n u i t y properties.
It would, however, be d i s a p p o i n t i n g if c o n t i n u i t y had to be t h r o w n away
altogether since one m a j o r p u r p o s e of c o n s t r u c t i n g S t is to e s t a b l i s h
s m o o t h n e s s p r o p e r t i e s of the v a l u e function. An a l t e r n a t i v e a p p r o a c h
is to take the Zakai e q u a t i o n in its d e n s i t y f u n c t i o n form (9) and prove
that the s o l u t i o n e v o l v e s in some b a r r e l l e d space (e.g. H i l b e r t space).
Then the r e q u i r e d c o n t i n u i t y is a u t o m a t i c [323. See B e n s o u s s a n [6].

VII. NECESSARY CONDITIONS FOR OPTIMALITY

C o n s i d e r f i r s t the s e p a r a t e d M a r k o v chain p r o b l e m (IO), (11).


One n o t e s that the o b s e r v a t i o n p r o c e s s (yt) a p p e a r s in the r61e of
"noise" in (10) so that this p r o b l e m looks like a " s t o c h a s t i c open-loop"
p r o b l e m if Y t - a d a p t e d c o n t r o l s are b e i n g considered. Thus, the
s t o c h a s t i c m a x i m u m p r i n c i p l e s g i v e n by B i s m u h [91 and K u s h n e r [26] are
applicable. B i s m u t ' s result states that if u O is o p t i m a l and (qt) is
the c o r r e s p o n d i n g s o l u t i o n of (IO) then

<A(u>
~ P tu' q t = max <A(a)Pt,qt> (13)
aEU
87

where A t is the a d j o i n t p r o c e s s w h i c h is c h a r a c t e r i z e d in the f o l l o w i n g


way : there is a unique pair of p r o c e s s e s (It,~ t) a d a p t e d to V t such
that A T = -# and

d'~t = A ' ( u ~ ) ~ t d t + H ~ t d t + ~tdYt (14)

For the d i f f u s i o n p r o b l e m of §II it is a p p r o p r i a t e to c o n s i d e r


the d y n a m i c s in d e n s i t y f u n c t i o n f o r m (9). Then r e a s o n i n g in a p u r e l y
formal m a n n e r one o b t a i n s the same m a x i m u m p r i n c i p l e (13) (14), A now
being the d i f f e r e n t i a l g e n e r a t o r and H the o p e r a t o r of m u l t i p l i c a t i o n
by h(x). This idea will be f o u n d in K w a k e r n a a k [27] and a r i g o r o u s
v e r s i o n will be found in B e n s o u s s a n [6]. However, such r e s u l t s r e q u i r e
caution : as m e n t i o n e d in §III above it seems h i g h l y l i k e l y that in m a n y
cases no s t r i c t sense optimal control exists, in w h i c h case n e c e s s a r y
c o n d i t i o n s are, of course, vacuous. Necessary conditions should be
given in a f r a m e w o r k for w h i c h g o o d e x i s t e n c e r e s u l t s are available.
In the p r e s e n t context, this m e a n s the w i d e - s e n s e c o n t r o l s of F l e m i n g -
Pardoux [22]. But as m e n t i o n e d above, it is n o t ~ u r r e n t l y u n d e r s t o o d
in w h a t way these c o n t r o l s ~ e l a t e to "local" choice of c o n t r o l v a l u e
and this is of c o u r s e just w h a t one n e e d s to k n o w in o r d e r to for-
m u l a t e local c o n d i t i o n s for optimality.

VIII. PREDICTED MISS PROBLEMS

In 1977 V.E. Bene~ started a m i n o r i n d u s t r y w i t h the p u b l i c a t i o n


of his paper [I] on b a n g - b a n g s t o c h a s t i c control. The p r o b l e m is
to steer a l i n e a r - s y s t e m to a h y p e r p l a n e in fixed time u s i n g h o u n d e d
controls, and the s o l u t i o n is to use b a n g - b a n g c o n t r o l s of the form
u t = -sgn(s' (t)x t) w h e r e x t is the state and s(t) a d e t e r m i n i s t i c
v e c t o r - v a l u e d function. The p r o b l e m is i n t e r e s t i n g b e c a u s e it has an
explicit s o l u t i o n but this s o l u t i o n c a n n o t be o b t a i n e d by s t a n d a r d app-
lication of d y n a m i c p r o g r a m m i n g since the c o n t r o l f u n c t i o n is d i s c o n -
tinuous. The r e s u l t was r e - p r o v e d and e x t e n d e d by several a u t h o r s :
Ikeda and W a t a n a b e [24], D a v i s and C l a r k [18], Shreve [34], C h r i s t o p e i t
and H e l m e s [11] R e c e n t l y B e n e ~ and K a r a t z a s [3] have shown that the
optimal c o n t r o l u s i n g p a r t i a l o b s e r v a t i o n s is just u t = -sgn(s'(t)Rt)
where ~t = E[xtlYt]" T h i s is one of the v e r y few p a r t i a l l y o b s e r v a b l e
problems w h i c h can be solved explicitly, and, interestingly, although
it is a n o n l i n e a r p r o b l e m it e x h i b i t s the " c e r t a i n t y - e q u i v a l e n c e "
p r i n c i p l e - the o p t i m a l control is o b t a i n e d by e s t i m a t i n g the state
and then u s i n g this e s t i m a t e as though it were the true state.
C h r i s t o p e i t and Helmes [11] have f u r t h e r i n f o r m a t i o n on this problem.
88

IX. CONCLUDING REMARKS

As w i l l be a p p a r e n t f r o m the above b r i e f survey, much interesting


w o r k has been done o n the p a r t i a l l y - o b s e r v a b l e stochastic control
problem, b u t some aspects of the theory are still in a r u d i m e n t a r y
state. The n o n l i n e a r semigroup side is p e r h a p s the b e s t - d e v e l o p e d ,
with satisfactory treatments existing in [21] and [6]. Further sub-
stantial progress with the B e l l m a n - M o r t e n s e n equation seems to me
very hard, while progress on a necessary conditions must await satis-
factory resolution of some q u e s t i o n s in e x i s t e n c e theory which remains
- shall we say - a c h a l l e n g i n g area. Certainly study of p a r t i c u l a r
problems admitting explicit solutions should continue.
89

X. REFERENCES

ZW = Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete


SICON = SIAM Journal on Control (and Optimization)

[I] V.E. Bene~, Full "bang" to reduce predicted miss is optimal,


SICON 15(1976)62-83
[2] V.E. B e n e ~ & I. Karatzas, On the relation of Zakai's and Morten-
sen's equations, submitted to SICON
[3] V.E. B e n e ~ and I. Karatzas, Examples of optimal control for
partially-observable systems: comparison, classical and martin-
gale methods, Stochastics ~(1980)43-64
[4] V.E. B e n e ~ and I. Karatzas, Estimation and control for linear
partially-observable systems with non-gaussian initial distribut-
ion, IOth Conference on Stochastic Processes and their Applic-
ation, Montreal, Canada, 1981
[5] V.E. B e n e ~ and I. Karatzas, Filtering of diffusions controlled
through their conditional measures, 20th IEEE Conference on
Dicision and Control, San Diego, Cal. 1981
[6] A. Bensoussan, this volume
[7] A. Bensoussan and J.L. Lions, Applications des inequations vari-
ationelles en contrSle stochastique, Dunod, Paris, 1978.
[8] J.M. Bismut, Un probl~mede c o n t r ~ s t o c h a s t i q u e avec observation
partielle, ZW 49(1979)63-95
[9] J.M. Bismut, An introductory approach to dualiZy in optimal
stochastic control, SIA/4 Review 2_O0(1978)62-78
[i0] N. Christopeit, Existence of optimal stochastic control under
partial observation, ZW 51(1980)201-213
[11] N. C h r i s t o p e i t a n d K. Helmes, Optimal control for a class of
partially-observable systems, Stochastics, to appear
[12] N. Christopeit and K. Helmes, On Bene~'bang-bang control problem,
preprint, University of Bonn, 1981
[13] N.J. Cutland, Optimal controls for partially observed stochastic
systems : an infinitesimal approach, submitted to Stochastics.
[14] M.H.A. Davis, Some current issues in stochastic control theory,
Stochastics, to appear
[15] M.H.~. Davis, Martingale methods in stochastic controls, in
Stochastic Control and Stochastic Differential Systems, Lecture
Notes in Control and Information Sciences 16, Springer-Verlag,
Berlin
[16] M.H.A. Davis, Nonlinear semigroup in the control of partially-
observable stochastic systems, in Measure Theory and Applicat-
ions to Stochastic Analysis, Lecture Notes in Mathematics 695,
Springer-Verlag, Berlin 1978
[17] M.H.A. Davis, The separation principle in stochastic control via
Girsanov solutions, SICON 14(1976)176-188
[18] M.H.A. Davis and J.M.C. Clark, On "predicted-miss" stochastic
control problems, Stochastics 2(1979)197-210
90

[19] M.H.A. Davis and M. Kohlmann, On the nonlinear semigroup of


stochastic control under partial observations, preprint, Imperial
College, London, 1981
[ 20] R.J. Elliott and M. Kohlmann, On the existence of optimal part-
ially observed controls, preprint, Hull University, 1981
[21] W.H. Fleming, Nonlinear semigroup for controlled partially-
observed diffusions, SICON, 2--0(1982)286-301
[22] W.H. Fleming and E. Pardoux, Existence of optimal controls for
partially-observed diffusions, SICON, 20(1982)261-285
[23] M. Hazewinkel and J.C. Willems (eds), Stochastic Systems; the
mathematics of filtering and identification and applications,
D. Reidel, Dordrecht, 1981
[24] N. Ikeda and S. Watanabe, A comparison theorem for solutions of
stochastic differential equations and its applications, Osaka J.
~ t h . I_~4(1977)619-633
[25] N. E1 Karoui, personal communication
[26] H.J. Kushner, Necessary conditions for continuous time stochastic
optimization problems, SICON I-0(1972)550-565
[27] K. Kwakernaak, A minimum principle for stochastic control
problems with output feedback, Systems & Control Letters ~(1981)
74-77
[28] R.E. Mortensen, Stochastic optimal control with noisy observat-
ions, International J. Control 4(1966)455-464
[29] M. Nisio, Stochastic Control Theory, Indian Statistical Instit-
ute Lecture Notes No.9, MacMillan, Delhi, 1981
[303 M. Nisio, On stochastic optimal controls and envelope of
Markovian semigroups, in Stochast~ Differential Equations,
ed. K. Ito, Wiley, New York, 1977.
[31] E. Pardoux, Equations du filtrage non lin~aire, de la prediction
et du lissage, Stochastics 6(1982)193-232
[32] R.T. Rockafellar, Level sets and continuity of conjugate convex
functions, Trans. Am. Math. Soc. 123(1966)46-63
[33] A.N. Shiryaev, Some new results in the theory of controlled
stochastic processes[in Russian] Trans. 4th Prague Conference on
Information Theory, Statistical Decision Functions and Random
Processes, Czech Academy of Sciences, Praha, 1967
[34] S.E. Shreve, Reflected Brownian motion in the "bang-bang" control
of Brownian drift, SICON 19(1981)469-478
[35] W.M. Wonham, On the separation theorem of stochastic control,
SICON 6(1968)312-326
APPLICATIONS OF DUALITY TO MEASURE-VALUED DIFFUSION PROCESSES

Donald A. Dawson, Thomas G. Kurtz,


Department of Mathematics and Statistics, Department of Mathematics,
Carleton University, University of Wisconsln-Madlson,
Ottawa, Canada KIS 5B6. Madison WI 53706, U.S.A.

i. INTRODUCTION TO MEASURE-VALUED DIFFUSION PROCESSES

Examples of measure-valued diffusion processes arise in nonlinear filtering


theory as solutions of stochastic partial differential equations which describe
the evolution of the normalized and unnormalized conditional distributions. In
other applications such as population genetics examples of measure-valued diffu-
sions have arisen which cannot be obtained as strong solutions of stochastic par-
tial differential equations. In these examples the probability law of the measure-
valued process is characterized as the unique solution of a martingale problem.
In this approach to measure-valued diffusion processes a key role is played by
the notion of a dual process which was originally introduced in the study of infi-
nite particle systems. The purpose of this paper is to describe the martingale
problem formulation of measure-valued diffusion processes with special emphasis
on the role of duality and to present a number of examples.

2. MARTINGALE PROBLEM FORMULATION OF MEASURE-VALUED DIFFUSION PROCESSES

Let S denote a complete separable metric space and M(S) denote the space
of bounded Radon measures on S. With an appropriate choice of metric M(S) is
also a complete separable metric space whose topology is equivalent to that of
weak convergence of measures, Let E denote a closed subset of M(S), B(E) the
a-algebra of Borel subsets of E and L (E) the space of bounded measurable func-
E E
tions on E. Let ~C := C([0,*),E), ~D := D([0,~),E), the spaces of functions

from [0,~) into E which are continuous, right continuous with left limits, res-
pectively. For s • 0, E let
~ c ~C' X(s,~) := ~(s~ and let

F := n o{X(s): 0 -< s <- t + ~}.


t
E>0
For a Ft-stopping time T, let FT := {A: for s ~ [0,~), An{T-<s} ~ Fs }"

An E-valued diffusion process is given by a family of probability measures


{P : ~ ~E} on ~C such that

(2.1) CMe~urable) the mapping ~ -~ P is measurable from E into P(~C ) ,

the space of probability measures on 4 furnished with the topology of


02

weak convergence,
(2.27 P ({~:X(0,~)=~!) = I,
(2.3) [strongMarkou Pmoper~) for ~ ~ E, t ~ 0, f E L (E), and all a.s. finite
Et-stopping times ~, E ( f ( X ( T + 0 ) IFT) = T(t) f(X(T)), ~-a.s., where

T(t) f(~) = /f(X(t,~)) P (d~) •

Given a subset A c c~E)×L (E), a measurable E-valued stochastic process


{Y(t):t e O} defined on a probability space (~,G,P) with filtration Gt is

said to be a solution to the martingale problem for A if for every (f,g) ~ A,


t
(2.4) f(Y(t)) - f g(Y(s))ds is a Gt-martlngale.
0
A family of probability laws {P :~ EE} is said to be a P(~D)[P(~C)]-solutlon
to the initial value martln~ale proble ~ for A if for every (f,g) ~ A,

(2.51 for each B E E, P~ ¢ P C ~ [ P ( ~ I ] and Pu({m:X(0,m)=~}) = i,

(2.67 for each probability law P the canonical process {X(t):t z 0} is a so-
lution to the martingale problem for A.

2.1. THEOREM (Stroock and Varadhan)

Let # denote a subset of L (E) which generates L (E) under bounded point-
wise convergence (e.g. Cb(E)). Assume that for each t ~ 0 and F e ~ there is
a function AF, t e L (E) such that for ~ E E,

(2.71 E (F(X(tl) = AF,t(~)

for any P(~D)[P(~C)]-solution {P :~ E E} of the inltial-value problem for A.

Then there is at most one P(~)[P(~c)]-solution to the Inltial-value problem for


A and if such a solution exists, then it is measurable and satisfies the strong
Markov property (2.3).
Proof. Refer to Stroock and Varadhan ([ii]; section 6.2) or Ethier and Kurtz
([3]; section 5.41.

In the finite dimensional ease a large class of diffusion processes can be


characterized as unique solutions of martingale problems associated with second
order elliptic differential operators (refer to Strooek and Varadhan ([ll];chapters
6,7)). By analogy a large class of measure-valued diffusion processes can be ob-
talned from martingale problems associated with second order differential operators
of the following form: for F e D(L),

(2.87 LF(U) := A(~)(6F(~)/~(x)) + ½ B(~)(~2F(u)/~(x) ~(Y))

where
93

(2.9) 6F(~)/6~(x) is the variational derivative defined by:


6F(~)/6~(x) := lira [F(p + £ a x ) - F(~)]/e where a ~ M(S) consists of
s+O ' x
a unit atom at x C S;
(2.10) for each ~ ~E, A(~) is a linear functional defined on D(A) c C(S) and
B(~) is a billnear functional defined on D(~c C(S×S),
(2.n) for F ~ D(L), ~F(9)/~(.) ¢ D(A) and ~2F(u)/6~(.) ~u(.) ¢ D(B).

The set D(L) c C(E), the space of continuous functions on E, must be chosen
so that the first and second variational derivatives exist and can be computed and
it must be also sufficiently rich so that it generates L~(E) under bounded point-
wise convergence. With this motivation we now introduce a family of functions on
M(S) which has these properties.

In this development we assume that S is locally compact with one-polnt com-


, *
pactification S . Let S *N denote the N-fold Cartesian product of S and

C := u C(S *N) (disjoint union), where C(S *0) := R I.


N= 0
For f c C, N(f) := N if f ¢ C(s*N). Similarly we define

D := u D(S *N) where for each N, D(S *N) is assumed to be a dense sub-
N=0
set of C(s*N).
For f ( C, a function on M(S) of the form

(2.12) Ff(U) := ; ~ . . f , f ( x I ..... xN(f)J"~(N(f))(dx), where


S S
~(N)(dx) 1= ~(dXl)...g(dXN),

is said to be a monomial on M~*). For a compact subset E of M(S*) we denote by


HI(E ) the set of monomials restricted to E. The al~ebra o__~fpolynomials H(E) is

defined to be the smallest algebra of functions on E which contains HI(E). The

sets HD(E),[~(E)] denote the monomials [polynomials] on E with coefficients

feD.

2.2. LEMMA

(i) ED(E) is dense in C(E).


(ll) ~D(E) generates L (E) under bounded polntwise convergence.
Proof: Let Ill-IllN denote the supremum norm on C(s*N). If f'fn ¢ C(s*N) are

such that Ill f - fnllIN "~ 0 as n ÷ ~, then IFf (~) - Ff(~) I -> 0 uniformly on
n
E. From this it follows that ~(E) is dense in H(E). If B and ~ e E and
ff(x)~(dx) = ff(x)~(dx) for all f E D(S*), then ~ = ~. Therefore the algebra
~(E) separates points of E. The Stone Welerstrass theorem then implies that
94

HD(E) is dense in C(E). Part (ii) follows from (1) since C(E) generates L (E)
under bounded pointwise convergence.

For a function of the form (2.12) the variational derivatives are given by:
N(f) ~ (V(f)/j) (dx)
6Ff(~)/~(x) = ~ ;S*'*'; *f(xl .... xJ-1"x'Xj+l .... xN(f))
j=l S

62Ff(~)/6~ (x) 6~ (y)


= ~)~¢(~)
j=l k=l fs~"SS *f(xl'"
• ,Xj_l,X,Xj+ 1 ....Xk_l,Y,Xk+ 1 .... XN(f))~(N(f)/Jk)(dx)
j~k
where
(N/j) (dx) N O (N/Jk) N
= ]I ]J(dx£); (dx) = II ~a(dx£).
£=i £=i
£~j £~j ,k

The operator (2.8) is said to be a differential operator with polynomial coef-


ficients of degree K if acting on a function of the form (2.12), it can be written
in the form: for f e D,
K N~f)
(2.13) LFf(~) [ (F (j) (W) - Ff(~))
FGN(f)f(~)'+ £=-i j=l A£ f

+ ½ ~ (FBfJk)(~)i
f - Ff(~)) + V(N(f)).Ff(H)
£=-i j=l k=l
j#k
where V(n) = cln + c2n(n-l),
GN is assumed to be the infinitesimal generator of a strongly continuous

semigroup of operators on C(S *N) whose domain contains D(S *N)

A£ is a linear mapping from D(S~ to D(S *(I+£)) , £ = -i,0,i ..... K,

and A~J)f denotes the action of A£ acting on f as a function of the


jth variable,

B£ is a llnearmapplng from D(Se~ to D(S *(l+Z)) , £ = -1,0,1, ....K,

and B~Jk)f denotes the action of B£ acting on f as a function of the


jth and kth variables.

Let Q denote the linear space of functions on D which contains functions


of the form:

(2.14) F(~,f) = F (f) := Ff(~) for ~ EE, f~D.

Then (2.13) can be rewritten as

(2.15) LFf(~) = L#F (f) + V(N(f))Fu (f)

where for F ¢ Q,
U
K N(f)
(2.16) L#F (f) = F (GN(f)f) + ~ ~ a£[F (A~J)f/a£) - F (f)]
~=-i j=l

K N(f)N(f)
+ ½ ~ ~ ~ b~[Fu(B~Jk)f/b £) Fu(f)]
£=-i J=l k=l
j~k
where a£, b A are arbitrary positive real numbers and
V(n) = eln + e2n(n-i ).

The application of the duality method grows out of the observation that L#
h~ the structure of an infinitesimal generator of a D-valued Marker process re-
stricted to functions in Q. The dynamics of the D-valued Marker process {Y(t):t~0}
(if it exists) with infinitesimal generator L# involves two basic mechanisms:

(2.17) CJump m~ahanism) f ÷ (A~J)f)/a£ with jump rate a£, j = 1,2 ..... N(f),

f -~ (B~Jk)f)/b£ with jump rate ½bE, j,k = 1,2,...,N(f);

(2.18) (Spatial diffusion semigroup) between jumps Y(t) evolves in a continuous


and deterministic way, namely

f ~ sN(f)f
t
where {S~ :t ~ 0} represents the semigroup of transformations on C(S *N)

with infinitesimal generator GN. We assume that D(S *N) is invarlant under S Nt"

If V(.) = 0, L# is the infinitesimal generator of a conservative right con-


tinuous D-valued Markov process {Y(t):t ~ 0} and L is the infinitesimal genera-
tor of an E-valued diffusion {X(t):t k 0}, then

(2.19) E (Ff(X(t)) = E(F (Y(t))IY(0) = f).

Processes X and Y which satisfy the relation (2.19) are said to form a pair of
dual processes. The development of the theory of dual processes in a general set-
ting is the objective of the next section.

3. DUALITY RELATIONS FOR SOLUTIONS OF MARTINGAL.E PROBLEMS

The existence of a process Y dual to a process X can provide a powerful


tool to study the latter if it turns out that the dual process is in some sense
more tractable. Important applications of dual processes arise in the study of
infinite interacting systems (Spitzer [i0~ Holley and Liggett [ 4 ], and Holley and
Stroock [ 6 ]) as well as finite dimensional diffusions (Holley, Stroock and Wil-
liams [ 5 ] and Shlga [ 9 ]). In addition applications to infinite dimensional dif-
fusions have recently been developed (Shiga [ 8 ], Dawson and Hechberg [ 2 ]).

Let E1 and E2 be metric spaces and C(EI×E 2) denote the space of


continuous functions on ElxE 2.
96

~.et

(3. l.a) A 1 := {(f(.,y),g(.,y)):y e E2] , where f,g c C(EIXE2),

~ C(EI);
(3.l.b) A 2 := {(f(x,.),h(x,.)):x E El} , where f,h ~ C(EI×E2) ,

Let {X(t):t ~ 0} be an El-Valued measurable process which is adapted to a filtra-

tion Ft and let T be a Ft-stopplng time. Let {Y(t):t ~ 0 } be an E2-valued

measurable process which is adapted to a filtration Gt and let o be a Gt-sto p-

ping time. Assume that X and r are independent of Y and a. We also assume
that for each T > 0 there is an ~ntegrable random variable rT such that

(3.2.a) sup [{I v a(X(rAT))}f(X(a^z),y(t^~)) I <_ r T , (x^y) := mln(x,y),


r~s,t ~ T
(3.2.b) sup [{i v B(Y(rao))}f(X(saT),Y(taa))[ <_ r T , (xvy) := max(x,y),
r~s~t ~ T
(3.2.c) sup [{I V a(X(raT))}g(X(s^~) ,Y(t^~)) I <__r T ,
r~s,t ~ T
(3.2.d) sup I{I v 8(Y(r^o))}h(X(s^~),Y(t^a))[ <__ F T ,
r,s~tST
and there is a constant c T such that
tAT tAu
(3.2.e) sup J/ ~(X(u))du I + sup l ] 8(Y(u))dul ~ cT •
t~T 0 tST 0

3.1. %ORE~

Assume that for each y ~ E2,


tAT
(3.3.a) f(X(tA~),y) - f g(X(s),y)ds is a Ft-martingale~
0
and for each x E E 1,
t^~
(3.3.b) f(x,Y(t^o)) - f h(x,Y(s))ds is a Gt-martlngale.
0

Then
tAT tA~
(3.4) E[ f (X(tAT) ,Y(0)), exp (S a ( X ( u ) ) d u ) ] - E[ f (X(Ol~f (tAn)) .exp (; ~3(Y(u))du) ]
0 0
t
= /E[lx{s<T}{g(X(s );~((t-s)^~)) + ~(X(s))f(X(s^T),Y((t-s)^o))}
0
- Xf(t_s)_<~ {h(X(sAT),X(t-s)) + 8(Y(t-s))f(X(sAT),Y((t-s)a~))}}
SAT (t-S)^O
.exp(f a(X(u)ldu + ; B(Y(u))du)]ds ,
0 0
where X'S<T'%t denotes the indicator function of the event {s <T} .
97

The proof of the theorem is based on the following three lemmas.

3.I.i. LEMMA

Suppose that the function f(s,t) is absolutely continuous in e for each


fixed t and absolutely continuous in t for each fixed s. Let fl = ~ f/3 s,
f2 =8f/~t, and assume that
TT
(3.5) ; ; Ifi(s,t) ldsdt < =, i =1,2, T > O.
O0

Then for almost every t,


t
(3.67 f(t,0) - f(0,t) = So[fl(s,t-s) - f2(s,t-s)]ds .

T t
Proof. SO;O (fl(S,t-S) - f2(s,t-s))dsdt

Tt Tt
= S ; fl (t-s,s)dsdt - S / f2 (s,t-s)dsdt
00 00
TT TT
= SOSs fl(t-s's)dtds - SOSs f2(s't-s)dtds

T T
= fo(f(T-s,s) - f(O,s))ds - So(f(s,T-s) - f(s,0llds

T
= S0(f(s,0) - f(0,s))ds .

By differentiating with respect to T we obtain

t
So(fl(s,t-s) - f2(s,t-s))ds = f(t,o) - f(o,t),

and the proof is complete.

3.1.2. LEMMA

Let {7(s):s >_ 0} be a measurable process adapted to a filtration Ft ,


M(t) be a Ft-martingale , and {u(s):s >__0} be a measurable process adapted to F t .
¢(t) := ity(s)ds + M(t) , and all expressions are well-defined, then
0 t+h t
(3.7) EE~(t+h).exp( S a(u)du) - 0(t).exp( S a(u)du) ]
0 0
t+h t t + h (u) u
= E[ S 7(slds.exp( S ~(u)du) + (S .exp( S u(rldr)du).¢(t+h)] .
t 0 t 0
t+h t
Proof: E[~(t+h).exp( S ~(u)du) - ~(t).exp(S a(u)du)]
0 0
t+h t+h t+h
= E[~ y ( s ) d s . e x p C f a(u)du) + S(t+h).exp(; ct(u)du)
0 0 0
t t t
- S y(slds'exp(S a(uldu) -N(tl.exp( S ~(u)du) ]
0 0 0
98

t+h t t+h u t+h


= EEl y(s)ds.exp(/ ~(u)du) + (/ ~(u) .exp(/ ~(r)dr)du)./ y(~)ds
t 0 t 0 0
t+h u t
+ M(t+h)(/ ~(u).exp(/ e(r)dr)du) + (M(t+h) - M(t)).exp(/ e(u)du)]
t 0 0
t+h t t+h u
=E[[ y ( s ) d s . e x p ( [ a(u)du) + (S ~(u).exp(/ =(r)dr)du).@(t+h)]
t 0 t 0

and the proof fs complete.

3 .i .3. LEMMA

Let
sAT tAG
(3.8) @(S,t) := E[f(X(sAT),Y(tA~)).exp(/ ~(X(u))du).exp(/ 8(Y(u))du)] .
0 0
Then

(3.9) @(s,t) - @(0,t)


SAT
= E[/ {f(X(rAT),Y(tAG)).~(X(rAT)) + g(X(rAT),y(tAc))}
0
r tA~
,exp(] ~(X(u))dulexp(f 8(Y(u)ldu) dr ].
0 0
A simflar identity holds for (@(t,s) - ¢ ( t , 0 ) ).
Proof. Let h • 0. Using ( 3 . 3 . a ) , t h e independence of X and Y, and Lemma 3 . 1 . 2
we obtain:

~(s+h,t) - ~(s,t)
(s+h)AT tad
= E[f(X((s+h)AT),Y(tAo)).exp(/0 a(X(u))du).exp(/ g(Y(u))du)]
0
sAT tAG
-- E[f(X(sAr),Y(tAG)).exp(/ ~(X(u))du).exp( S 6(Y(u))du)]
(3.10) 0 0
(s+h) A~ rA~ tAG
= gig (X((s+h) AT) ,T~^C~ { ] a (X(r)) .exp(~ a (X(u)) du) dr} .exp(f 8 (Y(u))du) ]
sAT 0 0

+ E l f (s+h) ^~ g(X(rAT,Y(tAq))dr.exp(] s^~ a(X(u))du).exp(f t^~ B(Y(u))du)]


s^~ 0 0
E[{(s+h) AT r tAG
= f(X(r),Y(tAc)).~(X(r)).exp(/ ~(X(u))du)dr}.exp(; B(Y(u))du)
s^~ 0 0
{/(s+h) AT/(s+h)AT r tAG
+ g(X(v),y(tAG))dv.e(X(r)).exTp(/ ~ ( X ( u ) ) d u ) d r } . e x p ( / 8(Y(u))d~
8AT r 0 0
/(s+h)AT r tAG
+ g(X(r),Y(taG)).ezp(/~(X(u))du)dr .exp(/ ~(Y(u))du)
s^~ 0 0
](s+hl^T SAT r^~
+ g(X(r),Y(tAO)).{exp(/ ~(X(u))du)-exP(S a(X(u))du)}dr .
SAT 0 0
tAG
exp(/ 8(Y(u))du) ] .
0

But for t,(s+h) ! T, the absolute values of the second and fourth terms are
99

bounded by

(3.11) %h2 E(rT). exp(eT).


Note that
m

(3.12) ~ ( s , t ) - ¢ ( 0 , t ) -- j -[1 (¢ (sj ,t) - ¢ ( S j _ l , t ) )

where 0 = s o < Sl< ... < s m = s. If max (sj - sj_ I) "+ 0, then (3.10), (3.11) and
J
(3.12) yield:
sAT
¢(s,t) - ¢(o,t) = EE; {f(X(r),Y(tAO)).a(X(r)) + g(X(r),Y(t^o))}
0
r tA@
. exp(; a(X(u))du).exp( i ~(Y(u))du) dr]
0 0

and the proof is complete.


Proof of the Theorem. To complete the proof of the theorem note that by L e n a 3. I.i j
t
~(t,0) - ~(0,t) = ; (~l(s,t-s) - ~2(s,t-s))ds.
0
From Lemma 3.1.3

¢l(S,t) = E[X{s<~}{f(X(sAr ),Y(tAo)).~(X(sAr)) + g(X(sA~),Y(t^~))}


sAT tAo
.exp(; u(X(ul)du).exp(; S(Y(u))du) ] ,
0 0

~2(t,s) = E[X{s<~}{f(X(tAT ),Y(sAa)).8(Y(sAa)) + h(X(t^~),Y(s^~))}


tAT SA~
.exp(] aCXCu)ldul.exp(; S(YCu))du)] ,
0 0
Therefore,
~(t,o) - ~(o,t)
t
= EEf_iX{s<r}{f(X(sAT),Y((t-s)Ao).a(X(sAT)) + g(X(sAT),Y((t-s)Ao))}
O sA~ ( t - s ) Aa
.exp (; a(X(u)) du) .exp(f B(Y(u))du)
0 0

- X{t_s<~}(f(X(sar),Y(t-s)).B(Y(t-s)) + h(X(sAr),Y(t-s))}
s^~ (t-s)^a
.exp( f c~(X(u)) du). exp (; 8(Y(u)) du)}ds]
0 0
and the proof of the theorem is complete.

3.2. COROLLARY. In addition to the assumptions of Theorem 3.1 assume that:

(3.137 g(x,y) + ~(x)f(x,y) = h(x,y) + B(y)f(x,y). If a+ := (av0), then


tAr tA~
E[f(X(tAT),Y(0)).exp(f a(X(u))du)] - E[f(X(0),Y(tAO)).exp(; 8(Y(u))du)3
0 0
100

(t--o~AT
ffi E[{f(X(e-s)+^r),Y(~))exp(/ ~(X(u))du) - f(X(0),Y(a)))exp(/ B(Y(u))du)]
0 0
(t-T)%o T
-g[{f(X(T),Y(~-r)+Ao))exp(i 8(Y(u))du) - f(X(T),Y(0))}exp(f s(X(u))du)]
0 0

3.3. COROLLARY. Assume that (3.2) and (3.3) are true with r = o = ~ and (3.13).
Then
t t
E[f(X(t),Y(0)).exp(f c~(X(u))du)] ffi EZf(x(0),Y(t)).exp(f .8(Y(u))du)].
0 0

3.~. COROLLARY. Let El,n ÷ El' E2,m + E2; T n :ffi inf{t:X(t) ~ El, n } ' and

O m := inf{t:Y(t) ~ E2, m} . Assume that T n are Ft-stoppingtimes, ~m are G t-


+
stopping times and that X(Tn) E E~,n(Closed~DEl,n~ ¥(~m) ~ E2,m(Closed)= E2,m" Let

~(n,m) :ffi sup + sup + {If(x,u)~.exp((~(y) + B(v))t)}.


x,yEE 1 U,vEE 2
n m
Assume that t
( 3 . 1 4 . a) E[If(X(t),y)l.exp(fte(X(s))ds) ] < ~, E[If(x,Y(t))I'exp(f 8(Y(s))ds)] < -,
0 0
(3.14 .b) for e > 0, there exists n,m such that (p(Tn<t)_ + P(om<t)).~(n,m)_ < e.

Then t t
E[f(X(t),y).exp(f ~(X(s))ds)] = E[f(x,Y(t)).exp(f 8(Y(s)ds)] .
0 0

4. UNIQUENESS RESULTS FOR MARTINGALE PROBLEMS

In this section the duality method is applied to establish the uniqueness of


solutions to a class of martingale problems for measure-valued diffusion processes.
Since the principal applications involve measure-valued processes on Euclidean spa-
ces and since it makes possible the application of Lemma 2.2, it is assumed that
S is a locally compact separable metric space with one point compactiflcation S .
It should be noted however that results of a similar nature can be obtained when
S is a Polish space.

For ~ > 0, E~ := {~:~(S*) <__ ~} which is a compact subset of M(S*) and T~


is the Ft-stopping time defined by T~ := inf {t:X(t) ~ E~}.

Let L be a differential operator with polynomial coefficients of degree K


as given by (2.13)-(2.16). It is assumed that L satisfies one of the following
three sets of hypotheses.

4.1. HYPOTHESES.
(a) (Conser~)a%4v~ Prope9~j) For every N > I, I(N ~ (the function Of N variables
which assumes the constant value i) ~ D(S ) and LFI(N) = 0.

(b) B~ = 0 if £ > 0.
(c) There is a norm fill.III~ on D(S *N) and constants c, Cl, Z, c2, £ such that for
101

f ¢ D(S *N) ,
(1) lllf + II f c IllIflll,
(ii))H~Jk)f/bzllI)~f)+£ <__ c2,£.llIIfl]II~[f) , and

II]~J)f/a£ HNN(f)+£ <-- el, Z. I]IIf]II~(f) , £ = -I,0,1,...,K; J,k = l,...,N(f),

where c2, 0 ~ I and ¢I := max (c2,£,oi,£) <

(d) V(n) + ~ c2n , where c 2 is a constant.

4.2. hYPOTHESES.
(a) (ConserUative p ~ p e r ~ ) (As in 4.l.a.)
(b) Condition 4.1.¢ is satisfied with oj,£~l, J = 1,2 and Z = -I,O,I,...,K.

(c) There is a constant c I < ~ such that V(n) + ~ eI .

(d) Consider the Markov Jump process on the non-negatlve integers Z+ with infini-
tesimal transitions rates qJk for a transition J + k ~ J, where

qj,j+£ = j.a£ + ½J(j-l).b£ , J ~ I, £ = -1,1,2 ..... K,

q0,£ ~ 0, £ = I,...,K.

qjj = 1 - ~ qjk "


k#J
It is assumed that {qjk } form the infinitesimal transition matrix for a
conservative Markov chain {~(t):t k 0} and E(n(t)) < ~.

4.3. hYPOTHESES.
(a) (Pure Death Dual) a~ ~ b£ *N= 0 if £ > 0.
(5) For each N kl, I(N ) D(S ), GNI(N ) = 0 and LFI(N) h 0.

(c) Condition 4.1.c (with e2, 0 ~ i) is satisfied.

(d) The moment problem for X(t,S*) Is well posed for small t; e.g. 4.l.d is satisfied.

A family of probability measures {Pu:~ e M(S*)} cP [fl~(S )]} is a solution to

to the initial value martingale problem for {(Ff,LFf):f e D} if 42.5) is satisfied


and for ~ e E~, f c D,
t
(4.1) Ff(X(t)) - S LFf(X(s))ds is a Ft-martlngale with respect to P .
0

4.4. THEOREM.
Under each of the sets of hypotheses 4.1, 4.2, 4.3, a p[fl~(S )]-valued solu-

tion {P :~ c.M(S*)} of the initial value martingale problem associated with


{(Ff,LFf):f e D} is unique, measurable and satisfies the strong Markov property.

Proof. By Lemma 2.2, ~(E~) generates L(E~) under bounded pointwlse conver-
gsnce.
E~
Then Theorem 2.1 implies that a P[flC ]-valued solution to the initial
value martingale problem is unique, measurable and satisfies the strong Markov pro-
t02

petty provided that for each t ~ 0 and f ~ D there is a function Af, t c L (M(E~))
such that for ~ ¢ MCE~),

(4.2) Ev(Ff(X(t)) = Af, t(~) .

In fact, it suffices to establish (4.2) for all 0 < t < tO for some t o > 0 (in-

dependent of ~ and f). To verify the latter assertion nQte that under this con-
dition the argument in the proof of Theorem 2.1 yields the uniqueness and Strong
Markov property of PU on Ft0. Then in the same way the conditional distribution

P (.[Ft0) = PX(t0)(.) is uniquely determined on ~{X(s):t 0 < s < 2t 0} and the

proof is completed by induction.


In order to establish (4.2) we first construct a dual process {Y(t):t ~ 0}
adapted to a filtration G t which is a right continuous strong Markov process with
state space D and distribution {Pf:f ¢ D} . The evolution of Y(.) can be in-
formally described as follows. If N(Y(tl)) = N, a set of N independent exponen-
tial clocks with means (i/ai) and a set of N(N-I) independent exDonentiai clocks
with means (2/b£) are constructed for each Z = -I,0,1,...,K. If T1 denotes the
time at which the first of these clocks rings, then

(4.3) Y(tl+t) = s~(Y(tl))Y(tl) for t I ! t < tl+T 1 , and

Y(tI+T I) = A(J)Y((tI+TI)-) or B(3k)y((tI+TI)-), depending on which of

the clocks rings at time TI, £ = -I,0,1,...,K; j,k = 1,2,...,N(Y(tl) ).

Under Hypotheses 4.1 N(Y(t)) is a Markov chain on Z+ such that the '%irth rate"
that is,the rate of transitions from D(S *N) ÷ D(S *N+£) with % > 0, is bounded by
a linear function of N(Y(t)). This implies that N(Y(t)) is a conservative Mmrkov
chain. In addition a coupling argument can be used to verify that /~(Y(t)), the
number of "births" in [0,t], is stochastically dominated by K.Z(t) where Z(t) is
a Yule process, that is a pure birth process with linear birth ratep on Z +. There-
fore
(4.4) P(NB(Y(t)) = nIN(Y(0))=k) ! c.nk.z(t) n where z(t) ~ 0 as t + 0, and
c is a constant.
For K>0, let ~ := inf{t:NIY(t) IIIN(y(t) ) + N(Y(t)) > K}. Then for < < - , ~ e M(E~)
and f e D,
(4.5) F (Y(t^~<)) - /t^~KL#F (Y(s))ds
0
is a bounded Gt-martingale with respect to Pf and conditions (3.3) are satisfied.
Conditions (3.2) are also satisfied for T = T$, o = o K , so that Theorem 3.1 and
C o r o l l a r y 3.2 can be applied. I f In Y(o)}l[N(y(o))+B(Y(O)) <~, ~ ¢ E~, this y i e l d s :
(4.6) Ev[F(X(tAz~),Y(0)] - E [F(X(0),Y(t^o )).exp(ft^aKV(Y(u))du)]
.(t-~<)~T~ t < . 0

0 ~v(t-~K)+
(t-s~o
+ V(Y((t-s)AOK)F(X(s^T~),Y((t-s)^O ))}exp( f ~V(Y~))du))~].
0
103

* n
Under the conservative hypothesis 4.l.a, Ep[(X(tAr~,S,)
* n] = E [ (X(0,S)1 ] for
all n E Z+ and ~ >_X(0,S*). Therefore X(t,S*) = X(0,S ) for all t and T~ -- ~.
Under Hypotheses 4.1 it follows from (4.4) that
~o

(4.7) Ef[IFu(Y(t))I.exp(;tv+(Y(u))du) ] 5_ [ (cln)N(f)+l.exp(c2nt).(z(t)) n.c2n


0 n=l

< = for t <__to, cla constant,

provided that to is chosen so that exp(c2t0).z(t0).e2 < i. In addition,


tA~
(4.8) lim Ef[X{~ .sup ILF (Y(aK))I.exp( f KV+(Y(u))du)]
K~ -<r.} U 0
K
<__lira ~(cln)N(f)
~ +i. exp (c2nt) .(z(t))n.c 12n = 0 .
K-+~ n=K

Under Hypotheses (4.2), results analogous to (4.7) and (4.8) follow immediately (with
to = ") since N(Y(t)) is conservative and ]IIY(t)111! c.]l]Y(O)]]l. Together,
(4.6), (4.7) and (4.8) imply that fort < t ,
-- to

(4.9) Ep[Ff(X(t))] = Ef[F(~,Y(t)).exp(f V(Y(u))du)] := Af,t(P).


0
Since Af,t(.) ~ L (E~), (4.2) has been established under either Hypotheses (4.1)
or (4.2).
In the non-conservative case the state space is all of M(S*) and allowance
must be made for the fact that ~D(M(S*)) is not contained in the space of bounded
continuous functions on M(S*). In this case the uniqueness, measurability and
strong Markov property of a P[~;(S*)]-valued'° solution to the in~tlal value martin-
gale problem can be obtained from a modification of Theorem 2.1 provided that:

(4.9) for 0 <__ t < t0, f E D, there is a function Af, t whihh is a measurable
function on M(S ), bounded on E~ for 0 < ~<~, and satisfies (4.2), and

(4.10) for 0 < t < to, the moment problem for X(t,S*) wlth distribution deter-
mined by {P :p E M(S*I} has a unique solution.

Under Hypotheses 4.3~q K = =o if IIIY(0)IIIN(y(0))+N(Y(0)I<K, and conditions (3.21, (3.31

are satisfied with ~< = = , T = 7~, ~ < = . Furthermore by 4.3.5, (X(t,S*)) N is a


submartlngale for each N >_ i. Since Ep(IX(t,S 11)< •, llm P (T~<__t) = 0. There-
fo re, ~-w=
* N
(4.11) lim E (IX(tATE,S)I "X{T $ < t} )

< limE (Ix(t,S*)]N.x{r~ t}) = 0 .

Then, for any f E D, lim E [Ff(X(tAr~))] = Eu[Ff(X(t))] . Then (4.6) and (4.11)
yield:

(4.12) Ep[Ff(X(t))] =
Ef[F(p,Y(t)).exp(f V(Y(u))du))]:= Af,t(~).
0
Since Af,t(.) is a measurable function which is bounded on the sets E$ , it
104

remains to verify that the moment problem for X(t,S ) has a unique solution. But
condition 4. l.d implies that

(4.13) m N := E ((X(t,S*) N) < c0.exp(coN(t+l)) where ~ is a constant,

which in turn yields the Carleman condition


CO

X (m~) -(lnj) = CO.


J=l -J
This completes the proof under Hypotheses 4.3.

4.5. EXAMPLE.
Fleming and Viot introduced a continuous state version of the Ohta Kimura step-
wise mutation model for describing allelic frequencies within a population undergoing
mutation, random genetic drift and selection. This model is formulated as a probab-
ility-measure-valued diffusion process on R defined by a differential operator
wlth polynomial coefficients of degree 2. The differential operator is given by

(4.14) LFf(~)

J~k
+ (N(f) - ~N(f)(#(f)-l))Ff(~) ,

where GN denotes the Laplace opera,or in N dimensions, m(.,.) c D(R*2),

M(J)f(x I ..... xN) := (m(xj,xN+I) - m(XN+l,XN+2))f(x I ..... XN) , and

B(Jk) f(x I ..... ~)-'= f(xl,..,x j .... Xk_l,X j ....xN) if j < k,

f(x I .... Xk_l,X j .... ,xj .....xN) if J > k,

and D(R*N) are the functions whose derivatives of order two or less belong to C(R*N).
The differential operator L given by (4.14) satisfies Hypotheses 4.1 and
therefore Theorem 4.4 yields the uniqueness, measurability and strong Markov pro-
perty of a solution to the associated initial value problem (if it exists). (The ex-
istence of, a solution can be established by other methods.) The duality method has
also been used to study limit theorems for this model in the selectively neutral
case (Dawson and Hochberg [2]).

4.6. EXAMPLE.
In this example we consider branching diffusion together with turbulent tram~
sport in R3 which is an extension of the model of P.L. Chow [i] for molecular
diffusion with turbulent transport. The differential operator associated with this
model is given by:

(4.15) LFf(~) -- FGN(f )f(~) + j=l k=l (FB(Jk)f(~) - Ff(U)) + N(f)(N(f)-l).Ff(U),


j~k
where B (jk) is as in Example 4.5 . For e,8 = 1,2,3, let pds(x) be smooth
functions on R3 wlth
105

N N 3
X Z ~E1p~,B(~i-~)~jjk~>_. o
J=l k=l cL,8=
for any set of real numbers {~j } • If xj E R 3 = (Xj,l,Xj,2,xj,3) , then

N N 3
c/(x I ..... ~1 := [ ~ X ~'(~(~:i-~).~2/3~l,~,~k,/f(~1 ..... ~15
j=l k=l a,B=l
j#k
N 3
+ ~E Z ~ 2 /~xj,~(f(x
2 I ..... XN)) -
J=l ~ I

For sufficiently large K the operators G N are uniformly elliptic and generate strong-
ly continuous semlgroups on C0((R3)N5 • Then the differential operator L given by
(4.15) satisfies Hypotheses 4.3 and Theorem 4.4 can be applied to establish the uni-
queness, measurability and strong Markov property of a solution to the associated
initial value martingale problem.

REFERENCES.

I. P.L. Chow, Function-space differential equations associated with a stochastic


partial differential equation, Indiana Univ. Math. J. 25(19765, 609-627.

2. D.A. Dawson and K.J. Hochberg, Wandering random measures in the Flemlng-Viot
model, Annals of Probability, to appear.

3. S.N. Ethler and T. G. Kurtz, Markov Processes: Characterization and Convergence,


to appear.

4. R. Holley and T. Liggett, Ergodic theorems for weakly interacting systems and
the voter model, Annals of Probability 3(1975), 643-663.

5. R. Holley, D. Stroock and D. Williams, Applications of dual processes to dif-


fusion theory, Proc. Syrup. Pure Math. 31(1977), 23-36, American Mathematical
Society.

6. R, Holley and D. Stroock, Dual processes and their applications to infinite in-
teracting systems, Advances in Mathematics 32(19795, 149-174.

7. E. Pardoux, Stochastic partial differential equations and filtering of diffusion


processes, Stochastlcs 3 (1979), 127-167.

8. T. Shlga, An interacting system in population genetics, J. Math. Kyoto Univ. 20


(19805, 2~3-2&2.

9. T. Shlga, Diffusion processes in population genetics, J. Math. Kyoto Univ. 21


(1981~, 133-15 I.

i0. F. Spitzer, Interaction of Markov processes, Advances in Mathematics 5(19705,


246-290.

ii. D.W. Strooek and S.R.S. Varadhan, Multidimensional diffusion processes, Grund-
lehren der mathematischen Wissenschaften 233(1979), Sprlnger-Verlag, Berlin,
Heidelberg, New York.
OPT~ STOPPING OF OC)NTIQOT,T,~ MARKOV PROCESSES

NiooleELKAROUI
with Jean-P~erreLEPELTIERandBernard ~@IRCHAL

I-~DUCflON
%~ are concerned with optimal stopping of Markov Processes,
controlled by change of absolutely continuous probability measures.
We use the results on optimal stopping depending on a parameter, in
order to construct by iteration the value-function of this optimization
problem.
We then show that the separation principle is satisfied: we can first
choose the optimal stopping rule, and then study a classical problem
of instantaneous control, that we solve under appropiate assumptions.

II - T H E M O D E L
In free evolution, _the state of the system is an E-valued~arkov
Process, OC = ( D, F__
t , Xt , p ;pE~E) ), where E is an lusinian space.
We suppose t h a t ~ i s a "good" MarkovProcess, Feller Process if E is
cxmloact, or more generally t h a t ~ i s a right Process, roughly speacking
t h a t ~ i s a Feller Process (with branchingpoints) for an adequate tolx)-
l o g y o n E.
For exanple,~is a diffusion process on R n, a diffusion with jumps,
a killing diffusion, a reflected diffusion on a convex domain in R n, .....

A admissible control is a adapted process ut, taking values in a


c(~ct set U.
Let ~ the family of admissible controls, and ~ t h e family of stopping
times.
(2.1) We suppose that ~ is closed under concatenation :
for every u,v i n ~ and every stopping time S,
uSov = u l]o,S] + v I]S,~] belongs to ~ .

(2.2) Under the policy u ~ , the law of X. is the probability


pU = L u p

where:- Lu is a uniformly integrable martingale > 0 .


the family ~ L u ; u e ~ ) ~ i s ~ t i b l e :
107

if U, V c ~ and ut = v t te IS,T[ , S,T E ~


u v
Lt Lt
u =--V-- ~ t~[s,T[ a.s.

LS LS _ uoSv
Particularly, ~ t >S Lt does not depend on u a.s.
T~v
S
(2.3) If a is a constant policy, ~a ( R, _~ , Xt ,
is a M a r k ~ Process, Feller or right as ~- .
Remark: Usually, L u is the exponential martingale associated with a fami-
ly of stochastic integrals.

The reward process is C Ut + e x p - Hut g(X t)

Here, CU
t = ;o t e- HU C(Xs,Us) ds , and Hut = ~oth(Xs,Us) ds

We suppose that c(x,a), h(x,a), and g(x) are bounded functions.


g(x) is the terminal pay-off.
Our control problem is to find an o~timal ~olic~ ~ e ~ , and
a sto~ping time T wich maximizes the reward funqtionnal:

We write Jx(U,T,g) = ExU( ~ + exp- ~ g(~))

We use the "martingale approach", ren%arkably explained by M.H.A.


DAVIS in [I] and R.ELLIOT~ in [21 , and the formulation of "Bellman's
principle of optimality" as a supermartingale inequality for the process
U U
C S + exp- H S W(S), where W(S) is the value-process :
W(S) = P- essup T>_S,v~EV[~v - C% + exp-( ~ - H;) g(XT) / F=s]

More precisely we have:


TH~0R~M 1 : PRINCIPLE OF OPTIMALITY
Letu be a fixed i n i ~ a l law
a) J(u,S) = C us + exp- H S W(S) is a pU_ ~-supermartin@ale;
u P
i.e. S and T c ~ S<T E [J(u,T)/F ~ < J(u,S) pU .a.s.

b) A policy (0, T) is o~timal iff :


- W(T) = g(~) pO a.s.

- CSA ~ + e x p - HSOA~ W(SAT) is a PU-~-martingale

The proof is similar to the proof of relations 5.3 and 5.11 de [IJ because
we assume (2.1) and (2.2).
108

Our control problem is Markovian. We at first show that the value-


process is Markovian, i.e. W(S) = Wg(X S) pU
Here ~g(x) = su~l~0 ' u ~ Eu [ ~ + e x p - ~ ~ .a.s.
g (~)] is called the
~-value function. =
However the family ~ is too ccmplicated to study directly the properties
of ¢~g(x). Thus we work with the set ~ of control admissible which are
e
step functions along stopping times, i.e.,

~e = {ua&; u = Z l<=n Un l]Tn,Tn+l~ }


(Tn) is a increasing sequenoe of stopping times, such that SUPnT n = +~
and un is an F__
T -randcm variable.
REMARK: ~ is the "~allest set", closed under concatenation which
e
contains the constant controls.

We construct the value-function wg(x) on the set ~


of step controls
e
by ir~ulse control technics andverify that the dynamic programming prin-
ciple is satisfied for ~ e " Frem this we easily deduce the separation
principle for control p~oblem.
Under appropriate assumptions on the continuity of a-~L a, c(x,a),h(x,a)
and t +g (Xt) , we show that the ~e-value-function is equal to C~g(x) .

Let u he be a step control u = ~. l<_i ui l~Ti,Ti+l]


Between T. and the evolution of ~ is governed by the probability
l Ti+l' pU(A/F ) = P ui(') ( ) a.s.by (2.3).
p[ui (m~,i.e., if A eF_Ti+l ~ ~-T. A/F--T.
Each u belonging to ~ e can be seen as ~ policy of impul~e control
with iapulse times (Tn). We use the methods of impluse control, [3] , [4]
but without impulse cost, to solve our control problem.
To obtain wg by approximation, let
~n = { n-step control of ~e,i.e, T n = + ~ }
Wng(X) = sup {Jx(U,T,g); Te~and u~ } and

Rg(x) = wlg(x ) = sup{ Jx(a,T,g); TE~and aeU }


The functions Wng and the operator Rg have the following properties:
PROPOSITION 2:
a) For each n, Wng = Rng, where R n is the n_th power of R .
b) The sequence Wng increases to wg.
c) Fo__ru e % and Se~ , if u S : utA S
J (u,s,wg) = sup{ J (~,T,g)~ vS= u s , w ~ , 7>= s }
10g

PROOF: The principal difficulty is to interchange the s u p ~ and


integration signs for the operator R .
Let Rag(x) = sup { Jx(a,T,g) ; T E ~ ~ be the value-function of the
optimal stopping problem for the parametrized O~ a process , and
Rg(x) = S(R'g) (x) where Mf(x) = sup af(X,a)
From the theory of analytic function, we obtain the following result:
3 : _Let f(x,a) be an analytic function on ExU cc~0act.
a) Mf(x) = SUPa f(x,a) is analytic
b) For each probability measure ~ on E,
p(Mf) = SUPu(x) I ~(dx) f(x,u(x)), where u(x) is a Borel-
function .
c) Let ( ~, G ,Q) be a cc~plete probability, space and Z a G-measurable,
E-valued variable:
Mr(Z) = Q-esssup { f(Z,V) ; V =G-measurable,U-valued variable }

This lenn~ applied to the optimal stopping problem leads to following


result:
(3.1) If p~[E) , uc~ e and S is a stopping time then:
J (u,S,Rg) = sup { J (uoS6 ,T,g); 6 ~-measurable random variable}
where (uS° 6)t = ut i{ t<__S} + 6 l{t>S }
By iteration, it is clear that:
(3.2) J (u,s,Rng) = sup {Jp(v,T,g); T>=S and ~ = uS , Ve~n(U,S)
where Ve~n(U,S) iff v has not more than n impulsions after S.
It remains to identify the limit w g of the non-decreasing sequence
Wng. We write, for v ~ e ,
J (v,T,g) = J (v, TA
_ _ Tn,g) + E~[I{T>T 2~ ( ~ - ~ T + e - ~ g(~)-e-~ g(~ ))3
g and c(x,a) are bou%ed; therefore,by ~besgue t~eorem, n
Jp(v,T,g) - Jp (v,TATn,g) converges to 0.
On the other hand, the policy vTn belongs to ~n+l ,and
Jp (v,TATn,g) = Jp(vTn, TATn,g) <= p (Wng) <__u(wg)
We deduce that: sup { J (v,T,g); v ~ e }= p( w g)and that w g =wg
and also by passage to the limit in (3.2) that c) is true.

Thus the principle of optin~lity and the principle of dynamic programming


hold:
THSORD~ 4: For each initial law p
u
a) exp-H t wg(Xt) is t h e ~ e value-process and the theorem 1 holds for

e
110

b) For all stopping times S,


-- " U

(,.11 ~(x~ = s~ T,u~u[ I~T<S~(~ ÷ e-~ ~C~))+ ~ C~-~ ~(Xs~)]


C) Suppose g >__0, and let D1 = inf {t >2; I wg(X t) <__g(Xt)} le [0,1[
(4.2.7 Then wg(x) = sup {Jx(U,Dl,wg); U ~ e }
wg is the value-function of instantaneous qontrol problem with play~off
w~at time~.
U
PROOF: From Proposition 2 c), it is clear that exp-H t wg(Xt) is the
value-process of this optimality problem on ~ e ' and that:
U U+ U U U
su~,uE x D{ T<S} (c~ e~,~-~ gCxT))+"I{ T>_s}(Cs + e x ~ s wg(Xs))] =
U %/ --

> wg(x) ,for we can put v = u and U = TVS


On the other hand,the left hand side is always bounded, since g<__wg , by
suPT, u Eux [CTAs
u + exp-~AS wg(~AS)] <= ~N(x) by Proposition 2 e) .
Equality (4.1) follows .
Let S = O 1 ; g ( ~ ) < lwg(~) a-s- on { T<S }and by (4.1)

~cx)<= su~,u ~x D<T<S~ (~ + ~ ~(~))+~T>slC~ + ~ ~%))I


< SUPT'u[ ~u [~ + ~ _ ~ ~%)1 ÷~i-~)~
< ~ ~<x) + (1-~) su~ ~uX [c~ ÷ ~ ~(Xsl]
Then wg(x) <__sup u Eu [Cs + e~q>-HS wg(Xs) ~
By P~oposition 2c), hhe reverse inequality holds and (4.2) is proved.

RI~MARK:We extend this constructinn and resultsin [5] .

III EXISTE~-~E OF OPTIMAL CONTROLS

To ensure existence of an opt/inal stopping time, and optimal policy


we make the following assuaptions :
(HI) g is non-negative and bounded, and t ÷ g(Xt) is right-conti-
nuous and regular Pu a.s. , or equivalently
~ m n ~u(g (xT )) = E~(g (~)) v u~ $ e
n.
for an increasing ( or decreasing) sequence of stopping times
(Tn) with liran T n = T
(H2) There exists a >0 such that h(x,a)>= a >0
(H3) The family { ~ ; uK ~ e } is uniformly integrable for Tc

Under (HI), there exists an e-optimal stopping time .


111

.~bre precisely we have:


THEOREM 5: Under the ass%~ption (HI)
a) For all ~eg(E), t +wg(X t) is right continuous P a.s.
b) The stoppin~ time De = inf {t~0 ; wg(Xt) ~ g(Xt) + ~ }
is e-optimal.
c) The stoppin~ ~ D = inf { t >__0 g(X t) = wg(X t) } is o~timal, an__~d
(5.1.) ~ ( x ) = sup u ~ ux [~ + ~ P g(xD) ] _~
if the assumptions (HI) r (H2), (H3) are satisfied.
PROOF:
a) Let w+g(x) = limt+oExl wg(Xt) ] . This limit exists because wg(X.) is
a strong semimart~gale.
Under (HI), wg >_w g >__g
Using Theor~n 4a) C tu + exp -HUw+g(Xt)
t is a right continuous supermartin-
gale, and :
wg(x) = sup {Jx(U,T,g) ; u ~ , T >__0 }
= sup {Jx(U,T,w+g); u~.9e, T >o } = w+g(x)
b) Using Theorem 4 , we prove that, if ~tu = Ctu + ~Ig]iiot e -Hu
s dHu
and g = g +~ g II, ,~<~) = wg~) + !Ig~
Let B 1 be the stopping time inf {t; 1 (wg + IIgll)(Xt)< g(X t) + IIg~ }
~e the stopping time inf {t; wg(X t) _<_g(X t) + e }
Because wg is bounded, ~c <=~X for e >=(l-X) ( Ilg~ + lwgll ) and

e
(5.2) = S U P u ~ E x u [CDI+ exp-~l wg(%l)]
g(X.) and wg(X.) are right continuous . The inequality w g ( ~ e ) < g ( ~ e )+e
is true, and ~e is e-optimal.
c) In order to show (5.1), we are going to prove that Jx(U,DX,g) converges
uniformly to Jx(U,~,g) if l+l, where D~= liml~iDX & D •
Under (H2) and (H3),

The right hand have 0 as limit.


We take the limit in (5.2~ .
= Supu gc%., J
D ~ is optimal, and by Theorem i, ~>__ D . Then D = D~ , and (5.1) is proved.
112

Notice that wg is the value-function associated to instantaneous


control problem, with terminal time D.
In order to prouve the existence an optimal policy, we can apply the classi-
cal results, ([6] p. 218).
suppose, for simplicity, that L u i s a exponential continuous martingale
associated to :
Nu
t = [ot ¢ (Xs,Us) dMs, where M is a continuous martingale
d

which is a functionnal additive of Markov Process


Under (H3), and if a +¢(x,a), c(x,a), and h(x,a) are continuous,
there exists a borel function u(x), that u(Xt_) is a optimal policy.
( [6J ~or~ 3.59. ) .

Peruh~<ENKIES

I] M.H.A. DAVIS: Martingale methods in stochastic control


Stochastic Control Theory and Stochastic Differential
Syst~as. Bonn 1979.
Lect.Notes in Cont. and Inf. Springer n°16

[2] R. ELLIOIT The martingale calculus and applications


Stochastic Control Theory and Stochastic Differential
Systems. Bonn. 1979

[33 M. I~3BIN Contr~le inpulsionnel des Processus de Markov


Th~se de Doctorat Paris IX 1978.

14] J.P. LEPELTIER Theorie g~n~rale du Contr61e impulsionnel


B. MARCHAL paraitre dans S ~ t i c s

is] N. EL KAROUI Construction du semi-groupede Nisio associ~ au contr61e


J.P. LEPELTIER de Prooessus de Markov.
B.MARCHAL paraitre. 1982.

N. EL KAI~DUI Les aspects prdsabilistes du ContrSle stochastique


Ecole d'Et~ de Probabilit@s de Saint-Flour 1979
Lect.Notes in Math. SpringerVerlag n ° 876.

Nioole EL KAROUI
Ecole Norn~le Sup~rieure
5, rue Boucicaut
92260-Fontenay-aux-Roses
FRANCE
TWO PARAMETER FILTERING EQUATIONS FOR

JUMP PROCESS SEMIMARTINGALES

By

ATA AL-HUSSAINI, UNIVERSlT¥ OF ALBERTA, CANADA

and

ROBERT J. ELLIOTT, UNIVERSITY OF HULL, ENGLAND

i. INTRODUCTION

A random event which occurs in the positive quadrant is considered. The

signal process is a two parameter semimartingale associated with this event. The

observation process is a second random event in the positive quadrant, and the

signal influences the observation because the distribution function of the observation

is a function of the signal. This description of the filtering problem is based on

the reference probability model of Zakai [7] . Using the calculus of two parameter

semlmartingales, ([2], [3]), horizontal, vertical and diagonal filtering equations

are obtained. Intuitively the problem can be considered as searching for a random

point in the quadrant using noisy observations.

Analogous formulae, for the filtering of a continuous two parameter semi-

martingale associated with the Brownian sheet, were obtained by Wong [6] and

Korezlioglu, Mazziotto and Szpirglass [B] . Filtering equations using two parameter

Poisson process observations of a such a continuous semimartingale were obtained

under certain symmetry hypotheses by Mazziotto and Szpirglas in [53 . In the notation

of section 3 below the symmetry hypotheses of [5] would, for example, include the

assumptions that H = ~ , and that R = O . Such hypotheses would considerably simplify

our calculations and give rise to results which superficially resemble those of [5] .

For the fundamental situation associated with the two parameter jump process we derive

filtering formulae without such hypotheses. The calculations are explicit and our

results new.

Professor Ai-Hussaini wishes to thank the Department of Pure Mathematics

of the University of Hull for its hospitality in September and October 1981 when

work on this paper was commenced.


114

2. SIGNAL AND OBSERVATION PROCESS

THE SIGNAL PROCESS

The signal process is a semimartingale related to an event that occurs at

a random 'time'(S1,S 2) in the positive quadrant [0,~] 2 . We can, therefore, take the

underlying probability space to be ~I = [0"~]2 with a probability measure P1 "


Consider the processes

: " i : 1,2
"/... 7.

The o - field F~*z on the ith factor of ~1 is the completion of o{Isi>S


i- : si<-ti} ,
I* 2*
and F*tl"t2 is the product Ftl x Ft 2 . We shall assume the usual condition (F4)

is satisfied, that is that F* and F* are conditionally independent given F*


tI t2 t l t 2"
In [2] it is proved that in the present situation this is equivalent to the components

SI and S 2 of the event being independent.

Write F* for F*

and =Pl[Si>ui] , i = 1,2 .


ui
Then
~*Pi = ~ ( t l , m) = - I (~'u~_)-ldF~
30, tiAS i ] z z

and qi* = Pi* - - -Pi


* " i = 1, 2

NOTATION 2.1.

For suitable integrands e(Ul,U2) E LI(~I,F*,P I) we shall write,for example,

t2 tI
8.plP2*~* = fo Io e(Ul"U2JdP~l~ 2 . (See [I] and [3].)

DEFINITION 2.2.

The signal process is then the semimartingale

Xtlt 2 = Xoo + + + e .plP2

where X
00
is a constant and 8i E LI(~I,F*,P1) , i = 1,2,3,4 .

REMARKS 2.3.

Here we have assumed that X is constant on the axes. Terms of the form
115

~.p~ and %'.pi~* could be introduced into X . However, such one dimensional stochastic

integrals do n o t g i v e r i s e t o more i n t e r e s t i n g results, b u t o n l y more c o m p l e x

calculations.

For the same reason we shall suppose that F u.


i* is continuous for i = 1,2.

THE OBSERVATION PROCESS.

The process X cannot be observed directly, but a second random event which

occurs at a 'time' (TI,T2) in [0,=] 2 can be observed. We shall describe below how the

distribution of (TI,T2) is a function of X .

The underlying probability space for (T2,T 2) can be taken as ~2 = [0"~]2"


with a probability measure P2 " We assume that under the original measures TI and T2
are independent and, furthermore, (TI,T2) is independent of (Si,S2).Therefore , the

signal and observation processes can be considered on the product space ~ = ~7 × ~2

with the product measure P = PI × P2 "


Write

Fiu. = P2[Ti > ui] , i = I, 2,

and again assume, to simplify the calculations, that F i is continuous. On each


ui
factor of ~2 Ft
ii is the completion of ~{I 8z.~T.z : si~ti}" Ptlt2 is the product

Ftl
I × F t2
2 and F = F ® . Associated with (TI,T2) we have the processes, (to which

Notation 2.1 will also apply):

Pi = Pi(ti '~) = It.>-T. "

o o_ cFi
]0, t .AT.] Ui- Ui

and qi = Pi - ~i ' i = 1,2 .

3. THE REFERENCE PROBABILITY MODEL.

Suppose Q2 is a probability measure equivalent to P2 on (~2,F) . There is

then a Radon-Nikodym derivative


116

dQ~
d22 = t E LI(~2, F) .

We can consider the two parameter martingale Ltlt2 = E[LIFtlt~ . Because Q2 is

equivalent to P2' L > 0 a.8.and Ltlt2> 0 ~.8. for each (tl,t2) ~ [0,-]2 . We shall

assume in the situations discussed below that P2 and Q2 have the same marginals, so

that

Ltl° = Lot2= Loo = I for all tl, tZ

Again, all our calculations go through without this hypothesis, but are then even

more complicated. Applying the martingale representation result of [1] to {Ltlt2-1}


we have that there is a function g (L2(~8,F) such that

tI t2
Ltlt2 = 1 + Io fo g(u1"u2)dqu2dqu I

THE FILTERING MODEL

With this representation result, and the reference probability model for

filtering of Zakai [7] in mind, we suppose there is a function g : R ~ R such that

for each sample path X of the signal process g(Xulu2) ~ LI(~2,F). Furthermore, if

t I t2
Ltlt2(X) = Ltlt2 = 1 + f° fo g(Xulu2)dqu2dqu 1

then we assume that Lilt2 > 0 a.8. and the distribution of the observation event

(T1,TS) , given the sample path X , is described by an equivalent probability

measure Q~ = Q2(X), where

dQ2(X)
= L (X) > 0 a.8.
dP 2
Note that under Q2(X) the components Tl and T2 are not, in general,

independent.

On the space (~,F ~ × F) the measure P = P1 x P2 is replaced by the

equivalent probability measure Q = PI × Q2


117

NOTATION 3.1.

Write

Xtlt 2 ~(3.1)
where

a = el.p~ + e 3 . ~ and ~ = e2.p~ + 8 4 . ~ .


Also

Xtit2 = XO0+ ~.p~ + ~ . ~ (S.~)

where

= J.p~ + 82.~ and ~ = e3.p~ + e4.~•


Define
t
HCt I, t 2) = L tl-,t
-1 2 Io 2 g(Xtls2)dq82

= t I, t2- fo g(Xslt2)dqsl
so that

t1 (S.S)
Ltlt2(X) = Ltlt2 = I + f o Lsl-tH(s1"t2)dqs I

t2
-- I + I ° Ltls2 - ~(t l,e2)dqs 2 (~, 4)

For any random variables

U,V,W ( LI(~,F~2t2 x Ftlt2)


write

= Ftl t2 ]

Conditional covariances are denoted by:

K(U,V) = EQ[(U - U)(V - V) IFtlt2 ]

and

K(U,V,W) = EQ[(U - U)(V - V)(W - W} lFtlt2 ] .


118

4. HORIZONTAL AND VERTICAL FILTERS.

Keeping t2 fixed and using the Ito product rule for one parameter semi-

martingales we have from (3.1) and (3.3):

tI
(LX)(t1"t2) = Xoo+ fo L81_,t2(a(sl,t2)dP;I + ~(sl,t2)a~l)

t1
+ fo XSl-,t2 L81-,t2 H(sI"tl)dq81

Therefore,

^ 1"
(tx)(tl,t2) = x_ o- I°tI ts1_,
_ t2(a(81,t2) + ~(81,tl))dF81

tI /k
+ ~o ~81_,t2 XH(sl-,t2) dqsI (4.1)
where

~(81,t2) = (Ls1_ t2)-IEp[~(sl,t2)Ls1_,tllS1 = sI and Ftlt2 ]

~(81,t2) = (Ls1_,tl)-lEp[~(Sl,t2)Ls1_,tllS1 = 81 and Ftlt2 ]


and
~(81_, t2) = (L81-,
- t2)-I (Xsl-,tlLsl-,t2 H(sl,t2)).
From (3.3) we have that

t
Ltlt2 = I + fo I ~s1_ t2 H(s1,t2)dq81

where

~(sl,t2) = (L81_ tl)-l(L81-,t2 H(Sl't2))"

Therefore, keeping t2 fixed,

- "t2)-1 = 1 - fotl (Lsl-,t2)-I H(Sl"tB)dqs1


(Ltl

+ xt1~-T1(L~r tJ -I ~(Tr t2)2 (4. ~).

Forming the product of (4.1) and (4.2) we have, after some algebra, the following

expression:
119

LX(t1,t2)
Xtlt2 = Lilt
_. 2

tl ^ I*
=~- Io c~(Srt2) + ~csrt~))s2~1

+ fo ----1
+ H-XH)(sl~t2)(dq81 - t2)d~Psl)"
81t2

REMARKS 4.I.
It is proved in [3] that the process

tI
I
vtlt2= qtI - Io kSrtJ~p~1
is a 1-martingale for the filtration {Ftlt2} under the measure Q .Similarly the

process

t2
2
vtlt2 = qt2 _ iO H(tl"s2)a~Ps2
is a 2-martingale for the filtration {Ftlt2} under the measure Q .

DEFINITION 4.2.

{V~lt2}-n is the horizontal innovations process.

{v$1t2 } is the vertical innQvations process.

The horizontal filtering equation derived above can then be written in

the following form:

THEOREM 4.3.

Horizontal Filtering Equation :

^ tl ^ I* tl K (X,H)
XtltZ Xoo- fo (~(81"t2) + ~(81"t2))dF81 + fo I + H= (81-,t2)dv It2

Similarly we can establish the following formula:

THEOREM 4.4.

Vertical Filtering Equation :


120

* 2 = Xoo-
Xtlt ^ fotl {a(tl's2)+ ~(t1"s2))dF82
~- 2 ~ + Iot2 K(X.~_)
i 7~H (ti, s2_)d 2tl"82 "
Here

,- s2-)-lEp[a(tl"Z2)LtI"82-]S2= 82 and Ft2" 82]


~(tl,S2) : (Ltl

;(tZ, s 2) = CLtz, s2 )-lEp[[Ctl, 82)Ltl, S2_]S 2 = 82 and Ftl,82]

5. THE DIAGONAL FILTERING EQUATION

NOTATION 5.1

If Ytlt2, (tl,t2) ( [O,-]~is a p r o c e s s write

AIYtlt2for the process [tl,t2 - Ytl_,t2

A2Ztlt2for the process Xtlt2 - Ytl,t2_

and A Ytlt2 for the process Ytl,t2 + £tl_,t2- - Yt1_ t2 - Ytl jt 2-


Write:

y(al, S2) = (L81-" 82- )-lg(X


'
_)
81" ~2

;(el" e2) = (Lsl-" 82-)-1 (L81-" 82- Y(el" 82))


R(sI,82) = y(sl,82) - H(81,s2-)~I(81-,s2)

R(sZ, s 2) = (Lsz_" s2_)-1 (Lel_" 82-R(81. e2))

= (82,82) - HH(bl,s 2) ,
where

~) =L81-e2-)-I (Lsl-,s2_rY(81,
s2-)H(81-,s2)),

and
121

Define

F = K(R,X) + K(X,H,Y)
and ~ = (1 + ~)Kcx,~) + (1 + ~)Kcx,~).

Keeping tI fixed, and using the differenCiation formula, we have the

following results.(See equation (5.3) of [2]).

LEMMA 5.2.

;(t1,t2) = (Lt1_ t2)-l(Ltl-"t2H(t1,t2))

- )-2 t2 -
= (Ltl-,t2 fo ~(t1"s2)dqs2

= for2RK(tl"S2)dqs2 - lot2RK(t1"s2)~(tl-'S2)l
+ ~(tl-,S2) dPs2

- I+H~ "P2 - RK'P2

Similarly

~(tl,t2) = ~RK "Pl - RK'~I


I+H

Using Theorem 3.3 of [3] we have the following two parameter expression

for (~)-1 :

LE~4A 5.3.

(Ltlt2)-I = I + C~-s~sJ.plp2

+ J(H(Sr82)).p1~x +
L
~2(G
,' (81"82),~Ip2_
L
s 1, s 2- si-, 82

+ c~7,_ ~2_)-~ck h, ~-)~c,:-,~) - .GK%,~?).~:~'~

REMARKS 5.4.

To determine the two parameter filtering formula for ; = (~)-1('-~) one


122

-I . ~oweverJ
might try to discuss this product using the above expression for -'(L)

the calculations rapidly become very complex. Instead it is better to suhstitute

the vertical filtering formula in the horizontal formula.

For

i = 1,2,3,4 write

~i = (~ )-1Ep IS1 = sI,S2 = 82 and Fsl, s~


81J 82 81-" 82- 81-, 82-

_ _ . s~ s2- uI u2
: {Lsl-,a J 1{8~i12(1 + Io fo g(Xoo- IO Io 04d~2~Pl)dquldqu2 )} "

and

nan(81,8 J = (I + HC81,s~-))C1 + ~(~I-,8~)) .

Then£he following expressions are obtained :

t2
~Cs1"tJ = - ~ot2 (°I
^ + ~ s )dF~* + I ° -K(a,H)
- (s1,z2_j dr2
8112 I" 2 2 I+H 81, s2"

with an analogous formula for ~(sl,t2) ,

=
• P2"
(1 + ~)2 (1 + [1)(n + )

XB ~ - fo HCa + ~)dFs2 + (I + ~)2 "P2

(HK(X,~) + XRK) 2
+ ^ .V ,

(1 +~)

.t2 ~ ~
so C a+
12 I+H

Using the differentiation rule and substituting in the formula of Theorem

4.3 we obtain the following expression:

THEOREM 5.5

Diagonal Filtering Equation:


123

Xtlt 2 " Xoo + fo fo (~I + +

tl t2 x(S,~ + ~) tl t2 K(~,~ + ~) d ~d/*


- fo Io ~ dF2* dr1 - fo /o "
I+H I+H

+ tl t2^ K(X,H) (F + G)
Io Io d~ I

tl t2 ~ , K(X_,~) ~E + s}^ }dv2~1


+ I o I o ~Kt(1 + ~)
ci + 9J cn + R K)

+ fotIfot2 ([IF - BKG) " (d~2d~l _ RKd~2d~Pl) .


(n + R E)

REMARKS 5.6

From Theorem 4.5 of [3],for @ 6 LI(~,F)

8.v2v I - (SR~K).~2~l is weak martingale for the filtration {Ftl,t ~


under the measure Q. Therefore, the process v 21
v - RK~ .p2pl can be thought of

as the diagonal innnovations process. Note that when R = O, RK = K(Hj~),


+ R K = EQ[(1 + H)(I + ~) IFtlt ] and our formulae resemble those of [5] .

REFERENCES

I. AL-HUSSAINI, A and ELLIOTT, R. J. Weak martingales associated with a two

parameterjumpprocess. Lecture Notes in Control and Information Sciences

16, 253-263. Springer-Verlag. Berlin, Heidelberg, New York. 1979.

2. AL-HUSSAINI,A. and ELLIOTT, K. J. Martingales,potentials and exponentials

associated with a two parameter jump process. Stoebastics 5(1981).

3. AL-HUSSAINI, A. and ELLIOTT, R. J. Stochastic calculus for a two parameter

jump process. Lecture Notes in Mathematics 863, 233-244, Springer-Verlag,

Berlin, Heidelberg, New York, 1981.

4. KOREZLIOGLU, H, MAZZIOTTO, G. and SZPIRGLAS, J. Nonlinear filtering

equations for two paramter semimartingales. Pre-print. Centre National

D'Etudes des Telecommunications.

5. MAZZIOTTO,G. and SZPIRGLAS, J. Equations du Filtrage pour un Processus


124

/ J
de Poisson Melange a Deux Indices. Stochastics 4 (1980), 89-119.

6. WONG, E. Recursive Filtering for Two Dimensional Random Fields. IEEE Trans.

in Information Theory. 21 (1975).

7. ZAKAI, M. On the Optimal Filtering of Diffusion Processes. Zeits. f~r

Wahrs. 11(1969), 230-249,


SPACE-TIME MIXING IN A BRANCHING MODEL

Klaus Fleischmann
Academy of Sciences of the GDR
Institute of Mathematics
DD~-1080 Berlin, Mohrenstrasse 39

Summar 2
Por some stationary spatially homogeneous measure-valued branching
processes we have asymptotic independence in the space-time diagram.

1. Introduction

Mostly branching processes describe mathematically the time evolution


of finite populations. Under usual assumptions these processes die
out or explode. Thus, processes stationary in time can only be expect-
ed if infinite populations are considered. A good way to illustrate
such an infinite population is to think of an unbounded space, say a
Euclidean space Rd=:G, in which the population is situated. More pre=
cisely, we shall use locally finite measures /~ defined on G, i.e.
measures /~ for which ~v(B) is finite for all bounded Borel sub-
sets of G. In this way /*(B) describes the mass (e.g. namber of par-
ticles) of the considered population in the region B.

In a natural way we have for the measures /~ s~ace translations

Let ( ~ t ) _ ~ ~ ~ ~ +~ be a stochastic process with locally finite


measures as its states.

Definition 1.1. (/,t) is called space-time mixing of all orders if


#

for all finite sequences t0,...,t k of time points and Xo,...,x k ~ G


the vect or
' " '

will be asymptotically independent in distribution as Exi,t i~ -

~j,tj] ~ foralli¢ j.
Roughly spoken, space-time mixing means that in the space-time dia-
gram the behaviour in finitely many regions will be asymptotically
independent if all distances between the regions tend to infinity.
126

Our p~rpose is to formulate the space-time mixing of all orders for


a relatively general stationary branching process.
Strong formulations of the model, the results and their proofs are
contained in [ISo

2. ~odel Assumptions
On principle we assume that, for all x in G, a unit mass ~x at posi-
tion x generates a cluster ~ x ' i.e. a random finite measure ~ x
defined on G. Moreover we presuppose a spatial homogeneity, i.e.
~ x coincides in distribution with Tx T0 for all x where ~0=:
is the cluster generated by a unit mass in the origin of R d. Farther-
more we shall use the basic assumption in branching theory, namely
that different masses generate independent progenies. Consequently,
in addition, we have necessarily to require that ~ is an infini-
tely divisible random measure, i.e. for all natural n~mbers k it can
be represented as a sum of k independent identically distributed
random measures. Finally, we assume that ~ is critical, i.e. its
expected total mass ~ ~ ( G ) equals I.
Using these assumptions, we can define a branching process (~ t)t~O
in the following way. At time 0 we start with the Lebesgue measure
on G, i.e. ~0:=~. Then all 'small parts d Jx' of the initial
population ~ 0 are clustered independently as well as spatially
homogeneously, and the superposition will give us the first genera-
tion ~ I" (A strong formulation for this intuitive explanation can
be given in terms of L~vy-Khinchin representations, cf.[5] or ~3].)
Given ~ I we continue in an analogous manner getting ~ 2 etc.
The following convergence lemma holds (cf. Hermann ~3]).
Lemma 2.1. ~ t converges in distribution to some limit population
~ which is cluster-invariant (in distribution) and infinitely
divisible, and its intensity measure E ~ equals ~ or the zero
measure oe
For the interesting problem of separating the extinction case ~ o
we refer to ~4]or, in the point process case, to [7], section 12.6.
An open problem is the question whether in some particular 'non-
stable' cases ~ = o there exist cluste~-invariant populations
(which would have infinite asymptotic densities).
127

3. The Result
~rom now on we restrict our considerations to the 'stable' case

Let ( ~ t ) t ~ Z be a stationar~ branching process with clusters


and ~ t = ~ (in distribution) where Z is the set of all integers.
Without any additional assumptions we have

Theorem 3.1. ( ~ t ) is space-time mixin~ of all orders.(Here we men-


tion that the theorem remains valid if we only presuppose that G is
a locally compact second countable Hausdorff Abelian topological
group.)
Analogously to the point process case it can be expected that ~
is always tail trivial, i.e. it has short range correlation (cf. L2]).
An open problem is to find criteria for still higher mixing proper-
ties in time, as e.g. Bernoulli property (except the very particular
case of stochastic translations ~ ~ < for some random position
cf. [9]).

4. Two Abstract Criterions

Starting from the stationary branching process ( ~ t ) let us consider


the random measure

~Z ~t~
defined on the groap R d × Z for which we have the space-time trans-
lations

:= x (x Rd,s z)

is infinitely divisible, too. This leads us to the following


abstraction.
Let A be a complete separable metric space, M be the set of all
locally finite measures /~ defined on A and P be an infinitely
divisible distribution defined on M. The L~vy measure (canonical
measure) of P is denoted by P (cf. ~ 6 ~ or [3~).
Let G be a locally compact second countable Hausdorff Abelian topo-
logical group and g ~ Sg be a homomorphism of G into the group of
all one-to-one transformations of A onto itself. The mapping [a,g
Sga is assumed to be continuous and { Sga : a ~ B, g ~ K
128

should be bounded for all bounded subsets B of A and compact subsets


KofG.
The last property ensures that the set-up

transfers the continuous G-flow (Sg) in A into a continuous G-flow


(Tg) in M. In this way the iiroup G acts continuousl[ in A and M.
New we assume in addition that P is invariant with respect to (Tg).

P is called mixing of order k > 0 if

tends to zero as gi-g j ~ ~ (i~j) for all g0,...,g k ~ G and


measurable subsets Y0'''''Yk of M.
In generalization of results of Nawrotzki C8~ (and earlier papers on
point processes cited thre) we have the following two theorems.

Theorem 4.1. P is mixing of order k if and only if

for all bounded Borel subsets B of A and all positive numbers £ ,

Let us mention here that the condition in the theorem does not
depend on the order k. In other words, for infinitely divisible ran-
dom measures the properties mixing and mixing of all orders coincide.
P is called ergodic if all measurable subsets of M which are almost-
invariant with respect to (Tg) are trivial with respect to P.
Let ( ~ n ) be a sequence of distributions defined on G such that the
variational distance II~I ~ ~, - ~Z ~ ]~ tends to zero as
n ) ~o for all distributions W I' ~ 2 defined on G which are
absolutely continuous with respect to a Haar measure ~ on G. Such
( T n ) is called weakly asymptotically uniformly distributed.

Theorem 4.2. P is er~odic if and only if


70

for all bounded measurable B C A and ~ > O


129

5. To the Proofs
Let F be the set of all bounded non-negative functions f on A with
bounded support. The Laplace functional ~ o f P is defined by

where pf := frc . fc. .


Theorem 4.1 is based on some e l e m e n t a r y inequalities:

Lemma 5 . 1 . Let fo,...,fk ~ F and t ~ 0 . Then

+ ,o ,t * o
c,,,4,.. >, )

where the oonst~ts only depend on k . On t h e o t h e r side

where the positive constant does not depend on g.


Por theorem 4.2 additionally an abstract statistical ergodic theorem
is used.
In order to prove space-time mixing of (~.t) first of all time
ergodicity is shown. Then we derive space ergodicity of ~ . Space-
time mixing then follows by using theorems 4.1 and 4.2 , several
tools in random measure theory and properties of sequences of con-
volution powers of distributions.

References
[I ] K. Pleischmann: Mixing properties of infinitely divisible ran-
dom measures. Carleton Mathematical Lecture Note, Ottawa (in
preparation)
[2] K. Fleischmann, K. Hermann and K. Matthes: Kritische Verzwei-
~
ungsprozesse mit allgemeinem Phasenraum, VIII., Math. Nachr.
submitted)
[3] K. Hermann: Critical measure-valued branching processes in dis-
crete time with arbitrary phase space, I., ~ath. Nachr. IO~,
63-107 (1981)
130

[4] K. Hermann: Kritische ma~wertige Verzweigungsprozesse in diskre-


ter Zeit. Dissertation. Akad. Wiss. DDR 1981
[5] ~. Ji~ina: Branching processes with measure-valued states.
Trans. 3rd Prague Conf. Inf. Theory Stat. Dec. F~nctions Random
Proc. 333.-357 (1964)
[6] O. Kallenberg: Random Measures. Akademie-Verlag Berlin 1975;
Academic Press London 1976
[7] K. ~atthes, J. Kerstan and J. ~ecke: Infinitely Divisible Point
Processes. Wiley & Sons, Chichester 1978
[8] K. Nawrotzki: Mischungseigenschaften station~rer unbegrenzt
teilbarer zuf~lliger ~age. ~ath.Nachr, 38, 97-114 (1968)
[9] T. Shiga and Y. Takahashi: Ergodic properties of the equili-
brium process associated with infinitely many Markovian partic r
les. P~bl. Res. Inst. ~ t h . Sol., Kyoto Univ. ~, 5C5-516 (1974)
LOGARITHMIC TRANSFORMATIONS AND STOCHASTIC CONTROL

1
Wendell H. Fleming

Lefschetz Center for Dynamical Systems


Division of Applied Mathematics
Brown University
Providence, Rhode Island 02912

i. Introduction. We are concerned with a class of problems described in a somewhat


imprecise way as follows. Consider a linear operator of the form L + V(x), where
L is the generator of a Marker process xt and the "potential" V(x) is some
real-valued function on the state space ~ of x t. We are interested in probahili-
stic representations for solutions ~(s,x) to the backward equation

(i i) d~ + L~ + V(x)# = 0, s < T,
• ds --

with data ~(T,x) = #(x) at a final time T. It is well known that, under suitable
assumptions,
? T
(1.2) ~(s,x) = Esx {¢(xT>e~ JsV(Xt)dt)
g i v e s such a r e p r e s e n t a t i o n • For i n s t a n c e , if x t = x s + wt - ws , w i t h wt a

brownian m o t i o n , t h e n (1.2) i s j u s t t h e Feynman-Kac f o r m u l a . We s e e k a d i f f e r e n t


kind o f p r o b a b i l i s t i c representation for I = - l o g #, i f ¢(s,x) is a positive
solution to (1.1}. In t h i s r e p r e s e n t a t i o n the generator L i s r e p l a c e d by a n o t h e r
generator L~ o f a Markov p r o c e s s ~t ( p o s s i b l y t i m e i n h o m o g e n e u u s . ) The o p e r a t o r

L~ i s c h o s e n t o s o l v e an o p t i m a l s t o c h a s t i c c o n t r o l problem o f the following kind•


The l o g a r i t h m i c t r a n s f o r m a t i o n I = - l o g # changes ( 1 . 1 ) i n t o t h e n o n l i n e a r e q u a t i o n

dI
(1.5) d~ + H(I) - V(x) = 0, where

(1.4) H(I) = - e I L ( e - I ) .

The function H is concave. For a fairly wide class of Marker processes, we wish
to write (I.3) as the dynamic programming equation associated with a suitable optimal
stochastic control problem for Marker processes. The stochastic control problem is
specified by giving: (a) a suitable control space U; for each constant control
u £ U, the generator Lu of a Markov process; and (c) a cost function k(x,u)
associated with constant control u and state x. See [6, Chap. YI]. It is re-

This research was supported in part by the National Science Foundation under contract
MCS 79-03554 and in part by the Air Force Office of Scientific Research under contract
AF-AFOSR 81-0116.
132

quired that

(1.5) H(I) (x) = rain [LUl (x) + k ( x , w ) ] , xEZ .


uEU
Then (1.3) becomes a dynamic programming e q u a t i o n :

(1.6) d-il + min [LUl + k(x,u) - V(x)] = O.


ds u6U
Time and state dependent controls ~(s,x), in feedback form, with values in the
control space U are allowed. The stochastic control problem is to find a feedback

(1.7)
mlnlmlzlng

J(s,x;u__) = Esx
f' [k(~t,ut) - V(~t)]dt + ~(~T) ,
S

where ~t i s t h e ( c o n t r o l l e d ) Markov p r o c e s s w i t h g e n e r a t o r LK, ~ = x, and

ut = ~ ( t , ~ t ) , ? = -log ¢ .

The Verification Theorem of optimal stochastic control theory [6, p.159] asserts that
if I is a "well behaved" solution to (I.3) with I(T,x) = ~(x) and if certain other
technical conditions hold, then

(1.8} I(s,x) = rain J(s,x;u) .

Moreover, an o p t i m a l feedback c o n t r o l £(s,x) is found by minimizing LUl(s,x) + k(x,u)


o v e r t h e c o n t r o l space U.

In this paper we take ~ = Rfl, a subset of n-dimensional euclidean space. In §2


we review the case when xt is a diffusion process on R n. For nondegenerate dif-
fusions, an appropriate stochastic control problem is immediately suggested by the
form of equation (1.3). In §3 we consider jump Markov processes x t, and associated
s t o c h a s t i c c o n t r o l problems. The choice o f an a p p r o p r i a t e c o n t r o l problem i s less
immediate for jump processes than for diffusions. In his Ph.D, thesis S-J Sheu [II]
uses a different control formulation, valid for a wide class of generators L(§4).
The optimal control in his sense leads to the change of probability measures described
in (4.5). In §5 we give a formal derivation indicating why stochastic control methods
C
can be used to obtain asymptotic estimates for exit probabilities for a family xt
of nearly deterministic jump processes. The results are not new (see [1][12]); the
interest is in the stochastic control method. Rigorous proofs are given in [ii] using
such methods.

In §6 we consider briefly the Donsker-Varadhan formula for the dominant eigenvalue


lI of L+V, from a control viewpoint. For nondegenerate diffusions the stochastic
control representation obtained for hI is the same as Holland's [9].
133

2. Diffusion p r o c e s s e s . Let xt be a diffusion in n-dimensional Rn , with generator

1
(2.1) Lf = ~ t r a(X)fxx + b(x) • fx

n ~2f ,
t r a(X)fxx =i,~=l aij(x)~

and with f the gradient. In this case,


X

(2.2) H(1) = ~1 t r a(X)Ixx + b ~ . I x - 1 i i a ( X ) i x

We may take U = Rn, u = (Ul,---,Un),

1
(2.3) LUl = ~ tr a(X)lxx + u " Ix

(2.4) k(x,u) = ½(bfx)-u)'a-l(x)(b(x)-u).

For a feedback control u , the drift coefficient b(x) i n (2.1) i s changed t o d r i f t

coefficient u(s,x) in the operator L~ .

The stochastic control representation (1.8) was used in [3] to give a stochastic
control proof of results of Ventsel-Freidlin type for some large deviations problems
for nearly deterministic diffusions. In those results a(x) is replaced by 6a(x),
small. In [4] the logarithmic transfo~,mtion was used to obtain stochastic
representations for positive solutions to the heat "equation with a potential term,
and to obtain the "classical mechanical limit." In [5] [10] the same logarithmic
transformation was apDlied to solutions to the pathwise equation of nonlinear filtering.
Large deviations results for the nonlinear filter problem are obtained by Hijab
[8] elsewhere in this volume.

In [7] Hernandez-Lerma obtained similar results for certain degenerate diffusions,


for which the matrix (aii(x)),. i,j=l,---,n < n is positive definite and aij(x)=0
if i > m or j > m.

3. Jump processes. To motivate our choice of stochastic control problem, let us


begin with a simple special case in which the process xt jumps only by a fixed
increment y (as for example for a Poisson process.) In this case the generator
L takes the fora

Lf(x) = a ( x ) [ f ( x + y ) - f ( x ) ] .
134

From (1.4)

H(I)(x) = a(x){1 - exp [ I ( x ) - I(x+y)]).

The dual function to the convex function er is u - u log u (u > 0):

(3.1) e r = max [u - u l o g u + u r ] .
u>0

The max occurs when log u = r. Let

(3.2) LUI(x) = u a ( x ) [ I ( x + y ) - I ( x ) ] , u > 0

(3.3) k(x,u) = a(x)(u log u - u+l).

By taking r = I(x)-I(x+y) in (3.1) and changing signs (to replace max by min),
we get the required form (1.5) for H(I). In this special case the control u
is scalar, with u > O. A constant control u changes the jumping rate from a(x)
to ua(x). A feedback control u(s,x) changes the rate at time s and state x from
a(x) to u(s,x)a(x). If I(s,x) = -log #(s,x) as in §i, then the optimal feedback
control is u*(s,x) = ~(s,x)-l#(s,x+y).

Let us now consider a jump process xt with generator of the form

(3.4) Lf(x) = a(x) /_n[f(x+y) - f(x)] ~(x,dy).


R
Here f E B(Rn), the space of bounded Borel measurable functions on Rn. We assume
that a £ B(R n) and that W(x,.) is a probability measure with ~(.,A) Borel
measurable for each Borel set A and z(x,{0}) = 0. Additional conditions on a
and n need to be imposed later. Motivated by the special case above, we control
the jumping distribution, replacing a(x)~(x,dy) by a(x)~(s,x;y)~(dy). To formalize
this idea, we introduce the control space

(3.s) U = {u('): u,u-16 B(Rn), u(y) > 0 for all y £ Rn }.

Suitable Lu(° ) and k(x,u(.)) are obtained by i n t e g r a t i n g (3.2), (3.3) with respect
tO 7r(x,dy) :

(3.6) Lu(" ) l (x) = a(x) IRn[I(x+y)-I (x) ]u(y)n (x,dy)

(~. 7) k(x,u(°)) = a(x) fRn[u(y)log u(y)-u(y)+l]w(x,dy).

We g e t a s i n e q u a t i o n (1.5)

(3.8) H(1) (x) : rain [Lu(" ]I (xJ+k(x,u(°))]


uC-)eo
If ~Cs,x) is a positive solution to (l.1) and I = -log ~ , then the optima~ feed-
135

back control is

(2.9)
(s,x+")
u*(s,x;') = --~--(s,x)

As outlined in the next section, it is sometimes more convenient t o consider instead


a related control problem. In particular, the formulation in §4 is the one used in
[ii] to give control method proofs of the results on the exit problem mentioned in
§5.

4. The Sheu formulation. In [ii] another kind of control problem is considered. Let
L be a bounded linear operator on C(~), the space of continuous bounded functions
on ~ , such that L obeys a positive maximum principle. (In particular, L may
be of t h e form (3.4) above.) For w = w(-) a positive function with w,w - i £ C(~),
define the operator iw by

1
(4.1) iwf = w- [L(wf) - fLw].

In a d d i t i o n , d e f i n e ~(x) by

(4.2) kw = ~ ( l o m O - ~L(w).

For unbounded L, additional restrictions on w are needed in order that ~w and Kw


be well defined.

From t h e d u a l i t y (3.1) between er and u log u - u, it is not difficult to show [ii]


t h a t for I £ C(~)

(4.3) H(I) = mjn iLwI + K~] .


W

The minimum is attained for w = exp (-I). For L t h e g e n e r a t o r o f a jump p r o c e s s ,


the two formulations are related by ~w = L~ , where u is the (stationary) feedback
control defined by

(4.4) u(x;y) = w(x+y)


-- w (x)

Moreover, KW(x) = k(x,u(x;.)).

In Sheu's formulation, the control problem is to choose wt(. ) for s < t < T to
minimize
rT w t
/(s,x;w) = Esx ( Js [K (~t) - V ( ~ t ) ] d t + vI(~T) } ,

where ~t is a Markov ~rocess with ~enerator ~t and with ~s = x. Here


136

we assume that L is the generator of a Markov process x t which implies in particu-


lar L1 = i.

Suppose t h a t ~ is a positive solution to (1.1), with ¢(s,'), ¢(s,') -1 £ C(~)


and w i t h V 6 C(~). We can u s e ( 4 . 3 ) t o g e t h e r with the Verification Theorem i n
stochastic control to conclude that I(s,x) ~ /(s,x;w) with equality when
w*t = # ( t , . ) . *
Thus t h e c o n t r o l w = ~ ( t , - ) is optimal in this sense. For jump
t
processes this agrees with (3.9), according to (4.4).

The change o f g e n e r a t o r from L to ~ = w~ c o r r e s p o n d s t o a change o f p r o b a b i l i t y


m e a s u r e , from P to P, as f o l l o w s :

ESX [ f ( x t ) } (x T) ]
(4.s) ~sxf(gt ) = zs#CXw ) , s < t < T, f eCd).

This is seen from the following argument. The denominator of the right side is
(S,X). Let

(s,x) = Esx[f(xt)¢(XT)] = Esx[f(xt)d#(t,xt)] .

Since ~ and $ both satisfy (l.l) with V = O, the quotient v = $#-i satisfies

as [ ~2 ] = - [L(vqb) - v L S ] ,

(4.6) 3__vv÷ Lv = O, s < t ,


as

with v(t,x) = f(x) as r e q u i r e d .

The a u t h o r w i s h e s t o t h a n k M. Day f o r a h e l p f u l suggestion related to (4.5).

5. Asymptotic estimates for exit probabilities.

Let xt be a family of Markov processes, s < t < T, depending on a small parameter


> O, such that xc
t tends (in a suitable sense) to a deterministic limit x~ as
e + O. Let $£ denote the probability that xc belongs to a set r of trajectories
which does not include trajectories "near" x0 Typically ¢~ is exponentially
small. Its asymptotic rate of decay to 0 can be found from the theory of large
deviations [1][12] [13]. In the exponent a constant I0 appears, which is the mini-
mum of a certain action functional over a set of smooth paths.

I n many i n s t a n c e s these asymptotic estimates can a l s o b e o b t a i n e d by i n t r o d u c i n g a


stochastic control problem of the kind indicated in previous sections, f o r each e > 0
[3] [ 1 1 ] . With t h i s method a ( s t o c h a s t i c ) optimization problem appears for each
137

8 > O, not just in the limit as e ~ O.

Let us consider the special case when ~e is an exit probability:

C8(s,x) = Psx(~ 8 ~ T),

where Ts i s t h e e x i t time o f x~ from a bounded, open s e t DC Rn, and where x~ E D


for s < t < T. We c o n s i d e r n e a r l y d e t e r m i n i s t i c jump p r o c e s s e s , as f o l l o w s . Nearly
deterministic d i f f u s i o n s were c o n s i d e r e d i n [5] [7]. Following Vent'cel [12] l e t us
r e s c a l e t h e jump p r o c e s s i n §5, r e p l a c i n g y by 8y and a(x) by m-la(x) to
obtain t h e generator for xt:

(5.I) L£f(x) = C l a ( x ) fI [f(x*ey)-f(x)] z(x,dy).


)
Rn
Fix x 8 = x. For s < t < T, the path x e tends in probability as 8 ~ 0 (D-metric)
s
to x 0 , where x~ satisfies

~n
with x 0 = x. The e x i t p r o b a b i l i t y ~e(s,x) is a positive s o l u t i o n to
s

(5.3) ~ ~i + Lgd~e = 0
~s

in (-~,T) x D. The logarithmic transformation 18 = -e log #e changes (5.5) into

8Ie+
(5.4) 8 H~(8-11 e) = O,

where He(1) = - e I L s ( e - I ) . Then

(5.5) oqs(e-ll) = a(x) !n (1 - exp[I(x)-I(x+s c~)])~(x,dy)

For I (x) such that I, I are continuous, bounded


x

l i m e H e (e-ll) = Ho(X,lx),
c.,O
with I the gradient and
x
?

(5.6) HO(x,P) = a(x) [ (1 - e -p " Y)~(x,dy).


J
£
This s u g g e s t s (but c e r t a i n l y does n o t p r o v e ) t h a t I tends to a limit I0 as 8 -+" O,
where I0 satisfies (perhaps i n some g e n e r a l i z e d s e n s e )
138

310 H ( x , I O) = 0.
(5.7) ~ss +

Now (5.7) is the dynamic programming equation for the deterninistic control problem
with control space U as in §3, with running cost k(~t,ut(-)), and with dynamics

d£ t
(5.8) dt "= b ( ~ t , u t ( ' ) ) ,

f
b(x,u(-)) = a ( x ) ] y u(y)TrCx,dy).
a
Rn

Sheu [11] p r o v e d t h a t indeed Ie ÷ I0 as E -~ 0 under the following hypotheses:

(i) a(-) is bounded, positive, and Lipschitz;


(ii) 7r(x,dy) = g(x,Y)~l(dY ) with n] a probability measure, nl({0}) = 0,
g(.,y) uniformly Lipschitz, and 0 < c I < g(x,y) < c2,

(iii) IRneXp (~lyl2)~l(dY) < co for some ~ > 0;

(iv) the convex hull of the support of ~i contains a neighborhood of 0.

Condition (iv) insures that H0(x,p) is the. dual of the usual "action integrand" A(~,~)
in large deviation theory, where for ~,~ E Rn

(5.9) A(~,~) = rain {k(~,u(.)): ~ = b(~,u(.))} .


u(-)
Then O
(5.10) I0(s,x) = min fs A ( ~ t ' ~ t ) d t ' x £ D.

The minimum i s t a k e n among C1 p a t h s ~. with ~s = x such t h a t ~t first reaches


DD at time O _< T. The requirement in (5.10) that ~t exit from n by time T
is suggested by the boundary condition ie(T,x) = + oo for x 6 D. This corresponds
in the limit as e + 0 to an infinite penalty for failure to reach 8D by time T.

In both [3] and [Ii] the stochastic control method used to show that Ie + I0 depends
on comparison arguments involving an optimal stochastic control process when ~ > 0
and an optimal ~0 in (5.10) when e=0.

6. The dominant eigenvaluo. In [2] Donsker and Varadhan gave a variational formula
[(6.4)below] for the dominant eigenvalue %1 of L + V. Another derivation of this
formula is given in [Ii], using the family of operators ~w mentioned in §4.

When L is the generator of a nondegenerate diffusion process, H o l l a n d [9] e x p r e s s e d


~1 as t h e minimum a v e r a g e c o s t p e r u n i t t i m e i n a s t o c h a s t i c control problem. Let us
first indicate f o r m a l l y how t h i s i d e a e x t e n d s t o more g e n e r a l g e n e r a t o r s L. Then we
139

impose strong restrictions on L, and give a short derivation o f (6.4),

Asstune t h a t L+V has a positive eigenfunction #1 corresponding to EI:(L+V)~ l = t l # l .


Let I 1 = - l o g ~1" Then

(6.1) -H(II) + V = El "

Assuming t h a t t h e r e i s a s t o c h a s t i c control representation (1.5) for H(I), equation


( 6 . 1 ) becomes

(6.2) min [LUIl(X ) + k ( x , u ) ] - V(x) = - i 1.


u 6U

E q u a t i o n ( 6 . 2 ) i s t h e dynamic programming e q u a t i o n f o r t h e f o l l o w i n g a v e r a g e c o s t p e r
unit time control problem. We a d m i t s t a t i o n a r y controls u(.) such t h a t t h e con-
trolled process with generator L~ h a s an e q u i l i b r i u m distribution P The c r i t e r i o n
t o b e minimized is

(6.5) J(p,U) = fV [k(x,u(x)) - V(x)]d~(x).


u
(If there is a unique equilibrium distribution ~ = ~-- then reference to p on t h e
left side of (6.3) is unnecessary.) The p r i n c i p l e of optimality states that
-E I ~ J ( ~ ) with equality provided u_*(x) gives the minimum over u 6 U of

LUll(X) + k ( x , u ) .

Let us now assume t h a t ~ i s compact, t h a t the generator L i s bounded on C(~)


and V 6 C(~). As i n [2] f o r any p r o b a b i l i t y measure p on [ let

where I, ~ 6 C(~). The D o n s k e r - V a r a d h a n f o r m u l a i s

(6.4) E 1 = sup
~ [ [ Vd~ - J ( ~ ) ] .

Let
e(l,~) = IX [-H(I) + V]d~ .
The function P is convex in I and linear in p . Formula (6.4) will follow if we
can find Ii' Pl with the saddle point property:

(6.5) P(Ii,~) _< E 1 < P(l,Pl) for all I, ~.

(This i d e a was known t o Donsker and V a r a d h a n a long t i m e ago, and f i g u r e s in their


proof [2] of (6,4).) If there is a positive eigenfunction 91, then we take ll=-log~ I.
140

From (6.1) we have in f a c t P ( I i , ~ ) = X1 f o r a l l p r o b a b i l i t y measures ~uOn ~ .


To get the r i g h t hand i n e q u a l i t y , choose u* as above and assume t h a t L-- i s
bounded on C(~). The corresponding Markov process ~t has an e q u i l i b r i u m d i s t r i -
bution ~1" and

(6.6) f~ (LU'-*I) d ~1 = 0, f o r a l l 1 6 C(~).

(If L~ is unbounded we need to assume the existence of ~i' and to restrict


I to the domain of L~*). By taking u = u__*(x) in (1.5) we have for I E C(~)

Le I + k(x,u__*) - v > x(1) - v.

By integrating both sides with respect to ~I'

-Xl = J ~ l 'u-*) ~ - P ( I ' ~ I ) ' ~1 ~ P ( I ' ~ l ) '

as r e q u i r e d .

In order to derive (6.4) in this way we had to impose unnecessarily restrictive


hypotheses. In particular, we assumed that lI is a dominant eigenvalue in the
strict sense that (L + V)~ 1 = ll#l, with ~I > 0. Actually, (6.4) holds if L is
the generator of a strongly continuous, nonnegative semigroup Tt on C([), such
that Ttl = i, L has domain dense in C(~), and L satisfies the maximum principle
[2]. With such assumptions lI is a dominant eigenvalue in the sense that the
spectrum of L + V is contained in {z: Re z ~ l I} and lI (L + V) does not
have an inverse.

REFERENCES

[I] R. Azencott, Springer Lecture Notes in Math. No. 774, 1978.


[2] M. D. Donsker and S. R. S. Varadhan, "On a v a r i a t i o n a l formula f o r p r i n c i p a l
eigenvalue f o r o p e r a t o r s with a maximum p r i n c i p l e , Proc. Nat. Acad. S c i . USA
72(1975) 780-783.
[3] W. H. Fleming, E x i t p r o b a b i l i t i e s and optimal s t o c h a s t i c c o n t r o l , Applied Math.
Optimiz. 4 (1978) 329-346.
[4] W. H. Fleming, Stochastic calculus of variations and mechanics, to appear in
J. Optimiz. Th. Appl.
[5] W. H. Fleming and S. K. Mitter, Optimal control and nonlinear filtering for
nondegenerate diffusion processes, to appear in Stochastics.
[6] W. H. Fleming and R. W. Rishel, Deterministic and Stochastic Optimal Control,
Springer-Verlag, 1975.
[7] O. Hernandez-Lerma, Exit probabilities for a class of perturbed degenerate
systems, SIAM J. on Control and 0ptimiz. 19 (1981) 39-51.
141

References (cont. }

[8] O. H i j a b , Asymptotic n o n l i n e a r f i l t e r i n g and l a r g e d e v i a t i o n s , t h i s volume.


[9] C. J . Holland, A minimum p r i n c i p l e f o r the p r i n c i p a l eigenvalue of second o r d e r
l i n e a r e l l i p t i c equations with n a t u r a l boundary c o n d i t i o n s , Communications
Pure Appl. Math. 3--1(1978) 509-519.
[I0] E. Pardoux, The solution of the nonlinear filter equation as a likelihood function,
Proc. 20th Conf. on Decision and Control, Dec. 1981.
[ii] S. J. Sheu, PhD Thesis, Brown University, 1982.
[12] A. D. Ventsel, Rough limit theorems on large deviations for Markov stochastic
processes, Theory of Probability and its Appl. 2-1(1976) 227-242, 499-512.
[13] A. D. Ventsel and M. I. Freidlin, On small random perturbations of dynamical
systems, Russian Math. Surveys, 25(1970) l-SS.
GENERALIZED GAUSSIANRANDOMSOLUTIONS
OF CERTAIN EVOLUTION EQUATIONS
Luis G. Gorostiza*
Centro de Investigaci6n y de Estudios Avanzados, IPN
and
Instituto de Investigaci6n en Matem~ticas Aplicadas y Sistemas, UNAM
M~xico

Certain generalized Gaussian processes which arise as high density limits of super-
critical branching random fields (see [ i ] , [4]) possess interesting properties. In
this note we prove some of these properties. We remark the fact that the processes
obey deterministic evolution equations with generalized random i n i t i a l conditions.
Let S(Rd) denote the Schwartz space of infinitely differentiable rapidly decreasing
real functions on Rd topologized by the norms

li¢~p = O~Iki~pmaX SUPx ~ = l ( l + I x J l ) P I D k @ ( x ) i ' @ES(Rd)' p = 0,1 . . . . .

Dk @k.~ kl _xkd . .
where x = (x I . . . . . Xd), k = (k I . . . . . kd), Ikl = k l + . . . + k d, = /dXl " " ~ d " Le~
S'(R d) denote the topological dual of s(Rd), ( . , . > the canonical b i l i n e a r form on
S'(R d ) x s ( R d ) , and li'll - the operator norm on the dual of the ~.ll -completion of
-~ d+ ,d+ .. P.
s(Rd). The Schwartz spaces %(R xR ) and $ (R xR ) are slmllarly deflned. The standard
Gaussian white noise on Rd will be written W; i t is the S'(Rd)-valued random variable
whose characteristic functional is Eexp{i <W,@) } = exp{-½~d @2(x)dx}" (See [3]).

Let A be the infinitesimal generator, {Tt,t~O} the semigroup and Pt(x,dy) the
transition probability of a time-homogeneous Markov process {Xt,t~O} whose state space
is all of Rd. We assume that

Tt:s(Rd ) ÷ L2(Rd),
T:s(Rd× R+) ÷ L2(Rd)

,(x,t) ÷
s2 rt,(.,t)(x)dt

and

A:$(R d) ÷ S(R d)

and that these mappings are continuous.


We are interested in random %'(Rd)-valued solutions of the deterministic evolution
equation ~ f / B t : A * f , where A* is the adjoint of A, with i n i t i a l condition fo = W.
Let {Mt,t~O} denote the centered S'(Rd)-valued Gaussian process with covariance

* Research supported in part by CONACYTgrant PCCBNA005167.


143

functional

Cov(< Ms,@) , < Mt,~> ) : [ [ @(y)~(z)K(s,dy;t,dz), qb,~Es(Rd), s,t_>O,


'Rd JRd
where

K(s,dy;t,dz) = !dPs(X'dy)Pt(x'dz)dx' s,t~O.

The existence of this process follows from its representation as a stochastic integral

(Mt'@> : IdTt@(x)W(dx)' @ES(Rd)' t~O;

indeed, the integral is well defined as a consequence of the f i r s t assumption on Tt,


and by the properties of such integrals i t determines {Mt} as a centered Gaussian
random field with covariance functional

Cov( <Ms,~>, <Mt,~ ) ) = [ Ts@(X)Tt~(x)dx, @,gEs(Rd), s,t~O


d
(see [3], [7]), which coincides with the previous expression. We note that Mo =W
and K(O,dy;O,dz) is the unit mass at the origin of Rd× Rd.

THEOREM. The process {Mt,t~O} is Markovian, i t possesses a continuous version in the


topology of S'(Rd) (i.e. for each T>O there is an integer p>O such that {Mt} i s
II.N_p-COntinuous on [O,T] almost-surely), and viewed as a space-time process i t satis-
fies the evolution yquation

aM/at = A'M, t~O,


M0 = W.

REMARKS.
I. All the randomness of {Mt} comes from the initial condition Mo = W.
2. We recall that a Gaussian random field is "classical" (or continuous) i f
and only i f its covariance measure has a density with respect to Lebesgue measure
(covariance kernel) and i t is continuous (see [ 3 ] ) . In our case, for K(s,dy;t,dz)
to have a density k ( s , y ; t , z ) i t is necessary that Pt(x,dY) have a density Pt(x,y),
and the covariance kernel is

k(s,y;t,z) = [ Ps(X,y)pt(x,z)dx, y , z E R d.
d

Hence for t>O, Mt is classical i f and only i f there is a density Pt(x,y) and
(y,z) ~ k ( t , y ; t , z ) is continuous.
144

Independently of the assumptions on Tt and A, a covariance measure K of the


type above need not have a density. For example, if {Xt} is a pure-jump Markov
process with holding time parameter ~(x) and jump distribution P(dy;x), i.e.

A~(x) = ~(x) i d (@(y)-@(x))P(dy;x),

then Pt(x,dy) has an a t o m at x.


For time-homogeneous diffusion operators A the situation concerning the exist-
ence of a continuous covariance kernel k does not seemsimple at first sight,
among other reasons because there are different definitions of diffusion. Under
reasonable conditions on aij(-) and hi(.) there corresponds to

A : ½Zi,j=
ld aij(.)@2/~xi@xj + Zi=id bi(.)~/~xi

a transition density (see [6]), but the continuity of (y,z) ~ k(t,y;t,z) is another
question. We leave it open.
3. For Brownian motion things are nice, as they should be. In this case A = ½Aand
Pt(x,y) = exp{-llx-yIl2/2t}(2~t) -d/2. The assumption on A is clearly met. We now
verify the assumptions on Tt. For @Es(Rd),

: = ! d (Ex@(Xt))2dx < d Ex~2(Xt)dx


IITt@IIL22 ! ! d (Tt@(x))2dx

= !d !d @2(y)e-' Y-x"2/2t (2~t)'d/2dydx

< sup(@(y)II;:l(l+,yj,)2)2! C e-~Y-Xll~2t(2~t)-d/2R~=l(l+lyjl)-4dydx


y d d

For ¢ Es(Rdx R+),

22 _dl2dydt
12dx
IIr,ll ~ = d oTt@(,,t)(x)dt dx --
d o d @(y,t)e"~y-x~ / t(2~t)

@(y,t)@(z,s)! e']Y-X~2/2t(2~t)'d/2e"Ilz'x~2/2s (2~s)-d/2dxdydzdtds


d

@(y,t)@(z,s)e-Uy-zll2/2(s+t) (2~(s+t))-d/2dydzdtds
145

sup
x,tZO - Jo~oJRd

" lldj=~'(1+lY'l)-211~3: i (1+I zJ I)-2(l+t)-Z(l+s)-2dydzdtds

< II¢li~ dIIdj=1(1+IzJ{)-2dz (1+t)-2dt] '

since IId
j : I ( I + I y j I )-2~I.

Therefore the Gauss-Markov process {Mt} in this case has covariance kernel

k(s,y;t,z) = e -lly-zll~/2(s+t)(2~(s+t))-d/2, s,t=>O

(because Pt(x,y) = pt(Y,X) and the Chapman-Kolmogorov equation), hence i t is


classical for t>O, and i t s a t i s f i e s the heat equation

BM/Bt = ½AM, t~O


Mo = W.

In additon the process is self-similar, i.e. for any constant ~>0 the distribution
of {Mt} is invariant under the transformation <~-d/2M 2t,@(~-i.)} , as can be
seen from the covariance kernel.

PROOF. For the Markov property of {Mt} i t suffices to show that given to<t and
@ES(Rd) there is a ~ES(Rd) such that

E{(<Mt,@ ) - <Mto,¢) )<Ms,@)} : 0 for all s<to and ~ Es(Rd),

and this can be done using the covariance of {Mt} and the Chapman-Kolmogorov equation
for Pt(x,dy), with @ : Tt_to¢.
I t follows from the covariance of {Mt } (or more easily from the stochastic i n t e -
gral representation) that

E( <Mt,@)- <Ms,@))2 = !d (Tt@(x)-Ts¢(X))2dx" @E$(Rd)"

Then, using the semigroup property of Tt, the relation f~TsA(ds = Tt@-~ and the
Schwarz inequality, we have for s<t
146

E( <Mt,@) - <Ms,~) )2 = !d (Ts(Tt-s@(X)-@(x)))2dx

= i dU
rrt-s
o Ts+rAdp(x)dr~2
j dx _<_It-s)( Ft'S(rs+rA,(x))2drdx
JRd Jo

= (t_s)It-s F (Ts+rA@(x))Zdxdr < Kit-s) 2,


,o 'Rd

where K is a constant (depending on each bounded interval [O,T] such that O<s<t<T).
The Dudley-Fernique theorem ([2], Theorem 7.1) then implies that {<Mt,@> } is sample-
continuous for each @Es(Rd), and the stated norm-continuity of {Mt} follows by
Mitoma's extension [5].
To obtain the evolution equation we view {Mt} as a space-time field. By the second
assumption on Tt (definition of T),

<M,,) = Io < M t , , ( - , t ) ) d t : d Z t , ( - , t ) ( x ) d t W(dx), ,E ×R÷)

determines {Mt,t~O} as a generalized space-time Gaussian random field M. Let

Wt = ~Mt/at-A*Mt'

meaning W = aM/at-A*M as a space-time field; hence for @Es(Rdx R+),

<W'*) = I <~ t ' * ( ' ' t ) > dt,


o
and

<W,@) = <~MI~t-A*M,@> = - <M,~@/~t+A@) = -


f~ <Mt,(~@/@t+A@)(,,t)) dt.
o
Since A is continuous, Wis a space-time Gaussian random field. Therefore to verify
the evolution equation we need only showthat Whas the right covariance. We have

Coy(<W,qb) , { W , ~ ) ) : Cov({Ws,@(-,s). > , <Wt,~(.,t)> )dsdt,

and on the other hand


co co

o o

=
IT!
o o d
Ts(?*l~t+A¢) (x,s)T t (~*l~t+A*) (x,t)dxdsdt,
147

hence

cov(<Ws,,(.,s)),< ,~(.,t)>)dsdt

where Ex is the expectation under the distribution Px of {Xt} starting from xERd.
But

,(×t,t) -
I2(~,/~t+A,)(Xs,S)ds, t~O

is a Px-martingale commencing from O(x,O), and since martingales have c o n s t a n t expec-


t a t i o n and ~(-,~) : O, then

Ex i o (?,/~t+A,)(Xs,S)ds = -~(x,O).
o
S i m i l a r l y for ~. S u b s t i t u t i n g above we obtain

f f>ov
o
<Ws,,(.,s)) , (wt,,(-,t)))dsdt =
Rd
,(x,O),(x,O)dx,

and therefore

Cov(<Ws,~(.,s)) ,<Wt,~(.,t)) ) = 62(s,t)~d~(X,O)~(x,O)dx,

where 62 is the Dirac delta function on R2 centered at the origin. This means that

~t=lw, t=o
O, t > O

and hence {Mt} satisfies the evolution equation.


148

REFERENCES

1. Dawson, D. and Gorostiza, L.G. Limit theorems for supercritical branching random
fields. In preparation.
2. Dudley, R.M. (1967). The sizes of compact subsets of Hilbert space and continuity
of Gaussian processes, J. Functional Analysis, Vol. 1, 290-330.
3. Gelfand, I.M. and Vilenkin, N.J. (1966). Generalized Functions, Vol. 4, Academic
Press, New York.
4. Gorostiza, L.G. (1981). Limites gaussiennes pour les champs al~atoires ramifies
supercritiques, in "Aspects Statistiques et Aspects Physiques des Processus
Gaussiens", Colloque International du C.N.R.S., Saint-Flour, France, 1980. Editions
du C.N.R.S. No. 307, 385-398.
5. Mitoma, I. (1981).. On the norm-continuity of S'valued Gaussian processes, Nagoya
Math. J., Vol. 82, 209-220.
6. Stroock, D.W. and Varadhan, S.R.S. (1979). Multidimensional Diffusion Processes,
Springer-Verlag, Berlin.
7. Wong, E. and Zakai, M. (1974). Martingales and stochastic integrals for processes
with a multi-dimensional parameter. Z. Wahrscheinlichkeitstheorie verw. Gebiete,
Vol. 29, 109-122.
EXTREMAL CONTROLS FOR COMPLETELY OBSERVABLE DIFFUSIONS*

U.G. Haussmann
Department of Mathematics
University of British Columhla

i. Introduction. We consider the basic control prpblem:

(i.i) inf{J[u]:u E U}

where

(1.2) J[u] =E{10T£(t,x(t),u(t))dt+c(x(T))}

and U is some class of admissible controls, and where the state x satisfies the
equation

(1.3) dx=f(t,x(t),u(t))dt+o(t,x(t))dw , x(0) = x 0 .

Here x and w are processes defined on some probability space which may depend on
u, and w is a standard Brownian motion. More details will be given in the next
section where several classes of controls will be considered. We begin with the fol-
lowing notation and assumptions (more will follow). Ct is the Banach space of con-
tinuous functions, [0,t] ÷ ~ n " CT = C , G is the Borel u-algebra on C and {G t}
is the canonical filtration on C . Write IIxllt =sup{Ix(s) l : ° ~ s ~t} . if
x:[0,T] × ~ + ~ is a continuous process, i.e. x is a C-valued random variable,
n
then ~t=x-loGt is the u-al~ebra generated by the past of x . On [0,T] we use
the Borel u-algebra. U is a Borel set in some Euclidean space with Borel subsets
% " ~n is n-dimensional Euclidean space and ~n×m is the space of n ×m
matrices.

(1.4) (i) f: [O,T] x ~ x U+3~ is Borel measurable and continuous in u uni-


n n
formly in t for each x .
(ii) o: [0,T] × I~ + ]R is Borel measurable.
n nxm
(lli) If(t,x,u)i+1~(t,x)l <~l(l+txl) ,
If(t,x,u) - f(t,y,u) I + lo(t,x) - O(t,y) I < K 21 x - Y] ,
for some K I<~ , K2<=

(1.5) (i) Z:[0,T] x ~


n
xU+~ is Borel measurable, and continuous in (x,u) for

each t, continuous in x uniformly in u for each t.


(ii) c:~ + ~ is continuous.
n
(iii) l£(t,x,u) l ~ K 3 ( l + I x l q ) , Ic(x) l ~ K 3 ( l + I x l q ) for some q<-
150

We show in section three that a control which satisfies the necessary condi-
tion of the maximum principle, i.e. an extremal control, is in fact 'optimal' pro-
vided that certain convexity hypotheses are satisfied. This result is proved along
the lines of the corresponding deterministic case, [9]. Here 'optimal' means optimal
within the class of controls which are adapted processes on a given probability space
with filtration and Brownian motion. However usually the 'non-antieipative' controls
are adapted processes defined on some probability space with filtration and Brownian
motion which may change with u . In section two we see that inf J[u] is the same
over all the usual control classes, so that any extremal control is optimal under the
convexity hypothesis. In section four we apply this resnlt to show that the extre~al
controls computed for several examples in [5] are in fact optimal. It is also useful
in establishing e-optlmality for certain problems, c.f. section three and [6] .
One final note. We shall always assume that the probability spaces are chosen
so that the solution x of (1.3) is not just continuous w.p.l, but in fact all tra-
jectories are continuous.

2. Control Classes, Various sets of controls have been introduced in the literaD~re.
We shall consider several of these here and observe that they all lead to the same
inflmum in (i.i).

Definition 2.1: U the set of control laws, consists of the {G t} progressively


L '
measurable functions u:[0,T] × C-~U for which there is a probability space (~,F,P)
with filtratlon,which may depend on u and which carries a continuous adapted pro-
cess x and a Brownian motion w such that (x('),u(-,x(.)),w(-)) satisfy (1.3)
w.p.l.

Definition 2.2: UA , the set of adapted controls (adapted to x ), consists of all


separable, measurable processes u defined on a probability space (~,F,P) , which
may depend on u and which carries two continuous processes x , w such that
(wt,Ft) is a standard Brownian motion, u is { } progressively measurable,
U-valued, and (x(-),u(.),w(-)) satisfy (1.3) w.p.l.

Definition 2.3: U N , the set of non-anticipative controls (non-anticipative with re-


spect to w ) , consists of all separable, progressively measurable, and U-valued
processes u defined on ~ , a probability space with filtration and Brownlan mot~n,
i.e. ~ = (~,F,P,{Ft},w) , which may depend on u .

Definition 2.4: ~ , the set of Markovian control laws, consists of all Borel meas-
urable functions u:[0,T] ×JR ÷U for which there is a probability spaee (~,F,P)
n
with filtration,which may depend on u and which carries two continuous adapted
processes x , w such that w is a Brownian motion and (x(-),u(',x(')),w(~))
151

satisfy (1.3).

Definition 2.5: For fixed ~ = (~,F,P,{Ft},w) write UN(~ ) for the corresponding
controls in UN . Similarly for UM(~) , UL(~) .

Remark 2.6. If u ~ U L , then according to [2], [i0] there exists a Brownlan motion
- x
(wt,F t) such that

dx = f(t,x(t), u(t,x(.)))dt +~(t,x(t))dw ,

so that u(.,x(-)) £ U A , i.e. ULC-+U A .

For u~U A we can take Ft = F xt • i.e. u ~ UN , so

c_+ UL c_+UA c U N .

We also have for each z

~ ( ~ ) c_+UL(~ ) c+ UN(~ ) •

Remark 2.7. As we are assuming (1.4), then for u ¢ U N or u c UN(~) , (1.3) has a
u
unique, continuous adapted solution x Gronwall's inequality, Burkholder's in-
equality and (1.4),(1.5) imply that there are constants Kq, K 4 such that

IIxuEftq -<KqE]X(0) Iq , any q<~ ,

E fixu - yUi] T <K4EIx(0) - y(0)]

for any u ¢% Here y is the solution of (1.3) with initial condition y(0) .
Now (1.5) implies

I2
J[u] -<E ]~.(t,xU,u(t)) [at+ ]c(xU(T))l

<- ( T + I ) ~ ( I +Kqlxo[q) .

Remark 2.8. From the above we have

inf{J[u]: u £ UN(~)} -<inf{Jlu]: u E % ( ~ ) } .

In [7], chapter 3, theorem 1.7, it is shown in fact that equality holds.

We also have
152

(2.9) inf{J[u]: u ~ } =inf{J[u]: u c UA}

provided U is compact and either

(2.10) is bounded, continuous, invertible, and


f,£,e are bounded, e.f. [i], theorem IV-6; or

(2.11) (i) for each (t,u) ~,f,c,£ are continuously differentiable in x


with uniformly bounded derivatives in the case of o,f and with
(%- ~c/~) Jc (x)J+lg. (t,x,u)J <K(l+ I~J q) ,

(ii) 0
f(t,x,u) =¢(t,x) + g(t,X,U)[ ? [o1
' O(t,x) = 0 2 (t,x)

~Ith ~ Borel,
g(t,x,u) E I~ m o2(t,x) E I~ m×m ,. g, o, £ continuous, ~2bounded,
invertible with bounded inverse, and with , continuously differentiable in x
with bounded derivative,

(iii) dx = # ( t , ~ d t + o(t,x)dw

has a transition density p(s,x,t,y) satisfying

IT I]RnP(S,x,t,y)~dY d r < =

for some B >i and all s'>s , c.f. [4] , corollary 3.9, where it is shown that
(2.9) holds with UA replaced by UN .

We point out that if for each u E [~ the solution of (1.3) is unique in law
(true under (2.11)) then for any

inf{J[u]: u c % } =inf{J[u]: u e ~(~)}

and consequently

Inf{J[u]: u e ~ } = inf(J[u]: u E % ( ~ ) } •

Finally we remark that if U is compact then the equality of the inf over
~(z) and ~(~) can be established even if f, ~, Z, c depend on the past of x
rather than just x(t) . Moreover inf J over the predictable controls also assumes
this common value.

3. Extremal Controls. We now define the extremal controls as those which satisfy
the condition of the maximum principle. We assume (1.4), (1.5), (2.11)(i) .
153

Definition 3.1. If (x(t),u(t),w(t)) satisfy (1.31, on (S,F,P,{Ft}) , then p ,


the adjoint process of x , is given by p(t) =E(p(t) IFt} and

(3,2) ~(t) = -Cx(X(r))®(r,t) -


I; gx(S,X(S) , u ( s ) ) ~ ( s , t ) d s

where the ith-column of ~(s,t) is the unique solution of

(3.3) dY=fx(S,X(S),U(s))yds+ T Oxk(S,X(s))ydWk y(t) = e i ,


k=l

with uk being the kth-column of g , wk being the k th component of w , and ei


being the i th column of the n ×n identity I • Note that p(t) is a row vector.
If u~U A or UL then p(t)=E{p(t)]F:) and if u~U M then

p(t) = E~(t) IFt) = E<~(t) ]Fix(t) j )

where F[x(t)] is the a-algebra generated by x(t) . In this case we can write
p(t) =~(t,x(t)) for a measurable function ~ . We shall, from now on, use p and
interchangeably, and in fact drop the ~

Definition 3.4. O is extremal if for each ~ EU , t not in some null set,

(3.4) p(t)f(t,~(t),Q(t)) -%(t,~(t),Q(t)) ~p(t)f(t,~(t),B) -£(t,~(t),~) w.p.l,

where p is the adjoint process for ~ , and ~ is the solution of (1.3) corre-
sponding to Q .
The following results show that necessarily an optimal control is extremal in
c~rtaln cases, i.e. a Pontryagin-type maximum principle holds.

Theorem 3.5. Assume (1.4), (1.5), (2.11)(i) . If O i~s optimal i__nn UN(W ) for
some ~ , and if Q ins {Ft_} adapted, then Q is extremal.

Proof: This follows from [8], with perturbed controls ut_6(s) =E{Q(t) IFt_ ~} .

Corollary 3.6. Assume U i__sscompact and (1.4), (1.5), (2.11)(i) . I f d is_


optimal i__nn UA , then ~ is extremal.

Proof: For the fixed ~ corresponding to (2,6) we have Q ~ UN(~) . Note


Ft - Ft . Now remarks 2.6 and 2.8 imply that 0 is optimal in UN(W) . Since
has continuous sample paths then
< = Fx hence Q is
x
Ft_adapted. By the
theorem G is extremal.
154

Corollary 3.7. Assume U is compact and 42.11). If fl is optimal in ~ , then


it is extremal.

Proof. This follows either from [7], [8] or from remark 2.8 and the above. In
fact (2.9) implies that Q is optimal in UA so the previous corollary gives the
result.

Remark 3.8. We know that under the hypotheses of corollary 3.7, there exists
optimal in % , c.f. [8] .
We are now interested in the converse question: if u is extremal is it
optimal?
We define the Hamiltonian

H(t,x,p,u) = p f(t,x,u) - £(t,x,u)

for pE~ (row vector), x £ ~ , u e U , t c [0,T] ; however the following function


n n
corresponds more closely to the Hamiltonian in the calculus of variations:

H*dt,x,p) = s u p H(t,x,p,u) .
uEU

If a is extremal then R(t,R(t),p(t),O(t))=H*(t,i(t),p(t)) a.e.

Theorem 3.9. Assume (1.4), 41.5), (2.ii)(i) . l_[f (1), fl•UN(n) is extremal
where 4~,Q,w) is a solution of 41.3), if (ll),

okdt,x) = Dk(t)x + ekdt) , k = l,...,m ,

where Dk:[o,T] +~R ek:[0,T] + ~ are bounded and measurable, if 4iii) c4")
nxn " n . . . . .

is convex, and if (iv), A c ~ n is an open convex set such that for each t w.p.l
R*(t,x,p(t)) is concave in x for x in A , then for any ~ c (0,i) and some
constant K0 dependin~ only o_~n B and the bounds and growth constants of (1.4),
41.5), (2.11)(i) ,

J[Q] Kinf{J[u]: u £UN(~)}+Ko[Pr{inf{T(u): u E UN(~)} <T}] B

where ~(u) is the first exit time of xu from A .

Proof: Write @(t) for @(t,0) and ~(t) =~(t) -I . Then

d?=-~[f~i~),@(t))- ~ D k ( t ) D k ( t ) l d t - ~ ~ Dk(t)dWk , ~(01 = I .


k=l k=l

From the pathwise uniqueness of solutions of (3.3), it follows that


155

#(s,t) =~(s)V(r) .

From (3.2) we obtain

p(t) = p(0)?(t) + £x(S,~(s),O(s))$(s)ds~(t) •

If (x,u,w) is any solution of (1.3) with u ~ UN(~) , then

d(?(t)x(t)) =-~(t)[fx(t,~(t),Q(t))x(t) - f(t,x(t),u(t))

+ [ Dk(t)ek(t)]dt+~(t)[ ek(t)dw k ,
k k

and

p(T) x(T) - p(O)x(0) - -p(t) [fx(t,~(t),~(t))x(t )

- f(t,x(t),u(t))+Z Dk(t)ek(t)] + t x ( t , ~ ( t ) , Q ( t ) ) x ( t ) d t
k
T k
+ p(O) [k Io~(t)e (t)dw k

+ ~ ~x(S,~(s) ,6(s) ) dsP(t) ,


k

so that

~(T)[x(T) - ~ ( T ) ] = -[p(t)fx(t,~(t),fl(t)) -Zx(t,~(t),fl(t))][x(t) -~(t)]

+ p(t) [f(t,x(t) ,u(t) ) - f(t,~t(t) ,a(t)) ]dr .

But p(T) = -Cx(.~(T)) so the convexity of c implies

p(T) Ix(T) -&(T)] >-c(~(T)) - c(x(T)) •

Hence

[c(~(r)) +I: ~(t,i(t),a(t))dt] - [c(x(T)) + f : g ( t , x ( t ) , u ( t ) ) d t ]

~(T)[x(Y) - ~ ( r ) ] +I~ t ( t , ~ ( t ) , a ( t ) ) - ~(t,x(t),u(t))dt

= I0-: - H x ( t , ~ ( t ) , p ( t ) , Q ( t ) ) [ x ( t ) - ~(t)]

+ [H(t,x(t),~(t),u(t)) -H(t,~(t),~(t),a(t)]dt .
156

SinCe H and Hx are linear in p , and x,u are Ft adapted, then p can be
replaced by p when expectations are taken. Moreover, since Hx(t,i(t),p(t),Q(t))
is a subgradient (in x) of H*(t,',p(t)) at i(t) and H*(t,.,p) is concave on
A , then

J[Q] - J [ u ] g E 1 a
I; -H*(t,x(t),p(t)) +H*(t,~(t),p(t))dt

+ E ( 1 - 1A)
I2 -Hx(t,e(t),p(t),fl(t))[x(t) -~(t)]dt

+ g
2 H(t,x(t),p(t),u(t))-H(t,e(t),p(t),~(t))dt

E(I - 1A)
2
-Hx(t,e(t) ,p(t) ,fl(t)) Ix(t) - e(t) ]

+ [H(t,x(t),p(t),u(t))-H(t,i(t),p(t),Q(t))]dt

where the last inequality follows 5y the definition of H* , and where IA(~) = 1
if xU(t,~) e A for almost all t ~ [0,T] , all u ~ UN(~) , and iA(m) = 0 otherwise.
Since c , ~ satisfy a growth condition in x and since $ satisfies a linear
x x
equation with bounded coefficients then E Ilpll ~ < ~ for any q < - The growth
conditions, remark 2.7, and H~ider's inequality now give the result.

Corollary 3.10. Under the assumptions of theorem 3.9, if A=~n (i.e. H* is con-
cave for all x ) then Q i__ssoptimal.
It is usually difficult to say when H* is concave, because for example if
n = 1 we cannot usually say that p has constant sign w.p.l; however we have the
following result.

Corollary 3.11. Under the assumption of theorem 3.9 except (iv), if

f(t,x,u) = A(t)x + b(t,u)


(3.12)
£(t,x,u) = ~o(t,x) + ~l(t,u)

with %0(t,.) convex on ]R n , then Q _is


_ optimal -in
- UN(W) -

Proof: The result follows from the theorem since

H*(t,x,p) = p A ( t ) x - ~0(t,x) +k(t,p) ,

k(t,p) = sup{pb(t,u) - ~l(t,u): u c U} ,

so H* is concave in x •
157

Observe that if Q e%(~) (or one of the other classes) is extremal, then
it is extremal as an element of UN(= ) , thus optimal in UN(~) and so optimal in
~(~) . If we have law uniqueness then Q is optimal in ~ .

Remark 3.13: These results of this section can be extended to the case when there
are constraints of the form E r(x(T)) = 0 if r satisfies the same conditions as
c • We say that u is extremal if E r(x(T)) = 0 and if (3.4) holds with p(t)
defined as

p(t) =E{Sp(t) +%'rx(x(T))~(T,t)IF t}

for 8~ 0 and some constant vector % . If 8= 0 the problem is called abnormal.


In the normal case 8 is normalized to 1 . Theorem 3.5 continues to hold but
corollary 3.6 may not since the results of section 2 may fail. On the other hand
corollary 3.7 holds because of [3], [4]. Theorem 3.9 holds in the normal case if
r is linear , i.e. r(x) = R x for some matrix R .

Remark 3.14. Suppose (3.12) holds and ~ is as in the theorem. In [6] we show how
to construct (approximations to) controls uR which are extremal for solutions of
(1.3) when f, ~ are altered on {Jx] > R} to be bounded with bounded x derivative.
From remark 2.7 it follows that given e > 0 , we can choose R such that for fixed
~ (0,i)

Pr{sup{]]xU[I T:U e U N} ~ R } < (e/3K0)I/8

But for R sufficiently large we observe that (for all u ) [J[u] - JR[U]l < e/3
where JR[U] is defined by (1.2) with x given by the altered f,o . Hence

J[u R] ~ i n f { J [ u ] : ~ E U N ( ~ ) } + e ,

R
i.e. u is e-optlmal in UN(~) if £0,c are convex in x for each t .

4. Some Examples. In [5], we =onsidered eight simple examples and exhibited ex~
tremal controls in each case. Our aim here is to show that seven of these controls
are optimal. Sometimes it is necessary to add a convexity hypothesis.

Example 4.1 (Linear Regulator)


(4.2) f(t,x,u) = A ( t ) x + B ( t ) u , o(t,x) =~(t) P

(4.3) ~(t,x,u)=x'M(t)x+u'N(t)u , c(x) = x ' Q x P

U =
P
158

A,B,o,M,N are bounded, measurable, and N(t) > 0 , M(t) Z 0 , Q a 0 . As indicated in


[5] we should guess O(t,x) = K(t)x e % as e x t r ~ a l . Then

p(t,x) = -2Etx{X(T)'Q*(T,t) +
I; x(s)'M(s)~(s,t)ds}

d~(s,t) = A(s)~(s,t)ds ,

dx= (A+BK)xdt+adw.

Hence Etx{X(T)}=~K(T,t)x where

d~K(s,t) = [A(s) + B(s)K(s)]~K(s,t)ds , ~K(t,t) = I ,

so that O is extre~al if

K(t) = -N(t)-iB(t)'P(t)

C
P(t) = ~(T,t) 'QSK(T,t) + Jt #(s,t)'M(S)#K(s,t)ds .

These last two equations imply that P satisfies the usual Ricatti equation

P+A'P+PA-PBN-IB'P+M =0 ,

P(T) = Q .

Now corollary 3.11 implies that Q is optimal in U K = {u: [0,T] x ~->I~ :


P
lu(t,~)I ~<K(I+ Ix(t) l) , all t} , specifically it is optimal in
U={u(t,x,m) : u(t,x,~)=L(t)x+v(t,m) , L , v bounded, measurable} .

ExamRle 4.4. (Predicted Miss) f and o are still given by (4.2) (with say, B ,
o continuous, a invertible), but ~ = 0 , c(x) = k(v'x) where v is a constant
vector and k is convex, even, non-negative and continuously differentiable,
k(y) _<(l+lyl q) . U={uEIRp:lUil -<i} . In [5] we show that
Q(t,x) =-sgn[B(t)'s(t)s(t)'x] is extremal with s(t) given b y

ds=-A(t)'s dt , s ( T ) = v .

Hence Q is optimal in UM or any of the other classes. Examples 3.3, 3.4 and 3.5
of [5] are treated similarly, if we add the hypothesis that £(t,-) , k(') are con-
vex.
159

Example 4.5.

dx = B(t)u(t)dt + ~ ( t ) d w

mln E
I;u'N(t)u dt , [ui(t) I -<i

where B(t) , N(t) are bounded, measurable and N(t) > 0 . We add the constraint

E x(T) = ~ .

Then p(t) = ~' (~ is a constant column vector), and in the normal case, i.e. when
{J[u]: u ~ U , Ex(T) = = } is more than just a singleton, then Q is extremal if

Q(t) = sat{N(t)-iB(t)'%}

where % must be computed from

~- x O=
s2
B(t)Q(t)dt .

Hence fi is optimal, e.f. remark 3.13.


Example 3.8 of [5] follows similarly. Example 3.6 of [5] cannot be analysed
in this fashion, i.e. we cannot show that the extremal control found in [5] is op-
tlmal.
160

References

[1] J.M. Bismut, Th~orle probabiliste du contrSle des diffusions, Mem. Amer.
Math. Sot., 4(1976), No. 167.

[2] M.P. Ershov, Representations of It6 processes, Th. Probab. Applic.,


17(1972), pp. 165-169.

[3] U.G. Haussmann, General necessary conditions for optimal control of sto-
chastic systems, Math. Programming. Stud., 9(1976), pp. 30-48.

[4] , On the adjoint process for optimal control of diffusion


processes, SIAM J. Control and Optimization, 19(1981), pp. 221-243.

[5] , Some examples of optimal stochastic controls, SIAM Review,


23(1981), pp. 292-307.

[6] , On the approximation of optimal stochastic controls, pre-


print, U.B.C.

[7] N.V. Krylov, Controlled Diffusion Processes, Springer-Verlag, New York, 1980.

[8] H.J. Kushner, Necessary conditions for continuous parameter stochastic op-
timizatlon problems, SIAM J. Control, i0(1972), pp. 550-562.

19] A. Seierstad, K. Sydsaeter, Sufficient conditions in optimal control theory,


Int. Economic Rev., 18(1977), pp. 367-391.

[i0] G. Wong, Representation of martingales, quadratic variation and applicatlons~


SIAM J. Control, 9(1971), pp. 621-633.
LEVY'S S T O C H A S T I C A R E A F O R M U L A IN H I G H E R D I M E N S I O N S

K. Helmes and A. Schwane

Institut f~r A n g e w a n d t e M a t h e m a t i k
Universit~t Bonn

ABSTRACT

Let (Wt ) = (Wt,W


, t2, . . . , W ~ ) , d~2, be a d - d i m e n s i Q n a l s t a n d a r d Brownian
motion and l e t A(t] be a bounded m e a s u r a b l e f u n c t i o n from ~+ i n t o t h e
space of d×d s k e w - s y m m e t r i c m a t r i c e s and x ( t ) i n t o ~d. We are concerned
with a c l a s s of s t o c h a s t i c p r o c e s s e s (L~ ' x ) a p a r t i c u l a r example of which
is P. LEvy's ' s t o c h a s t i c a r e a '

Lt = ~ I to(WsdWs
i 2 _ w;dw l

We c a l c u l a t e t h e j o i n t c h a r a c t e r i s t i c f u n c t i o n of Wt and L~ ' x and based


on t h i s r e s u l t g i v e a formula f o r f u n d a m e n t a l s o l u t i o n s f o r t h e hypo-
e l l i p t i c o p e r a t o r s which g e n e r a t e t h e d i f f u s i o n s (Wt, L~'X].

1. INTRODUCTION

In [8] P. L~vy began studying what he called the 'stochastic area'


of a 2 - d i m e n s i o n a l standard Brownian motion, i.e. the area e n c l o s e d by
the t r a j e c t o r y of the Wiener process and its chord: the s t o c h a s t i c pro-
cess (Lt). In subsequent papers [9-12] he d e r i v e d formulae for the char-
acteristic function of L t and for the c h a r a c t e r i s t i c function of L t giv-
en the p o s i t i o n of the process (Wt) at time t>0; namely, for 1 6 ~ , x6 ~2

(1) E~xp(i~Lt)] _ 1
cos h (~2---~
and

t~
(2)
162

where Ixl denotes the Euclidean norm of x, cf. [12, p. 173]. L~vy's proof
of Eq. 2 is based on the e x p a n s i o n of (standard) Brownian motions (W~),
i=I,2,.--, in a countable coordinate system, e.g.

i t i + ~ ~ ~in(mt) yi
Wt = Vn Y0 m= I m m' 0<t~-u '

i i
where Y 0 , Y I , - - - are i n d e p e n d e n t normally distributed random v a r i a b l e s
with mean zero and v a r i a n c e I, see [12] or [I, p. 261]. To obtain the
characteristic function
of L t then one simply has to integrate (2) with
2 2
respect to the d i s t r i b u t i o n of Pt: = IWtl . Actually, L~vy gave two
proofs for formula I the second one being based on the skew product re-
presentation of two d i m e n s i o n a l Brownian motion and the formula for the
characteristic function of

Recently, M. Yor simplified L~vy's proof e m p l o y i n g a result on Bessel


processes and an e l e m e n t a r y result used by D. Williams for the ' first
time' to give an elegant proof of a stopped B r o w n i a n m o t i o n formula or-
iginally due to H.M. Taylor, [15, 16]. However, Yor's proof still rests
on a n o n - e l e m e n t a r y result in the theory of stochastic differential equa-
tions, a criterion due to T. Yamada and S. W a t a n a b e on I t S - u n i q u e n e s s
for n o n - l i n e a r stochastic equations. A fine account of all these ideas
can be found in [6, pp. 384-391].
~n this note we shall show that Eq. 2 is an immediate consequence of
the C a m e r o n - M a r t i n - G i r s a n o v t h e o r e m combined w i t h the solution formula
for linear stochastic differential equations. This kind of proof does
not only w o r k in the 2-dimensional set-up considered by L~vy but extends
n a t u r a l l y to d i m e n s i o n s da2. Thus we shall c o n s i d e r stochastic processes
(L~ 'x) which are specified by a d - d i m e n s i o n a l W i e n e r process (wt) tz0'
W 0 = 0, a d e t e r m i n i s t i c function x(£) f r o m ~ + into ~ d and A(t) into the
space of s k e w - s y m m e t r i c matrices, viz.

.A,x = It <A(S)[W(s)+x(s)],dW(s)>, %a0,


(3) ~t 0

where the stochastic integral in (3) is defined in the sense of K. It8


and <., .> denotes the E u c l i d e a n scalar p r o d u c t in ~d.
As a special case take x(t) ~ 0 and A(t) ~ J,

+1 0
163

then the process (L~'0/2) is the same as L~vy's stochastic area (Lt)-
We shall study the
~ stochastic process Lt
(Wt,-A'x") and, for fixed t>0,
calculate its c h a r a c t e r i s t i c function. The proof of the result is in-
spired by work of R. Liptser and A. S h i r y a y e v [13, vol. 2, p. 12] on
filtering of r a n d o m processes.
As has been n o t i c e d by B. Gaveau [4] formulae like (I) and (2) y i e l d
estimates as well as e x p l i c i t e x p r e s s i o n s for fundamental solutions for
the generators of the d i f f u s i o n s (~t,L~'X), a class of h y p o e l l i p t i c op-
erators w h i c h n a t u r a l l y arises in some problems in analysis and geometry.
So, in particular, when we choose x(t) E 0, A(t) H A, A skew-symmetric,
we thus give n e w proofs of some of the results, e.g. Theorem 4.2.1 and
4.3.1, o b t a i n e d in [4] using the e x p a n s i o n of B r o w n i a n m o t i o n d e s c r i b e d
above. By the m e t h o d used in [2] we can also give a formula for the fun-
damental solution of the standard sub-Laplacian of any simply c o n n e c t e d
nilpotent Lie group of order 2. The e x p r e s s i o n found is further exploited
in the special case of 'generalised H e i s e n b e r g groups', a class of nil-
potent groups introduced by A. Kaplan [7]. By an elegant m e t h o d Kaplan
has shown that the standard sub-Laplacian of these Lie groups admit fun-
damental solutions analogous to that known for the H e i s e n b e r g group.

The following n o t a t i o n will be adopted throughout:


ly12: : <y,y>: : [y~, y6 C d.
Hz(t), z6 C, ta0, denotes the m a t r i x solution to the d i f f e r e n t i a l equa-
tion
Hz(t) = zA(t)Hz(t), Hz(0) = Id.

FA(t), AE ~, t6 [0,T], T>0 fixed, denotes the unique symmetric non-pos-


itive definite matrix defined by the Riccati equation

~A(t) = A2A(t)A*(t) - F A2(t), 0St~T, FA(T) = 0.

* indicates t r a n s p o s i t i o n of real- and c o m p l e x - v a l u e d matrices.


~(t) denotes the unique solution to the m a t r i x d i f f e r e n t i a l equation

~(t) = FA(t)#(t), ~(0) = Id.

bt ( is Zdefined
) as

t
ut(Z) = Z I H -I (s)A(S)X(S)dS.
0 z
164

2. THE JOINT CHARACTERISTIC FUNCTION

The main result of this note is the following formula for the joint
characteristic function of the two random variables Xt: = W t + x(t) and
L~,X, t>0. The proof which is rather lenghty and will thus appear else-
where is based on Girsanov's measure transformation technique and ana-
lytic continuation of the function

z * E xp{z I<A(S)Xs,dWs> + i<7,XT> ,


0

defined on (-c,+c), c>0 'small', to the domain {z6 ~I }Re(z) l ~ c/2}.

Theorem I. Let A:[O,T] ~ Sd, S d t h e s p a c e of dxd s k e w - s y m m e t r i c matri


c e s , and x : [ O , T ] ~ ~ d , "T>O f i x e d , be bounded m e a s u r a b l e f u n c t i o n s . Assum~
t h a t f o r e v e r y t e [O,T], z6 C t h e m a t r i c e s H z ( t } and A 2 ( t ) commute, i . e .

(4) Hz{t)A2(t) = A2(t3Hz(t).


Then f o r Ae ~ and me ~ d

is) F(A,y): = E[exp{iAL~'X + i<7,XT> ~

o exp[i< ,[xtTl ÷ Hi^tTIUiAJI>


_ ~2 T
2 flA(s)x(s) + A(S)HiA(S)U~ iA) I2ds +
0

+Yo
T
* . (iA)
f@*(r)[HiA(r)A2(r)x( r)u r ]drl2ds +
s
+ !2~SP
T (FA(S))d ~

Remark I. Condition (4) is satisfied if, for instance, A(t) ~ A,


A6 Sd, Or A(t) is skew-symmetric and orthogonal for all t6 [0,T]; more
generally, A(t)6 S d and A2(t) = a(t)Id, a(t) real-valued bounded measur-
able function. Note that for d = 2 the matrices J and J* are the only
ones which are skew-symmetric and orthogonal; in dimension 3 there are
none but for d = 4, for instance, any matrix A having the representation

A =
-+CJ

where B = ~ 0 , Icl < I and 0 is an orthogonal matrix such that


165

det(0) = ±I (-I if +c is chosen and vice versa), possesses both proper-


ties.

Formula (5) can be e x p l o i t e d in some special cases. Since u t(z) s 0

if x(t) ~ 0 the next result follows immediately from (5).

Corollar~ I. If x(t) ~ O, t h e n

F(^,~I = exp £sPCrAIsllds - ½ ~o


and

F(A,o) = e x p sp(rA(s))ds

Corollar~ 2. If x(t) z 0 and A ( t ) ~ A, t h e n

[d/2] I exp[ I 2 ,2 ]tanh(Atak~


(6) F(A,y) = ~ cosh(Atak) - ~[(OY)2k_]+(O~,2kJ ~
k=]

where 0 i s an o r t h o g o n a l m a t r i x s u c h t h a t O'AO i s a skew-symmetric matrix


formed from [d/2] diagonal blocks

aR

and t h e numbers a k a r e s i m p l e a l g e b r a i c functions of t h e entries of t h e


m a t r i x A.

Proof. Since A is constant we know e x p l i c i t l y HiA(t), FA(t) and @(t) up


to m u l t i p l i c a t i o n with the o r t h o g o n a l m a t r i x 0. Then by some lenghty but
easy c o m p u t a t i o n formula (6) is derived from (5).

Corollary 5. If x(t) =- x, A ( t ) =- A t h e n

[~12] I 2 2
F(A,y) = k=]]T coSh(Atak) exp - ] { ( 0 [ 7 + A A x ] ) 2 k _ ] + ( 0 [ ~ + A A x ] ) 2 k }.

tan;, (Atak) I - £xp(i<7,x>)


" A
J

Proof. The a s s e r t i o n follows from formula (6) if there we substitute 7


by (y + AAx). The formula can also be d e r i v e d d i r e c t l y from (5) taking
into account the formulae for HiA(t), FA(t) and ~(t).

Remark 2. L~vy's formula (I) follows from (6) since for this example
a I = 1. F o r m u l a (2) is also derived from (6) by taking c o n d i t i o n a l ex-
166

p e c t a t i o n on the left h a n d side of Eq. 6 first and doing Fourier trans-


f o r m a t i o n on both sides after; m u l t i p l i c a t i o n by (2wt)exp(-IxlZ/2t) then
y i e l d s the assertion.

3. FUNDAMENTAL SOLUTIONS

In this section we shall give two a p p l i c a t i o n s of T h e o r e m I, cf. also


A
[2,4,5 and 7]. Let x(t) E 0 and A(t) E A; from now on we shall w r i t e L t
instead of _A,0
Lt "
The s t o c h a s t i c p r o c e s s (Zt): = (Wt,L~) is the u n i q u e s o l u t i o n to the
stochastic differential equation

dZ t = GA(Zt)d~,Zt, z 0 = 0,

w h e r e ~A(Z), z = (8,~)6 ~ d ~ , denotes the (del)xd - d i m e n s i o n a l m a t r i x

OA(Z) =
(id)
(A~)
.

Put a(z) = GA(Z)GA(Z); the g e n e r a t o r of (Z t) is thus given by

1 1 d+l 2 d 8 d
2AA 2i,~= ] j •
8zlSz 3. 2 i=] j=] i~ j0~

Let PO(T;z) ,%>0, denote the fundamental s o l u t i o n to the equation

(~p - AAp) = 0 w i t h pole at zero.

T a k i n g ~he F o u r i e r t r a n s f o r m of Eq. 6 with r e s p e c t to y first we o b t a i n


the f o l l o w i n g result.

Theorem 2, Put n = [d/2], then


n pa k
P0(T;(~,@)) = (2ZT) -(]+d/2) f n s2nh( lexp [i<~,p> _
k=1 Pak

Oak 2 +(07) 2 ~coth(0ak) l dO.


- 2T [(0~)2k-] 2k
J

Since in d i m e n s i o n dr3 B r o w n i a n m o t i o n is not r e c u r r e n t we can d e r i v e


from T h e o r e m 2 a formula for the f u n d a m e n t a l s o l u t i o n for the o p e r a t o r
A A. For d = 2 the i n t e g r a l can be e a s i l y calculated.
167

Corollary 4. The f u n d a m e n t a l s o l u t i o n f o r ~A i s g i v e n by

q0(z) = ]p0 (T;z)dT.


0
If A = J, t h e n
(2~r) -1
q0 (Z) = 2 ~ 2
/(~1 +~2 ) +

Finally, let us consider a more general situation: Instead of just


one matrix A we take m~l matrices A (1),---,A tm)" having the properties
that each matrix A (i) is skew-symmetric and orthogonal and that for every
pair of indices (i,j) the relation

A (i)A(j) = -A(J)A (i)

holds. We consider the stochastic process It: = (Wt,L t


(I),.-.,L ~m)) where
A(i)
L i): : Lt , 1~i~m.

Theorem 3. The c h a r a c t e r i s t i c function of ~ t ' t>O,~6 ~ d , ~ E ~m, is

E [exp{i<( ~,~ )!t> }} = I iy i2 tanh( At )}


eosh(At) {d/21 exp{- 2A

where A: = I A i . The f u n d a m e n t a l s o l u t i o n to (a/at - ~A~,.. ) = 0 with


pole at zero is g i v e n by

.era / A )[d/2]
p0 (T;OJ,0) = (2","e) l+,',+a/2 f sinh('2A)

Proof. Let 0 # ~6 ]RTM be given. Put

A: = A- 1 mZ AiA(i) ;
i=]

Since i<(7,0),It> = i<~,Wt> + i<A,L~> and A is a skew-symmetric and


orthogonal matrix the first assertion follows from Corollary 2 since,
in that case, all the numbers a k are +I or -I only. The second formula
is derived from the first one by taking F o u r i e r transform on both
sides of the equation.
168

REFERENCES

[]] BREIMAN, L.: Probability, Addison Wesley, New York, 1968.

[2] CYGAN, J.: Heat kernels for class 2 nilpot£nt groups, Studia Math.
64 (1979), pp. 227-238.

[3] FRIEDMAN, A.: S t o c h a s t i c differential equations and applicationS,


Academic Press, New York, 1976.

[4] P r i n e i p e de m o i n d r e a c t i o n , p r o p a g a t i o n de l a c h a l e u r
GAVEAU, B . :
et estim~es sous elliptiques sur certains groupes nilpotents, Acta
Math. 139 (1977), pp. 95-153.

[5] HULANICKI, A . : The d i s t r i b u t i o n of e n e r g y i n t h e Brownian m o t i o n


in the Gaussian field'and analytic-hypoelliptieity of c e r t a i n s u b -
elliptic o p e r a t o r s on t h e Heisenber@ g r o u p , S t u d i a M a t h . 56 ( 1 9 7 6 ) ,
pp. 165-173.

[6] IKEDA, N. and WATANABE, S.: Stochastic differential equations and


diffusion processes, North-Holland Publ., Amsterdam, 1981.

[7] KAPLAN, A.: Fundamental s o l u t i o n s for a c l a s s of h y p o e l l i p t i c pde


generated by c o m p o s i t i o n of q u a d r a t i c f o r m s , T r a n s A m e r . M a t h . S o c .
258 (1980), pp. 147-153.

[8] LEVY, P.: Le mouuement Brownien plan, Amer. Jour. Math. 62 (1940),
pp. 487-550.

[9] : ProceSsus stochastiques e t mouvement B r o w n i e n , Gauthier-


Villars, Paris, 1948.

[10] : Calcul des p r o b a b ~ l i t ~ s - f o n c t i o n s al~atoires Laplaciennes,


C. R. Acad, Sci. 229 (1949), pp. 1057-1058.

[]I] : C a l c u l des p r o b a b i ~ i t % s ~ s u r l ' a i r e c o m p r i s e e n t r e un


a r c de l a e o u r b e du mo~vement Brownien p l a n e t sa c o r d e , C. R.
Acad. Sci. 230 (1950), pp. 432-434; errata p. 689.

[]2] : Wiener's random f u n c t i o n , and o t h e r L a p l a e i a n random func-


tions, Proc. 2nd Berkeley Symp., pp. 171-187, 1951.

[]3] LIPTSER. R. and SHIRYAYEV, A.: Statisti~ of random processes,


vols. I, 2, Springer-verlag, New York, 1977.
169

[14] STROOCK, D. and VARADHAN S.: M~Ztidimensional d ~ f f u s i o n processes,


Springer-Verlag, New York, 1979.

[15] WILLIAMS, D . : On a s t o p p e d Brownian motion formula of H.M. T a y l o r ,


S~minaire de Probabilit~s X, Lect. Notes in Maths. 511, pp. 235-
239, Springer-Verlag, Berlin, 1976.

[16] YOR, M.: Remarques ~ur une formule de Pa~l L~vy, S~minaire de
Probabilit~s XIV, Lect. Notes in Maths. 784, pp. 343-346, Springer-
Verlag, Berlin, 1980.
ASYMPTOTIC NONLINEAR FILTERING
AND LARGE DEVIATIONS

Omar HiJab

Mathematics and Statistics


Case Western Reserve University
Cleveland, Ohio 44106

O. Introduction. Consider a diffusion t + x~(t) evolving on Rn and governed


by a generator of the form

g2 2
Ae = f + 2(gl + "'" + gm ) (1)

corresponding to a given set of vector fields f'gl' "'" 'gm on Rn. It is of


interest to study the asymptotic behavior of the probability distributions p6 on
a n E C([O,T];R n) of the diffusions t + xe(t) as e + O. It turns out that the
asymptotic properties of Pe depend strongly on properties of the associated con-
trol system

x = f(x) + gl(x)u I + ... + gm(X)Um. (2)

Indeed, it turns out that in some sense

PE(dx(')) = exp(- ~e ;~ u(t)2dt)dx(')

as £ # 0. More precisely, suppose that the diffusions t -~ XE(t) satisfy


x£(0) = x almost surely and suppose that to each u in L2([0,T];R m) there is a
o
well-deflned solution Xu of (2) in ~n satisfying Xu(0) = x o. Then the asymp-
totic behavior of pe is given by the following estimates: For any open set G in
~n and closed set C in ~n,

lime
+ 0 e log PC(G) ~ - i n f 1 IT u2dtlXu
{~ in G}
O
(3)
lim
e@0
£ log Pc(C) ~ -inf {~ 11T u2dtlx u
o
in C}.

In 1966 S.R.S. Varadhan set down a general framework [i] for dealing with the
asymptotic behavior of families of measures and certain associated expectations,
and in particular derived the above estimates for processes with independent in-
crements [i]. Subsequently, he derived these estimates for the case of drift-free
nondegenerate diffusions (i.e., f = 0) [2]. Later Glass [3] and Ventsel and
Freldlin [4] established these estimates for nondegenerate diffusions with drift.
171

In 1978 Azeneott [5] established these estimates in a general ease; Azeneott's


results imply that if f'gl' "'" 'gm are C 2, if for each ~ > 0 there is a solu-
tion to the martingale problem on ~n corresponding to Ae and if for each u in
L2([0,T];R m) the solution x of (2) starting at x exists in ~n, then the above
u o
estimates hold.
Suppose that the diffusions t + xe(t) are observed in the presence of an in-
dependent Brownian motion t + b(t),

ye(t) = h(xe(s))ds + ~ b(t) , t ~ 0,


o

where h : Rn + Rp is a given map. Then the unnormalized conditional distribution


~IY of t ~ xe(t) given t ~ ye(t)
is well-defined. In this paper we show that
2 2
if h is C3 and h, f(h), gl(h) . . . . . gm(h), gl(h) . . . . . gm(h) are all bounded
on Rn then for any open set G in an and closed set C in ~n,

lim £ log Q~ly(G) ~ -inf [~I I T u 2 + h(Xu)2dt - I T h(Xu)dylx u in G}


c+0 o o (4)

e
lim £ log Qxly(C) ! -inf {~i IT u2 + h(Xu)2dt - I T h(Xu)dylx u in C}
e+0 o o
for almost all y in ~P E C([0,T];RP).

i. Large Deviations. Throughout ~n will denote C([0,T];R n) with ~m and ~P


defined analogously, where [0,T] is a fixed time interval. The topology on an
is that of uniform convergence on [0,T]. We suppose we are given

(i) C2 vector fields gl' "'" 'gm on Rn and a tlme-varylng vector


field f in C°'2([0,T] ×Rn,Rn).

If g is any vector field on R n, let g(~)(x) denote the directional d@rivative of


in the direction of g at the point x. The vector field g can then be
thought of as a first order differential operator taking ~ in C~(R n) to g(~).
If g2(~) is short for g(g(~)), then (i) defines a second order (possibly time-
varying) differential operator A C. Let C~(R n) denote the space of smooth com-
o
pactly supported functions on R n.

Let b(t) : ~m RTM be given by b(t,~) = ~(t) and impose Wiener measure on
~m. Then t ÷ b(t) = (bl(t) . . . . . bm(t)) is an Rm-valued Brownian motion.
One way to construct diffusions on Rn governed by Ae is to pick a point x
o
in Rn and to let t + xE(t) be the unique process ~m ÷ ~n satisfying

~(xe(t)) _ ~(xe(s)) ~ /t Ae(~)(xe(r))dr = ~ It g(~)(xe(r))db(r) (5)


s s

for all K0 in Co(Rn), 0 < s < t < T, and xC(0) = Xo, almost surely on ~m. Here
g(~)db is short for gl(~0)db
I _ + ... + gm(q~)dbm where gi(~0) is defined above.
172

Using the standard exlxtence and uniqueness theorem for stochastic differential equa-
tions and Ito's differential rule, it is easy to show that there is a unique such pro-
cess defined up to an explosion time ~g ~ ~ characterized by the fact that t÷ x6(~
leaves every compact subset of Rn as t ÷ ~, almost surely on Cc < ~.

The merit of the above definition of t ÷ xC(t) is that it makes sense on any
manifold X. Indeed, the Whitney embedding theorem allows one to embed any such X
into some RN and by extending f' gl' "'" 'gm to RN one can derive the result de-
scribed above on any manifold. Of course in Rn t + xE(t) is the "Stratonovitch so-
lution". In any event, as g + 0 the "correction factor" disappears and so estimates
(3) are expected to hold just as well for the diffusions t ÷ x~(t) constructed here.
In what follows we are careful to state everything in such a way as to make sense on
any manifold X.
If T < ~e then the probability distribution Pe of t + xe(t) exists on an
and is the unique probability measure on an satisfying PC(x(0) = x o) = 1 and

Ee(~(x(t)) -~(x(s)) - ft Ae(~)(x(r))drl~s ) = 0 (6)


S

for all @ in C~(R n) and 0 < s < t < T. Here x(t) : a n + Rn is the canonical
map and ~ is the ~-algebra generated by the maps x(r), 0 < r < s.

Conversely, if one assumes that

(ii) for each g > 0 there is a probability measure pe on an


satisfying PE(x(0) = x o) = 1 and (6) for all ~ in Co (Rn)
and 0 < s < t < T,

then one can show that the solution t + xe(t) of (5) explodes after time T, i.e.,
~e ~ T, almost surely.
In what follows we shall assume (ii) and

(lii) to each u in L2([O,T];R m) there is a path x in ~n


U
satisfying (2) and Xu(O) = x o. In other words, the solution
of (2) starting at x has escape time greater or equal to T,
o
for all u in .L 2.

Under assumptions (i),(ii) and (iii) estimates (3) hold for the measures {pC} con-
structed here [5]. To understand these estimates from a more general perspective
consider the following definition [1].

Definition. Let ~ be a completely regular topological space and let pC, £ > 0, be
a family of probability measures on ~. We say that {pC} admits larse devlation if
there is a function I on ~ satisfying
(i) 0 < I < +~.
(ii) I is lower semicontinuous on R.
(iii) {~II(~) ~ M} is a compact subset of ~ for all finite M.
173

(iv) For any open set G in a, llmelog Pe(G) ~ -inf{l(~)l~ in G}.


e+0
(v) For any closed set C in a, lim flog PE(C) ~ -inf{I(~)I~ in C}.
e+0
The function I is then referred to as the corresponding "l-functional".

Estimates (3) then state that (iv) and (v) hold for the probability distributions
of the diffusions t ~ xC(t), where I is given by

I(~) = inf{½ I0T u(t)2dtlx u = ~}

for all ~ in a n, with the understanding that the inflmum of an empty set of real
numbers is +~. Since (ii) is easy to derive and (Ill) is the statement that u ÷ xu

iS a compact map from L2([O,T];R m) into a n, we have

Theorem i.i. The probability distributions {pe} corresponding to Ae admit large


deviation as e + 0.

A consequence of the above abstract definition is the following theorem, which


is a summary of results appearing in section 3 of [i].

Theorem 1.2. Let {pC} admit large deviation with corresponding I-functional I
and let ~e be a bounded continuous function on a such that ~ converges uniform-
ly to @ as e + O. Let QE be given by

dQ C = e c edps"

Then {Qe} satisfies

lim
£+0 el°g QC(G) _> -inf{I(~) + ¢(~)le in G}.

i---~ E log Q£(C) < -Inf{I(~) + ¢(~)I~ in C}


c+O

for G open in a and C closed in a.

We note that for the results of theorem 1.2 to hold it is not necessary that

be bounded: That is required is that the tail estimate


c
lim l i m e log EE(I[~e>R] x exp(-#e/c)) = -~
R-~m e+O

holds [i].

2. Nonlinear Filtering. Let h : Rn + Rp be a locally bounded measurable map and


let t ÷ b(t) denote an RP-valued Brownlan motion independent of given processes
t ÷ xe(t) on Rn. Let

YE(t) = f0t h(xC(s))ds+~b(t), 0 < t < T. (7)


174

In this section, we study the conditional distribution pexly on ~n of t ~ x C(t)

given t ÷ ygCt). We shall use Bayes' rule to compute P~IY"


Let W denote Wiener measure on ~P, let P~X denote the probability distribu-
tlon of t + xe(t) on ~ n let pC denote the probability distribution of
c Y
t + ye(t) on ~P, let P(x,y) denote the probability distribution of
t ~ (xK(t),yC(t)) on ~n × ~ p and let pexly denote the conditional distribution of
t ~ xE(t) on ~n given t + yg(t). Let yCt) : ~P + R p denote the canonical map.
Let We denote the Wiener measure on ~P !'of variance g".
For 0 < t < T set

ACt) = ~f01t h(x(s))2ds - for h(x(s))dy(s).

ACt) is then a measurable function on ~n x ~P for each t. Using (7) and invoking
the Cameron-Martin formula it is easy to see that

dP~x,y ) = e c d(P~ x W e)

where A = A~), Here and elsewhere, h 2 = h I2 + .. . + h p,


2 hdy = hldY I + ... + hpdyp,
etc.
Using Bayes' rule, the conditional distribution is given by

dP~x,Y) . dp C
dPx[Y = d(P~ x P~) x

dP~x ) d(P~ X W C)
• dp XC

_lc A - Zc A
= e dPE/EE(e 7. (81
x

Equation C8) is the formula of Kallianpur-Striebel [6]. We rewrite it as

d s / c -~n-
= Qxly Qxly( ) ,
c
and refer to ~]y as the unnormalized conditional distribution.

So far equation (87 holds for any processes t + xe(t). Now suppose that pC
x
is governed by Ae in the sense of equation C6), where Ae is given by (i). For
any bounded measurable ~ let

~(~) ~ EP~(~(xCt))exp(-h(t)/e)) (9)

the "unnormalized conditional expectation of ~(xg(t)) given ye(s), 0 < s < t".

We derive the equation governing the time evolution of


ve(~).
t
Ito's rule guarantees that zE(t) = expC-A(t)/c) satisfies

zCCt) - zCCs) = ~I /ts hCx(r))ze(r)dy(r)


175

for 0 < s < t < T. This last equation together with equations (6) and (8) and the
Ito product rule then yield

Theorem 2.1. For all ~ in C~(R n) and 0 < s < t < T,

i it ~(h~0)dy(r).

We emphasize that this proof is valid for any locally bounded measurable h and any
generator Ae of the form (i). This equation is well-known and appears in various
forms in the literature.
E
In the next section we study the asymptotic hehavlour of QXlyl as e + 0.

3. ~symptotic Filtering. In this section, we shall assume (i),(ii),(iii) and

(iv) h is C3 and f(h), gi(h), g~(h) and h, i = i, ... ,m are bounded


on [0,T] × R n.

Let I be the l-functional given by theorem i.i,

I(~) = inf{½ IT0 u(t)2dt[xu = ~}.

In this section we shall prove


e ~n.
Theorem 3.1. Let Qxly denote the unnormallzed conditional distribution on
Then estimates (4) hold for almost all y in ~P.

Note that for h = 0 this theorem reduces to estimates (3). The idea of the
E
proof is simple enough; apply theorem 1.2 to Qxly using the representation given
by equation (8). This however does not work directly because the exponent A is not
a continuous function on ~n for each y in ~P. We therefore have to make a
slight detour and integrate by parts the stochastic integral appearing in A.
For each g > 0 let ~g on ~n be given by
1 12
~E(w) - -y(T)h(~(T)) + y(O)h(~(O)) + I~[+yAe(h)(~) + ~ h(~) 2 - ~ y g(h)(~)2]dt.

Then # + #
= ~ as c + 0 uniformly on ~ n for each y in ~P. Referring to
e
o
(8) and performing an integration by parts in the stochastic integral appearing in
A and invoking Girsanov's theorem we see that

dP~:y E e g CdQ~ly (10)

satisfies equation (6) with AE replaced by

Ae
- ygl(h)gl ... - Ygm(h)gm"
We w i ~ t o apply theorem i.i to iP~:y}. To do so we must check that assumptions (1),
(il), (ill) of section i hold for the vector fields

fy = f - Ygl(h)gl - "'" - Ygm(h)gm ' gl' "'" "gm


176

for all y in ~P, given that they hold for y = 0. For (i) this is obvious. For
(ii) this is also obvious and for (iii) this is so because gl(h), ... ,gm(h) are
bounded feedback terms. Thus theorem i.i applies to {P~ :y_} and hence theorem 1.2
applies to ~IY via equation (i0).
Thus let x u:y denote the unique path in ~n satisfying

= fy(X) + gl(X)Ul + ... gm(X)Um and x(O) = Xo.

Let I be the l-functional corresponding to P£ according to theorem i.i:


y x:y
ly(~) = inf{½f0T u(t)2dtlx u :Y = ~}.

The theorem 1.2 implies that for any G open and C closed in ~n

~+01imElog Q~ Y(G) _> -inf{l (~) + ~(~)I~ in G}


Y (n)
ll m e l o g Q~ (C) < -inf{l (w) + ~(~)[~ in C}.
e+0 Y -- Y
Now a little algebraic manipulation in (ii) using the fact that

ly(~) = inf{½fro (u + yg(h)(m))2dtlXu = m}

yields theorem 3.1. Applications of theorem 3.1 will appear elsewhere.

REFERENCES

[i] S.R.S. Varadhan, "Asymptotic Probabilities and Differential Equations," Comm.


Pure &Applied Math., Vol. XIX, 261-286 (1966).

[2] , "Diffusion Processes in a Small Time Interval", Comm. Pure &


Applied Math., Vol. XX, 659-685 (1967).

[3] M. Glass, "Perturbation of a First Order Equation by a Small Diffusion", Ph.D.


Disseration, New York University, 1969.

[4] A.D. Ventsel and M.I. Friedlin, "Small Random Perturbations of Dynamical System8
Russian Math. Surveys, 25 (1970) 1-56 [Uspehi Mat. Nauk. 25 (1970) 3-55].

[5] R. Azencott, Lecture Notes in Math. #774, Springer 1978.

[6] G. Kallianpur and C. Striebel, "Estimation of Stochastic Processes", Annals Math


Statistics, 39 (1968) 785-801.
R e p r e s e n t a t i o n and a p p r o x i m a t i o n of c o u n t i n g p r o c e s s e s

Thomas G. Kurtz
D e p a r t m e n t of M a t h e m a t i c s
U n i v e r s i t y of W i s c o n s i n - M a d i s o n
Madison, Wisconsin 53706 U S A

i. Introduction

By a c o u n t i n g p r o c e s s we m e a n a s t o c h a s t i c p r o c e s s N whose
sample paths are c o n s t a n t e x c e p t for jumps of +i. The s i m p l e s t
example is, of course, the P o i s s o n process. Recall that the d i s t r i b u t i o n
of the P o i s s o n p r o c e s s is d e t e r m i n e d by s p e c i f y i n g the i n t e n s i t y param-
eter 1 w h i c h gives

(i.i) P { N ( t + At) - N(t) > 0IF t} = IAt + o(At)

where Ft is the h i s t o r y for the p r o c e s s up to time t, i.e.


F t = ~(N(s) :s ! t). M o r e g e n e r a l c o u n t i n g p r o c e s s e s are d e t e r m i n e d by
specifying an i n t e n s i t y f u n c t i o n l(t,N) which, as in (i.i), g i v e s

(1.2) P { N ( t + At) - N(t) > 0IF t} = ~ ( t , N ) A t + o(At).

Of course for (1.2) to m a k e sense, l(t,N) can d e p e n d only on the


values of N up to time t. To be precise, let ~+ be the
nonnegative integers and ~+~ = ~+ u {+~}. ( T o p o l o g i c a l l y think of
~+~ as being the o n e - p o i n t c o m p a c t i f i o a t i o n of ~+.) Let J[0, ~)
denote the r i g h t continuous, nondecreasing ~+~-valued functions x
such that x(0) = 0 and x(t) - x(t-) is 0 or I. In par-
ticular if x(t) = ~, then x(s) = ~ for s > t. (We g i v e
J[0,~) the S k o r o h o d t o p o l o g y w h e n a t o p o l o g y is needed.) Let Tn(X)
denote the time of the n th jump of x and d e f i n e xt by
xt(s) =x(s A t). A (Borel)-measurable function I:[0,~) x J[0,~) ÷
[0,~) is an i n t e n s i t y function, if for all x ¢ J[0, ~)

(1.3) ~(t,x) = l(t,xt), t ~ 0,

and
Tm(X)
(1.4)
I0
l ( t , x ) d t < ~, m = i, 2, 3 ...

G i v e n an i n t e n s i t y f u n c t i o n I, the p r o b l e m then b e c o m e s to
associate w i t h it a c o u n t i n g process N satisfying (1.2). There
are a v a r i e t y of w a y s of a c c o m p l i s h i n g this. H e r e we w i l l s p e c i f y a
stochastic e q u a t i o n for w h i c h N is the u n i q u e solution. For other
approaches see the books by B r e m a u d (1981) and Snyder (1975). All
178

these a p p r o a c h e s are e s s e n t i a l l y equivalent. This e q u i v a l e n c e is dis-


cussed in Kurtz (1982).
Let Y be a P o i s s o n process w i t h p a r a m e t e r i. Then the
e q u a t i o n for the c o u n t i n g process N corresponding to a g i v e n l is

(1.5) N(t) = Y( l(s,N)ds).


0

E x i s t e n c e and u n i q u e n e s s of the s o l u t i o n follows by using (1.3) and (1.4)


to s o l v e the e q u a t i o n "from one jump to the next". T h i s is d i s c u s s e d in
d e t a i l in Kurtz (1982). The u n i q u e n e s s implies /tl(s,N)ds is a
0
stopping time for Y and o b s e r v i n g that on the set w h e r e
N ( t + At) = N(t)
~t + At + At
(1.6) l(s,N)ds = I t l(s,N('At))ds
J
0 0

we have, on the e v e n t {N(t) < ~},

(1.7) P { N ( t + At) - N(t) > 01F t}


= 1 - P { N ( t + At) - N(t) = 01F t}

= 1 - p{y(/t+At l(s,N(.At))ds) - y ( i t l(s,N(.At))ds) = 01F t}


0 0

= 1 - exp[-I t+At l ( s , N ( - A t ) ) d s }
t
w h i c h is a p r e c i s e v e r s i o n of (1.2). The fact that I t X(s,N)ds is
0
a s t o p p i n g time also g i v e us the r e l a t i o n b e t w e e n the s t o c h a s t i c e q u a t i o n
(1.5) and the m a r t i n g a l e approach described in B r e m a u d (1981). Since
Y(u) - u is a m a r t i n g a l e , the o p t i o n a l s a m p l i n g t h e o r e m implies

(1.8) N(t A Tm) - I t A T m X ( s , N ) d s


0
= Y(ItAYm l(s,N)ds)- ItATm l(s,N)ds
0 0
th
is a m a r t i n g a l e , where Tm = Tm(N) is the time of the m jump
of N.
E q u a t i o n s of the form (1.5) also can be s p e c i f i e d for systems of
counting processes. H e r e we r e q u i r e that Y1 ' Y2 "'" are i n d e p e n d e n t
P o i s s o n p r o c e s s e s w i t h u n i t intensity. Letting N = (N 1 , N 2 ...),
w e have the system of e q u a t i o n s

(1.9) N k ( t ) = Yk({ t Ik(S,N)ds)

w h e r e for each k, lk:[0,~ ) w (J[0,~)) d~÷ [0, ~) (d m a y be finite


or infinite) and for each x E (J[0,~)) d and each k

(1.10) lk(t,x) = lk(t, x t)


179

and

(l.ll) ITm(X) [klk(t,x)d t <


0
Here

(1.12) ~m(X) = inf{t: [ xk(t) h m}.


k
Examples

(a) Counter model. Let p > 0 and lim p(u) = 0. The equation
U+~

N(t) = y(/t le-~tp(t-s)dN(s) ds)


0

models the number of counts registered on a counter in a Poisson stream


of particles where the sensitivity of the counter is reduced by each
count but recovers in time.

(b) Birth and death process in a random environment. We can easily


introduce "external" randomness. Let A, B, and C be positive
stochastic processes independent of the (independent) Poisson processes
Y1 ' Y2 " mhen the equation

(1.13) Z(t) = Z(0) + Xl(~ t A(s)Z(s)ds) - Y2({ t (B(s)Z(s) + C(s)Z(s)2)ds)

determines a birth and death process in a random environment. Note that


the counting processes are just the number of births

(1.14) Nl(t) = YI({ t A(s)Z(s)ds)

and the number of deaths

(1.15) N2(t) = Y2(~ t (B(s)Z(s) + C(s) Z(s) 2) ds) .

Since Z = Z(0) + N1 - N2 , except for the additional complication


of random coefficients, (1.14) and (1.15) form a system of the type in
(i .9).
(c) Markov chain in 2Zd. Let

(1.16) X(t) = X(0) +Z!2Z d ZYz(~t Bz(X(sllds )

where the Y£ are independent Poisson processes and ~ Bl(k) < ~,


k • ~d Here the counting processes

(1.17) Nz(t) = YZ(~ t B£(X(s))ds)

count the jumps of X of type Z, and X is the minimal Markov


chain associated with the intensities qk,k+l = Bz(k). See Karlin
(1966), page 228.

(d) Controlled counting process. The intensity l(s,x,u) may depend


180

on a control parameter u. Then the c o n t r o l l e d counting process is


given by

(1.18) N(t) = y(it I (s,N,u(s,N))ds)


0

provided u(s,x) = u(s,xt)(cf. (1.3)).


In Sections 2 and 3 we use the stochastic equations described above
to prove limit theorems for counting processes. In Section 4, in order
to give another example of the type of a r g u m e n t used in Section 3, we
consider the a s y m p t o t i c s of a simple fiber bundle model. The m o d e l is
expressed as a solution of a stochastic equation similar to (1.5) but
using the empirical process rather than the Poisson process.

2. Continuous dependence o__nn

In this section we use the stochastic equations to show that the


distribution of a counting process depends c o n t i n u o u s l y on its i n t e n s i t v
in a very strong sense.
2.1 Theorem Let I, l(n) n = i, 2 ... be intensity functions~
N the c o u n t i n g process c o r r e s p o n d i n g to ~ and N (n) the
counting process corresponding to l(n) If for each T > 0, m,
and x e J[0,~)

(2.1) lim ITm(X)ATII(n) ( S,X ) - l(s,x) Ids = 0,


0

then N(n) ~ N.

Proof We a c t u a l l y prove a stronger result than that stated. Further-


more we use a d i f f e r e n t representation of N (n) than that given by
(1.5). Specifically let Y1 " Y2 be i n d e p e n d e n t P o i s s o n p r o c e s s e s
w i t h intensity one and let N (n) satisfy

(2.2) N (n) (t) = YI(~ t IAI (n) (s,N(n))ds) + Y 2 ( { t ( l ( n ) - I A l (n)) (s,N(n))ds)

and
(2.3) ~(n) (t) = YI({ t IAI (n) (s,N(n))ds) + Y 2 ( { t ( A - I A I (n)) (s,N(n))ds) .

Note that I (n) = IAI (n) + l(n) - IAI (n) and I = IAI (n) + I -
IAI (n) , and it follows by the m u l t i p a r a m e t e r optional sampling the-
orem, Kurtz (1980), that
tat (N (n))
(2.4) N(n) (t A ~ m ( N ( n ) ) ) _ / m l(n) (s,N(n))ds
0

and

(2.5) ~(n) (t A ~m(N(n))) - I m l(s,~(n))d s


0
181

are martingales. Consequently, by the martingale characterization of


counting p r o c e s s e s (See Bremaud (1981)) N (n) has intensity l(n) and
~(n) has intensity I. In particular all the ~(n) have the same
distribution, namely that of N satisfying

(2.6) N(t) = y(/t l(s,N) ds) .


Note that N (n) (t) = ~(n) (t) for

(2.7) t < Yn-inf{t:Y2({t(l(n)-IAl(n)) (s,N(n))ds) > 0

or Y2({t(X-IAI (n)) (s,N(n))ds) > 0]

Consequently for any Borel set F C J[0,~)

(2.8) IP{N(.A~m(N)AT ) e F} - P { N ( n ) ('ATm(N(n))AT) E F} I


= IP{N (n) ('ATm( ~ (n) )AT) e F} - P {N (n) (-ATm(N (n) )AT)
< p{~(n) (t) / N (n) (t) , some t < Tm(N(n))AT)}

<--P{Yn <- Tm(N(n)) AT}

{Y2 (iT^~m(N(n))
D II _ i(n) I(s'N(n))ds) > 0}

TATm (~ (n) )
< E[I - exp{-I ll(s,N(n))-I (n) (s,N (n)) Ids}]
0

TA~m (N) (n)


= E[I - exp{-/ ll(s,N)-I (s,N) Ids}] .
0

Note that since ~(n) (t) = N (n) (t) for t < Yn

TA~ m (N (n))
Y2({ (l(n)-lAl(n)) (s,N(n))ds) > 0

if and only if
TAr
(~(n))
y2({ m (l(n)-IAl(n)) (s,N(n))ds) > 0.

Finally as n*~ the right side of (2.8) goes to zero by (2.1).

3. Density dependent Markov families


We now consider a sequence of processes xn , with values in
En = {n-lk:ke ~d}, satisfying
(3.1) ~(t) = X n (0)+^[ d £n-iy£ (n It 8£(Xn(S))ds)

where the Yd are independent Poisson processes. As noted in example


(e) in Section i, Xn is a Markov chain. If we set Y£(u) = Yl(u)-u
182

and

(3.2) F(x) = [ /~/(x)


£
then

(3.3) Xn(t) = Xn(0) + l!zzd Zn-i Y1 (n It0 ~(Xn(S))ds)


+ I t F(Xn(S))ds.
0
Since the law of large numbers gives

(3.4) lim sup n-iIy(nu) I = o, u 0 > o,


n÷~ u~u 0
if Xn(0) + X(0) we expect that Xn(t) ÷ X(t) where

(3.5) X(t) = X(0) + I t F(X(s))ds


0

i.e. = F(X). Similarly, if we define W(~! (u) = n-i/2 yl(nu) ,


then W ( 2 )= W 1 where Wl is a standard B r o w n i a n motion.
Defining Vn(t) = nl/2(Xn(t) - X(t)), we have

(3.6) V n ( t ) = Vn(0 ) + ~ ~ W ( ~ ) ( { t Bl(Xn(S))ds)

+ f t nl/2(F(Xn(S)) - F(X(s)))ds
0

= Vn(0) + iCXnCS))ds)
+ f t nl/2(F(X(s) + n -I/2 V (s)) - F(X(s)) ds
0 n

and if Vn(0) + V(0), we expect that Vn V where

(3.7) v(t) -- v(o) + ~ I w~({ t Bs(x(s))ds)

+ I t ~F(X(s))V(s)ds.
0

Both of the suggested limits hold in c o n s i d e r a b l e generality. See


Kurtz (1978) or Kurtz (1981) for details. In p a r t i c u l a r the limits
h o l d if F is c o n t i n u o u s l y differentiable, the s o l u t i o n of (3.5)
exists for all t > 0, the El are c o n t i n u o u s and ~ Ill 2 s u p B~x)
xe K

< ~ for each c o m p a c t K. We will assume these c o n d i t i o n s in w h a t


follows.
In this s e c t i o n we would like to show how the above results can be
used to o b t a i n asymptotic results for hitting distributions of the Xn

3.1 Theorem Let ~ be c o n t i n u o u s l y differentiable on ~d with


183

9(x(0)) > 0. Let

(3.8) Tn = i n f { t : ~ ( X n ( t ) ) ~ 0]

and
(3.9) T = inf{t:~(X(t))<_ 0}.

Suppose r < ~ and $~(X(l=)) • F(X(T )) < 0.

Then

(3.10) ~-~(Tn_7 ) . _ ~ ( X ( T ) ) "V(r )


a~(X(T ) "F(X(T ))

and
(3.11) /n(Xn(T n)-X(~ ))
~(x(~)).v(T ) F(X(T)).
V(T) --
~(X(T ))'F(X(T ))

Proof Note that

(3.12) 3
3-~ ~ (X(t)) = ~ (X(t))'F(X(t))

so the c o n d i t i o n that ~(X(T )).F(X(T~)) < 0

implies that for E > 0, ~(X(T -e)) > 0 and ~(X(T +e)) < 0.
Since the c o n v e r g e n c e of Xn to X is u n i f o r m on b o u n d e d time
intervals ~(Xn(t)) + ~(X(t)) uniformly on b o u n d e d time intervals
and it f o l l o w s that ~n ~ T in p r o b a b i l i t y .
Next note that ~(X(T )) = 0 and, since ~ ( X n ( T n)) ~ 0 and
(Xn(~n-)) ! 0

(3.13) I~-~ ~ ( X n ( ~ n)) I ! I/~ (~ (Xn([n)) - ~ (Xn(Tn-)) ) I

= I~ (8 n) " (Vn(T n) -Vn(Tn_) ) I


for some 8n on the line b e t w e e n Xn(Tn_) and Xn(Tn) . The r i g h t
side of (3.13) goes to zero since Tn ~ T , Vn ~ V and V is
continuous. (Note we are a p p e a l i n g to the c o n t i n u o u s mapping theorem
here, Billingsley (1968), page 30, but see the Remark below.) Conse-
quently

(3.14) ~H ( ~ ( x ( T ) ) - ~(X(Zn)))

= ~ (e(Xn(~n)) - e(X(~n))) - ¢~ ~(Xn(~n))

= ¢~ (e(X(T n) + n-i/2 Vn¢~n)) - e(X(T n)) - #-~ e(Xn(~))

a~(x(~ )) -v(~ ).

Since the left side of (3.14) is a s y m p t o t i c to


184

(3.15) -~(X(T ))-F(X(T~)) /n ( ~ n - T ) , (3.10) follows.

Finally, noting that v~(X(rn)-X(T~)) ~ F(X(T~))¢~ (Tn-T)

(3.16) /n (Xn(T n) - X(Y ))

= ¢~ (Xn(Tn) - X(Tn) ) + v~ (X(Tn) - X(T~))

~(X(~ )) -V(~ )
V(T ) - 8~(X(T ))-F(X(Y )) F(X(T )) .

Remark A l t h o u g h the c o n t i n u o u s m a p p i n g t h e o r e m is s u f f i c i e n t to justify


the above conclusions, it m a y be h e l p f u l to r e c a l l the S k o r o h o d repre-
s e n t a t i o n t h e o r e m w h i c h states that if V ~ V then there is a proba-
n
b i l i t y space and v e r s i o n s Vn and V of Vn and V such that
+ V a.s., w h i c h in our case m e a n s
n
(3.17) lim sup IVn(S) - V(s) I = 0 a.s.
n+~ s<t

See B i l l i n g s l e y (1971), page 7. In K u r t z (1978) a construction based


on work of Komlos, Mayor, and T u s n a d y (1975) is given w h i c h p r o v i d e s a
rate of c o n v e r g e n c e for (3.17).

Epidemic Model We c o n s i d e r the s i m p l e s t s t o c h a s t i c e p i d e m i c model. In


this example, the p a r a m e t e r n can be t h o u g h t of as the total popu-
l a t i o n size. Two s u b c l a s s e s of the p o p u l a t i o n are considered, the sus-
c e p t i b l e s and the infectives. Assuming that the p r o b a b i l i t y that a
particular s u s c e p t i b l e is i n f e c t e d d u r i n g a s h o r t interval of time is
proportional to the f r a c t i o n of the p o p u l a t i o n that is infective, we
have t r a n s i t i o n i n t e n s i t i e s

(3.18) Is ! = nl ~ !
n nn

for the i n f e c t i o n of a s u s c e p t i b l e and

(3.19) ~i = n ~ i

for the r e c o v e r y of an infective, where i is the number of infeetives


and s the n u m b e r of s u s c e p t i b l e s . A s s u m i n g r e c o v e r e d i n f e c t i v e s are
immune, we h a v e the M a r k o v chain g i v e n by

(3.20) In(t) = In(0) + n -I Yl(n I t ISn(S)In(S)ds) - n -I Y 2 ( n I t ~In(S~S)


0 0
and
(3.21) Sn(t) = Sn(0) - n -I Yl(n I t ISn(S)In(S)ds)
0
where In and Sn are the f r a c t i o n s of the p o p u l a t i o n that are
185

infective and susceptible. Letting n + ~ the l i m i t i n g deterministic


model is

(3.22) I(t) = I(0) + ft (IS(s)I(s) - ~I(s))ds


0

(3.23) S(t) = S(0) - ft XS(s)I(s)ds.


0

Note that the M a r k o v chain eventually absords o n the b o u n d a r y with


I n = 0. We want to o b t a i n results of N a g a e v and Startsev (1970) on the
asymptotics of the n u m b e r of s u s c e p t i b l e s remaining after absorption.
Unfortunately, Theorem 3.1 is n o t d i r e c t l y applicable since ~= = ~.
However, s i n c e we are o n l y interested in w h e r e the p r o c e s s absorbs we
can s p e e d up the p r o c e s s without affecting where it h i t s the b o u n d a r y .
In p a r t i c u l a r let Yn satisfy

/ Y n (t) foo
(3.24) I (s) ds = t, 0 < t -<- In( s)ds
0 n 0

and d e f i n e In(t) = In(Yn(t)) and Sn(t) = Sn(Yn(t)) . Then, sub-


stituting in (3.20) and (3.21)

(3.25) In(t) = In(0) + n -I Y l ( n I t IS (s)ds) - n-iY2(nDt)


0 n
and

(3.26) Sn(t) = Sn(0) - n -I Y l ( n I t IS (s)ds)


0 n

for t < I (s)ds Of c o u r s e (3.25) and (3.26) have a unique so-


-- 0 n
lution for all t > 0, so w e w i l l assume In and Sn are de-
fined for a l l t > 0. Let

(3.27) T
n = inf{t:In(t) = 0}

and n o t e that Sn(Tn) is the n u m b e r of s u s c e p t i b l e s remaining in the


original epidemic model after absorption.
Theorem 3.1 is a p p l i c a b l e to (I n , Sn). The limiting determin-
istic m o d e l is n o w g i v e n by

(3.28) I(t) = z(0) + /t ~ ( s ) d s - ~t


0
and
(3.29) S(t) = S(O) - I t IS(s) ds,
0
so

(3.3o) S(t) -- S ( 0 ) e - I t

and

(3.31) I(t) = I(O) + S(O)(i - e - A t ) - pt.


186

Consequently T is the s o l u t i o n of

(3.32) I(0) + S(0)(i - e - l ~ ) - ~T = 0.

We also have (~-n(in - I)' ~(Sn - ~)) = (VI ' VS)


where
(3.33) VI(t) -- VI(0) + WI(S(0)(I - e-lt)) - W2(Pt) + /t lv S(s)ds
o
and

(3.34) Vs(t) = Vs(0) - WI(S(0) (i - e-lt)) - ~t lVs(S)dS"

Then Vs and VI can be w r i t t e n

(3.35) Vs(t) = e -It VS(0 ) + I t e-l(t-s) /S(0) (l-e -Is) dw(s) .


0
and

(3.36) Vi(t) = VI(0) + VS(0)(i - e -At )

_ /t e-l(t-s) /S(0)(l-e -Is) dw(s) - W2(~t),


0
where w is a s t a n d a r d B r o w n i a n motion.
F i n a l l y we have

(3.37) /n(Sn(Tn) - S(T ))

Vs(T~) + ~§(~)VI(T~)/(~S(T .) - U).


The m e a n and v a r i a n c e for the limiting G a u s s i a n d i s t r i b u t i o n can be
calculated easily in terms of y
The limit (3.37) is a special case of results of N a g a e v and Startsev
(1970). Because of the time substitution, the asymptotic behavior of
Tn tells e s s e n t i a l l y nothing about the d u r a t i o n of the epidemic.
Some a s y m p t o t i c results for the d u r a t i o n are given in N a g a e v and
Mukhomor (1975). For m o r e i n f o r m a t i o n on epidemic models see Bailey
(1975).

R e p a i r m a n Model We now consider a repair f a c i l i t y model of I g l e h a r t


and Lemoine (1974). (See also Lemoine (1978).) The m o d e l involves an
operating f a c i l i t y and two repair facilities. When o p e r a t i n g at full
capacity, the o p e r a t i n g f a c i l i t y requires n operating units. There
are n + mn operating units available if none are being repaired
(i.e. there are mn spare units.) The time until failure of a unit
is e x p o n e n t i a l l y distributed with p a r a m e t e r 1 and a failure is of
type 1 and is sent to the first repair facility with p r o b a b i l i t y p
and is of type two and sent to the second repair facility with proba-
bility l-p. The repair facilities have S1 and S2 repairmen
n n
187

respectively, and the service times in each facility are exponentially


1
distributed with parameters ~i and ~2 respectively. Let Xn
and X n2 be the numbers of units in the repair facilities normalized
by n. Then setting 11 = Ip and 12 = l(l-p)

(3.38) xi(t)i, = xi(0) + n-iYli(~ t linA(n+mn -nxl(u)-nXn(u


) 2 )du)

i i
- n-iY2i({ t ~i SnA(nXn(U))du)

2
= xi(0) + n-iYli(n{tl i iA(l+n-lmn-X 1 (u)-Xn(U))du)

- n-iY2i(n{ t ~i(n -I Si)Axi(u)du)

where YII 'Y21 ' YI2 ' Y22 are independent Poisson processes.
Assume that -i
n m n = m + 0(i/n) and n - 1 S ni = S i + O ( 1 / n ) . Then the

limiting deterministic model is

(3.39) xi(t) = xi(0) + {t (lilA(l+m_xl(u)_X2(u))_NisiAxi(u))du"

Since Fi(X) = li iA(l+m-xl-x2) - ~i s i AXi is Lipschitz the "law


of large numbers" (for example in Kurtz (1981)) applies and assuming
Xn(0)- + xi(0), i = 1,2 we have

(3.40) i ) - x i (u) I + 0
sup IXn(U in probability.
u<t
The "central limit theorem" does not quite apply since Fi is not
continuously differentiable. But the same proof gives the following.
Let
(3.41) vni(t) - /n(xi(t) - xi(t))

(n) ({t lilA(l+n-lmn_Xln(u) -Xn2 (u)) du)


= Vn(0) + Wli
. (n) ( . t -i i i
- W2i ~ ~i (n Sn) AXn (u) du)

+ ~t li~-~(iA(l+n-lmn_X~(u)_X~(u))_iA(l+m_xl(u)_X2(u)))du

-i i i " '
_ ~t ~i/~ ( (n Sn)AXn(U)-SIAXI(u) )du

Assuming vi(0)n ÷ vi(0) , i = 1,2 the weak convergence of Wli"


(n)
and w2i"
(n) and (3.40) imply Vni ~ V i where
188

(3.42) vi(t) = vi(0) + Wli(0f t lilA(l+m-xl(u)-X2(u))du)

- W2i(0f t ~iSiAXi(u)du)

{t li(X{m } (xl(u)+X2(u)) ( v l ( u ) + V 2 ( u ) ) v 0

+ X (m,~) (xl(u)+X2(u)) (vl(u)+V2(u)))du

:t ~i(X (xi(u))Vi(u) + X (xi(u)) Vi(u)A0)du


0 [0,S i) {S i }

(XA(Z) = 1 if z E A and = 0 otherwise). Since the integral


term is not linear in V, V may not be Gaussian. But, except for
very special cases of the parameters, the Lebesgue measure of
{u:xl(u) + X2(u) = m or xi(u) = si~ is zero and the nonlinear
terms can be dropped. We do not want to go into an exhaustive case by
case analysis here (see Lemoine (1978)), but if S 1 ~ AI/~ 1 ,
S 2 ~ ~2/~2 and $2~2/12 ~ SI~I/~I then the deterministic model has
a unique asymptotically stable fixed point (~ , x 2) and, assuming
xi ~ S i i = 1,2, a theorem of N o r m a n (1974) (see Kurtz (1981))
implies that the convergence in distribution (V~(t) V2(t))
• n
(vl(t), V2(t)) is uniform for all t > 0. In particular the sta-
tionary distribution for (v~ , v~)_ converges to the (Gaussian) sta-
tionary distribution for (V I , V2).

4. Fiber Bundles
In order to give another example of an argument similar to those
used in Section 3, we consider the simplest example of the fiber bundle
models studied by Phoenix and Taylor (1973) and Phoenix (1979). In
fact the particular result we give goes back to Daniels (1945).
We consider a bundle of n fibers under a load nL. We assume
that all fibers share the load equally (i.e. initially each fiber is
subjected to a load L). Under this load a number of fibers Nn(L)
will break leaving n - Nn(L) fibers to support the load, and hence
the load on each remaining fiber is nL/(n-Nn(L)) = L/(I-Xn(L)) where
Xn(L) = n-INn(L) is the fraction of fibers that have broken. Finally
assume that a fiber subjected to a load Z breaks with probability
F(Z) and that the fibers break independently of each other. We can
construct this model by associating with each fiber an independent random
variable ~k ' uniformly distributed on [0,i]. If the k th
fiber is subjected to a load £, then it breaks if ~k ~ F(£).
Define the empirical process
~89

(4.1) y(n) (u) = #{k:~ k < u}.

Then N (L) must satisfy


n
nL
(4.2) N n ( L ) = y(n) (F(n_N (L)))
n
or, equivalently, X (L) satisfies
n

(4.3) Xn(L) = n-iy (n) (F(I_xL(L))) o

Unfortunately (4.3) may not have a unique solution so we m u s t specify


that Xn(L) is the smallest solution, if it exists, of (4.3). The
analogy with the limit theorems of Section 3 is clear. We know that

(4.4) lim sup in-i y(n)(u) - uI = 0


n~ 0<u<l

and, defining Wn(U) = n-i/2(Y (n) (u) - nu) , that Wn ~ WB where


WB is B r o w n i a n bridge (see for example B i l l i n g s l e y (1968) page 64).
The limiting deterministic m o d e l is

(4.5) X(L) = F ( ~ )

(again take X(L) to be the smallest solution, if one exists, of


(4.5)). Assume that F is c o n t i n u o u s l y differentiable. Then it is
not d i f f i c u l t to see that Xn(L) + X(L) for each L such that
X(L) exists and

(4.6) (i - X(L))2 _ L F ' ( ~ ) L > 0.

Assuming (4.6) , then setting ~(n) (u) = y(n) (u)-nu

(4.7) Vn(L) =- ~-n (Xn(L) - X(L))

= n-I/2y (n) (F( L


I-Xn(L ) ) )

+ /~(F( ) - F(I_X--~-U~y))
and it follows that Vn(L) ~ V(L) where V(L) satisfies

L
(4.8) V(L) = WB(F(I_X--j~-~))

L L V(L) ,
+ F ' (I-X--~--~) 2
(l-X (L))
that is
(I_X (L))2WB (F L
(l_x-~TZy))
(4.9) V(L) = (I-X(L)) 2 _ LF' ( ~ )

Finally consider the m a x i m u m load the bundle will support, that is the
maximum L for which (4.3) has a solution. Rewriting (4.3) we see
190

that

(4.10) L - (i - n -I y(n) (F(I_xL(L))))


I-X n (L)

and if Ln is the m a x i m u m load, then

* -i y(n), (F(u))) .
(4 .ii) L = sup u(l - n
n U

S i m i l a r l y define

(4.12) L = sup u(l - F(u)).


U
Then

(4.13) /n ( L n - L )

= sup [-un -I/2 ~(n) (F(u)))


u

- ~-n (L - u(l-F(u)))] .

N o t i n g that L > u(I-F(u)) we c o n c l u d e that

(4.14) /~ (Ln - L ) ~ sup(-uWB(F(u)))


ue F

where F = {U:U(I-F(u)) = L*}.

References

i. Bailey, N o r m a n T. J. (1975). The M a t h e m a t i c a l T h e o r y o_ff Infectious


Diseases. Griffin, London.

2. Billingsley, P a t r i c k (1968). C o n v e r g e n c e o_ff P r o b a b i l i t y M e a s u r e s :


J o h n ~iley, N e w York.

3. Billingsley, P a t r i c k (1971). W e a k C o n v e r g e n c e of Measures:


A p p l i c a t i o n s i_nnP r o b a b i l i t y . SIAM, P h i l a d e p h i a .

4. Bremaud, P i e r r e (1981). Point Processes and Queues: Martingale


D[namics. S p r i n g e r - V e r l a g , N e w York.

5. Daniels, H . E . (1945). The s t a t i s t i c a l theory of s t r e n g t h of


bundles of threads. Proc. R. Soc. L o n d o n A 183, 405-435.

6. Iglehart, D o n a l d L. and A u s t i n J. L e m o i n e (1974). Approximations


for the r e p a i r m a n p r o b l e m w i t h two repair facilities, II: Spaces.
Adv. App. Prob. 6, 147-158.

7. Karlin, Samuel (1966). A F i r s t C o u r s e in S t o c h a s t i c Processes.


A c a d e m i c Press, N e w York.

8. Komlos, J., P. Major, and G. T u s n a d y (1975). An a p p r o x i m a t i o n of


p a r t i a l sums of i n d e p e n d e n t r a n d o m v a r i a b l e s and the sample d i s t r i -
b u t i o n function, I. Z. W a h r und Verw. G e b i e t e 32, 111-131.
191

9. Kurtz, Thomas G. (1978). Strong approximation theorems for density


dependent Markov chains. Stochastic Processes Appl. 6, 223-240.

i0. Kurtz, Thomas G. (1980). The optional sampling theorem for


martingales indexed by directed sets. Ann. Probability, ~, 675-681.

ii. Kurtz, Thomas G. (1981). Approximation of Population Processes.


SIAM, Philadelphia.

12. Kurtz, Thomas G. (1982). Counting processes and multiple time


changes. (in preparation)

13. Lemoine, Austin J. (1978). Networks of queues - A survey of weak


convergence results. Management Science, 24, 1175-1193.

14. Nagaev, A. V. and T. P. Mukhomor (1975). A limit distribution of


the duration of an epidemic. Theory Prob. Applications 20, 805-818.

15. Nagaev, A. V. and A. N. Startsev (1970). The asymptotic analysis


of a stochastic model of an epidemic. Theory Prob. Applications;
15, 98-107.

16. Norman, M. Frank (1974). A central limit theorem for Markov pro-
cesses that move by small steps. Ann. Probability, 2, 1065-i074.

17. Phoenix, S. Leigh (1979). The asymptotic distribution for the


time to failure of a fiber bundle. Adv. Appl. Prob ii, 153-187.

18. Phoenix, S. Leigh and Howard M. Taylor (1973). The asymptotic


strength distribution of a general fiber bundle. Adv. Appl- Prob.
5, 200-216.

19. Snyder, Donald L. (1975). Random Point Processes. John Wiley,


New York. -
APPROXIMATE I N V A R I A N T M E A S U R E S F O R THE A S Y M P T O T I C D I S T R I B U T I O N S
OF D I F F E R E N T I A L E Q U A T I O N S W I T H W I D E B A N D N O I S E INPUTS*

H a r o l d J. K u s h n e r
D i v i s i o n of A p p l i e d M a t h e m a t i c s
Brown University
Providence, Rhode Island 02912

ABSTRACT

D i f f u s i o n m o d e l s are u s e f u l and of w i d e s p r e a d use in m a n y areas of


c o n t r o l and c o m m u n i c a t i o n theory. The m o d e l s are f r e q u e n t l y u s e d as
approximations to c o n t i n u o u s or d i s c r e t e p a r a m e t e r systems w h i c h are not
q u i t e d i f f u s i o n s but are, hopefully, c l o s e to a d i f f u s i o n in some sense.
For example, the input noise m i g h t be ' w i d e - b a n d ' - - b u t not 'white-
Gaussian'. M a n y a p p r o x i m a t i o n t e c h n i q u e s have b e e n d e v e l o p e d and the
t y p i c a l results are of a w e a k c o n v e r g e n c e nature. The p h y s i c a l p r o c e s s
x (.) is p a r a m e t e r i z e d by c, and one tries to show that {x~(.)} con-
v e r g e s w e a k l y to some d i f f u s i o n x(.) as s + 0. The limit p r o c e s s
x(-) is then u s e d to study v a r i o u s p r o p e r t i e s of xe(-) for small £.
Frequently, in a p p l i c a t i o n s , we are c o n c e r n e d w i t h a s y m p t o t i c properties,
as t + ~ (for small e), as well as w i t h w e a k convergence. Such in-
f o r m a t i o n is not n o r m a l l y p r o v i d e d by the weak c o n v e r g e n c e theory. We
d i s c u s s the p r o b l e m of a p p r o x i m a t i n g f u n c t i o n a l s on the 'tail' of x ~ (-)
for small ~, by such f u n c t i o n a l s on the 'tail' of x(.); e.g., approxi-
m a t i n g the m e a s u r e s of {x£(t), large t} , for small c, by an invariant
m e a s u r e of x(-). This is P a r t i c u l a r l y useful in p r o b l e m s in c o m m u n i -
c a t i o n theory, w h e r e the (say) d e t e c t i o n s y s t e m is o f t e n s u p p o s e d to be
in o p e r a t i o n for a v e r y long time.

*This "research was s u p p o r t e d in part by the Air F o r c e O f f i c e of Scientific


R e s e a r c h under A F O S R - 7 6 - 3 0 6 3 D , in p a r t by the N a t i o n a l Science Founda-
tion u n d e r N S F - E n g . 7 7 - 1 2 9 4 6 - A 0 2 , and in part by the Office of Naval
Research under N00014-76-C-2079-P0004.
193

I. Introduction

Let {x£(.)} be solutions to ordinary differential equations with


oE
random right hand sides, e.g., where x = FE(XE,~ E) for some function
Fe(-,.) and a "wide band" noise process ~(.). Many results are avail-
able concerning the weak convergence of {xE(-)} to a diffusion x(').
The weak convergence basically gives us information on the approximation
of xe(.) by x (.) on arbitrarily large but still finite time intervals.
In applications to control and communication theory, information
concerning the closeness of the distributions of the xe(t) for
large t, small c, to ~(.), the invariant measure of x('), is of
considerable interest. Some results along this line were obtained in
[2, Section 6] for the system
•E E) E
(i.i) x = F(X,{ /e + G(x,~ ) + ~(x), xe(t) E R r ,

where EF (x,{)=EG(x,~)=0, ~(t)=(~ t/E2 ), a n d ~(.) is a Markov jump


process. The weak limit of {xe(t)} is a diffusion with differential
generator

(1.2) (_~f(x) = f'(x)~(X)x + 10 E(fx(x)F(x' ~(t)))xF(X' ~(0))dt.

Suppose that x(.) has a unique invariant measure ~(.), and


let there be a smooth Liapunov function satisfying 0 ~ V(x) + ~ as
Ix l ÷ ~ , and a T> 0 such that

(1.3) -~V(x) ~ - YV(x), for large Ixl

Then, fQr small c, (x~(.), ~£(-)) has an invariant measure v~(-)


whose x-marginals ~e(.) converge weakly to ~(.) as ~ ÷ 0
[2, Section 6 ].
Xn this paper, ~c(.) need not be Markov, F or G might not be
smooth and (1.3) is replaced by a weaker condition. Equation (1.3) would
not no~aally hold if G were bounded, for example. The basic techniques
used here are similar to those in [2] ; both heavily depending on the
use of "averaged Liapunov functions."
Section 2 contains the basic approximation theorem, but using the
condition (A5) which is not usually directly verifiable. A verifiable
condition for (A5) is then given.
We present only some of the theorems and their assumptions. A
fuller development is in [6] which contains all proofs, examples, and
extensions to the cases of unbounded noise, discontinuous dynamical terms
and state dependent noise.
194

2. The Basic C o n v e r g e n c e Theorem

Assumptions

AI. For each x(0), x(.) is a Feller M a r k o v d i f f u s i o n process with


continuous coefficients.

A2. x(.) has a unique invariant measure ~ (.) and ~ (x,t,.),


the m e a s u r e of x(t) when x(0) = x, conver~es w e a k l y to
~(-) for each x, as t ÷ ~ .

A3. The c o n v e r g e n c e i_nn (A2) is u n i f o r m in c p m p a c t x-sets; i.e.,


for each f(.) 6 ~ r, the space of continuous b o u n d e d functions
on R r, Exf(X(t)) ~ E f(x(0)) uniformly in x in compact
sets, where E denotes e x p e c t a t i o n w i t h resRect to the
stationary measure of x(.).

A4. I_~f t c + t o ~ ~ and xE(te) + x(0) weakly, then xE(tE+ .) ÷


x(.) weakl~. (The usual c o n d i t i o n s w h i c h imply this for
t0=0, also imply it as stated.)

A5. There is an £0 > 0 such that {xe(t), 0 < e ~ ~0' t ~ 0}


is tight. (See T h e o r e m 3 for a v e r i f i a b l e criterion for (A5).)

Lemma 1. For any integer m, let f(.) E ~ m r + r. Assume (AI)-(A3)


and let 0 = A0 <41"'" < ~m" Let S denote a. tight set of R r valued
r a n d o m variables. Then

Ex(0)f(x(t + Ai), i < m) + E f(x(~i), i < m)

uniformly for x(0) E S, as t+~ .


Theorem 1 is the basic c o n v e r g e n c e theorem.

T h e o r e m i. Assume (AI)- (A5) . Then for each f (.) E ~ r m + r ' T <


and ~ > 0, there are t0(f,6)< ~ and £0(f,~) > 0 such that for
all t > t0(f,6) , and &i < T, and ~ < ~0(f,~), and any se~uenee {xe(.)}
which converges w e a k l y to x(. ),

(2.1) IEf(xE(t+~ i), i < m) - E f(x(A i) , i < m) I < 6.

Let (x E(.), ~e(.)) be M a r k o v and have an i n v a r i a n t m e a s u r e v c(.),


195

Replace (A5) by (A5'): There is a sequence TC ÷ ~ (T e can depend on


the initial condition xC(0), ~(0)) such that (A5) holds for t _> T E
Then {~(.)} , the x - m a r ~ i n a l s of {v~(-)} are tight. In addition, let
(A4) hold when (xE(0), ~ ( 0 ) ) has the d i s t r i b u t i o n vE(.). Then {~£(')}
converges w e a k l y to ~ (.) as ~ +0.

Remark. ~he t h e o r e m implies that the c o n v e r g e n c e as t ÷ ~ in (2.1) is


uniform in s for small e, a fact w h i c h is important in applications.
In applications, it is often p o s s i b l e to prove results such as
Ex,~IxC(t) I ~ K, where for small ~, K does not depend on e,~ = ~ ( 0 ) ,
or x. Then the r e p l a c e m e n t (A5') for (A5) in the last p a r a g r a p h holds.
Under (A5'), (xe(.),~c(.)) has an i n v a r i a n t m e a s u r e vE(.) for small
c , if for some ~(0), x£(0), {~e(t), t > 0} is tight for each small
.

Proof. Suppose that (2.1) is false. Then there is a s u b s e q u e n c e ~÷0


and a sequence {t E} ÷ ~- such that

(2.2) IEf(xE(te+~i) , i < m) - E~f(x(~i), i < m) I > ~ > 0.

We will find a further subsequence, also indexed by ~, w h i c h violates


(2.2). Fix T > 0. By (A5), we can choose a further subsequence such
that {x e (te- T)} c o n v e r g e s w e a k l y to a r a n d o m v a r i a b l e x(0). By (A4),
{x~(t~ - T + • )} c o n v e r g e s w e a k l y to x(. ) w i t h initial c o n d i t i o n x(0)
and

(2.3) Ef(xe(te - T + T + ~i ), i < m) + E E x ( 0 ) f ( x ( T + Ai), i < m).

By (AS), the set S of all p o s s i b l e x(0) (over all T > 0 and w e a k l y


convergent subsequences) is tight. By Lemma i, we can take T large
enough such that

(2.4) IEEx(0)f(x(T + ~ i ), i < m) - ~f(x(Ai), i_< m) I < ~/2.

Equations (2.3) and (2.4) c o n t r a d i c t (2.2).


The proof of the last a s s e r t i o n is similar to the last part of the
proof of T h e o r e m 4 of [6] and is omitted.
Q.E.D.
196

3. A L i a p u n o v F u n c t i o n C r i t e r i o n for Tightness of {x(t),t ~ 0}

Here, we state c o n d i t i o n s which guarantee (3.1).

(3.1) {x(t), t > 0, x(0) 6 B} is t i g h t for each compact B. To prove


(3.1), c o n d i t i o n (A6) is required.

A6. There is a c o n t i n u o u s Liapunov function 0 ~ V(x) + ~ a_ss Ixl + ~


an__~d a 10 and ~0 > 0 such that ~V(x) ~ -~0 for
x ~ Q0 = {x:V(x) ~ 10} . The partial d e r i v a t i v e s of V(') up to
order 2 are continuous.

The proof of T h e o r e m 2 is a p r o t o t y p e of the technique used to


verify (A5). See[6].

Theorem 2. Under (AI) and (A6), condition (3.1) holds.

Comment on the proof. (A6) implies that Q0 is a r e c u r r e n c e set for


x(-). Let T O and ~i be stopping times such that V(x(T0)) £ ~QI =

{x:V(x) = l I > X 0} , and let T1 > t 0 be the next return time of

x(.) to ~Q0" For x(t) ~ Q0' {V(x(t))} is a 'supermartingale' and


probability estimates can be o b t a i n e d of the m a x i m u m excursion of V(x(.)
on any such interval [T0,YI] Given 6 >0, we find numbers k,1
(not d e p e n d i n g on t) such that for each t there are k 'return inter-
vals' satisfying

P{ t ~ these k return intervals } ~ 6/2


P{ V(x(t)) ~ X on these k return intervals} ~ 6/2.

This yields the d e s i r e d tightness.

4. An A v e r a g e d Liapunov Function Criterion for (A5).

In this section, we use the model (i.i), smooth F, G, G and a


strong m i x i n g condition on ~(-). The result and techniques should be
viewed as an i l l u s t r a t i o n of the general possibilities. The m i x i n g con-
d i t i o n is too strong for m a n y applications, and other c o n d i t i o n s on
~E(.) and on F, G, G are c o n s i d e r e d in [6] In order to get the
necessary inequalities for any L i a p u n o v function based approach, an
assumption such as (B4) seems to be required. The conditions hold in
numerous cases of interest.
197

BI. ~-) is a bounded, _right continuous, stationary #-mixing process


[ 7]with ~A i/2 (t)dt < ~
J0
B2. F(','), G(',') and G(') are continuous, R r valued functions
w h o s e growth (as Ixl + ~ ) is O(Ixl). The partial derivatives of
F(',~) up to order 2 (and of G(',~) up to order i) are bounded
uniformly in x,~, and EF(x,£) - 0 - EG(x,~).

B3. There is a diffusion process x(') with differential generator


defined by (1.2), and which satisfies (AI)-(A3). Also, (A6) hold_____~s,
but the partial derviatives of V (") up to order 3 are continuous.

B4. There are constants K such that, uniformly in x,~ ,

(4.1a) IVx(X)G(x,~) I + IVx(X)F(x,~) I < K(I + V(x))


! !
(4.1b) I (Vx(X)F(x,~))xF(X,~) I < K(I + V(X))

(4.2) I (Vx(X)G(x,~))xU(X,~) I _< K(I + I_~V(x) I ) for U = F,G,G.

(4.3) I (Vx(X)F(x,~))xU(X,~) I < K(I + l ~ V ( x ) I ) , U = G,G

(4.4) t((Vx(X);(x,~))xF(X,~))iu(x,~)f _< K(1 + fyV(x) I), u = F,~,~.

Theorem. 3. Under (BI)-(B4) and the tightness of {xe(0)} , condition


(A5) holds.

Remarks on Theorem 3. (B4) fits many examples. In a sense it is a


prototype condition for two typical cases, where the orders would often
be as required by (B4): (a)F,G,G increase roughly (at most) linearly
in x and V increases roughly quadratically in x as Ixl + ~; (b)
F,G,G are bounded and V increases roughly linearly in Ixl for large
x.
In the proof of Theorem 3, an 'averaged' Liapunov function VE(-)
is obtained from V(-) for small e > 0, Ve(x~(t),@is a 'supermartin-
gale' on the time intervals during which xC(t) £ Q0. This is used to
prove recurrence of xC(-), and to get probability estimates on the path
excursions of xe(') on trips from ~QI = {x:V(x)=ll > 10} to 8Q 0.
Then we apply a technique similar to that used to complete the tightness
proof in Theorem 2.
198

REFERENCES

[i] R. Z. Khazminskii, "A limit theorem for solutions of differential


equations with a random right hand side." Theory of Probability
and Applic., ii, 1966, p. 390-406.

[2] G. Blankenship, G. C. Papanicolaou, "Stability and control of


stochastic systems with wide-band noise disturbances." SIAM J.
on Appl. Math., 34, 1978, pp. 437-476.

[3] G. C. Papanicolaou, W. Kohler, "Asymptotic theory of mixing ordinary


stochastic differential equations." C o m m . Pure and Applied Math.,
2_~7, 1974, pp. 641-668.

[4] H. J. Kushner, "Jump-diffusion approximations for ordinary differen-


tial equations with wide-band random right hand sides." SIAM J. on
Control and Optimization, 17, 1979, p. 729-744.

[5] H. J. Kushner, "A martingale method for the convergence of a sequence


of processes to a jump-diffusion process." Z. Wahrscheinlichkeits-
teorie, 5_33, 1980, p. 209-219.

[6] H. J. Kushner, "Asymptotic distributions of solutions of ordinary


differential equations with wide band noise inputs; approximate
invariant measures." To appear in Stochastics, early 1982.

[7] P. Billingsley, Convergence of Probability Measures, John Wiley,


New York, 1968.
OPTIMAL STOCHASTIC CONTROL OF DIFFUSION TYPE PROCESSES

AND HAMILTON-JACOBI-BELLMAN EQUATIONS

P.L. Lions
Ceremade , Paris IX University
Place de Lattre de Tassigny
75775 Paris Cedex 16
France

I. Introduction :

In this paper we present a general approach and several results (obtained in


particular by the author) concerning general optimal st0chasti e control problems
and more precisely the associated Hamilton-Jacobi-Bellman equations (also called
the dynamic prosrammin 5 equations).

Let us first describe briefly the type of problems we consider : the state
of the sTstem we want to control is given by the solution of the following stochas-
tic differential equation :

(I) I dYx(t) : O(Yx(t)'v(t'm))dWt + b(Yx(t)'v(t'~))dt

[ yx(0) = x e

where 0 is~gmooth domain in R N ; Wt is a Brownian motion in R p ; q(x,v) is a matrix-


valued function frSm R N x V ; b(x,v) is a vector-valued function from R N x V ;
V is a separable metric space. We assume that (I) takes place in a probability space
(~,F,~,P) having the usual properties. And v(t,~) (called the control process) is
any progressively measurable process with values in a compact subset of V (which
may of course depend on v). We will call an admissibl e system the collection :
(~,F,Ft,P,Wt,V('),(Yx(')) _) = A -
x E 0

For each admissible system A , we define a cost function :

TX Ot )
(2) J(x,A) = E I~ f(Yx(t),v(t))exp [-[ e(Yx(S),V(s))ds
200

where f(x,v) , c(x,v) are rea~l-valued functions from O x V and Wx is the first
exit time of the p~ocess Yx(t,~) from O . To simplify the presentation, we will
assume throughout %his paper : for all ~ = e,b,c,f

~(.,v) ew2'~(R N) , sup U~(.,v)~ <


V • V W 2'~
(3)
V x eR N , ~(x,.) ~ C(V) ;

(4) 1 = inf {c(x,v) / x e R N , v • V } > 0 .

In particular (4) insures that J(x,A) has a meaning.

We want to minimize J(x,A) over all possible admissible systems A that is we


consider the minimum cost function - also called the value function or the crite~
rio~ ;

(5) u(x) = inf J(x,A).


A

It'is a well-known consequence of the d~nam.ic, programming principle (due to


R. Bellman [2 ]) that u should be "related to the solution" of the following non-
linear second-order elliptic equation :

(6) sup [AVu(x) - f(x,v)} = 0 in 0


6 V

and u should vanish on 80 or on some portion of F = 80 :

(7) u = 0 on F

Here and below A v denotes the 2nd order elliptic operator (eventually degenerate)
defined by :

(8) AV = - .[. aij(x'v) 8ij - ~ bi(x'v) 8i + c(x,v)


1,J l

I T
and the matrix a(x,v) is given by : a = ~ ~ o

The equation (6) is called the Hamilton-Jacobi-Bellman equation associated


with the above optimal stochastic control problem : in some sense it is an extension
201

of the classical first-order Hamilton-Jacobi equations occuring in the Calculus of


F~miations (see P,L. Lions [26 ]). Le~ us also point out that in the litterature
(6) is Somatlmes called th~ Bellman equation, or the dynamic programming equation.

A more precise relation between (5) and (6) is the following (see W.H. Flemtng
and R, Rishe~ [16 ], A, Bensoussan and J.L. Lions [3 ], N.V. Krylov [21 ]) :
i) If u ~ C2(0) then u solves (6), ii) If ] e C2(0) n C(O) satisfies (6) and (7)
then ~(x) = u(x) in ~. Unfortunately this classical theory (consisting of verifi-
cation theorems) is not convenient since i) u is not in general C 2, it may even
happen on simple examples that u i~ not continuous ! ; ii) no classical tools can
taka care of (6) and this foz several reasons : first it is a fully nonlinear
equation that is the nonlinearity acts on second derivatives of the unknown and
second it is a degenerate equation since a may not be posifli~e definite.

To solve these d~ffieulties, we propose here a notion of veak solution of


(6) that we call viscosit 7 solution (since it is an extension of the notion
introduced in flrst-order Hamilton-Jacobl equations by M.G. Crandall and P.L.
Lions [7 ]~ [8 ] - see also M.G. Crandall, L.C, Evans and P.L. Lions [9 ]). This
notion is briefly discussed in section II. Since this notion requires continuity,
we giv~ in section I I I a few results concerning, in particular, the continuity
of u, In section IV we give a general uniqueness result for viscosity solutions
of (6). Next (section V) we present various regularity results which, combined
with the notion of viscosity solution, immediately yield that (6) holds in elemen-
tary ways (such that a.e, for example). Finally in section VI we mention several
rel~ted topics that may be treated by the same methods.

Finally we want to mention a problem that we do not consider here : we will


not give any result concerning optimal controls. Let us just mention that using
results due to N,V. ~[rylov [21 ], [22 ] and a method due to S.R.S. Varadhan [48 ],
it is possible to give under very general assumptions the existence of e-optimal
(or even optimal) markovian controls (i.e. controls in the so-called feedback
form).
202

II. Viscosity solutions of Ham ilton-Jacobi-Bellman equations.

Remarking that (6) may be rewritten :

(9) H(D2u,Du,u,x) = 0 in 0 ,

where H is specified to be :

"" "
H(uiJ,ul,u,x) = sup ~ ! aij(x,v)u IJ
"' - bi(x,v)ui+c(x,v)u_f(x,v)]
v e V i j

we are going to define weak solutions of (9) where H is a function on


LNs (= {A : N x N symmetric matrix}) x R N x R x 0 satisfying :

(lo) { ~ ~ c.(~ x ~

H(A,p,t,x) ~
~ ~ ~ o~

H(B,p,t,x)
~ for al~ ~,~ ~ ~ ,

if A > B (in L~)


(p,t,~) ~ ~ ~ ~ ~ o:

(the second part of (10) ekpress the fact that (9) is elliptic).

For any continuous function to ~ C(0) ~ we define generalized derivatives as


follows : we denote by T+ the sets

T+ = x e0 / ~A eL N
s , ~ 6R N

lim sup {~0(y)-~0(x)-(~,y-x) + ~ (y-x,A(y-x))}


y-~x
I ly-xI-2<0 }
~o/3Ae~, 3~e~ N

lim inf {~(y)-~(x)-(~,y-x) + ~ 1 (y-x,A(y-x))} ly-xl -2~ 0 }


y~x

And for x 6 T+ (resp. T_) we set :

D+~(x) = N
(A,~) ,6 L S x R N /

1
lim sup { ~ ( y ) - ~ ( x ) - ( ( , y - x ) - ~ ( y - x , A ( y - x ) ) } ly-x
I-2< 0 }
y+x

(resp. O_~(x) = I(A,~) N


e Ls x RN /

lira i n f { ~ ( y ) ~ ( x ) - ( ~ , y - x )
y÷x
i 1
- ~ ( y - x , A ( y - x ) ) } lY-x1-2~0 ).
203

Re~rk .~,1 :

T+ Is the Net of points where ~ has, roughly speaking, an upper expansion up


to order 2 ; and Di@(x) consists in all possible couples (A,$) such that the
expansion holds. Let us point out that for a general continuous function ~ , T+
and T_ are dense in 0 and that if (A,~) E D+~(x) and B > A then (B,$) • D+~(x).

Definition II.1 :

u • C(0) is said to be a viscosity subsolution (resp. supersolution, reap.


solution) of (9) if u satisfies :

(11) ¥ x e T+ , ~ (A,~) • D+u(x) H(A,~,u(x),x) ~ 0

(resp. (12) ~ x e T , V (A,~) • D u(x) H(A,~,u(x),x) > 0

resp. (11) and (12)).

Remark II.2.

This definition is the extension to second-order equations of the notion


of viscosity solutions of first-order Hamilton-Jacobi equations introduced in M.G.
Crandall and P.L. Lions [7 ], [8 ]. It is also related to some notions of accre-
tivity considered by L.C. Evans [ 10 ], [ 11 ] (see also M.G. Crandall, P.L. Lions
and L.C. Evans [9 ]) ; and to a notion introduced for linear elliptic equations
by E. Calabi [5 ] .

Let us give without proof a few elementary results :

Proposition IT. 1 :

The following are equivalent for any u • c(O) :


i ) u ~ a vi6cosity solution of (9) ;
ii) u s a ~ f i e s for a l l ~ E C2(0) :

(13) H(D2~(x),D~(x),u(x),x) < 0 a t any l o e ¢ ~ max~um x o~ u ~ ,

(14) H(D2~(x),D~(x),u(x),x) > 0 any local minimum x of u~ ,

Remark II.3 :

A similar result holds for viscosity subsolutions or supersolutions : (1 I)


204

is equivalent to (13) and (12) is equivalent to (I~). In addition we may take


E c~(o) in (13)-(II~), we may replace any by some and we may ~&"striet extremum".

ProRosition 11.2 (0onsist~ncy) ;

i) I f u 6 C2(0) is a s o l u t i o n of (9) then u i~ a v i s c o s i t y sogntion of (9).

ii) I f u e c(o) iS a v i s c o s i t y s o l u t i o n of (9), u i s d i f f e r e n t i a b l a near


x E 0 and u i s t w i c e d i f f e r e n t i a b l e a~ x then we have :
o o

H(D2U(Xo),DU(Xo),U(Xo),Xo) = 0

A fundamental application of this notion is the following easy :

Proposition II.3 (Stability) :

L e t H n be a seqaence of f u n ~ o ~ satisfying (10) and converging uniformly on


c o m p a c t set~ of LN
S x RN x R x 0 to soma f u n c t i o n H ( t ~ satisfying (~0)}. L e t un
be ~ seq~e~e i~ c(o) of viscos~y s o ~ o ~ of : Hn(D~,~,x ) : o in o. ~e

assume t h a t u n converges on compact s e t s to u 6 C(0). Then u is a v i s c o s i t y


solomon of (9).

Finally, in the case when we specialize H to correspond to the Hamilton-Jacobi


Bellman equation (6), the connection between the optimal stochastic control problem
and the above considerations is illustrated by :

Theorem II.1 (Dynamic Prgsramming) :

Let u 6e t h e m ~ u m c o s t f u n c t i o n given by ( 5 ) . I f u e c ( o ) then u i s a


viscosity sogation of (6).

We briefly sketch the proof since we believe it to be enlightening : let


xO 6 T(for example), let (A,~) E D_U(Xo). It is an easy exercise to build
E c~(O) such that : u(x o) = ~(x o) , ~ = I~(x o) , A = D2~(Xo ) and u ~ in O .
Writing now the mathematical formulation of the dynamic programming principle (see
K. It8 [ 19 ], or next section) : V T > 0 ,

T ^T
u(xo) = inf
A
E fox° f(Yx (t),v(tl) exp
0
-
0
c(y x (sl,v(s)]ds d ~ +
0
t
TX ^ T
^ - e(y x (t) ,v(t)) dt .
o o o o
205

and this yields :


^T
Sup [I I%~(Xo) _ E[~(YXo(TXo^ T)] exp c(yx (t),v(t))dt
% JO O

,'~x ^T t
-1E I o f(Yx (t),v(t))exp £-I c(yx (s)'v(s))ds} dtl O.
T o o o o

Using It$'s formula, we deduce easily ;

s FXo ^ T
~up y E ! {Av(t) ~(Xo) - f(Xo,V(t))} dt > - e(T)
A Jo
where g(T) + 0 if T + 0+. And we conclude remarking that :

sup P [rx < T l ~sup P [ sup lYxo(t) - Xo) I >@]


A o A O~t~
I • C
< 8=-° sup E [ sup .lYxo(t) - Xol e] < ~--~T for T ~ I ,
A 0~t~T
= dist(Xo,F) (we used at that point that the coefficients are bounded indepen-
dently of v).
Combining Theorem II. I and Proposition II.2 we deduce the :

CorollarF !I. I :

t ~ u be given by (5) :
i) W e h a u e : ¥ v ~ V , AVu ~f(.,v) inD'(O) ;
ii) If u E oe(O) for soma p > N then we have :

(6') sup {AVu(x) - f(x,v)} = 0 a.e. in 0 •


vEV
iii) If ~Jbelongs to the vector space generated by the none X defined by :

X = <u • C(0) , ~ eW~.'P(0)ioc(p > N ) , D 2u ~D2~ in P'(@)}

t h i n u • wl'm(O) D2u e M(O) .and sup (AVu - f ( . ,v)} ~ a nega,tLve m e ~ u ~ e on 0


.1.00 •
vEV
or/cAoflon(~£to the Lebzsflue m e a ~ e .

iv) If u • x t h e n : V v e V , h < A V u < C for Some h eL~oc(0) (P > N )


and (6') kolds.
206

Remark II.4 ;

An example due to Genis and N.V. Krylov [18 ] shows that {AVu - f( • ,v) }
may be a non zero measure on 0.

Remark 11.5 :

ii) is a consequence of well-known differentiability properties of functions


in W ~ ( 0 ) spaces (see E. Stein [46 ]) ; while iii),iv) are deduced from a diffe-
rentiability theorem due to Alexandrev [I ], H, Busemann [5 ] - see for more
details P.L. Lions [27 ].

All the results mentioned in this section are detailed and proved in P.L.
Lions [27 ~.

III. Cgntiguity of the minimum cost function :

As we just saw, the notion of viscosity solutions requires that some conti-
nuity of the value function is known. On the other hand since we are dealing with
eventually degenerate diffusion processes the question whether the value function
is continuous can be extremely difficult to settle (even in the case without
control - see D.W. Stroock and S.R.S. Varadhan [ 47 I )° It turns out that there
exists a natural assumption which gives quite ~eneral results : we assume that
"the problem has a subsolution" :

w bounded meausrable on 0 , ~F ° C F such that

t ^ t ^ s
i)¥A , W(Yx(t ^ TX)) exp[-Io Txc dsI + Io Txf exp[I- o cd~l ds

is a F t ^ strong submartingale for all x 6 O ;


(15) ~x

ii) w = 0 on F ° , 1(Wx<~)yx(Tx) 6 F o a.s. ¥ A , YxEOand

lim inf w(y) > 0 .


dist(y,F o) + 0

Remark III. I :

To explain and motivate this complicated condition, let us make a few


remarks :
207

i) If 0 = R N of if, more generally, Tx = +~ a.s. ¥ x , A t h e n we choose

Fo = ~ ; w = inf ~ and (15) holds. This is the case when G ~ 0 6n F and


X,V
-b(x,v) • n(x) > a >0 for all (x,v) 6 F x V (where n(x) is the unit outward
normal to 80).

ii) If there exists w 6 WI'~(0) satisfying :

(16) AVw < f ( . , v ) in D'(0) , V v eV ; w = 0 on F ;

then (15) holds.

This is the case for example when f(x,v) ~ 0 (¥ x,v) : take To = F , w = O.


It is also the case when all the processes "cross the boundary" that is when we
a s Sllme :

(17) ~G >0 , ¥ (X,V) e F x V either (n(x),a(x,v)n(x)) >

or b(x,v).n(x) ~ . [ aij(x,v) ~ij @(x) >


l,j
(where ~(x) ~ dist (x,F)).

It is also possible to combine the two cases i) and ii) above.

Under assumption (15), we have the following :

Theorem III. I :

Under a s s u m p t i o n (15), we have :

i) J(-,A) , u ( x ) a r e u . s . c , on ~ f o r a l l A ; u ~ w , u ~ u_ i n ~ whare

u(x) = inf E gx f(Yx(t),v(t)) (io


exp - C(Yx(S),V(S))ds dt and T' x /S the f i r s t

cx/~ t ~ e of Yx(t) fanm O. I n p a r ~ i c u Z a r " u > 0 on F;


ii) i(Tx~ ~) U(Yx(Tx)) = 0 a.s. for all A , x E ~ ;

iii) For all A and x e U , U(Yx(t ^ Tx)) exp - + -


o
i6 a F s t r o n g s~bma~Jcin~ale. I n pa~J~ieular, we have :
t

(18) u(x) = inr E


A
I
o
f(Yx(t),v(t)) exp -
[Iot c(y x (s),v(s))as dt
1 +

+ U(Yx(8 ^ Tx)) exp - e(yx(t),v(t))dt ,


;O
whiz e is a s t o p p i n 9 t ~ m z ( w M c h may depand on A).
208

i v ) I n (5) , t h e infim~mmay be r e s t A i c t e d t m admissible system~ where t h e


p r o b a b i l i t y spaae ( D , F , F t , P ) and t h e Brownian motion wt / s p r ~ c ~ i b e d .

v) If ~ •w~j~(0) s~fz~ , ~i~ sup (~-u)(y) G o , ~ ~ • r ; #~ <f(.,v)


y~x
in P'(O) , ¥ v •V ; then w(x) ~ u ( x ) i n U.

Remark III.2 :

Let us mention that assumption (15) is discussed in P.L. Lions [26 ] for
deterministic control problems and that if (15) (or some ana/ogue of (15))
does not hold, everything may happen.

Remark 111.3 :

This result improves various previous results obtained by J.L. Menaldi and
P.L. Lions [37 ].

Corollary IIi. 1 :

Under assumption (15) , the following ~ s e ~ t i o n s are equivalent :

i) u • C(O) ; ii) UlF • C(F) ; iii) F I = {x • F , u(x) = 0} i~ ~ o s e d .

In p a r ~ i c u l a r i f u = o on r then u • c ( o ) and u = ~ i n U .

Corollar~r III.2 :

I f (15) h o l ~ ~ d i f we ~ s ~ e : ~ • (0,7 ]

(79) Iv(x) f ~ c dist (x,rI)~ ;

then denoting by 1° t h e following constant :

• 3~ ) ~ • ~ / (x,v,~) e RN RN

we have : u e C°'~(O) w i t h B = min

8 < I if e = I , I = Io ; B = C~ if I >
( -ol
a,

Io'
if 0 < X < 1o ; g = ~ i f ~ < I , X = Xo;

All these results are detailed and proved in P.L. Lions [27 ] : they extend
various results obtained by J.L. Menaldi and P.L. Lions [ 37 ]. Let us finally men-
tion t h a t some extensions of (15) are given in [27 ].
209

IV. Uniqueness of viscosity solutions.

Again, the boundary and the eventual degeneracy of the processes create
technical difficulties. A typical result that we may obtain is the following :

Theorem IV. 1 :

L e t u be 9 i u e n by ( 5 ) . We assume :

(20) u e cb(~) , u= 0 on r ;

(21) Ye >o , ~w Eec1(O) nc(~) • Yvev A~ g ~f(.,v) + in 0 ,

Iwl ~ on r

Then i f m
u~ a viscosity solu~on o f (6) s a t i s f y i n g ( 2 0 ) , we have :

~(x) = u(x) , ~ e

Remark IV.I :

In P.L. Lions [27 ], the proof of this result is given together with several
variants or extensions. Let us mention that if 0 = R N then conditions (20)-(21)
are vacuous ; let us also indicate that it is possible to replace (21) by : there
exist FI,F 2 relatively open subsets of r such that F = F I U r 2 and on F 2
O(x,v) = 0 (¥ v) and either b(x,v) = 0 (¥ v) or b(x,v).n(x) ~-~ <0 (¥ v) ; and
for each g > 0 there exists w e e CI(o) satisfying A~w e ~ f ( . , v ) + C in 0,
lw£1 ~ on F I. Then we may replace (20) by : u E Cb(O) , u = 0 on
r I•

V. Regularity results.

We will use below the following formulation of the fact that the controlled
processes really cross the boundary (or some part of it) :

FI,F 2 relatively open in F , disjoint, eventually empty such that :


F = F I U F2 and

(22) ~ (x,v) e r 2 x v , u(x,v) = h(x,v) = 0

¥ (x,v) E F I x V , then either (a(x,v)n(x),n(x)) ~a >0 or

O j o(x,v) = 0 for lJl ~ I and b(x,v)'n(x) ~ a > 0


x
210

Theorem V.].

Under ~ S ~ p t ~ o n (22), there ex~ ~I depending o.~y on bounds on [ID~@:I for

(= inf c(x,v)) >~i :


x~v
(23) u ~wl'~(0) , u = o on r I ;

(24) AVu e L~(0) f o r ~£1 v e V and sup UAVull < ~ ;


v e V L~(0)
(25) sup [AVu - f(x,v)] = 0 a.e. in 0 ;
v 6 v

(26) ~2u ~ ~ in D'(0) • ¥ X • RN : IXI = I .


8X 2
Theorem V.2 :

L~t ~ be an open s u b s ~ of O. L ~ t p e { I , . . . , N } .

i) Under t h e assumptio~ of Theorem V . ] and i f we have :

~ > o , vxe~ ,]m>] , gv 1..... vmev , 3el ..... em e ] o , ] [

m m P 2
(27) such t h a t :

for a l l ~ e RN

then ~iju 6L~(~) for I < i , j <p .

ii) Und~ assumption (15) and i f we have :

(2s) inf {(a(x,v)~,~) / v e V , ~ E RN • I~[ = t} > O , V x e

then u E C2'6(~ ') for any open s e t ~' C O w , where 8 e (0,1) depends o ~ y on ~'
and {a(x,V)}x, v .

Theorem V.3 : (uniqueness)

i) Under assumption (15) , i f w s a t i s f i e s : w 6 W~(O) ,

~w ~f(.,v) in Q'(0) , ¥ ve V ; lim sup w(y)~o , Yx~r


y+x
thenw<uinU.
21!

ii) If u •C(0) and u = 0 on r ; t h a n f o r any w s a t i s f y i n g :

(23' w eW~(0) nC(O) , w = 0 on r ;

AVu •L~oc(0) , ¥ v •V ; sup IIAVull < ~ f o r any bounded

open s e t ~ C C 0 ;

(2s) sup [AVu(x) - f(x,v)] = 0 a.g. in 0 ;


vEV

(26' g • L~OC(0) , Aw < g i n D'(O) ;

Wz have • w =- u i n ~ .

Remark V. I :

It is easily seen on simple examples that those results are essentially


optimal : in particular the assumption I > ~I is in general necessary (see Genis
and N.V. Krylov [ 18 ] ). Nevertheless in the uniformly non-degenerate case, i.e.
when the matrices a(x,v) are uniformly in (x,v) positive definit~ I > 0 is enough
(this is proved in P.L. Lions [28 ], L.C. Evans and P.L. Lions [ 15 ]) and if 0 is
bounded then the precise range of admissible I in R such that u solves (25) is
given in P.L. Lions [29 ].

Remark V.2 :

Many variants of Theorems V. I-3 are given in [ 27 ]. Let us just mention that
if 0 = R N,,(22) becomes vacuous.

Remark V.3 :

Previous regularity results (included in Theorems V.I-3) are given in N.V.


Krylov [23 ], [24 ], [25 ], [21 ]; H. Br@zis and L.C. Evans [4 ]; L.C. Evans and
A. Friedman [ 14 ] ; M.V. Safonov [ 44 ], [ 45 ]. The non degenerate case is treated
in [28 ] , [ 15 ] by purely p.d.e, methods. The case 0 = R N was proved independently
by N.V. Krylov [22 ] and P.L° Lions [30 ] , [31 ] by slightly differen~ probabilis-
tic methods. Those results are taken from P.L. Lions [27 ] (see also [32 ] , [33 ] ,
[34 ] ) and part ii) of Theorem V.2 is an easy consequence of deep regularity
results due to L.C. Evans [ 12 ], [ 13 ].
212

VI. Related problems :

We indicate briefly a list of problems where the above results and methods
are used.

I. Similar problems : Let us mention that similar results hold for problems where
i) there is a pay-off when the process reach F ; or il) the coefficients are
unbounded in x ; or iii) we add optimal stopping time problems ; or iv) we consider
time-dependent diffusion processes and time-dependent H~B equations. Analogous
results held and are proved by similar methods - see P.L. Lions [27 ] for more
details.
Using similar methods, the ea~e of impulse control problems has been treated by
Perthame [42 ].

2. Numerical approximation : We refer the interested reader to P.L. Lions and B.


Mercier [38 ], J.P. Quadrat [43 ].

3. Bifurcation and optimal stochastic control : in P.L. Lions [29 ], some bifur-
cation phenomena are interpreted in terms of optimal stochastic control. In
addition the analog~es of eigenvalues are introduced for HJB equations.

h. Differential Games : The main open questions in stochastic differential games


concern the uniqueness result and regularity results. We hope to come back on
these questions in a future study. Let us also mention that deterministic differen-
tial games are covered by the results in M.G. Crandall and P,L. L i o n s [ 7 ], [8 ];
P.L. Lions [26 ].

5. Nonlinear semi-~roups : It is possible to adapt with the methods of sections


II-IV the method of M. Nisio [41 ] to build nonlinear semi-groups associated with
HJB equations. Then the results of section IV yield easily a very general unique-
ness result concerning those nonlinear semi-groups (see P.L. Lions and M. Nisio
[39 ]).

6. Asymptotic ~roblems : The preceding results and methods can be used in the
proofs of various asymptotic results such as She simplification of large-scale
systems (see R° Jansen and P.L. Lions [20 ]), the homogeneization of optimal
stochastic control problems (see P.L. Lions, G. Papanicolaou and S.R.S, Varadhan
[~o ]).
213

7. Mon6e-Amp~re e%uations : The results and methods given above have enabled us
recently to solve the classical Monge-Amp~re equations (arising in Differential
Geometry) - see P.L. Lions [35 ], [36 ]. The remark that Monge-Amp~re equations are
indeed HJB equations is due to B. Gaveau [17 ].

BibliograPhy :

[I ] A.D. ALEXANDROV : Almost everywhere existence of the second differential


of a convex function and some properties of convex surfaces connected
with it. Ueen. Zap. Leningrad Gos. Univ. No. 37 (Set. Mat., vyp 6)
(1939), p. 3-35 (in Russian).

[2 ] R. BSLLMAN : D~namie prograx~ing . Princeton Univ. Press, Princeton, N,J.


(1957).

[3 ] A. BENSOUSSAN and J.L. LIONS : Applications des in6%uations variationnelles


en contrSle stochastique. Dunod, Paris (1978).

[4 ] H. BREZIS and L.C. EVANS : A variational approach to the Bellman-Dirichlet


equation for two elliptic operators. Arch. Rat, Mech. Anal., 71
(1979), p. 1-14.

[5 ] H. B U S E M A N N : Convex surfaces, Interscience, New-York (1958).

[6 ] E. CALABI : Improper affine hypersphe~es of convex type and a generaliza-


tion of a theorem by K. JSrgens. Michigan Math. J., ~ (1958), p. I05-126.

[7 ] M.G. CRANDALL and P.L. LIONS : Condition d'unicit~ pour les solutions
g~n@raulis~es des ~quations de Namilton-Jacobi du ler ordre. Comptes-
Rendus Paris, 25__~2(1981), p. 183-186.

[8] M.G. CRANDALL and P.L. LIONS : Viscosity solutions of Namilton-Jacobi


equations. To appear in Trans. Amer. Math. Soc.

[9] M.G. CRANDALL , L.C. EVANS and P.L. LIONS : Some properties of viscosity
solutions of Hamilton-Jacobi equations. Preprint.

[10 ] L.C. EVANS : A convergence theorem for solutions of nonlinear second-order


elliptic equations. Ind. Univ. Math. J., 2-7 (1978), p. 875-887.

[11] L.C. EVANS : On solving certain nonlinear partial differential equations


by accretive operator methods. Israel J. Math., 36 (1981), p. 225-247.

[12 ] L.C. EVANS : Classical solutions of fully nonlinear, convex, second-order


elliptic equations. To appear in Comm. Pure Appl. Math.

[13] L.C, EVANS : Classical solutions of the Hamilton-Jacobi-Bellman equation


for uniformly elliptic operators. Preprint.
214

[1~,] L.C. EVANS and A. FEIEDMAN : Optimal stochastic switching and the Dirichlet
problem for the Bellman equation. Trans. Amer. Math. Soc., .25.3. (1979),
p. 365-389,

[15 ] L.C. EVANS and P.L. LIONS : R@solution des @quations de Hamilton-Jacobi-
Bellman pour des op@rateurs uniform@merit elliptiques. Comptes-Rendds
Paris, 299 (1980), p. 1049-1052.

[16 ] W,H. FLEMING and R. RISHEL : Deterministic and stochastic optimal control.
Springer, Berlin (1975).

[171 B. G A V E A U: M6thodes de contr$le optima& en analyse complexe. I. B@solution


d'@quations de Monge-Amp~re. J. Funct. Anal., 2~5 (1977), p, 391-411.

[181 I.L. GENIS and N.V. KRYLOV : An example of a one dimensional controlled
process. Th. Proba. Appl., 21 (1976), p. 148-152.

[19 1 K, IT0 : personnal communication.

[20 l E. JENSEN and P.L. LIONS : Some asymptotic problems in optimal stochastic
control and fully nonlinear elliptic equations. Preprint.

[211 N,V. KRYLOV : Control of diffusion type processes, Moscou (I£79) - in


Russian. -

[22 I N.V. KRYLOV : Control of the diffusion type processes, Proceedinss of the
~nternational Congress of Mathematicians,_ He!sinki, 1978.
[23 ] N,V. KRYLOV : On control of the solution of a stochastic integral equation.
Th. Proba. Appl., 1 7 (1972), p. 114-137.

[24 ] N.V. KRYLOV : On control of the solution of a stochastic integral equation


with degeneration. Math. USSR Izv,, ~ (1972), p. 249~262,

[25 ] N.V. KRYLOV ; On equations of minimax t~pe in the theory of elliptic and
parabolic equation8 i n the plane. Math. USBR $~ornlk, I_~0(1970),p,I~19,
[26 ] P,L. LIONS : Generalized solutions of Hamilton-Jacobi equations. Pitman,
London (I£82),

[27 ] P.L. LZONS : Optimal control of diffusion processes and Hamilton-Jacobi-


Bellman equations. To appear.

[28 ] P.L. LIONS : R~solution analytique des probl~mes de Bellmaa-Dirichlet, 146


(1981), p. 151-166.

[291 F,L. LIONS : Bifurcation and optimal stochastic control. Preprint ; see
also MRC report, University of Wisconsin-Madison (1982).

[30 ] P.L. LIONS : ContrSle de diffusions dans RN. Comptes-Rendds Paris. 288
(I~7£), p. 339-3h2.

[31 ] P,L. LIONS : Control of diffusion processes in H N. Comm. Pure ApD1, Math.,
3__4 (1281), p. 121-147.
215

[32 ] P.L. LIONS : Equations de Hsmilton-Jacobi-Bellman d~g@n@r@es. Comptes-


Rendus, 289 (1979), p. 329-332.

[33 ] P.L. LIONS : Equations de Hsmilton-Jaeobi-Bellman. In S@minaire Goulaouic-


Schwartz 1979-1980, Ecole Polytechnique, Palaiseau.

[3~ I P.L. LIONS : Optimal stochastic control and Hamilton-Jacobi-Bellman


equations. In Mathematical ControI The0r~, Banach Center Publications,
Warsaw.

[35 ] P.L. LIONS : Une m6thode nouvelle pour l'existence de solutions r6guli~res
de l'6quation de Monge-Amp~re. Comptes-Rendus Paris, 293 (1981),
p. 589-592.

136 ] P.L. LIONS : Sur les @quations de Monge-Amp~re. I, II and III. Preprint.

[37 ] P.L. LIONS and J.L. MENALDI : Optimal control of stochastic integrals and
Hamilton-Jacobi-Bellman equations. I, II. SIAM J. Control 0ptim., 20
(1982), p. 58-95.
[38 l P.L. LIONS and B. MERCIER : Approximation num6rlque des ~quations de
Namilton-Jacobi-Bellman. R.A.I.R.O., 1_~h(1980), p. 369-393.

[39 ] P.L. LIONS and M. NISIO : A uniqueness result for the seml-group associa-
ted with the Hamilton-Jacobi-Bellman operator. Preprint.

[40 ] P.L° LIONS , G. PAPANICOLAOU and S.R.S. VARADHAN : ~ork in preparation.


[41 ] M. N I S I O : On stochastic optimal controls and envelop e of Markovian
semi-groups. Proc. Intern. Symp. SDE~ Kyoto (1976).

[42 ] PERTHAME : Th~se de 3e cycle, Universit6 P. et M. Curie, 1982-83.

J.P. QUADRAT : Analyse num6rique de l'~quation de B~llman stochastique.


Rapport Laboria n ° lh0 (1975), INRIA, Rocquencourt.

M.V. SAFONO~ : On the Dirichlet problem for Bellman's equation in a plane


domain. M~th. USSR Sbornik, 31 (1977), p. 231-2h8.
[45 ] M.V. SAFONOV : On the Dirichlet problem for Bellman's equation in a plane
domain. II. Math. USSE Sbornik, 3_~4(1978), p. 521-526.

[46 ] E. STEIN : Sinsular integrals and d ifferentiability properties of func-


tions. Princeton Univ. Press, Princeton (1970).

[47 ] D.W. STR00CK and S.R.S. VARADHAN : On degenerate elliptic- parabolic


operators of second-order and their associated diffusions. Comm. Pure
Appl. Math., 25 (1972), p. 651-713.

[48 ] S.R.S. VARADHAN : personnal communication,


ON REDUCING THE DIMENSION OF CONTROL PROBLEMS
BY D I F F U S I O N APPROXIMATION

P e t r RandL
Department of Probability and M a t h e m a t i c a l
Statistics, Charles University
SokoLovsk~ 8 5 , 186 OO Prague 8, C z e c h o s l o v a k i a

1. The c o n t r o l problem
The t o p i c was d e a l t by s e v e r a l authors in Prague ( ~2J - ~Sj ) .
They a p p l y a m a r t i n g a l e method e x p l a i n e d h e r e on a s p e c i f i c model.

Let { Z t , t~O~ be a c o n t r o l l e d Markov p r o c e s s wtth finite state


space I whose d y n a m i c s is defined by t h e t r a n s i t i o n rates

rq(t,j;z), t,j = I, q(t,J;z)'= - ~ q(t,j;z), ie I.


J#t

z denotes the control parameter taking values in a set J. r is a


time scale parameter. We s h a l l study the Limit behavtour of t h e model
as r~-)~ . Let the reward arise from the process tn t h e f o ~ o w i n g
way. The r e w a r d r a t e in state t is a random v a r i a b l e with distribu-
tion function rFt(y). The r a t e is constant as l o n g as t h e system
stays in t. After the transition to the next state, say j, a new
reward rate realizes independently according to d i s t r i b u t i o n function
rFj(y), etc. We assume

ly d rFi(Y ) = at, J y2 d rFi(y) -,, rbl,


~y4d rei(Y): O(r2),

Let Yt,Zt denote the reward rate and t h e c o n t r o l parameter at time


t, respectively. Further, let Ct be t h e c a p i t a l accumulated up to
t|me t,
t
Ct = Co + ~ Ys ds , t~O.
o
The t n t t i a L capital Co is nonrandom,
If t h e aim i s to m a x i m i z e ECT, the expected capital at time To
0 <T < ~o t the controller can L i m i t himself to the controls

(1) z t -- u ( t , x~), t ~ CO,T2 ,


217

where u is mapping from [O,T]xI to J . ~X;, t~Oj d e n o t e s the


left continuous version of ~ X t , t~OJ . H o w e v e r • i f r i s k s e n s i t i v i t y
is desirable t the optimality p r o b l e m i s u s u a l l y f o r m u l a t e d as

(2) E g(CT) = max•

where g(y) is a given function. For g(y) = -exp~-hyJ • with con-


start h > O, the controls (1) are a sufficiently broad c l a s s . For g e -
neral g(y) t h e c o n t r o l s o f t h e form

(3) Z t = ~(t•Ct•X;) • t£~O•T~ •

are to be u s e d .

Similarly, when m a x i m i z i n g the expected discounted reward

(4) E I e'ttd Ct• l > O,


0

the controls
zt: u(x;)• t~o•
suffice. If the risk of the ruin is to be t a k e n into account• (4) is
changed i n t o

(5) E( I e"Lt d Ct - h e " t ~ ),


0

where
~'= i n f {t : Ct < O~

is t h e t i m e o f the ruin, and h a positive constant. To m a x i m i z e (5),


more g e n e r a l controls

z t = ~( ct• x; )• t=~0•
are n e e d e d .

A simplification is achieved in diffusion approxlmation as r--~.


The o r i g i n a l problem reduces to the optimization of a one-dlmenslonal
process LCt, t-~Oj satisfying

d C't = e ( U t ) d t + ~(Ut)d Wt• t~O"

where ~Wt, t~=OJ i s a Wiener p r o c e s s , and t h e c o n t r o l Ut takes va-


lues i n t h e set 7L o f ~ t a t i o n a r y c o n t r o l s u r e p r e s e n t e d by mappings
u(i) from I to J.
218

The model c o n s i d e r e d here is a simple generalization of the ser-


vice system i n v e s t i g a t e d in L5~ ( s e e Example 1 ) . In [2] • C3~ d i s c r e -
te time controlled finite Markov c h a i n s with criterion (5) are treated,
3~ i n t r o d u c e s the aggregation of Rarkov c h a i n s ( s e e Example 2 ) . In
E4~ v a r i o u s kinds of the coupling of two Markov c h a i n s with different
time scales are studied.

Example 1. V.Linsk~ ( [5] ) considers ~Xt, t~0] to be the


model o f an R/M/l/1 service s y s t e m to which customers of n types
arrive fop service. The t y p e i customers arrive with rate rqi,
and have t h e s e r v i c e completion rate rd i . Xt=0 means t h a t the sys-
tem i s idler Xt=i means t h a t a customer of type i is in service.
The random reward rates introduced above a r e t h e payments o f the cus-
tomers fop service. The c o n t r o l consists in deciding which types of
customers will be a c c e p t e d . The b l o c k i n g of the L i n e by u n p r o f i t a b l e
customers is to be a v o i d e d . Thuso z=(zlt...sz n) where zi=l or 0
says w h e t h e r t h e type t customers a r e a c c e p t e d on n o t . Hencet the
rates are

q(O,i;z) = ziqi. q(O/O;z) = - ~ z qjt


j~o J
q(i,O;z) = d i = -q(t,i;z), i~O,

q(i,j;z) = O, O~i~j~O.

For t h e d i f f u s i o n approximation one has

z.q.a. Z~ zjqj
= j~o ) / (1 + j~o--A--)~j ,

~(z) 2-- 2 ~ zjqjbj / (1 + Z ~zjq ) •


j~o ~ jpo j
J

Since the decisions are e f f e c t i v e only in state 0, stationary con-


trols u can be i d e n t i f i e d with control parameter values z . The
goal is to m a x i m i z e ( 5 ) .

ExampLe 2.. Consider an a g g r e g a t e o f n independent processes


of the kind defined in the introduction. Let its trajectory be

Xt = ( l x t , ..., nxt), t~O°


219

The t o t a l capital at time t equals

= kct "
k

To maximize Eg(CT) , one has to employ in g e n e r a l the c o n t r o l s

kz t = k ~ ( t , C t , X ; ) , t~O,T~ , k=l,...,n.

Again, i t i s advantageous to s i m p l i f y the problem by means o f d i f f u -


sion a p p r o x i m a t i o n . T h i s can be done p r o v i d e d t h a t

nat ~ ai" nbi "-> b i " i e I, as r~ ~ .

2. Diffusion approximation

Theorem. Let J c Rm be c l o s e d and bounded, and Let q ( i , j ; z ) ,


t,JE I, be c o n t i n u o u s l y d i f f e r e n t i a b l e . For each u ¢ ~J~ l e t the
matrix

~q(i,J)u(i)) I[ i,j 6 I

be t n d e c o m p o s a b l e . Assume ( 3 ) w i t h G(t,y,t) h a v i n g bounded d e r i v a -


t i v e s . Then, as r ~ oo , the p r o b a b i l i t y distribution o f ~ Ct,
t~O,T]~ c o n v e r g e s weakly to t h e p r o b a b i l i t y distribution of a dif-
f u s i o n process ~t" t~ ~ O , T ] } satisfying

d ~t = g(ut)dt + ~(Ut)d Wt, t~CO,T~ ,


where
Ut = ~ ( t , ~ t , . ) , t ~ CO,T] .

The c o e f f i c i e n t s @(u), (~(u) are o b t a i n e d by s o l v i n g

(6) a i + ~ q(i,j;u(i))w(j,u) - @(u) = O, i d I,


J
for unknowns @(u), w(j,u), j ~ I, and
b. b
q( i , j ;u{ i)){ I + i + j,u)) +
j~i q( i , i ~ u ( i ) ) 2 q(j,j;u(j))2 ~2 (
(7)
+ q(i,i;u(i))Q2(i,u) - CS(u)2 = O, i eI,

for unknowns d(u) 2, ~2(J,u), j e I.

The proof of the Theorem will be briefly sketched. Let u be a


220

stationary control. Solve 46), which is a known system o f equations


defining the average reward B4u ) and a u x i l i a r y constants ~j,u)
for stationary control u in a controlled Markov p r o c e s s with tran-
sition rates q(i,j;z) and r e w a r d rates at ( s e e e.g= [ 1 ] ).
Further, let

(8) w(i,y,u) = ;~i~u) _ Y " ai


r rq(, i,i;ul iJ)

Then
y + r ~ q(i,j;u(i))lw4j,s.u) d rFj4s) +
49)
+ rq4i,i;u41))w4i,y,u) - @4u) = O, iel.

L e t 43) hold. Set

w4t) = w( xt,Yt,54 t, ct,.)), s t = s4G(t,ct,.)).

Introduce the counting process of the state changes in X, ~Nt,t~O }.


Using (9) it can be shown that the process
#t t
Mt = C t - ~ 8 s ds + ~ 4w(s) - w(s-))d Ns =
o O

410) t t d w4 s) d s , t-~O,
= Ct - ~ @s ds + w 4 t ) - w ( O ) - ~
O O

is • martingale. Namely, from ( 9 ) it follows that


t
I) s ds - Ct , t->O,
O

is the compensator of
t
4 w(s) -w4 s-) ).d N s , t~-O.
O

48) indicates that


w(t) = o(~),

Hence i t is derived that


t
6"t, t oj a.d [c t -
0
6 s ds,

have I d e n t i c a l limiting distributions.


The second p a r t of the proof consists in verifying that the ltml-
221

t|ng quadratic variation of M is


t
<M>t = I ~(a(S,Cs,.))Zds, t~0.
0

To do s o t one i n fact repeats t h e above r e a s o n i n g . It holds


t
<Mr t = ~ (w(s)-w(s-))2d Ns, t~O,
0

because o n l y jumps, and n o t the absolutely continuous terms tn ( 1 0 ) ,


contribute to the quadratic variation.

Further, one s o l v e s for stationary controls u

I'~'-q(i,j;u(i))~(w(j,s,u)-w(j,y,u))
2 + wz(j,s,u)]d rFj(s) +
jfi
(11)
+ rq(i,i;u(i))w2(i,y,u) - ~(u) = O, i ~ I.

w2(j,y,u) , j ~ I, a r e a new s e t o f auxiliary functions. Introducing

w2(t) = w2(Xt'Yt'O(t'Ct''))" ~t = ~'(~(t'Ct''))*

we d e f t n e a martingale
t t
Lt = I (W(S) - W(s-)lZd Ns - ~ ~*s ds +
0 0

t
+ w2(t} - u2(O) - ~ ~s w2(s)ds* t->O"
0

The Last t h r e e terms are negligible t since again

wz(J,y,u) : 0(~), r - - 9 ~0 .

More o v e r t
t
E L~ • E S (u(s) - w(s-))4d Ns ~ O, rm>~O .
0

We c o n c l u d e that
t
<M>t ~ I ~(;(S,Cs,.)) 2ds . t~O,
0

where

We have e s t a b L i s h e d that in the Limit


t
¢t - ~ e(a(S,Cs,.))ds , t~O,
0
is a martingale with quadratic variation
222

t
~ (~ ( ~ { S , C s , . ) ) 2ds, t-~O.
0

Consequent ly~

d C t = 9(~(t,Ct,.))dt + ~(~(t,¢t,.))d Wt, t~O,

for a Wiener process ~Wt, t~OJ . Integrating (11) with respect


to drF~(y) and letting r~o equations (7) for ~(u) 2 are obtal-
ned. The analogy between (6) and (7) is apparent.

Example 3 . L e t us e x a m i n e , what s o l u t i o n i s o b t a i n e d from the


diffusion a p p r o x i m a t i o n f o r the most common r i s k averting c r i t e r i o n

E exp ~ - h ~TJ = min,


where h ~ 0 . Assume t h a t u c o n s t a n t i s used on ~T-~ tT~ .
Then the i n c r e m e n t CT-CT. ~ i s Gausstan, and
h2
E exp ~-hCCT-~T. ~ )~ = exp C-h eCu)~ ÷ ~ ~(u) Z ~ o

From here i t is seen t h a t the o p t i m a l control is constant,

Ut = u, t O,T ,
where u is the m a x i m i z e r o f
h2
h 9(u) --,~-- (~(u) 2 = ~ (u).

Combining ( 6 ) and ( 7 ) , we g e t fop ~ (u) the system o f e q u a t i o n s

d(i,u(i)) + ~q(i,j~u(lllw(j,u)- ~(u)--o, iE1,


J
with
h2 bI bj
d(i,z) = h ai + ~ ~ q(t,j~z)(~i,i;z)2 + .

j~i q(j,j;z) Z

Thus, u can be found by s t a n d a r d methods o f dynam|c p r o g r a m m i n g , e . g .


by Howard's iteration method.

3. Optimization
Using the d i f f u s i o n a p p r o x i m a t i o n the o r i g i n a l problem was t r a n s -
formed i n t o t h e o p t l m l z a t i o n o f a c o n t r o l l e d Mankov p r o c e s s ~ t , t - ~ O ~
with dlfferential gentrator
d2 d
Lu = ½ ~(u) Z 2 + e(u)~ .
dy
With r e g a r d to ( 2 ) , ( 5 ) the aim i s to a c h i e v e
223

(12) sup Ey g(CT) = V14T,y) ,


or
SUp Ey ( I e-L* d Ct - h • " L ~ ) =
o
(13)
-tt
sup Ey ( I e @4Ut)dt - h e " l ~ " ) = Vz4y ) .
o

The s u b s c r i p t y r e f e r s to the s t a r t i n g p o s i t i o n o f the p r o c e s s .


The dynamic programming e q u a t i o n s f o r ( 1 2 ) and 413) are

(14) ~ V 1 : sup L u V 1, Vl_(Y,O) : g4Y)~,


uEU..
and

(is) o = sup{L u V
u~ ~ 2 + e4u) - L V z} , Vz(O) = -h

Let ~(T,y) and ~ ( y ) be the m a x i m i z e r s o f the e x p r e s s i o n


on the r i g h t hand s i d e of ( 1 4 ) and o f ( 1 5 ) , r e s p e c t i v e l y . The c o r r e s -
ponding o p t i m a l feedback c o n t r o L s are

416) Ut = ~4T-t,Ct) , te~O,T~ and Ut = ¢ ~ ( C t ) , t-ZO.

I t shouLd however be n o t e d t h a t the c o n t r o l s (16) o f t e n do n o t f u L f i L


the smoothness a s s u m p t i o n s o f the Theorem. Of c o u r s e , t h e y can be
approximated arbitrariLy c l o s e l y by smooth c o n t r o L s . A method f o r s o l -
v i n g ( 1 5 ) i s p r e s e n t e d in E 3 ] • [ 5 ~ .

For i l l u s t r a t i o n Let us r e t u r n to Example 3. We have


h@
V14T•Y) = inf E~exp ~ - h E t } = exp ~ - h Y - 4 h @ ( a ) - F ~ ( ~ ) E ) T ~ .
Hence•
hz hE
-~ra Vl = - ( h e ( G ) - "~ ~(0)~Vl =u~inf ~ ( h g ( u ) - ~-~(u)Z)V 1] =

= inf Lu Vl, V14Y•O) = exp ~ - h y } o


uE~

4. References

[13 R.A= Howard : Dynamic Programming and Markov P r o c e s s e s • Cam-


b r i d g e ( H a s s . ) - N e w Y o r k • 1960.

123 Pham Van Kieu : A d i f f u s i o n a p p r o x i m a t i o n in the r u i n problem


for a controlled Markov c h a i n . K y b e r n e t i k a 4 P r a g u e ) 1 0 4 1 9 7 4 ) • 1 2 5 - 1 3 2 =
224

P. M a n d l : On a g g r e g a t i n g c o n t r o t l e d Markov c h a i n s . C o n t r i b u -
t i o n s to S t a t i s t i c s (ed.J°Jure~kov~) p p . 1 3 7 - 1 5 6 , Prague 1979.

D°Kukltkov&-Jaru~kov~: Approximation of b~vartate Markov chains


by o n e - d i m e n s i o n a l diffusion processes. A p l i k a c e matematiky 23
(1978), 267-279.

rs3 Vo L~nsk~ z Diffusion approximation for a controLLed service


system. K y b e r n e t t k a ( P r a g u e ) 18 (1982) to a p p e a r .
LIE ALGEBRAICAND APPROXIMATION METHODS
FOR SOMENONLINEARFILTERING PROBLEMS

S.I. Marcus*, C.-H. Liu**, and G.L. Blankenship***

ABSTRACT

State estimation problems for systems involving small parameters are treated by both
Lie algebraic and analytical approximation techniques. An asymptotic expansion for
the unnormalized conditional density corresponding to the case of observations of a
Gauss-Markov process through a (weak) polynomial nonlinearity is computed and a
convergence result is derived. The convergence result is based on arguments used
recently to prove existence and uniqueness and to estimate the tail behavior of
solutions to nonlinear f i l t e r i n g problems with unbounded coefficients. The expan-
sion is related to certain approximations of the associated estimation Lie algebra.
Lie algebraic methods are used to compute f i n i t e dimensional f i l t e r s for the terms
in the asymptotic expansion.

*Department of Electrical Engineering, University of Texas at Austin; Austin,


Texas. Supported in part by NSF Grant ECS-8022033.
**Department of Electrical Engineering, National Taiwan Institute of Technology,
Taipei, Taiwan. Supported in part by the Joint Services Electronics Program
at the University of Texas under Contract F49620-77-C-0101.

***Department of Electrical Engineering, University of Maryland, College Park,


Maryland. Supported in part by ONR Contract N00014-79-C-0808.
226

I. INTRODUCTION

This paper is concerned with the problem of estimating a signal xt , t ~ 0, from


noisy observations Ys' s ~ t , based on the model

dxt = f ( x t ) d t + g(xt)dw t
(1.1)
dyt = h(xt)dt + dvt , 0 < t < T <

Here, f , g, h are smooth functions, wt , v t are independent standard Wiener processes,


and x0 is a random variable ~ndependent of (wt , vt) for a l l t. We assume that xt
R , Yt E • for convenience. I t is well known that the problem of recursively estim-
ating x t given ~/t = o{y(s), S ~ t } , the o-algebra generated by the observations,
involves in an essential way the (Ito) stochastic partial d i f f e r e n t i a l equation
(Duncan-Mortensen-Zakai) (DMZ) equation) [ l ] , [2], for the unnormalized conditional
density u(t,x)

du(t,x) = ( ~ * u ) ( t , x ) d t + h(x)u(t,x)dyt

##*u = ½axx(g2(x)u) - @x(f(x)u) (I.2)

u(O,x) = Po(X), the density of xO.

Specifically, i f (I.2) has a "nice" solution, then the conditional density of xt


given is p(t,x) = u ( t , x ) / ; u ( t , z ) d z .

Over the past five years or so a considerable e f f o r t has been devoted to the search
for (recursive) f i n i t e dimensional "representations" in terms of various "sufficient
s t a t i s t i c s " of either the solution of (1.2) or other conditional s t a t i s t i c s (such as
the conditional mean). This e f f o r t , which has produced few such estimators [ 3 ] - [ 6 ] ,
has nevertheless led to a major improvement in the understanding of the estimation
problem and to the tools available for i t s treatment. The l a t t e r include algebraic
methods [7]-[10] and the use of certain transformations which simplify the analysis
of the existence and uniqueness of solutions of (I.2) (see, e.g., I l l ] - [ 1 4 ] and the
collection [ I ] ) . The algebraic methods are especially useful for classifying equiv-
alences of f i n i t e dimensional f i l t e r s , for indicating when no f i n i t e dimensional f i l -
ters exist, and for f a c i l i t a t i n g the computation of conditional s t a t i s t i c s . In those
cases where no f i n i t e dimensional representations exist the available methods must be
redirected to the construction of consistent and useful approximate f i l t e r s . This is
the central theme of this paper.

We consider in this paper the specific subclass of the problems (I.1) in which f is
linear, g is constant, and h is a polynomial of degree k > l ; for convenience of
227

notation, we w i l l usually consider the system:

dxt = axtdt + dwt

dy~ = [x t + e(xt)k]dt + dvt , k ~ 1


(i .3)
YO = O; Po(X) Gaussian

where ~ is a small positive parameter; as c @O, we recover the Kalman f i l t e r i n g prob-


lem. As is well known, the Kalman f i l t e r computes, via a f i n i t e dimensiOnal recur-
sion, a sufficient s t a t i s t i c for the associated estimation problem.

The f i l t e r i n g problem (1.3) does not in general admit a f i n i t e dimensional


solution. However, i t is reasonable to expect, as has been argued before [15]-[17],
that i t s solution may be well approximated by the solution (conditional density, mo-
ments, etc.) of the Kalman problem when c is small. The structure of such an approx-
imation is discussed f i r s t by means of the natural Lie algebraic structure associa-
ted with the operators 2 * - - ½ h 2 and h in ( I . 2 ) (h acts as multiplication by h(x)),
specialized to ( I . 3 ) . The Lie algebra associated with (1.3) is i n f i n i t e dimensional
for k > 2. Approximation structures are defined by identifying certain f i n i t e dimen-
sional projections of these Lie algebras by a "modcn'' device. This idea was used by
Hazewinke] in [17], and we consider i t further in Section 2.

This approximation structure corresponds to an analytical approach in which one simply


assumes a formal power series in E for the unnormalized conditional density associa-
ted with ( I . 3 ) , and obtains a formal asymptotic expansion by substituting this expan-
sion into (I.2) and equating coefficients of l i k e powers of c in the usual way. This
simple procedure, used previously in [15], [16], reveals clearly that terms in the
formal asjnnptotic expansion can be evaluated by computations not much more complex
than those in the Kalman problem.

In Section 3, we use Lie algebraic methods (in particular, the Wei-Norman representa-
tion [18]-[20]) to derive recursive f i n i t e dimensional f i l t e r s for the terms in the
asymptotic expansion. This procedure shows clearly that the problem of whether the
expansion is a true (not merely formal) asymptotic expansion is equivalent to that of
existence, uniqueness, and regularity of the conditional density in the original prob-
lem. These points were also made in [15]; here, however, we are able to get a l i t t l e
further since existence and uniqueness questions for nonlinear f i l t e r i n g problems are
now better understood than was the case four years ago (see, e.g., [ I l l , [13], [14]).
The existence of the asymptotic expansion in an appropriate norm is stated and dis-
cussed in Section 4.

A preliminary version of this work appeared in [27]. Related results have been ob-
tained independently with t o t a l l y different methods by Sussmann [28].
228

2. LIE ALGEBRAICMETHODS

A Lie algebra L(Z) can be associated with each f i l t e r i n g problem ( l . l ) , (].2), and i t
is now widely recognized that the realizability of L(Z) or its quotients with vector
fields on a f i n i t e dimensional manifold is related to the existence of f i n i t e dimen-
sional recursive f i l t e r s (this idea is originally due to Brockett [7]). The problems
(I.3) do not in general admit such realizations, suggesting that for these problems
no statistic of the conditional density can be computed with a f i n i t e dimensional re-
cursive f i l t e r .

Specifically, consider the problem (I.3). The unnormalized conditional density u~


satisfies (formally -- see [14] for results on existence and uniqueness)

duC(t,x) = ~*uC(t,x)dt + (x+~xk)uC(t,x)dy~

u~(O,x) : Po(X) (2.1)

~*u = ½~ xxu - a~x(XU)

or in Fisk-Stratonovich form,

du~(t,x) = [E~?*-~ (x+cxk)2]ua(t,x)dt + (x~xk)ue(t,x)ody~


(2.2)
u~(0,x) : po(X)
where o denotes the Fisk-Stratonovich differential. The associated Lie algebra

Lk~ = { ~ * - ½ (x + ~xk)2, x + cxk}LA (2.3)

is the smallest vector space of differential operators containing the two operators
in (2.3)and closed under the Lie bracket [Dl , D2] = DID2 - D2Dl.

As a particular case, consider the "weak cubic sensor" when k : 3. In [17] i t is


shown that L3c is isomorphic to the Weyl algebra Wl = ~ < x, d-~> of all differentia]
operators (of any order) with polynomial coefficients (the same as the Lie algebra for
the cubic sensor, with h(x) = x3 [9]), A basis for Wl consists of the operators
eij = xi dJ.dx
J . i . j=O,l,2,.
. . . Wl is an i n f i n i t e dimensional Lie algebra under the
commutation [Dl , D2] = DID2 - D2D1; its properties are examined in detail in [9] where
the following is proved.

Proposition 1
Let M be a f i n i t e dimensional manifold, and let V(M) be the Lie algebra of smooth
vector fields on M. There is no non-zero homomorphism of Lie algebras from Wl into
V(M).
229

Combined with the results of [21], [22], this result implies that for k=3 a n d ~ 0 ,
fixed, no non-zero s t a t i s t i c of the conditional distribution for (2.1) can be computed
(exactly) with a recursive f i n i t e dimensional f i l t e r . Of course, for ~=0, the Lie a l -
gebra L30 -~ L0 has basis { ~ * - ½ x2, x, d/dx, 1} , the Lie algebra for the Kalman
problem; and in this case there is a two-dimensional sufficient s t a t i s t i c that can be
evaluated recursively. Thus, as c passes from zero to ~ 0 , the f i l t e r i n g problem moves
from the simplest to the most d i f f i c u l t class. Hazewinkel [17] has shown similar be-
havior for k=2; i t is suspected that the behavior of the other problems (I.3) with
k > 4 is similar.

To treat the case ~#0 small, i t is natural to consider expansions of the conditional
density and s t a t i s t i c s in powers of E. This was done for several problems in [15],
[16]. Approximations from a Lie algebraic point of view were considered in [17]; this
can be summarized as follows. Let Wl(~) = R <x, ¢, d/dx> be the Lie algebra of d i f -
ferential operators with coefficients that are polynomials in x and ~. Thus, WI(E)
has a basis { e i j k = cixJd~/dx~; i,j,~=O,l . . . . } (Here we regard E as a "variable".)
The associated estimation algebra Lk~ may be regarded as a sub-algebra of WI(~).
Using the notation, from [17], we define LkE modsn as the Lie algebra obtained from
Lk¢ by setting cl=0 for i ~ n . (A more precise definition is given in [17].) In [17]
i t is shown that Lk mod~n is f i n i t e dimensional for each k, n.

This development is related toasymptotic expansions in the following way. Assume


that uc in (2.1) has a formal expansion

u~(t,x) = uo(t,x) + Cul(t,x) + ~2u2(t,x ) + . . . . (2.4)

Substituting this in (2.1) and equating coefficients of powers o f c , gives

duo(t,x ) = ~ * u 0 ( t , x ) d t + XUo(t,x)d~t
(2.5)
u0(O,x) = Po(X)

du~(t,x) = ~*u~(t,x)dt + xu~!t,x)d~


(2.e)
+ xku~_l(t,x)d~t, ~ = ],2 . . . .

Writing (2.5)-(2.6) in Fisk-Stratonovich form and truncating after 4 = n gives


230

k(£*-½x2
uo
0 UoI
k+l .~*-½x2
u1
-X Ul

_~x 2k . _xk+! ~*-½x 2


u2 u21 dt

..½x2k ~xk+l ~W_lx2


L Un . un.~

X
u0
xk x 0 uI
ody~

(2.7)
•xk x . Un~

Define, for i=l . . . . . n+l, the (n+l) x (n+l) matrix Ei which has zero entries everywhere
except for ]'s in the sub-diagonal positions ( i , l ) , (i+l,2) . . . . . (n+l, n+2-i) ( i . e . , on
a particular sub-diagonal); note that El is the identity matrix. Also, denoting U(t,x)
= [Uo(t,x) , .. ., Un(t, X )] ! , we can rewrite (2.7) as

dU(t,x) = [(~*-½ x2)El -x k+l E2 - lx2kE3]U(t,x) + [xEl + xkE2]U(t,x)ody~


(2.8)
The Lie algebra of (2.7) or (2.8) is isomorphic to LkcmOd~n+l T h i s can be seen either
by a direct calculation (see the next section) or by the observation that there is an
obvious isomorphism (with E2 corresponding to ~) between the Lie algebra of (2.2) modulo
~n#l
and that of (2.8), since (E2)n+l = O. Thus the Lie algebraic construction of
LkmOd n + l corresponds to the equations for the approximation (2.4) to n+l terms.

Note that the fundamental solutions of (2.5) and (2.6) are the same for every ~ = 1,2,
.... The fundamental solution is just the unnormalized conditional transition den-
sity of the Kalman problem with Yt' 0 < t < T, appearing as a parameter. The density
Uo(t,x) is, of course, Gaussian. Consider (2.6) in the case ~ = l ; clearly, ul(t,x)
can be found by convolving a Gaussian density with a polynomial (x k) times a Gaussian.
This observation has been made elsewhere for related problems [15], [16]. This inte-
gral can be computed in closed form. Moreover, the several "moments" which appear in
the expression can be eva]uated recursively by f i n i t e dimensional equations• Such
f i n i t e dimensional f i l t e r s for the solution of (2.5)-(2.8) are derived by Lie algebraic
methods in the next section. In the construction of formal asymptotic expansions,
231

Lie algebraic methods serve to organize, simplify, and explicate the structure of the
expansions, as well as to lead to finite dimensional f i l t e r s . However, i t is also
important to prove that (2.4)-(2.6) is a true asymptotic expansion in an appropriate
norm; we indicate how to do this in Section 4.

3. FINITE DIMEBSIONAL FILTERS FOR THE TERFIS


IN THE ASYMPTOTICEXPANSION

Denote by L(k, n+l, ~) ~ ~0' XEl + xkE2}LA the Lie algebra of (2.8), where A0
(~*-½x2)EI - xk+IE2 - ½x2kE3. The following theorem describes the structure of
L(k, n+l, ~); then we use Lie algebraic methods to solve (2.8) in terms of a finite
number of recursively computable statistics.

Theorem 1: The Lie algebra L(k, n+l, ¢), for ~ p O, consists of A0 and elements which
are linear combinations of the matrix differential operators

xr dS Ei; i=1,2 . . . . n+1; 0 < r+s < l + ( i - l ) ( k - l ) .


dxS • _ _
n-I
The dimension of L(k, n+l,c) is less than or equal to l + ½ Z [i(k-l)+2] x [ i ( k - l ) + 3 ] .
i=O
In addition, L(k, n+l, ~) is a solvable Lie algebra [5], [29] (a Lie algebra L is sol-
vable i f the derived series of Lie algebras L(O)=L, L(m+l) __A[L(m) L(m)] ~ {[A,B]IA,B
¢ L(m)}, m _> O, is the trivial Lie algebra {0} for some m).

Since L(k, n+l, ~) is solvable and finite dimensional, (2.8) can be solved in terms
of a finite number of recursively computable statistics by the method of Wei and Norman
[4], [18]-[20]. The calculation, which can be rigorously justified in this case as in
[4], [20], proceeds as follows. Assumethat the solution of (2.8) can be written in
the form

U(t,x) [egO(t)Ao egd(t)Ad ]


: ... U0 (x) (3.1)

where {Ai; i=O . . . . . d} is a basis for L(k, n+l, E), Uo(x ) = (Po(X),O . . . . . 0 ) ' , and
{gi; i=O . . . . . d} are real-valued functions of t and y ~ t o be determined. Substituting
(3.l) into (2.8), using the i d e n t i t y
tAi = tk k t Ai
e Aj = Z F adA Aj e , O<__i, j <__d (3.2)
i=O i

where ad~ B = B, adAk+1 B = [A, adA k B],)and equating the coefficients of each basis
element Ai, one obtains a set of stochastic differential equations driven by y~ for
{gi" i=O. . . . . d}. Since L(k, n+l, ~) is solvable, there is an ordering of a basis for
which the representation (3.1) is globally valid. This method was used to solve the
232

DMZ equation for the Kalman f i l t e r i n g problem by Ocone [4], [20]; this is precisely
equation (2.5) for u0 (which corresponds to L(k, l , ~)). Indeed, the representation
(3.1) for the f i r s t component of U(t,x) is

Uo(t,x ) [egO(t)(~*-½x2) gl(t)Xeg2(t)d gd(t) I


= e e PO ( x ) , (3.3)

which is the form obtained by Ocone. The other components of U(t,x) are givep by

u i ( t , x ) = bi(t,X)Uo(t,x ) (3.4)

where bi is a polynomial differential operator containing terms of the form

Brs(t) xr -d-s r + s < i(k+l). The form of the equations is most easily seen by con-
dx s '
s i d e r i n g an example.

Example l (the weak quadratj.c, sensor~: Consider the problem (1.3) with k=2 and a=O,
and the f i r s t order approximation (n=l in (2.7)-(2.8)); thus (2.7)-(2.8) becomes

d[uo]=r <x2 o ][uo]dt [ , i][UO]o< (3.5)


ul E-x3 ~*-½x 2 u1 x2 u]

The Lie algebra L ( 2 , ~ is ten-dimensional, with basis elements

AO = ~*EI - x3E2' A1 = XEl' A2 = d-~El'

A3 = x2E2' A4 = xE2' A5 = x d-~E2' A6 = d E 2 '


d2
A7 = - - E 2 , A8 = E2, A9 = El .
dx2

Substituting the representation (3.1) into (3.5), and using (3.2) repeatedly, the
following differential equations for {gi; i=O. . . . . 9} are obtained (the Lie algebraic
manipulations, similar to those in [19], were performed with the aid of the MACSYMA
symbolic manipulation program at M.I.T.; for details, see [30]);

go(t) = t
dgl(t) : cosh t ° dy~

dg2(t) = -sinh t ° d~t (3.6)


dg3(t) = Cl(t) o dy~
dg4(t) = [2c2(t)gl(t) - 2cl(t)g2(t)]o dy~
233

dgs(t ) = 2c2(t) o dy~


dg6(t ) = [2c3(t)g l ( t ) - 2c2(t)g2(t)] o dy~
dg7(t ) = c3(t ) o dYtc
dg8(t) = [cl(t)[g2(t)]2 - 2c2(t)gl(t)g2(t) + c3(t)[gl(t)]2 + c2(t)] ° dy~
dg9(t) = -(sinh t ) g l ( t ) o dy~
g0(O) = . . . . g9(O) -- 0

where the deterministic coefficients {ci(t); i : l , 2, 3} are given by


Cl(t) = cosh 2t + cosh t - l
c2(t) = sinh t - sinh 2t
(3.7)
c3(t) = cosh 2t - 2 cosh t + 1

The actual form of u0 and uI, as given by (3.1), is obtained by calculating


exp(gi(t)Ai), i=0 . . . . . 9, and using (3.2); the solution (assuming the initial density
is Po(X) : 6(x - Xo) ) is
u0(t,x) = [et(~*'½X2)egl(t)Xeg2(t)d~ eg9(t)p0] (x)

_ ~ l expl- ½ (tanh t)(xo-g2(t))2 + gl(t)(x0-g2(t))

+ g9(t) - ½ (tanh t)-l[x-(cosh t)-l(xo-g2(t))]21 (3.8)

and ul(t,x) is q(t,X)Uo(t,x), where q is a polynomial in x with coefficients ex-


pressed in terms of go . . . . . g9 (see [30] for the exact expression).

Thus the "zero"th order approximation of u~ is just the unnormalized conditional den-
sity of the Kalman f i l t e r , and the f i r s t order approximation is Uo(t,x) + EUl(t,x),
where u0 and uI are computed from the nine y~-dependent statistics gl . . . . . gg" A
straightforward but involved off-line calculation is involved to obtain these equa-
tions (this is greatly facilitated by MACSYMA), but the only on-line calculations are
the simple equation (3.6) for the gi's and the calculation of u0 and uI as memoryless
functions of the gi's; this constitutes a finite dimensional recursive f i l t e r for
u0 + cuI. I t is also clear that finite dimensional f i l t e r s for the zero-th and f i r s t
order approximations of the normalization factor;uC(t,z)dz, the normalized conditional
density p(t,x), and the conditional mean R(tlt) can be obtained as functionals of
gl .... 'g9" The on-line computation is the same; only the off-line calculation of the
form of the functionals is different. These calculations and some numerical experi-
ments to test the performance of these filters will be reported elsewhere.
234

4. ASYMPTOTICEXPANSIONRESULTS

To justify the use of algebraic methods on equation (2.2) and its formal asymptotic
expansion, i t must be proved that (2.2) has a unique solution within an appropriate
class of functions, and that the formal expansion (2.4)-(2.6) is in fact an asymp-
totic expansion in an appropriate norm. Existence and uniqueness results are proved
in [13], [14] by f i r s t applying certain exponential transformations to (2.2), and then
using the "classical" theory of PDE's [24], [25]. Here we concentrate on outlining
the proof that (2.4)-(2.6) is an asymptotic expansion.

Consider the system (2.2), (2.4)-(2.6) and let Oc(t,x) : u~(t,x) - Uo(t,x). Then
B~ satisfies the Stratonovich stochastic PDE

dO~(t,x) = [ ~ * - ½ (h~(x))2]O~(t,x)dt + (h~(x))o~(t,x)ody~ (4.I)

+ ~xkuo(t,x)ody~ - a(x + ½ Exk)xkuo(t,x)dt

OC(O,x) = p~(x) - poD(x), 0 < t < T.

Here we have generalized (2.2) by letting u~(O,x) depend on ~. Also,

duo(t,x) : [~* - ½x2]uo(t,x)dt + XUo(t,x)ody~

Uo(O,x) = poO(x), 0 < t < T. (4.2)

I f we form the vector W~(t,x) = [e~(t,x), Uo(t,x)]T, then

(4.3)
+ H~(x)W~(t,x)od~
where

x[i i (4.4)

Introducing
Vc (t,x) = exp[-H~ (x)~t]W~ (t,x) (4.5)

the f i r s t component ~ of the 2-vector V~ satisfies a forced parabolic PDE (for each
H~Ider continuous path of y~)

@t _ 2
l @xxVlc + m~(t,X)@xViC + n~(t,x)~ +cf~(t,X)Uo(t,x) (4.6)
235

V~ (O,x) = eC(O,x) = p~(x) - poO(x) = ~C(x)


where
m~(t.x) = hxY
~ t - ax

n~(t,x) :½rh x x ~ t +
2- 2axhxYt .
2a _
(h~)2] (4.7)

-heyc ^ ~xky_
cfC(t,X)Uo(t,x) = e t[~g, _ ½ (hC)Z][e z l)Uo]

h~ x ~
+ (e- Yt-e- Y t ) [ ~ , _ ½ x2]Uo

-zxk(x + ½ ~xk)e'h~tUo

The analysis of (4.6) proceeds as follows (see [14]). Note f i r s t that the term
-~1 (hE(x))2 dominatesthe potential nC(t,x) as Ix[ ÷ ~. Thus, the fundamental solu-
tion r~(t,x;s,z) of (4.6)"falls off" like exp[-~Clxl k+l] for some ~ e (O,l) as
Ix[ +~, and similarly as I z) ÷ ~ . The forcing function in (4.6) has fc(t,x)
-xy~
pC(t,x)e t where p~(t,x) is a polynomial in x with bounded coefficients (on c > O,
t e [O,T] for each fixed path ~t ). Thus, fi the i n i t i a l data ~C(x) is bounded, then
(4.6) has a well-defined solution
ao
V~(t,x) = /_j~(t,x;O,z)[p~(z) - poO(z)]dz
(4.8)
t
+ )(t'Xs'z)F(s'z)u°(s'z)dzds

which approaches zero as Ixl ~ ~. I f e = p~ _ po0 = 0(~, then V~ = 0(~). Precisely,

Theorem 2: Suppose (Al) ~ ( x ) = p~(x) - poO(x) is bounded, continuous and approaches


zero as Ixl +~. Suppose (A2) further that E = D(E); i . e . ,

lim [ s u p l ~ ( x ) I / c ] : c < ~ (4.9)


c+O xe~
and that poO(x) > 0 is bounded, continuous and approaches zero as Ixl ÷ ~o with
~oO(x)dx < ~. Then, (for almost every path ~t ),

lim [sup I V l ( t , x ) l l ~ ] : c < ~ (4.10)


c+O O<t<T
xelR
If, in addition, (A3) [ I x C ( x ) I d x < ~ , then
J~
236

lim [sup fir Iv (t,x)Idx/ ] : c < (4.11)


~+0 O<t<T

Proof: The assumptions (Al), (A2) guarantee that (4.6) has a unique solution Vi(t,x)
(using the results of Besala [25] and Kryzanskii [26]), and that IV~(t,x)l ÷ 0 as
Ix[ ÷ =. (The assumption on poO(x) guarantees that Uo(S,Z) behaves like exp[-~z2]
as Izl ÷ ~ . ) The estimates (4.10), (4.11), follow from Cauchy's inequality applied
to (4.8). QED.
From (4.5) we see that
h~ (x)Yt
~ xYt~ he(x)y~
e~(t,x) = (e -e )V2(t,x) + e Vl(t,x)

i~ 1 i ~ (~xky~)lu0 ( t ' x ) + e Vl(t~x) (4.12)

Thus 8~ is 0(E) in the Ll-norm of (4.11), since the f i r s t term in (4.12) consists of
powers of E multiplied by moments of a Gaussian distribution and V~ falls o~f like
exp[-^~
B x k+11,. Using a similar proof, i t can also be shown that uE(t,x)- Z c1ui(t, X)
i=O
= 0(~n+l) in the same norm. In addition, i t is straightforward to justify the asymp-
totic expansions of p(t,x) and R(tlt), which are computed as indicated at the end of
Section 3.

CONCLUSIONS

The asymptotic analysis of the simple f i l t e r i n g problem involving observations of a


Gauss-Markov process through a "weak" polynomial nonlinearity carried out here using
a combination of Lie algebraic and analytic methods can be readily extended to a large
variety of nonlinear f i l t e r i n g problems containing small parameters. The conditions
for use of the procedure are (i) that the limit problem (corresponding to ~ = O) have
a simple form (e.g., a Kalman filtering problem), ( i i ) that the Lie algebraic struc-
tures associated with the perturbed and limit problems be of a known classified type,
and ( i i i ) that existence and uniqueness conditions and analytical bounds on the solu-
tions of the perturbed and limited PDE's be readi|y available. Since each of these
three points is now reasonably well understood in the theory of nonlinear f i l t e r i n g ,
the consolidation achieved here may be readily extended to many of the examples dis-
cussed in []5], [16], for example.

I t remains to perform some numerical experiments to test the usefulness of the asymp-
totic expansions in specific contexts. Work on this is now under way and will be
reported elsewhere. In this connection i t is worth noting that the truncated asymp-
totic expansions (2.4) do not always produce densities. Starting from a different
point, based on the fact that u~(t,x) is a density, one can construct asymptotic ex-
pansions whose truncated versions are always positive. However, the resulting expan-
sions may have very poor convergence properties. This work will be reported in [31].
237

ACKNOWLEDGEMENTS

The third author would like to thank W.E. Hopkins and J.S. Baras for discussions on
the results in Section 4. The calculations in Section 3 were performed with the aid
of the MACSYNLAprogram of the Mathlab Group at M.I.T., which is supported, in part,
by the Department of Energy and the National Aeronautics and Space Administration.

REFERENCES

I. Hazewinkel, M., Willems, J.C., eds., Stochastic Systems: The Mathematics of


Filtering and Identification_and App!ication~, NATOAdvanced Study Institute
Series, Reidel~ Dordrecht, 1981.
2, Davis, M.H.A., Marcus, S.I., "An introduction to nonlinear filtering," in [ l ] .
3. Benes, V.W., "Exact finite-dimensional f i l t e r s for certain diffusions with non-
linear d r i f t , " Stochastics, 5, (lg81), pp. 65-92.
4. Ocone, D., Ph.D. Thesis, Dept. of Mathematics, M.I.T., Cambridge, Mass., ]g80.
5. Marcus, S.I. and Willsky, A.S., "Algebraic structure and finite dimensional
nonlinear estimation," SIAM J. Math. Anal., 9 (1978), pp. 312-327.
6. Ocone, D.L., Baras, J.S., and Marcus, S.I., "Explicit filters for diffusions
with certain nonlinear drifts," to appear in Stochastics.
7. Brockett, R., "Remarks on finite dimensional nonlinear estimation," in Analyse
des Systems (Bordeaux, 1978), C. Lobry, Ed., and Ast~risque, 7__5,76, Soc. Math.
de France, 1980.
8. Mitter, S.K., "The analogy between mathematical problems of nonlinear filtering
and quantum physics," Richerche di Automatica, X (1979), pp. 163-216.
g, Hazewinkel, M., Marcus, S.I., "On Lie algebras and finite dimensional filtering,"
to appear in Stochastics.
lO. Liu, C.-H. and Marcus, S.I., "The Lie algebraic structure of a class of finite
dimensional nonlinear f i l t e r s , " in Algebraic and Geometric Methods in Linear
~
stems TheoTy, Lectures in Applied Math.~ vol. 18, C.I. Byrnes and C.F. Martin
eds.), Amer. Math. Soc., Providence, I980, pp. 277-297.
II. Pardoux, E., "Stochastic partial differential equations and filtering of diffu-
sion processes," Stochastics, 2 (1979), pp. 127-168.
12. Davis, M.H.A., "Pathwise nonlinear filtering," in I l l .
13. Baras, J.S., Blankenship, G.L., Mitter, S.K., "Nonlinear filtering of diffusion
processes," Proc. IFAC Triennial CongreFs, K~voto, 1981.
14. Baras, J.S., Blankenship, G.L., Hopkins, W.E., "Existence, uniqueness, and
as~nnptotic behavior of solutions to a class of Zakai equations with unbounded
coefficients," to appear in IEEE Trans. Automatic Control.
15. Blankenship, G.L., Haddad, A.H., "Asymptotic analysis of a class of nonlinear
filtering problems," Proc. IRIA-IFAC Workshop Singular Perturbations in Control,
Paris, 1978, pp. 1-15.
238

16. Blankenship, G.L., "Some approximation methods in nonlinear filtering," Proc.


IEEE Conf. Deci.siop and Control., Albuquerque, 1980, pp. 51-56.
17. Hazewinkel, M., "On deformations, approximations, and nonlinear filtering,"
to appear in Systems and Control Letters.
18. Wei, J. and Norman, E., "On global representations of the solutions of linear
differential equations as a product of exponentials," Proc. Amer. Math. Soc.,
16 (1964), pp. 327-334.
19. Steinberg, S., "Applications of the Lie algebraic formulas of Baker, Campbell,
Hausdorff, and Zassenhaus to the calculation of explicit solutions of partial
differential equations, J. Diff. Eq., 26 (1977), pp. 404-434.
20. Ocone, D., "Nonlinear filtering problems with finite dimensional Lie algebras,"
Proc. 1980 Joint Automatic Control Conf., San Francisco, 1980.
21. Hazewinkel, M., Marcus, S.I., Sussmann, H.J., "Nonexistence of exact finite
dimensional filters for the cubic sensor problem," in preparation.
22. Sussmann, H.J., "Rigorous results on the cubic sensor problem," in [ l ] .
23. Baras, J.S., Blankenship, G.L., "Nonlinear filtering of diffusion processes:
a generic example," to appear.
24. Aronson, D.G., Besala, P., "Parabolic equations with unbounded coefficients,"
J. Diff. Eqns., 3 (1967), pp. 1-14.
25. Besala, P., "On the existence of a fundamental solution for a parabolic equation
with unbounded coefficients," Ann. Polonici Math., 29 (1975), pp. 403-409.
26. Kryzanskii, M., Partial Differential Equations of Second Order, Polish Scientifi(
Publishers, Warsaw, 197l (Theorem l, p. 201). --
27. Marcus, S.I., Liu, C.-H., and Blankenship, G.L., "Lie algebras and asymptotic
expansions for some nonlinear filtering problems," in Proc. 1981 Joint Auto-
matic Control Conf., Charlottesville, June, 1981.
28. Sussmann, H.J., "Approximate finite dimensional f i l t e r s for some nonlinear
problems," preprint, June 1981.
29. Sagle, A.A. and Walde, R.E., Introduction to Lie Groups and. Lie Algebras. New
York: Academic Press, 1973.
30. Liu, C.-H., Ph.D. Thesis, Dept. of Electrical Engineering, University of Texas
at Austin, Austin, Texas, 1981.
31. Hopkins, W.E., Ph.D. Thesis, Dept. of Electrical Engineering, University of
Maryland, College Park, Maryland, to appear.
32. Fleming, W.H., Mitter, S.K., "Optimal control and nonlinear filtering for non-
degenerate diffusion processes," to appear.
O P T I M A L S T O P P I N G FOR T W O - P A R A M E T E R P R O C E S S E S

G. M A Z Z I O T T O and J. S Z P I R G L A S

Centre N a t i o n a l d ' E t u d e s des T ~ l ~ c o m m u n i c a t i o n s


92 131 - ISSY LES M O U L I N E A U X - F R A N C E

Summary: A f o r m a l i s m for the o p t i m a l s t o p p i n g of t w o - p a r a m e t e r pro-


cesses is d e v e l o p e d by a n a l o g y w i t h the c l a s s i c a l theory. An o p t i m a l -
ity c r i t e r i o n is e s t a b l i s h e d in terms of the c o n d i t i o n a l p a y - o f f pro-
cess. The p r o b l e m of optimal s t o p p i n g of a p r o c e s s i n d e x e d by ~2 is
completely solved, in p r o b a b i l i s t i c terms, by using the n o t i o n of tac-
tics. The m e t h o d consis of s e a r c h i n g for an optimal stopping p o i n t
among the m a x i m a l s t o p p i n g p o i n t s up to w h i c h the Snell e n v e l o p e is a
martingale. On JR+
2 the d i f f i c u l t i e s arise from the lack of i n f o r m a -
tion about the b e h a v i o u r of t w o - p a r a m e t e r s u p e r m a r t i n g a l e s , and par-
ticularly the Snell envelope. The o p t i m a l s t o p p i n g of a B r o w n i a n sheet
is solved and we p r e s e n t the case of the b i - B r o w n i a n process. Asso-
ciated systems of v a r i a t i o n a l i n e q u a l i t i e s are proposed.

I-i Preliminaries

The p r o c e s s e s c o n s i d e r e d here are indexed by the set ~ , w h e r e


2
= ~2 or JR+ . They are e x t e n d e d to its o n e - p o i n t c o m p a c t i f i c a t i o n
7=~U{ ~} by taking the null value at the infinity. The p a r t i a l or-
dering of ~ is d e f i n e d by:

s = (Sl,S 2) ~ t = (tl,t2) iff sI ~ tI and s 2 ~ t 2.

For two p o i n t s s and t, s ^ t (resp. s v t) d e n o t e s the p o i n t

(inf(sl,tl), inf(s2,t2)) (resp. (sup(sl,tl), sup(s2,t2)) )


240

Let (~, A, ~) be a complete probability space. A filtration


F= is a family (Ft;= t e ~ ) of s u b - c - a l g e b r a s of ~ such that F= is in-
creasing for the partial ordering, ~o contains all ~ - n e g l i g i b l e
sets of (~, A) and ( incase ~ = ~ ), ~ is right-continuous.

A random v a r i a b l e (r.v.) T, taking its values in ~ , is a


stopping point (s.p.) iff:
Vt e~ , { T~t } e ~ t
The set of s.p. is d e n o t e d by T . To a s.p. T, we associate the
~-field ~T of events A such that:
A ~ {T~t } ~ ~t '~ t E
Given two s.p. S and T, one can prove (12) -as in the classical
theory- that:
{ S = T } ~ ~S ~ ~T ' { S ~ T } ~ ~T
but g e n e r a l l y , { s ~ T } ~ ~S
SvT is a s.p., but S^T is g e n e r a l l y not.

Two given processes X and Y are said to be p s e u d o - i n d i s t i n -


guishable iff
T ~ T , XT = YT a.s.

I-2 The optimal stopping p r o b l e m

The p a y - o f f is a non negative optional process of class (D)-


i.e. the set ( YT; T 6 T ) is u n i f o r m l y integrable. The p r o b l e m
consists of finding a s.p. T , here called optimal, such that:

E(YT*) = sup E(Y T)


TET

As in the classical theory- of (2) and (5)- the main tool


is the c o n d i t i o n a l pay-off system of r.v., ( J ( T ) ; T 6 T__ ), defined by
J(T) = esssup E(Ys/FT ) , ~ T £ T
S > T
s~T
By g e n e r a l i z i n g the M e r t e n s theorem (2) to the b i d i m e n s i o n a l case
one can prove (3), there exists one process J - up to p s e u d o - i n d i s -
tinguishability - w h i c h agregates this system. J is the Snell en-
veloppe of Y, i.e. the smallest strong s u p e r - m a r t i n g a l e greater
than Y:
T 6 ~ : JT ~ YT a.s., JT is ~ T - m e a s u r a b l e
VS,T~, S ~ T ~ E(JT/~S) ~ JS a.s.
241

The optimality criterion - Bellman principle - has b e e n pro-


ved (9) to be:

Proposition: A stopping point T i s optimal i f and only i f :


(i) JT* = YT* a.s.
(ii) E(JT*) = E(J 0)

In the classical theory, the above criterion is u s e d -(2),(5)-


in the f o l l o w i n g manner. Under some appropriate regularity conditions
the s m a l l e s t s.p. such that (i) h o l d s - the debut of { Y = J } - also
verifies (ii) and then is o p t i m a l . We suggest in (9) t h a t it s e e m s
more t r a c t a b l e , in t h e two-parameter context, to s e a r c h for a s.p.,
which satisfies (i) a m o n g the m a x i m a l s.p. such that (ii) holds.

A S.p. T will be c a l l e d maximal if it is a m a x i m a l element


with r e s p e c t to the p a r t i a l ordering on T, of the set:
{ T E ~ : E ( J T) = E(J0)' }

AS a matter of fact, the debut of t h e random set { Y = J }


is a w e a k stopping line (10) which has no a p r i o r i reason to c o n t a i n
any s.p.. Moreover those s.p. could be e i t h e r optimal or not, as
in the following example:

Y=I J=0 Debut { Y = J } = { S,T }


i S is o p t i m a l , but T is not!
J=1 i T Y=I/2

J=I/2t
I l
I I

The existence of m a x i m a l s.p. c a n be d e d u c e d from Zorn's


lemma u n d e r quite general assumptions on the Snell enveloppe. In
the f o l l o w i n g we assume that J is l e f t continuous in e x p e c t a t i o n
on s.p. ( that is to say: l~m E(J T ) = E(Jlim T ) for e v e r y increa-
n n n
sing s e q u e n c e (Tn; n ~ ~) of s.p. )

Then the m a i n problem is to p r o v e that a maximal s.p. is


indeed an o p t i m a l s.p.. In t h e c a s e of p r o c e s s e s indexed by one-
dimensional sets, ~ or ~ + , this could be d e d u c e d from the M e r t e n s
decomposition (6) of the supermartingale J. But for t w o - p a r a m e t e r
supermartingales, we lack such a useful decomposition. For these
reasons, we consider particular situations in w h i c h we c a n say m o r e
242

about the p r o b l e m or the Snell envelope.

II- O p t i m a l s t o p p i n g on ~ 2

The optimal s t o p p i n g problem, for a p r o c e s s indexed bv ~ 2 1


p r e s e n t e d here, g e n e r a l i z e s in some w a y s that of (7) w h i c h is m a i n l y
c o n c e r n e d w i t h sums of i n d e p e n d e n t i d e n t i c a l l y d i s t r i b u t e d r.v..
M o r e o v e r we solve by p u r e l y p r o b a b i l i s t i c m e t h o d s the p r o b l e m solved
in (8) via d y n a m i c p r o g r a m m i n g . The m a i n tool is due to (7) and (8).
For that p u r p o s e we assume that the f i l t r a t i o n s a t i s f i e s the condi-
tion of c o n d i t i o n a l q u a l i t a t i v e i n d e p e n d e n c e of (7) or m o r e p a r t i -
cularly, v e r i f i e s the F4 p r o p e r t y of (13).
To t = (tl,t 2) in ~2 we a s s o c i a t e the points t' t" and t +
r e s p e c t i v e l y d e f i n e d by (t1+1,t2) , (tl,t2+1) and (t1+1,t2+1).
2
A tactic is a r a n d o m path in ~ ; i n c r e a s i n g from 0 to ~ -
Z = (Z 0 = 0; Zp+1 = Z'p or Z"p , ¥ p; Z = ~), such that, for each p,
Zp is a s.p. and ZD+ I is m e a s u r a b l e w i t h r e s p e c t to the a - f i e l d ~Z "
The m a i n a d v a n t a g e of tactics comes from the fact - p r o v e d p
in (7) and (12) - that, g i v e n two s.p. S and T such that S ~ T,
there exists one tactic w h i c h goes t h r o u g h S and T, a.s..
To solve the o p t i m a l s t o p p i n g problem, we d e f i n e a tactic Z ,
,
and afterwards, a s.p. T on this tactic, by the f o l l o w i n g recurrence.
Z0 = 0 and ¥ p > 0 let

Zp+
1. = t" on {Z~ = t} N {Jt = E(Jt"/~t) }

Zp+ I = t' on {Zp = t } ~ {J t > E(Jt,,/~t)} ,

T = Zp on {Zp = t } N {J t > inf(E(Jt,,/~t),E(Jt,/~t))} , ¥ t£~ 2

If we assume that Y is a.s. c o n t i n u o u s up to i n f i n i t y - i.e. l~m Yt


,
exists and equals Y - then, T is a m a x i m a l s.p.. Moreover, we
prove in (9) the f o l l o w i n g result.

P r o p o s i t i o n : Every maximal stopping p o i n t i s optimal.

III- O p t i m a l s t o p p i n g on ~ 2

At p r e s e n t we cannot p r o p o s e any general m e t h o d to solve the


o p t i m a l s t o p p i n g p r o b l e m of a process indexed by ~ 2+. The m a i n reason
is the lack of i n f o r m a t i o n about the b e h a v i o u r of the s u p e r m a r t i n g a l e Jl
the Snell envelope. Consequently, we turn here towards p a r t i c u l a r
cases in w h i c h J is b e t t e r known, thanks to its f u n c t i o n a l features.
243

III-1 Optimal stopping of the Brownian sheet

Let W = (Wt; t £ ~ ) be a B r o w n i a n sheet, defined on its


canonical space ( ~ , A , ( F t ) , ~ ) ; let f be a positive,
= --
bounded continuous
function and, ~ a positive constant. The pay-off process is given by:
2
Yt = exp(-~tlt2) f(Wt) , ¥ t = (tl,t 2) ~ ~ + .

The Snell envelope, J, is c o m p u t e d in (9), We obtain the


following formula:
Jt = exp(-~tlt2) q(Wt)
where q is the Snell reduite of f with respect to the c l a s s i c a l Brow-
nian semigroup on ~ , see (11). In addition, when q is s u f f i c i e n t l y
differentiable, it can be defined as the solution of a classical ~ s t e m
of v a r i a t i o n a l inequations, (I).

The notion of optional increasing paths (o.i.p.), due to (12),


generalizes that of tatics to ~2. An o.i.p, is a family of s.p.,
Z = (Zu; u E ~+), increasing for the partial ordering (with Z 0 = 0
and Z = ~) and such that the a p p l i c a t i o n u + Z is continuous from
2 u
~+ into ~+.
We prove in (9), the following resul~:

Proposition: On e a c h optional increasing path, Z, not iden-


tically equal to the coordinate axes, there exists an o p t i -
mal stopping point, T , given by:
T = ZT * , w i t h T = inf{ u / q(W Z ) = f ( W Z ) }
u u

III-2 O p t i m a l stopping of a B i - M a r k o v Process

This example can be explained in the context of stochastic


games theory. The evolutions of two i n d e p e n d e n t players are m e d e l i z e d
by two M a r k o v processes defined on their canonical spaces; for
i = 1,2 : X i = (~i,~i,x~i,~tii, (p~i ;xi ~ E.)l )" They must stop at -

possibly d i f f e r e n t - t i m e s T I and T 2 such that the average of common


reward
2
E ( e x p ( - ~ I T I - ~ 2 T 2) f(X~I,XT2) )
is maximum. In that formula, f is a given positive bounded function
on EIXE 2 and el' e2 are some positive constants. Times T I and T 2 have
to be causally choosen; that is to say, knowing the sample paths of
X I up to T I and X 2 up to T 2 only. In other words, for every given
244

real t~, t , the e v e n t ( T I ~ t I , T 2 5 t 2 ) m u s t d e p e n d only on the


r.v. I .i 2 ~2 sl < tl s2 < t2 )
( X s I , As 2 • _ , _ •

The two p a r a m e t e r m o d e l is d e f i n e d on the f o l l o w i n g t e n s o r i a l


products:
~= ~1x~2 • A= = AI~
= A= 2 ' ~ x = ~ Ix1@ ~ 2x2 , E = EIxE 2 , ~ t = FI ~ F2
=t I =t 2

X t = (X~ , X 2 ), Yt = e x p ( - ~ . t ) f(X t) w h e r e t = (tl,t 2) and


I t2
~.t = ~iti +e2t2

The Snell e n v e l o p e , J, for the optimal s t o p p i n g of process


Y d e f i n e d on the f i l t e r e d p r o b a b i l i t y space (~,A, (Ft;t 6 ~ 2 ) ' ~ x ) is
c o m p u t e d in (9). It satisfies
Jt = exp(-~.t) q(X t)
w h e r e q is the s m a l l e s t u - b i e x c e s s i v e f u n c t i o n on E - i.e. for every
fixed coordinate, the f u n c t i o n q is ~ . - e x c e s s i v e as in (2) w i t h res-
1
p e c t to the other c o o r d i n a t e . T h e f u n c t i o n q is called the ~ - r e d u i t e
of f. M o r e o v e r w h e n q is s u f f i c i e n t l y differentiable, it satisfies a
system of v a r i a t i o n a l inequalities. For example in case X 1 and X 2 are
standard Brownian motions in some o p e n sets and Ul = e2 = 0, this sys-
tem is the following:
q >_ f
A1q < 0 , A2q <_ 0
41
A2q = 0 on (AlXA2)c with A = {q f}
w h e r e A I (resp A 2) d e n o t e s the L a p l ~ c i a n on the first (resp second)
v a r i a b l e and, for any A C E , A 1 (resp A 2) d e n o t e s its p r o j e c t i o n on
E I (resp E 2) .

This s y s t e m leads to several remarks. First, it is not closed


b e c a u s e n o t h i n g is said upon the state of q on A I X A 2 / A. Then we do
not k n o w u n d e r what appropriate conditions such s y s t e m admits a uni-
que s o l u t i o n w h i c h is is the Snell r e d u i t e of f. To our k n o w l e d g e
these p r o b l e m s are open.

Nevertheless, when the f u n c t i o n ~ld2q has a c o n s t a n t sign


A1d2q ~ 0 one can show, that there exists a u n i q u e m a x i m a l s.p. and
this s t o p p i n g p o i n t is optimal. H o w e v e r until now, we cannot say any ~
thing a b o u t the general case.
245

Bibliography

(1) A. B E N S O U S S A N and J.L. LIONS : " A p p l i c a t i o n s des i n ~ q u a t i o n s


variationnelles en c o n t r 6 1 e s t o c h a s t i q u e " Dunod, 1978.
(2) J.M. B I S M U T and B. SKALLI : "Temps d ' a r r ~ t optimal, th~orie
g ~ n ~ r a l e des p r o c e s s u s et p r o c e s s u s de Markov" Z.f.Wahr.V.Geb.
39, (1977) 301-313.
(3) R. CAIROLI : "Enveloppe de Snell d ' u n p r o c e s s u s b param~tre
bidimensionnel" Ann. Inst. H. P o i n c a r 4 , V o l X V I I I N°I (1982)47-53
(4) C. D E L L A C H E R I E and P.A. M E Y E R : "Probabilit~s et p o t e n t i e l "
vol. I and 2, H e r m a n n ,1975 and 1980.
(5) E.B. DYNKIN : "Markov p r o c e s s e s " S p r i n g e r Verlag, 1965.
(6) N. EL K A R O U I : "Les aspects p r o b a b i l i s t e s du c o n t r S l e s t o c h a s -
tique" Ecole d ' 4 t 4 de St F l o u r 1979. Lect. N o t e s in Math.
N°876, Springer-Verlag (1981) 74-239.
(7) U. K R E N G E L and L. S U C H E S T O N : " S t o p p i n g rules and tactics for
processes i n d e x e d by d i r e c t e d sets" J. of Mult. A n a l . V o l 11
N°2 (1981) 199-229.
(8) A. M A N D E L B A U M and R.J. V A N D E R B E I : "Optimal s t o p p i n g and super-
m a r t i n g a l e s over p a r t i a l l y o r d e r e d sets" Z.f.Wahr.V.Geb.57
(1981) 253-264.
(9) G. M A Z Z I O T T O and J. S Z P I R G L A S : "Arr~t o p t i m a l sur le plan"
C.R.Acad. S c . P a r i s t.293 (1981) 87-90.
(10) P.A. M E Y E R : "Th~orie 4 1 4 m e n t a i r e des p r o c e s s u s b d e u x indices"
Lect.Notes in M a t h . N ° 8 6 3 , Springer-Verlag (1981) 1-39.
(11) A.N. SHIRYAYEV : "Optimal s t o p p i n g rules" Springer-Verlag,1978.
(12) J.B. W A L S H : "Optional i n c r e a s i n g paths" Lect. N o t e s in Math.
N°863, Springer-Verlag,(1981) 172-201.
(13) E. W O N G and M . ZAKAI : " M a r t i n g a l e s and s t o c h a s t i c i n t e g r a l s
for p r o c e s s e s w i t h a m u l t i d i m e n s i o n a l p a r a m e t e r " Z.f.Wahr.V.
Geb.29 (1974) 109-122.
STOCHASTIC CONTROL PROBLEM FOR REFLECTED DIFFUSIONS

IN A CONVEX B O U ~ E D DOMAIN

Jose Luis Menaldi (*)


Department of Mathematics
Wayne State University
Detroit, Michigan 48202, U.S.A.

I. INTRODUCTION

Let O be a convex bounded domain in An . The state of our stochastic


control problem is a normal reflected diffusion (y(t), t ~ 0) on ~ , starting at
x and with drift coefficient g(y(t),v(t)) , diffusion term ~(y(t),v(t)) . The
functions g,~ are given and our control is the adapted process v(t) taking
values in a compact convex set V in I~m .
The cost functional is defined by
t
(i.i) Jx(V) = Ef f(y(tl,v(t))exp(~ F c(y(s),v(s))dsldt ,
0 0

where functions f,c are known.


Our control problem is to minimize the cost functional (I.i) over all admis-
sible controls v(.) ,

(1.2) u(x) = inf{Jx(V) :v(.)} .

At least formally, by the arg~nent of the dynamic programming, we can derive


the following equation to be satisfied by u ,

Sup[A(v)u-f(-,v) : v ~ V] = O in C~
(1.3)
5__~u=
5n 0 on ~ ,

* 2
where A(v) = -(i/2)tr(~o V ) -g-V + c , V is the gradient operator n denotes
the outward unit normal to O • tr(.) is the trace and * denotes the transpose.
In this way, the initial stochastic control problem (1.2) is connected to the
Hamilton-Jacobi-Bellman equation with Neumann boundary conditions (1.3).
We characterize the optimal cost function u(x) as the unique stationary
function of the corresponding nonlinear semigroup. Also, under some conditions,
we construct an optimal Markovian control.

(*)This research has been supported in part by the University of Paris-Daphine and
I.N.R.I.A., France.
247

On the other hand, we consider the diffusion (y¢(t),t ~ 0) in ]Rn , with


drift coefficient g(y¢(t),v(t)) - (I/¢)~(y¢(t)) and diffusion term
~(y¢(t),v(t)) . The function 8 represents the penalization factor, which is
exactly the gradient of the square of the distance to the set O •
We denote by J~(v) , ¢ > 0 the cost functional (i.i) with y¢(t) instead of
y(t) . The optimal cost is

(1.4) u¢(x) = inf{J~(v) : v(')] e

We show that the function u (x) converges uniformly in ~ to the initial


optimal cost function u(x) . Also, under suitable assumptions, the diffusion
process (y~(t),t a 0) converges to the reflected diffusion (y(t),t ~ 0) in an
appropriate sense.
The first rigorous results concerning the derivation of the HJB (1.3) in the
whole space were obtained by Krylov [i0]. The HJB equation (1.3) with Dirichlet
boundary conditions was studied by Safonov [24] in a plane domain, by Evans and
Friedman [5], Lions and Menaldi [14] in a n-dimensional domain but with constant
coefficient ~ , and by Lions [13], Evans and Lions [7], Evans and Lenhard [6],
Evans [4], Krylov [12] in ~ general smooth domain. A weak formulation of HJB
equation (1.3) in a general domain with Dirichlet boundary conditions was presented
in Lions and Menaldi [14,15].
Let us mention that these kinds of problems were introduced in the book of
Fleming and Rishel [8]. We also refer to the books of Bensoussan and Lions [1,2],
Friedman [9], Krylov [ii] for impulsive and continuous control problems. We remark
that some results concerning impulsive control problems are given in [17,18] for
degenerate diffusion processes, in [19,20] for degenerate reflected diffusion
processes, and in Bensoussan and Menaldi [3] for diffusion processes with jumps.
In section 2, we give the main notations, assumptions and background to be
used later on. Next, in section 3, we establish some results concerning the charac-
terization and regularity of the optimal cost, and the approximation by the problem
in the whole space.

2. PRELIMINARIES
Let O be an open, convex and bounded set in A n , and V be a compact convex
set in IRm . We suppose that

g : IRn× V-~ ~Rn , ~ : ~nx V-~ 1Rn® l~n,

f : l~n× V ~ l ~ , c : ~nx V~ ~+

satisfy g = (gi' i = I ..... n) , ~ = (~ij' i,j = i ..... n) ,


248

- cl - 'l , ;

(2.e) I+(x,v)l c, V ,v; V+=gi, =ij 'f'c

c(x,v) > c o > 0 , V x , v ;

where 0 is a given continuous function such that p(0) = 0 and C denotes


generic constants.
We call an admissible system a set • = (~,~r,~t,p,w(t),v(t),y(t)) , where
(~,~r,p) is a probability space (~rt,t~O) is a nondecreasing right continuous
family of completed sub cT-fields of ~r , w(t) is a standard Wiener process in
~n with respect to ~t , v(t) is a measurable adapted process taking values in
V , and y(t) is the normal reflected diffusion defined by the following
stochastic variational inequality (S.V.I.)

y(t), ~(t) continuous, measurable and adapted processes in ]Rn such


that
(i) y(t) takes values in the closure ~ and ~(t) has
bounded variation on [O,T], 0 < T < ~ , ~(0) = 0 .
t
(ii) y(t) + 1](t) = x + ~ 0 g(y(s),v(s))ds
(2.2)
rt
+00 e(y(s),v(s))dw(s) , ~t ~ 0

(iii) for any z(t) continuous, measurable and adapted process


taking values in the closure ~ , we have

~0T (y(t)- z(t))dl](t) ~ 0 , ~ T ~ 0

Notice that if the convex set O has smooth boundary, the normal reflected
diffusion defined by the S.V.I. (2.2) becomes the classical one. The existence of
such reflected diffusion in (2.2) is deduced from Tanaka [27], Lions, Menaldi and
Sznitman [16], and [21].
Now, for any admissible system • and any real measurable bounded function
h(x) , we define a functional
t
J(x,~,t,h) = Elf f(y(s),v(s))~0(s,~ds + h(y(t))~0(t,d~] ,
0
(2.3) t
~(t,~7) = exp(- 7 c(y(s),v(s))ds) ,
0

and and operator

(2.4) [Q(t)h](x) = inf{J(x,Q,t,h) : Q} .


249

Denote by Cb(O) the space of all real functions on O which are uniformly
(1)
continuous and bounded, endowed with the supremum norm If'If • Then as in Nisio
[22,23], Bensoussan and Lions [I], Lions and Menaldi [14,15], we can prove that
(Q(t),t a 0) is a nonlinear semigroup acting on Cb(O) , we have precisely the
following properties:

(2.5) Q(t) : cb(~) - cb(~) ,

(2.6) Q(t+s) = Q(t)oQ(s) , Q(O) = identity,

(2.7) llQ(t)h - Q(s)hll ~ 0 as t ~ s , ~h E Ch(O) ,

(2.8) !IQ(t)hI - Q(t)h211 ~ llhI - h211 , V hl,h 2 E Cb(~) ,

(2.9) Q(t)h I ~ Q(t)h 2 if hI ~ h2 , hl,h 2 E Cb(O) ,

If h is twice continuously differentiable in a compact subset


K of O and the normal derivative to ~O of h vanishes on
~K ~ ~ , then we have
(2.10)
~ [ Q ( t ) h - h ] ~ -sup[A(v)h- f(.,v) : v E V} as t 4 0 ,

uniformly on K ,

where the operator A(v) is defined by

I . 52 A
(2.11) A(v) = - ~ t r ( o ~ -7)- g ~x + c .
Ox

Under this formulation, we define a cost of each admissible system control Q,

(2.12) Jx(~7) = Elf ~ f(y(t),v(t))exp(-I t c (y(s) ,v(s))ds)dt} ,


0 0
and the optimal cost

(2.13) u(x) = inf[Jx(~Z) : ~] •

On the other hand, we call ~ n -admissible system control a set


= (~,~,Jt,F,w(t),yt(t)) , • > 0 satisfying the same conditions as but

y¢(t) is the diffusion process given by the following Ito equation

dye(t) = g(y¢(t),v(t))dt + ~(ye(t),v(t))dw(t) - ~ 8(y¢(t))dt, t ~ 0


(2.14)
¢
y (0) ffix ,

where

(2.15) ~(x) ffi (x - ~(x))* , ~ : orthogonal projection on O .

(i)
Since the domain is bounded, we Just need continuous functions.
250

In a similar way to (2.3), (2.4) we define the operator

(2.16) [Q¢(t)h]x = i n f [ J ( x , ~ , t , h ) : ~ ] ,

and as in (2.12), (2.13), we obtain the optimal cost

(2.17) u¢(x) = i n f [ J x ( ~ ) : ~} .

Clearly, (Qe(t) : t ~ O) satisfies analogous properties (2.5) . . . . . (.2.10).


We remark that the penalization (2.14) has been used in Shalaumov [25], in
Lions, Menaldi and Sznitman [16], and in [21]. Also, similar kind of nonlinear
semigroup have been treated in Stettner and Zabczyk [26], Zabczyk [28].
In order to construct an optimal Markovian admissible system with respect to
(2.13), we will deal with the existence of a weak solution of the S.V.I. (2.2) for
coefficients g,~ which are only bounded and measurable.
We assume that there is no degeneracy, i.e.

for some constant ~ > 0 , and we also suppose that we can not exert any control
near to the boundary, i.e. ,

x E ~ such that dist(x,~) ~ 6


(2.19)
we have ~(x,v) = ~(x) , "~v E V ,

where 5 is some positive constant.

Lennna 2.1
Let v(x) be a measurable function from O into V . Then under assumptions
(2.1), (2.18), (2.19) ~I)"
~ there exists an admissible system = (n,~',Jt, p,w(t),
v(t), y(t)) such that v(t) = v(y(t)) for all t ~ 0 .

Outlines of the Proof.


First of all, combining Krylov's estimates for diffusion processes (of. [ii])
and classical estimates for reflected diffusions with smooth coefficients
(of. Bensoussan and Lions [2]), we obtain for any measurable function h(x) the
estimate

(2.20) Elf [h(y(t))le c0tdt} ~ C~r O lh(x) lndx) l/n ,

where y(t) is the reflected diffusion associated to a smooth feed-back v(x) ,


and the constant C is independent of v,h .
(1)
the boundary of O is supposed to he smooth.
251

Next, by a similar technique as in Krylov [ii,p.86], the lemma follows.

3. CHARACTERIZATIONS
We have

.Theorem 3.1
We asstmle (2.1). Then the function u defined b_y (2.13) is the unique
solution of the problem

(3.1) u E CD(O) , Q(t)u = u , ~ t ~ 0 .

Moreover, the equation of the dynamic pro~rammin~ is satisfied

(3.2) u(x) = inf[J(x,@,~,u) : (Q,@)} ,

where J is the functional given b y (2.3) and 8 stands for any stoppin~ time
associated to the admissible system ~ .

Outlines of the proof.


We just need to use properties (2.5) ..... (2.9) in a similar way as in Nisio
[22], Bensoussan and Lions [i], Lions and Menaldi [15].
We remark that if u is smooth, from Theorem 3.1 and property (2.10) we
obtain equation (1.3). In this sense, problem (3.1) can be taken as an integral
formulation of the HJB equation (1.3).
For the regular case, we have

Theorem 3.2
Let assumptions (2.1), (2.18), (2.19) hold. Suppose also that

gi' ~ij' f' c have uniform Lipschitz continuous


(3.3)
first derivative in x ,

(3.4) has smooth boundary .

Then the optimal cost u defined by (2.13) is the unique solution of problem (1.3)
in the Sobolev space W2'~(O) . Moreover, there exists an optimal admissible
system which is Markovian feed-back.

Outlines of the proof.


We just need to show the W 2'~ regularity for the optimal cost u . Since
there is no control near the boundary ~O , the function u solves a linear
equation near ~O . Hence u is smooth in a neighborhood of ~O by using
classical argnments. Next, similar to Evans and Lions [7], we deduce the W 2'~
regularity in the interior of & ,
252

The existence of an optimal control follows from the fact that u is W 2'=
and the Lemma 2.1.

Remark 3.1
Using the technique of Lions and Menaldl [15] and [19], the optimal cost u
can be characterized as the maximum subsolution in an appropriate sense.

Remark 3.2
We can obtain the C2"~(~)- regularity of the solution u by using a recent
result of Evans [4].
On the other hand, we have

Theorem 3.3
Under assumption (2.1) we have the followin~ convergence

(3.5) u (x) ~ u(x) as ~ ~ 0 , uniformly in x ~ ~ ,

where u ,u are Kiven by (2.17), (2.13) respectively. Moreover, for any function
h i__nn C b ( ] R n ) w~e have, a~s ¢ ~ 0 ,

(3.6) Q (t)h * Q(t)h in C h (0) , uniformly in t ~ 0 ,

where Q (t) , Q(t) are the nonlinear semi~roup defined by (2.16), (2.4).

Outlines of the proof.


We use the following fact (cf. [21]): ~ T > 0 we have

(3.7) E[sup]y¢(t) -y(t) I : 0 ~ t ~ T} ~ 0 as ¢ * 0 ,

being the limit uniform in v(.) and x in O . The processes y¢(.) , y(t)
are given by the stochastic equations (2.14), (2.4) .

Theorem 3.4
$ b_~e any positive constant. Then under hypothesis (2.1) there exists a
function u such that

u E C 2 ' 8 ( ~ n) for some 8 > O ,


s

(3.8) s u p [sA (-v ) u - f(.,v) : v E V} ~ , in I

-¢ ~ u (x) - u(x) ~ ~ , "{~x ~ 5 ,

where u is the optimal cost (2.13). Moreover, we can construct an admissible


system ~¢ , which is s-optimal, ~.~.

(3.9) Jx(~ ) ~ u(x) + ¢ , ~ = ~,x "


253

Outlines of the proof.


First of all, we note that without loss of generality we can assume that data
g,~,f,c satisfy (2.18), (3.3). Since the domain O is not supposed to be smooth,
we can not apply directly Theorem 3.2. Then, we use (3.5) to deduce (3.8). We
remark that in (3.8), the function us = us , , with u¢, given by (2.17) for some
small ~' .
Finally, we construct Q by mean of a feed-back v (x) which achieves the
Infimum in (2.17).

Remark 3.3
In almost all of this paper, assumptions (2.1), (2.18), (2.19) can be relaxed.
We will have quite similar results.

Remark 3.4
We can extend all results to the parabolic case.

Remark 3.5
With analogous techniques, we can consider the case where we add an impulsive
control in the system ~ .

REFERENCES

[i] A. BENSOUSSAN and J.L. LIONS, Applications des In~quations Variationnelles


en Contr$1e Stochastique, Dunod, Paris, 1978.

[2] A. BENSOUSSAN and J.L. LIONS, ContrOls Impulsionnel e_~tInequations Quasi-


Variationnelles, Dunod, Paris, 1981 (to appear).

[3] A. BENSOUSSAN and J.L. MENALDI, Optimal Stochastic Control of Diffusion


Processes with Jumps Stopped at the Exit of a Domain, Advances in Probability
Vol. 7, Stochastic Differential Equations, Ed. M.A. Pinsky, Marcel Dekker Inc.
to appear.

[4] L.C. EVANS, Classical Solutions of the Hamilton-Jacobi-Bellman Equations for


Uniformly Elliptic Operator, preprint.

[5] L.C. EVANS and A. FRIEDMAN, Optimal Stochastic Switching and the Dirichlet
Problem for Bellman Equation, Trans. Am. Math. Soc., 253 (1979), pp.365-389.

[6] L.C. EVANS and S. LENHART, The Parabolic Bellman Equation, Nonlinear Analysis,
(1981), pp. 765-773.

[7] L.C. EVANS and P.L. LIONS, R~solution des Equations de Hamilton-Jacobl-
Bellman pour des Op~rateurs Uniformement Elliptiques, C. R. Acad. Sc. Paris,
At290 (1980), pp. 1049-1052.

[8] W.H. FLEMING and R. RISNEL, Optimal Deterministic and Stochastic Control,
Springer-Verlag, New York, 1975.

[9] A. FRIEDMAN, Stochastic Differential Equations a n d Applications, Vol. I and


II, Academic Press, New York, 1976.
254

[iO] N.V. KRYLOV, Control of a Solution of a Stochastic Integral Equation,


Theory Prob. Appl., I~7 (1972), pp. 114-131.

[II] N.V. KRYLOV, Controlled Diffusion Processed, Springer-Verlag, New York, 1980.

[12] N.V. KRYLOV, Some New Results in the Theory of Controlled Diffusions
Processes, Math. USSR Sbornik, 3_/7 (1980), pp. 133-149.

[13] P.L. LIONS, Sur Quelques Classes d'Equations aux D6riv6es Partielles
Nonlin~alres et Leur R6solution Num6rique, Th~se d'Etat, Universit6 de
Paris VI, 1979.

[14] P.L. LIONS and J.L. MENALDI, Probl~mes de Bellman avec le ContrSle dans les
Coefficients de Plus Haut Degre, C. R. Acad. So. Paris, ,A-287, (1978),
pp. 409-412.

[15] P.L. LIONS and J.L. MENALDI, Control of Stochastic Integrals and Hamilton-
Jacobi-Bellman Equation, Part I and If, SIAM J. Control Optim., 2_O0(1982),
pp. 58-95. See also Proc. 20th IEEE CDC, San Diego, 1981~ pp. 1340-1344.

[16] P.L. LIONS, J.L. MENALDI and A.S. SZNITMAN, Construction de Proeessus de
Diffusion R4fl6chis par P~nalisation du Domaine, C. R. Acad. Sc. Paris,
1-292 (1981), pp. 559-562.

[17] J.L. MENALDI, On the Optimal Stopping Time Problem for Degenerate
Diffusions, SIAM J. Control Optim., 18 (1980), pp. 697-721. See also
C. R. Acad. Sc. Paris, A-284 (1977), pp. 1443-1446.

[18] J.L. MENALDI, On the Optimal Impulse Control Problem for Degenerate
Diffusions, SIAM J. Control Optim., I~8 (1980), pp. 722-739. See also
C. R. Acad. Sc. Paris, A-284 (1977), pp. 1449-1502.

[19] J.L. MENALDI, Sur le Probl~me de Temps d'Arr~t Optimal pour les Diffusions
R~fl~chies D4g~n6r~es, C. R. Acad. Sc. Paris, A-289 (1979), pp. 779-782.
See also J. Optim. Theory Appl., 36 (1982), to appear.

[20] J.L, MENALDI, Sur le Probl~me de Contr~le Impulsionnel Optimal pour les
Diffusions R4fl6chies D6g~n6r4es, C. R. Acad. Sc. Paris, A-290 (1980),
pp. 5-8. See also Mathematicae Notae, 2~8 (1982), to appear.

[21] J.L. MENALDI, Stochastic Variational Inequality for Reflected Diffusion,


t__q appear.

[22] M. NISIO, On Stochastic Optimal Controls a~d Envelope of Markovian Semi-


Oroups, Proc. of Intern. Symp. SDE, Kyoto 1976, pp. 297-325.

[23] M. NISIO, O~ a Non-Linear Semi-GK~up Attached to Stochastic Optimal Con-


trol, Publ. RIMS~. Kyoto Univ., i_~3 (1976), pp. 513-537.

[24] M.V. SAFONOV, On the Dirichlet Problem for Bellman Equation in a Plane
Domain, Math. USSR Sbornik, 3_!I (1977), pp. 231-284 and 3_44 (1978) j
pp. 521-526.

[25] V.A. SHALAUMOV, On the Behavior of a Diffusion Process with a Large Drift
Coefficient in a Half Space, Theory Prob. Appl., 2__4 (1980), pp. 592-598.

[26] L. STETTNEK and J. ZABCZYK, Strong Envelopes of Stochastic Processes and


a Penalty Methods, Stochastics,~ (1981), pp. 267-280.
255

In] H. TANAKA, Stochastic Differential Equations with Reflecting Boundary


Condition in Convex Regions, Hiroshlma Math. J.,.~9 (1979), pp. 163-177.

[zB] J. ZABCZYK, Semigroup Methods in Stochastic Control Theory, preprint 1978,


University of Montreal.
Nonlinear Filtering of Diffusion Processes
A Guided Tour
by
Sanjoy K. Mitter
Department of Electrical Engineering and Computer Science and
Laboratory for Information and Decision Systems
Massachusetts Institute of Technology
Cambridge, MA 02139
i. Introduction

In this paper we give a guided tour through the development of


nonlinear filtering of diffusion processes. The important topic of
filtering of point processes is not discussed in this paper.
There are two essentially different approaches to the nonlinear
filtering problem. The first is based on the important idea of in-
novations processes, originally introduced by Bode and Shannon (and
Kolmogoroff) in the context of Wiener Filtering problems and later de-
veloped by Kailath and his students in the late sixties for nonlinear
filtering problems. This approach reaches its culmination in the
seminal paper of FUJISAKI-KALLIANPUR-KUNITA [1972]. A detailed ac-
count of this approach is now available in book form cf. LIPSTER-
SHIRYAYEV [1977] and KALLIANPUR [1980]. The second approach can be
traced back to the doctoral dissertation of MORTENSEN [1966], DUNCAN
[1967] and the important paper of ZAKAI [1969]. In this approach
attention is focused on the unnormalized conditional density equation,
which is a bilinear stochastic partial differenti~l equation, and it
derives its inspiration from function space integration as originally
introduced by KAC [1951] and RAy [1954]. Mathematically, this view is
closely connected to the path integral formulation of Quantum Physics
due to FEYNMAN [1965 ]. For an exposition of this analogy see MITTER
[1980, 1981]. A detailed account of the second viewpoint can be found
in the lectures given by Kunita, Pardoux and Mitter in the CIME lecture
Notes on Nonlinear Filtering and Stochastic Control [1982] and in
HAZEWINKEL-WILLEMS [1981].

2. Basic Problem Formulation

To simplify the exposition we consider the situation where all


processes are scalar-valued.
Let (~, ~, P) be a complete probability space and let ~t, tel0, T]
be an increasing family of sub a-fields of ~. Let ~t be an ~t-adapted
process, considered to be the signal process and consider the observa-
tion process Yt given by
257

(2.1) Yt J0 hsdS + nt"

where n t is an~;t-Weiner process and it is assumed that a(n t - nslt ~ s)


is independent of the past of the joint signal-observation process
~(Yu' hulU ~ s). Information about the ~ -process is contained in h
which satisfies

E/0 t lhsI2ds < ~ V t ~ [0,T].

Let~tY = ~(YslS <_ t). Then the filtering problem consists of


computing

E ~(~t ) Y = ~t'

where ~, say is a bounded, continuous function (indeed any function such


that the conditional expectation makes sense).

3. The Innovations Method

The fundamental paper of Fujisaki-Kallianpur-Kunita proceeds as


follows:
Define the innovations process:

(3.1) 9t = Yt - hsdS , where

(3.2) i s =A E (hs I~s y)

Then it can be shown that


(i) ~t is ~rtY-wiener

(ii) ~ 1 9 u - ~tlu ~ t I is independent of~rtY.


In general it is not true that:
(Innovations Property) ~rtY = ~t ~ (Cirelson counter-example).
If is assumed that h t and n t are independent then innovations
property holds (cf. ALLINGER-MITTER [1981]).
However even without the innovations property holding, it can be
proved:
Every square-integrable ~tY-martingale m t can be represented as:

(3.3) mt = m0 + [t ~sd9 s , where


J0 ~s is jointly measurable, adapted to
258

~rsY and f E]~sI:2~s < ~.

To proceed further, let us assume that ~t is a continuous semimartingale

(3.4) ~t = ~0 + J 0 fsds + vt ' where v t is a square integrable

•. ~ t - m a r t i n g a l e . Then
(1)
(3.5) ^ - ~0 - ~0 ~ fs ds
mt = ~t is a ~ u y- martingale and from

the previous result

(3.6) m t =f ~sdVs , where ~s can be identified as


/k ^ ^

(3.7) ~S = ~ shs - ~shs + Ds , where D s is a ~ s - p r e d i c t a b l e

process and
/t DsdS = <v,n> t , where < ' >t denotes the quadratic
variation.
In case h t and n t are independent there is an essential simpli-
fication using the innovations property.
For example, one would have the representation

C3.8) ~t = ~0 + 0 ~ E(~tgs)dVs + J0~0 aSlas2E(~tvSlgs2)dVsd~s2 + "''

To proceed further let us assume that ~t is a Markov diffusion


process satisfying the Ito equation

(3.9) d~t = f (~t) dt + g(~t ) dB t , and

(3.10) h s = h(~s )

Let us also assume that B t and n t are independent (assumed


throughout the rest of the paper). Then using the Ito differential
rule for ~ e C~ , one gets

(i) ^ always denotes conditional expectation with respect to~tY


259

(3.11) $(~t ) = $(~0 ) + f L ~ ({s) ds + ~ (~s) hs - ~ hs d~ s •


0 "0

where L is the generator of the diffusion process.

4. The Innovations Method Continued: Kushner-Stratono~ich Equation. (i)

Let Ht(dY,e) denote the conditional distribution of ~t given ~ Y .


Let e C~ and denote by

(4.1) ~t(~) = J~(y) Ht(dY,~)

Then Kt satisfies the Nonlinear Stochastic Partial Differential


Equation (Kushner-Stratonovich Equation)

(4.2) Kt(~) = E0(~) + £ Ks(L~)ds

+~ [Ks(~hs)- Ks(*)Ks(hs)] <dYs - Ks(hs)dS )"

We think of the filter as a dynamical system and we think of equation


(4.2) as the input-output equation of the filter, the input being
the observations Ys and the output the conditional distribution ~t"

5. Zakai Equation

Let Pt be a continuous stochastic process with values in the set


of finite positive measures on e. Denote by

(5.1) Pt (~) = / ¢ ( Y ) Pt (dy)

Consider the equation

(5.2) Pt(~) = P0(~) + ;t PS (L~)ds + f Ps (hs~) dY s •


0 0
(weak form of Zakai equation)

Now it can be proved:

(i) The development here follows Kunita [1982].


260

If Pt is a solution of (5.2) then ~t(~) = ~


pt(~) is a solution
of equation (4.2).
Morever we have the Feynman-Kac formula:

(5.3) Pt(~) = E ~(~t ) exp h s dy s -


0

For later use (and throughout the rest of the paper) we shall
consider the Weak form of the Zakai equation in Stratonovich form:

(5.4) pt(~) = p0(~) +ft Ps [(T' - 1 h2)~] ds + f t Ps(h~)°dYs


0 0

If the solution Pt(dY) has a smooth density pt(y) it satisfies


the Zakai equation:

1
(5.5) dPt = (L* - ~. h 2) pt(y) dt + h pt(Y)odY t ,

where * denotes formal adjoint.


pt(y) has the interpretation of an unnormalized density and is to
be thought of as the "state" of the filter.
To compute conditional statistics, we need the state-output
equation

(5.6) A f~(x) Pt(X;y)dx


(xt) = __
~ Pt(x;y)dx

The fundamental problem of nonlinear filtering is the "invariant"


study of equation (5.5). The analytic difficulty of this p r o b l e m
stems from the following:
(i) In~mQs~_~nteresting situations the operator x~ , h(x) is
(ii) The paths of the y-process are only H~ider continuous
1
of exponent < ~ °

6. Pathwis~ Nonlinear Filterin@

The ideas of this section are due to CLARK [1978], DAVIS [1980],
and MITTER [1980].
There is as yet no theory of nonlinear filtering where the
261

observations are:

(6.1)
Yt = h(~t) + ~t '

where ~t is physical wide-band noise and hence smooth. Define Yt = Yt


and n t = Nt where • denotes differentiation. Then (6.1) can be written
as:

(6.2) dY t = h(~ t) dt + dN t , or

t
(6.3) Yt =~0 h(~ s) ds + N t .

Equation (6.3) is a mathematical model of the physical observation


(6.1) where the wide band noise n t has been approximated as "white noise"
Nt and hence N t is a Wiener process.
Now, if we wish to compute

E < ~(~t ) l~tY) = Functional of Y a.s. Wiener measure

then this filter does not accept the physical observation y. The idea
is to construct a suitable version of the conditional expectation so
that the performance of the filter as measured by the mean-square
error remains close to when the physical observation 'y' is replaced
by the mathematical model of the observation.
This is most easily done by eliminating the stochastic integral in
(5.5) by a suitable transformation (gauge transformation in the
language of physicists).
Define qt(x;y) by

(64 pt x;y = exp ( thCx qt (x;Y)

Then qt satisfies the parabolic partial differential equation

~q _ 1 ~2q
(6.5) ~-~ - ~ a(x) - - + bY(x t) ~-~t + vY(x,t)q
~x 2

dh da
where a(x) = g2(x), b y = - f + Yt a ~-~ + ~-~ ,

1 2 dh )2 df dh 1 d2a
vY = ½ h2 - Yt Lh + 2 Yt a(~-~ dx + Yta~ + 2 dx 2
262

Equation (6.5), the pathwise filter equation, is now needed to be


solved for each y, the observation (which can be taken to be a
physical observation).

7. Existence and Uniqueness Results for the Zakai Equation.

In case the observation h is bounded or linear,existence and


uniqueness results for the Zakai equation has been given by PARDOUX
[1982] by-studying equation (5.5) directly. Existence, uniqueness
and estimate of tail distributions for equation (5.5) including
unbounded observations (in the scalar case all polynominal h) have been
given by FLEMING-MITTER [1982], using stochastic control arguments.
This approach studies equation (6.5) by transforming it into a
Bellman-Hamilton-Jacobi equation (cf. paper of FLEMING this volume)
using an exponential transformation and then showing that the measure
constructed from q coincides with the measure given by formula (5.3).
For other literature on this problem see the bibliography in
FLEMING-MITTER (1982). See also PARDOUX [1981] and MITTER [1982] for
an interpretation of the exponential transformation in the context
of nonlinear filtering. For related variational considerations see
MITTER [1980], BISMUT [1981], HIJAB [1980].

8. Geometrical Theory of Nonlinear Filtering.

How can one answer the question when two filtering problems have
identical solutions? How can one decide whether the Zakai equation
admits a finite-dimensional statistic?
The starting point of this analysis is the Zakai euqation (5.5)
in Stratonovich form. Consider the two operators:

=L* -y1 h2

where the two operators are considered as formal differential operators


on C 0 ( ~ ) . Denote by~r the Lie algebra of operators generated by
~0 and~l under the standard bracket operation. This Lie algebra
is invariant under (i) smooth change of coordinates x~--~(x) and (ii)
gauge transformations ¢~--~p where ~ is a C~-function which is
invertible. Indeed, if the "invariance group" of the Zakai equation
263

is suitably defined then the above constitutes


the largest invariance group of the Lie algebra ~. The insight here is
that two filtering problems with isomophic Lie algebras are likely
to have the same filters. We say likely, since for a proof, analytic
considerations such as the existence of a common dense set of a n a l y t i c
vectors must come into play. For a rigorous analysis in specific
situations see OCONE [1980].
By a finite dimensiona~ filter for the c o n d i t i o n a l statistic
~txt). we mean a stochastic dynamical system.

(8.1) d~ t = ~(~t ) d t + B(~t )0 dy t

where e and 8 are smooth vector fields on some f i n i t e - d i m e n s i o n a l


smooth manifold, and a s t a t e - o u t p u t equation

(8.2) ~(xt) = y(~t ) , with y a smooth r e a l - v a l u e d function.

The idea of studying the Lie algebra is i n d e p e n d e n t l y due to


BROCKETT [1981] (cf. the b i b l i o g r a p h y of earlier Brockett papers
cited there) and M I T T E R [1980] (cf. the b i b l i o g r a p h y of earlier
paper of Mitter cited there) and the idea of i m p l e m e n t i n g the Lie
a l g e b r a ~ r as a Lie A l g e b r a of vector fields is due to Brockett.
The first examples of f i n i t e - d i m e n s i o n a l filters for n o n l i n e a r filter-
ing problems were c o n s t r u c t e d by BENES [1981] using functional integral
methods. For a g e n e r a l i s a t i o n of Benes results see O C O N E - B A R A S - M A R C U S
[1982].
In most situations the Lie algebra F is infinite dimensional
(cf. IGUSA ~1981]) and in many situations simple. If the Lie algebra
F is infinite dimensional it does not n e c e s s a r i l y mean that a finite-
dimensional filter does not exist. For a precise result in this
direction see H A Z E W I N K E L - M A R C U S [1982], SUSSMANN [1981].

9. Final Remarks.

There are other topiCs of i n t e r e s t and the theory r e p r e s e n t e d by


the second point of view is far from complete.
(i) Asymptotic Expansions: see B L A N K E N S H I P - L I U - M A R C U S [1982].
Much w o r k remains to be done here.
(ii) Lower Bounds on N o n l i n e a r Filtering: see article of
BOBROVSKY-ZAKAI in H A Z E W I N K E L - M A R C U S [1981], (and the
bibliography cited there). The s t o c h a s t i c control interpre-
284

tation of F L E M I N G - M I T T E R [1982] should be i m p o r t a n t here.


(iii) F i l t e r i n g on M a n i f o l d s : see D U N C A N [1977] for a beginning.
(iv) S m o o t h n e s s of d e n s i t i e s : see K U N I T A [1982].
(v) For a p a r t i a l s o l u t i o n to the v e c t o r "cubic-sensor" problem,
see D E L F O U R - M I T T E R [1982].
28S

REFERENCES

i. Allinger D., and Mitter, S.K. (1981) New Results on the Innovations
Problem of Nonlinear Filtering, StQchastics, vol 4, pp. 339-348.

2. Benes, V.E., (1981) Exact Finite Dimensional Filters for Cetain


Diffusions with Nonlinear Drift, Stochastics, vol 5, pp. 65-92.

3. Bismut, U.M., (1981) Mecanique Aleatoire, Springer Lecture Notes


in Mathematics r vol. 866.

4. Blankenship, G.L., Liu, C., and Marcus, S., (1982) Asymptotic


Expansions and Lie Algebras for Some Nonlinear Filterin~ Problems,
to appear IEEE Trans on Auto. Control.

5. Brockett, R.W., (1981) Paper in Hazewinkel-Willems, loc. cit.

6. Clark, J.M.C., (1978) The Desi@n of Robust Approximations to the


Stochastic Differential Equgtigns of Nonlinear Filtering, in
Communication Systems and Random Process Theory: ed. J.K.
Skwirzynski, Sithoff and Noordhoff.

7. CIME Lectures on Nonlinear Filterin~ and Stochastic Control (1982),


to be published as Springer Lecture Notes, eds. S.K. Mitter & A. Moro.

8. C!ME Lectures on Nonlinear Filterin~ and Stochastic Control [1982],


to be published as Springer Lecture Notes, ed. A. Moro.

9. Davis, M.H.A., [1980] On a Multiplicative Functional Transformation


Arisin@ in Nonlinear Filtering Theory, Z. Wahr. verw Gebiete 54
Dp. 125-139.

10. Delfour, M.C., and Mitter, S.K., to appear.

ii. Duncan, T.E., [1967] Doctoral Dissertation, Department of Electrical


Engineering, Stanford University

12. Duncan, T.E., [1977] Some Filterin~ Results on Riemannian Manifolds,


Information and Control, 35, pp. 182-195.

13. Feynman, R.P., and Hibbs, A.R., [1965] Quantum Mechanics and Path
Integrals, McGraw Hill, New York.

14. Fleming, W.H., and Mitter, S.K., [1982] Optimal Control and Nonlinear
Filtering for Nonde@enerate Diffusion Processes, to appear
Stochastics.

15. Fujisaki, M., Kallianpur, G., and Kunita, H., [1972] Stochastic
Differential Equations for the Nonlinear F ilterin~ Problem, Osaka
J. of Mathematics, Vol. 9, pp. 19-40.

16. Hazewinkel, M. and Marcus, S., [1982] On Lie Al~ebras and Finite
Dimensional Filtering, to appear in Stochastics.

17. Hazewinkel, M., and Willems, J. C., [1982] Stochastic Szstems:


The Mathematics of Filterin 9 and Identification and Applications:
eds. M. Hazewinkel and J. C. Willems, pp. 479-503, D. Reidel
Publishing Company.
266

18. Hijab, O, [1980] Minimum Energy Estimation, Doctoral Dissertation,


University of Calfornia, Berkeley.

19. Igusa, J.I., [1981] On Lie Albegras Generated by Two Differential


Operators in Lie Groups and Manifolds, pp. 187-195. Birkhauser Verl
Boston.

20. Kac, M., [1951] On Some Connections Between Probability Theory and
Differential and Integral Equations, Proc. 2nd Berkeley Symposium
Math. star. and Prob., pp. 189-215.

21. Kallianpur, G., [1980] Stochastic Filtering Theory, Application of


Math. Springer Verlag, ?oi. 13.

22. Kunita, H., [1982] Stochastic Partial Differential Equations


Connected with Nonlinear Filtering, to appear in Proceedings on
CIME School on Nonlinear Filtering and Stochastic Control.

23. Liptser R.S., and Shiryayev, A.N., [1977] Statistics of Random


Processes, Springer-Verlag, New York.

24. Mitter, S.K., [1980] On the Analogy Between Mathematical Problems


of Nonlinear Filtering and Quantu m Physics, Ricerche di Automatica,
vol. i0, No. 2, pp. 163-216.

25. Mitter, S.K., [1981] Non-linear Filtering and Stochastic Mechanics


in Stochastic Systems: The Mathematics of Filtering and Identifca-
tion and Applications: eds. M. Hazewinkel and J.C. Willems, pp.
479-503. D. Reidel Publishinq Company.

26. Mitter, S.K.,[1982] Lectures Given at CIME School on Nonlinear


Filtering and Stochastic Control, Cortona, Italy, July 1981. Pro-
ceedings to be published by Springer-Verlag.

27. Mortensen, R.E., [1966] Doctoral DissertatiOn, Department of


Electrical Engineering, University of Calffornia, Berkeley, Calif.

28. Ocone, D., [1980] Topics i n N0nlinear Filtering Theory, Doctoral


Dissertation, M.I.T.

29. Ocone, D., Baras, J.S., and Marcus, S., [1982] Explicit Filters for
Certain Diffusions with Nonlinear Drift, to appear Stochastics.

30. Pardoux, E. [1981] The Solution of the Nonlinear Filtering Equation


as a Likelihood Function, Proceedings of 20th CDC Conference,
Ban Diego, California.

31. Pardoux, E. [1982] Equations ' du Filtrate non lineaire de la


Prediction et du Lissa~e, to appear in Stochastics.

32. Ray. D.B., [1949] On Spectra of Second Order Diffezential Operators,


Trans. Am. Math. So-~., 77, pp. 299-321.

33. Sussmann, H., [1981] Rigorous Results on the Cubic Sensor Problem
in WILLEMS-HAZEWINKEL, loc. cit.
Note on uniqueness of semigroup associated

with Bellman operator

Makiko Nisio

Department of Mathematics, Kobe U n i v e r s i t y

Kobe, 657, Japan

i. Introduction. As a m a t h e m a t i c a l formulation of Bellman principle,

we define a n o n l i n e a r semigroup, using the value function of stochastic

optimal control. [2, 4, 7] The g e n e r a t o r of this semigroup is the

Bellman operator. The purpose of this note is to consider the unique-

ness of semigroup with g e n e r a t o r of Bellman operator, appealing to re-

sults of integral solution (Benilan solution)[l] of Cauchy problem.

First we recall our n o n l i n e a r semigroup. Let F be a compact

convex subset of R k, called a control region. B(t), t a 0, denotes an

,-dimensional Brownian motion on a p r o b a b i l i t y space (2, F, P). Any

P-valued o t ( B ) - p r o g r e s s i b l e measurable process is called an admissible

control. ~ denotes the totality of admissible controls. For U ¢~,

we consider the following controlled stochastic differential equation

(i. i) dX(t) = ~(X(t), U(t))dS(t) + 7(X(t), U(t))dt


x(O) = x ( ~ R n ) .

we assume that n x n symmetric non-negative definite m a t r i x a(x, u)

and n-vector y(x, u) satisfy the following conditions (AI) and (A2).

(AI) lh(x, U) I ~ b for any x, u

(A2) I~(x, u) - hey, v)l ~ K I x - Yl ÷ oClu - v l )

where b and K are constants and O continuous on [0, ~) with

p(0) = 0. Then there exists a unique solution X(t) = X(t; z, U),

called the r e s p o n s e for U. The value function is defined as follows

f ~(X(O), UfO))d~
(I. 2)
~ Exn~ e- "
S(t, x, ~) : usup fCxCs), uCs))ds
268

't
+ e-
io
c(X(s), U(s))ds
¢(X(t)),

where X(t) is the response for U and x its starting point. Now

we a s s u m e (A3),

(A3) c ~ 0 and f satisfy (AI) and (A2).

Let C be a B a n a c h lattice of b o u n d e d and uniformly continuous

functions on R n, w i t h the supremum norm a n d the usual order. Then

we c a n easily see t h a t S(t, ., ¢) E C whenever ¢ e C. Moreover

the operator S(t): C ÷ C, d e f i n e d by

(i. 3) S(t)~(m) = S(t, x, ¢)

has the following properties

(i) order preserving, ¢ ~ ~ ~ S(t)¢ ~ S(t)~

(it) contraction, llS(t)~ - S(t)~ll ~ ll~ - VII

(iii) c o n t i n u o u s , ]]s(t)¢ - s(O)~]l ÷ o as t - @ ÷ 0

(iv) semigroup, s(t + 8) = s ( t ) s ( o ) .

( v ) generator , O( ) = {¢ • C: ~ (S(t)l - ~)

converges in nomm, as t + 0} = C 2, w h e r e C 2 = {¢ ~ C; ~ ,

~2~ e C, i, j = i,... n}, and


~xi~x j

~ ¢ = sup hue
u£F
+ f(x, u), for ¢ ~ C2

~2 8
where Lu = .[.aij(x , u) ~x.~ + [yi(z,. u) ~ . - c(x, u)I and

a = ~
i ~a , I = i d e n t i t y .

(vi) Define Gu by GU# - = L U ¢ + f(m, u). Let TU(t), t e O, be a

semigroup on C with generator G u. Then

(i. 4) Tu(t)¢ ~ S(t)¢ for any t, ¢, u.

(Hereafter semigroup means strongly continuous one.)

(v~) If s e m i g r o u p A(t), t m 0, on C satisfy (i. 4), then

S(t)~ ~ A(t)~ for a n y t, 4.

In §2 we investigate some properties of i n t e g r a l solution, and


269

in §3 we apply these results to S(t) and show that under some con-

ditions S(t), t a 0, is a unique contraction semigroup with generator

of Bellman operator [Theorem 3].

2. Integral solution. In this section we state some properties of

integral solution, which are useful in §3.

Let A be a densely defined single valued strict dissipative op-

erator in C, namely ~ = C and A: D(A) + C satisfies

(2. i) T(A~ - A~, ~ - 4) < 0 for any ~, ~ ~ D(A)

where ~ is the tangential functional, that is


1
(2. 2) T(g, h) : its ~ (llh + l~ll - llhN )
l+o
1
= inf [ (lib + lgll - IIhlI)
k>o

Hereafter dissipative means strict dissipative.

Definition. V: [0, ~) + C is called an integral solution of Cauchy

problem,

(2. '3) dr(t)


dt : AZ(t), V(0) : # (• C)

if, V(t) is continuous in t, V(0) = @ and satisfies the following

inequality

(2. u) IIv(t) - *If - IIv(~) - *If -<


s;T(~,, v(~) - ,)de

for any t~ s and ~ • P(A).

Hence an integral solution may n o t be differentiable. An i n t e g r a l

solution is weaker than a solution (= V is Lipschltz continuous, strong-

ly differentiable at a. a. t and satisfies (2. 3) at a. a. t).

Proposition 1. Let A(t), t -> 0, be a contraction semigroup on C,

whose generator ~ is an extension of A, that is, ~(~) D ~(A)

and ~ = A on ~(A). Then A(t)¢ is an integral solution of (2. 3).

Proof. Putting V(t) = A(t)¢, we will show (2. 4) according to [6].

Define Ah by
270

(2. 5) A h = ¼ (h(h) I), N(A h) = C.

Since h(h) is contraction, Ah is dissipative. M o r e o v e r there ex-

ists a unique semigroup Ah(t) on C with generator A h. Further-

more Ah(-) ~ is c o n t i n u o u s l y differentiable on (0, ~) and Ah(t) #

is a unique solution of Cauehy p r o b l e m

(2. 6) dW(t)
~ = AhW(t) , W(O) = ~.

Since Ah(t) ~ turns out an integral solution of (2. 6), we have,

(2. 7) IIAh(t)* - ell - llAnOs)* - ell ~ T(Ah,, AhCe), - e)de

If % (llAh(e), - , + ~Ah~ll - IIAh(O)0 - *ll)dO, for X > O

For ~ e 9(A),

(2. 8) IIAh~ - g~I I ÷ O, as h # 0.

Hence we see

(2. 9) lintegrand of right side of (2. 7) I ~ IIgh~II

IIAell + i for small h

On the other hand, as h + 0

(2. 10) IIAh(8)¢ - A(e)ell + 0


uniformly on any b o u n d e d time interval. Therefore, tending h to O,

we have, from (2. 7) ~ (2. i0),

(2. il) II^(t)*-ell - 11A(8)¢-*11 s


If %(llA(e),-,+XAOll - IIA(e)*-mll)de

Tending k to 0, the b o u n d e d convergence t h e o r e m derives (2. 4).

This completes the proof.

Next we recall the f o l l o w i n g proposition,

Proposition 2 [3. 6]. Suppose the condition (2. 12).

(2. 12) lim ~1 dist.


~i-~ (R(I - XA), 9) = 0, for any ¢ ~ C

where R means the range. Then we have

(i ) for any ~ ~ C, an integral solution V(t; $) of (2. 3) exists

uniquely.

(it) the operator W(t), defined by W(t)@ = V(t; ~) becomes a unique


271

semigroup w h i c h provides an i n t e g r a l solution of (2. 3).

From Proposition 1 and 2, we see

Proposition 3. Under condition (2. 12), a semigroup, whose generator

is an e x t e n s i o n of A, is u n i q u e , if exists.

3. S e m i g r o u p S(t). Put G¢ = sup LU~ + f(x, u) and


UEF

(3. i) D = {¢ c W oo2 n C; G¢ has a version in C}.

Since c o e f f i c i e n t s ~, y and c are c o n t i n u o u s in u, G¢ can be


3~
defined for ¢ e W 2~, if v e r s i o n s of ~x
i3~ and 8xi~xj are fixed inde-

pendent of u. Since D = Cz and Cz is d e n s e in C, D is d e n s e in

C. G = G/D can be r e g a r d e d as a m a p p i n g from D into C.

Theorem i. Under (AI) ~ (A3), the g e n e r a t o r ~ of S(t) satisfies

the f o l l o w i n g conditions.

c3. ~ ~c~> ~ D and 3: a on D.


namely ~ is an e x t e n s i o n of G.

Lemma. Suppose that (AI) ~ (A3) a n d (A4) hold.

(A~) there exists p > 0 such that

(a. 3) ~.~ij(x, u)eie j >_ ~lel 2 f o r any x, e, u.


zj

Let ~ ~ W~ n C. If G~ = g + h with g c C and h E L , then, for

k > 0, t h e r e exists to = t0(%, b, k, g) > 0, i n d e p e n d e n t of p and

h, such t h a t

(3. 4) II ~1 (s(t)~ - ~) - gll < 11h11~ + ~, whenever t < t0

Proof. By (A4) Ito's formula [4] t e l l s us

(3. 5) S(t)~(x) - ~(x) : sup Exf e- c(X(e), U ( O ) ) d @ U(8)


Ue~ Jo (f(X(s), U(8)) + n ~(X(8)))d8

-< sup
- ~
UC U L
E.~ e-
.~j~
o
I c(X(O),
o
U(O))de
G~(X(s))ds

sup Exi t e- lo~o(X(e)~ U(O))dO


W~L ~o g ( x ( ~ > d ~ + tllhlloo
272

Since g ~ C, we see that, as t + O, ~i E x~[ e-


'C c(x(0) u(e))d8
g(X(8))ds con-

verges to g(x) uniformly in x and U. Moreover, by r o u t i n e method,

we c a n t a k e tl = tl(X, b, K, g) > 0 which is i n d e p e n d e n t of N and

h, so t h a t

(3. 6) I{ (S(t)~Cx) - ~(z)) - g(x) <_ IlhI], + X, for _ x e R n, t < tl

On the other hand, by m e a s u r a b l e selection theorem we can t a k e a F-

valued Borel function u(.) such that

(3. 7) G~(x) = LU(X)~(x) + f(x, u(x)) for any x.

The stochastic differential equation

d~(t) = ~(~(t), u(~(t))dB(t) + ~(~(t), u(~(t))dt

6(0) = x

has a weak solution by (A4)) [4]. Although u(~(t)) may not be st(B).

measurable, we can see [7])

S(t)~(x) -
>Ejj ~- c(~(e), u(~(e))de c(~(o) u(~(O))dO
f(~(8), u(~C~))ds + e- ~(~(t))'
o

Hence Ito's formula implies

(3 8) s(t)~(x) - ~(x) > Ex~ e- c(~(e), u(~(e))de u(~(8))


• - ( f ( g ( 8 ) , u(~(e)) + L $(~(8)))ds

= E=~ e- c(~(O), u(~(O))dO


a~(~(8))ds

a Exl e- c(~(O), u(~(O))dO


g<~(=))d= - tllhll ~ .

Therefore there exists tz = tz(X, b , K, g ) , independent of ~ and

h, such that

(3. 9) i (sCt)$(x)
~ -$(x),) - g(x) ~ - ]]hl]~- l, for z ~ Rn , t < t=,

Combining (3. 6) and (3. 9), we c o n c l u d e Lemma.

Proof of T h e o r e m 1. sg = ~ + eI satisfies (A~). When we u s e

instead of ~, we put superscript ¢. Consider controlled stochastic

differential equations
273

r
(3. 10) J dX(t) = a(X(t)~ U(t))dB(t) + X(X(t), U(t))dt
X(O) = x
and

(3. 11) [ dXe(t) = ~e(xe(t)~ U(t))dB(t) + T(Xe(t)~ U(t))dt


xe(o) = x

Applying t h e r o u t i n e method, we can see t h a t , for T > 0, t h e r e exists

Kl = KI(T), i n d e p e n d e n t of e, x and U, such that

EIxe(t) - x(t)l 2 ~ eK,t for t < T.

Therefore we h a v e , as e ÷ 0,

(3. 12) llSe(t)¢ - S(t)¢l I ÷ 0, u n i f o r m l y on [0, T], (VT > 0).


For ~ E D~ G~ has a v e r s i o n in C and

(3. 13) It%(=) - ~¢(=)I ~ ~ ~up ll~i~(=,o ,) + ~ ~(=)l,a.e.


uEr ~xi~x j
Hence, w i t h some c o n s t a n t d = E(~, ¢) > O,

Now we a p p l y L e m m a to Ge~. e satisfies (A4) a n d we can s u p p o s e that

G~ e C. Hence~ for e' > 0~ t h e r e exists t 0 = t0(e' , b, k, G~) such

that

1
(3. 15) II ~ (se(t)~ - ~) - G~ll < de + e', f o ~ t < t0 and e < i.

From (3. 12) and (3. 15)~ we get

(3 • is) II ~ (s(t)~ - ~) - ceil ~ e' , for ~ < ~0 •

This m e a n s that % E D(~) and ~ = G~.


For t h e c o n d i t i o n (2. 12) we i n t r o d u c e (AS) and (A6) according to

[5].

(AS) ~i~j, Yi~ c and f are twice continuously differentiable and,

putting h = ~k,£, Yk, c, f,

sup [ I ~2h(x'u)
~ j
~xL~x- I + I ~h(x'u)
i 1 < ~' i, j = l~...n~

(A6) there exists a constant v > 0, s u c h t h a t f o r all x E Rn there

are an i n t e g e r m, a n d
274

m
ul, .... u m • F, el,... 8m • [0, i], ~ e i : i, such that
i=i

[ X ek aid(x, ~k)eied ~ ~1~1 ~, f o r v~ ~ R~.


ij k

T h e o r e m 2. Under (AI)(A2)(A3)(A5) and (A6)~ R(I - IG) is dense in C~

for small I > 0.

Proof. C o n s i d e r Bellman e q u a t i o n for @ e C2

(3. 17) sup LUW(x) + f(m~ u) - ~ W(x) + ~ ~(x) = O, a.e.


u<F

For small ~ (depending to a and y), (3. 17) has a s o l u t i o n W • W 2 nC.

[5]. Therefore GW = ~ (W - ~) a.e. Since W - ~ ~ C, GW has a ver-

sion in C. So W • D. This means ~ • R(I - ~@). Since C2 is

dense, this c o m p l e t e s the proof.

A p p e a l i n g to P r o p o s i t i o n s 1 % 3, we have

T h e o r e m 3. (i ) Under (AI) % (A3), S(t)¢ is an integral s o l u t i o n of

C a u c h y p r o b l e m in C,

dV(t) : GV(t) V(O) =


dt ' ~"

(it) Under (AI)(A2)(A3)(AS) and (A6), S(t), t ~ 0 is a unique con-

t r a c t i o n s e m i g r o u p on C, whose g e n e r a t o r is an e x t e n s i o n of G(= G/D).

4. Optimal stopping. Put mCt) : { B - s t o p p i n g time ~ t}. For U • 6~

and T ~ m(t) we have the m o t i o n X of (i. i) until s t o p p i n g time ~.

So, the value f u n c t i o n S* is d e f i n e d as follows,

(4. I) S*(t, ~, ¢) = sup ,u•Ex~ T e- If e(X(O), U(e))dO


T6m(t) f(X(8), U(s))ds

+ e-
s; eCX(z), U(8))ds
¢(x(~)).

We can e a s i l y see that S*(t, ., ~) • C whenever ¢ e C. More-

over the o p e r a t o r S*(t); C ÷ C, d e f i n e d by

(4. 2) S*(t)¢(x) = S*(t, x, ¢)

is an order preserving, contraction s e m i g r o u p on C, whose g e n e r a t o r


275

~* is given by

(~. 3) ~*~ = max {0, sup LuG + f(x, u)}, for ~ ~ C~


d ucr

Using the same method as in §3, we can get similar results. That

is, define G* by

(4. 4) G*~ = max(0, G~) for ~ • D

Then, ~* is an extension of Ge undem the conditions (AI) ~ (A3).

Theorem 4. (i ) Under (AI) ~ (A3), S*(t)~ is an integral solution

of Cauchy problem in C,

(4. s) ---d-{dv(~)_ ~*v(t), v(o) = ~.

(ii) Under (AI)(A2)(A3)(A5) and (A6), S*(t) is a unique contraction

semigroup on C, whose generator is an extension of G*.

References

1. Ph. Benilan, Equation d'evolution dans un e s p a e e deBanaehquelconque

et applications, Th~se 0rsay, 1972.

2. A. Bensoussan and J. L. Lions, Applications des In4quations Varia-

tionneles en Contr61e St0chastique, Dunod, Paris, 1978.

3. Y. Kobayashi, Difference approximation of evolution equation and

generation of nonlinear semigrouDs, Proc. Japan Acad., 51(1975),

408-410.

%. N. V. Krylov, Controlled Diffusion Processes, Appl. Math. 14,

Springer Verlag, 1980. (Orig. Russian 1977).

5. P. L. Lions, Control of diffusion processes in R n , Comm. Pure $

Appl. Math. 34(1981) 121-147.

6. I. Miyadera, Hisenkei Hangun (Nonlinear semigroups), Kinokuniya,

1977 (Japanese).

7. M. Nisio, Stochastic Control Theory, ISI Lect. Notes 9, Macmillan

India, 1981.
P D E with random coefficients :

asymptotic expansion for the moments

+ +
+
R. BOUC - E. PARDOUX

Introduction

Suppose Zt is a diffusion process with values in Illd , and V z C]Rd, wa are

given an unbounded linear operator A(z) on some Hilbert space H (in pratical

examples, A(z) will be a PDE operator), such that the following Cauchy problem is

well - posed :

(0.1) dut + A(Zt)u t = o, u ° given in H


dt

In [ 2 ] it is shown, under appropriate conditions, that if q(t,z) denotes

the solution of:

-~.q(t,z) + A(z)q(t,z) = L*q(t,z)


at
(o.2)
q(o,z) = u ° po(Z)

where L is the adjoint of the infinitesimal generator of Zt, Po the density of

the law of Zo, then E [ u t ]= Sq(t,z)dz

Thus the computation of the first moment of u t is reduced to the computation of

the solution of (0.2). Similar results apply for higher-order moments.

+ L M A - C N R S 31, chemin Joseph Aiguier, 13274 Marseille Cedex 2

+
+ UER de Math~matique, Unlversit~ de Provence,

3 place V. Hugo, 13331Marseille Cedex 3.


277

As noted in [ 2], this result is not completely satisfactory, since the

dimension of the space variable in equation (0.2) will often be too large for

computing a numerical approximation. Indeed, if the space variable x implicit in

equation (O.I) is of dimension n, the space variable (~) of equation (0.2) is of

dimension n + d. This is the price one has to pay for the fact that (O.I) is non-
U
linear in (zt). The above result exploits the linearity with respect to U t t
t
Unfortunately, one is typically interested in cases where d is large - the

limiting case d = + ~ would correspond to a random field input Z(t,x).

On the other hand, if u t is the solution of a bilinear Ito equation :

(0.3) du t + A u t dt - B u t dw t

then the first moment ut = E [ u t ] solves the deterministic evolution equation :

(0.4) d Et
dt +A ~t = o, ~o = E [uo]

whose space variable is of the same dimension as that of (0.3).

The result of this paper is that if in some sense the noise entering (O.I)

is "wide-band", then u t is "close in law" to the solution of an equation similar

to (0.3). This is a natural generalisation of known results in finite dimension-

see e.g.BLANKENSHIP-PAPANICOLAOU [ I ], KUSHNER [ 5 ]- with the other difficulty

that Z t takes values in a (non compact) euclldean space. Our main concern here is

to show how arbitrary accurate (l)approximations to E [ut] can be computed,

solving only PDEs with space variable of the same dimension as that of (0.I).

Similar results would hold for higher order moments. The details of proofs of the

results announced here will be published elsewhere.

(I) at least in principle, since the coefficients in the equation become more

and more complicated, when one wants to increase the accuracy.


278

I. The PDE with coefficients perturbed by wide-band noise

We r e p l a c e (0.1) by :
£
du e 1
~[-t + A u t + -- B(Z t )u~ = o
/e2
(1.1)
uE = u
o o

where e is a "small parameter", and we now make precise assumptions on Z t, A a n d B.

I. a The process Zt

We suppose that Z t is a Markov diffusion process with values in IRd , whose

infinitesimal generator is given by :

1 32
L = ~ ~ aij(z) +Y
i,j ~z.~z. ibi(z) ~z.
x j x

we suppose that the coefficients aij(.) and bi(.) are measurable and bounded (1) ,

and that the following ellipticity condition is satisfied :

(1.2) B ~ > o s • t. . Z aij(z)~i~ j > B [~[2 , V z , $ 6 IRd


1,3

We suppose moreover that the process Z t has an invariant probability measur%

which has a density p, which then satisfies :

Lp=o

where L* is the formal adjoint of L. We suppose moreover that :

(1.3) p-I Z 3 ~ d
J ~j (aijP) £ L (]R), i= l...d

We add the following hypothesis (which is not necessary, but does simplify

the exposition):

(1.4) p-1L*(p.)= L

We choose p as initial density. Z. is then a stationary process. (1.4)

implies that Z. is time-reversible, in other words : V t >o, V ~ bounded measurable

from ]Rd into JR,

(1) All the results that follow would be true with minor changes, in the case

where the bi's are allowed to grow at most linearly in z.


279

E [ ~ (Zo)IZ t = z] = E [ ~(Zt)Iz o = z]

In the sequel, we will write E for E [ ./Z o = z] .


z

I. b The operators A and B

We suppose that A and B are PDE operators acting on functions of x, where

xE~ n (we consider (I.I) as an equation in the whole space, without boundary

e0nditions). We denote by ['I the norm of L2(]R n), [].][ the norm of HI(]Rn), (.,.)

and <. , . > the scalar product in L 2 ( ~ n) and pairing between HI(jR n) and H-I(]Rn).

Let us assume that A is a second order elliptic operator, which belongs to

Z(H|(~ n) , H-|(IR n)), and s.t. B y > o and % with :

(1.5) < A u , u > + % lu 12 > ~ 11ull2, V u E H I ( I R n)

Moreover, V z 6JR d, B(z) is a first order differential operator, and

B(.) 6L~(]Rd; 2(HI(jR n) ; L2(]Rn)))

We assume that :

B(z) + B*(z) = ~(.,z)

i.e. V z EIR d, B(z)+B$(z) is the multiplication by £(x,z) - this is the case as

soon as the coefficients of B have some regularity in x - and that :

(1.6) ~ £ L~(IR n +d)

Now the crucial hypothesis on B is :


(I)
(1.7) E [ B(Zt)]= o, i.e. SB(z)p(z)dz=o

We finally suppose that each coefficient of A and B is C ~ in x, all

derivatives being bounded functions of x - resp - of (x,z).

II. Convergence in law

Under some additional hypothesis, which are essentially those that we will

introduce in the next sections, one can show that u E converges in law to u, the

(1) This means in fact that the same equality holds for each coefficient of the

PDE operator B, a.e. in x.


280

solution of the following Ito - type stochastic PDE :

du t + [ A - [ E(B(Z0)B(Zo))dS]utdt= dMt(u)
O

where Mt(u ) is a continuous L2(]R n) - valued martingale, such that Vq0E L2(IR n) ,

the quadratic variation process of the real - valued martingale (Mt(u),~) is :


t co
<(M.(u),q))~=2fdsf E [(B(Zo)Us, q))(B(ZO)Us,tP)]d8
O o

This result is analogous to known results in finite - dimension, see

e.g. BLANKENSHIP - PAPANICOLAOU [ l ]. The difficulty due to the PDE can be taken

care of by using techniques from VIOT [ 8 ]. We also need here the results in § IV

and V below.

III Expansion of the first moment in powers of g : formal derivation

It follows from the result quoted in the introduction that :

E [ u e ]= f q e ( t , z ) d z
t
where qe solves :

~qe + A ~ +--1 B q e = - ~1 L* qe
~t £ E

q~ (o) -- p u
O

It is convenient to divide the above equation by p, i.e. to consider

~(t,z) s.t.
q~(t,z) = pCz) v--~(t,z)

Using (1.4), we get that -E


v solves :

~v---~ + A ~ 1 e ~
+ ~ Bv--'E = --~
(3.1) ~t c

v-~(o)= u o

v---C(t,z) is also given by :

~(t,z)=E[ u et I z ct = z]

At that point, we need to introduce one complication. There are two time

scales in our problem : t and ~E 2 . We now have to introduce the "fast time
281

variable" explicitly, in order to treat the initial layer. That is, we introduce

a new function v~(t,T,z), such that :

v-~(t,z) = v~(t,~e2 ,z)

From (3.1), v e must satisfy :

Bv_~
e
+ A ve + l ~ v~ + 4( ~J - L ~)=o
(3.2) ~t e ~e ~
vC(o,o,z) = u O

We do not claim that v e is caracterised by (3.2) - part of the boundary

conditions are missing - but this does not matter, since (3.2) will be used only

as a guide to guess the correct expansion. We will then go back to (3.1) in

order to justify it.

Let us now write v e as :

ve(t,T,z) = u--e(t,Y) + ~£(t,T,z)

with the constraint :

S@£(t,T,z)p(z)dz = o , so that :

E[u~ ] = fve(t,~e2,z)p(z)dz

--£
- u (t,t/e2)

--£ ,

u ~s the quantity of interest. Our problem comes from the fact that we

cannot compute -£
u without computing the whole function ve(t,T,z).

We are now going to look for an expansion of the form :

(3.3) vE(t,T,Z)= v°(t,T,z)+evl(t,T,z)+ e2v2(t,T,z)+...

and we write each v i as :

(3.4) v1(t,T,z)= ~l(t,T)+ Ol(t,T,z)

with the constraint :

s@i(t,Y,z)p(z)dz = o

so that :

u--e(t,T)= ~ o ( t , T ) + E ~ l ( t , T ) + £2 ~2(t,T)+ ...

The reason why this expansion is useful is that we will get "explicit"
')82

expressions for the ~l's, and the u1's will be solutions of PDEs with state

variable x.

We now replace in (3.2) v e by the right hand side of (3.3), and equate to

zero the coefficient of each power of e. Considering the coefficients of 8 -2 ,


-1 o 1
, ~ and ~ , we get the equations :
~v ° o
(3.4) - L v =o

(3.5) -Lv I +Bv°=o

(3.6) ~v2- - L v 2 + B v I + ~v° + A v ° O


~t

(3.7) ~X3 - L v 3 + B v 2 + ~vl + A v 1 ---- O


aT ~t

These equations will be enough for deriving the equations satisfied by -O


u

and ~ I Let us recall that what follows is not a rigorous derivation. We are

going to "guess" from (3.4)...(3.7) the equations that should satisfy the ul's

and ~1's. The justification will be made later on.

The following condition implies that (3.4) is satisfied :

(3.8) v°(t,T,z) ~ u-°(t)

which implies in particular that @o = o.

Using (3.8), (3.5) becomes :

--~vl - L v I + B u-°(t)= o
2T

we multiply the above equation by p, and integrate with respect to z. We then

get, using L ' p = o and (1.7):

..... o, i.e. (t,x)= (t)

we then deduce :

--~I _ e ~l + B u-°(t)= o
~T

to which we associate the boundary condition :

~l(t,o,z)= o

Then ~I is given in term of --°


u by :
283

T
(3.9) 01(t, ,z)= - (~ Ez[B(Zs)]ds) u°(t)
o

We now consider (3.6) :

~v2 - L v 2 + B v I + du--° + A --Ou = o


~T dt

Again~ multiplying by p and integrating against dz, we get :

(3.I0) ~ 2 + y B @ I p dz + du
--° + A ~ ° ~ o
~T dt

we expect that all quantities should have limits for T + + ~ , and that :

~2(t,~,z)= o
~T

we then g e t from ( 3 . 1 0 ) , u s i n g (3.9) ;

duo
-a-~-(t)+ [ A - ~ E ( B ( Z s ) B ( Z ))ds]u-°(t) ~ o
O
(3.11) O

~(o1 = u
0

( 3 . 1 1 ) is the equation fo= the limit of E[u~ ], which is also the

expectation of the limit in law of u et . The difference between A and the operator

in (3.11) can be interpreted as an Ito - Stratonovitch correcting term.

It follows from (3.10),(3.9) and (3.11):

o~

~-2 + [S E(B(Zo)B(Zs))ds]u--°= o
~T "r

we now choose the boundary condition :

~2(t,o)=o

we can make this choice because we do not want really to compute ~2, but only u °

and~l. By direct integration, we get :

2 7
(t,T)=-[ S dO SE(B(Zs)B(Zo))ds]u--°(t)
o 0

Subtracting (3.10) from (3.6), we get :

- L ,~2 + B u l + B .~1 - ~B.~I p d z = o

~2(t,o, z)=o

Using (3.9), we get the following expression for @2 .

~2(t,T,z)=- [SEz(B(Zs))ds]u I ( t ) + [ S T ~ E z - g) (B(Z0)B(Zs))dsde]u--°(t)


o o 0
284

By similar argument, using the above expressions for Ql, ~2 and ~2, we

get from (3.7):

d~1(t)+ [ A - SE(B(Zs)B(Zo))ds]u ] (t)+


dt o
co oo

(3.12) + [f Sg(B(Zs)B(Zs)B(Zo))dsdS]uO(t)=o
oe
~I (o)=o
--3 @3
and one can express u and in terms of ~ ° and ~I. Again, we do not compute the

"true" value of ~3, but just a quantity which will be necessary for estimating

u-~(t) _ u-°(t)_ ~ l ( t ) .

IV Ergodie properties of {Zt~

We need some conditions to guarantee that the integrals appearing in (3.11)

and (3.12) do converge.

What we need is to make sense of

Smz[~(Zt)]dt
o

for ~ bounded mesurable from ]Rdinto JR, such that :

(4.1) S~(z)p(z)dz = o

and to show that it is the limit, as T ~ + = , of :

IEz[~(Zt)]dt
O

This will be true if, for some norm II .U, (4.1) implies that B e > O s.t.:

(4.2) 71E. ~(Zt)lJ ~ 11~ lie-~t

Doeblin's condition (see e.g. PAPANICOLAOU [ 6 ]) would imply (4.2) with ]] . I~ equal

to the sup norm.

But Doeblin's condition cannot be true here, essentially because Z t takes values

in the non compact set B d .

Following CAUGHEY- PAYNE [ 3 ], we now define the Hilbert space :

L2~L2(]Rd ; p(z)dz). We denote by I • IA and (.,.)^ the norm and scalar product on

~2 . Consider L as an unbounded operator on i2, with domain D(L). For u E D(L), one

gets :
285

(4.3) (Lu ,U)A=-.l.(aij 8u , 3 u ) ^ < o


x,j 8z i 8zj

It then follows from (1.4) that L is a selfadjoint operator on ~2 • o is an eigen-

value of L, the corresponding eigenvectors being the constants.

Define i2 = { u 6 £ 2 ; Su(z)p(z)d z = o} and L to be the operator L considered as an


o • o

unb0unded operator on ~2
o

From (4.3), the spectrum of L, Sp(L) is contained in IR-. Suppose that o is

an isolated point in Sp(L). Then 3 ~ > o such that S p ( L o ) C ] - ~ , - ~ ], and (4.2)

follows from (4.1), with If. l[ replaced by I. I^ • We now state :

Theorem 4.1 Suppose in addition to (1.2) that 3 M a n d c>o s.t.

V z C m d, I~<I>M,
(4.4) b(z). z <- c
[zl
Then o i s an i s o l a t e d point of Sp(L).
[]

The proof of the Theorem uses a criterion due to H. Weyl (see RIESZ-NAGY

[ 7]), which says that the conclusion of the theorem is equivalent to the non

existence of sequences (u n } c D ( L ) s.t.

lUnI^ = I, u n + o in ~2 weakly

Lu + o in i2 strongly
n

The same criterion shows that (4.4) is close to be a necessary condition

for o to be an isolated point of Sp(L).

Remark : Suppose for simplicity that d=l, and define the following unbounded

operator on L2(IR):

A v = ~p L(v-Y-)
Vp
A has the same spectrum as -L on ~2

In the particular case where a (z)=2, A is the Schrodinger operator

A + V, where

V(z) = -
l(b2
4 + 2b')
286

one can check that Sp(A)=IR , when b2(z)+ 2b'(z) ÷ o as Izl + + ~ .

A similar result in dimension d = 3 can be found in KATO [ 4 ] p.304.

V The Poisson equation

Under the hypothesis (4.4), one easily shows that if f C L~ , then the

Poisson equation

(5.1) LW + f = o

has a unique solution W E L2, which is precisely given by

W(z) = IEz[f(Zt)]dt
o

It will prove crucial for us to have conditions on f under which the solution W is

bounded.

Again, Doeblin's condition would insure that f bounded implies W bounded. This is

not true here, as the following trivial example shows :

d = 1, f(z)=-b(z); then W(z) = z

We now state a result, which is due to P.L. Lions :

Theorem 5 . , I Suppose that (4.4) holds and in addition there exist


d
p > ~ and N > O s.t :

(i) f 6 e~o c ( m d)

(ii) S ess sup If(z) Idt<


N IzI>t
Then W E L~(~d) , where W is the unique solution of (5.]) which

belongs to ~2
o
[]

It is worth noting that a condition like f 6 L P ( ~ d) for some large p would

not imply the theorem. Indeed, a smooth function equal to loglz I for ]zI> 1

would satisfy (5.~), with ]f(z)]< c for Izl large.


Izl

VI Jusfification of the expansion

We assume that (4.4) holds. Fix T > o.

It is then easy to prove that equations (3.11) and (3.12) have unique solutions
287

belonging to :
L2(o,r,Hl(~Rn)) N C([o,T] ; L2(IRn))

And from the regularity assumptions on the coefficients of A and B, one can check

that any partial derivative of any order with respect to the zi's has the same

regul ari ty.

We now want to show :

(6.1) [ E [ u tc] - ~(t) cEl(t)l ~ c(t)~ 2

where again [ . I stands for the norm in L2(]Rn) .

We will use the following notation : if H is a Hilbert s1~ace, L2(H)~L2(/Rd~(z)dz;I~

Define :

vI JJ(t,%2)

where, for i = 1,2,3, vi(t,T) = ui(t,T)÷ "~i(t,T), and these are the quantities

whe have computed in § III.

Lemma 6.1 A sufficient condition for (6.1) to hold is that V t > o ,3C(t)EIR+

s.t :

(6.2) [[ OE(t)[[ 2(L2(iRd))< ~(t)g 2

Proof : From Cauchy-Schwarz inequality,

ISpc(t,z)p (z)dz[ <ll pe(t)l]


£2(H)

so that (6.2) implies :

mE uEt - u--°(t) - C u l ( t ) - C 2 u 2 ( t , ~ 2 ) - E3u3(t,%2) l ~ ( t ) C 2

But one can show that V t >o , 3 C' s.t. :

sup l~2(t,~)l<c ', sup l~3(t,~)l< c'


T>O T>O

(6.1) then follows.


[]

Now O E satisfies :

(~-~+ A + I B - I L )pe(t)=e2g(t,e) ;pe(o)=o


~t c c

where g(t,~)is an expression involving v 2 and v 3.


288

One can show, under the above hypothesis :

V T > O , B C" s.t :


T
(6.3) ~ Jlg(t,~)lJ2 dt ~ C"
o £2 (H- |(JRd ))

Define ye(t ) = ~ pC(t), ye solves :

(6.4) (~-~+A+ I B---12L) y~(t)= g(t,e); ye(o)=o


C

(6,2) is a consequence of (6.3) and :

Theorem 6.2 Suppose that the following holds for h = Z, -~.,


~z i = l...n :
1
B N and K > o s.t.:

(6.5) f+ess supih(x,z) Jdt < K , V x E ] R n


N Izl>t
and that ~ £ LOO(iRn+d), i= l...n.
i
then Vt> o, 3k(t) s.t.
t 2
llye(t) l122 ~ k(t)f JJg(s,e)ii^9 ! ~ ds
(L2(IRd9 o L-(H-'(~-9

Sketch of proof : It follows from the hypotheses of the Theorem, using Theorem

5.1, that X, unique solution of :

L X(x,z) = £(x,z)

X(X,.)6 L 2 V x E ] R n
o'

satisfies:

(6.6) X E L~ n]R
+C d~ ; 3_~_
~x. E L~(iRn+d) ' i = I " ..n
1
One can then get, using (4.3) and standard PDE techniques :

J]Ye (t)'~9 ~ + ~ (YC (t)'X Ye (t))~2 (L2 (iRd))~k( t)!'~g(s, E)"22(H- | (~d)) ds
L- (L-(]R~))9

which yields the desired result, from (6.6).


[]

Remark We have supposed that B(z) is of the form :

B(z) = Bl(X,z).V + Bo(X,Z )

If B =o, then one can get estimates for yS(t) uniformly in e, using
o
the maximum principle and then avoiding the restrictive assumption of
Theorem 6.2. []
289

Bibliography

[| ] G. BLANKENSHIP. G. PAPANICOLAOU.- Stability and Control of Stochastic

Systems with wide - band noise disturbances. S i ~ J.

A~pl - Math. 34, 3, pp. 437-476 (]978)

[2] R. BOUC. E. PARDOUX.- Moments of semilinear random evolutions, Sic~n J.

Appl - Math. 4], 2, pp. 370-399 (1981)

[3] T. CAUGHEY. H. PAYNE.- On the response of a class of self-excited oscil-

lators to stochastic excitation.

Int.J. of Nonlinear Mechanic8 2, pp. 125-|51 (|967)

[4 ] T. KATO.- Perturbation theory for linear operators. Springer (]976)

[5] H. KUSHNER.- article in this volume

[6 ] G. PAPANICOLAOU.- Asymptoptic analysis of stochastic equations.

in Studies in P~obability Theory M. Rosenblatt, ed.

MAA Studies in Applied Mathematics, vol. 18

[7] F. RIESZ. B. SZ.-NAGY.- Legon8 d ~ n a l y s e Fonctionnelle

Gauthier-Villars (1972)

[8] M. VIOT.- Solutions faibles d'~quations aux d~riv~es partielles stochastiques

non lin~aires. Th~se, Univ. Paris VI (1976)


A DISCRETE TIME STOCHASTIC DECISION MODEL

Stanley R. Pllska
Northwestern University
Evanston, IL 60201/USA

Discrete time Markov decision chains are usually defined in terms of a Markov
transition matrix. A less common approach, but one that is more useful for appli-
cations, is to formulate the model in terms of a state transition function, where
the next state is a function of the current state, the current action, and an exo-
genous random variable.
For most applications these exogenous random variables (one for each period)
have an explicit, physical interpretation. Moreover, in the case of Markov decision
chains, they are independent and identically distributed. A natural and important
generalization, therefore, and the subject of this paper, is the stochastic decision
process that results when these exogenous random variables are not independent and
identically distributed, but rather they comprise a general stochastic process.
Upon making this generalization, the underlying process being controlled may
become non-Markovian. A few authors have studied non-Markovian stochastic control
models. Davis [2] and Rishel [6] studied very general, continuous-time models.
Discrete-tlme models were presented by Dynkln [3] and Gihman and Skorohod [4]. The
stochastic decision model studied here is considerably more structured and less ab-
stract than any of these. Indeed, as mentioned above, it is only a modest, yet
significant, generalization of the state-transltion-function kind of Markov decision
chain model.
After formulating the stochastic decision model in Section I, its potential
usefulness as a practical tool is illustrated with the brief presentation of five
different applications. Section 3 provides a martingale type of optimality conditi0~
and explains how to use dynamic programming to solve the problem. An alternative
solution technique that is sometimes more efficient than dynamic programming is
sketched out in Section 4; this method involves stochastic calculus and convex
optimization theory. Sections 5~ 6, and 7 solve a fairly general example problem
with both dynamic programming and the alternative solution technique. The paper
ends with some concluding remarks.

I. Formulation of the Model

The basic elements of the model are a filtered probability space (~, F , ~, P)
and a time horizon T < =. For technical reasons it is assumed the sample space
is discrete. However, most of what is done here is also true when the sample space
is uncountable. Thus in the case of general sample spaces the reader should regard
the results here as being formal but not rigorous.
The filtration ~ = [Ft; t = 0 , 1 , . . , T ] , where each Ft is a ~-algebra of subsets
291

of ~ and F0 ; , • = FT . Without any real loss of generality, it is assumed that


- ~,~ end Fr = F
A stochastic process Z = let; t = 1 , 2 .... T] is specified and fixed. This will
be called the environmental process. It is assumed that Z is real-valued and adapted,
that is, the function ~ - Zt(m ) is measurable with respect to Ft (written Z t E Ft)
for each t.
The admissible controls are defined in terms of a predictable set-valued process
A = [At; t = 1 , 2 .... T} called the constraint process. Here ~ ~ At(w ) U ~ for each
t and w; for example, At(m) is an interval. One should think of A t as defining the
admissible actions or decisions. Predictable means A t E Ft. l for each t.
Throughout this paper controls and policies will be called decision processes.
An admissible decision process will be any predictable, real-valued stochastic pro-
cess D = [Dt; t = 1 , 2 .... T} satisfying Dt(~) E At(m) for all m and t. Let =D denote
the set of all such decision processes. Viewing the sequence of control actions as
a predictable stochastic process is a crucial feature of this decision model. It
will be seen that this approach is not really different from that taken with Markov
decision chains, say, where the control is taken to be a function of the current
state.
The process to be controlled is denoted X = [Xt; t =I,2,..,T] and called the
controlled process. It evolves according to a specified state transition function
f: R 3 , R. The way this works is very simple. The initial state X I is specified,
that is, X I E F0, and then for any particular decision process D one has

Xt+l = f(Xt' Zt' Dr)' t=l,2,..,T-l.

Note that X is predictable.


The decision model generates rewards according to a specified reward function
r: R 3 , R. The reward

R t = r(Xt, Z t, D t)

is generated at time t, and this sequence of rewards defines a reward process


R = [Rt; t ~I,2,..,T}. Note that R is adapted.
Corresponding to the reward process is another adapted stochastic process
W = [Wt; t =O,I,..,T] called the wealth process. The initial wealth W 0 E F0 is
specified, and then one has

Wt = Rt + W t - l ' t =I,2,..,T.

To understand how the decision model operates it is useful to think of the time
parameter t as the index for a sequence of periods. At the beginning of period t
the decision maker observes the information ~ _ ~ h i c h inclRdes XI,X 2 .... Xt; DI,D2,
•.,Dr.1; Zo,ZI,..,Zt_I; A1,A2,..,At; and W0,WI,.. , and Wt. I. In particular, one
should think of X t as the current state and Wt. I as the current wealth. The decl-
sion maker then uses this information to choose the action Dr, after which the next
292

state Z t of the environmental process is observed, the reward R t for the period is
generated, and the new wealth W t is realized. This sequence is repeated perlod-by-
period until the terminal wealth W T is realized. The applications in the next sec-
tion will give further insight into how this decision model functions.
The decision maker's objective may be to maximize the expected terminal wealth
W T. However, it will be useful for purposes of economic modeling to be more general
than this. Let u be a specified real-valued function measuring the utility of the
decision maker's terminal wealth. Then the problem is to choose e decision process
D so ms to maximize the expected utility E[U(WT) ]. Later sections will explain how
to solve this problem.
It is important to recognize that if the random variables in the sequence [Zt}
are independent and identically distributed and if the rest of the decision model is
suitably defined, then the decision model becomes an ordlnary Markov decision chain.
Indeed, it becomes identical to the kind of Markov decision chain treated by Bertse~l
[I] which, in turn, is equivalent to the conventional kind of Markov decision chain
that is formulated in terms of a Markov transition matrix.

2, Soma Applications

A primary reason for the importance of the decision model is its suitability
and usefulness for many different applications. The following table presents five
possible applications. These applications are meant to be suggestive, not defini-
tive. The columns indicate the applications, while the rows specify the various
elements of the model. Note that the constraint process A is sometimes specified
in terms of the controlled process X; this is allowed, since X is predictable.
All of the applications involve an environmental process Z that has an explicit,
physical interpretation. In the special case where Z is a sequence of independent
and identically distributed random variables, all of these problems specialize to
standard applicatlons of Markov decision chains. But for all of these problems it
is both natural and meaningful to allow the environmental process to be more general.
The first three applications are simple generalizations of'classlcal problems
from the operations research literature. For all three of these problems it may be
important to take the environmental process Z to be more general than a sequence of
independent and identically distributed random variables. For the controlled queue-
ing problem the term r I of the reward function is meant to be the service cost, while
r 2 is the waiting cost. In the production-inventory problem r I is the ordering cost,
r 2 is the holding cost, r 3 is the shortage cost, and there is complete backlogging.
In the replacement-maintenance problem r I is the cost of maintaining an item that
has received quantity X t of shocks and now receives shock Zt, while the scalar c is
the replacement cost.
The fourth application is one example of an optimal portfolio problem, an impor-
tant and well-studled problem in finance. The investor can buy a stock, with $I
controlled production- replacement- optimal consumption
queues inventory maintenance portfolio investment

arrivals demand shocks & wear one period one period


Zt during during during rate-of-return rate-of-return
period t period t period t or stock of investment

number beginning cumulative c u r r e n t current


Xt waiting inventory wear wealth wealth
level
,

number inv. level service action fraction of wealth consumption


Dt served after 0 = replace in stock versus during
ordering I : keep bank at rate I + period t

At {0,1,..,Xt} [Xt,Xt+ I,..} [O,l} [o,i] [0,xt]


"rl(D t) "r I (D t - Xt) "r(Xt,Zt)D t
Rt -r2((D t - Zt) V 0) X t (DtZ t+ (I- Dt) ~) r(D t)
-r2(X t - Dr) -c(l -Dt)
-r3((Z t - Dr) V O)
, ,,,,,,,,

Xt+l Xt + Zt -Dt D t - Zt DtX t + Z t X t +R t (1 + z t ) (x t - D t)

TABLE. Some Applications of the Decision Model


294

invested at time t becoming worth $(Z t + I ) at time t +I, and/or put money in a bank
at a fixed interest rate ~. The problem is to optimally divide his money between the
two investments. Note that X = W.
The last application is a consumption-investment problem. Consumption-investment
problems, as well as variations such as optimal capital accumulation under uncertsln~
and resource allocation under uncertainty, have been extensively studied in the ec0-
nomics literature. As with the optimal portfolio problem, the environmental process
Z is the rate of return of an investment, and X is current wealth available for invest
merit. However, now W ~ X. Each period the decision maker must consume the portion
D t of his wealth and invest the balance X t - D t. The consumption generates immedlate
utility r(Dt) , while the investment yields wealth (I + Z t ) ( X t -Dr) next period. Now
W should be thought of as the cumulative utility, so one should take u(w) = w. Incl-
dentally, thinking of how one might model the prime interest rate, it may be appro-
priate for the environmental process Z of these last two applications to be a Mark0v
chain.

3. Dynamic Programming and Martingale Optimality Conditions

Just a s d y n a m i c p r o g r a m m i n g i s u s e d t o s o l v e Markov d e c i s i o n p r o b l e m s , s o can it


be used to solve the stochastic decision problem. This will be explained in this
section, as will be a martingale type of necessary and sufficient condition for a
decision process to be optimal.
Let W D denote a wealth process under decision process D, end similarly for X D.
One says "a" wealth process rather than "the" wealth process because there is no res-
triction on its initial value ~ 0. Similarly for X D. For each t = O,I,..,T and D EL.
let V Dt be the real-valued function on E 2 X ~ defined by

= Erupt)I% D ffiw,X~+1 = x, Ft] - u(w) •


D
In other words, Vt(w,x,, ) is the conditional expected change in utility from the end
of period t, that is, from the beginning of period t + I ~ given the wealth then is w,
the state then is x, and the information corresponding to F has been observed, The
t
problem, of course, is to choose D E =D so as to maximize vD(Wo01,X ,.), where W 0 and
X 1 are the specified initial values.
For each t = 0,1,. • ,T, let V t be the real-valued function on R 2 X n defined 5y
D
Vt(w,x,w) = sup Vt(w,x,w).
DE~
u

Thus V t is the maximum expected change in utility from the end of period t. If D E
-

is such that V (W0,XI,.) = V0(Wo,XI,.), then D will be called optimal, for clearly
this D maximizes E[u( )] over D subject to = W 0 and X I ffiX I. Note that V T = 0.
The function V will be called the value function, In order to avoid annoying
technicalities, it will be assumed that V~(w,x,w)_ and Vt(w,x,w ) are well-deflned and
finite for every D, t, w, x, and ~. The main result of this section is that V can
295

be computed by solving a dynamic programming functional equation.


(I) Theorem. Suppose there exist real-valued functions v0,vl,..,v T, each wlth
domain ~ 2 × ~, satisfying v T = 0 and
(2) Vt_l(W,X,,) : sup {E[u(w+r(x,Zt,Dt)) ]Ft_l] - u(w)
D t EA t

+ E[vt(w +r(x,Zt,Dt), f(x,Zt,Dt), ,) IFt i]]


for t - T , T - I , . . , I . Then V t = v t for each t.
Remark. This dynamic programming equation says the maximum expected change in util-
ity equals the maximum of the sum of the expected change in utility over the current
period plus the expected remaining change. The notation here is somewhat confusing,
so it deserves an explanation. For each fixed w, x, and D t the expression on the
right hand side being taken a supremum of, that is, the expression within the paren-
theses [, }, is an Ft_ I measurable function on ~. Thus, for each fixed w E ~,
Vt_l(W,X,~) equals the supremum of this expression as the scalar Dt(~) varies over
the set At(w ) . Since ~ is discrete, it follows that Vt_l(W,X,. ) is Ft_ l measurable
and, moreover, if Dt(w) attains the supremum for every w E ~, then D t is Ft. I meas-
urable.
Proof. This induction proof is similar to that for conventional dynamic programming
problems, so it will only be sketched. After easily showing VT. l - VT.l, one assumes
V t = v t and shows Vt_ l = vt_ 1 by carrying o u t the following computation:
D
Vt_l(W,X,. ) = s~p[Vt.l (w,x, .) + = w, X t = x, Ft.l]]
DED_
D
= sup = w, xt : x, Ft-l] - u<w)
DED
D

= sup {E[u(w + r(x,gt,Dt)) I Ft_l ] - :(w)


DE_D
+ E[VD(w+r(x,Zt,Dt), f(x,Zt,Dt),')IFt.l]]

= sup [E[u(w+r(x,Zt,Dt))IFt.l] - u(w)


DtE A t

+ E[ sup vD(w+r(x,Zt,Dt ), f(x,Zt,Dt),.)]Ft.l]]


D
t+l,...
: sup [E[u(w+r(x,Zt,Dt))IFt.l] - u(w)
DtE A t

+ E[vt(w+r(x,zt,Dt), f(x,zt,Dt),')IFt.1]]
= vt. I.
Just as with conventional dynamic programming problems, if the supremum is
attained in (2) then the corresponding decision process is optimal (the fact t h a t [~
is discrete makes it easy to show this process is predictable). In other words, one
296

has the following.


(3) Corollary. For D E D_ to be optimal it is necessary and sufficient that, for
t = 1,2,..,T,
D
(4) D Dt,.) -
Vt.I(Wt.I,X ~[u~ l+r~x~zt.t~l~tl ~ u~wtl ~
+ .[vtc~ ~ +rCx~,~t,~t:~, f'::~,zt,~,;~,.:~lFt.,]~.

To use this Corollary to determine an optimal decision process, first set t = I,


~0 s WO, and X DI = X1" Then choose D 1 so as to satisfy (4), Next, for t = 2, set

wID = WoD + r(XI,ZI,DI


} D and X 2D = f(XI,ZI,DI), and then choose D 2 so as to satisfy (4)
Continue in this way until D T has been determined. This procedure will be illustra-
ted in Section 6.
The recent literature on stochastic control (see, e.g., Davis [2]) contains
optimality conditions stated in terms of certain martingale properties. It turns
out Theorem (I) and Corollary (3) can be transformed into such e statement.
For each D E D__ let M D = [~t; t = 0 , 1 .... T] be the stochastic process defined by
setting

= . Xt+l, ._,
)
D
where ~0 " W0 and X 1 - X I. Thus ~t equals the expected change in utility over all
periods if D is used through the end of period t and then optimal decisions ere made.
~ecall t h a t , process such as ,a l e ~ su~er..rtln~ale i f ~ ~ "~+l-~ -~V~] for all t
end that M D is a martingale if both ~ and -M D are smpermartingales. This leads to
the following result.
(5) Theorem. The process M D is a supermartlngale for every D E D_. The decision
process D is optimal if and only if M D is a martingale.
Proof. Let D E D and t be arbitrary. Note that

- + vt+ l - v t

Also, by the dynamic programming results,

v 'tl.- + l,
eo

Hence ~ [ ~ + l - <IFt] " O . . h i c h means ,a i s . supermartln~ele.


For the second pert of this theorem, one sees by Corollary (3) that D is optimal
if and only if all of these inequalities are actually equalities. But, of course,
E[~t+ I -M~t ] Ft ] = 0 for all t if and only If ~ is a martingale.
297

4. An Alternative Solution Technique

The dynamic programming approach amounts to maximizing the real-valued function


D -- E[u(W~-)7
L T J
over the space D of admissible decision processes. Even with a finite
sample space, however, the "curse of dlmenslonallt~' may make this a formidable
undertaking from the computational standpoint.
An alternative approach would be to proceed with the following three steps.
First, identify in the space of random variables the set that is the range of the
function D ~ W DT. In other words, if =W denotes this set of attainable wealths, then
the random variable Y E W if and only if Y = W~ for some D E
Second, solve the optimization problem of maximizing E[u(Y)] subject to Y E E.
This problem may be easy to solve, particularly if the utility function u is concave,
for then this is a conventional convex optimization problem for which there is a
large literature.
Finally, having determined an optimal Y E W for the preceding optimization prob-
lem, determine the decision process that generates this terminal wealth, that is,
the D E ~ satisfying ~T = Y" Depending on the circumstances, this third step is
usually the easiest of the three to carry out.
The first step is the crucial one; it may not be feasible to identify the range
W. But the contention is that if W can be readily identified, then this alternative
solution procedure may be computationally more efficient than dynamic programming.
Unfortunately, there are no general results to support this sLatement. However~ one
general class of stochastic decision models has been identified for which this alter-
native solution procedure works very well. This is where the wealth process W can
be expressed as a stochastic integral of the decision process with respect to a
martingale. In this case stochastic calculus theory can be applied to identify W.
Then if u is concave the second step can be carried out with convex analysis. F~nally,
the optimal decision process is readily determined by solving a martingale represen-
tation problem. This all will be illustrated in Section 7 where an example of this
kind of problem will be solved in detail.

5. An Example

This section presents an example of a stochastic decision model that will be


solved with dynamic programming in Section 6 and with martingale methods and convex
analysis in Section 7.
Let the sample space ~ be all the T-dimensional vectors whose components are
either I or -I. Thus ~ has 2 T elements.
Let the probability measure P be arbitrary, subject only to the requirement that
P(w) > 0 for all ~ E ~.
Let the environmental process Z be defined by setting Zt(~ ) = wt, the
component of m.
298

Let the filtration ~ = [ ~ ; t=O,l,..,T} be the one generated by Z with


F 0 = [~,~}. Note that F T = F, the set of all subsets of ~.
For the constraint process A, set A t = ~ for all t. Thus a decision process
D E D can be any scalar-valued, predictable stochastic process.
There is no controlled process X with this example. The reward associated
with period t is simply R t ~ DtZ t. With initial wealth W 0 = 0, the wealth at the
end of period t is W t = R I + R 2 + .. + R t-
The utility function u is initially taken to be strictly concave and increaslsg
with either ut(w) -- 0 as w ~ ~ or ul(w) - - ~ as w - - m . Moreover, u has a contin-
uous second derivative. Later this will be specialized even further.

6. Solving the Example with Dynamic Programming

To solve the example problem of Section 5, one begins by computing the value
function V. Since there is no controlled process X with this example, the notation
will be modified accordingly. In particular, the dynamic programming equation (2)
becomes

(6) Vt_l(W,,) = sup E[u(w+DtZt)- u(w) + V t ( w + D t Z t , . ) ] F t _ l ].


DtE Ft_ I
t
With V t denoting the partial derivative of V t with respect to its first argument, one
has
(7) Proposition. There exists an optimal control. For each fixed m E ~ and each
t the function w - Vt(w,w) + u(w) is strictly concave and increasing with a continuous
second derivative and either V~(w,~) + ul(w) - 0 as w --~ or V~(w,~) + ul(w) -- - ~ as

Proof. The proof is by induction. Since V T = 0, the function V T + u clearly has the
specified properties, so for the induction step it will be assumed that so does the
function h, which is defined by

h(w,w) = Vt(w,w ) + u(w).

Substituting h into (6) yields

(8) Vt.l(W,,) = sup E [ N ( w + D t Z t , , ) I ~ ~ I ] - u(W).


DtE ~ - I

TO analyze this, consider how the o-algebra Ft_ I is equivalent to a partition of fl,
and then focus on an arbitrary cell of this partition, say B c ~. (thus B E Ft_l, but
no proper subset of B is in Ft_l ). The cell B will subdivide into two cells, say BI
and B_I , according to whether Z t = I or Z t = -I. Let W l E B I and w _ I E B _ I be arbitrary,
so that h ( w + d , m ) = h(w+d,ml) for all w E B I and h ( w - d , w ) = h ( w - d , m_l ) for all
w E B_I. Then with p = P(Z t e l ] B ) , (8) becomes, for all w E B,

Vt-l(W'W) = d sup
E [ph(w+d,Wl) + (l-p)h(w-d,W_l)} . u(w).

Note that w - Vt_l(W,W ) is constant on B. B y the induction assumption about the


299

asymptotic properties of h and some elementary calculus, there exists a continuous


real-valued function f on the real line s'uch that, for each w E R , f(w) attains the
s.premum. In particular, f satisfies the first-order optimality condition
(9) p h'(w+f(w),w I) = ( I - p ) h'(w-f(w),w i), w E B,
where h' denotes the partial derivative of h with respect to its first argument.
Thus f(w) is the optimal decision for all w E B when the wealth is w, and

Vt_l(W,~) + u(w) -- p h(w+f(w),~ I) + ( I - p ) h ( w - f ( w ) , m . l ) "

Differentiating this last expression with respect to w one obtains, after using (9),
(I0) V t (w,w) + ut(W) = p hI(w+f(w),w I) + ( l - p ) h S ( w - f ( w ) , w i)
t-I
Meanwhile, differentiating (9) yields, for all w E B,
(l-p)h"(w-f(w),w l) -p h"(w+f(w),wl)
f'(w) =
(I - p ) h " ( w - f(w),w_l ) + p h"(w+f(w),wl)
where h" denotes the second partial derivative of h with respect to its first argu-
ment. Thus f' is continuous with -1 < f'(w) < I. Hence, by (I0), not only is
Vt.l(W,~) + u(w) increasing with respect to w, but Vt.l(W,~) + u'(w) either converges
to 0 as w -* • or converges to - m as w-~ -- o

Finally, differentiating (I0) with respect to w yields

Vt_l(w,w) + u"(w) = p h " ( w + f ( w ) , w l ) [ l + f ' ( w ) ]

+ (I - p)h"(w - f(w),w.l)[l - it(w)],

where V" denotes the second partial derivative of Vt_ I with respect to its first
t-I
argument. Hence w -" Vt_l(W,W) is strictly concave and has a continuous second
der Ira tive. r"~
What can be said about the computational effort that is required to obtain the
value function V? For each cell in the partition corresponding to each Ft one needs
to solve, for the function f that was defined in the preceding proof. This amounts
to solving the first-order optimality condition (9). For example, if

u(w) = a - (b/c)exp(-c w)

for arbitrary scalars a, b > 0, and c • 0, then uS(w) = b exp(-cw). For the last
period, h ffiu and each cell in the partition corresponding to FT. I consists of two
elements, say ~I and ~-I" Hence (9) becomes p/(l-p) = exp(2c f(w)), where the con-
ditional probability p = P(,.1)/(P(Wl) + P(W_l)). For ~ E [~i,~.i], the optimal
value of DT(W ) is then given by

f(w) = 1 log(p(wl)/p(v~_l)).

Moreover, for the same m,

VT_l(W,W) = a-2(b/c)4~(Wl)P(W.l)exp(-cw)/(P(~l)+P(,,.l)) - u(w).


300

Note that f(w) is independent of w. This is a manifestation of a well known


property of the exponential utility function: the risk adjusted value of a lottery
is not a function of the wealth. Indeed, it is not difficult to show the optimal
decision D t is independent of current wealth for earlier time periods as well. For
example, with h(w,w) ~ VT.l(W,W ) + u(w) and VT_ 1 as above one gets h'(w,w) =
exp(-cw), where 5 is a function that may depend upon w but not w. Substituting this
into the first-order optimality condition (9) yields f(w) = (1/2 c) log(g), where g
is a function that may depend upon ~ but not w. Finally, substituting this into the
expression for VT_ 2 shows that the only dependence of this function on the wealth w
is through the factor exp(-cw).

7. Solving the Example with Convex Analysis

To carry out the alternative solution technique described in Section 4, first


define a stochastic process M I [Mt; t =0,1,..,T] by putting M 0 = 0 and

M t - Z l + Z 2 + . . +Zt, t=l,2,,.,T.

Thus M is a random walk on the integers, and under the probability measure P'(w) =
(I/2) T M is a martingale with respect to F . Let E t denote the expectation operator
corresponding to P'.
Since Z = AM, it is clear that the wealth process W D under any decision process
D can be represented as the stochastic integral of D with respect to the martingale
M, that is,
t
~t = ~ DsAMs, t=l,2,..,T.
s=l
By standard results, each such wealth process will be a martingale under pt. Further-
more, since pe is the unique probability measure equivalent to P under which M is a
martingale, it follows (see, e.g., Jacod [5, Ch. XI]) that every martingale (under
pt) can be represented as a stochastic integral of a decision process with respect
to M.
The implications of this are as follows. Let ~ denote the space of random var-
tables Y on ~, and let ~ be as in Section 5, that is, ~ consists of all Y E ~ such
that Y = W DT for some D E =D. Since ~ is a martingale under P' null at zero, it
follows that E'[Y] = 0 for all Y E K" Conversely, if Y E X satisfies E'[Y] = 0, then
upon considering the martingale N defined by N t = E'[Y]Ft] it follows from the martin-
gale representation property described above that there exists some decision policy
D E ~ such that ~ T = Y, Hence

= [y~: E'[Y] = o}.

This completes the first step in the alternative solution technique. The second
step is to find an optimal terminal wealth, that is, some ~ E W such that E[u(q)]
E[u(Y)] for all Y E ~. This will be done with some convex optimization theory (see,
e.g., Rockafellar[7]).
301

Let Y* denote the space that is dual to Y under the linear functional ES[YY*],
Let W* denote the ortho~onal complement of W, that is

W_* = {Y*E_Y*: E I [ Y Y *] = 0, all Y E W ] .

Since E t[Y] = 0 for all Y E W, it is clear that W* contains all the constant functions
in Y*. If Y* E Y* is not constant, then one can readily find some Y E W such that
E'[YY*] ~ 0, so actually

W* = [Y*EY*: Y* is constant].

Denoting U(Y) = E[u(Y)], the fact that u is concave means U is a concave func-
tional on Y. Hence step 2 amounts to solving the concave optimization problem
(11) m a x i m i z e U(Y)
subject t o Y 6 W
Let U* denote the concave conjugate functional of U, that is, for each Y* E Y*,

U*(Y*) = fnf{E'[yY~] - U(Y) }.


YEY
(12) Proposition. Y solves (ll) if and only if there exists some ~ * E W * such that

(13) U*(Y*) =E' - U(Y).

Proof. To sho~ sufficiency, since E ' [ ~ * ] ~ 0 one has U*(~*) - -U(~). But the
definition of U~ means g*(~) ~ F.'[yq*] - U(Y) for all Y ~ Y, so in particular
U*(~*) -- -U(Y) • E'[YY* ] - U(Y) = -U(Y) for all Y E W_, that is, ~ is optimal.
Conversely, by a version of the Fenchel duality theorem there exists some
9" 6 __W* such that U*(q*) = -U(Y). Since E'[YY* ] -- 0, this means (13) holds.

dP(w)
With g(w) ~ dP' , the Radon-Nykodym derivative, let u*: × R - R be the
concave conjugate functional

u*(w,y) = inf {wy - g(~)u(w)].


wER
Since U(Y) = u(Y(w))dP(w) = ~ u(Y(~))g(m)dPt(w), by Rockafellar's [7] results on
integral functionals one has
P
(14) U* (Y*) = u* (m,Y*(w))dP I(w).
This leads to the following.
(15) Proposition. Y solves (II) if and only if there exists a positive, constant
function Y* E W* such that

(16) u*(~,~*(w)) = Y(w)~*(w) - g(~)u(Y(m)), all w E ~.


Moreover,

(17) U(Y) - -sup u*(w,y)de'(~).


y~f
Proof. By Rockefeller [7], (13) holds for Y E W and constant Y* i f and only i f (16)
holds. Eauatlon (14) and the fact that g(Y) = -sup U*(Y*) imply
Y* E W*
302

P ^
U(Y) - -sup u*(w,y)dP'(~). This snpremum is attained by y = Y*,
yER ~

and ~ ~ 0 because for y < 0 u*(w,y) - - ~ .


Proposition (15) can readily be used to solve for the optimal Y, as will shortly
be illustrated. First, however, mention should be made of how to carry out the third
and final step of the alternative solution technique, namely, finding the D E ~ such
that W~ = Y.
The idea is very simple. Let W denote the wealth process under the optimal dec-
A

islon process D. It has Just been determined that WT = Y" As was stated previously
is a martingale under pt, so

(18) Wt = E'[YI Ft], t = 0,l .... T.

It is also known that W can be represented as e stochastic integral with respect to


M~ that is,
t
Wt = E Ds A Ms
s=l
for some D E ~. Hence to determine D it remains to solve this easy representation
problem.
Returning to step 2, the use of Proposition (15) will be illustrated with the
specific utility function already studied in Section 6, namely u(w) = a - (b/c)exp(-cw).
One begins by computing

- =, y < 0

u*(~,y) = -eg(m), y = 0
(y/c)log(bg(w)/y) - ag(~) +y/c, y > O.

During this computation one notes that for y • 0 the argument in the definition
of u* is minimized by

(19) w = ~ log (bg(~)/y).


c
Next, for any y • 0 one computes

E'[u*(m,y)] = ~[logb+E'[1ogg]-logy+l] - a,

since Ee[g] = I. This expression is maximized by

= b exp(E I [log g]),

so Y*(~) = y for all w. Substituting this back into Et[u*(w,y)] and using (17) gives

U(~) = a - ; / c = e - (b/c)exp(E'[logg]).

for the optimal value of the objective function.


To compute Y one can solve equation (16). It is apparent the solution is given
by (19) with y = ;, that is

(20) Y(m) - ~(log(g(w)) - E1[logg]).


303

This completes step 2.


For step 3 one could use (18) to solve for the optimal wealth process W and then
use

(21) ~t(=) = ~t.l(=) + 6t(~) ~ Mr(=)


= ~t_l(~) + Dt(~)zt(w)
to solve for D. Alternatively, (21) can be used in a recursive manner to compute
end D simultaneously if one keeps track of the various partitions (corresponding to
the q-algebras Ft, as explained in Section 6). For example, suppose {~l,W_l] is one
cell in the partition corresponding to FT_ I with ZT(Wl) = I and ZT(W.l) = -I.
Substituting into (21) gives

(22) WT(Wl) = WT.I(ml) + DT(Wl)


^ ^ ^

WT(W i) = WT.I(w i) - DT(W_I ).

But WT-I and DT' being FT. I measurable, are constant over the cell [~l,~_l}, so these
two equations suffice to solve for the two constant values. Indeed, WT. 1 and DT
^

can be determined by solving 2T = ]fl[ such equations, after whlch WT. 2 and DT. 1
can be determined by solving 2T-I equations, and so forth. Overall, to solve for
and D one needs to solve 2 T+I- 2 equations of the form (22). Note that with WT = Y
given by (20), the equations in'(22) imply DT(Wl) = (I/2c)log(P(Wl)/P(~.l)) , the same
as the answer computed by dynamic programming in Section 6.

8. Concluding Remarks

It should be apparent the alternative solution technique can be applied to


considerably more general stochastic decision models. The essential feature is that
each wealth process can be represented as a stochastic integral of the decision pro-
cess with respect to an environmental process that, with a change of measure, is a
martingale. The example assumed this change of measure was unique, but this require-
ment can be relaxed without too much difficulty.
The alternative solution technique is successful when one is able to give the
environmental process special structure, namely, the martingale property. A research
topic deserving attention is to see what other kinds of simplifications result when
the environmental process is given other kinds of structure, such as the Markov
property.
Another topic, duality theory for the stochastic decision problem, will be
presented in e forthcoming paper.
304

References

I. Bertaekas, D. P., Dynamic Pro~rammln~ and Stochastic Control, Academic Press,


New York, 1976.

2. Davis, M. H. A., "Martingale Methods in Stochastic Control," Lecture Notes in


Control and Information Sciences 16, Springer-Verlag, New York - Berlin, 1979.

3t Dynkin, E. B., "Controlled Random Sequences," Theory of Probability and Its


Applications 10(1966), 1-14.

4. Gihman, I. I., and A. V. Skorohod, Controlled Stochastic Procgsses , Springer-


Verlag, New York - Berlin, 1979.

5. Jacod, J., Calcul Stochastique et Probl~mes de Martingales , Lecture Notes in


Mathematics 74, Springer-Verlag, New York - Berlin, 1979.

6. Rishel, R., "Necessary and Sufficient Dynamic Programming Conditions for


Continuous Time Stochastic Optimal Control," Siam J. Control 8(1970),
559-571.

7. Rockafellar, R. T., "Conjugate Duality and Optimization," Regional Conference


Series in Applied Mathematics 16, Society for Industrial and Applied Mathematics,
Philadelphia, 1974.
ON THE APPROXIMATION OF CONTROLLED
JUMP DIFFUSION PROCESSES

H. Pragarauskas
Institute of Mathematics and Cybernetics
Academy of Sciences of the Lithuanian SSR
Vilnius, K. Poz~los 54, USSR

Let Rd be a d-dlmensional Euclidean space, T E (0,~), H T = [ 0 , T ] X R d,


S R = { x q R d : Ixl < R}, (A,~(A)) be a separable metric space with Borel o-algebra,
an integer dl ~ i and [2(Rd,H) be the space of functions u : Rd + S1 such
that
llul12,~= {llu(z)12n(dz)}½ < ~ where ~(dz) = dz/Iz[ d+z.
Suppose that for all eEA, (t,x) £ H T we are given: a matrix o(~,t,x) of
dimension dx dl, a d-dlmenslonal vector b(~,t,x), an element c(~,t,x,.) of
i2(Rd,~) and a number g(x).
i. Condition. a) o,b are continuous in ~, continuous in t uniformly in
~; c is Borel in (e,t,x,z) and in the sense of the norm [I.H2,H: continuous in
~, continuous in t uniformly in ~; g is continuous. For all (t,x) £ H T

sup / Ic(~,t,x,z) i2H(dz) + 0

as g + 0.

b) For some constants m,K~ 0 and all ~EA, (t,x) £HT, yER d

[Io(~,t,x)ll + Ib(~,t,x) l + Ile(a,t,x,.)l[2,H ! K(1 + I x l ) ,


Ilo(=,t,x) -o(=,t,y)ll + Ib(a,t,x) - h ( a , t , y ) l +
llc(a,t,x, ") - c(e,t,Y,')~2, ~ ~ KIx - yl,
lg(x)l ~ K(l + l~l)m.

Let (~,F,P) be a complete probability space with an increasing family


(Ft, t ~ O) of complete o-algebras Ft=F , (wt,F t) he a dl-dimensional Wiener pro-
cess, (zt,F t) be a d-dimensional Cauchy process independent of wt with Levy mea-
sure 9, q(dtdz) be the Polsson martingale measure, constructed from the Jumps of
z .
t
Let ~ be the set of all processes et(~ ) progressively measurable w.r. to
(F t) having values in A. To each strategy ~ 62, (s,x) C H T we set into corres-
pondence a solution x~ 's'x of Ito's equation

x t = x + fto(~ , u , x )dw + f~b(~u,U,Xu)dU + f~Ya(~u,U,Xu,Z)q(dudz)-


s B u u
For (s,x) 6 H T let
v(s,x) = sup Eg(x~'S'X).
306

Fix an arbitrary s £ [0,T). Let {I } be a sequence of embedded subdivisions


n
s = tnO < t nI < .. . < t nn = T of the interval Is,T] and diam I n + 0 as n + ~.
Suppose that for all ~ 6 A, n = 1,2, ... we are given one-step Markov transition

functions ~. n n
Pn[ti,x,ti+l, F") ~ Pni(X,F), i = 0,i, ... ,n - I, x £ R d, r £ B ( R d) which
n
are Borel in (u,x). Let B n be the class of all families S(n) = (q; . . . . qn_l )
n n
of functions q0(d~01x0) , qi(dailx0,~0, ... ,~i_l,Xi), i < i < n - i, which are
probability measures on B(A) in the first argument and Borel in the other argu-
ments. An initial point (s,x), a strategy B(n) £ B n and a family of transition
functions define a controlled process sn(s) = x, an(s) . . . . ~n(t:_l),~n(T) on the
probability space (~n,B(~n),Q ~I:)), where 9n = (Rd)n+lxA n (see §6, ch. I [1]).
2. Condition. For some constant m' > m and every x 6 Rd

sup sup E B(n) sup l~n(t~I m'v2 < ~.


n B(n)£B n s,x t6I
n
For x E R d, n = 1,2, ... let

Vu(S,x) = sup E B(n) g(~n(T)).


S(n)£B n s,x

Let D[s,T] be the space of all right continuous and having left hand limits
functions xt with values in Rd defined on [s,T] with Skorochod topology. Set
DIs,T] = o { x t e r ; s < t < T, F£B(Rd)}.
Let us define the process ~tn setting ~tn = ~n(t i ) if n n I).
t £ [ti,ti+ Denote
by pB(n) the measure on (D[s,T], D[s,T]) induced by ~n
s~x
3. Condition. For an arbitrary sequence {8(n)} of strategies B(n) E[B n the
sequence of measures {p~(n)} is tight on (D[s,T],D[s,T]).
s~x
For all u6A, (t,x)£H T define a measure n(~,t,x,.) on B(R d) by the form-
ula: 3(~,t,x,dy) = ~(z : x + c(~,t,x,z) E dy\{x}), ~(u,t,x,{x}) = 0. Set
n n n
A i = ti+ 1 - ti, a = ~ * (~* is the transpose matrix of ~).

4. Condition. For some number p > i every R > 0 and every continuous
bounded function $ on R d which vanishes in some neighborhood of the origin
n-i d
n
I suv sup I I I (y - x)jP~ni(x,dy) - Ai~j(~,ti,x) I ÷ 0,
i=0 I~I<_R ~EA j=l ly-~l_<p
n-i d
l sup sup [ I / (Y x)j(y x) k p~i(x,dy ) _ n n
i=0 I~I<_R sEA j,k=1 ly-xLp
- - Ai[ajk(~'ti'x) +
+ /(y - x)j(y - X)k~(G,tn,x,dy)]I -~ 0,
n-i
n
sup suply~(y - x)P~i(x,dy ) - Anf~(y - x)~(e,ti,x,dy) I ÷ 0
i°0 Ixl£
as n -~ c o

5. Theorem. Let the conditions 1-4 hold. Then Vn(S,X ) -~ v(s,x) as n -~ =0


for all x E R d.
307

The particular cases of this problem were considered by H. J. Kushner [2],


H. J. Kushner and G. DiMasi [3] and J. P. Quadrat [7]. These authors did not use
the fact that under suitable assumptions the payoff function v satisfies the
Bellman equation.
Denote by C 1'2 the set of functions u = u(t,x) on R d+l twice continuously
differentiable in x
and once continuously differentiable in t. Let
d
1
L~(t'x)u(r'x) = Ur(r'x) + 2 ~ a_.(~,t,x)u
13 x.x (r,x) +
i,j=l i j
d d
+ i=l
[ bi(~'t'X)Ux'(r'x)1 - I[u(r,y) - u(r,x) - i=l~ uxi(r'x)(Y-x)i]~(~'t'x'dy)'

LC~(t,x)u(t,x) --_ L ~ u ( t , x ) ,
F u ( t , x ) " sup L a u ( t , x ) , (1)
c~£A
Fix x £ R d. Let {~(n)} be an arbitrary sequence of stategles 8(n) £ B n,
(~n(s),@n(s) ... ,~n(tn_l),~n(T))_ the process controlled by strategy 8(n) pn
~ S,X
the measure on (D[s,T], D[s,T]) induced by the process ~n(t) - ~n(tn)
n n
t 6 [ti,ti+l), i = 0,i, ... ,n - i.
By assumption 3 there exist a subsequence {m} c {n} and a measure P on
S,X
(D[s,T], D[s,T]) such that the measures pm converge weakly to P as m ÷ o0.
S,X ~ ~ ~ S~X
By lemma from §6, ch. 1 [5] there exist a probability space (~,F,P) and pro-
m
cesses~~~m(.), ~(.) defined on this space such that ~ ~ t hprocesses
e (x(.),Ps,x) and
(~m(.),e) as well as the processes (x(.),Ps,x) and ($(-),P) are equivalent and

~ ( t ) + ~(t) in probability as m -> ~ for every t6 [s,T].


Denote by Ft the completion (with respect to PJ of o{~(r), s < r < t}.

6. Lemma. Let the assumptions 1-4 hold, u 6 C 1'2, R > 0, T is the time of
first exit of (t,~(t)) from [s,T)xS R.
Then
u(tAT, ~(t^T)) - u(s,x) - I tAT Fu(r,~(r))dr
S

is a (Ft,P) ~ supermartingale.

Proof. Without loss of generality we can suppose that u(t,x) = 0 if


x ~ SR+ 2. Then Fu is a finite continuous function.
n ~(n)
Denote by F k the completion (with respect to Qs,x ) of
~{~n(s), n(s ) . . . . . ~n (tk),C~
n n (tk)}.
n
The sequence
k-i
n ~ n ( tnk ) ) - u ( s , x ) -
U(tk~ ~ E 8(n) n n n n n n IF~] =
i=0 s,x [u(ti+l'~ (ti+l) - u ( t i ' ~ (tl))
k-i n ~n(t~ )
= U(tk,~n(tk))- u ( s , x ) - [ S[u(ti+l,y)- u(t~,~n(tl))]Pni (~n(tl),dy)
i=0
n ~8(n)~~ - martingale.
is a (Fk,qs,x From this it follows that
308

n n n n
n t - U(t,x(t)) - u(s,x) - sup f[u(ti+l,y) - u(ti,x(ti))]p~i(x(t~),dy)
.° n < ~6A
z. t i + l _ t
n
is a (Dis,t], Ps,x ) - supermartingale.
Let
~m m m ~m m .] ~ , ~ m , t TM. d "
rl t - u(t,~m(t)) - u(s,x) - ~. sup l[u(ti+l,Y)-u(ti,~ (ti)) Pmit% t i ), Y)-
. n < ~EA
i. t i+l<__t
Suppose that

~m u(t,~(t)) - u(s,x) - ft Fu(r,~(r))dr ---~t (2)


~t -~ s
in probability as m + ~ for every t E [s,T]. Then it is easy to prove that ~t
is a (Ft,P) - supermartingale. This implies the required result.
Therefore it suffices to prove (2). To this end it suffices to prove that for
every R > 0
m-i
sup suplf[u(ti+l,y
m ) _ u(ti,x
m ) ]Pmi(X,
~ dy ) - AimL~ u(ti,x)[
m ÷ 0 (3)
i-o I~I<R ~CA
as 13 -~ c o
Let p be the number from assumption 4, e6(0,1), ~e' 4 0 be continuous func-
tions on Rd with values in [0,i] such that ~£(x) = i if Ix] < £12, Se(x) = 0
if Ix ] > g, ~p(X)=l if Ix] < p and ~p(x) = 0 if Ix] > p + 8.
Using Taylor' s expansior% we obtain

l[u(t,y) - u(r,x)]pa(x,dy) - ALeu(r,x) = u(t,x) - u(r,x) -


- Aut(t,x) + AL~(r,x)[u(t,x) - u(r,x)] +

+ I uxk(t'x)[ I (y - x)kPe(x,dy) - Abk(e,r,x)] +


k=l l y-x I<_p
d
i ~ u (t,x)[ I (y - X)k(y - X)Ep~(x,dy) - A~E(~,r,x) -

- Af(y - X)k(y - x)£~r(a,r,x,dy)]-


- I[$p - ,~.~](y - X)Rl(t,x,y)[pa(x,dy) - Aw(a,r,x,dy)] +

+ /[1 - ~e](y - x)[u(t,y) - u(t,x)] [pe(x,dy) - Aw(~,r,x,dy)] +

+ f ~p(Y - X)Rl(t,x,y)p~(x,dy) +
ly-~l>p
+ I~e(y - x)R(t,x,y)[p~(x,dy) - A~(~,r,x,dy)],
d d
where ~" xk(t'x) (y - x)k + ~ k,£~l uxkxE(t,x )(y - X)k(y - x)£,
~(t,x,y) =k=l u 1 [

R(t,x,y) = u(t,y) - u(t,x) - Rl(t,x,y)

and for convenience we use the notation: tmi= m 1 = t, Ami = A '


r, ti+ P~mi(x,dy) =
pe(x,dy).
80g

Substituting these expressions in the left hand side of (3) and using the assump-
tions of the lemma it is easy to prove (3). The lemma is proved.
Let C > 0. Denote by Ae the set of all matrices a of dimension d ×d I
with elements alj E [-E,e], B e the set of all d-dimensional vectors with compo-
nents hi6 [-E,e], C e the set of all elements c6 L2(Rd,~) such that llC}l~,~ g.
Let Xe = A E x B ex C e, ~ = A×~e. We denote the elements of ~ by ~ = (~,~),
where s 6 A , ~£XE.
Let ~. (x),~2(t) be nonnegatlve infinitely differentlable functions of arguments
x 6 R d, t 6 R ~ equal to zero for [x[ L i, I t l t i and such that I~l(X)dx=I~2(t)dt-
i. Let ~n(t,x) = nd+l~l(nx)~2(nt), n = 1,2 . . . . .

Denote by G(n),b(n),c(n),g(n) the convolutions of functions o,b,c,g with


function ~n with respect to (t,x) (in computing these convolutions we assume that
o(s,t,x) = o(~,0,x) for t ~ 0, o(~,t,x) = o(~,T,x) for t ~ T, etc.).
Let ~6~. Furthermore, let

O~(~,t,x) = ff(n)(~,t,x) + a~, a ~ £ A e,

b~(~,t,x) = b(n)(e,t,x) + b ~, b ~ E B ~,

c~(E,t,x,z) = c(n)(~,t,x,z) + c~(z), ce(") £ C e.

Replacing here o(n),b(n),c (n) by o,b,c we construct functions oe, bE, ee.
Using the collections " ~ O n b n cn (n)',(~,~e,b ,ce,g) we construct controlled
' S' ~' e 'g ) E
processes x~p'X(n,e),x~,S,X(g) and payoff functions Vnc,V E in the same way as
we constructed the above controlled process x ~'s'x and the payoff function v on
t
the basis of the collection (A,~,b,c,g).

7. Lemma. Let assumption 1 hold. Then vne + ve as n + ~, ~ + v as ~0


uniformly on every bounded subset HT.
The proof of this lemma is analogous to the proof of Theorem IIl 1.12 [6].

8. Proof of Theorem 5. First we shall prove the inequality

llm Vk(S,X ) <_ v(s,x).


k'-~
Let the subsequence {n}~{k} be such t h a t l i m Vk(S,X) = = lim Vn(S,X). If

{g(n)} is a sequence of e-optlmal strategies, then for some subsequence {n'}= {n}
we have
Y

Vn,(S,X) - e < E S(n') g(~n'(T)) = E g(~n (T)) + E g(~(T))


-- S,X

I
as n ÷ ~. Here e > 0 is an arbitrary positive number. Therefore, for (4) it
suffices to show that E g(~(T)) _< v(s,x).
310

Fix ¢ > 0, R > 0. Let

F(t,x) = {a(~,t,x),b(~,t,x),c(a,t,x,.); e£A},

Fne(t'x) = {a~ (~'t'x)'bn(~'t'x)'cn(~'t'x'');e


s ~E~},
n n n]*
where a e = 6¢[~e Using assumption 1 we derive that llan - all + 0 , Ib (n) - bl ÷0,
lie(n) - eli2,~ ÷ 0 as n + ~ uniformly over A ×[O,T]XSR, where am = o(n)[~(n)]*.
Therefore, for sufficiently large n

F(t,x) c Fn£(t,x ) (5)

for all (t,x) 6 [0,T] × S R.


n n cn (n)
The functions 0 E, b~, e,g are smooth in (t,x) and for all (t.x) 6 H T

inf su, (an(~,t,x)y,y) > O.


lyl=l ~Ei~
By Theorem 1.4 [4], there exist locally bounded partial derivatives Vnc t, Vncxi,

Vncx.x. in Sobolev sense such that Fne v n¢ = 0 (a.e. HT) and Vnc(T'" ) = g(n) (')'
13
where F is the operator defined by the formula (i) if we replace in this form-
me
ula A,a,b,c by ~ e < , b n£, ce
n Therefore, using (5) we obtain that for suffi-
ciently large n
FVnE _< 0 a.e. on [0,T] × S R. (6)

Denote by v(m)n£' (L~Vn£)(m) the convolutions with respect to (t,x) of functior~


Vng , L~Vng with the function ~m. Fix a number h such that s + h < T - h.
From L e n a 6 and (6) we obtain that for m > i/h and sufficiently large n

E Vn(~) (~2,~(~Z)> - ~ v(m)(~l,~(~l)) hE;


<
(7)
< ~ f~'l Fv(m)(r,~(r))dr < fT-h am(t)dt,
-- Y2 ne -- s+h

where Y1 = (s + h ) ^ TR, Y2 = (T - h) ^TR, ~R is the time of first exit of ~(t)


from SK and
t~m(t) = Sup sup .(X (m) (L~Vn£:) (m) [ (t,x)
I xl<R sEA ~ vne -

Using assumption I it is easy to derive that the functions 6 are bounded on


m
(s + h, T - h) uniformly with respect to m and 6m(t ) + 0 as m ÷ ~ for every
t 6 (s + h, T - h). Letting m ~ = in (7) we obtain

From this inequality in view of arbitrariness h and stochastic continuity ~(.) we


obtain

Vn~(Yb,~(Y3))£ Vn~(S,X),
where 7 3 = T A T R. Letting n + ~, g + 0, R + ~ in this inequality and using lemma
311

7 we obtain (4).
Now for the proof of the theorem it suffices to show that

li_m Vk(S,X) > v(s,x).


k-~o
Fix e > 0. Using the same arguments as in Corollary III 2.9 [6] we conclude
that there exist a subdivision Ik of an interval [s,T], a finite subset
= { i ..... ~N}cA and a strategy ~ E ~ having values in ~, ~(t) = ~(tk) if
k k
t£ [ti, ti+ I) such that

E g<x T ) >__ v(s,x) - E.

Denote by ~n(t) the process controlled by the strategy B(n)=(qg ..... q~_l)£Bn,

where
q0'%n" = Slx 0) =P(a(s) = 8)
n .,.
qj(ej = 8[x0,e 0, ej_l,X j) = P(e(t~) = 81a(s) = e 0 ....

n I) = ej_ I} . j. = .1,2,
~(tj_ . ,n, n _> k, 8 £ ~ .
!

By the assumption 3, for some suhsequence {n') c {n] the measures Pns,x in-
duced by ~n (.) on (D[s,T], D[s,T]) converge weakly to some measure Ps~x" It
is not difficult to show that the measure on (D[s,T],D[s,T]) induced by
x ~'s'x coincides with Ps,x" From this in view of arbitrariness E > 0 (8)
follows. (8) together with (4) proves the theorem.

9. Remark. The complete proof of the theorem and related results will be pub-
lished in Lith. Math. J., vol. XXIII (1983).

REFERENCES

i. I. I. Gichman, A. V. Skorochod, "Controlled stochastic processes", Kiev, Naukova


dumka, 1977 (in Russian).

2. H. J. Kushner, "Probability Methods for Approximations in Stochastic Control and


for Elliptic Equations", Academic Press, New York, 1977,

3. H. J. Kushner, G. DiMasi, Approximations for Functionals and Optimal Control


Problems on Jump Diffusion Processes, J. Math. Anal. Appl. 63 (1978), 772-800.

4. H. Pragarauskas, On Bellman equation for weakly nondegenerated general stochas-


tic processes, Liet. matem, rink., 20 (1980), 129-136 (in Russian).

5. A. V. Skorochod, "Studies in the Theory of Random Processes", Kiev, Naukova


dumka, 1961 (in Russian).

6. N. V. Krylov, "Controlled Diffusion Processes", Moscow, Nauka, 1977 (in Russian).

7. J. p. Quadrat, Existence de solution et algorithme de resolution numerique de


probleme de controle optimal de diffusion stochastique degeneree ou non, SIAM
J. Cont. Opt., vol. 18, N2, 1980, 199-226.
ON OPTIMAL STOCHASTICCONTROLPROBLEM
OF LARGESYSTEMS

J.P. OUADRAT
Oomaine d e V o l u c e a u - B.P. 105
78153 LE CHESNAV C~dex

I - INTRODUCTION.

We discuss three different approaches, leading to numerical methods, for the


solution of optimal stochastic control problem of large dimension :

- Optimization in the class of local feedbacks,


- Monte Carlo and stochastic gradient techniques,
- Perturbation methods in the small intensity noise case.

We consider the stochastic control problem of diffusion processes in the


complete observation case

I dXt = b (Xt,Ut)dt + dWt , Xt E~n, Ut ~ m


(i)
V(o,y) = Min E {I +~e-~t C(Xt'Ut )dt IX(°) : y}
U JO

The solution of the Hamilton Jacobi equation

(2) Min { b ( x , u ) grad V + C(x,u)} + AV - XV = o


U

gives the optimal cost and the optimal strategies of (I).

The numerical solution of (2) is almost impossible in the general situation


when n is large. The d i f f i c u l t y is not a problem of numerical analysis but an
irreducible difficulty. To see that consider the simpler problem

AV -~V = C X c ~ = [0,i] n
(3)
V~h = 0
313

where 60 denotes the boundary of

For such problem i t is easy to show that the number of eigen vectors associated
to an eigen value smaller than a fixed value, increases exponentially with the
dimension. But we need to have a good representation of the eigen vector associated
to eigen value of small modulus, in any good f i n i t e dimensional approximation of (2).
And thus, whatever could be the approximation, the obtention of a given precision
w i l l be obtain at a cost which increases exponentially with the dimension.

In the three following Dointsof view we avoid t h i s d i f f i c u l t y but we have a


loss of optimality.

I I - OPTIMIZATIONIN THE CLASS OF LOCAL FEEDBACKS.

In this paragraph we give the optimality conditions in the class of local


feedbacks, and show that i t is more d i f f i c u l t to solve these conditions than to
compute the solution of the Hamilton-Jacobi equation. Then we study two particular
cases :

- the case of the uncoupled dynamics,

- the case of systems having the product form property.

In these cases only i t is possible to compute the optimal local feedbacks for
large systems. F i n a l l y we ~Iscuss b r i e f l y the decoupling point of view.

II-i. !b~_g~!_~!~u~!eg.
Given I the indexes of the subsystems I = {1,2 . . . . k}
ni,rresp-m.] denotes
l
the dimension of the states[resp~he controls] of the subsystem i ~ I. The local
feedback Si is a mapping o f ~ + x ~ n l i n ~i c ~mi the set of the admissible values
of the control i . 8L denotes the class of local feedbacks SL = {S = (S1, . . . . Sk)}.
Given the d r i f t term of the system :

b :JR+ xIRn x ~-~IR n


t x u b(t,x,u)

with n = ! ni , ~ : II
i I i(l ~i"
314

- the d i f f u s i o n term :

: R + x ~n ÷ Mn
t x ~(t,x)

with Mn the set of matrices (n,n) and a = ~1 o~ where , denotes the transposition

- the instantaneous cost :

C : R + X ]Rn X~f + R+

t X U c(t,x,u)

n n
then boS rresp coS] denotes the functions R+ x R -~ R

n
rresp ~+ x ~ + ~R+] b(t,x,S(t,x)) ~resp c ( t , x , S ( t , x ) ) ]

The_u i f XS denotes the diffusion (boS,a) ( d r i f t boS, and diffusion term ~) and
pS i t s measure defined on ~ = C(R+ , R n) with ~ the law of the i n i t i a l condition
we want to solve

T
Min E s J C°S(t'wt)dt
Sc8L p~ o
where ~ c ~, T denotes the time h o r i z o n . We have here a team of I players working
to optimize a single criterion

A simple way to obtain the optimality conditions is to consider another formu-


l a t i o n of this problem : the control of the Fokker Planck equation that is :

Min jS = I CoS(t,x)pS(t,x)dt dx
SEgL Q
with p solution of

~s pS = 0

pS(o,.) :

with Q = FO,T] x 0 and O = ~ n

~ 22
~s = -- + Z b j o S - + .~. a i j
Bt j ~xj 1,j @Xi~Xj
the law of the i n i t i a l condition.
315

Than we have :

Theorem 1

A N.S.C. for jR ~ jS , R S ~ SL, is that :

(~) H(t,R,pR,v S) ~ H(t,s,pR,vS) pp in t

with

H(t,R,p,V) = I [CoR(t,x) + ~ bioR(t,x) ~ (t,x)] p(t,x)dx


(z) 0 I ]

~ p R : 0 P R ( ° " ) : ~ ; ~S VS + CoS = O, vS(T,.) = 0

Remark I . From this theorem the Pontriagyn condition oanbeobtaip~c1, that i s a


necessary o0ndtion of optimality of the strategy S is that: p,V,S satisfy

H(t,s,pS,v S) = Min H(t,R,pS,v S) ;


Re8L

(3) l j_~pS : 0 , p(o,.) = u ;

ZS vS + CoS = 0 , vS(T,.) = O.

A proof is given in J.L. Lions [Z3].

Remark 2. This theorem give an algorithm to improve a given strategy R that is :

R
Step I : compute p
Step 2 : solve backward simultaneously

~S VS + CoS = 0 vS(T,.) = 0

(4) I S E Arg Min H(t,z,pR,v s)


Z
By t h i s way we obtain a better strategy S.
A fixed point of the application R÷S w i l l s a t i s f y the conditions (3).
We see that one i t e r a t i o n (4) of t h i s algorithm is more expensive than the
computation cost of the solution of the H.J.B. equation.
316

ZZ-2. U ~ e ~ ! ~ - ~ ! ~ - ~ "

This is the particular case where bi is a function of x i and ui , Vi E I

• ni
bi : ~+ x n ~ x ~i +
t xi ui bi(t,xi,ui)

and the noises are not coupled between the subsystems that is :

~i : ~R+ x ~Rnl ÷ Mni


t xi oi ( t , x i )

In this situation we have


R,
R I
P : ~ Pi
icl
with pi Ri solution of

I~ R.
* l l
(5) ~i,RiP i =0 Pi (o,.) : ui with u = ]1 ui
i~I

and

92
= @-- + X bk°Ri(t,X ) ~ + ! akI w
~i'Ri @t kcI i ~Xk k l~I i @Xk~X1

with I± : { Z n~ < k ~ ! nj}.


j<i J j i+1

Let us denote by

R ~+ ~Rni IR+
(6) CioRi : x +
t xi I CoR(t,x) jmi
]I p
2 (t,Xj)dXj

That is the conditional expectation of the instantaneous cost knowing the infor-
mation only on the local subsystem i .
We have the following sufficient conditions to be optimal player by player :

Theorem 2. A sufficient condition for a strategy S to be opt.ima.l player by player


is that the following conditions are satisfied :
317

R
(7) MinR. F~i'Ri vi + Ci°Ri] = o, i E I ;
1

with C~oRi defined b% (6) and (5)

The optimal cost is ~I(V~..=~I(VI) with ~i(Vi) : ~ n L ui(dXi)Vi(°'Xi )

Remark 3. The theorem 3 gives an algorithm to compute a feedback optimal player


by player

given E, v ~ R+

Step 1) Choose i ~ I
Solve (7)
i f : ~i(Vi).. ~ v - than v := ~i(Vi)
Ri := Arg Min {~i RiVi+CR°Ri}
R. t
1

i f not choose another i ~ I until

~i(Vi) ~ ~ - c, Vi ~ I.

e
Step 2) When~i(Vi) ~ v - ¢, Vi ~ I, than ~:=~- , go to step 1.

By this algorithm we obtain a decreasing sequence ~(n) which converges to a


cost optimal player by player.

For a proof of a discrete version of this algorithm see Quadrat-Viot [2].

Remark 4, The interpretation of Vi(t,Xi) i c I in terms of the variables of


theorem 1 is :

Vi ( t ' x i ) = iV(t,x ) .~. pRj (t,Xj)dXj

Remark 5. In this situation we have to solve a coupled system of P.D.E. but each
of them is on a space of small dimension. By this way we can optimize , in the class
of local feedback, systems which are not reachable by H.J.B. equation. An application
to hydropower systems is given in Delebecque-Quadrat C4].
318

The property that a system has i t s dynamic uncoupled is very r e s t r i c t i v e in


this paragraph, we show systems which have t h e i r invariant measure uncoupled, they
are l i m i t of network of queues of Jackson type. Thi~ property can be used to apply
to them the results of I I - 2 for the corresponding ergodic control problem that is:

Min lim ~
S T~
1J
o
CoS(~t)dt

Given B a generator of a Markov chain defined on E = {1,2, . . . . n}, a function


E x~÷N a matrixo~ Mn , A = ~a , D a diagonal matrix satisfying :
(i,x) ui(x )
(8) DB* + BD + 2A = 0

Theorem 3

The invariant measure of probability p of the diffusion (b = Bu, a=A) such


that (8) is true has the product form property that is :

n
(9) p(x) = C ~ Pi(Xi) " i c E
- - i : 1
xi
_!_1
(10) Pi(Xi) = exp - di i ]o ui(s)ds

where C is a constant of normalization.

Demonstration : The Fokker-Planck equation can be written :

(11) -div [bp] + div [A grad p] = 0

Let us make the change of variables p = exp V in (11), we obtain

(grad V, b-A grad V) + div (b-A grad V) = 0

Using (10), we have :

(12) (D-lu, (B + AD-I)u) + t r [(B +AD-1) grad u] = O

The quadratic part in (u) of (12) is equal to O i f and only i f :

D-IB + B*D-1 + 2D-IAD- I = 0

which can be written :

BD + DB* + 2A = 0
319

which is (8).

We have also t r [B + AD- I ] grad u = 0 . Indeed grad u is diagonal because ui is


a function of x i only and the coefficient of @x
@ui
i is b i i + a i i / d i i which is equal
to zero by (8).

Remark 6. This class of diffusion processes are quite natural i f we see them as the
l i m i t process when N ÷ , obtained from Jackson network of queues by the scaling
x t
x÷~ , t÷N~ .

queue j

queue i ~i(xi)
N)

where ~ i ( x i ) is the output rate of the queue i , mij is the probability of a customer
leaving the queue i to go to the queue j .

The correlation of the noise given by (8) corresponds to system for which the
noise satisfies a conservation law (for example the total number of customer in a
closed network~of queues).

Remark 7. We can now apply the result of I I - 2 to compute the optimal local feedback
f o r systems having the product form property and an ergodic c r i t e r i o n . Indeed :

SCoS(x'
x<'
n
p(x) = ~ Pi(Xi)
i=1
and Pi satisfies :

@2
@
Dxi [uiP i ] + @x~ [ d i i pi ] = O,
- - i c E

I P i ( X i ) dxi = 1

ZI-4. B~m~_~_~g~B~!!D~-

Another way to use the results of I I - 2 when the dynamic is coupled is to do a


change of feedback l e t us consider the simpler case

b : IRn x U ÷IRn with u EI~n


x u b(x,u)
320

we use the feedback transformation v = b(x,u) to decouple the d r i f t terms. Now v is


the control and we can apply the results of I I - 2 to compute the best local feedback
vi = si(x i).
Then the solution in u of

(13) b(x,u) : S(x)

gives the best feedback among the class that we can call "local decoupling feedbacks".

One d i f f i c u l t y with this approach is for example the constraints on the


control : the image by b of an hypercube is not in general an hypercube and i f we
take for constraints on the new control v c V(x) c b(x,%~ with V(x) an hypercube of
Rn, the loss of optimality can become unacceptable.

This approach is well studied for deterministic linear and non l i n e a r systems
Womham [ 8 ], I s i d o r i [17] and in the dynamic programming l i t t e r a t u r e Larson [ 1 1 ] .

Ill - OPTIMIZATIONIN A PARAMETRIZEOCLASS OF FEEDBACKSBY MONTECARLOTECHNIQUES.

We have seen in §2 that we are able to compute the optimal local feedback only
in p a r t i c u l a r cases. Moreover, sometimes the local information is not good ; we can
have, a p r i o r i , an idea on a better one and would l i k e to use this a p r i o r i
information to solve a simpler problem than the general one. A way to do that is to
parametrize the feedback, optimize the open loop parameter by a Monte Carlo technique.
More precisely given the stochastic control problem

dXt = b ( t , x t , u t ) d t + dwt xt ~ Rn, ut E Rm


(I)
I
IMin E
T
Io C(t,xt,ut)dt
we make the feedback transformation

(2) u(t) : S ( t , x t , v t ) Vt c R p

where S : R+ x~Rn x R p -~Rm is given.

We use f o r the approximation of the probability law of the noise P, the


di s t r i bution
r
I ~ ~ (~1
I=
"

where mi are trajectories of the noise obtained by random generation perhaps after
a discretization time i f we want to avoid the d i f f i c u l t i e s of the non existence of a
321

solution trajectory by trajectory to (1). And now we have to solve :

(3)
Min
v j=1 o

wJt denotes here a particular trajectory of the noise. Thus, at the end, we have to
solve a deterministic dynamic control problem. For that, we can use a gradient
technique or the Pontryagin principle. For discrete time system the convergence
properties of this approach has been studied in Quadrat-Viot [15]. Application to
the French hydropower system is currently done at EDF now. Feedbacks on the demand of
e l e c t r i c i t y and the level of water in the local dam are optimized with success by t h i s
technique (Lederer- Colleter [ 2 ] ) .

The idea of the stochastic gradient method is the same as the fozTnez one but
we use a recursive way to optimize. The recursivity is on the index of the
trajectory of the noise generated. The problem (1) (2) can be reduced to the problem

Min E J(v)
vEV
@J
is a situation where we are able to compute ~ b~z adjoint state tecbmique here
T
J(v) = I C(t'xt'S(t'xt'vt))dt-
o

Moreover, we can consider that after discretization v is f i n i t e dimensional.


Then the stochastic gradient algorithm is the following recursive way to improve
the parameter v

@J
Vr+l = Pv {Vr - Pr ~'v(Vr'mr)} Pr ~ R+' Vr e ~ ,

pr : ~ , pr < ~
r r

~r denotes a generated random realization of the stochastic parameter in the


d e f i n i t i o n of J(v), for our problem (1), (2) that is a realization of the Wiener
process wt. Pv denotes the projection on the set V.

In a convex situation which is not in general the case for the problem ( i ) , (2),
we can give some global convergence results.
322

Theorem 1

On the hypotheses

1) v ~ J(~,v) convex Vm ;

2) ~ ÷ J(m,v) is L' , Vv

3) sup l~v (v,m)l ~ q ;


vEV
4) E J(V) - j * ~ C ~2(v)

where ~* denotes.the optimal cost and ~(v) denotes the distance of v to the optimal
set.

5) V a bounded convex set, we have lim E ~2(Vn) = 0 and moreover i f Pr = - -


n cr+q 2
yoC
with Yo = E ~2(Vo) we have :

1
E £2(Vr) ~ c2 1
q2 r +
Yo
The proof of this theorem can be found i n I~du-Gottrsat-Hez-hz-Quadrat-Viot [5] a
l o t of similar results can be found in Polyak [14] and in the reference of this
paper. In Kushner-Clark [12]local convergence are proved in the non convex case.

The following result shows that in some sense the stochastic gradient algorithm is
optimal. We suppose that :

6) the noise is f i n i t e valued and we denote by vp = Argmin Ep J(v) and we suppose


that

7) v ~ J(~,v) i s twice differentiable and uniformly convex V~ c ~, then we


have D.G.H.Q.V. [ 5].

Theorem 2

On the hypothese 6) and 7) we have :

El~ ( v - v ) ~ _> ~ Hpl QIJ H1~1


I - -

with
32
Hp
: ~v
- - 2 zp J(v)
:I~ t~--JIv ~@2
323

for a l l unbiased s t a t i s t i c v of vp defined on (~,u)~r.

I f we remark that b~ is an estimation of H-lu and q2 an estj_r~at~ of _0N we see


that in a certain sense the speed of convergence of the stochastic gradient
technique is optimal.

We have applied this algorithm to the problem of the optimization of the


investment of a transmission network of e l e c t r i c i t y D.G.H.Q.V. [ 5 ] . The comparison
with a sophisticated simplex approach shows that the stochastic gradient mathematic
is undoubtly better.

IV - PERTURBATIONMETHODS.

By perturbation methods we can reduce a d i f f i c u l t problem to a simpler one.


In this chapter we study the small intensity noise case. In this situation i t is
possible to build an affine control which leads to c4 error with respect to the
optimal control, where c denotes diffusion term.

We consider the following stochastic control problem :

dXt = f ( x t , u t ) d t + ~dwt , xt E I~n , ut E IRm


I
I ;'
VC(o,y) = M~n IZ [ o C ( x t ' u t ) d t I X(o) = y ] .

where ~ belongs to IR+ and is small.

We denote by

(2) H(x,u,p) = p f(x,u) + C(x,u)

We suppose that :

(3) u + f(x,u) is linear ;

(4) ( v , ~ v) ~ k Ivl 2 where k is a positive constant, Vx ;


Bu

Let us consider the deterministic control problem

(5) dXt = f ( x t , u t ) d t
f T
V(o,y) = Min{| C(xt,ut)dt I X(o) = y} ;
U JO

and denote by uo(t ) the optimal open loop deterministic control.


324

The second variation calculus around the optimal trajectory of (6) Cruz [3]
gives the quadratic form "osculatrice" of the optimal cost V around the optimal
trajectory. This quadratic form is defined by the (n,n) time dependent matrix P
solution of the Riccati equation :
(6) P + PA + A'P - PSP + Q = 0 P(T) = 0

where
-1 i
(7) A = fx - fu Huu Hux

(B) S : f u H u u-1 f,u

_ H- 1 ,
(9) Q = Hxx Hux uu Hux

are evaluated along the optimal trajectory of (5) on the hypotheses that :

(i0) Huu > O, Hxx - Hux H-uu


I H'
ux >
-
0

Let us consider the following affine control :

(11) uf(t,X(t)) = Uo(t) + Kit ) IX(t) - Xo(t )]

where Xo(t ) denotes the optimal trajectory of the deterministic control problem (5),
X(t) the actual trajectory of the diffusion process (11) when the control is (11),
and K(t) is defined by

(12) -z (Hux
K(t) = Huu , + f'u p)(t)

evaluated on the optimal trajectory Xo(t ).

We have :

Theorem

On the hypotheses (3), (4), (10) and (f,c) t~rLce differen~able, ~ affine
control build on the deterministic control problem, used in the stochastic control
problem leads to a loss of optimality of order 0(c4).

Ideas of the proof : Fleming [ 6]has shown that the optimal deterministic feedback
used in the stochastic control problem leads to an error of C~(~4). But in the
estimation of the proof he does not need the optimal deterministic control but a
control which gives exact V, ~V
~, B Z_2_vv along the optimal trajectory of the deterministic
3Y
control problem.
325

Using for example Cruz [3] we know that the affine feedback ( I i ) has this
property and thus the result is proved.

REFERENCES.

[I] BASKET - CHANDY - MUNTZ - PALACIOS : Open Closed and Mixed Network of
Oueues with Different Class of Customers, JACM 22, pp. 246-260

[2] COLLETER -LEDERER : Rapports internes EOF sur la gestion des r@sArvolrs
hydroGlectriques fran~ais

[3] CRUZ : Feedback Systems, MacGraw Hill, 1972.

[4] DELEBECQUE - @UADRAT : Contribution of Stochastic Control Singular


Perturbation Averaging and Team Theories to an Example of Large ScaJe
System : Management of Hydropower Production. IEEE AC, April 1978.

[5] DODU - GOURSAT - HERTZ - DUADRAT - rIOT : "M~thodes


de Gradient Stochastique pour l'Optimisation des Investissements dens un
RGseau Electrique", EDF Bulletin S~rie C, n°2, 81, pp. 133-164.

[s] FLEMING : Stochastic Control for Smell Noise Intensities, Brown University
Report, April 1970.

[7J HOLLAND :-Small Noise Open Loop Control, SIAM J. Control, 12, August 1974.
-Gaussian Open Loop Control Problems, SIAM J. Control, 13,
August 1975.

[8] ISIDORI - K~ENER - GORI GIORGI - MONACO : Non Linear Oeoouplin~ via
Feedback e Differential Geometric Approach, IEEE, AC 26, n°2, April 1981.

[9] JACKSON : Jobshop Like Queueing Systems, Management Science, 10,


pp. 131-142, 1963.

[lO] KELLY : Reversibility and Stochastic Networks, J. Wiley and Sons, 1979.

[11] KDRSAK - LARSON : Dynamlc Successlve Approximation Technique with


Convergence Proofs, Autematiea, 1969.

[12] KUSHNER - CLARK : Stochastic Approximation Methods for Constrained and


Unconstrained Systems, Springer Verlag, 1978.

[13] LIONS : ContrGle Optimal des Syst~mes Distrlbu~s, Ouned, 1968.

[14] POLYAK : Subgradlent Methods : a Survey of Soviet Research in Non Smooth


Optimization, ads. C. Lemer~chal & R. Mifflin, Pergamon Press, 1978.

[15] QUAORAT - VIOT : M@thodes de Simulation en Programmation Oynamlque


Stoehastique, RAIRO, April 1973, pp. 3-22.

" " : Product Form and Optimal Local Feedback for Multllndex
Marker Chains, Allerton Conference, 1960.

[17] WONHAM : Linear Multivariable Control : A Geometric Approach ,


New York, Springer Verlag, 1979.
UNNORHALIZED CONDITIONAL PROBABILITIES AND OPTIMALITY
FOR PARTIALLY OBSERVEDCONTROLLED
JUMP MARKOV PROCESSES

by
Raymond Rishel
Department of Mathematics
University of Kentucky
Lexington, KY 40506

I. INTRODUCTION

Optimality conditions for controlled partially observed jump Markov processes


have been given in a number of papers, for instance as in [I],[4],[S],[9]. The objec-
tive of this paper is to give a simple proof of necessary conditions for optimality
for partially observed controlled jump Markov processes similar to the simple proof
for completely observed controlled jump Markov processes given in [6]. By making use
of the concepts of unnormalized conditional probabilities and expectations of a par-
tially observed jump Markov process, it will be possible to formulate two intermediate
deterministic control problems whose optimal control is the optimal control for the
stochastic problem. The Pontryagin maximum principle applies to these deterministic
problems to give necessary conditions for optimality. The two problems are dual to
each other in that the adjoint variables for one problem are the negatives of the state
variables for the other and vice versa.
In order to give a short discussion of the control problem, the discussion of un-
normalized and normalized conditional expectations is given in an appendix. Thus the
paper really consists of two almost equal parts, the body discussing the control pro-
blem and the appendix discussing unnormalized conditional expectations.
Unnormalized conditional expectations are important in their own right. For in-
stance if y is a process obtained from x by aggregating the states of x together, and
if x is conditionally Markov given y, then the conditional distributions of the jump
times and locations of the aggregated process y may be expressed in terms of unnor-
malized conditional probabilities. This relationship is given in Corollary A1 of the
appendix.
Unnormalized conditional expectations for partially observed jump Markov pro-
cesses were used in [8], Theorem 2, and in [7]. However in both of these papers they
are used mainly as a vehicle to obtain normalized conditional expectations rather
than being studied for their own importance.
327

If. PROBLEM FORMULATION

A jump process x is a stochastic process whose paths are right-continuous step


functions. For these processes there is a one-to-one correspondence

x'+--,- ( X o , T I , X l , T 2 , . . . , X n , T n , . . . ) (1)
between t h e p r o c e s s x and t h e s e q u e n c e o f i t s jump t i m e s and jump l o c a t i o n s . I f we
u s e t h e c o n v e n t i o n t h a t TO E 0, t h i s c o r r e s p o n d e n c e i s d e f i n e d by

x(t) = xn if Tn ~ t < Tn+ 1 . (2)

I f g i s some f u n c t i o n on t h e s t a t e s p a c e o f x which i s n o t one to one and y ( t ) =


g(x(t)) is observed, call the process partially observed. Call x(t) the unobserved
p r o c e s s and y ( t ) the observed process. The o b s e r v e d p r o c e s s y ( t ) w i l l a l s o be a jump
process. Let
n(t) = {no. of jumps of x in [O,t]}

k(t) = {no. of jumps of y in [O,t]} .

When n(t)=n, x on [O,t] corresponds to

(X0,TI,X I .... ,Tn,Xn) (3)

If we also have k(t)=k, we must have k ~ n and y on [0,t] corresponds to

(Y0,T0,Yl . . . . . Tk,Yk) (4)


where Y0 . . . . . Yk a r e t h e v a l u e s o f (g(x0) . . . . ,g(Xn) ) w i t h r e p e t i t i o n s s u p p r e s s e d and
(31 . . . . . Tk) i s t h a t s u b s e t o f (T 1 . . . . . Tn) a t which t h e r e i s a jump o f g(x) a t e a c h ~.
For brevity let
X n = (Xo,TI,Xl ..... Tn,Xn) (5)
and l e t
Yk = ( Y 0 ' ~ I ' Y l . . . . . ~k,Yk) . (6)
C a l l Xn t h e h i s t o r y o f x up t o t h e n - t h jump and Yk t h e o b s e r v e d h i s t o r y up t o t h e
k - t h jump. I f ~k ~ Tn < Tk+l t h e n Yk i s t h e o b s e r v e d h i s t o r y c o r r e s p o n d i n g t o Xn.
We s h a l l want t o f o r m u l a t e t h e c o n c e p t o f a c o n t r o l l e d family of partially ob-
s e r v e d jump Markov p r o c e s s e s . For c o n v e n i e n c e we s h a l l t a k e t h e s e t o be jump p r o -
cesses with values in the finite set of integers (1,...,N). The c o n t r o l l e d processes
o f t h e f a m i l y w i l l be s p e c i f i e d in terms of a controlled generator

A(t,u) = (ai<(t,u))9 i,j = 1..... N (73


which satisfies
aij(t,u) ~ 0 aii(t,u) : -.[. aij(t,u) (8)
and a family of control functions
{u(t,Yk)}
with values in a closed set U of E r.
The f/mily {u(t,Yk) } of functions specifies a control in the sense that if there
is an observed process y and k(t)=k and the observed history is Yk then the control
u(t,Yk) is applied at time t.
328

Given a control {u(t,Yk)} the probability measure of the corresponding partially


observed jump process x can be constructed as follows. Construct the conditional pro-
bability of the time and location of the next jump given the history up to the n-th
jump by defining

P[Zn+l<t,Xn÷l=J IXn]
= ~ l{Tn<-t'Xn=i} f;n aij(s'u(S'Yk))exp[TS[
aii(r,u(r,Yk))dtds (9)
"Tn

where in (9) Yk is the observed history corresponding to X n.i


In terms of a given initial distribution
P[x0=i] = Pi (I0)

and the conditional distributions [9) construct finite dimensional distributions of


the sequence
(x0,T l,xl,.. .,Tn,Xn .... ) (ii)
and then extend this to a probability measure in the usual way. Then define the pro-
cess x corresponding to the control {u(t,Yk)} by (2).
Since the generator (aij(t,u)) depends on the past only through the state i, and
the value of the control u, we call the process constructed a controlled partially
observed jump Marker process. This is a misnomer since the conditional distribution
(9) of thenext jump depends onthe current state, Xn=i, and on the past measurements Yk"
Thus the process is not Markov. It would have been Marker had a control depending
only on the current value yk=g(i) of the measurements been used. Of course since
there are only partial observations, there is advantage to be gained by using controls
which depend on the past measurements. We shall continue to use the term partially
observed jump Markov process, even though the process is not Markov.
For the process constructed it follows from (9) that the control applied at
time t is u(t,Yk(t)).. ~'e shall often abbreviate by writing u(t) for u(t,Yk(t))...
To formulate an optimization problem let c(t,i,u) denote a cost rate and T a
fixed terminal time. Formulate the control problem of finding the control in a given
class of controls so that the corresponding expected total cost

E ~ : c(s,x(s),u(s))dsI (13)

is a minimum.

I l l . UNNORMALIZEDAND NORMALIZEDCONDITIONAL PROBABILITIES

Optimality conditions will be expressed in terms of conditional expectations and


conditional probabilities conditioned with respect to the past of the process x and
also in terms of conditional probabilities and expectations conditioned with respect

IThe notation 1A denotes the characteristic function of the set A, that is the func-
tion which is one on A and zero on the complement of A.
329

to the past o f t h e m e a s u r e m e n t s y. Let


ot = u[x(s) ;O~s~t]

u t = ~[y(s) ;O~s~t]
~(Xn) = ~[X0,TI,X I ..... Tn,X n]

~(Yk) = O[90,T1,Yl . . . . . Tn,9£]


denote the G-fields generated respectively by the past of the process x, by the past
of the process y, by the jump times and locations up through the n-th jump of the
process x, by the jump times and jump locations up through the k-th jump of the ob-
served process y.
A fundamental observation about any jump process, such as say y, is that on the
set Tk S t < Tk+l every ~t measurable random variable may be written as a ~(Yk) mea-
surable random variable. Using this it easily follows as is shown for instance in
[2], Theorems 5 and 4, that for any random variable z for which EIz ] < ~ that

E[z l{~k~t<~k+l } [Yk ]


E[zl~ t ] : [ l{~k~t<~k+l } (14)
k=O g[l{~k~t<~k+l } IYk ]
Thus call
E[z l{~k~t<~k+l } IYk ] (15)

the unnormalized conditional expectation of z on {Tk~t<Tk+ I} and call

E[z l{~k~t<~k+l } IYk ]


(16)
E[l{~k~t<~k+l } I Yk ]
the normalized conditional expectation o£ z on {Tk~t<~k+l}.
Notice that both the normalized and unnormalized conditional expectations evalu-
ated at t=~ k agree with
E[z [Y k] . (17)

D e f i n e n o r m a l i z e d and u n n o r m a l i z e d c o n d i t i o n a l expectations for the x-process


conditioning with respect t o o t and O(×n) i n a s i m i l a r manner to t h a t above for the
y-process.

IV. ESTIMATION EQUATIONS (FILTERING FO~,IULAS)


In this section we will define various unnormalized and normalized conditional
probabilities and expectations involved in deriving necessary conditions for opti-
mality of the control problem. Proofs of the relations and properties stated in this
section for these quantities are given in the appendix.
Let Qi(t,Yk) denote the unnormalized conditional probability that x(t)=i given
the history Yk of the observed process y. That is
Qi(t,Yk) = E[l{x(t):i} l{~k~t<~k+l } ]Yk ] . (18)
330

THEOREM I: On the interval ~k ~ t < T, for each i for which g(i)=Yk, Qi(t,Yk) satis-
fies the system of differential equations
d
d--~ Qi(t,Yk) = aii(t,u(t,Yk))Qi(t,Yk)

+ [ aji(t,u(t,Yk)) QjCt,Y k) {19)


{j:g(i)=g(j)}
and the boundary c o n d i t i o n
Qi(~k,Yk) = P[X(~k)=i IY k] (20)

is satisfied. If g(i)~Yk, Qi(t,gk)=0.

Define J ( t , X n ) , the n o r m a l i z e d c o n d i t i o n a l e x p e c t e d c o s t from time t onward


conditioning with respect to the entire past of x by

Ju(t,Xn) = (21)
E[l{Tn_<t<Tn+l } [Xn]

THEOREM 2: I f i d e n o t e s the v a l u e o f Xn, the q u a n t i t i e s (Yk, i ) a r e s u f f i c i e n t sta-


tistics f o r e x p r e s s i n g .the c o n d i t i o n a l r e m a i n i n g c o s t (21) i n the sense t h a t when Xn:i
and Yk i s the o b s e r v e d p r o c e s s c o r r e s p o n d i n g to Xn, then J u ( t , X n ) can be w r i t t e n

Ju(t,Xn) = Ju(t,Yk,i) (22)

as a function of (t,Yk,i).

THEOREM 3: On the interval Tk ~ t ~ T, Ju(t,Yk,i) satisfies the system of differen-


tial equations

d Ju(t,Yk,i) i = ( i . . . . . N)
dt

= -c(t,i,u(t,Yk) ) - Ju(t,Yk,j) aij(t,u(t,Yk) ) (23)


{j:g(i)=g(j)}

Ju(t,Yk,t,g(J),J) aij(t,u(t,Yk)) - Ju(t,Yk,i) aii(t,u(t,Yk)) •


{j:g(i)~g{j)}
At time T the boundary condition
Ju(T,Yk,i) = 0 (24)
is satisfied.

In Thoerem 3, Yk+l = (Yk 't'g(j)) so that Ju(t,Yk+l,j) = Ju(t,Yk,t,g(j),j) is the


quantity defined by (22) and (21).
831

THEOREM 4: The conditional remaining cost from an observed jump time ~k onward (or
from time 3 0 > 0 onward if k=O) conditioned with respect to the observed history Yk
satisfies

E ~: c(s,x(s),u(s))ds ,Ykl
k (25)

~k ~" (t'i'u(t'Yk)) + {j:g(j)~g(i)~ }Ju(t'Yk't'g(j)'j)aijCt'uCt'Yk Qi(t'Yk)dt

and also satisfies

E c(s,x(s),u(s))ds IY = [ ju(~k,Yk,i ) p[x(~k)=i Iyk ] . (26)


~k
kl i

In Theorem 4, Yk+l = (Yk' t ' g ( j ) ) ' so that


Ju(t,Yk+l,j) = Ju(t,Yk,t,g(j),j)
is the quantity defined through (22) and (21).

V. THE RELATED DETERMINISTIC CONTROL PROBLEMS

Fix an integer r. Let {u*(t,Yk)} be an optimal control for the problem formu-
lated in Section IT. In this section we shall formulate two deterministic control
problems whose solution agrees with u*(t,Yr). These problems look at minimizing the
conditional remaining cost from the r-th jump onward over a class of controls which
differ from the optimal control only between the r-th and (r+l)-st observed jump.
To begin this discussion consider the class of controls {v(t,Yk)} such that
v(t,Yk) = u*(t,Yk) if k~r. That is, these controls agree with the optimal control
except perhaps between the r-th and (r+l)-st jump of the observed process. Since
controls of the above class agree up to time ~r with the optimal control the following
theorem follows from the standard argument used to obtain the principle of optimality.

THEOREM 5: If v is any control of the class above

E C(S,X(S] ,V(s))ds I Y >E c(s,xCs),u*Cs))ds I Y (27)


Tr r
almost surely.

Since any control v of the class above agrees with the optimal control u* from
the (r+l)-st observed jump onward, another standard argument implies that

Jv(~r+l,Yr+l,j) = Ju.(~r+l,Yr+l,j) . (28)


If ~r+l =t and then Yr+l =(Yr 't'g[j))' equation (28) can be also written as

Jv(t,Yk, t,g(j),j) = Ju.(t,Yk, t,g(j),j) . (29)


332

Theorem S and formulas (25),(19),(20) suggest we fix r and Yr and abbreviate by


defining v(t) = v ( t , Y r ) , Qi(t) = Qi(t,Yr) and define the d e t e r m i n i s t i c optimal control
problem with states Qi and control v given by:

Prob]em A. Choose v(t) on [~r,T] with values in U to minimize

Ju,(t,Yr,t,g(J),j) aij(t,v(t)) l Qi(t) dr. (30)


~r " {j:g(j)¢g(i)}
Subject to Qi(t) being soiutions o f
__d aij(t,v(t)) Qj(t) i f g(i) =Yr (31)
dt Qi (t) = aii(t'v(t)) Qi (t) + Z
{j:g(j)=g(i)}
with boundary condition
Qi(~r) = P[X(~r)=i [Yr ] (32)

Theorem 5 and formulas (26), (23), (24) suggest we again fix r and Yr' abbreviate
by defining ~(t) = ~(t,Y r ] , J i ( t ) = J v ( t , Y r , i ) and define the d e t e r m i n i s t i c optimal
control problem with s t a t e s Ji and control v of:

Problem B. Choose v(t) on [Tr,T] with values in U to minimize


N
Ji(Tr) P[X(Tr)=i [Yr] (33)
i=O
subject to J i ( t ) being solutions o f
d
d-t-Ji (t) : -c(t,i,v(t)) - Ji(t) aii(t,v(t)) - aij(t,v(t))Jj(t)
{j : g ( i ) = g ( j ) }

Ju.(t,Y~t.g(j),j) aij(t,v(t)) i = 0,1 . . . . . N (34)


{j : g ( i ) ¢ g ( j ) }
with boundary condition
Ji(T) = 0 i =0,i ..... N . (35)

VI. OPTIMALITY CONDITIONS

If u*(t) is an optimal control for Problem A, a calculation using the Pontryagin


principle shows that the adjoint equations of Problem A are
d
d - ~ i ( t ) = c(t,i,u*(t)) + Ju.(t,Yr,t,g(j),j) aij(t,u*(t))

- ~i(t) aii(t'u*(t)) - (j:g(i)~g(j)}[ Oj(t) a i j ( t , u * ( t ) ) (36)

with t r a n s v e r s a l i t y conditions
~j(T) = 0 . (37)

Thus (34), (35) and the uniqueness of systems of l i n e a r d i f f e r e n t i a l equations implies


the a d j o i n t v a r i a b l e s o f Problem A are the negatives o f t h e s t a t e v a r i a b l e s ofProblem B.
333

A similar calculation with Pontryagin's principle shows that the adjoint equations
of Problem B are the same as the state equations of Problem A, and that the transvers-
ality conditions are the negatives of the boundary conditions (32). Thus the adjoint
variables of Problem B are the negatives of the state variables of Problem A. Putting
these together with the maximum condition of Pontryagin's principle gives the follow-
ing type of duality for Problems A and B.

THEOREM 6: A necessary condition that u*(t) be an optimal control for either


Problem A or Problem B is that almost surely with respect to Lebesgue measure on
[Tk,T]

veU " {j : g ( i ) ~ g ( j ) } u r z3 ]

i s a t t a i n e d by the v a l u e o f the o p t i m a l c o n t r o l v = u * ( t ) , where Q i ( t ) i s t h e s o l u t i o n


of (31) with v ( t ) = u * ( t ) and with boundary c o n d i t i o n (32) and J i ( t ) is the soIution
of (34) with v ( t ) = u * ( t ) and with boundary c o n d i t i o n (35).

Putting Theorem 6 together with Theorem 5 gives:

THEOREM 7: A n e c e s s a r y c o n d i t i o n t h a t ( u * ( t , Y k ) } be an o p t i m a l c o n t r o l for t h e s t o -
c h a s t i c control problem of Section II is that almost surely with respect to the dis-
tribution of Yk and almost surely with respect to Lebesgue measure on [Tk,T] that

+ ~ ~ u*(t'Yk'i? aii(t'v) + {j:g(i)=g(j)}

i s a t t a i n e d by v = u * ( t , Y k ) where Qi(t,Yk) i s t h e s o l u t i o n o f (19) with c o n t r o l


u*(t,Yk) and boundary c o n d i t i o n (20) and J u , ( t , Y k , i ) i s t h e s o l u t i o n o f (23) with
c o n t r o l u*(t,Yk) and boundary c o n d i t i o n (24).

APPENDIX
VII. UNNORMALIZED
CONDITIONALPROBABILITIES
In t h i s appendix, x i s a jump p r o c e s s and y d e f i n e d by y ( t ) = g ( x ( t ) ) i s the
observed process. To shorten the notation slightly we shall assume that the condi-
tional distributions P[Tn+iSt,Xn+l=j I Xn ] have the form

P[Tn+l~t,Xn+l=J [Xn ]

= ~ l{mn~t,Xn=i} S~ aij(S,yk)exPif s aii(r,Yk)dr]ds . (40)


n LTn
334

Thus (9) is of this form with


aij(s,Y k) = aij(s,u(S,Yk)) •
The following four formulas which are consequences of (40) are important for
later work.

E [I"Tt n-<t<Tn÷1} IXn J = ~ l{Tn~t,Xn=i} exp n aii(r'Yk)d (41)

PETn+i<t,Xn+l=J [ Xn]

(42)
= ~ l{Tn<-t,Xn=i) Ii aijCS,Yk) EEl{yn~S<Tn+l} I Xn]ds
n

l{~k_<Tn<~k+l} P[Tn+l<t'Xn+l =j I Xn3

= ~ 1 - < - It (433
" {Tk-Tn<Tk+l 'Xn=i} Tk aij(S,Yk) E[l{Tn<S<Tn+l } IXn]dS

E E1 . - -
{Xn=l,Tk~Tn<Tk+l,Tn~t<Tn+l }
Ixn]

= l{xn=i,~k~Tn~k+l,Tn~t}

+ 1 {Xn=i'Tk~Tn-Tk+l}
- <- f:
Tk aii(S,Yk) E[l{TnSS<Tn+l } I Xn]dS
(44)

To obtain (41) since l{Tn~t} is a(Xn) measurable

E [l{Tn<t<Tn+l } IXn ] = l{Tn~t} E [l{t<Tn+l} IXn ]

= l{Tn<t}[l -P[Tn+l ~t [Xn]] • (45)

Using (40) and (8) and integrating gives

P [Tn+l~t[Xn]= ~ l[Tn~t,Xn=i}I
-exp~inaii(r'Yk)drIl" (46)

Substituting (46) in (4S) now gives (41).


To obtain (42) notice from (41) that
S3S

~ l{Tn~t'Xn=i} rn aij(s,Y k) E[l{zn<S<Tn+l} ]Xn]dS

= { l{Tn~t,Xn=i} Ii
tn m n (47)

= ~ l(Tn<t,Xn=i} ITn aij(S,Yk) l{Tn<S,Xn=i} exp ~:n aii(r'Yk)drl ds

= ~ l{Tn~t,Xn=i} Ii aij(S,Yk) exp ~:n aii(r'Yk)drl ds .


n

The next to the last equality in (47) follows because to have


l{TnSt,Xn=i}l{TnSS,Xn=m}~0 we must have i=m. The last equality follows beeause on
the set {Tn~t,Xn=i} for s in the range of integration Tn _<s _<t we have l{Tn~S,Xn=i}=l.
To obtain (43) notice from (42), since

l{Tn~S} EEl{Tn<S<Tn+l}IXn] = E[l{Tn<S<Tn+l } IXn ] , (48)


that
l -
{Tk~Tn<Tk+l } P[Tn+l~t'Xn+l=J IXn]

= l{~k~Tn<~k+l} ~ l{Tn~t,Xn=i} II l{Tn~S} aij(S,Yk)EEI{Tn~S<Tn+I} IXn]dS. (49)


n

The lower limit Tn in (49) may be replaced by Tk" because on the set {Tk~Tn}, l{TngS}
is zero for Tk ~ s < Tn" In addition

l{Tn<t,Xn=i } l{Tn<S} = l{Tn<S} l{xn=i} . (50)


Thus (43) follows from (49).
Since
P[Tn~t<Tn+ I IXn 3 = l{Tn<t}(l -p[tSrn+llXn] ) (51)
(44) follows from (43) and (8).
We shall often use a simple but very important generalization of the law of iter-
ated conditional expectations stated by Makowski in [3], Theorem AI, which applies in
the case in which there may not be global inclusion between two sigma fields.

THEOREM: Let ff and a' be two G-fields. Let A be a set which is in both o and a' and
for which
~ o Ac~ I N A . (S2)
Then if z is a random variable for which
E lz[ <
it follows that
E [ I A E[z I(~'3 [ a ] = E [1A z l a3 . (53)

In particular the set {Tk~Tn} is both a(Xn) and a(Yk) measurable and
l{~k~Tn} a(Yk) c l{Tk~Tn} a(Xn) , (s4)
336

Thus
E[[l{zk~Tn }3 E[ z IX n] I Yk ] = E [l{~k~n } z I Yk ] . (55)

Call (53) or (55) the generalized law o f i t e r a t e d conditional expectations

THEOREM AI: Given the history Yk of the first k jumps of the observed process, the
conditional probability that the next observed jump Tk+ 1 occurs before t and that at
this time the unobserved process x takes on the value j is given by

P [~k,l~t,X(Tk.1)=j IYk ]

{fk^t .~ Qi(S,Yk) aij(S,Yk)dS


1

0
if

if
g(J)~Yk

g (J)=Yk
(56)

The form of the conditional probability distribution of the observed process and
the conditional probabilities of the unobserved process at the time of an observed
jump are given in the following two corollaries which follow ilmediately from
Tneorem AI.

COROLLARY AI:
P[~k+l<t,Yk+l=£ I Yk]

= ~ ~ Qi(S,Yk ) aij(S,Yk)dS •
Tk^t {j:g(j)=£} i

COROLLARYA2:
P[X(~k+l)=J [Yk+l ]

i Qi(Tk÷l,Yk ) aij(Tk+l'Yk)
if g(J) =Yk+l
~ Qi(Tk+l,Yk ) aij(Tk+l,Yk~
{j:g(j)=Yk+i } i

0 if g(J)~Yk+l

PROOF OF THEOREMAI: Since Tk*l is a jump time of the observed process, the s t a t e j
t h a t the unobserved process jumps to must s a t i s f y g(J)¢Yk" Thus i t is impossible at
time ~k+l f o r the unobserved process to be in a s t a t e j f o r which g(J)=Yk' and the
conditional p r o b a b i l i t y (56) is 0 f o r these s t a t e s . In the remainder o f the proof
assume t h a t g(J)¢Yk"
337

Since g(J)~Yk and since {~k } are the jumps of the observed process y and {Tn}
the jump of the unobserved process x, we have that

{~k+l~t,x(Tk+l)=j} = U {Tk~Tn<~k+l,Tn+l~t,Xn+l=j} • (59)


n
Thus the monotone convergence theorem for conditional expectations implies

P[~k+l~t,X(~k+l)=j [Yk ] = ~ P[~k~Tn<~k+l,Tn+l~t,Xn+l=j IYk ] . (60)


n
The generalized law of iterated conditional expectations implies

P[Tk~Tn<~k+l,Tn+l~t,Xn+l=J [Yk ]

: E [i - E [1 - -
{Tk~Tn } {Tk~Tn<Tk+l,Tn+l~t,Xn+z=j} Ixn] IYk ] . (61)
The a(Xn) measurability of l{~k~Tn<~k+l} and (42) imply

E [i - -
{Tk~Tn<Tk+l,Tn+l~t,Xn+l=]} IXn ]

= l{~k~Tn<~k+l} E [l{Tn+l<t,Xn+l=j } Ixn] (62)

= ~
" 1 {~k~Tn~Tk+l'Xn=i}
- - ?
~k aij (s 'Yk) E[Tn~S<Tn+ 11Xn ] ds

Thus substituting (62) in (61) and then (61) in (60) interchanging integration,
summation and conditional expectation; using the generalized law of iterated condi-
tional expectations; and using that

U {~k~Tn<~k+l,Xn=i,TngS<Tn+l } = {Tk~S<Tk+l,X(s)=i} (63)


n
gives
P[Tk+l-<t,X(Tk+l) =J Yk]

ds (64)
- < s < - %k+l,X(s)=i} [Yk ] aij(S,Yk) }
, ~ { E E l ftZk_
= / tTk

Tk ~ Q i ( S , Y k ) aij(S,Yk) ds ,

which i s t h e c o n c l u s i o n o f Theorem A I .

THEOREM I : I f g(i)=Yk, on the interval Tk g t < T, Qi(t,Yk) s a t i s f i e s the system of


differential equations
d
d-~Qi(t,Yk) = aii(t,u(t,Yk) ) Qi(t,Yk )

+ [ aji(t,u(t,Yk)) Qj(t,Y k) (19)


{j:g(i)=g(j)}
338

and the boundary condition


Qi(~k,Yk) = p[x(~k)=i IYk ] (20)
is satisfied. If g(i)~Yk, Qi(t,Yk)=O.

PROOF OF THEOREM I: If g(i)~Yk, the sets involved in the definition of Qi(t,Yk) have
an empty intersection, and Qi(t,Yk) is the conditional expectation of a zero random
variable and hence is zero. Assume in the remainder of the proof that g(i)=y k.
The monotone convergence theorem for conditional expectations implies

E [l(x(t)=i} l(~k~t<~k+l } IYk ] = [ E [l[x(t)= i ~k~t<~k+l,Tn~t<Tn+l} IYk ] • (65)


n

Since ~k are the jumps of the observed process y and Tn the jumps of x the following
set equality holds:
(~k~t<~k+l,Tn~t<Zn+l} = (~k~Tn<~k+l,Tn~t<Tn+l } . (66)
The generalized law of iterated conditional expectations (66) and (44) imply

E [l(x(t)=i,~kSt<~k+l,Tn~t<Tn+l} IYk ]

= E[irx i - < <- IYk ]


t n = ,Tk-Tn Tk+l,Tn~t<Tn+l }

= Eli(- - E (it - < < I Xn ] [Yk ]


Tk~Zn) tXn=i,~k_Tn<~k+l,~n_t<Tn+l } (67)

= E [l(xn=i,~k<Tn<~k+l,Tn<t} I Yk ]

- -
+ E [l(xn=i,Tk~Tn<Tk+l } ftTk aii(S'Yk) E [l(zn~S<Tn+l } IXn]dS IYk] .

Writing (Tk~Tn } = (Tk=Tn } u (Tk~Yn_l }, we have

E [l(xn=i,~k<Zn<~k+l,Zn<t } IYk ]

= l{~k~t} E [l{xn=i,~k=Zn} IYk ] (68)

+ E (it " - < <3 • <- ,~n~t} I Yk ] "


tXn=1"Tk-Tn-I k+l" n Tk+l
Notice since we are assuming g(i)=Yk, that Xn=i and Yn-I <rk+l imply that Tn < Tk+ I-
Thus the condition Tn < ~k+l may be dropped in the last term of (68). Now by the
generalized law of iterated conditional expectations and (42)

E [I¢ _. - < - < -IYk ]


IXn-l,Tk-Tn_l<Tk+l,Tn -t)
= E [i - E [l~- < <- < [Xn_ l [Yk]
(~k~Tn-i } tTk-rn-I Tk+l'Tn-t'Xn =i} (69)

=E 1 - < - . aji(S,Yk) E( I Xn]dS I Y •


(~k-~n-l<Tk+l 'Xn-l=J } ~k l{~n-l-<S<~n}
339

Combining (69), (68) and (67), using the generalized law of iterated conditional ex-
pectations, interchanging integration, summation and conditional expectation gives
E [l{x . - -
(t)=i,TkSt<Tk+l,TnSt<Tn+l} ]Yk ]

= l{TkSt} E [l(xn=i,.~k=.~n} I Yk]


(70)
÷ It ~ aji(S,Yk) E [I.~ <-
~k - <
i k-Tn_l Tk+ I ,Tn_l<S<Tn,Xn_l=j) [Yk ] ds

+ aii(S,Yk) E [ i t - s <- <s< IY k] ds


Tk tTk Tn Xk+l,T n- Tn+l,Xn = i } "

Using (66) and summing (70) over n gives

E [l{x(t)=i,~k~t<~k+l} IYk]

= l{Tk_<t } E [ i {x(T- k) =i} ]Yk ]


(7i)
÷ It aji(S,Yk) E [it- < <-
t%k-S %k+l,X(S) =j }
I Yk ] ds
~k
+ ~k aii(S'Yk) E [i.-
tTk_< s <-rk+l,X(S) =i} I Yk ] ds "

Using the definition (18) of Qi(t,Yk) we have from (71) that

Qi(t,Yk) = l{~k<t} P[X(~k)=i IY k]

+ It ~ii(S,Yk) Qi(S,Yk) + ~ aji(S,Yk) Qj(S,Yk)1 ds (72)


~k J

from which Theorem I follows by differentiation.

VIII. CONDITIONALREMAINING COSTS

THEOREM A2: If z is any O{Tn+l,Xn+l ..... Tn+k,Xn+k .... }-measurable random variable for
which EIz I < ~, then E[z IX n] can be expressed as a function of x n and Yk where Yk is
the observed history corresponding to X .
n

PROOF: It can be shown using (40) that the theorem is true for the characteristic
function of any o(Tn+l,Xn+l, .... Tn+k,Xn+k)-measurable set. The class of random vari-
ables for which Theorem A2 is true is closed under monotone limits and products.
Therefore, the theorem follows from the monotone class theorem.
Theorem 2 follows by applying Theorem A2 to both the numerator and denominator
in the definition (21) of Ju(t,Xn).
340

T H E O R E M 3: On the interval ~k -< t -< T, Ju(t,Yk,i) satisfies the system of differen-


tial equations
_~d
dt Ju(t,Yk,i) = _c(t,i,u(t,yk) )

I Ju(t,Yk,j) a i j ( t , u ( t , Y k ) )
{j :g(i)=g(j)} (23)
[ }Ju(t.Yk.t.g(J) • J) a i j ( t , u ( t , Y k)
{j:g(i),g(j)
- Ju(t,Yk,i) aii(t,u(t,Yk)) .

At time T the boundary condition


Ju(T,Yk,i) = 0 (24)
is satisfied.

PROOF OF THEOREM 3: Breaking the integral in (73) into the integral between t and
the next j~p Tn+ 1 and using that between the n-th and (n+l)-st jump, x(t)=Xn, and
u(t)~(t,Yk) and then using the ~ u l a s (41) and (56) giving the distribution of
Tn+1 and Xn+l, we have

g {~n~t<Tn÷l } t c(s'x(s)'u(s))'ds IX

= E .{rn~t<Tn+l}|| t c(s,x(s),u(s))ds+ c(s,x(s),u(s))ds IX


TA~n+ 1

E . • c(s,i,u(S,Yk))ds I X J
+E E{Tn<t<Tn+l} E T^Tn+
T 1
(733

= l(Tn<t} I~ ~ l{xn=i} exp aii(r,Yk)d c(s,i,u(S,Yk))ds


Tn

+ E [l{Tn_<t<Tn+l} Ju(Tn+l,Xn+1) [Xn]

= . exp aii(r,Yk) d c(s,i,u(S,Yk))ds


~ l{Tn-<t,Xn=i} It Tn

+ ~ l{Tn-<t,Xn=i} I] ~ Ju(Tn+l,Xn,Tn+l,j) aij(Tn+l,Yk) exp Tn+l aii(r,Yk)dr1 ds .


KTn
341

Now since

Ju(t,Xn) :
E [l{rn<t<rn+l } IXn]
and from (41)

E [I{TnNt<Tn+I}IXn]= ~ l{Tn<-t,Xn=i}
exP ~ n aii(r,Yk) drl • 441)

Dividing (73) by (41) gives

From Theorem 2 we have that i f Yk is the observed h i s t o r y corresponding to Xn and Xn=i


that
Ju(t,Xn) = J u ( t , Y k , i )
and i f Xn+1 = (Xn,t,j) and g(J)=Yk' then Yk is the observed h i s t o r y corresponding to
Xn+1 and
Ju(t,Xn+l) = J u ( t , Y k , j )
and if Xn+ 1 = (Xn,t,j) and g(J)~Yk' then Yk+l = (Yk ' t ' g ( j ) ) is the observed h i s t o r y
corresponding to Xn+ 1 and
Ju(t,Xn.l) = Ju(t,Yk,t,g(j),j) •
Making these substitutions in (74) gives for t -> Tn that

Ju(t,Yk,i) = exp aii(r,Yk) d c(s,i,u(S,Yk)dS


t (75)

+
It
[ J(t,Yk,J)aij(S,Yk)+ [ Ju(S,Yk,S,g(j),j)alj(s,Y k
j :g (j)=yk} {j :g(j)~yk}
exp
t
aii(r,Yk)drlds •

Theorem 5 follows by differentiating (75) with respect to t.

TH£OREM 4: The conditional remaining cost from an observed jump time ~k onward (or
from time ~0=0 onward i f k=O] conditioned with r e s p e c t to the observed h i s t o r y Yk
satisfies

E I Tk c(s,x(s),u(s))ds [ Y~l--~j
(2s)

(j :gCj).g(i)) Ju(t,Yk,t,g(J) •J)aij (t,u(t,Yk))l Qi(t,Yk)dt


(t,i,u(t,Yk)) + [

and also s a t i s f i e s
842

c(s,x(s),u(s))ds IY = ~ Ju(Tk,Yk,i) P[x('~k)=i ]Yk ] • (26)


~k i

PROOFOFTHEOREM4:
B ~ k c(s,x(s),u(s))ds IYk]
ATk+ i T
= E c(s,x(s),u(s))ds + c(s,xEs)uCs))ds ]Y (76)
L_-k TA~k+ I
Using the generalized law of iterated conditional expectations

E c(s,x(s) ,uCs))ds IY
TATk+ 1

Now
=E
fE n~ t{Tk+l=Tn}
E
T,V'rk+I
c(s,x(s),u(s))ds IX IYk+ IY . (77)

E c(s,x(s),u(s))dslX = Ju(Tk+l,Xn) (7s)


TATk+ 1
and on the set {~k+l=Tn }, Yk+l is the observed history corresponding to Xn-
Thus by Theorem 2

~nl(Tk+1='tn}JuC'~k+l'Xn): n~ ~'j l{Zk+l=Tn,Xn=l


}. Ju('}k+l,Yk+l,j)

= ~ l(x(~k+l)=j} Ju(~k+l'Yk+l 'j) • (79)


3
Hence from (76),(77),(78),(79)

~fk c(s,x(s).u(s))dsI~1-_
= E c(s,x(s),u(s))dsl Y + E I~ l{x(Tk+l)=j } Ju(Tk+l,Yk+l,j) I Ykl
~kl{~kSS<~'k+l}

= E I.~k ~ E {~.k~S<.~k+l,X(s)=i} I Y q c(s,i,u(S,Yk))ds (80)

+ IT ~, s Ju(S,Yk
,g(j). ,j) ~ Qi(S,Yk) aij(S,Yk) ds
"rk J 1
-I
~k ~ (s'i'u(S'Yk)) + Z J (s,Y~,s,g(j)) aij(S,Yk) [ Qi(S,Yk)dS
{j:g(j)~yk} ~ _J
which gives (25).
843

To obtain (26), notice using the generalized law of iterated conditional expec-
tations and Theorem 2 that

E ~:k C(S,X(S),U(S)) ds IYk]

= E [! l{-~k=rn} E I:k c(s'x(sJ'u(s))ds 'Xn] 'Yk]

-- F. [~n { l{.k=Zn,Xn=i} Ju(,k,Yk,i) IYk] (81)

= .~ P[X(~k)=i IYk] Ju(Tk.Yk.i)


1

REFERENCES
[i] R. Boel, P. Varaiya, "Optimal Control of Jump Processes," SlAM Journal on Control
and Optimization," Vol. 15 (1977).
[2] M. Kohlman, A. Makowski, R. Rishel, "Representation Results for Jump Processes
with Application to Optimal Stopping," Stochastics, Vol. 4 (1980).
[3] A. Makowski, "Local Optimality Conditions for Optimal Stopping," Stochastics 1982.
[4] R. Rishel, "A Minimum Principle for Controlled Jump Processes," Springer Lecture
Notes in Economics and Mathematical Systems, Vol. 107 (197S).
[S] R. Rishel, "The Minimum Principle, Separation Principle and Dynamic Programming
for Partially Observed Jump Processes," IEEE Transactions on Automatic Control,
Volume AC-23 (1978).
[6] R. Rishel, "Optimality for Completely Observed Jump Processes," IEEE Transactions
on Automatic Control, Vol. AC-22 (1977).
[7] R. Rishel, "State Estimation for Partially Observed Jump Processes," Journal of
Mathematical Analysis and Applications, Vol. 65 (1978).
[8] M. Rudemo, "State Estimation for Partially Observed Markov Chains," Journal of
Mathematical Analysis and Applications, Vol. 44 (1973).
[9] A. Segal, "Optimal Control of a Finite State Markov Process," IEEE Transactions
on Automatic Control, Vol AC-22 (1977).
ON NORMAL APPROXIRLATION IN BANACH SPACES

V. V. Sazonov

Steclov Mathematical Institute

Moscow, U.S.S.R.

During recent years considerable progress was made in the study

of the accuracy of normal approximation given by the central limit

theorem in Banach spaces. In this talk I intend to survey some impor-

taut new results in the area. Consider first a somewat simplier case

of Hilbert space.
Let XoXz, be a sequence of independent identically distribu-

ted (i.i.d.) random variables with values in a separable Hilbert spa-

ce ~ . The inner product and norm in ~ will be denoted by ~ ) and

i.L respectively. Suppose that

and let V be ~he covariance operator of X , i.e.


3

recall that V is linear bounded nonnegative symmetric operator with

finite trace ~V . Denote

and let Y be a Gaussian random variable with values in ~! and with

EY=O and covariance operator V •


The Central Limit Theorem states that the probability measures

induced on ~ by S ~ converge weakly to the Gaussian probability

induced on ~ by Y .

We are interested in the estimation of


845

: e {l A +,xJ,
where ~ is some class of Borel subsets of 7q . It is well known
(see e.g. ~ ) that contrary to the finite dimensional case ~ ( ~ ) in
general do not tend to zero as ~ o ~ if ~ is rich enough, e.g. if
contains the class of all halfspaces or the class of all balls.
In what follows we will be concerned mainly with the case w h e n ~ = ~ ,
the class of all balls with a fixed centre ~ .
The first estimates o f } ~ ( ~ ) (apart from logarithmic speed ob-
tained in [17~)were of the type O( m-~ ),~ < l ~ a n d were valid when
EIXj[ <c~ for some ~ 3 (depending on the estimate) (see ~l~, [l~ , ~
and Yurinskii's theorem in ~ ). Under some additional (rather rest-
rictive) conditions on the probability measure induced by ~i better
speeds were obtained in [4~ and [5] .
Denote ~ the kth eigenvalue of V (we assume that eigenvalues
of V are numbered in the decreasing order). In 1979 F.G6tze [7] pro-
ved that if EIX, I~<~ and J~>o then

moreover, if EIX,18~/~and Je~o then

(-.J.) = o C..,,-, +
(1)

for any ~ > o . Later F.G6tze was able to show (see ElO]) that (1) is
true if E[X.I*<~ and Jj>o j>~i ~ ~j = o~-P) for some ~#o
In 1981 V.V.Yurinskii[18-1~under the assumption that Xj~ 5 - - j ~
are independent (but not necessary identically distributed), EX3--o j
all Xj have the same covariance matrix V , proved that

# ~ ( dL) ~ c ~ El×il~ (14(l~i(7;a.VJ-'~)~)~ -'~ (2)

where g depends only on the eigenvalues of V ~ v ) -r . Note that V.V.


Yurinskii also generalized this result to spaces 4k where k is a po-
346

sitive integer. In this more general case the result has the same form

only power 3 in square brackets in (2) is replaced by 3(2k-1).


Recently B.A.Zalesskii was able to extend and partially improve
these results: assuming E IXil~+g~oo , 0<_g~l , he proved that

in particular if EIXiI4~ then for any fixed ~ 7 o

]),.(-A) = o C,-~ -,+~ )

( ee [2o ,[21] ).
In the same paper B.A.Zalesskii pointed out that the speed in (2)
is the best possible; namely for any ~~ o he constructed a sequence

~X~) with E IYil~ for a n y ~ o such that

:D o

He also proved that there exist a sequence {X;~ with EIX;I~4~ such
that

Two main methods were employed to obtain the abovementioned esti-

mates. One method consists in approxomating the indicator function~s~f~


of the ball ~(~) = <x~V-f:l~~,I<~by a smooth function ~ ( x ) thus
replacing the estimation of

by the estimation of

To e s t i m a t e (3) the difference P~-Q. is represented as

]=o
347

where ~) is the probability measure induced by ~'~ ' ~C~) is nor-


mal (o>~-,/LV).Thus one has to estimate terms of the type

~or this purpose ~+~)is expanded by Taylor's formula with respect


to ~ . Since probability measures ~ Q~)have the same first and
second moments, the integration of the first three terms of the expan-
tion gives zero. These are the main lines of the first method. The
best speed obtained by applying it is ~ -'~. However the method permit-
ted to construct estimates with explicit dependence on the parameters
of the probability measure induced by ~I (see ~ ).
The second method consists in application of the Fourier analysis
to the measure on the real line induced by ~ @ ~ I ~ Usually the random
variables X X are at first truncated, i.e. replaced by X[> ~
where XT=
~ X~ix;i~ ~ is the indicator function of E ). The function
~(~) is chosen in such a way that

Ill

where ~-" = ~ - '/',-'2."


~x;, is small enough. The characteristic function ~ of
l$~-Zl~is smooth and one may apply successfully the Fourier analysis.
By an Esseen inequality we have
"T
/

-T

where T(~) is the characteristic function of I'k'-~ IL. To estimate If~'('tJ-


- ~(f)l for small values of ~ an expansion of f ~ - ~ similar to the
expansion of ~ - Q in the first method described above is used. Then
(as for larger T ) GGtze-Yurinskii estimates of the characteristic
functions such as ~/ (see [7], [18]) are applied. E.G. for a n y A , o ,
348

where % = ~ n ~ ~I~J,(~a.c.:(~,,~)'k.~-"k-)~/~ and ~'~z are independent cente-

red Gaussian ~ -valued random variables with the same covariance ope

rator as ~ / and constants c. are independent of ~ and ~ .


J J

In special cases and under additional assumptions better estima-

tes may be obtained. Consider, for example, ~=~z~/) and let

where ~5~j:11~ are independent and uniformly distributed an [O,1] ,

~(~)=l if a > Q , = 0 i f u ~ o . Note that in this case

a j--'t
t

= W 2-

where U(~)is the empirical distribution function corresponding to

~,~.., ~ , and thus

where F(~) is the limiting distribution function for P~W~<~) • In

other words the estimation of ~ ( d o ) in this special case means the

estimation of the speed of convergence in ~criterion in statistics.

Recently Yu.V.Borovskih [3] proved that

I P ( w ~ <'~) - FPc,~I _~c ( 1 4 ~ - ) - ' , h "~

There exist several generalizations of these results to the case

of random variables with values in a separable Banach spase ~ (see

e.g. Ill, [2] , [6], [8] , [9] , [12]-[1¢] . Here we will mention two recent re-

sults. The first one belongs to F.GGtze[9] and is more general also in

the sence that it deals with a larger class of sets then balls. Sup-

pose X,xz~...are i.i.d, random variables with values in B and such

that EUX,II4L~( I1.11 is the norm in ~ ), E X , = O . Assume also that the


84g

probability measure p~ induced by ?l-'/z'.~_.X.


i ]
converges weakly to

some Gaussian Borel measure ~ on B . Let now F be a real func-

tional On B such that

(i) F is 4 times continuously Fr~chet differentiable and

I~FCi~Oc)n ~CF {14-fl~efl) '1" ) i_~]~(~

-c4} F(¢)(~)ii cr

f o r any ~,~6 ~> and some CF>% ~->oj 1~(~2o,


(ii) q {r_: ECFe'I('zjCX,JJZ~
C)= O¢~.=lkJ
as Z 60 where c i s large enough,
then

s~ { I(P~-Q)(~.~c,4(~)1~ ~o~ = oC,-a). (4)

Note that in Hilbert space the condition (i) on FC~) =1~1 is

satisfied. If ~ is an arbitrary Banach space this result can be ap-

plied to F(~):ll.ll only if I[.[I is smooth which is not always true. E.g.
the supremum norm in B= C Cs), the space of continuous real functions

on a compact metric space S , is not Fr~chet differentiable. We will

mention here a recent result for CC3~valued random variables obtained

by V.Bentkus[l~. Let ~ be a continuous pseudometric on S , ~CS) be

the smallest number of open ~-balls of radius ~ covering ~ , and

let H ? ~ (~) be the metric entropy of ~ . For ~cC~) denote

e = IITll 4- s~-pit I ~> 6) - ?C-f:Jl/~,÷Jj%-k¢S~


and let

i#c,-j o I He (<'-j " eo


As before X,~X~,,,. are i.i.d, random variables with values in CCS) ,
EX,(6)=O for any "J~S and ~, -~,#l-'/~__~.. Then if 0('i)<'c~

A =EIx,l
350

Po , and for some q 7 o and all ~->o

Y being a Gaussian ((S)-valued random variable with EX/,(~)X,(4)=E~)y(-~)


for any ~ , ~ 6 S , we have

r-.df) ct~l~'A '/~ -,/~&.,.. (51


Inequality (4) is obtained by a method similar to the Fourier me-
thod in Hilbert space outlined above. Estimate (~) is obtained by ap-

plying a finite dimentional approximation and using appropriate finite


dimentional estimates.

References

i. Bentkus, V., Estimates of the speed of convergence in the central

limit theorem in C(~) , to appear in Dokl. Akad. Nauk SSSR.


2. Bernotas, V. and Paulauskas~ V., Non-uniform estimates in the cen-

tral limit theorem in some Banach spaces, Litovsk. Mat. Sb.,

19(1979), 23-43.
3. Borovskih, Yu., Estimates of characteristic functions with applica-

tions to ~Z-ststistics, to appear in Teor. Verojatnost. i


Primenen.

4. Borovskih, Yu. and Ra~kauskas, A., Asymptotic behaviour of distri-


butions in Banach spaces, Litovsk. Mat. Sb., 19(1979), 39-54.
v
5. Cebotarev, V.I., Estimates of the speed of convergence in the cen-

tral limit theorem in ~ , Sibirsk. mat. Z., 20(1974), 1099-


lll6.

6. Gene, E.M., Bounds of the speed of convergence in the central limit

theorem in C(S) , Z. Wahrsch. verw. Geb. 36(1976), 317-331.


SSl

7. GGtze, F., Asymptotic expansions for bivariate yon Mises functional

Z. Wahrsch. verw. Geb. ~0(1979), 333-355.


8. , On Edgeworth expansions in Banach spaces, Ann. Probab.,

9(19SI), S52-S59.
9. , On the rate of convergence in the .central limit theorem
in Banach spaces, Preprints in Statistics, University of
Cologne, 68(1981), pp. 1-34.
lO. , Convergence rate in the central limit theorem in Hilbert

space, 14th European Meeting of Statisticeans, Wroclaw 1981,

Abstracts, p. 35.
ll. Kuelbs, J., Kurtz, T., Berry-Esseen estimates in Hilbert space
and an application to the law of the iterated logarithm,

Ann. Probab., 2(1974), 387-407.


12. Paulauskas, V., The estimate of the rate of convergence in the

central limit theorem in C~$) , Litovsk. Mat. Sb., 16(1976),


167-201.
13. , On the rate of convergence in the central limit theorem
in some Banach spaces, Teor. Verojatnost. i Primenen., 21

(1976) , 775-791.
14. , Paulauskas, V., On convergence of some functionals of
sums of independent random variables in a real Banach space,
Litovsk. Mat. Sb., 16(1979), 103-121.
15. Sazonov, V.V., Normal approximation - some recent advances. Lec-

ture Notes in Math. vol.879, Springer, Berlin (1981).


16. Ulyanov, V.V., On the estimation of the rate of convergence in
the central limit theorem in a real separable Hilbert space,

Mat. Zametkl, 28(1980), 465-473.


17. Vakhania, N.N. and Kandelaki, N.P., On the estimation of the speed

of convergence in the central limit theorem in Hilbert space,

Trans. Comput. Centre Acad. Sci. Georgian SSR., lO:l (1969),


150-160.
~2

18. Yurinskii, V.V., On the accuracy of Gaussian approximation for


the probability of hitting a ball, Tecr. Vercjatnost. i Pri-

menen., 2(1982), 270-278.


19. , Estimate of the accuracy of normal approximation of the
probability of hitting a ball, Dokl. Akado Nauk SSSR, 258

(1981), 557-558.
20. Zalesskil, B.A., Estimates of the accuracy of normal approximation
in a Hilbert space, Teor. Verojatnost. i Primenen., 2(1982),

279-285.
21. , On the rate of convergence in the central limit theorem

in Hilbert space, to appear in Dokl. Acad. Nauk SSSR.


A CLASS OF P R O B L E M S IN THE O P T I M A L C O N T R O L OF D I F F U S I O N S
WITH FINITELY MANY CONTROLS
Diane D. S h e n g
Bell L a b o r a t o r i e s
Holmdel, New Jersey 07733
USA

i. INTRODUCTION
In this p a p e r we c o n s i d e r a class of p r o b l e m s in w h i c h the
process to be c o n t r o l l e d is f u n d a m e n t a l l y a B r o w n i a n m o t i o n on the
n o n n e g a t i v e real line w i t h either a b s o r p t i o n or r e f l e c t i o n at the origin.
At each point in time one of a finite n u m b e r of c o n t r o l s must be em-
ployed, and the c o n t r o l s c h o s e n i n f l u e n c e the e v o l u t i o n of the B r o w n i a n
motion process. Operational costs and linear h o l d i n g costs are con-
tinuously i n c u r r e d at a rate d e p e n d i n g on the control in use and the
state of the B r o w n i a n m o t i o n process. L u m p - s u m s w i t c h i n g costs are
incurred i n s t a n t a n e o u s l y each time there is a switch in controls, and
these s w i t c h i n g costs are dependent on the two c o n t r o l s i n v o l v e d in
the switch. Further, if the b a r r i e r at the origin is a b s o r b i n g there
is a cost for h i t t i n g the boundary. We a l l o w a g e n e r a l set of control
strategies where the current control may depend in an arbitrary, meas-
urable way on past states and past controls. The o b j e c t i v e is to de-
termine a strategy for s e l e c t i n g a control at each point in time, so
as to m i n i m i z e the total e x p e c t e d cost d i s c o u n t e d over an i n f i n i t e
p l a n n i n g horizon. These p r o b l e m s are r e l a t e d to those studied by Doshi
[4,5], Mandl [I0]~ and Pliska [13], d i f f e r e n c e s b e i n g in the form of
the h o l d i n g costs or s w i t c h i n g costs; and also are r e l a t e d to w o r k by
Arkin, Ko~emaev, and S h i r y a e v [I], Benes [2], Davis and V a r a i y a [3],
and F l e m i n g [6], d i f f e r e n c e s b e i n g in the c o n t r o l l a b l e n a t u r e of the
drift c o e f f i c i e n t and/or d i f f u s i o n c o e f f i c i e n t of the d i f f u s i o n pro-
cesses. Additionally, our focus e m p h a s i z e s e x p l i c i t p r o d u c t i o n of
optimal control strategies.
D i f f u s i o n m o d e l s in general, and c o n t r o l l e d B r o w n i a n motion~
in p a r t i c u l a r , have been used by FoschIDi [7], Iglehart [8], and
others in m o d e l i n g w a t e r r e s e r v o i r s , cash m a n a g e m e n t inventories and
other I n p u t - o u t p u t systems. As an a p p l i c a t i o n of the w o r k p r e s e n t e d
here, c o n s i d e r a service f a c i l i t y w i t h i n f i n i t e c a p a c i t y that can be
o p e r a t e d at d i f f e r e n t service rates. (For example, the rate at w h i c h
a packet switch t r a n s p o r t s data might be v a r i e d by c h a n g i n g the n u m b e r
or speed of o u t g o i n g t r a n s m i s s i o n c h a n n e l s . ) Suppose the costs as-
sociated w i t h this service f a c i l i t y are a linear c u s t o m e r - h o l d i n g cost,
384

an o p e r a t i o n a l cost c h a r g e d at a rate d e p e n d e n t on the p r e v a i l i n g ser-


vice rate, and s w i t c h i n g costs charged w i t h each change in service rate.
The p r o b l e m is to choose the service rate at each point in time so as
to m i n i m i z e the service f a c i l i t y ' s total e x p e c t e d costs. The process
to be c o n t r o l l e d , t h e n is the n u m b e r of c u s t o m e r s in the service fa-
cility and indeed, it is often r e a s o n a b l e to r e p r e s e n t the n u m b e r of
customers in a q u e u e i n g system by a B r o w n i a n m o t i o n process. (This is
because, w i t h sufficient congestion, the d i s c r e t e n e s s of i n d i v i d u a l
c u s t o m e r s becomes r e l a t i v e l y u n i m p o r t a n t and is J u s t i f i e d by limit
theorems; see, for example, Rath [14].)
It is c o n j e c t u r e d here that there always exists an optimal
M a r k o v control strategy that is of a p a r t i c u l a r band form, but this
has been shown only for two a v a i a b l e controls. N e c e s s a r y and suffi-
cient c o n d i t i o n s are given for such a "band strategy" to be optimal,
and o p t i m a l b a n d s t r a t e g i e s are e x p l i c i t l y p r o d u c e d for a few special
cases.

2. FORMULATION

The data of our control p r o b l e m s are the set A = {I,2,...,N}


of a v a i l a b l e actions or control mode6, and real c o n s t a n t s ri, ~i' and
a. for each i ~ A. These c o n s t a n t s are the o p e r a t i o n a l cost rate,
l
drift p a r a m e t e r , and v a r i a n c e parameter, r e s p e c t i v e l y , for control mode
i. We assume that o. > 0 for all i ~ A. Also given are n o n n e g a t i v e
l
s w i t c h i n g costs Kij for i,j £ A w h i c h satisfy

(2.1) Eli = o for each i ~ A, and


(2.2) Kij ~ Kia + Kaj for each i,j,a ~ A.
Finally, there is a h o l d i n g cost rate h, a b o u n d a r y cost R, an inter-
est rate ~ > 0, and a b o u n d a r y p a r a m e t e r I E {0,I}.
The state of the system at time t is d e n o t e d by X(t) for
t > 0, and the state space by S = [0,~). Whenever control mode i is
employed, the state of the system changes locally like a B r o w n i a n motion
with i n f i n i t e s i m a l drift ~i and i n f i n i t e s i m a l v a r i a n c e ~'l At the
o r i g i n the p r o c e s s is a b s o r b e d if I = 0, or i n s t a n t a n e o u s l y reflected
if I = i.
There are c o n t i n u o u s costs i n c u r r e d at each point in time
w h i c h are a f u n c t i o n of the current state of the s y s t e m and current
action. Denote these cost by g: S × A ÷ ~ where

[hx + r i if x > 0
g(x,i) !
[(l-l)~R if x = 0.
355

Additionally, the lump sum s w i t c h i n g costs Kij are i n c u r r e d i n s t a n t a n e -


ously w h e n e v e r there is a change from control mode i to mode J. Con-
dition (2.2) g u a r a n t e e s that is cheaper to switch d i r e c t l y from a c t i o n
i to j, than to switch from i to any o t h e r i n t e r m e d i a r y a c t i o n and then
to j. Both the c o n t i n u o u s and s w i t c h i n g costs are d i s c o u n t e d so that
a total cost of C i n c u r r e d at time t is e q u i v a l e n t to a cost of Ce -~t
incurred at time zero.
Based on the s w i t c h i n g costs, we can divide up the a c t i o n
space into disjoint equivalence classes. For each l E A , let e(i) =
{j~A: Kii = Kii = 0} denote the set of all control modes that can be
reached from i and from w h i c h i can be r e a c h e d with zero s w i t c h i n g
costs. Denote the M disjoint equivalence classes by A1, A 2 , . . . , A M.
It costs n o t h i n g to switch in either d i r e c t i o n a m o n g actions in any
one e q u i v a l e n c e class, and there is a p o s i t i v e s w i t c h i n g cost for switch-
ing in some d i r e c t i o n b e t w e e n actions in d i f f e r e n t equivalence classes.
Let Ck~ for each k,~ ~ {I,2,...,M} r e p r e s e n t the s w i t c h i n g costs b e t w e e n
equivalence classes. That is,
(2.3) Ck~ = Kij for each i E A k, j ~ A ~ , k , ~ ~ {I,2,...,M},

(2.4) Ckk ~ 0 for each k ~ {1,2,...,M}, and

(2.5) Ckg ~ Ckm + Cm£ for each k , ~ , m E {I,2,...,M}.

Start with a p r o b a b i l i t y space (~,F,P) on w h i c h is d e f i n e d


a s t a n d a r d B r o w n i a n m o t i o n B = {B(t);t ~ 0}, and let {Ft;t ~ 0} be the
i n c r e a s i n g family of s u b - q - f i e l d s g e n e r a t e d by B. We a l l o w the follow-
ing class of control strategies.
Definition. An a d m i s s i b l e s t r a t e g y is a f u n c t i o n
~: ~ x (0,~) ÷ A s.t.
(2.6) ~(w,t) is j o i n t l y m e a s u r e b l e in ~ and t,
(2.7) ~(;t) if F t m e a s u r a b l e for each t > 0, and
(2.8) the f u n c t i o n e (w,t) = 8(~(8,t)) has only f i n i t e l y many
discontinuities in each finite i n t e r v a l of time.
Hereafter, we suppress the d e p e n d e n c e of ~ on ~, and augment # by in-
c l u d i n g at time zero an initial a c t i o n ~(0) = i.
We now a s s o c i a t e w i t h each a d m i s s i b l e s t r a t e g y 7, initial
state x, and initial a c t i o n i, a c o n t r o l l e d It8 process. For each
admissible strategy ~, x g S, and i g A, let X and Y be the unique
pair of n o n a n t l c i p a t A n g p r o c e s s s.t.

(2.9) X(t) = x + U~(~) du + ~(u) dB(u) + Y(t) for each t ~ 0, and


0 0
(2.10) Y(') is c o n t i n u o u s and n o n d e c r e a s i n g w i t h Y(0) = 0 and grows

only when X(t) = O.


~6

That is, X(t) = Z(t) + Y(t) where


t It
Z(t) = x = I ~ ( U ) du + ~ (u) dB(u), and
0 0
- +
Y(t) = ioinf<u<tZ(u I for each t > O.

Denote the controlled process generated by w, x, and i by {X(tlx,i,~);


t > 0}. These controlled processes are precisely as follows: If k = 1
then X(tlx,i,~) = X(t), and if k = 0 then X(tlx,i,~) = X(tAT) where
X(t) is defined in (2.9) and T = inf {t > 0: .X(t) = 0}.
Remarks. (1) Even though an admissible strategy, is a m e a s u r a b l e
function of the B r o w n i a n path, it can be viewed as a function of the
history of the action process and c o n t r o l l e d process. However, if
early actions were in some sense "randomized" later actions w o u l d not
be contained in the Brownian e-fields. This issue of action random-
ization is not c o n s i d e r e d here.
(2) We need not require a fixed initial state x. Instead we
could fix an inital d i s t r i b u t i o n for X(0) such that X(0) has finite
quadractic mean.
We now a s s o c i a t e with each a d m i s s i b l e strategy 7, a value
function V , which given initial state x @ S and initial control mode
i ~ A represents the expected total d i s c o u n t e d cost g e n e r a t e d by w, x,
and i. For each pair of equivalence classes k and ~ in {1,2,...,M}
define {Qk~(tlx,l,w); t ~ 0} to be the counting process corresponding
to the number of switches made under strategy w from w i t h i n class k
to class ~ during [0,t]. The value function for strategy w is then
defined on S × A by

V (x,i)= E ~ e-~t[g(X(tlx,i,w), ~(t))]dt

M M I~ Ck~e-~t *
+ k=l
~ ~ ~l 0 dQkg (tlx,i,~

where Qk~(-Ix,i,w) is the counting measure associated with counting


process Qk~(.Ix,i,~), and where by c o n v e n t i o n

I t0
Lemma 1. F o r an a d m i s s i b l e strategy ~ and for e a c h i g A, IV ( x , i ) -
h x / ~ I i s b o u n d e d f o r e a c h x ~ S i f and o n l y i f
E ];I°0 e-~tdQ : ~ (tlx'i'~ < ~ for each k,~ in {I,2,...,M} where Ckg > 0.
357

Proof. In the case of a b s o r p t i o n

E Ix,i,~)dt - h x / a <

where U = m a x { I u il), and in the case of r e f l e c t i o n


i

E e-C~thX(tlx,i,~)dt - hx/c~ < + lhl ~e~--


0 - ~ a~
* 2
where e = max{el}; see Sheng [16]. Therefore

E e-atg(X(tlx,i,g), ~(t))dt - h x / a is b o u n d e d for each x and i,


0

and IV (x,i) - h x / ~ I is likewise b o u n d e d if and only if

E ~ C k £ e - a t d Q k £ ( t Ix,i,~ is. []
I £=I

The optimal value f u n c t i o n V,: S x A + R is defined by V,(x,i)


= InfV (x,i) where the irfflnum is t a k e n over all a d m i s s i b l e strategies.
The a d m i s s i b l e strategy ~ is called (x~i)-optimal if V (x,i) = V,(x,i).
Given initial state x g S and initial a c t i o n i ~ A, our control p r o b l e m
is to construct an a d m i s s i b l e strategy that is (x,i)-optimal.

3. O P T I M A L BAND S T R A T E G I E S

Of special interest is a subset of a d m i s s i b l e strategies


which we call b a n d strategies. Each band stratety is b a s e d on a band
function w h i c h d i c t a t e s changes in control mode as a f u n c t i o n of the
current state of the system and current control.
Definition. A band f u n c t i o n is a f u n c t i o n f: S × A + A s.t.
(2.11) for each i ~ A, f(x,i) has f i n i t e l y many d i s c o n t i n u i t i e s in
x ~ S and f(o,i) = lim f(x,i),
x+o
(2.12) for each (x,i) ~ S×A, f(x,f(x,i)) = f(x,i)
(2.13) for each i g A, the class c o n t i n u a t i o n set I i : { x E S: f(x,i)
e(i)} is an open subset of S, and
(2.14) if y is a closed b o u n d a r y point of the action c o n t i n u a t i o n
set I i = { x g S : f(x,i) = i} for some i ~ A, then there exists
j ~ e(i) where j = lim f(x,i) and f(y,j) = i.
x+y
x~I i
The state space can be divided into a finite n u m b e r of intervals such
358

that for i ~ A, f(x,i) is constant across any particular interval.


Condition (2.12) guarantees that it will not be possible to make more
than one change in control mode at any one point in time. For each
i ~ A, the action continuation set defined in (2.14) denotes those states
for which f will continue with control mode i, and the class continu-
ation set defined in (2.13) denotes those states for which f will con-
tinue with an action from equivalence class e(i).
Set I. is the finite
i
union of intervals open in S and set I i is the finite union of intervals
open or closed in I..
l
For example, supDose that we are given three contol modes with
KI2 = K21 = 0, KI3 = K23 > 0, and K31 = K32 > 0. Then A I = {1,2}, A2 =
{3}, C12 = KI3 = K23, and C21 = K31 = K32. In Figure i we illustrate
the following band function:

mode 2

s5
mode i ~ mode 2, else continue
s4
mode 3
s3

mode 2 ~ mode I, else continue


s2

mode i-~ mode 2, else continue


sI
mode 1
0

FIGURE I. AN ILLUSTRATIVE BAND FUNCTION

li if x ~ [O,Sl]
if x ~ (Sl,S 2]
f(x,l) = if x • (s2,s 3)
if x ~ [s3,s 4]
if x ~ (s4,~) ,

li if ~ ~ [O,Sl]
if X ~ (Sl,S 2]
f(x,2) = if x ~ (s2,s 3)
if x @ [s3,s 4]
if x ~ (s4,~), and
359

i if x E [0,s I]
f(x,3) = if x ~ (Sl,S 5)
if x ~ [Ss,~)

Note that I I : [0,Sl]Kg(s2,s3) , 12 = (Sl,S2]k2(s4 ,~), 13 : (Sl,S5),

I 1 ~ ! 2 = [0,s3)q)(s4,=), 13 : (Sl,S5), and that (2.14) holds at s 1


and s2•

We now introduce some more n o t a t i o n concerning band


functions For each i ~ A, let 0 = s~- < s~- < < si si : +~
. . . . n(i) < n(i)+l
be the d i s c o n t i n u i t y points in S of f(x,i). Further, denote by b i
i i P
for p = 0,1,2,...,n(i), the constant value of f(x,i) on (Sp,Sp+l).
The following theorem shows that given an initial state
and control mode, each band function corresponds to a unique admissi-
ble strategy.
Theorem i. If f is a band function and (x,i) ~ S×A, then there exists
a unique admissible strategy w s.t.

(3.1) w(t) = f(X(tlx,l,w) , ~(t)) a.s. for each t > 0.

Proof• Starting from initial state x and action i, we construct ad-


missible strategy w and its a s s o c i a t e d controlled process X(.Ix,i,w) ,
that t o g e t h e r satisfy (3.1), by sequentially doing so on intervals of
the sort [Tn,Tn+ I) where {Tn;n=0,1...} is a sequence of i n c r e a s i n g
Brownian stopping times•
Let a ! = f(x,i) and p E {0,1,•..,n(al)} be s.t. x

l,sp+ and bpl = a I. Define {61(t); t ~ 0} to be the unique re-

flected Brownian m o t i o n process beginning in state x with i n f i n i t e s i m a l


2
drift ~a I and i n f i n i t e s i m a l variance ~al Let ~I be the first h i t t i n g

time of Sp Sp+ I by 61 and begin c o n s t r u c t i o n of action process

and state process X by specifying that w(t) = a I and X(t) = ~l(t) on

(0,TI)" aI &
Suppose that 61(x I) = sp Now either i) s I is a closed
aI P aI
boundary point of I , ii) s al is an open boundary point of I but
P
not a boundary point of Ial , of iii) s Pal is a boundary point of Ial

Cases i) and ii) differ very slightly and will now be h a n d l e d


together.
a2
Let a 2 ~ 8(a I) and p E {0,1,2,...,n(a2)} be s.t. b~ 2 = a 2, S~+l, =
360

aI a2 a~
Sp , f(s~+l,a 2) = a I for i), f(s~ I' al) = a2 for ii). A result of
Nakao [12] guarantees the pathwise unique existence of the process
{@2(t); t ~ T l} s.t.

@2(t) = Sp + p 02(u) du + o @2(u) dB(u) for t ~ ~i


TI TI

where
I if y < s al
~a 2 P ,
~(y) =
if y > s al
~a I P

Ii if y < s al
a2 P
o(y) =
if y > ~ al
[ al P ,

U(s I) = Ual and c(s I) = Ca I for i), and U(Sp ) = Ua2 and o( ) = aa2

for ii). This is to be distinguished from uniqueness in distribution,


and the importance lies in the fact that the solution process 42 can
be constructed on our Brownian motion process, so that @2(t) is a u-
nique measureable function of {B(u); T I <u< t}; see Watanabe and Yamada
[17]. Let {~2(t); t ~ T l} be the uniquely reflected 42 process and

define T 2 to be the first hitting time of Sp+l,S by the process


~2" We further extend our definitions of w and X by
a1
a if ~2(t) < Sp
~(t) = a1
a! if ~2(t) > Sp ,
aI aI
~(t) = a I if ~2(t) = Sp for i), and ~(t) = a 2 if ~2(t) = Sp for ii),

and X(t) = ~2(t) on [Wl,T2).


aI
For iii), let a 2 = f(Sp ,a I) and p £ {0,1,2,...,n(a2)}

as in the second paragraph of this proof, thereby further defining ~


a a a2
and X on [Tl,~ 2) where T 2 > ~i a.s. If Sol_ = s~2_ or s~+ l, we proceed
as in the previous paragraph.
al Similarly, we handle all of the possibilities where
~l(T1) = Sp+ I. Process w and X will then have been defined from time
361

zero until stopping time T 2 with (3.1) satisfied on (0,T2). Following


the same p r o c e d u r e s we next define the process ~3' stopping time T3 >
T 2 a.s., and w(t) = f(X(t), w(t)) on [T2,~3 ) where X(t) = g3(t).
In this way we continue to build the action process
and state process X, and if the sequence of stopping times {Tn; n~l}
approaches infinity almost surely, then w(t) and X(t) are w e l l - d e f i n e d
for all t > 0. O b s e r v i n g that there exists a minimal bandwidth

i~A p ~ { 0 , 1 , . . . j n ( i ) } p+l s

of positive width and using the arguments of Appendix 4 in [15], we


J

conclude that E[~n+I-T n] > b > 0 for all n and some constant b. This
-- -- N

implies that T ÷ ~ a.s., and furthermore, that the function 8 : (0,=) +


n
{I,2,...,M} defined by 8*(t) = 8(w(t)) has (a.s.) finitely many dis-
continuities in each finite interval of time. Therefore, z is an
admissible strategy and the c o n t r o l l e d process u n i q u e l y generated by
~, x, and i, {X(tlx,i,~); t > 0}, is p r e c i s e l y X(tlx,i,~) = X(t) if
= I and X(tlx,i,~) = X(tAT) if k = 0 where t = inf {t ~ 0: X(t) = O}.
We define the band strategies generated by a band func-
tion f (one strategy for each initial state and action pair) as the
unique solutions constructed in Theorem i. In order to c h a r a c t e r i z e
the value function corresponding to a band strategy we will need the
following extension Ito's lemma which is proved in [16].
Lemma 2. Let X and Y be a pair p r o c e s s e s s.t.

(3.2) X(t) = X(0) + 10 c(u)dB(u) + Y(t) for each t 0

where X(0) is F 0 - m e a s u r a b l e , o is a nonzero, bounded and n o n - a n t i c i p a t i n g ,


and Y is a continuous, non-anticipating process of b o u n d e d variation.
Let f: [0,~) × ~ ÷ ~ satisfy

(3.3) f(t,-) ~ C~ ( ~ ) for each t ~ 0, and


(3.4) f(.,x) is c o n t i n u o u s l y differentiable on [0,~) for each x ~ • .
Then the process {f(t,X(t)); t > 0} satisfies

(3.5) f(t,X(t)) = f(O,X(O)) + u(U,X(u) + du


0

+ r.t fx (u'X(u))°(u)dB(u) + It fx(u,X(u))dY(u) for each t ~ 0.


J0 0

Remarks. (1) We adopt the n o t a t i o n C~(E) to denote the set of con-


tinuous functions f: ~+ • such that in each finite interval of ~ ,
ft and f, exist at all but a finite number of points; and such that
f"(x-) and f"(x+) exist for each x ~ • .
(2) The proof of Lemma 3 relies h e a v i l y on a similar result, Theorem
362

2.2 in Kunita and Watanabe [9], the difference being that our class of
functions include those that misbehave at a finite number of points in
each finite spatial interval.
We now show the value function of a band strategy to be
the solution to a standard differential equation.
Theorem 2. For each band function f and state-actlon pair (x,i) E SxA,
let Wf,x, i denote the admissible strategy uniquely corresponding to
f,x, and i. The function Vf: SxA ÷ ~ defined as Vf(x,i) = V (x,i)
~f,x,i
is the unique function V: SxA ÷ ~ satisfying the following for each
i ~ A:
(3.6) V(.,i) ~ C~(S),

(3.7) IV(x,i) - ~ is b o u n d e d for all x @ S,

(3.8) DiV(x,i) - aV(x,i) + g(x,i) = 0 for all x in the interior of

I i, where D. denotes the differential operator 3 + 1


D i = ~i ~-~
2 92 l
~l ~x 2
(3.9) V(x,i) = Ki,f(x,i) + V(x,f(x,i)) for all x ~ I i, and

(3.10) kV'(0,i) - (l-l)V(0,i) + (I-I)R = 0.

Proof. We begin by verifying th~Vf satisfies (3.6)-(3.10). By Lemma


i, condition (3.7) holds if

(3.7') E (tlx,i,~f,x, i <


0

for each (x,i) g S×A and distinct k,% ~ {I,2,...,M}. Fix (x,i) ~ SxA
and define {~n : n _> 0} by Yo = 0 and Y n = T n - T n-i for n --
> I " where
the sequence of stopping times {T ;n > 0} associated with f, x, i is
n
defined as in the proof of Theorem I. We have already established that
the Yn'S a r e i n d e p e n d e n t and that E[Y n] _> b _> 0 for each n and some
n
constant b. Define {Sn; n ~ 0} by S n = [ Yj for n > 0 and {N(t); t>0]
by N(t) = sup{n: S n ~ t} for t > 0. Sinc~ =0 -- -

Sn(t)+l = k=0 n=0


[ YnI{N(t )+l=k} : n=0
[ k=n
[ YnI{N(t )+l=k},
we have

E[Sn(t)+l~-- = n=0
~ EIY nl {N(t )+l>_n I > bE[N(t)
-- + i].

Since Sn(t)+l j t + Yn(t)+l' we get bE[N(t)] j t - b + E[Yn(t)+l].

Hence lim sup[E[N(t)]/t] < i/b. Now fix distinct k,i £ {I,2,...,M}.
363

Since every switch from an action A k to an action AZ occurs only at


some time point T n, we have Qk~(tlx,i,Wf,x, I) ~ N(t) and therefore (3.7')
holds.
Fix i ~ A and consider x ~ I i. Let p be s.t. x @ [ S p i, S p i+ l ]
and let (Zi(t); t ~ 0) be the Brownian motion starting in state x with
drift parameter ~i' variance parameter ci, and absorption at sip and Sp+li.
Assign to process Z i linear holding costs at rate h, operational costs
at rate ri, and absorption costs Ki,f(s~,i) + Vf(s~,f(s~,i)) and

K~,f(s i + V_(si.~,f(si.~ .)) at points s i and s i respectively.


- p+l,i ) r p*± p*±,l p p+l'

Let Vi(x) denote the conditional expectation of total discounted cost


i i
generated by Z i and define Fi: [0,~) × [Sp,Sp+l] + ~ by Fi(t,x) =
e-atV.(x). Since
1

where T = inf(t ~ 0: Zi(t) ~ (s~,s~+l)) , we apply Lemma 2 and get

e-mTvi(Zi(T)) = Vi(x)

+ T ae-~tvi(zi(t)) + 1 e- Vi(Zi(t))~ i
0

+ e -at_',~
vi~i(t))~ ~ dt + IT e-~tv' (Zi(t))eidB(t).
0 i

Taking expectations,

This implies that


= Vi(x) + E
El
0
e-~t[DiVi(Zi(t)) - ~Vi(Zi(t))]d .

E e-at[avi(zi(t)) ~ DiVi(Zi(t))]d = E e-at[hZi(t) + ri]d ,

i i
and since Vi(x) = Vf(x,i) on [Sp,Sp+l] , function Vf satisfies (3.8).

Clearly, if x g I l, Vf(x,i) = Ki,f(x,i) + Vf(x,f(x,i));


hence (3.9) also holds. For state zero and the case of absorption,
Vf(0,i) = R for each i ~ A; thereby validating (3.10) when k = 0. In
the case of reflection, if 0 ~ I i then V~(0,i) = 0 by (3.8), and if
!
0 g I i then 0 ~ I f(0"i) and Vf(0,i) = V~(0,f(0,i)) = 0; thereby val-
idating (3.10) when k = 1.
384

For every (x,i)E S A, we have x @ I f(x'i). If x is an In-


l
terior point of I f(x'i), then as seen above Vf(x,f(x,i))and V~(x,f(x,i))
I tl
exist, Where Vf and Vf denote the first and second partial spatial
derivatives. S u p p o s e now t h a t x i s a c l o s e d b o u n d a r y p o i n t o f I f ( x ' i ) "
let fix,i) = a and p s.t. x E [Sp,Sp+l]. Assume that x = sa.p (The
case x = Sp+
la is similar.) Then there exists J E 8(a) and p s.t. sJ¥1 -

s~? b jp = J, and f(sJ+l,j) = a. As in the proof of Theorem I, let {~(t);


t _> O} b e t h e unique process defined for e a c h y ~(S~p,Sap+l ) b y

¢(t) = y + ~(¢(u))du + ~(~(u))dB(u)


0 0

{
where
~j if y < s 3-"
]~(y) ffi p+l
~a if y >_ sJ+l ,

and

Gj if y < s~+ I
~(y) =
'
I ~a if

Impose on process @ the linear holding costs, operational costs, and


y > s~
-- p+l

switching costs associated with process X(.]x,i,Wf,x, i) and absorption

upon hitting S~p or sap+l at costs Kj,f(s~,j] + V f s , f(s ,j) and

a
Ka,f(Sp+l,a)+Vf [a a
Sp+l, f(Sp+l,a ) ] , respectively. Let Vaj(Y) denote the

conditional expectation of the total discounted cost generated by


in this setting and observe that Vaj(Y) is continuously dlfferentiaole
on (s , Sp+l) with a second derivative except at the points S~p, spa

and sa
p+l" l
Thus, if X is not a boundary point of I i then Vf(x,i)
II l
and Vf(x,i) exist. If x is a closed boundary point of I i, then Vf(x,i)
exists; and if x is an open boundary point of I i but not a boundary
point if Is(i) , then again Vf(x,i) exists. Finally, if x is a boundary
l
point of Is(1), then Vf(x,i) does not necessarily exist. So, (3.6)
holds, and Vf is indeed a solution to (3.6)-(3.10).
Suppose now that function V: S×A + ~ also satisfies
(3.6)-(3.10). Then letting A(x,i) = V(x,i) - Vf(x,l) for each (x,i)E
S×A, we would find that the function A satifies these conditions for
each i £ A: I i

(3.11) l (x,i)i is bounded in s~


t I
(3.12) DiA(x,i) - ~A(x,i) = 0 for all x £ I l,.
(3.13) A(x,i) = A(x, f(x,i)) for all x ~ I l, and
36S

(3.14) lA'(0,i) - (l-l)A(0,i) = 0.

The second-order differential equation (3.12) implies that


81x 82x
A(x,i) = yl e + y2 e for each x ~ I i,

where B 1 is the positive root and B 2 and the negative root of the quad-.
1 22
ratic equation Ul s + ~ oiB - a = 0. But by (3.11), (3.13), and (3.14),
it must be that Y1 = Y2 = 0. Hence, (x,i) = 0 for each (x,i') ~ SxA,
and Vf is the unique solution to (3.6)-(3.10). [ ]
Band function f, then, generates an (x,i)-optimal strategy
if Vf(x,l) = V,(x,i). We call band function f (everywhere) optimal ,
if for each (x,i)( S×A its corresponding band strategy is (X,i)-optimal.
After proving a verification lemma, we wlll derive necessary and suf-
ficient conditions for a band function to be everywhere optimal.
Lemma 3. Suppose that for each i ~ A, V : S × A + ~ s a t i f l e s (3.6), (3.7),
(3.10),
(3.15) V(x,i) j Kij + V(x,J) ~ for each (x,j)~ S×A, and
(3.16) DjV(x,i) - V(x,i) + g(x,j) 3 0 for each (x,j) ~ S×8(i) where
we further define Dj for those x without a second partial spatial

derivative by DjV(x,i) = ~jV ' (x,i) + ~sj


1 2~V,,(x_,i)+V,(x+,i))/2]

Then V(x,i) ~ V.(x,i) for each (x,i)~ SxA.


Proof. First, since V(x,J) = V(x,I) for each (x,i)~ SxA and J E e(1),
let V(x,k) be the common value of V(x,J) for each J ~ A k. Conditions
(3.6), (3.7), (3.10), (3.15), and (3.16), can be restated in terms
of V(x,k) for (x,k) ~ S × {1,2,...,M} and class switching costs Ck£
for k,~ ~ {I,2,...,M}.
Fix (x,i) ~ S × A and let w be an arbitrary admissible stra-
tegy. We want to show that V (x,i) ~ V(x,i). Let 0 = T O < T 1 < T 2 <...
denote the discontinuity points of function ~: [0,~) + {1,2,...,M} where
~(t) = e(~(t)) for each t > 0. Then using (3.16),

(3.17) V (x,i) = E e-atg(X(tlx,i,~),z(t))dt


O

+ e-aTn+Ic~ ~I
(Tn+ !- ), 8 (Tn+l+
366

>E [ I[Tn+l e-~t [aq (X(t Ix,i,~) ,8(t) ]


n=0 Jl
[Tn
- _ Dw(t)9(X(t Ix,i,~) , ~(t)] ]dt

+ e-C~Tn+lc~ )!J.
(Tn+ I- ), 0 (Tn+l +

Fix n and define function F n on (Tn, Tn+ I) × S by Fn(t,x) =


e-atq(x,e(t)). Applying Lemma 2, we get
-~T
(3.18) e-~tv(x(tlx,i,~),e(t)) = e nV(X(Tnlx,i,w)~(Tn+) ]

ii rl

for each (t,x) i~ (Tn,Tn+l) x S, where X and Y uniquely satify

X(tlx,i,~) = X(Tnlx,i,w) + V~(u)dU + Ow(u)dB(u) + Y(t)


Tn Tn
and (2.10). The last integral in (3.18) has value zero since if I = 0
I
then Y = 0, and if I = i then Y grows only when X('Ix,i,w) = 0 and
(0,~(u)) = 0 for each u ~ (Tn,Tn+l). Taking expectations in (3.18)
and substituting into (3.17), we have

-sT
v (x,i) > E IV[x,6(i)) + ~ ~ nV(X(Tn 1x,i,w),8(Tn+) )
n=l

-sT m~T ~]
- e n~(X(TnlX,i,~),~(Tn-) ) + e nC~(Tn ),~(Tn+

-sT
e ~nv(X(TnlX,i,w),8(Tn+))
n=l
~7

-~e(Tn-)'@(Tn+)+ ~(X(Tn'x'i'w)'~(Tn+))~+ C~(Tn_)'~(Tn+)~


1
= V(x,~(i)).

Since V(x,e(i)) = v(~,i), we have as desired V ( x , i ) ~ V(x,i) and there-


fore, V,(x,i) ~ V(x,i). []
We now present our necessary and sufficient conditions
for a given band function to be optimal.
Theorem 3. A band function f is optimal if and only if its value func-
tion Vf satisfies the following for each i E A:
(3.19) Vf(x,i) = min{j~AKij + Vf(x,j)} for each x ~ S, and
(3.20) min {DjVf(x,J) - ~Vf(x,j) + g(x,j)} = 0 for each x ~ S,
j~L(x,i)
where L(x,i) = {j~A: Vf(x,i) = Kij + Vf(x,J)} and Dj is as in
(3.16).
Proof. Suppose that Vf satisfies (3.19) and (3.20). By Theorem 2 Vf
also satisfies (3.6), (3.7), and (3.10). Since e(i) C L(x,i) for each
(x,i) ~ SxA, (3.20) implies (3.16), and (3.19) implies (3.15). There-
fore by Lemma 3, Vf(x,i) = V,(x,i) everywhere.
Now suppose that band function f is optimal. If (3.19)
fails, then there exists x ~ S and i,j E A s.t. Vf(x,i) > Kij + Vf(x,J).
By the continuity of Vf(.,i) and the optimality of f, there also exists
~ > o s.t. Vf(y,i) > Kij + Vf(y,J) for each y @ [x-~,x+~], and

iJ + -~ + -E[e-~T E[Te-~T] + E <

where {~(t); t ~ 0} is the process ~(t) = x + ~jt + ~jB(t) and T =


inf{t > 0: @(t) ~ [x-@,x+~]}.
Now define admissible strategy w by w(t) = J for J ~ (0,T],
and w(t) ~ f(X(tlx,i,w) , w(t)) for t > T where X(TIx,i,w) is the con-
tolled process generated by w, x, and i. Hence,

V (x,i) = Kij + E e-at[he(t) + rj] + E

< Kij + E ~ e-~t[h@(t)+ rj]dtl + E~-mT[vf[~(T),i)- Kij]~ <Vf(x,i).

This contraducts the optimality of f; so f optimal implies (3.19).


Now suppose that f is optimal and (3.20) fails. Then
368

there exists x E S\{0} i,jEA and E > 0 s.t. Vf(x,i) = Kij + Vf(x,i)
and DjVf(y,j) - Vf(y,J) + g(y,j) < 0 for each y [x-E,x+g]. Defining
the process 4, the stopping time T, and the admissible strategy ~ exactl~
as above, we have

Lemma 2 further implies that

and hence Vw(x,i) < Kij + Vf(x,j) = Vf(x,i). Again we have contradicted
the optimality of f, thereby p r o v i d i n g that (3.20) is also n e c e s s a r y
for f to be optimal. []
We can summarize the optimality conditions (3.19) and
(3.20) by d e m a n d i n g that the optimal value function Vf satlfy the satis-
fy the following single condition for each (x,i) E S×A:

(3.21) min~.. + Vf(x,J) - Vf(x,i)] + t[DjVf(x,j) - eVf(x,j) + g(x,j)]} = O


jEA LIJ
for all small e n o u ~ t .
Condition (3.21) is a lexlcographic minimum condition since it requires
first that (3.19) hold and second that (3.20) hold. Note that (3.21)
is the a p p r o p r i a t e Bellman-Hamilton-Jacobi equation for our control
problem.
We conjecture that there always exists an optimal band
function and, moreover, that is has a special "flnlte-critical-number"
form. Hereafter, we label the control modes so that W 1 ~ W2...~WN.
Conjecture. There exists an optimal band function f, s.t.
(3.21) for each i E A, the class c o n t i n u a t i o n set I i is an open in-
terval of S,
(3.22) if ~ = I and h > 0 then for each i E A, f,(x,i) is i n c r e a s i n g
in x on Ii,
(3.23) if k = 1 and h < 0 then for each i E A, f,(x,i) is d e c r e a s i n g
in x on Ii,
(3.24) if k = 0 or, I = I and h = 0, then for each i E A, rf,(x,i )
is d e c r e a s i n g in x on II, and
369

I
(3.25) for each i g A, Vf,(.,i) is continuous of S.

Remarks. (i) Our cost structure can be simplified to one where each
of the operational costs r i is nonnegative. This is accomplished by
redefining cost function g: S×A ÷ ]Ras
rhx + ~i if x > 0
g(x,i) = ~[(l-~)aR if x = 0
rw
where ~i = ri - r, for i=l,2,.~.,N, R = R - 7 ' and r, = min {rl}.
(2) In the case of absorption the cost structure can be simplified
further so that there are zero holding costs as well. To see this
observe that for any admissible strategy w and (x,i) g S×A,

V (x,i) = Eli ~ e-at[hX(tlx,i,~) + r (t)]dt + Re -~T

M M ~ ~]
+ ~ ~ I Ck~e-atdQ*k£(t Ix'i'~
k=l Z=I 0
where T is the time of absorption; and if we change the order of in-
tegration,

V (x,i) = ~ + E e-at U t) + r(t dt + Re -aT


0

+
k=l £=i I~ C
0
e-atd^* (tlx,i,~)

where
"{~w(t),r (t)} if t S T

{~(t),r(t) } =

{0,0} if t > T
Therefore we have an equivalent control problem if we define g: SxA ÷
as
Iri if x > 0

g(x,i)
] aR if x = 0 ,

h~i 5,
where ~i = ~ + ri - 5, for i=l,2,...,N R = R - ~-, and 5, = min

"~--~a 4 r i .

(3) Together, conditions (3.21), (3.22), and (3.23) imply that for
the optimal band function, the class continuation set I i is at most
370

one open interval in S and the action c o n t i n u a t i o n set I i is at most


one interval contained in li, for each i • A. Moreover, within a class
continuation set as the state of the controlled process increases, it
is optimal to use a control mode with faster drift downwards in the
case of positive holdingcosts (h > 0) and a control mode with faster
drift upwards in the case of negative holding costs.
(4) As a consequence of remark (3) and the properties of band stra-
tegies, the total number of d i s c o n t i n u i t i e s in x ~ S for the optimal
M
band function is at most M(M-I) +
~ (Nk-l) = M 2- 2M + N, where N k
k=!
is the n u m b e r of actions in equivalence class A k. The M(M-1) term
accounts for possible switching between action equivalence classes.

(There are M ( M - I ) / 2 pairs of classes and for each pair A k and A£,
two switching numbers are involved. At one, switching out of A k into
A£ occurs and at the other, switching out of A£ into A k occurs.) The
(Nk-l) term accounts for possible switching b e t w e e n actions within
class A k. (To p a r t i t i o n the Ak-Class continuation interval into its
possible action continuation intervals, (Nk-l) switching numbers are
involved. )
For example, Figure 1 depicts an admissible band function
for data N = 3, K12 = K21 = 0, KI3 = K23 > 0, and K31 = K32 > 0. Sup-
pose further that I = 1 and h > 0. The optimal band function might
be as follows:

"I if x ~ [O,s I]

f,(x,l) = 2 if x ~ (Sl,S 3)

3 if x ~ [s3,~),

'I if x ~ [O,s I]

f,(x,2) = 2 if x ~ (sl,s 3)

3 if x @ [s3,~), and
371

'i if x ~ [O,s I]

f,(x,3) = 2 if x E (Sl,S 2]

.3 if x ~ (s2,=)

mode 3
s3~
mode l--)mode 2, else continue
s2

mode 2

mode I
0

FIGURE 2. AN I L L U S T R A T I V E OPTIMAL BAND F U N C T I O N

In Figure 2 we illustrate f,. Note that f, is a " t h r e e - c r i t i c a l - n u m -


ber" strategy.

4. EXPLICIT SOLUTIONS

The above conjecture has p r e v i o u s l y been proven when


there are two available control modes; see Sheng [15] for the solu-
tions to a b s o r b i n g b a r r i e r problems, and Sheng [16] for the solutions
to r e f l e c t i n g b a r r i e r problems. It has not yet been proven in all its
generality for large N. As an illustration, we explicity produce here
optimal band strategies for some specific cases.
(1) Consider our control p r o b l e m for two avaiable control modes,
reflection at the boundary, and zero switching costs. We use the fol-
lowing notation:

+ 2 2 2 2 ./Ul+2ao122
~I ~i+2~°i U2 + z z ~I -
8 = 2 ' P = 2 , v = 2

61 °2 °I
2 2
62 °l 2
AI -- - -2 B 2 - W26 - ~, and A 2 = --~--p -
pl p - a .
372

Let fl d e n o t e the single band function of a l w a y s using control mode ~.;


f~, the single band function of a l w a y s using control mode 2; a n d fz'
two-band function of a l w a y s using mode 2 whenever the state of the
s y s t e m is a b o v e z and a l w a y s using mode 1 whenever the state is b e l o w
z. The solutions are indicated in t h e t w o t a b l e s below.

2 2
oI > o2 use strategy f2

A2 ~ (p2-~l)p - apr I use strategy f2


2
Cl < ~2

A2 < (p2-Pl)p - apt I use strategy f2

TABLE I. OPTIMAL BAND STRATEGIES W H E N h = i, r I > 0, A N D r 2 = 0

r 2 < ~i-~2 use strategy f2

2 2
d'l --> °2 ALl ~ (pl-P2)B - eSr 2 use strategy fl
Pl-P2
r2 > - -
AI < (pl-~2)~ - ~ r 2 use strategy fz
~z £ ~2
pl-~2 a2 < (p2-Pl)p - ~pr 2 use strategy fz
r2 <
2 2 A2 > (~2-Pl)p - ~pr 2 use strategy f2
c I < 02

r 2 > Pl-P2 use strategy fl

TABLE 2. O P T I M A L BAND STRATEGIES W H E N h = i, r I = 0 AND r 2 > 0

In e a c h of the t w o t a b l e s , the single critical number z characterizing


optimal band function fz is the u n i q u e positive solution to a t r a n s -
cendental equation corresponding to the condition that the o p t i m a l
reward function has a continuous second derivative everywhere.
373

(2) Now consider the special case of N available control nodes, re-
flection at the boundary, positive holding costs at rate h = I, general
2 and
switching costs, and such that U1 ~ ~2"''~ ~ , o~ > ~ ~ . . . ~ ON,

r I ~ r 2 ~ . . . ~ r N. Let fN denote the single band function of always


using control mode N. The corresponding value function VfN on S×A is

VfN x ~N rN a ~ -8x
(x,i) = KiN + ~ + ~-~ + -~- + e
where
2 2
= ~N + A N + 2 a ° N
2
oN

Checking the optimality c o n d i t l o n s (3.1~) ~nd (3.20), we find that fN


~1--~j ~~--i-N
is optimal if and only if K iN--< - 2 "" + -~~ "" for each i # N
(3) Finally, consider the case of two available control modes, re-
flection at the boundary, positive holding costs at rate h = I, and
symmetric switching costs KI2 = K21 = K > 0. For single band function
fl we have
x ~I rl
Vfl(X'l) = ~ + ~ + 7 + e-BX for all x > 0, and

x UI rl
Vfl(X,2) = K + ~ + j + -~- + e -Bx for all x >_ 0.

By Theorem 3, then, fl is optimal if and only if

(4.1) -~2-~i
-~ + (r2-rl) + I---A
aB 1 e-~X > aK for all x > O.
Similarly for single band function f2:
~2 r2 1 -px
Vf2(x,l) = K + ~ + T + ~ for all x _> 0,

u2 r2 l---e-PX for all x > O,


Vf2(x'2) = V + -~- + ~ p

and f2 is optlmal if and only if

(4.2) ~I-~2
~ + (rl-r2) + ~ A 2 e - P X > ~K for all x _> 0.
Suppose instead we use the strategy of never switching
control modes. Let f denote the corresponding band function; that
C
is, fc(X,i) = i for each (x,i). Then

x ~I rl ~ - 8x
Vf (x,l) = ~ + --~ + -~ + e for all x > O,
C
374

x ~2 r2 !e-PX
Vf (x,2) = ~ + --~ + -~- + ~P for all x > 0,
and optimality conditions (3.19) and (3.20) reduce to
pl_~ 2
-eK < - - + r I - r 2 < aK

1 1
-aK <
~i-~2 + rl - r2 + 6 - 5 < ~K.

The pa#ameter combinations corresponding to (4.1), (4.2) and


(4.3), however, do not exhaust the possible range of diffusion and cost
parameters. For the remaining parameter combinations, we conjecture
that the optimal band function has the following two-critical number
form:
if x ~ [O,Z)
f,'(x,l) = {~
if x ~ [Z, oo)

and

f.(x,2) = {i if x ~ E0,z)
if X E (Z.~)
where 0 ~ z ~ Z ~ +'~. The optimal pair' of switching numbers, z, and
Z,, is chosen so as to satisfy the optimality conditions of Theorem 3.
The optimality conditions for f, are

-K ~ Vf,(x,2) - aVfl,(x,l) ~ K for x E [z,Z].

DiVf,(x,2) - ~Vf,(x,2) + x + r I ~ eK for x @ [Z,~), and

D2Vf,(x,l) - aVf,(x,l) + x + r2 ~ ~K for x ~ [0,z].

Therefore our two-mode symmetric-switching-cost problem is completely


solved if there exist positive z, and Z, (z, < Z,) such that for the
appropriate parameter restrictions (i.e., none of (4.1) through (4.3)
holding) the following are true:
Vf.(x,l) Vf. (x,2) -
increasing in x ~ [z.,Z.],

Vf.(z.,1) - Vf.(z.,2) = Vf, (Z,,2) - Vf, (Z.,1) = - K,

I [~i2 - ~22]V'~,(x,l)+ (~l-~2)V~,(x,l) increasing in x ~ [O,z,],

z[ 2 2] ,, (~l-~2)v~. (~*'!) = r2 ~i ~K,

[
i Sl2 - o2 2]~ ,(x.2) + (,I-,2)V~ ,(x,2 ) increasing in x ~ [Z..~) and

~i - ~ v~ + (~1-~2)v~.(z,,2) = r2 - r I + ~-
375

REFERENCES

[1] Arkin, V. I. Kolemaev, V. A. and Shiryaev, A. N. (1964) On Find-


ing Optimal Controls. Trudy Steklov Math. I n s t i t u t e LXXI, 21-25.

[2] Bene~, V. (1971) Existence of Optimal Stochastic Control Laws.


SIAM J. Control 9, No. 3, 446-472.

[33 Davis, M. H. A. and Varaiya, P. (1973) Dynamic Programming Condi-


tions for Partially Observable Stochastic Systems. SIAM J. Control
ll, No. 2, 226-261.

[4] Doshi, B. T. (1978) Two Mode Control of Brownian Motion with Quad-
ratic Loss and Switching Costs. Stoch. Proc. Appl. 6, 277-289.

[5] Doshi, B. T. (1979) Optimal Control of a Diffusion Process with


R e f l e c t i n g B o u n d a r i e s and Both Continuous and Lump Costs. Dynamic
Programming and Their Applications, ed. M. Putterman, Academic
Press, New York, 269-258.

[6] Fleming, W. (1969) Optimal Continuous-Parameter Stochastic Control.


SIAM Rev. II, No. 4, 470-509.
[7] Foschini, G.J. (1982) Equilibria for Diffusion Models of Pairs of
Communicating Computers-Symmetric Case. IEEE Trans. Infor. Th. 2,273-21

[8] Iglehart, D.L.(1969) Diffusion Approximations in Collective Risk


Theory. J. Appl. Prob. 6, 285-292.

[9] Kunita, H. and Watanabe, S. (1967) On Square Integrable Martingales.


Nagoya Math. J. 31, 209-245.

[I0] Mandl, P. (1968)Analytical Treatment of Ohe-Dimensional Markov Pro-


cesses. Springer-Verlag, New York.

[II] McKean, H. P., Jr. (1969) Stochastic Integrals. Academic Press,


New York.

[12] Nakao, S.(1972) On the Pathwise Uniqueness of Solutions of One-


Dimensional Stochastic Differential Equations. Osaka J. Math 19,
513-518.
[13] Pliska, S. R. (1973) Single-Person Controlled Diffusions with Dis-
counted Costs. J. Opt. Th. Appl. 12, No. 3, 248-255.

[14] Rath, J. H. (1975) Controlled Queues in Heavy Traffic. Adv. Appl.


Prob. 7, 656-671.
[15] Sheng, D. (1980) Two-Mode Control of Absorbing Brownian Motion,
submitted for publication.

[16] Sheng, D. (1980) Two-Mode Control of Reflecting Brownian Motion,


submitted for publication.

[17] Watanabe, S. and Yamada, T. (1971) On the Uniqueness of Solutions


of Stochastic Differential Equations. J. Math. Kyoto ii, 155-167.
A Resum@ of Some of the Applications of
Malliavin's Calculus

Daniel W. Stroock

This research was supported in part by N.S.F. Grant MCS 80-07300.

O. Introduction:

This brief note is intended to introduce the reader to the Malliavin calculus.
However, rather than attempting to explain the intricacies of Malliavin's calculus,
I have decided to only sketch the ideas underlying his calculus and to concentrate
on describing several of the problems to which the calculus has been successfully
applied. Of course, I hope that, having seen its applications, the reader's
appetite will be whetted and that to satisfy his appetite the reader will seek more
information about this subject.

I. The Basic Setting:

Denote by @ the space of, 8 E C([0,~), Rd) such that 8(0) = 0 and let
he Wiener measure on @ • Civen a mapping ~ : @ ~ R D , the purpose of
Malllavin's calculus is to provide a mechanism for studying the regularity
properties of the induced measure ~# = ~ o ~-I on RD In particular,
Malliavin's calculus gives one a way o~ testing for the absolute continuity of ~
a~ ,
and examining the smoothness of f~ = ~ when it exists. At the same time, it is
often possible to obtain regularity results about the behavior of ~ as a
function of # . Probabilists who are familiar with diffusion theory and related
subjects are all too well aware of that, heretofore~ the only way to attack such
problems has been to identify B~ as the solution to some functional equation
and then invoke the regularity theory associated with that equation (ef. the
discussion in the second paragraph of section 2) below).
Malliavin's idea is to work right in Wiener space (8,~) . In brief, his
technique is to introduce a certain self-adjolnt diffusion operator ~ on L2(~).
The importance of taking to be a diffusion generator is that the associated
bilinear map:

(1.1)
will then satisfy:

(1.2) <+o~,~>~ = ~,o~<~,~>~


for smooth ~ on R ~ R . (Equation (1.2) follows from the It~ calculus for
continuous martingales; and it is in order to be ~ealing with continuous
martingales that one needs ~ to generate a diffusion.) Given such an ~ , one
377

can integrate by parts. Namely, given ~,~ E Dora(i) such that ¥/<~,@>£ E Dora(i),
one has from (1.1) , (1.2) and the symmetry of £ :

z~[(#,.~)~] _- s~[<,.~,~>zCW<~,~>~)]
= E~[#o~(¢£(~i<~,¢> ) - y~l<¢,~> z- ~(¢YI<¢,¢>~))]
= -g [~ ~{2z~I<¢,#>~ + <#,Yl<~,¢>~>~)]
o

In particular, with Y- I , one concludes from this that t h e r e is a g ~ LI(~)


such that

(1.3) s~[~ '] = Z


~ [~g]

, ~
®
Co(R)
1

It is an easy step to go from (1.3) to regularity results about ~ .


There are two ingredients required before this technique can be used. First
one must know that • ~ Dora(i) . Second, one must show t h a t II<~,~>£ ~ Dom(£) .
As one would expect, it is the existence o~ the second ingredient which is
difficult to check, because it is this ingredient which c o n t a i n s the
"non-degeneracy" of the map • .
Although it may appear that there is considerable lattitude in one's choice of
Z , it turns out that ease of computation forces one to choose the simplest £
The £ chosen by Malliavin is the one known to quanttnn field theorists as the
"number operator". For probabilists it is more illuminating to describe
Halliavin's ~ as the "Ornstein-Uhlenbeck operator" associated with Wiener
measure. For details, the reader is referred to the original paper by Malliavin
[6] , my paper [i0] in which I expand Malliavin's ideas, or my recent articles
[II] and [12] in which I introduce a quite different approach to understanding £
and its associated calculus. References to other articles on this subject can be
found in [ii] and [12] .

2. Some APplications t_~__Diffusions:

Let
D .. ~2
(2.1) L = 1/2 ~ aXJ(x) + ~ bi(x) ~
i,j=l ~xi~x j i~l 5xj
where a : RD ÷ R D ® RD and b : R~ ~ R D are smooth functions having bounded
derivatives and a(x) is non-negative and symmetric for each x E RD Suppose
that G : RD ÷ RD ® RD is a smooth function satisfying a ~ ~G and consider the
Ito stochastic integral equation
T T
(2.2) X(T,x) ~ x + J0o(X(t,x))de(t) + fob(X(t,x))dt , T > 0

The cornerstone on which much of modern diffusion theory rests is the observation
that the measure P(T,x,*) given by:

(2.3) P(T,x,-) = • " X(T,x) -I

is the fundamental solution to the Cauchy initial value problem


378

(2.4) 5 u = Lu , t > 0
St
u(0,.) - f
^
To be mere precise (by an elementary application of Ito's formula) if u is a

reasonably smooth solution to (2.4) and u does not grow too fast as Ixl °
then

(2.5) u(T,x) ffi f f ( y ) P ( T , x , d y )

Conversely, if f is s m o o t h and h a s m o d e r a t e growth at infinity, t h e n one c a n show


that the function u given by (2.5) is a solution to the Cauchy problem in
(2.4) . (A probabillstic proof of this latter statement can be based on the
observation that X(T,x) is a smooth function of x . See [7] or [13]) .
Having identified the measure P(T,x,-) in (2.3) as the fundamental
solution to (2.4) , it has been customary for probahilists to read (2.3) "from
right to left". That is, the theory of partial differential equations enables one
to say a great deal about the fundamental solution to (2.4) . (For example, in
the case when a(-) > ¢I for some ¢ > 0 , it is a well-known consequence of the
classical parametrix method that P(T,x,dy) = p(T,x,y)dy where p(T,x,y) is a
smooth function of (T,x,y) so long as T + ly-xl 2 > 0 (cf. [2]) .) Thus, be-
l i

cause P(T,x,.) is the fundamental solution to (2.4) , (2.3) allow us to con-


elude that, for T > 0 , the distribution ~o X(T,x) -I of X(T,x) under • will
have regularity properties which, before the introduction of Malliavin's calculus,
were not evident from the description of X(T,x) given by (2.2) .
Using Malliavin's calculus and reading (2.3) "from left to right", it is
possible to not only recover some of the familiar regularity results about
P(T,x,-) but also obtain some new results which do not seem to follow easily from
the theory of partial differential equations. As we mentioned in section I) ,
successful application of Malliavin's calculus depends on one's proving two
properties of the map X(T,x) : e ~ RD . First one must show that X(T,x) is
in the domain of the operator £ It turns out that this step is quite simple
and depends only on smoothness of the functions a(.) and b(-) . The second, and
difficult step, is to prove that X(T,x) has the necessary non-degeneracy
properties. Obviously, the origin of the non-degeneracy of X(T,x) must be the
non-degeneracy of a(.) . The problem is to figure out how to relate the
non-degeneracy of a(-) to that of X(T,x) . If a(o) > el for some e > 0 , the
required non-degeneracy of X(T,x) is relatively easy to prove. Thus the
classical elliptic theory can be recovered from Malliavin's calculus. Not so easy,
but nonetheless possible, is the proof that X(T,x) is non-degenerate when L
satisfies H~rmander's conditions for the hypoelllpticity of --~ + L . To be
at
precise, rewrite L in H~rmander's form:

(2.6) L = 1/2 d~ (v(k)) 2 + V(0) ,


kffil
379

where V (k) = ~ v!k)(° 1 5 is a smooth vector field on RD Define


j=l J 5xj
~(v(O),v (I), .... V (d)) to be the Lie algebra generated by (adnv(O))v (k) , n > 0
and 1 < k < d (here (adn+Iv(O))V = [V(0),(adnv(0))V] and adOV(01 (V) = V 1.
Combining H~rmander's theorem (cf. [8] ) with the Schwartz kernel theorem, one can
show that if dim(@(V(0);V(1),...,v(dl)(y)) = D for all y ER D , then,
P(T,x,dy) = p(T,x,y)dy with p(r,x,-) E C ® ( R D) Using Malliavin's calculus,
one can prove the same result under the assumption that dlm(~(V(0),V(1)~...,-
v(d))(x)) - D (i.e. one only needs H~rmander's condition at the initial point x .)
Furthermore, one can get some (admittedly crude) estimates on p(T,x,y) as T+0
or y ÷ x . The proof of this result can he found in section (8) of [12] .
(The proof there is cased on joint work with S. Kusuoka. The bibliogrphy of [12]
contains references to earlier versions of this and related results.1
Gratifying as the preceding successes of Malllavin's calculus may be, they are
too close to known results to be considered real victories. To get a feeling for
the sort of application in which Malliavln's calculus really comes into its own,
consider the following situation. Assume that, for some 1 < N < D , the principal
NXN-minor a(N)(°) of a(-) satisfies a(N)(-) ~ ¢I (N) and set X(N)(T,x) =

(XI(T,x),...,XN(T,xl) . Denote by P(N)(T,x,-I the marginal distribution of


X(NI(T,-) under P(T,x,-) . Using Malliavin's calculus, one can show that for
T > 0 • p(N)(T,x,dY(N)) = p(N) (T,x,Y(N))dY(N) where p(N)(T,x,Y(N)) is a C~
function so long as T + X(N)-y(N ) > 0 . (A proof of this result can be found
in [Ii] . In a forthcoming article, S. Kusuoka and I will discuss various
extensions and refinements of the result.) Observe that it is highly unlikely that
the regularity of P(N)(T,x,dY(N I) could be easily derived from the theory of
partial differential equations. Indeed, there is no obvious equation for
p(N1(T,x,Y(N )) to satisfy as a function of Y(N) " Considerations of this sort
have made Malliavin's calculus a powerful tool in the study of infinite dimensional
diffusion of the sort which arise in statistical mechanics. The interested reader
is referred to [3] where Malllavin's calculus is applied to a continuous state
Ising model.
A related application was made by D. Michel in [8~ . Her idea was to use
Malliavin's calculus to derive regularity properties of conditional transition
functions arising in non-linear filtering theory. More recently, she and J.H.
Bismut [I] have generalized her work.

3~ Applications t=~=oSome Non-Markovlan Situations:

At the end of section 21 we saw some examples of situations to which


Malliavin's calculus applies but the theory of partial differential equations
apparently does not. In this section we mention a source of examples about which
the theory of partial differential equations have even less to say.
380

Let o : [0,~) × 0 + RD~ Rd and b : [0,=) × 0 ÷ RD be bounded

progressively measurable functions which are "smooth" (in the sense of Frechet
differentlahility). Consider the ItS stochastic integral equation

(3.1) xtT) = f~a(t,X(°))d0(t) + f~b(t,X(-))dt , T~ 0

Obviously X(.) is not necessarily a Markov process; and, in general, it cannot be


embedded in any finite dimensional Markov process. Thus it is difficult to imagine
what sort of functional equation ~T ~ ~ " X(T)-I might satisfy. In particular,
it seems very unlikely that one could invoke theorems from partial differential
equations theory to prove regularity properties for ~T Nonetheless~ if one
assumes that ~*(-) >_EI , there is no probabilistlc reason to suppose that ~T
is not just as smooth as in the diffusion case. Furthermore, a proof based on the
Malllavln calculus sh6uld run along very much the same lines as it does in the
diffusion case. In [12] , I showed that, at least on a very special case, one
can indeed carry out this program. More recently, joint work with S. Kusuoka
indicates that we can do the same for a much wider range of examples.
In this connection, the work of Shigekawa [9] must be mentioned.
Shigekawa's interest is in regularity results for the distribution of Wiener
functionals arising in Wiener's theory of homogeneous chaos. It turns out that
this line of research leads quite quickly to problems best handled by algebra{c
geometry. For the latest progress in this direction, see the paper by S. Kusuoka
[5]

References

[I] Bismut, J.M., and Michel, D., "Diffusions conditionnelles, I., hypoelliptlcit~
partielle," J. Fnal. Anal. vol. 44 #2, pp. 174-211 (1981).
[2] Friedman, A., Partial Differential Equations of Parabolic T~pe, Englewood
cliffs, N.J., ~ e Hall (1964).
[3] Holley, R., and Stroock, D., "Diffusions on an infinite dimensional torus,"
J. Fnal. Anal., vol. 42 #I, pp. 29-63 (1981).
[4] Rbrmander, L., '~ypoelliptlc second order differential equations," Acts Math.
119, pp. 147-171 (1967).
[5] Kusuoka, S., '~n absolute continuity of the law of a system of multiple Wiener
Integrals," to appear in J. Fac. Scl. Unlv of Tokyo.
[6] Malliavin, P . , "Stochastic calculus of variation and hypoelliptie operators,"
Proc. Intern. Symp. on S.D.E.'s, Kyoto, ed. by K. Its, Kinokuniya, Tokyo
(1978).
[71 Mckean, H.P., Stochastic Integrals, Academic Press (1969).
[8] Michel, D., "R~gularit~ des lols conditionelles en theorie du filtrage non
lindaire et caleul variations stochastique," J. Fnal. Anal. 41 #1, pp. 8-36
(1981).
[9] Shigekawa, I., "Derivatives of Wiener functionals and absolute continuity of
induced measures," J. Math. Kyoto Univ., 20, pp. 263-289 (1980).
[10] Stroock, D., "The Malliavin calculus and its application to second order
parabolic differential equations, Part I," Math. Systems Th., 14, pp. 25-65
(1981).
381

[II] Stroock, D., "The Malliavin calculus, a functional analytic approach," J.


Fnal. Anal., vol. 44 #2, pp. 212-257 (1981).
[12] Stroock, D., "Some applications of stochastic calculus to partial differential
equations," to appear in lecture notes from the 1981 Ecole d'Ete at Saint
Flour, Springer Lecture notes in Math.
[13] Stroock, D., "Topics in stochastic differential equations," to appear in Tara
Inst. Lec. Notes Series, Springer-Verlag.
LARGE DEVIATIONS

S.R.S. V a r a d h a n
Courant I n s t i t u t e of M a t h e m a t i c a l Sciences
New York University
New York, NY 10012/USA

!. What are Large Deviations?

Let {Pn } be a s e q u e n c e of p r o b a b i l i t y measures on Polish space X


such that Pn ~ ~x0 for some x0 e x as n ~ ~. If A is a c l o s e d set
with A N {x0 } = ~ then Pn(A) + 0. The question of large deviations
is c o n c e r n e d with the rate of c o n v e r g e n c e of Pn(A) to zero as n ÷ ~.
We w i l l be concerned here only with the case when Pn(A) ÷ 0 at an
exponential rate and the p r o b l e m will be to i d e n t i f y the precise expo-
nential rate in s e v e r a l concrete situations.
Definition. We w i l l say t h a t the large deviation results hold
for a s e q u e n c e {P } o n X with a functional I(.): X + [0, ~] if t h e r e
n
is a f u n c t i o n a l I(.) m a p p i n g X i n t o [0, ~] s u c h t h a t
(i) 0 < I(-) < ~ and I(-) is a l o w e r semicontinuous function
of x.
(ii) [x: I(x) < Z] is a c o m p a c t subset of X for e a c h £ < ~.
(iii) For each set A t h a t is c l o s e d in X

1
l i m sup ~ log Pn(A) < - inf I(x)
n + ~ x6A

(iv) For each set G t h a t is o p e n in X

1
l i m sup ~ log Pn(G) ~ - inf I(x)
n + ~ x6G

Whenever large deviation results hold for {P } w i t h a functional


n
I(-) for every set A such that

inf n
I(x) = inf I(x)
we h a v e xeAV xEA
1
l i m ~ log Pn(A) = - inf I(x)
n~ xEA

Moreover for e v e r y function F(x) on X w h i c h is b o u n d e d and contin-


uous we have
383

Theorem.

lim 1 log I I exp [n F(x)] Pn(dX)~ = sup [F(x) - I(x)]


n~ n x

This theorem is the motivation for studying large deviations


because it then provides us a method for evaluating certain integrals
asymptotically for large n.

2. Finite Dimensional ExamRles.

We will provide some examples where the large deviation results


hold. We start with some fairly elementary examples in finite dimen-
sional spaces.

Example 1. Let Xl,X2,...,Xn,... be a sequence of independent


identically distributed random variables on the line with M ( 8 ) = Fie 8x}
< ~ for all @ 6 R. If we define

Pn(A) = Prob ~Xl+'~'+Xn


A I - E

then the large deviation results hold with

I(x) = sup [Sx - log M(8)]


e
This result can be found in [i].

Example 22. Let us take the same situation as in example i, but


with the change that Xl,...,Xn,... take values in R d. Then

M(8) = E{exp<8,x>} for 8 6 R d.

Again the large deviation results hold with

I(x) = sup [<8,x> - log M(8)]


e
This for instance can be found in [i0].

3. Infinite Dime nsinonal Examples.


Exam~ole 3. Suppose in example 2 we replace R d by a Banach space
X and add the hypothesis that

E{exp[~Ixa]} < ~ for all ~ > 0

then we have again the same results with


884

I(x) = sup. [<8,x> - log M(8)]


8EX

where X is the dual of the Banach space X. See for instance [2].

Example 4. We now s p e c i a l i z e the Banach space X to c[0,1], the


space of c o n t i n u o u s functions on [0,i] and the c o m m o n d i s t r i b u t i o n
of our X v a l u e d r a n d o m v a r i a b l e s to a G a u s s i a n process x(t) with
mean zero and c o v a r i a n c e P(s,t) w h i c h has almost surely continuous
trajectories. We then o b t a i n the following corollary

i , + ~-
lira £ ~ Prob E sup Ix(t) I i z ] = -(2p) -I
O<t<l
where
P = sup P(t,t)
0~t!l

See in this context [3], [7] and [8].

Example 5. If we take X to be C0[0,I], i.e. the space of contin-


uous function on [0,i] which v a n i s h at the origin and Pn to be
1
the W i e n e r m e a s u r e w i t h c o v a r i a n c e ~ min(s,t), i.e. the m e a s u r e
corresponding to 1 B(-) where B(-) is the standard B r o w n i a n motion,
then again the large d e v i a t i o n results hold with
1
I(x) = ~1 I Ix(t)]2 at
0
This is proved in [9].

Example 6. If we take X = Cx[[0,1],Rd], the space of R d valued


continuous functions x(t) on [0,i] with x(0) = x and take for P
n
the m e a s u r e corresponding to the solution x(t) of the stochastic
differential equation

dxCt) = ~ -1- q(x(t)) dS(t) + b(x(t)) dt

x(0) = x

then again the large d e v i a t i o n results hold with


1
1 <~(t)-b(x(t)), a-l(x(t)) ( x ( t ) - b ( x ( t ) ) ) > d r
I (x) =
0
This was shown in [ii] when b -= 0 and in [6] and [12] for the general
case.

Example 7. Let Xl,X2,...,Xn,... be a sequence of i n d e p e n d e n t


identically distributed random variables on a Polish space Y.
385

we take for X the space Mx of all probability distributions on Y.


We take for P the distribution of the sample distribution based on
n
n observation i.e.

Pn(A) = Prob
I ~Xl + "" .+~ Xn E

Then again the large deviation results hold with I(.) defined for
e My by

See [2] for details.

Remark i. If ~ << u and log d~


~-~ e LI(~) then I(~) is given as
above. Otherwise I (~) = ~

Remark 2. Example 1 is a special case of example 7. If V: Y ÷ R


is a real function t h e n
-6 +.. +5
V(Xl)+'n''+V(Xn) = I V(y) [ xl n" Xn~I (dy)

Therefore
I l(a) = inf 17(11)
[~ :IV (y)~ (dy) =a]

When we want to refer to the I(.) function in a specific example we


will use the subscript denoting the number of that example.

Example 8. Let Xl,X2,...,x n ... be a "good"t Markov chain on a


state space Y with transition operator (~f) (x) = J f(y) ~(x,dy).
We take X = My and consider again
I 6xl+'''+~
Pn(A) = Prob n Xn E A I x 0 = x

Then again we have the large deviation results with

I(~) = - inf f log (-~-)


~u (y) D(dy)
u>0
Application. If e v denotes the operator of multiplication by
the function exp[V(y)] then the spectral radius of the operator ~e v
is given by
log s(~e V) = sup [ ] V(y) U(dy) I(p)]
)

Conditions. The following conditions are sufficient to yield the


386

result
(i) Either Y is c o m p a c t or the process is "strongly" positive
recurrent
(ii) W has the F e l l e r property.
(iii) There is a r e f e r e n c e m e a s u r e w i t h respect to w h i c h ~(x,dy)
has an almost d u r e c l y positive d e n s i t y ~(x,y) for each x e y.

See [2] for details.


If we take ~ ( x , d y ) = ~(dy) then we are back in example 7 and
therefore

I7(~) = IS(P) with ~(x,dy) = e(dy)

Example 9. We change over to continuous time M a r k o v processes.


Let T t be a s e m i g r o u p of t r a n s i t i o n operators on a state space Y and
let L be the i n f i n i t e s i m a l generator of T t acting on b o u n d e d contin-
uous functions on Y. Let My = X be as before. Let x 0 be any
starting point. Define

Pn(A) = Prob [L t e A I x(0) = x 0]


where

Lt(E) = ~ XE(X(S)) ds for E e 8(Y)


0
Here again the large d e v i a t i o n results hold with

I(~) : - inf
u>0
ue?(L)
f (~7-) (x) U(dx)

where ~(L) is the d o m a i n of L.

Application. If we c o n s i d e r the operator L + V where V is multi-


p l i c a t i o n by the function V(') then by the m a x i m u m p r i n c i p l e the
point in the spectrum of L + V w i t h the largest real part is real and
denoting this by q(L + V) we get the v a r i a t i o n a l formula

a(L+V) = sup ~ I V(x) ~(dx) - I(~

Remark. This example also needs conditions similar to those in


example 8. See [ 2] for details.

Remark. If Tt happens to be s e l f - a d j o i n t with r e s p e c t to a


reference measure l(dy) under m i l d r e g u l a r i t y assumptions

fd~11/2, dp 1/2 d~ I/2>i


387

In particular, if

L = IV'aV on Rd
then

1 I ~aVf, Vf> d x = 1 [
| <a V ~ , V / f > dx
I(~) = ~ f
J
where d~ = f(x) dx.

Example 10. If Y is the unit circle and L = --2


1 dx d 2 + b(x) ~d with
2-
y b (y) dy ~ 0 then the Markov process is not rever~fble with respect
to any measure and if d~ = f(x) dx, then
b(x)
I(~) = ;1 8 (ff)2 dx + 21- ; [b' (x)+b2(x)] f(x)dx - ,i f dxdX)%"

Exampl_e iI. Let Xl,X2,...,Xn,... be a sequence of independent


identically distributed random variables taking values + 1 with
probability 1/2 each. Consider the point ~ in the space of doubly
infinite sequences defined by

= (...,Xl,...,Xn,Xl,...,Xn,Xl,...,Xn,...)

If T is the shift operator then W,T~, ...,Tn-lw is a periodic orbit


of period n. Then the measure

1 [~ + 6T~ + --- + ~Tn_l ] = Rn(Xl, ... ,Xn)

is a random stationary stochastic process. If we take X to be the


space of all stationary stochastic processes on the space of sequences
of ~ i, and denote by P0 the product Bernoulli measure then the
distribution of Rn(Xl,...,Xn) , which is a measure Qn on X, satisfies

Qn ~ 6P 0
Note that P0 e X is a point of X. This is just the ergodic theorem.
Again the large deviation results hold in this context and

I(P) = log 2 - h(P)

where h(P) is the Kolmogorov-Sinaii entropy of the stationary process


P. This result is essentially a restatement of the Shannon-Breiman
McMillan theorem in information theory.

Example !2. Let us take ~ = K Y i.e. the space of sequences


with values in Y. Let us take a Markov chain in Y with transition
388

probabilities w and replace the independent random variables of


example ii by the Markov chain. Then analogous to example ii we
have large deviation results with an I(-) functional defined by

II2(P) = EP{I7(P~,~)}
where
p~ = P [ X l e • I ..- X_l,X 0]

is the conditional distribution of x I given the past under P. E~ is


the transition distribution ~(x0,-) where x 0 is viewed as a function
of ~, i.e. the position at time 0 dictated by ~. I7(~,e) is the I-
functional of example 7

I7(~'e) = I log {~ (y)~ d~(y)

I7(Pm,H~) then depends on and is integrated with respect to P. The


details will appear in [4].

Remark. Comparing this to example ii, if we take that ~(x,dy)


= e(dy) then He is independent of ~ and then

II2(P) = Ill(P)

Remark. Since the one dimensional marginals of Rn(Xl,...,X n)


are given by ~ [~x +...+d ], example 8 is a special case of example
•~ 1 Xn
12 and one has the contractlon principle

I8(~) = inf I12 (P)


P:Marg P=~

The infimum is taken over all stationary processes P with marginal


distribution ~.

Remark. Let F(Xl,...,x N) be a function depending on a finite


number of coordinates. Then if Xl,...,Xn,... is a Markov chain
with transition probability ~ we have

lim ~ log E exp[F(Xl,...,XN)+F(x2,...,XN+l)+...+F(x n ..... Xn+N_l)]

= sup {EP[F(Xl,...,XN)] - II2(P) } -


P
Example 13. We can carry out the analogue of example 12 in the
continuous time case. R will be the space of cadlag functions with
values in Y and X is the space of stationary measures on
389

We have the family R x of M a r k o v m e a s u r e s on ~+ corresponding to the


semigroup T t w i t h g e n e r a t o r L. For any T > 0 for any s t a t i o n a r y
process P we have the r.c.p.d. P~ of P given the past and the M a r k o v
process Re(0) starting where ~ ended at time 0. Considering both
of them on the time interval [0,T] we define

pT
h(T,P) = EP[I7 ( ~,R~(0))]

w h e r e pT R T are r e s t r i c t i o n s of Pm and Rm(0) to the time


~' ~(0)
interval [0,T]. It turns out that h(T,P) is linear in T and

h(T,P) = T II3(P) ,

and the large d e v i a t i o n results hold with this I(.) function. There is
again a c o n t r a c t i o n principle connecting I13 and 19 identical to one
connecting 18 and I12.

Remark. It turns out that in examples 8 and 9 I(-) is a convex


functional, but the e r g o d i c i t y implies that in examples 12 and 13, I(.)
is linear in its argument.

Analogous to the third remark following example 12 we have a


similar formula
T
lim ~ log E exp F(m s) ds = sup [EP{F(m)} - II3(P)]
T~ P
0
F is a tame function on ~ and ~ is the path ~ shifted in time by an
s
amount s.

4- Applications.

We will look at some i l l u s t r a t i o n s of these ideas through specific


examples. We will not try to be precise, but just try to give the
flavor.
i. Study the b e h a v i o r of

for 0 < a < i. Here ~(t,x) is the local time at x for the standard
B r o w n i a n motion.
By B r o ~ i a n scaling

G(t) = E e
390

where [(~,x) = ~i Z(T,x) and ~ = t(l+a)/(3-~) Then

1
lim -l-/&--- log G(t) = - inf
• ~-~ f >_ 0
t 3-a f f dx = i

2. Consider
t t
1
G(e) = lim ~ log E exp {
a -'U-s] do ds} ff
t~ 0 0 ]~(s)-6 o 1
where 6 ( • ) is 3-dimensional Brownian motion.
We will show that

[- ,2[y) v~.l2 d~j


U~IJ2=I
TO see this, rewrite

e-l°-s{ dO ds = 2 ds dO
0 0 l~(°)-~(s)l 0 [s(o)-B(s)l

2 ds as t +~
o 16(°)-6(s) l
t
= f F(~s) ds
0
where f -s
e ds
F (~) = 2 |
J0 [~(s)-~(0)[

Then clearly we can expect


oc)

G(~) = sup [E P 2 [ e-Sds - I (P )_~


e 0 1~(s)-~(°) {
By Brownian scaling
e-S/a 2
G (~) _ sup F= p
2 [o 2 f dx - z(P
e P

By ergodicity one sees that

lira E P [ 1 i e-S/a2 d s ~ = i / 1 f(x) f(y) dx dy


391

where f(') is the marginal density of P.


Finally it is not unreasonable to expect

~+~ --~--= supf 2 ~/~ f(x) f(y)) dx dy - ~ f dx

using the contraction principle. We can set f = ~2 for # e L2(R 3) and


we get the final formula. See [5] for details.

5. Counterexample.

If we take x(t) to Brownian motion on R 1 with a drift, i.e.

1 d2 d
L = ~dx-~+ ~-~

then direct elementary calculation shows that

= i/ (f,)2 1
I9(U) ~J ~ d x + ~

where ~ (dx) = f(x) dx .


1
In any case I9(~) ~ ~ for all ~: The large deviation results
cannot hold because if we take A = My , the whole space, the
probability is 1 and

1 1
lim ~ log 1 = 0 < - inf I9(~) < -
n-~=

is clearly false. That is why we need the strong positive recurrence


condition.

Acknowledgments ,. This work was supported by NSF Grant


MCS-8109183.

References

[i] Cram~r, H., On a new limit theorem in the theory of probability;


Colloquiu/n on the Theory of Probability, Hermann, Paris, 1937.

[2] Donsker, M. D. and Varadhan, S.R.S., Asymptotic evolution of


certain Markov process expectations for large time, III,
Comm. Pure Appl. Math. 29 (1977), 389-461.

[3] Donsker, M. D. and Varadhan, S.R.S., Some problems of large


deviations, Instituto Nazionale di Alta Matematica, Symposia
Mathematica, Vol. XXI (1977) pp. 313-318.

[4] Donsker, M. D. and Varadhan, S.R.S., Asymptotic evaluation of


certain Markov process expectations for large time, IV
(to appear).
392

[5] Donsker, M. D. and Varadhan, S.R.S., Asymptotics of the


Polaron problem,

[6] Glass, M., Perturbation of a first order equation by a small


diffusion, Thesis, New York Univ. (1970).

[7] Marcus, M. B. and Shepp, L. A., Sample behavior of Gaussian


processes, Sixth Berkeley Symposium on Mathematical Statistics,
and Probability, Vol. II (1972) pp. 423-439.

[8] Pincus, M., Gaussian processes and Hammerstein integral equa-


tions, Trans. Amer. Math. SoC., Vol. 134 (1968) pp. 193-216.

[9] Schilder, M., Some asymptotic formulae for Wiener integrals,


Trans. Amer. Math. Soc., Vol. 125 (1966) pp. 63-85.

[10] Varadhan, S.R.S., Asymptotic probabilities and differential


equations, Comm. Pure Appl. Math., Vol. 19 (1966), pp. 261-286.

[ll] Varadhan, S.R.S., Diffusion processes in a small time interval,


Comm. Pure Appl. Math. 20 (1967) pp. 659-685.

[12] A. D. Ventcel and M. I. Ereidlin, On small random perturbations


of dynamical systems, Russian Math. Surveys 25 (1970) 1-55.

You might also like