Professional Documents
Culture Documents
Lecture Notes in Control and Information Sciences
Lecture Notes in Control and Information Sciences
Lecture Notes in Control and Information Sciences
Control and
Information Sciences
Edited by A.V. Balakrishnan and M.Thoma
I
Fo
IRP!
l
42
Advances in Filtering
and Optimal Stochastic Contrc
Proceedings of the IFIP-WG ?/1 Working Conference
Cocoyoc, Mexico, February 1-6, 1982
Edited by
W.H. Fleming and L.G. Gorostiza
I I
Springer-Verlag
Berlin Heidelberg NewYork 1982
Series Editors
A. V, Balakrishnan • M. Thoma
Advisory Board
L D. Davisson • A. G. J. MacFarlane • H. Kwakernaak
J. L. Massey • Ya. Z. Tsypkin • A..1. Viterbi
Editors
Wendell H. Fleming
Division of Applied Mathematics
Brown University
Providence, Rhode Island 02912
USA
Luis G. Gorostiza
Departamento de Matem&ticas
Centro de Investigaci6n y de Estudios Avanzados del IPN
Apartado Postal 14-740
M~xico 07000, D.E
M~xico
This work is subject to copyright. All rights are reserved, whether the whole
or part of the material is concerned, specifically those of translation, re-
printing, re-use of illustrations, broadcasting, reproduction by photocopying
machine or similar means, and storage in data banks.
Under § 54 of the German Copyright Law where copies are made for other
than private use, a fee is payable to 'Verwertungsgesellschaft Wort', Munich.
© Springer-Verlag Berlin Heidelberg 1982
Printed in Germany
Printing and binding: Beltz Offsetdruck, Hemsbach/Bergstr.
2061/3020-543210
PREFACE
Wendell H. Fleming
Luis G. Gorostlza
* This simile was in part suggested by the fact that Cocoyoc means "place of the
coyotes" in the N~huatl language, and in part by the name of one young speaker.
ADDRESSES OF CONTRIBUTORS
BARAS, J. S. FLEISCHMANN, K.
Deparment of Electrical Engineering Akademie der Wissenschaften der DDR
University of Maryland ZentralinstitHt fur Mathematik
College Park, MD 20742 und Mechanik
U.S.A. DDR - 1080 Berlin, Mohrenstrasse 39
GERMAN DEMOCRATIC REPUBLIC
BENES, V. E.
Bell Laboratories FLEMING, W. H.
Murray Hill, NJ 07974 Division of Applied Mathematics
U.S.A. Brown University
Providence, RI 02912
BENSOUSSAN, A. U.S.A.
INRIA
Domaine de Voluceau - Rocquencourt GOROSTIZA, L. G.
B. P. 105 Departamento de Matem~ticas
78150 Le Chesnay Centro de Investigacidn y de Estudios
FRANCE Avanzados, IPN
Apartado Postal 14-740
BLANKENSHIP, G. L. M~xico 14, D. F.
Department of Electrical Engineering ~x~co
University of Maryland
College Park, MD 20742 HAUSSMANN, U. G.
U.S.A. Department of Mathematics
University of British Columbia
CLARK, J. M. C. Vancouver, B. C., V6T IW5
Department of Electrical Engineering CANADA
Imperial College
London, SW7 2BT HELMES, K.
ENGLAND InstitUt fur Angewandte
Mathematik
DAVIS, M. H. A. Universit~t Bonn
Department of Electrical Engineering 5300 Bonn, Wegelerstr. 6-10
Imperial College FEDERAL REPUBLIC OF GERMANY
London, SW7 2BT
ENGLAND HI JAB, O.
Department of Mathematics and Statistics
DAWSON, D. A. Case Western Reserve University
Department of Mathematics and Statistics Cleveland, OH 44106
Carleton University U.S.A.
Ottawa KIS 5B6
CANADA KURTZ, T. G.
Department of Mathematics
EL KAROUI, N. University of Wisconsin
Ecole Normale Sup~rieure Madison, WI 53706
3, rue Boucicaut U. S. A.
92260 Fontenay-aux Roses
FRANCE KUSHNER, H. J.
Division of Applied Mathematics
ELLIOTT, R. J. Brown University
Department of Pure Mathematics Providence, RI 02912
The University of Hull U.S.A.
Hull HU5 2DW
ENGLAND LIONS, P-L.
Ceremade, Paris IX University
Place de Lattre de Tassigny
75775 Paris Cedex 16
FRANCE
V
MANDL, P. PARDOUX, E.
Department of Probability and Mathematical U.E.R. de Mathematiques
Statistics Universit~ de Provence
Charles University 3 Place Victor-Hugo
Sokolovsk~ 83 13331-Maiselle Cedex 3
186 Prague 8 FRANCE
CZECHOSLOVAKIA
PLISKA, S. R.
MARCUS, S. I. Department of Industrial
Department of Electrical Engineering Engineering and Management
University of Texas at Austin Science
Austin, TX 78712 Northwestern University
U.S.A. Evanston, IL 50201
U.S.A.
MAZZIOTTO, G.
Centre National d'Etudes des T~lacommunications PRAGARAUSKAS, H.
92 131 - ISSY LES MOULINEAUX Institute of Mathematics and
FRANCE Cybernetics
• Academy of Sciences of the
MENALDI, J-L. Lithuanian SSR
Department of Mathematics 232 600 Vilnius 54, K. Pozelos Str.
Wayne State University U.S.S.R.
Detroit, MI 48202
U.S.A. QUADRAT, J-P.
Domaine de Voluceau-Rocquencourt
MITTER, S. K. B. P. 105
Department of Electrical Ehgineering and 78150 Le Chesnay
Computer Science and FRANCE
Laboratory for Information and Decision
Systems RISHEL, R. W.
Massachusetts Institute of Technology Department of Mathematics
Cambridge, MA 02139 University of Kentucky
U.S.A. Lexington, KY 40506
U.S.A.
NISIO, M.
Department of Mathematics SAZONOV, V. V.
Faculty of Sciences Steklov Mathematical Institute
Kobe University Academy of Sciences of the USSR
Rokkodai-machi, Nada-Ku 42 Vavilova Street
Kobe 657 Moscow - B 333
JAPAN U.S.S.R.
SHENG, D. D.
Bell Laboratories
Holmdel, NJ 07733
U.S.A.
STROOCK, D. W.
Mathematics Department
University of Colorado
Boulder, CO 80309
U.S.A.
VARADHAN, S. R. S.
Courant Institute of Mathematical
Sciences
New York University
251 Mercer Street
New York, NY 10012
U. s. A.
CONTENTS
ELLIOTT, R. J., AL-HUSSAINI~ A. Two paramter filtering equations for Jump 113
process semimartingales
HELMES, K., SCHWANE, A. L~vy's stochastic area formula in higher dimensions 161
MARCUS~ S. I.~ LIU, C-H.~ BLANKENSHIP~ G.L. Lie algebraic and approximation 225
methods for some nonlinear filtering problems
MENALDI, J-L. Stochastic control problem for reflected diffusions in a convex 246
bounded domain
VIII
PARDOUX, E., BOUC, R. PDE with random coefficients: asymptotic expansion 276
for the moments
ABSTRACT
denotes the
space of functions having locally H~Ider continuous derivatives of
order i in x and J in t. Furthermore the generator L of the diffusion
process x(') is assumed t o be uniformly elliptic; that is, there exist
continuous functions 8i(x,t) i=1,2, and a constant eo>O such that for
all ~ n and
( x , t ) ~ @01512<@l(x,t)l$12<~_ eiJ (x,tl~isjie2(x,t)l~l 2,
i T ..--i,~=l
where ( iJ)~ ~gg The filtering pron±em lot (I) is to estimate
statistics of x(t) given the o - a l g e b r ~ r ~ = o{y(s)lO~s~t}. Equivalently
we can compute the conditional distribution of x(t) given~.
2. OUTLINE OF THE M E T H O D
II t
uk(x'tk ) = ~
r P0(x)exp(~0(x,0)) , k = 0
(7)
[u k - l ( x , t k ) e x p ( ~k
(x,t k) -
~k-1
(X,tk)),l<k_<N-I
where
n n
;gkuk = E aiJu + Z biku + cku
i,J =I xi X .3 i=l X.
Z
k 1 k (<hx i > n
eess " ~ IgV~ - g 'Y )i=l +
In this section we describe our results for the scalar case including
polynomial nonlinearities, Our assumptions on the coefficients in
(1.6) are stated in t e r m s of the orlg~na! functions f,g,h. To state
these succinctly, we will use the relative order notation:
Definition. Let F,G,:R~R and
L = lim sup IF(x)/CCx)le[O,~]
g2hxx, (g2hx) x = o ( g 2 h ~ )
(B3) either gh x = ~ h ) or gh x = o ( f / g ) ;
lira IfXCh/g)C~)d~l = + -
0
Ixl++~
where
~k(x) = =41(x) k
+ 8142 (x) + 82FI + 422(x)]½
Assumptions (A3) and (B6) together with the constraints s>0, 82>0,
62>16~I, imply that the weight functions ~k(x) diverge to + ~ as Ixl÷ ~.
The remaining growth conditions serve to identify the dominant terms
(as Ixl÷=) in the potential ek(t,x) in (7) and in the potential of the
adjoint of (7).
A s s u m p t i o n (B3) permits us to select the functions ~k
k
and the constants y so that these potentials are nonpositive. This
in turn permits the use of a maximum principle.
and some M<~. Then for any constants 8i,0<ei<ei, i=l,2, there exists a
U0(x,0) = P o ( X ) , k = 1,2 . . . .
where
Then for. any T<~, there exist positive constants Hi, M2, KI, K2, which
may depend on the path {y(t),O~t~T} such that the solutfon of the DMZ-
equation ~iven by (13) satfs~fes
V (x,t)cRx[O,T]
To lllustrate our results and make contact with other recent work on
nonlinear filtering (e.g., [6],[12]), we consld'er a class Of systems
with polynomial f,h.
Then from Theorem 2 for any 0~tl~t 2, there exist constants Mi,K i
depending on the o b s e r v a t i o n path such that
where
M 1 + KlJZhz(Z)]!Jh(z)J!M 2 + K21zhz(z)1
Note that these conditions are satisfied whe~ ~(z) is a f f i n e and h(z)
is a n o n - c o n s t a n t polynomial; we do not consi@e~ the case h(z) constant.
The assumptions that f,h and thei~ derivatives are bounded at the
origin are m a d e for convenience only. They can be relaxe~ by intro-
ducing more complex growth restrictions. The other assumptions (C2)-
(C5) are essential (to our method).
11
W(x,O) = Po(e x)
The m e t h o d s of B e s a l a [I0] as used in the proofs of T h e o r e m s I and 2
can be d i r e c t l y applied to the robust version of this equation. The
proofs of T h e o r e m s i-2 go t h r o u g h when the w e i g h t functions ~k(x) are
defined by (9) with x=log z and
el(Z) = -/zf(~)/~2d~
i
(26)
~2(z) = h(z)
Similarly as b e f o r e define
for all zE(0,~) and some MI>0, Then for an~ Oi~O i the D M Z - e q u a t l o n
(24) has a unique solution in the class of f u n c t i o n s satlSfyln~ for
all t>0
z£(0,-)
exp [ I z ( f ( ~ ) / ~ 2 ) d ~ - l h ( z ) l] (32)
i
in the s e n s e of (16) in T h e o r e m 2.
5. THE M U L T I V A R I A B L E CASE
If(x,t)l!Eg(x,t) + K(E)
2
Definition, A nonnegatlve function rEHloc0Rn) is said to be a s c a l e
function if
(1) there exist positive constants DI,R such that IVr(x)12~01
for all IxlZR
(ll) llm m l n r(x) = +
R+~ Ixl=R
We shall use
on o e c a s ion the n o t a t i o n
n
Ar(x,t)~2 I aiJ(x,t)rx (x)r x (x) = Ig(x,t)Vr(x)12~0,
i,J=l i J
Definition. Let ~ denote the collection of pairs of functions (F(z,t),
r(x)) satisfying
13
2'l']Rx
F E HCIo c( [0,T]), reH~oc(]Rn) and ~(1) Fz(z,t) ~ Ob(F2(z,t))
r (x) r (x)
f F(z,t)dz, I Ft(z,t)dz = Ob(F2Ar)
0 0
n n
~(ii) F ~ a ij r , F Y. a ij r = Oh(F2
i,J=l xixj i,j=l x i xj Ar) "
Definition. Two time-varying vector fields fl(x,t), f2(x,t) are said to be com-
n i i >
patible if there exists a constant R > 0 such that Z aiJ(x,t)fl(x,t)f2(x,t) 0,
i,J=l
for all L~[ >_R, tC[0,T]. If we ~ow Zet (~lj) d e n o t e t h e i~verse of (a lj) we
can state the assumptions on the coefficients f,g of the diffusion as follows.
K I Ix 1 2 - v < 8 1 ( x , t)
la ij (x,t)l<__K01x 12-~
la ij (x,t)l<K01x ll-V
xi
IaxiJixj(x, t) I_<K0
Theorem 4. If either F(i), (iii) and G hold, or F(1) and L hold, then
the martingale problem for the pair (f,g) is well-posed.
Finally we need the following hypothesis which help control the growth
of the potentials.
Hypothesis I.
We can now state the following existence result. For a proof see 19].
15
Let
loglx I , if u=O
~0(x,t) = , for IxltR
|IxIR/~, if ~e(0,2]
lany C -time invariant extension for Ixl!R
r(x)
~l(x,t) = I ~(z,t)dz
0
s(x)
~2(x,t) = / ~(z,t)dz
(~0 .---0
0 1---~1t~ (x) - ~ i TI (x,t) + (1-~)~2T2(x,t))
for all e~(O,l). If i n s t e a d of h y p o t h e s i s G, h y p o t h e s i s L holds the
result remains valid with s 0 = ~0 © 0 and ~i chosen arbitrarily small.
dx(t) = dw(t)
REFERENCES
[I] M.H.A. Davis, S.I. Marcus, "An i n t r o d u c t i o n to n o n l i n e a r filtering",
in S t o c h a s t i c Systems: The M a t h e m a t i c s of F i l t e r i n ~ and Identifie
cation and A p p l i c a t i o n s , M. H a z e w i n k e l , J.C. Willems, eds., NATO
Adv. Study Inst. Ser., Reldel, Dordrecht, 1981, pp. 53-76.
[2] M. H a z e w i n k e l , and J.C. W i l l e m s , eds., The M a t h e m a t i c s of F i l t e r i n g
and I d e n t i f i c a t i o n and A p p l i c a t i o n s , N A T O Adv. Study Inst. Ser.,
Reidel, Dordreeht, 1981.
[3] N.V. Krylov, B.L. Rozovskii, "On the c o n d i t i o n a l distribution of
diffusion processes", Izvestia Akad. Nauk, SSSR~ M~th. Ser., __42,
1978, pp. 356-378.
[4] E. P a r d o u x , "Stochastic partial diff, eqs. and f i l t e r i n g of diff.
17
1. Introduction
The area of stochastic control, though long ago supplied by Richard Bellman with the
powerful methodology of dynamic programming, is still strewn with unsolved problems.
Yes, problems, because although the theory of stochastic control is well and even elegantly
formulated, the gritty task of finding or describing optimal control laws languishes, and
problems continue to stay generically unsolved, except on a set of empty interior by
(shudder!) numerical methods. In the difficult and currently active field of control with
incomplete noisy information, solutions are even more sparse; for good Lie algebraic reasons
they are limited with a few exceptions to linear dynamics and Gauss-Marker processes.
It is therefore important to note that a special but significant subclass of control problems
with incomplete noisy information turns out to be more readily approachable than one might
expect: these are the problems of optimally stopping a process (say a diffusion xt, ) while
observing only a function h(xt) muddied up by white noise, so as to maximize Ek(xf) for
some return function k.
Now stopping a process to maximize an expected return depending on the final value at
the stopping time is a problem that has attracted much theoretical interest, and found
applications in economics, finance, and inventory theory, inter alia. Here the admissible
stopping times are those of the process itself, because one has perfect information. Thus in
the problem's usual formulation it is assumed that the process to be stopped is Marker and
that it is exactly observed; the task then becomes one of finding a continuation region in
which the optimal return is harmonic (in the generalized sense appropriate to the process)
while everywhere dominating the return function k, and fitting it smoothly at the
(unknown) boundary. These optimality conditions can also be reformulated more
analytically as a quasi-variational inequality, convenient for proofs but obscuring the "free
boundary" problem.
We pose a version of the stopping problem in which only noisy incomplete information
about the past of the process x t to be stopped is available: the decision to stop must be based
on a given initial distribution for x o and on a noisy observation dyt --h(xt)dt +db t
consisting of a function h of xt corrupted by white Gaussian noise. This kind of problem
was posed first by Grigelionis, l and studied extensively by Mazziotto and Szpirglas. 2-5
It turns out, not surprisingly, that our form of the problem can be formulated and
attacked by means of nonlinear filtering, and that the resulting structure is a kind of infinite-
dimensional version of the original problem: the conditional distribution of the state is used
as a new state, and constitutes a measure-valued diffusion; again one looks for a function
(defined now on the positive cone of a Banach space of measures) which is " h a r m o n i c " in a
certain continuation region, and dominates a given function. The meaning of 'harmonic' is
clarified by the interpretation in terms of filtering: we "Markovize" the problem by taking
the conditional distribution of x t as an infinite-dimensional state variable, with dynamics
given by Zakai's equation. It is natural, if not almost obvious, that this conditional measure
is a su~cient statistic in the sense that by using only it one can do as well as by using all the
available data; so we can expect that a particular optimal stopping time can be defined as the
hitting time of a suitable boundary by the conditional distribution. The "harmonic"
19
functions in this setting are those which are annihilated by a natural infinite-dimensional
"Laplaeian" operator obtained from the Zakai equation. In this way we inject, into the
problem of stopping under partial observations, a more analytical method than had been
present in the previous work ~-4 of Szpirglas and Mazziotto.
We end this introduction with a summary of the rest of the work. Sections 2 and 3
describe a formulation and interpretation of the problem whose neatness owes much to
Fleming and Pardoux 7 and to Davis and Kohlmann. s-9 In Section 4 we advance some
heuristics on the appropriate meaning of 'harmonic' to be used in an analytical approach,
leading, in Section 5, to an Ire differential rule. In Section 6 we describe a quasi-variational
inequality proposed as equivalent to the stopping problem; the inequality is stated in terms of
an infinite-dimensional "Laplacian" advanced in Section 4 as expressing the appropriate
meaning of 'harmonic.' There follow a verification lemma in Section 7, a discussion of the
regularity of the optimal return in Section 8, and a reformulation of the whole problem as a
fixed point problem in Section 9, and a characterization of the optimal return as the least
superharmonic majorizer of the linear functional (k,t~). Section 10 studies the optimal
return as a fixed point in a complete lattice, while Section 11 suggests and calculates some
examples. The final Section 13 gives the proof of the differential formula.
2. Formulations
It is convenient to give our problem a series of formulations of increasing tractability.
We start with a diffusion process observed through noise, described by stochastic DEs
dx t -~ bdt + adw t state
dy t = h (x )dt + db t observation
and the problem is to maximize the expected return E k ( x t ) over stopping times r of y , for a
given initial measure # for x 0. The first reformulation consists in eliminating all reference to
x t. Let Y t = ulys, s<_t}, and for a stopping time r e l y , let Y~ ffi o-algebra of events prior
to r. Since
Elk(x~)lY~}ffiElk(xt)lYt}t., a.s.
The notation ( , ) refers to the pairing between the set C b of bounded continuous functions
and the set M of finite regular Borel measures, with the weak topology for M. at u is a
measure-valued diffusion driven by the observation process. In terms of it we can write
f,, =
(1,¢rt/t)
and state a first reformulation of the original problem as the search for
(k,~r,u)
sup E - , 7 a s.t. of Yt
r (1,artz)
subject to the Zakai dynamics for crtt~, the expectation E now being purely over the Yt-
process.
Finally, on the suggestion of M. H. A. Davis 6'8 we can use an idea due to W. H. Fleming
and E. Pardoux 7 to eliminate the troublesome-looking normalizing denominator (1,~r,t~) if
we replace the measure induced by Yt by Wiener measure. The final formulation is then
this: Find, for w t Brownian a stopping time ~'* of w t t h a t achieves
S (t~) -- sup E ( k , ~ u )
.g
3. Interpretation via c o n v e x a n a l y s i s
The following explanation of S ( # ) is adapted from Davis and Kohlmann. s The functions
of the f o r m
S ( # ) -- sup E (k,~)
T
are convex, positive homogeneous, and bounded in the sense (11.11 = variation)
sup IS(I~)] < co for bounded k
II#)lffil
be the " r e a c h a b l e " set, we see that S ( # ) is the support function of K , in the language of
convex analysis. S ( # ) is not changed if the bounded but possibly nonconvex set K is
replaced by its convex hull c o K , and in terms of S ( ' ) , c o K is given by
c o K = { f E Cb: ( f ,#) ~ S ( # ) for all #}
This has a nice control theoretic interpretation: the set of which S is the support function
consists of those f such that for any #, given the choice of (f,t~) pesos now or of taking the
maximum expected payoff from the stopping problem, you would go for the latter.
21
We provide here an informal motivation for the mathematical ideas used in the sequel.
It is tempting to conjecture that our problem of stopping with partial information, when
reformulated as a complete information problem on a new state space, will have the same
kind of structure as the well-known stopping problems governed by free boundary problems
or quasi-variational inequalities. This structure is revealed when we view the M+-valued
diffusion crt# as having a drift associated with the generator A, and a degenerate kind of
diffusion proportional to h.
To fix ideas, let us use the density formulation of the Zakai equation. For t > 0, the
unnormalized measure * t ~ has a density p which, according to the Zakai equation, moves
with " d r i f t " A ' p , and diffusion " h a " when driven by dyt. This suggests that our problem
might also be governed analytically by a second-order operator based on A and h.
For guidance, let us see how a smooth functional U(t,pt) would develop in time under
the Zakai dynamics: expanding in Taylor's series up to terms of order 2, we find
U(t+dt,p,+~) = U(t,p t) + Ui(t,pt)dt + U2(t,pt)[A*ptdt+hptdyt]
1
+ T U22(t'Pt)[dpt'dptl + " ""
Here Ut, U2, U22 are the ordinary and Frechet derivatives of U; U 2 is a linear functional
acting on a function f to produce U2(t,pt)[f], while U22 is a bilinear functional acting on a
pair f , g to produce U22(t,pt)[.f,g]. Considerations of quadratic variation indicate that U22
will contribute only through a term U22(t,pt)[h pt,h pt](dy) 2. Thus in p-space the analog of
a backward operator on a smooth U is
1
L U ( p ) --- "~ U22(p)[h p,h p] + Uz(p)[A* p] .
W¢ can expect that under reasonable circumstances the second term can be expressed in
an "adjoint" manner: U2(p) is represented by a function u -----u (p) in the domain of A, such
that Au is a bounded continuous function, and with ( , ) the natural pairing,
V2(p) [A* p] = (Au,p).
is a martingale when wt is Brownian; finally, (iii) use of L leads to rather complete analogs
of classical results for optimal stopping.
(1) Ito lemma: Let F : R+)<M + ---.R be C l in t and F r e e h e t C 2 in t~, such that its first tL-
derivative F 2 is represented by a function f 2 in the domain of A such that Af 2 E Cb. Then
t t
1
F(t,ottz) = F(O,l~) + ~ (Af2(attz),a,#)ds + T ~0 F22(s'as#)[h astz'h astz]ds
+ f (hf2(°slz),tral~)dw~ f2 =f2(s,°s#)
0
Proof: The proof is long, not quite standard, and in the Appendix, Section 13.
6. Quasi-variational inequality
We guess, in analogy with the finite-dimensional case, that an appropriate value function
for our problem will be a positive homogeneous function V: M + ~ R in the domain of L,
such that
V(~,) >__ (k,~,) = j" k d~,
L V ( ~ ) __< 0
L V (~ )[ V (~ ) - ( k , ~ )] = 0
is an optimal stopping time. The continuation region R is such that V(/*) > ( k , / * ) a n d
LV = 0 , and it is bounded by a surface V ( / * ) = ( k , v ) . Starting inside R we should
calculate a t / , and stop when V(att~ ) = (k,~t/*); when starting outside R , we should stop at
once.
7. Verification lemma
With all these preliminaries behind us, our first result is that a smooth solution of the
QVI must be the value function of the problem.
Proof: Let ~r be any stopping time of the Brownian process wt that drives a t/*. By the Ito
lemma and the QVI,
(k,~,~)_< V ( ~ )
1" *" if"
Taking expectations, we find that the return E(k,¢T/* ) obtained by stopping at ~ is not
greater than V(/*), so V >__S . For the reverse inequality, consider the special stopping time
.r* ----inf t: V(at/*) = ( k , ~ t ~ ) . Writing
T °
=- EV(~,,/*)
= v(/*)
so ~'* achieves the upper bound V(/*), and hence is optimal. Only one smooth V can satisfy
the conditions of Theorem (2); if there were two, each would equal S , and so the other.
24
8. Regularity of S
(3) Remarks: It is of course of prime importance here, as in any other problem governed by
a QVI or a Bellman equation, to determine when there exist solutions having the regularity
requisite for applicability of these analytical methods. This is already a difficult task in the
well-known cases, and it is correspondingly harder in the present problem, if only because of
the singularity of L. However, some results on regularity can be obtained directly from the
definition of the optimal return 5; (/z ) = sup E ( k , o ~#).
T
This result is essentially the observation that the upper envelope of a family of linear
functions is 1.s.c. " S t r o n g e r " continuity can be obtained by strengthening the topology to
that of the variational norm I1.11.
EIIor#ll = I1#11 .
Similarly, --KII#I--#zll is a lower bound. This argument is from Davis and Kohlmann. 9
25
If we can interpret V ( # ) as an achievable average return starting from " k n o w l e d g e " t~, then
the first property says that the policy leading tO V is at least as good as stopping at once,
while the second says that if you wait for any stopping time T (while the initial # moves
stochastically under its driving Wiener process to ~r~tt) and then from a~# follow the policy
yielding V then the average return starting from a , # is no larger: you can still do as well as
if you had followed the V-policy in the first place. Thus the second property expresses a
kind of optimality, since the null stopping time r -----0 is admissible.
Intuitively, then, we expect (7) to act as a kind of Bellman equation for the noisy
stopping problem, with the value function satisfying the two conditions V = FV and
V(tz) >-- ( k , # ) . This guess will be borne out in part by theorems to follow. However, it
turns out that these two conditions are only necessary and seemingly not sufficient: such a
function V might not be achievable by a stopping policy. Still, we shall show that the
smallest such function is the right one, and so characterize the value function (in a manner
similar to using the Snell envelope) as the least fixed point of I" that majorizes (k,t~).
(8) Remark: An equation (and cognate results) can be formulated for the ordinary stopping
problems in R n : v ( x ) >---k ( x ) and
v ( x ) ffisupEv(xr) r ~s.t. of x t
g
(9) Remark: From an analytical viewpoint, introduction of r begs all the questions, because
so much is hidden in the definition of I' itself, and you cannot easily calculate with it, or
iterate it. This is one reason for trying to use the prima facie more explicit operator L.
However, it will appear that the inequalities V >_ PV and L V __< 0 are both analogous to the
superharmonic (excessive) property. W e have
If LV ~ O, taking E and sup on r gives V >__ PV; the reverse inequality P V >__ V is trivially
26
Let F be a fixed s.t. The function S is the upper envelope of a family of linear functions:
S(tt) =supE(k,#¢lz). So S is lower semi-continuous in the weak topology of M, and
T
hence measurable. Let ~ > 0 be fixed. By Lusin's theorem, there is a compact subset K of
M such that S is continuous on K and Pr{a~-g~K} ~ 1 -- ¢. For each g~K there is a
neighborhood N~, such that
p~K ¢3 Nu implies S ( v ) ~ S ( # ) + ¢
O n N g N K then
E(k,cr,(u)tt ) > S ( v ) -- 2~
From the relative cover {Nt, (3 K , geK} one can pick a finite subcovcr Nt, ' = N i with
corresponding s.t.s. T(/zi) -----r i. Define a partition {A~.} of K by A t ----N1 ¢3 K
An+ l = N n + 1 f3 K - - 6 Ai
i-I
Now we can mimic each rs starting at F; that is, there is a stopping time ~'i of we+. -- we
with the same law as r i, and of course independent of F¢ = events prior to ~', such that
~-+ ~-~ is a s.t. Now set r ' = 0 = Ti on {%-#EK'} and
R7
>" E I ~ , ~ K (k,ar,a~At)
(12) Lemma: Let ~ be a stopping time, F r the a-algebra of events prior to r, and r t stopping
times of w + . - - w ~ . Let B i be a countable system of disjoint sets from F~, of total
probability 1. Then the r.v. to equal to 7 + vi o n B i is a stopping time.
then S --> V.
Proofi From
T
it follows that
Similarly
tar
V ( e t ^ ~ ) = V(~) + f L V ( a s ~ ) d s + martingale at r
0
f Lv(~..)zs - o at.$.
0
28
Proof:
tar
tar
The converse is obvious. Note that the domination condition V(/z) ~ (k,/z), required by
the optimal return S , plays no role here.
We can now put together the following multiple characterization of the optimal return S :
(15) Proposition: For a smooth fixed point V of P, satisfying V(t~) >---( k , t t ) , the following
conditions are equivalent:
(i) For 7* ~ inft: V(~rtt, ) = (k,uttt), V(atAr, 1~) is a martingale
(ii) LV(V--(k,#)) E 0
(iii) V(I~) = ~n~f
E U(I~), E = excessive functions, i.e. U _> F U .
The equation 1/ -----FF" can be studied directly in an abstract context by several methods
of functional analysis. Needless to say, the degrees of difficulty of such approaches are
directly related to the degree of regularity sought for the solution. W e have seen that S is
l.s.c, in the weak topology, and Lip, in the variational norm, of M; differentiability of S
remains an open question, with a negative answer in general, we suspect.
It is convenient to replace the weak topology of M + by the strong topology induced by
the variational norm Ifql, in which it is harder for functions to be continuous. We consider
the Banach lattice of bounded uniformly continuous functions ~: M +---, R with uniform
norm, and pointwise order. The subclass Llpa, a > 0, is defined by the condition
1~0,1)-~0,2)1 - < afl/~l--/*211 -
Proof: If C) E Lipa
( P $ ) ( # I ) -- (F$)#2 -< sup E{C)(~.V.1)--¢(a,/~2)}
sup E IC)(#T~I)-C)(#,~2) I
1"
<_~a sup
I"
E IItTrl~ l--O rlz211
_~< a II#l--bt211
11. E x a m p l e s
dyt = x t , t t + dbt
where
dO,=--(tanht)Oz~+tanht@, , Oo=O
6==Am , mo=(x,0,0)
A ffi
[-o°il 0
Yt 0
, Y= yt 2
o
~ £(t,&,x)~,(dr) = (k,~,~,)
d ~, = F, ~,dt + g,C~,)dy,
B - ~ -a+ ±2 ~j a~
Z g'g/a~~--~/+ I'F'v
31
Bu _<0
n. lu-(i.,)] - o
It can be verified that this QVI is exactly equivalent to finding the best admissible stopping
time for (17). In particular, the time t functions here as another nonrandom statistic, and
S(g) = supE(k,~,#) ~ u (0,0,t~)
z + e~2/ f e-"Zdu.
..-oo
12. Acknowledgements
We note with pleasure the value of discussions with M.H.A. Davis, G. Mazziotto, and L.
A. Shepp, and the stimulating influence of the preprinP by Davis and Kohimann. W . H .
Fleming suggested the invitation that led to presentation of this material at the IFIP Working
Conference on Recent Advances in Filtering and Optimization, Cocoyoc, Mexico, February
1-6, 1982.
with {w,} a sequence of weights to be chosen. There is no loss of generality in supposing that
the ¢, are all C 2 functions of norm 1 in the domain of A. Furthermore we can choose the
weights w, to decrease so fast that
W n 2 4n ~ O0
n--I
When the coefficients in A are bounded, we can further adjust the w, so that ~ w, lA~,(x)l
defines a bounded function.
Put /~f = (k,#t/z)/(1,aH~). If w, were really the observed function y,, then h would be
E{h(x,)lY,}. We have the relation
T h e first E does not i n v o l v e w, the second is b o u n d e d on t-intervals, and the third has m e a n
t
T h u s on .,Is t h e orbit {o,a, o<_s<_t} remains in a weakly compact set K 'vt of positive measures.
As usual, we consider a partition ,r = {offito<t]<...<h=t} of the interval [o,t]; setting
mesh(*r) ffi max Itl+t-ti{, we start with the identity
n--I n-I
F(t,~tI~ ) -- F(O,~a) ffi ~ [F(tl+1,~l+1)-F(ti,I~1)] ffi ~.j A F ~
I-0 f-O
Except on sets o f arbitrarily low probability the orbit s, o,~ stays in a c o m p a c t set, so a
standard u n i f o r m continuity a r g u m e n t shows that the s u m o v e r t~ ~ w of the second t e r m on
the right is o(1) in pr. W e turn t h e n to the third term.
33
= )o ~(t~,pDl:'pt] + fo F~2(tt'm+uAPD[APt'~P~lau
+ i i {F,2(tt,pt+UA.t)-- Fz2(t,,.t)tAP,,AP,]duds
For s o m e st 6 [o,1] the last term is
i{F,l(tt.PCF'uAPt)--F,,(tt,Pt)tA.DAt~tldu
which in turn is equal, for s o m e u~ fi [O,stl, to
,t{F2,(t,.pt'I-utA,t)--F~(t,.pt)tA,t.A,,] .
Thus we have
1
F(tl,Pj+l) -- F(tDPD = F2(tt,pDIApl] -I-~ F22(tl,pD[API,APl]
(I 8) + s~{F22(t1,p~+ur:'at)-F22(tl,Pt)}[APt,APt]
W e claim that the s u m over tt E f of the third term on the right is o(I) in pr. Thinking of
F,, as a m a p o~" [0,1]XM + into the linear vector space of bilinear forms B on M>O4, such that
IB [#,u]l --< llBlld(p,O)d(v,O)
Hence
EIIos#ll 4 < 811#114+ const, s(eg~--l)
J+¢
T h u s dr is the distance, in our special metric for the weak topology of measures on R a, from
eq+t# to otj#, tt+l > tI.
35
Proof: W e have
dr = ~ ~ . l ( ~ . . a . , ) l
tl--I
'r+l~
il [~' ,o. ~)dw.
n~l n~l | tI
The Lebesgue integral has quadratic variation zero, and the last s u m has mean at most
t .up Ih Iz.
Proof: Write d~ = Xj + M~ where xs is the Lebesgue integral part, and Mr the sum of stochastic
integrals. Evidently, under our choice of ~,., EX 2 _< conn. (tr+l-tD ~, and
Pr{l×,l>oI ~ ~-z ~onn. (tt+l-tt) 2
w h e n c e on A N
and
By (22) and (21) resp., the m a x i m u m converges to zero in probability, a n d the s u m Z dl2 is
b o u n d e d in the m e a n uniformly in ~r. Since o~N(U)decreases to zero with u it follows that
tt~
goes to zero in pr.; b u t P{AN} ~ 1. SO We can drop the 1A#. This completes proof that the
third t e r m on the right contributes o(I) in pr. as mesh(~r) ---, O.
It remains to deal with the first two terms on the right of (18). If F2 is represented by
the function f z in the d o m a i n of A with Af2 ~ Cb as a s s u m e d then with f [ = f z ( s , # . ~ ) ,
f[ ~f~, the first of these terms can be written as
t,+ I i,+~
F2(ti,gi)[A~,] = f (Af~,~s#)ds + f (hf~,,;,,t)dw,
liar| [1+1
-- f (Af[--Af~.,.,)ds -- f (hf[ - - h f i . ¢ . , ) d w .
Methods exactly like those already used show that the sums on tl ( ~r of the absolute values
of the third and fourth terms on the right go to zero in pr. as mesh( I : ) ---, O. The second term
can be written out in the form:
F22(tI~, ) [A;~ IA ;z,] = F22(tf,//i) [h O'tl I~ ,h~ttIS']
Atl
(23) 4- F2~(tt,l~)[hetllz,h ~5/~]((Aw,j)2-- A/I)
tl+1 ti+l 1
+F22(t,,~,t)[f h(~..--~,,.)d~. , f,, h(#..-~,,~)d~s
37
f I ~ hj(xl(...)(d~)aw,(~l n ~ dim h
II xEA J-I
That (24) defines a, as a signed measure rather than just a linear functional on a dense
subspace can be seen from
t/+l
A~t ma"b f h ~st~dws •
II
Again, methods like those used before show that the sums on t~ E ~r of the absolute values of
all the terms of Fz2(ti,ul)[At~l,At~l] in (23) except the first go to zero in pr. as mesh(w) - , O.
References
A. B~nso!,tssan
UniVersity Paris-Dauphine and INRIA
INTRODUCTION.
We discuss in this paper the stochastic maximum principle and the dynamic
programming approaches of the problem of stochastic control under partial observation.
We formulate the problem as a problem of stochastic control for a stochastic partial
differential equation, with full observation. The stochastic PDE is the ZakaT
equation. A formal approach of the stochastic maximum principle has been already
given by H. Kwakernaak [7]. Ou~ approach is rigorous. We also present an analytic
problem which is a surrogate of Dynamic Programming. The solution of the analytic
problem coTncides with the value function of the problem. We present here the
results and some ideas of the proofs. A detailed version of the paper will appear
elsewhere.
1.1. Notation.
Let :
G2Zadwill be a compact non empty subset of Rk, (here after the set of control
values)
I *
Setting aij = ~(ee )ij we assume that :
ij~ aij ~ + ai
where
(1.6) A* = - 1,j
"~" ~ (aij') + ! ~ (gi')
Bai
Av A*v .
yt : (y(s), s ~ t)
We write :
In (1.9) we can take T = ~, and replace V by any Hilbert space (in particular
we will use i t with V replaced by Rk)
40
Let
(1.11) ~ E L2(R~, ~ ~ 0
(1.12) I dp + A*V(')p dt = p h . dy
Ip(o) : ~ ,
which we will call the ZakaY equation, following the common practice. I t is
convenient to define :
~ ai(x,v )
p(o) :
Theorem 1.1. : We assume (1.1), (1.2), (1.3), (!.11), then for any admissible control
v(.) (cf. (1.10)), there exists one and only one solution of (1.15) in the
following functional space :
(1.16) p 2
Ly(o,T;V) n L2(O,M,P;C(o,T;H)), VT f i n i t e .
(1.17) p~ 0
41
Bai
Setting :
(1.19) E Ip(T)l 2 + 2E
I
o
<Aop,p>dt = E
o Rn
a(x,v(t))p2(x,t)dxdt + I~I 2
al so
(1.20) E e-2YT Ip(T)) 2 + 2E IT em2yt <Aop,p> dt
o
Consider next the case of a constant control v(.) z v. Let us write pV(t) to be
the solution of (1.15) at time t, emphazing the dependence with respect to v and to
the i n i t i a l condition. The map :
z(o) :
z(t) : TV(t}
E Ip(t)I 2 ~ cT E Ip(s)I2ds
o
and by Gronwall's inequality, i t follows p=O.
Let us set :
which is a standardised Wiener process with respect to yS+e = ~s. In addition, the
process yS(s) is independent of ye.
which has a unique solution in say C(o,T;L2(R,M,P;H)). We can claim that the
random variable qv(t ;~) with values in H is Y8 independent. Moreover we have the
following
setting :
for F ~ B, then we have from Proposition 1.1, the semi group property •
(1.30) @v(t) ; C ÷ C.
This follows easily from (1.22), (1.23). Now, since pV(t) depends l i n e a r l y on
~, i t i t useful to consider functionals F, which are not bounded, and rather have
a linear growth.
IIFII : sup
and BI is a Banach space.
Similarly, we define :
For F c B1 or CI , we have :
2. A STOCHASTICCONTROLPROBLEM.
(2.2) ~c H
Consider an admissible control v(.) (cf. (1.10)) and let DJ(')(t) be the
corresponding solution of (1.15). We define :
(2.3) d (v(.)) = E [
f'
O
(fv(t),p~(')(t))dt + (g,p~(')(T))]
2.2. Preliminaries.
@gi ~2gi
are bounded
~vj ' Bvj~ vz
i=i . . . . . n j,Z=1-k
B2f
,FRn(BVi~V---~(x'v))2dx -< Constant independent of v
45
We will set
fv ~ ~f(xsv~
'
~ i
and fv! can be considered as an element of Hk. We start with some preliminaries. Let
u(.) be an optimal control. We denote by p(.) the corresponding state, which is the
solution of
p(o) :
p E L2 (o,~;V) n L2(~,M,P;C(o,T;H)) VT
Y,y
where we have denoted
The fact that p ( L~,y(O,~;V)_ follows from the energy inequality (derived from
(1.20))
z(o) = 0
z~ L~(o,T;V) n L2(a,M,P;C(o,T;H)).
Provided that v(t) is bounded by a deterministic constant, then (2.7) has one
and only one solution. In fact, we have the energy equality :
46
IT
(2.8) E e"2YT Iz(T)l 2 + 2E e-2Yt <Aoz,z> dt +
+ EI
T/
o
= -2E iT/o Rn
e-2Yt ~ ai,v(X,u(t))v(t)p ~
1
dx dt
T !
(2.9) d~J(u(.)+ ev(.)lle= o = E {Jo[(fu(t) V(t),p(t)) +
+ ( f u ( t ) , z ( t ) ) ] dt + (~,z(T))}
D
~.s. M ~ P r ~ n o i B l e .
p(o) = o
p ~ L~(o,T;V) n C(o,T;L2(R,M;P))
°
= 2E <p(t),@(t)> e'2ytdt
o
I t follows that the map : @÷p
T
@÷ E {
f0 (fu(t),P(t))dt + (~,p(T))}
i t is linear and continuous on L~(o,T;V').
Therefore, there exists one and only one element L ~ L~(o,T;V) such that :
f T rT
(2.13) E{|o(fU(t),P(t))dt, + (~:p(T))} = E ,|o <X(t),@(t)>dt
@
+ (~(t), ! ~ (ai,v(X,u(t~v(t)p))]dt =
Rn (~j(X,u(t)) +
+ ! @~(x;t),@xiBHi-
3v--j (x,u(t)))p(t,x)dx] dt
where we have used the fact that :
~ai ~gi
@-~j: @vj
We can then state the following
Theorem 2.1 : We make the assumptions of Theorem 1.1 as well as (2.1), (2.2), (2.4),
(2.5). We also assume that
Le____tu(.) be an optimal control for problem (1.15), (2.3) aid p the correspondin~
trajectory, solution of (2.6). Then, there exists ~ i n L~(o,T;V) such that the
.followin 9 condition holds :
(2.16) k j - uj(t)) [ i
jZ1(v (~-~-,(,u(t))
~'J
~f x + ~ B~(x;t) ~ . (x,u(t~)p(t;x)dx]
= Rn i~i Bxi
= f ( x , u ( t ) ) exp y ( t ) . h(x)
For a.s. m , (2.17) has one and only one solution in the functional space :
We w i l l also set :
(2.24) ~ L~
@xk •
and
(2.25) ~ E V.
(2.26) E llG(s)ll ~ ~ c
u ( x , t ) = ~ ( x , t ) exp y ( t ) . h(x)
(2.27) EII;(t)ll~ ~ C
Theorem 2.2. : We make the assumptions of Theorem 2.1, (2.24), (2.25). Then the
adjoint process X defined in Theorem 2.1 s a t i s f i e s :
d~ + (Ao~ + ! ai (x,u(t)) ~ +
+ ~ l h l 2 ) d t = %h . dy - exp - y ( t ) . h r ( t ) . dy +
+ ( f ( x , u ( t ) ) + exp - y ( t ) . h r ( t ) . h)dt
X(x,T) = ~(x)
50
9
Moreover there exists one and only one pair (k,r) with r c L~(o,T;V')
such that (2.30) holds.
0
We take
(3.2) ~>y
(3.3) S E C1
3.2. Preliminaries.
0
51
We assume
Let us consider the functional (2.3) over an infinite horizon, with discount B,
namely :
where v(.) is an admissible control and p"~ ( ' ) ( t ) is the solution of (1.15).
and vn is Y n measurable.
Lemma 3.5. : Let $ ~ B and v(.) c W. Then for Tn ~ t < Tn+1, one has
0
Define for v(.) ( W
= E(f • v 'p )
~n+l- n' n n
hence f i n a l l y the formula :
Wh = {V(.) E W I •n = nh}
Theorem 3.2. : The assumptions are those of Theorem 3.1. We then have :
We have :
Theorem 3.3. : The assumptions are those of Theorem 3.1 and (2.15). Then we have :
REFERENCES.
ABSTRACT
i. THE PROBLEM
written in the Ire calculus. Here f,g,h are smooth functions .mapping
R into R; v(t), w(t) are independent, R-valued standard Wiener processes
mutually independent of the initial condition x 0 which has a smooth,
bounded density Po(X). We call x the signal and y the observation
process. The filtering problem is to d e t e r m i n e an estimate of x(t)
given observations y(s), 0~s~t, or, more precisely, the o-algebra Yt
generated by {y(s), 0~s~t}. For the estimate one usually takes the
conditional mean x(t) = E[x(t) IYt] since this produces the m i n i m u m
mean square error.
^
I = I F(x)dW(x) (1.5)
Co
I . lira
. . . f n / d a l . . . d a n F (Zsx ) .
max]tj-tj_ll+O R R
l£j£n (i. 6)
n exp[-(a.-a )2/2(tj-tj_l)]
n ,j j-1
j=l [ 2 ~ ( t j - t j _ l ) ] n-2
I = fexp[ftV(x(s))dy(s)]dW(x)
C 0
n u t
f n )
= (2~) -n/2 {exp[E V(xi-i + 272~n A Y I - I ~ } (1.1o)
Rn i=l
1 2 2 + e
• exp[-~(Ul+'''+u )]dUl'''dUn n
where
The a p p r o x £ m a t i o n error is
process, it is n e c e s s a r y to m a k e a change of c o o r d i n a t e s or a c h a n g e
of m e a s u r e (i.e., a Girsanov transformation) in (I.I) to take advantage
of this structure.
2. COORDINATE TRANSFORMATIONS
If the c o e f f i c i e n t s in (l.la) are sufficiently smooth, we can change
coordinates (pathwise) so that the resulting diffusion is a W i e n e r
process. When the coefficients are not smooth or w h e n x(t) is a m u l t i -
variable process, this procedure may not work, and a change of m e a s u r e ,
i.e., a Girsanov transformation, may be required to i m p l e m e n t our compu-
tational algorithms.
Suppose
(AI) g(x)>__g0>0 for some go and all xcR
f~dx/g(x) = fOdx/g(x) = + ~
+ h(~-I ( z ) ) u ~ Z )
where
(2.6)
R(z) = h(@-l(z))
In s i t u a t i o n s where f and g are not smooth, i.e., (A3) does not hold,
then we m u s t consider some more general type of w e a k transformation to
accomplish the reduction of the Zakai equation. Also, when x(t) is an
Rd-valued process with d~2, it w i l l be n e c e s s a r y to h a v e the integrand
in 42.4) be a g r a d i e n t . That is, with g(x~ dxd and non-singular
-I I
g (x)f(x) - ~ g ' ( x ) w i l l have to be a g r a d i e n t for the e x p o n e n t i a l
transformation (2.A) to h a v e a meaning.
and <~,r> is the Euclidean inner product. The error in (3.2) is inter-
preted as in the Lemma (even though it h a s a deterministic component
which is 0 ( n - 2 ) ) . Efficient evaluation of the n-fold integral (3.3a)
is our main objective here.
where
n-I
^ I 2
K n ( t , z , w , ~) = exp{l [H(wi)Y i - ~H (wi)t/n
i=0
i 2 + .. + (w )2]]
2 n2-[2(Wo-Z) + (wl-Wo) • n_l-Wn_2
t
If w e now define
1 2
g(t,w,y) = H(w)y - ~-H ( w ) t / n - V(w)t/n (3.8)
(3•9) k
• I k _ l ( t , Z , W k _ l ) d W k ~ I, k = 2,3 .... ,n-l,
then
"In_l(t,Z,Wn_i)dWn_ 1 (3•9) n
where C n ( t ) ~ (21/2n/-ffn)/(t 2 / ~ ) .
Remarks I. The i n t e g r a l s Ii, I 2 , . . . , I n _ 1 are independent of the i n i t i a l
data; and so, (3.9) has the form of a G r e e n ' s representation
n
3.2 Implementation
Suppose we r e q u i r e the e v a l u a t i o n of v ( t , z ) = In(t,z) + 0(n -2) at p o i n t s
z = al,a 2 ..... a m for some m~l, and t=t I, t 2 , . . . , t N = T. The tj are p r e -
specified observation times• Let n l , n 2 , . . . , n N be i n t e g e r s w i t h nj~l,
j = 1,2,...,N. Consider the d i s c r e t i z a t i o n of the time interval shown
below
jl = ( t / / n ) d i a g [ ¢ ~ 7 ~ , 1 ..... 1,/~]
(4.i)
j2 = d i a g [ ( l _ a i ) - ½ ]
In(t,z) = Cn(t,z)I
n-i
[~ e
_~2/¢ []Kn(t,z,i,~l)d (4.3)
Rn i=0
where
n-i ½ _nz2/t2/2n/2
Cn(t,z ) = 2/2/3 ~ (l-al) e
i=0
n-i
K(t,z,_~,y ) = e x p { [ ~ 0 - $ l ( ~ - l ( W n _ l )1 + Z g(t,wi,Yi)]%r=j~
i=0
n-i 2 n-2
- i=0
Z aii~ i - 2Zi=0 a i ' i + l E i ~ i + l - b 0 ~ o ~
2 ~7~/Jn (4.4)
b0 = - ~ z
aii = ~i/(l-si)
=
[ -(i/¢~)//(i-%)(ii%) , i = 0
<i n-3
ai,i+ l , _ _
[ -(i/~)//(l-an_2)(l-an_ I) , i=n-2
1
ri(~i,t) = exp[H(Ji~i)]
2
ri(~i,t) 1 2 (Ji~i )t/n
= exp[-~H - V(Ji~i)t/n]
2 0<k<n-2
S k ( ~ k , ~ k + I) = e x p [ a k k + l ~ k ~ k + 1 akk~ k] ,
s n _ l(~n _ 1 ) = e x p [ ( ~ 0 - ~ ) ( ~ -I (Jn_l~n_l)) ]
• S n _ l(En_l~n ) In_l(~n_l,Z)d~n_ i
where
en(t,z) = [Cn(t,z)]i/n (4.8)
I 2
Note that the functions r k, r k, s k do not depend on k, for k=l,...,n-2.
I ~ S + eM
M (4.9)
S = Z Aif(a i)
i=l
where Ai>0 are weights and a i are the (real) zeros of the M th Hermite
po lynominal
Ik+l(aj,a i) = C n ( t , a i ) [ S k ( a j , a i ) + e k] (4.11)
where
M i ]Ykr2
Sk(aj,a i) = Z A~[r (a~,t) (a£,t)
4=1
" s ( a ~ , a j ) I k ( a ~ , a i)
M
(4.12)
-" Z A~ [rl (a~, t) ] Ykr2 (a£, t)
4=i
M
~ k = {$k(~aj'ai)}i,J=l (4.13)
and let
Then
M rl
S k ( a j , a i) = Z A~[ ( a z , t ) ] Y k R ( t , i , j ,~) S k _ l ( a e , a i) (4.15)
6=1
The terms AZ, r I (ag,t), R(t,i,j,Z) can be p r e c o m p u t e d "off-line " In
this case, calculation of the m a t r i x ~k consists of
(i) r a i s i n g the e l e m e n t s of the v e c t o r r l(a~,t)
= 1,2,...,M to the Yk power;
(ii) p e r f o r m i n g M 2 v e c t o r p r o d u c t s of
1 Yk
~ij = {Ag[r (ag,t)] R(t,i,j,&), ~ ~ i ..... M}
i
and ~ k - I = {Sk-l(ag'ai)' ~ = i, .... M}
for i,j = 1,2 .... ,M.
To c o m p u t e Ik+l(aj,al), it is n e c e s s a r y to m u l t i p l y Sk(aj,ai) by
Cn(t,al) ,
Thus, a total of 2 M 3 + M 2 + M elementary operations is r e q u i r e d to
compute the M 2 e n t r i e s Ik+l(a~,a~) , i,j=l,...,M. This is for
k=l,2, .... n-2, a total of (n-2)(2M3+M2+M) operations. The o p e r a t i o n
count for l l ( a g , a i) is the same, adding (2M3+M2+M) operations. Only
2M 2 + 2M o p e r a t i o n s are r e q u i r e d to c o m p u t e the M - v e c t o r
n-I n-2
2
G(~,a)=exp[-( Z aii(~i~ i + Z aii+l(.~i,~i+l)~i~i+ 1 + bo(~0)~.O)]
i=O i=0
assume F has continuous derivatives up to order 2M+I, bounded by a
constant F 0 possibly depending on the o b s e r v a t i o n path y(t~. By
completing the square with respect to ~'i in G(~,al and e m p l o y i n g a
bound for H e r m i t e polynomials [12],
2
Hj (x)e -x _< c02J/2 3V~.,e-X2/2 , C0~I-086
ie_X 2 2
and d /dx i = e -x H.(x)- (-i) i, it may be shown that
I
n- 1
EM(t,z, E) -< CIe(M) Z Pi(~) sup e-Qi (i, -~) (5.6)
i=0
-n/2
Cl(n,t,z) ~ C0F n0 ~ (2) t-n e x p ( _ n z 2 / t 2)
n-i -1/2 )=m 2m 2m , 1 ~/2
Pi(~) = (j=o(l-~j)n ) (J~)-m(l-~i j=0Z ( j ) j~-~.'(2Ji ~i )
i 2 i T -i
exp[~ bi/aii+ ~ BiA i Bi] (a_)
B0= , Bi=
n-i
ieM(t'z'a*)I-< EM(t'z'a*) -< CIg(M)i%--0 P-i(a*) (5.7)
It is robust in the sense that a* does not depend on the data f,g,h,
or the o b s e r v a t i o n process y(tl. Of course, the d e p e n d e n c e of
(5.71 on (m,n) is crucial. At present, the a s y m p t o t i c behavior of
a~ as m -+ ~ has not been determined, but analysis of si'milar i n t e g r a l s
with A i diagonal suggests that the p e r f o r m a n c e should be good.
Remark : The a s s u m p t i o n that F(w) have bounded derivatives is not
very r e s t r i c t i v e . From (2.6) C3.8), (5.4] it is clear that we include
the cases
(i) f,g,h and their derivatives bounded
(ii) g constant , f,h together wi'th their derivatives of
"polynomial" growth, and lira sgn(x)f(]x) < ~
Ixl ~ ®.
More general classes of f,g,h may be identified from (~2.61, (~3.8), (5.4).
REFERENCES
J.M.C. Clark
INTRODUCTION
there are two or more forcing Brownian motions driving non-commutative vector fields
then the best order is again n-~ [i].
Newton [6], in an analysis of approximations for bilinear equations with
Brownian motions driving eonmmting vector fields, has shown that a truncated Taylor-
series scheme is efficient, in the sense of possessing the best rate of convergence,
70
only if most terms of third order are retained. This suggests that the schemes just
mentioned, which are basically second-order methods, would generally be inefficient
when applied to equations of the type (I).
This paper considers the efficiency of a "prototypical" approximation scheme
for (I). It is shown that the solution of a sequence of relatively simple differ-
ential equations driven by piecewise linear approximations of the process w form
a schema that is efficient for all sufficiently regular distributions of w. This
is done by means of a centraL limit theorem for the conditional distributions of
normalized error. The technique of representing the the solution of a differential
equation as the composition of the solutions of two simpler equations will be used
repeatedly in the proofs; this technique is at the heart of the work of Doss [2]
and Sussmann [IO] and has also been used by Kunita [3] in a more general context.
AN ODE APPROXIMATION S C H E ~
[~j(gj ~x
3~fi-" fj ~x
3~gi" ). Then (n~) is our scheme of approximations for ~ . The
presence of the double Lie bracket in (2) distinguishes it from the familiar Wong -
Zakai piecewise linear scheme [Ii],[4]. It also has an incidental geometric virtue not
possessed by the simpler diffe=ence schemes : suppose f and g are Cb and N is
a maximal integral manifold generated by f and g with a dimension d I possibly less
than d. It follows from the special structure of (I) ([2], [iO], also Le~mms i and
2) that if xeN then ~(x,.) stays in N for all t. However the vector field
generating n~ depends only on f and g and their brackets and so n~ also stays
t t
in N.
It turns out that the most natural way of studying the asymptotic behaviour
of n ~ (= n (',,w~ is by the action of its inverse map; roughly speaking if x lies
in a maximal integral manifold N of f and g, then in the limit n ( n ~ l o ~ ( x ) ~ x )
behaves like a normal random vector in the tangent space of N at x. Let P be
n
the completion of PDn by the P-null sets in B T.
71
Theorem i: Suppose f and g are of class CB, 3 and [g,f] and [g,[g,f]]are of class
2
Cb •
i) If P = PO' Wiener measure, then for almost all w, any version of the Pn-
conditional distribution of n(~lox~(x)-x)^- converges weakly to a normal distribution
in ~ d with mean at the origin and~ ~covariance matrix
r 2 ~ T {Xt,[g,f](Xt(x,w))
Vo(X,W) := ~-~ -i }{x~l[g,f](Xt(x,w)}'dt. (3)
EFFICIENCY
of approximations of c ' ~ in the sense that for all infinite subsequences of integers
I for which (Pn:nel) is increasing,
liN~i--Var[c,X~ipnj -l
for almost all w in the set {w:C'VT(X,W)C > O}
It is of interest to determine conditions under which the covariance matrix
VT(X,W) is a.s. positive defln~te. It is instructive to look at the elementary
Gaussian case where
dX t = AXtdt + bdw(t), X0(x,w) = x.
Then (6) reduces to
dU= " Autdt + 7 ~ Abdv(t) U 0 = 0.
and the VT(X,W) becomes the "controllability ~t Gra~mlian
PROOFS
We begin with soma leumms. In what follow~, rather than saying that a
sequence of continuous functions (qn) converges uniformly on hounded sets to a
continuous limit goo, we shall say that (x,n) ~ qn(X) is "continuous on E d x F ', where
73
is the usual one-point compactlfication ~ u {~} of the integers. I'I will denote
Euclidean norms in various finite dimensional spaces, and also the uniform norm in
C[O,T].
Suppose g:Ed+ffd is a vector field of class Cb3 and q:Ed×~÷~ d an indexed vector
field with bounded first and second derivatives continuous £n (x,n). Let Yt(X)
solve
~t = g(~t )' YO (x) = xE~d' tEE (7)
and solves the corresponding ordinary differential equation for all w that are piece-
wise C I .
ii) The x-derivatives of 7t(x) up to third order are con=inuous in t and
x and bounded by ke kt for some constant k.
iii) The first and second x-derivatives of Yt := Yt (''w'n) and those of its
inverse y~l are all continuous on ~ and bounded uniformly in t,x, and n for bound-
ed w.
Proof: (i) is simply an amalgam of theorems in [2] and [i0]. A standard theorem
of 0DE's shows that Yt and its x-derivatives exist and are continuous in t and x.
Furthermore, for d = i,
D~t = Dg.DYt , Dy 0 = i,
bounded w. But it is easy to verify that Yt,o = and so the proof of (iii) is
complete.
74
To simplify the notation, from now on we shall generally suppress the argum-
ents n and w.
Lenmm 2: Suppose X~ is defined^_lbY^_l(2) and f,g and [g'[g'f]]1 are of class C~.v The
Jacobian matrices DX t , DX t and ~Xt,k),where k is a C~ vector field on ~d, are
continuous on ~, bounded for bounded w and P -measurable for fixed t, x and n.
n
because ~(ih) = O. Now let L f denote [g,f] regarded as the Lie derivative of the
g
vector field f in the direction of g. The first term of (i0) can be expanded as
a Lie series. So
IXtl ~ K exp KI~(t) I,IXtl, and suet EIXt Ir < ~ also for all r ~ i. Now it follows
from Lemma i (ii) and Lipsehitz nature of L3fg that IRl(t) I < k I exp k21~(t) l'IXtl"
lw(t) I3. Since l~(t) l is O(h ½) in L for all r > i, uniformly in t, H~Ider's
r
inequality implies that IRl(t) I is O(h 3/2) in L r for all r uniformly in t. Simil-
arly by Lemm8 3, IR2(t) I is uniformly O(h).
It is convenient at this point to introduce a special notation for orders.
We shall say a earameterized process U(t,x,n,w) is "O (h~)'' if for all integer
c
r ~ i, suPt,x,n(E I ~ h ~ Ir) < ~ where, as before, h = T/n, and where (bn(W)) is a
n
positive (P~-adaeted sequence bounded uniformly in n. It is clear from Le~mm 2
that DXt, D ~ I and DX]~ [g,f] are all bounded by such a (Pn)-adapted sequence bn and
so are 0 (i) in this sense. Now consider the "error" process Z := X-Iox . We have
c t t t
Zo(X) = x. Ordinary calculus yields
zt = ~-i
t*Eg'f](zt)~(t) + DX~I (Xt)~l (t) + m2(t)) (Ii)
IX -I
t* L~f(t) - r s *I L~f(~)l < CnlXlh{-~ , i = I or 2
for all It-sl ! h, O < s < t < T, and for O < ~ < ~. These bounds and the bounds
on the derivatives of the vector fields of (Ii) given by Lemma 2 show that it can be
transformed into the discrete form
ih+h~ 2 1 2
where ~i~ = fih+h ~(£)dt and ~i u = ~h w(t) dt - ~ .
ih
76
The remainder R3, i is 0c(h5/2-~). Both J3(i) and J4(i) are PnVBih-measurable.
Ai w and ~i u are independent of this ~-field and have zero mean. It follows from
the Brownian bridge properties of ~ and the usual moment inequalities for martingale
transforms that for some (Pn)-adapted sequence dn, hounded in n,
(E 1 n-i
~i=O J4(i)ni u12)~ = 0(h3/2)
ZT(X)-X = In-l
i=O Xi~,[g,f](x)Aiw + (~iJ5 (i) "(Zih-X) "Ai ~ + R 4 ~12)
=: J6 + R5
Mn ÷ M a.s. (Po) . Since Eo[M 2] = E[M] < ~, we have that the tail of the quadratic
variation
77
~O[(M-Mn)21Pn]= ~O[M2[Pn] - M~
+ O a.s. (POI.
To prove (ii) it is sufficient to establish convergence to the corresponding limits of
the moments E[I~n 121F-]L, and the values of the character~st£c function E[exp iC'~n]P
3,,
for rational c ~ ~ . We consider just the first; the argument for the others is
the same. Now M > O a.s. (PO) implies that Mn > O a.s. (Po) . We have, by
routine arguments,
E[l~ni21Pn] =~'-i E0[Ml~nl2lPn]
n
i
n
!-Eo[M-Mn)ZlPn
] ~ M
n
EO[IEnI'IPn]~;
so it follows from (i) and the properties just established for M that this con-
n
verges to zero a.s. (Po) and that E[I~nl=IPn ] converges to the correct limit.
For cases where Po(M = O) > O, we restrict our attention to those w for which
M > O; since {M > O} is P -measurable and P(M > O1 = I, the results still hold a.s.
n
(P). similarly, we have that
P(E[exp ic'~IPn] ÷ exp(-~c'V0c), all rational c ~ ~d) , I
~(X) = ~^ ( X ) + ^
D~(ZT(X)-X) + 0c(n-2)
The limits of the condifional distribution and moments of nOX~-~) can be established
in exactly the same way as £n the proof of Theorem i and it is easy to see that these
correspond to the conditionally normal distribution of the Ito (or Stratonovich)
integral
UT := D~(x)" T [Tx-I [g,f](x)dv(t) (13)
~12-~)6 t*
where v(t) is a Brownian motion independent of w(t)). But it follows from Le~m~a 2that
d(DX t) = Df(Xt)'DX t dt + Dg(Xt)'DX t odw(t),
REFERENCES
[i] J.M.C. Clark, R.J. Cameron, The maximum rate of convergence of discrete
approximations for stochastic differential equations. B. Grigelionis (Ed.)
Stochastic Differential Systems, Prec. IFIP-WG 7/1 Working Conference Vilnius
1978, Springer-Verlag Berlin 1980, pp. 162-171
[4] E.J. McShane. Stochastic Calculus and Stochastic Models. Academic Press NY 1974.
[6] N.J. Newton. PhD. Thesis, E.E. Dept. Imperial College, Univ. of London 1982
£i0] Hector J. Sussmann. On the gap between deterministic and stochastic ordinary
differential equations. Ann. Prob. 1978, ~(i), pp. 19-41.
M.H.A. Davis
D e p a r t m e n t of E l e c t r i c a l E n g i n e e r i n g
I m p e r i a l College, London,
ENGLAND
I. PROBLEM FORMULATION
dx t = b ( x t , u t ) d t + g ( x t ) d v t (I)
dy t = h ( x t ) d t + dw t (2)
dx t = g(xt)dv ~
T
dPu
dP - exp(;Tg-1(Xs)b(Xs'Us)dv°- f Ig-1 (Xs)b(Xs'Us)12 as
0 O
T Th2 (Xs)
+ f h ( X s ) d Y s - 10~ ds
O
p (dx,dy,du) = pU(dx)v(du,dy)
and finally the measure P~ is given by the e x p o n e n t i a l transformation
(5) as before. The cost c o r r e s p o n d i n g to ~ • A is
J(9) = E g [ 9 ( X T )] = f ~(Xl)dP~
E U [ ~ ( X T ) A T I V T] aT(#)
EU[#(XT) IVT] = ou =:
E [ATIY T] ~T(1)
ST (~) = S ~(x)oT(dx)
Rd
Returning now to (6) we have
J(u) = EU[aT(~)/OT(1)]
= ~U[ATaT(#)/aT(1)]
= ~u[~u[ATIVT]OT(~)/OT(1)]
ou
= E [aT(~)] (7)
~f (x)
= ~1 . I (g(x)g' (x))ij ax.ax------~
A(u)f(x) ~2f + [ bi(x,u) ~-~.
z,] I 3 i x
oU
Recall that under measure P , (yt) is BM. Equations (7), (8) give the
control p r o b l e m in "separated" form. The new "state" is the un-
normalized conditional distribution at whose "dynamics" are given by
the Zakai e q u a t i o n (8), and the cost is the linear functional at (7).
The initial condition for (6) is 00 = w where ~ is the given probabil-
ity d i s t r i b u t i o n of the initial state x O. If this has a d e n s i t y Po(X)~
in a suitable function space then ot has a density p(t,x) w h i c h satis-
fies the "forward" v e r s i o n of e q u a t i o n (6), namely
a *
~--~ p(t,x) = A (ut)P(t,x)dt + h ( x ) p ( t , x ) d y t (9)
P(o,x) = Po(X)
dy t = h(x t) + dw t
where (wt) is a BM i n d e p e n d e n t of (xt). The u n n o r m a l i z e d conditional
distribution is given by
N
dqt = A ' ( u t ) q t d t
+ HqtdY t (10)
th
Here A(u) is the m a t r i x with i, j entry aij(u) and H is the diagonal
.th
matrix with i, 1 entry h(i). This p r o b l e m is f o r m u l a t e d for V t-
adapted or w i d e - s e n s e controls in a way e x a c t l y similar to the d i f f u s i o n
case. By c o n d i t i o n i n g on V t the cost function (3) becomes
where ~' := (~(I),..., ~(N)) and <-.> denotes the inner product in R d.
Thus the "separated" problem (10) - (11) concerns control of the degen-
erate p r o c e s s (qt)° This is an instructive case to study as it is a
problem of partial o b s e r v a t i o n s where the conditional distribution is
finite-dimensional. It has been studied in detail by Bismut [8] but
his methods do not generalize to the d i f f u s i o n case.
IV EXISTENCE OF O P T I M A L CONTROLS
j(~) = / ~T(~) d~
G
V DYNAMIC PROGRAMMING
<A(u>
~ P tu' q t = max <A(a)Pt,qt> (13)
aEU
87
X. REFERENCES
Let S denote a complete separable metric space and M(S) denote the space
of bounded Radon measures on S. With an appropriate choice of metric M(S) is
also a complete separable metric space whose topology is equivalent to that of
weak convergence of measures, Let E denote a closed subset of M(S), B(E) the
a-algebra of Borel subsets of E and L (E) the space of bounded measurable func-
E E
tions on E. Let ~C := C([0,*),E), ~D := D([0,~),E), the spaces of functions
from [0,~) into E which are continuous, right continuous with left limits, res-
pectively. For s • 0, E let
~ c ~C' X(s,~) := ~(s~ and let
weak convergence,
(2.27 P ({~:X(0,~)=~!) = I,
(2.3) [strongMarkou Pmoper~) for ~ ~ E, t ~ 0, f E L (E), and all a.s. finite
Et-stopping times ~, E ( f ( X ( T + 0 ) IFT) = T(t) f(X(T)), ~-a.s., where
(2.67 for each probability law P the canonical process {X(t):t z 0} is a so-
lution to the martingale problem for A.
Let # denote a subset of L (E) which generates L (E) under bounded point-
wise convergence (e.g. Cb(E)). Assume that for each t ~ 0 and F e ~ there is
a function AF, t e L (E) such that for ~ E E,
where
93
The set D(L) c C(E), the space of continuous functions on E, must be chosen
so that the first and second variational derivatives exist and can be computed and
it must be also sufficiently rich so that it generates L~(E) under bounded point-
wise convergence. With this motivation we now introduce a family of functions on
M(S) which has these properties.
D := u D(S *N) where for each N, D(S *N) is assumed to be a dense sub-
N=0
set of C(s*N).
For f ( C, a function on M(S) of the form
feD.
2.2. LEMMA
such that Ill f - fnllIN "~ 0 as n ÷ ~, then IFf (~) - Ff(~) I -> 0 uniformly on
n
E. From this it follows that ~(E) is dense in H(E). If B and ~ e E and
ff(x)~(dx) = ff(x)~(dx) for all f E D(S*), then ~ = ~. Therefore the algebra
~(E) separates points of E. The Stone Welerstrass theorem then implies that
94
HD(E) is dense in C(E). Part (ii) follows from (1) since C(E) generates L (E)
under bounded pointwise convergence.
For a function of the form (2.12) the variational derivatives are given by:
N(f) ~ (V(f)/j) (dx)
6Ff(~)/~(x) = ~ ;S*'*'; *f(xl .... xJ-1"x'Xj+l .... xN(f))
j=l S
+ ½ ~ (FBfJk)(~)i
f - Ff(~)) + V(N(f)).Ff(H)
£=-i j=l k=l
j#k
where V(n) = cln + c2n(n-l),
GN is assumed to be the infinitesimal generator of a strongly continuous
where for F ¢ Q,
U
K N(f)
(2.16) L#F (f) = F (GN(f)f) + ~ ~ a£[F (A~J)f/a£) - F (f)]
~=-i j=l
K N(f)N(f)
+ ½ ~ ~ ~ b~[Fu(B~Jk)f/b £) Fu(f)]
£=-i J=l k=l
j~k
where a£, b A are arbitrary positive real numbers and
V(n) = eln + e2n(n-i ).
The application of the duality method grows out of the observation that L#
h~ the structure of an infinitesimal generator of a D-valued Marker process re-
stricted to functions in Q. The dynamics of the D-valued Marker process {Y(t):t~0}
(if it exists) with infinitesimal generator L# involves two basic mechanisms:
(2.17) CJump m~ahanism) f ÷ (A~J)f)/a£ with jump rate a£, j = 1,2 ..... N(f),
f ~ sN(f)f
t
where {S~ :t ~ 0} represents the semigroup of transformations on C(S *N)
with infinitesimal generator GN. We assume that D(S *N) is invarlant under S Nt"
Processes X and Y which satisfy the relation (2.19) are said to form a pair of
dual processes. The development of the theory of dual processes in a general set-
ting is the objective of the next section.
~.et
~ C(EI);
(3.l.b) A 2 := {(f(x,.),h(x,.)):x E El} , where f,h ~ C(EI×E2) ,
ping time. Assume that X and r are independent of Y and a. We also assume
that for each T > 0 there is an ~ntegrable random variable rT such that
3.1. %ORE~
Then
tAT tA~
(3.4) E[ f (X(tAT) ,Y(0)), exp (S a ( X ( u ) ) d u ) ] - E[ f (X(Ol~f (tAn)) .exp (; ~3(Y(u))du) ]
0 0
t
= /E[lx{s<T}{g(X(s );~((t-s)^~)) + ~(X(s))f(X(s^T),Y((t-s)^o))}
0
- Xf(t_s)_<~ {h(X(sAT),X(t-s)) + 8(Y(t-s))f(X(sAT),Y((t-s)a~))}}
SAT (t-S)^O
.exp(f a(X(u)ldu + ; B(Y(u))du)]ds ,
0 0
where X'S<T'%t denotes the indicator function of the event {s <T} .
97
3.I.i. LEMMA
T t
Proof. SO;O (fl(S,t-S) - f2(s,t-s))dsdt
Tt Tt
= S ; fl (t-s,s)dsdt - S / f2 (s,t-s)dsdt
00 00
TT TT
= SOSs fl(t-s's)dtds - SOSs f2(s't-s)dtds
T T
= fo(f(T-s,s) - f(O,s))ds - So(f(s,T-s) - f(s,0llds
T
= S0(f(s,0) - f(0,s))ds .
t
So(fl(s,t-s) - f2(s,t-s))ds = f(t,o) - f(o,t),
3.1.2. LEMMA
3 .i .3. LEMMA
Let
sAT tAG
(3.8) @(S,t) := E[f(X(sAT),Y(tA~)).exp(/ ~(X(u))du).exp(/ 8(Y(u))du)] .
0 0
Then
~(s+h,t) - ~(s,t)
(s+h)AT tad
= E[f(X((s+h)AT),Y(tAo)).exp(/0 a(X(u))du).exp(/ g(Y(u))du)]
0
sAT tAG
-- E[f(X(sAr),Y(tAG)).exp(/ ~(X(u))du).exp( S 6(Y(u))du)]
(3.10) 0 0
(s+h) A~ rA~ tAG
= gig (X((s+h) AT) ,T~^C~ { ] a (X(r)) .exp(~ a (X(u)) du) dr} .exp(f 8 (Y(u))du) ]
sAT 0 0
But for t,(s+h) ! T, the absolute values of the second and fourth terms are
99
bounded by
where 0 = s o < Sl< ... < s m = s. If max (sj - sj_ I) "+ 0, then (3.10), (3.11) and
J
(3.12) yield:
sAT
¢(s,t) - ¢(o,t) = EE; {f(X(r),Y(tAO)).a(X(r)) + g(X(r),Y(t^o))}
0
r tA@
. exp(; a(X(u))du).exp( i ~(Y(u))du) dr]
0 0
- X{t_s<~}(f(X(sar),Y(t-s)).B(Y(t-s)) + h(X(sAr),Y(t-s))}
s^~ (t-s)^a
.exp( f c~(X(u)) du). exp (; 8(Y(u)) du)}ds]
0 0
and the proof of the theorem is complete.
(t--o~AT
ffi E[{f(X(e-s)+^r),Y(~))exp(/ ~(X(u))du) - f(X(0),Y(a)))exp(/ B(Y(u))du)]
0 0
(t-T)%o T
-g[{f(X(T),Y(~-r)+Ao))exp(i 8(Y(u))du) - f(X(T),Y(0))}exp(f s(X(u))du)]
0 0
3.3. COROLLARY. Assume that (3.2) and (3.3) are true with r = o = ~ and (3.13).
Then
t t
E[f(X(t),Y(0)).exp(f c~(X(u))du)] ffi EZf(x(0),Y(t)).exp(f .8(Y(u))du)].
0 0
3.~. COROLLARY. Let El,n ÷ El' E2,m + E2; T n :ffi inf{t:X(t) ~ El, n } ' and
Then t t
E[f(X(t),y).exp(f ~(X(s))ds)] = E[f(x,Y(t)).exp(f 8(Y(s)ds)] .
0 0
4.1. HYPOTHESES.
(a) (Conser~)a%4v~ Prope9~j) For every N > I, I(N ~ (the function Of N variables
which assumes the constant value i) ~ D(S ) and LFI(N) = 0.
(b) B~ = 0 if £ > 0.
(c) There is a norm fill.III~ on D(S *N) and constants c, Cl, Z, c2, £ such that for
101
f ¢ D(S *N) ,
(1) lllf + II f c IllIflll,
(ii))H~Jk)f/bzllI)~f)+£ <__ c2,£.llIIfl]II~[f) , and
4.2. hYPOTHESES.
(a) (ConserUative p ~ p e r ~ ) (As in 4.l.a.)
(b) Condition 4.1.¢ is satisfied with oj,£~l, J = 1,2 and Z = -I,O,I,...,K.
(d) Consider the Markov Jump process on the non-negatlve integers Z+ with infini-
tesimal transitions rates qJk for a transition J + k ~ J, where
q0,£ ~ 0, £ = I,...,K.
4.3. hYPOTHESES.
(a) (Pure Death Dual) a~ ~ b£ *N= 0 if £ > 0.
(5) For each N kl, I(N ) D(S ), GNI(N ) = 0 and LFI(N) h 0.
(d) The moment problem for X(t,S*) Is well posed for small t; e.g. 4.l.d is satisfied.
4.4. THEOREM.
Under each of the sets of hypotheses 4.1, 4.2, 4.3, a p[fl~(S )]-valued solu-
Proof. By Lemma 2.2, ~(E~) generates L(E~) under bounded pointwlse conver-
gsnce.
E~
Then Theorem 2.1 implies that a P[flC ]-valued solution to the initial
value martingale problem is unique, measurable and satisfies the strong Markov pro-
t02
petty provided that for each t ~ 0 and f ~ D there is a function Af, t c L (M(E~))
such that for ~ ¢ MCE~),
In fact, it suffices to establish (4.2) for all 0 < t < tO for some t o > 0 (in-
dependent of ~ and f). To verify the latter assertion nQte that under this con-
dition the argument in the proof of Theorem 2.1 yields the uniqueness and Strong
Markov property of PU on Ft0. Then in the same way the conditional distribution
Under Hypotheses 4.1 N(Y(t)) is a Markov chain on Z+ such that the '%irth rate"
that is,the rate of transitions from D(S *N) ÷ D(S *N+£) with % > 0, is bounded by
a linear function of N(Y(t)). This implies that N(Y(t)) is a conservative Mmrkov
chain. In addition a coupling argument can be used to verify that /~(Y(t)), the
number of "births" in [0,t], is stochastically dominated by K.Z(t) where Z(t) is
a Yule process, that is a pure birth process with linear birth ratep on Z +. There-
fore
(4.4) P(NB(Y(t)) = nIN(Y(0))=k) ! c.nk.z(t) n where z(t) ~ 0 as t + 0, and
c is a constant.
For K>0, let ~ := inf{t:NIY(t) IIIN(y(t) ) + N(Y(t)) > K}. Then for < < - , ~ e M(E~)
and f e D,
(4.5) F (Y(t^~<)) - /t^~KL#F (Y(s))ds
0
is a bounded Gt-martingale with respect to Pf and conditions (3.3) are satisfied.
Conditions (3.2) are also satisfied for T = T$, o = o K , so that Theorem 3.1 and
C o r o l l a r y 3.2 can be applied. I f In Y(o)}l[N(y(o))+B(Y(O)) <~, ~ ¢ E~, this y i e l d s :
(4.6) Ev[F(X(tAz~),Y(0)] - E [F(X(0),Y(t^o )).exp(ft^aKV(Y(u))du)]
.(t-~<)~T~ t < . 0
0 ~v(t-~K)+
(t-s~o
+ V(Y((t-s)AOK)F(X(s^T~),Y((t-s)^O ))}exp( f ~V(Y~))du))~].
0
103
* n
Under the conservative hypothesis 4.l.a, Ep[(X(tAr~,S,)
* n] = E [ (X(0,S)1 ] for
all n E Z+ and ~ >_X(0,S*). Therefore X(t,S*) = X(0,S ) for all t and T~ -- ~.
Under Hypotheses 4.1 it follows from (4.4) that
~o
Under Hypotheses (4.2), results analogous to (4.7) and (4.8) follow immediately (with
to = ") since N(Y(t)) is conservative and ]IIY(t)111! c.]l]Y(O)]]l. Together,
(4.6), (4.7) and (4.8) imply that fort < t ,
-- to
(4.9) for 0 <__ t < t0, f E D, there is a function Af, t whihh is a measurable
function on M(S ), bounded on E~ for 0 < ~<~, and satisfies (4.2), and
(4.10) for 0 < t < to, the moment problem for X(t,S*) wlth distribution deter-
mined by {P :p E M(S*I} has a unique solution.
Then, for any f E D, lim E [Ff(X(tAr~))] = Eu[Ff(X(t))] . Then (4.6) and (4.11)
yield:
(4.12) Ep[Ff(X(t))] =
Ef[F(p,Y(t)).exp(f V(Y(u))du))]:= Af,t(~).
0
Since Af,t(.) is a measurable function which is bounded on the sets E$ , it
104
remains to verify that the moment problem for X(t,S ) has a unique solution. But
condition 4. l.d implies that
4.5. EXAMPLE.
Fleming and Viot introduced a continuous state version of the Ohta Kimura step-
wise mutation model for describing allelic frequencies within a population undergoing
mutation, random genetic drift and selection. This model is formulated as a probab-
ility-measure-valued diffusion process on R defined by a differential operator
wlth polynomial coefficients of degree 2. The differential operator is given by
(4.14) LFf(~)
J~k
+ (N(f) - ~N(f)(#(f)-l))Ff(~) ,
and D(R*N) are the functions whose derivatives of order two or less belong to C(R*N).
The differential operator L given by (4.14) satisfies Hypotheses 4.1 and
therefore Theorem 4.4 yields the uniqueness, measurability and strong Markov pro-
perty of a solution to the associated initial value problem (if it exists). (The ex-
istence of, a solution can be established by other methods.) The duality method has
also been used to study limit theorems for this model in the selectively neutral
case (Dawson and Hochberg [2]).
4.6. EXAMPLE.
In this example we consider branching diffusion together with turbulent tram~
sport in R3 which is an extension of the model of P.L. Chow [i] for molecular
diffusion with turbulent transport. The differential operator associated with this
model is given by:
N N 3
X Z ~E1p~,B(~i-~)~jjk~>_. o
J=l k=l cL,8=
for any set of real numbers {~j } • If xj E R 3 = (Xj,l,Xj,2,xj,3) , then
N N 3
c/(x I ..... ~1 := [ ~ X ~'(~(~:i-~).~2/3~l,~,~k,/f(~1 ..... ~15
j=l k=l a,B=l
j#k
N 3
+ ~E Z ~ 2 /~xj,~(f(x
2 I ..... XN)) -
J=l ~ I
For sufficiently large K the operators G N are uniformly elliptic and generate strong-
ly continuous semlgroups on C0((R3)N5 • Then the differential operator L given by
(4.15) satisfies Hypotheses 4.3 and Theorem 4.4 can be applied to establish the uni-
queness, measurability and strong Markov property of a solution to the associated
initial value martingale problem.
REFERENCES.
2. D.A. Dawson and K.J. Hochberg, Wandering random measures in the Flemlng-Viot
model, Annals of Probability, to appear.
4. R. Holley and T. Liggett, Ergodic theorems for weakly interacting systems and
the voter model, Annals of Probability 3(1975), 643-663.
6. R, Holley and D. Stroock, Dual processes and their applications to infinite in-
teracting systems, Advances in Mathematics 32(19795, 149-174.
ii. D.W. Strooek and S.R.S. Varadhan, Multidimensional diffusion processes, Grund-
lehren der mathematischen Wissenschaften 233(1979), Sprlnger-Verlag, Berlin,
Heidelberg, New York.
OPT~ STOPPING OF OC)NTIQOT,T,~ MARKOV PROCESSES
NiooleELKAROUI
with Jean-P~erreLEPELTIERandBernard ~@IRCHAL
I-~DUCflON
%~ are concerned with optimal stopping of Markov Processes,
controlled by change of absolutely continuous probability measures.
We use the results on optimal stopping depending on a parameter, in
order to construct by iteration the value-function of this optimization
problem.
We then show that the separation principle is satisfied: we can first
choose the optimal stopping rule, and then study a classical problem
of instantaneous control, that we solve under appropiate assumptions.
II - T H E M O D E L
In free evolution, _the state of the system is an E-valued~arkov
Process, OC = ( D, F__
t , Xt , p ;pE~E) ), where E is an lusinian space.
We suppose t h a t ~ i s a "good" MarkovProcess, Feller Process if E is
cxmloact, or more generally t h a t ~ i s a right Process, roughly speacking
t h a t ~ i s a Feller Process (with branchingpoints) for an adequate tolx)-
l o g y o n E.
For exanple,~is a diffusion process on R n, a diffusion with jumps,
a killing diffusion, a reflected diffusion on a convex domain in R n, .....
LS LS _ uoSv
Particularly, ~ t >S Lt does not depend on u a.s.
T~v
S
(2.3) If a is a constant policy, ~a ( R, _~ , Xt ,
is a M a r k ~ Process, Feller or right as ~- .
Remark: Usually, L u is the exponential martingale associated with a fami-
ly of stochastic integrals.
Here, CU
t = ;o t e- HU C(Xs,Us) ds , and Hut = ~oth(Xs,Us) ds
The proof is similar to the proof of relations 5.3 and 5.11 de [IJ because
we assume (2.1) and (2.2).
108
e
110
e
(5.2) = S U P u ~ E x u [CDI+ exp-~l wg(%l)]
g(X.) and wg(X.) are right continuous . The inequality w g ( ~ e ) < g ( ~ e )+e
is true, and ~e is e-optimal.
c) In order to show (5.1), we are going to prove that Jx(U,DX,g) converges
uniformly to Jx(U,~,g) if l+l, where D~= liml~iDX & D •
Under (H2) and (H3),
Peruh~<ENKIES
Nioole EL KAROUI
Ecole Norn~le Sup~rieure
5, rue Boucicaut
92260-Fontenay-aux-Roses
FRANCE
TWO PARAMETER FILTERING EQUATIONS FOR
By
and
i. INTRODUCTION
signal process is a two parameter semimartingale associated with this event. The
observation process is a second random event in the positive quadrant, and the
signal influences the observation because the distribution function of the observation
the reference probability model of Zakai [7] . Using the calculus of two parameter
are obtained. Intuitively the problem can be considered as searching for a random
martingale associated with the Brownian sheet, were obtained by Wong [6] and
Korezlioglu, Mazziotto and Szpirglass [B] . Filtering equations using two parameter
under certain symmetry hypotheses by Mazziotto and Szpirglas in [53 . In the notation
of section 3 below the symmetry hypotheses of [5] would, for example, include the
our calculations and give rise to results which superficially resemble those of [5] .
For the fundamental situation associated with the two parameter jump process we derive
filtering formulae without such hypotheses. The calculations are explicit and our
results new.
of the University of Hull for its hospitality in September and October 1981 when
a random 'time'(S1,S 2) in the positive quadrant [0,~] 2 . We can, therefore, take the
: " i : 1,2
"/... 7.
Write F* for F*
NOTATION 2.1.
t2 tI
8.plP2*~* = fo Io e(Ul"U2JdP~l~ 2 . (See [I] and [3].)
DEFINITION 2.2.
where X
00
is a constant and 8i E LI(~I,F*,P1) , i = 1,2,3,4 .
REMARKS 2.3.
Here we have assumed that X is constant on the axes. Terms of the form
115
~.p~ and %'.pi~* could be introduced into X . However, such one dimensional stochastic
calculations.
The process X cannot be observed directly, but a second random event which
occurs at a 'time' (TI,T2) in [0,=] 2 can be observed. We shall describe below how the
Ftl
I × F t2
2 and F = F ® . Associated with (TI,T2) we have the processes, (to which
o o_ cFi
]0, t .AT.] Ui- Ui
dQ~
d22 = t E LI(~2, F) .
equivalent to P2' L > 0 a.8.and Ltlt2> 0 ~.8. for each (tl,t2) ~ [0,-]2 . We shall
assume in the situations discussed below that P2 and Q2 have the same marginals, so
that
Again, all our calculations go through without this hypothesis, but are then even
tI t2
Ltlt2 = 1 + Io fo g(u1"u2)dqu2dqu I
With this representation result, and the reference probability model for
for each sample path X of the signal process g(Xulu2) ~ LI(~2,F). Furthermore, if
t I t2
Ltlt2(X) = Ltlt2 = 1 + f° fo g(Xulu2)dqu2dqu 1
then we assume that Lilt2 > 0 a.8. and the distribution of the observation event
dQ2(X)
= L (X) > 0 a.8.
dP 2
Note that under Q2(X) the components Tl and T2 are not, in general,
independent.
NOTATION 3.1.
Write
Xtlt 2 ~(3.1)
where
where
= t I, t2- fo g(Xslt2)dqsl
so that
t1 (S.S)
Ltlt2(X) = Ltlt2 = I + f o Lsl-tH(s1"t2)dqs I
t2
-- I + I ° Ltls2 - ~(t l,e2)dqs 2 (~, 4)
= Ftl t2 ]
and
Keeping t2 fixed and using the Ito product rule for one parameter semi-
tI
(LX)(t1"t2) = Xoo+ fo L81_,t2(a(sl,t2)dP;I + ~(sl,t2)a~l)
t1
+ fo XSl-,t2 L81-,t2 H(sI"tl)dq81
Therefore,
^ 1"
(tx)(tl,t2) = x_ o- I°tI ts1_,
_ t2(a(81,t2) + ~(81,tl))dF81
tI /k
+ ~o ~81_,t2 XH(sl-,t2) dqsI (4.1)
where
t
Ltlt2 = I + fo I ~s1_ t2 H(s1,t2)dq81
where
Forming the product of (4.1) and (4.2) we have, after some algebra, the following
expression:
119
LX(t1,t2)
Xtlt2 = Lilt
_. 2
tl ^ I*
=~- Io c~(Srt2) + ~csrt~))s2~1
+ fo ----1
+ H-XH)(sl~t2)(dq81 - t2)d~Psl)"
81t2
REMARKS 4.I.
It is proved in [3] that the process
tI
I
vtlt2= qtI - Io kSrtJ~p~1
is a 1-martingale for the filtration {Ftlt2} under the measure Q .Similarly the
process
t2
2
vtlt2 = qt2 _ iO H(tl"s2)a~Ps2
is a 2-martingale for the filtration {Ftlt2} under the measure Q .
DEFINITION 4.2.
THEOREM 4.3.
^ tl ^ I* tl K (X,H)
XtltZ Xoo- fo (~(81"t2) + ~(81"t2))dF81 + fo I + H= (81-,t2)dv It2
THEOREM 4.4.
* 2 = Xoo-
Xtlt ^ fotl {a(tl's2)+ ~(t1"s2))dF82
~- 2 ~ + Iot2 K(X.~_)
i 7~H (ti, s2_)d 2tl"82 "
Here
NOTATION 5.1
= (82,82) - HH(bl,s 2) ,
where
~) =L81-e2-)-I (Lsl-,s2_rY(81,
s2-)H(81-,s2)),
and
121
Define
F = K(R,X) + K(X,H,Y)
and ~ = (1 + ~)Kcx,~) + (1 + ~)Kcx,~).
LEMMA 5.2.
- )-2 t2 -
= (Ltl-,t2 fo ~(t1"s2)dqs2
= for2RK(tl"S2)dqs2 - lot2RK(t1"s2)~(tl-'S2)l
+ ~(tl-,S2) dPs2
Similarly
Using Theorem 3.3 of [3] we have the following two parameter expression
for (~)-1 :
LE~4A 5.3.
(Ltlt2)-I = I + C~-s~sJ.plp2
+ J(H(Sr82)).p1~x +
L
~2(G
,' (81"82),~Ip2_
L
s 1, s 2- si-, 82
REMARKS 5.4.
-I . ~oweverJ
might try to discuss this product using the above expression for -'(L)
For
i = 1,2,3,4 write
_ _ . s~ s2- uI u2
: {Lsl-,a J 1{8~i12(1 + Io fo g(Xoo- IO Io 04d~2~Pl)dquldqu2 )} "
and
t2
~Cs1"tJ = - ~ot2 (°I
^ + ~ s )dF~* + I ° -K(a,H)
- (s1,z2_j dr2
8112 I" 2 2 I+H 81, s2"
=
• P2"
(1 + ~)2 (1 + [1)(n + )
(HK(X,~) + XRK) 2
+ ^ .V ,
(1 +~)
.t2 ~ ~
so C a+
12 I+H
THEOREM 5.5
+ tl t2^ K(X,H) (F + G)
Io Io d~ I
REMARKS 5.6
REFERENCES
/ J
de Poisson Melange a Deux Indices. Stochastics 4 (1980), 89-119.
6. WONG, E. Recursive Filtering for Two Dimensional Random Fields. IEEE Trans.
Klaus Fleischmann
Academy of Sciences of the GDR
Institute of Mathematics
DD~-1080 Berlin, Mohrenstrasse 39
Summar 2
Por some stationary spatially homogeneous measure-valued branching
processes we have asymptotic independence in the space-time diagram.
1. Introduction
~j,tj] ~ foralli¢ j.
Roughly spoken, space-time mixing means that in the space-time dia-
gram the behaviour in finitely many regions will be asymptotically
independent if all distances between the regions tend to infinity.
126
2. ~odel Assumptions
On principle we assume that, for all x in G, a unit mass ~x at posi-
tion x generates a cluster ~ x ' i.e. a random finite measure ~ x
defined on G. Moreover we presuppose a spatial homogeneity, i.e.
~ x coincides in distribution with Tx T0 for all x where ~0=:
is the cluster generated by a unit mass in the origin of R d. Farther-
more we shall use the basic assumption in branching theory, namely
that different masses generate independent progenies. Consequently,
in addition, we have necessarily to require that ~ is an infini-
tely divisible random measure, i.e. for all natural n~mbers k it can
be represented as a sum of k independent identically distributed
random measures. Finally, we assume that ~ is critical, i.e. its
expected total mass ~ ~ ( G ) equals I.
Using these assumptions, we can define a branching process (~ t)t~O
in the following way. At time 0 we start with the Lebesgue measure
on G, i.e. ~0:=~. Then all 'small parts d Jx' of the initial
population ~ 0 are clustered independently as well as spatially
homogeneously, and the superposition will give us the first genera-
tion ~ I" (A strong formulation for this intuitive explanation can
be given in terms of L~vy-Khinchin representations, cf.[5] or ~3].)
Given ~ I we continue in an analogous manner getting ~ 2 etc.
The following convergence lemma holds (cf. Hermann ~3]).
Lemma 2.1. ~ t converges in distribution to some limit population
~ which is cluster-invariant (in distribution) and infinitely
divisible, and its intensity measure E ~ equals ~ or the zero
measure oe
For the interesting problem of separating the extinction case ~ o
we refer to ~4]or, in the point process case, to [7], section 12.6.
An open problem is the question whether in some particular 'non-
stable' cases ~ = o there exist cluste~-invariant populations
(which would have infinite asymptotic densities).
127
3. The Result
~rom now on we restrict our considerations to the 'stable' case
~Z ~t~
defined on the groap R d × Z for which we have the space-time trans-
lations
:= x (x Rd,s z)
Let us mention here that the condition in the theorem does not
depend on the order k. In other words, for infinitely divisible ran-
dom measures the properties mixing and mixing of all orders coincide.
P is called ergodic if all measurable subsets of M which are almost-
invariant with respect to (Tg) are trivial with respect to P.
Let ( ~ n ) be a sequence of distributions defined on G such that the
variational distance II~I ~ ~, - ~Z ~ ]~ tends to zero as
n ) ~o for all distributions W I' ~ 2 defined on G which are
absolutely continuous with respect to a Haar measure ~ on G. Such
( T n ) is called weakly asymptotically uniformly distributed.
5. To the Proofs
Let F be the set of all bounded non-negative functions f on A with
bounded support. The Laplace functional ~ o f P is defined by
+ ,o ,t * o
c,,,4,.. >, )
References
[I ] K. Pleischmann: Mixing properties of infinitely divisible ran-
dom measures. Carleton Mathematical Lecture Note, Ottawa (in
preparation)
[2] K. Fleischmann, K. Hermann and K. Matthes: Kritische Verzwei-
~
ungsprozesse mit allgemeinem Phasenraum, VIII., Math. Nachr.
submitted)
[3] K. Hermann: Critical measure-valued branching processes in dis-
crete time with arbitrary phase space, I., ~ath. Nachr. IO~,
63-107 (1981)
130
1
Wendell H. Fleming
(i i) d~ + L~ + V(x)# = 0, s < T,
• ds --
with data ~(T,x) = #(x) at a final time T. It is well known that, under suitable
assumptions,
? T
(1.2) ~(s,x) = Esx {¢(xT>e~ JsV(Xt)dt)
g i v e s such a r e p r e s e n t a t i o n • For i n s t a n c e , if x t = x s + wt - ws , w i t h wt a
dI
(1.5) d~ + H(I) - V(x) = 0, where
(1.4) H(I) = - e I L ( e - I ) .
The function H is concave. For a fairly wide class of Marker processes, we wish
to write (I.3) as the dynamic programming equation associated with a suitable optimal
stochastic control problem for Marker processes. The stochastic control problem is
specified by giving: (a) a suitable control space U; for each constant control
u £ U, the generator Lu of a Markov process; and (c) a cost function k(x,u)
associated with constant control u and state x. See [6, Chap. YI]. It is re-
This research was supported in part by the National Science Foundation under contract
MCS 79-03554 and in part by the Air Force Office of Scientific Research under contract
AF-AFOSR 81-0116.
132
quired that
(1.7)
mlnlmlzlng
J(s,x;u__) = Esx
f' [k(~t,ut) - V(~t)]dt + ~(~T) ,
S
ut = ~ ( t , ~ t ) , ? = -log ¢ .
The Verification Theorem of optimal stochastic control theory [6, p.159] asserts that
if I is a "well behaved" solution to (I.3) with I(T,x) = ~(x) and if certain other
technical conditions hold, then
1
(2.1) Lf = ~ t r a(X)fxx + b(x) • fx
n ~2f ,
t r a(X)fxx =i,~=l aij(x)~
1
(2.3) LUl = ~ tr a(X)lxx + u " Ix
The stochastic control representation (1.8) was used in [3] to give a stochastic
control proof of results of Ventsel-Freidlin type for some large deviations problems
for nearly deterministic diffusions. In those results a(x) is replaced by 6a(x),
small. In [4] the logarithmic transfo~,mtion was used to obtain stochastic
representations for positive solutions to the heat "equation with a potential term,
and to obtain the "classical mechanical limit." In [5] [10] the same logarithmic
transformation was apDlied to solutions to the pathwise equation of nonlinear filtering.
Large deviations results for the nonlinear filter problem are obtained by Hijab
[8] elsewhere in this volume.
Lf(x) = a ( x ) [ f ( x + y ) - f ( x ) ] .
134
From (1.4)
(3.1) e r = max [u - u l o g u + u r ] .
u>0
By taking r = I(x)-I(x+y) in (3.1) and changing signs (to replace max by min),
we get the required form (1.5) for H(I). In this special case the control u
is scalar, with u > O. A constant control u changes the jumping rate from a(x)
to ua(x). A feedback control u(s,x) changes the rate at time s and state x from
a(x) to u(s,x)a(x). If I(s,x) = -log #(s,x) as in §i, then the optimal feedback
control is u*(s,x) = ~(s,x)-l#(s,x+y).
Suitable Lu(° ) and k(x,u(.)) are obtained by i n t e g r a t i n g (3.2), (3.3) with respect
tO 7r(x,dy) :
We g e t a s i n e q u a t i o n (1.5)
back control is
(2.9)
(s,x+")
u*(s,x;') = --~--(s,x)
4. The Sheu formulation. In [ii] another kind of control problem is considered. Let
L be a bounded linear operator on C(~), the space of continuous bounded functions
on ~ , such that L obeys a positive maximum principle. (In particular, L may
be of t h e form (3.4) above.) For w = w(-) a positive function with w,w - i £ C(~),
define the operator iw by
1
(4.1) iwf = w- [L(wf) - fLw].
In a d d i t i o n , d e f i n e ~(x) by
(4.2) kw = ~ ( l o m O - ~L(w).
In Sheu's formulation, the control problem is to choose wt(. ) for s < t < T to
minimize
rT w t
/(s,x;w) = Esx ( Js [K (~t) - V ( ~ t ) ] d t + vI(~T) } ,
ESX [ f ( x t ) } (x T) ]
(4.s) ~sxf(gt ) = zs#CXw ) , s < t < T, f eCd).
This is seen from the following argument. The denominator of the right side is
(S,X). Let
Since ~ and $ both satisfy (l.l) with V = O, the quotient v = $#-i satisfies
as [ ~2 ] = - [L(vqb) - v L S ] ,
~n
with x 0 = x. The e x i t p r o b a b i l i t y ~e(s,x) is a positive s o l u t i o n to
s
(5.3) ~ ~i + Lgd~e = 0
~s
8Ie+
(5.4) 8 H~(8-11 e) = O,
l i m e H e (e-ll) = Ho(X,lx),
c.,O
with I the gradient and
x
?
310 H ( x , I O) = 0.
(5.7) ~ss +
Now (5.7) is the dynamic programming equation for the deterninistic control problem
with control space U as in §3, with running cost k(~t,ut(-)), and with dynamics
d£ t
(5.8) dt "= b ( ~ t , u t ( ' ) ) ,
f
b(x,u(-)) = a ( x ) ] y u(y)TrCx,dy).
a
Rn
Condition (iv) insures that H0(x,p) is the. dual of the usual "action integrand" A(~,~)
in large deviation theory, where for ~,~ E Rn
In both [3] and [Ii] the stochastic control method used to show that Ie + I0 depends
on comparison arguments involving an optimal stochastic control process when ~ > 0
and an optimal ~0 in (5.10) when e=0.
6. The dominant eigenvaluo. In [2] Donsker and Varadhan gave a variational formula
[(6.4)below] for the dominant eigenvalue %1 of L + V. Another derivation of this
formula is given in [Ii], using the family of operators ~w mentioned in §4.
E q u a t i o n ( 6 . 2 ) i s t h e dynamic programming e q u a t i o n f o r t h e f o l l o w i n g a v e r a g e c o s t p e r
unit time control problem. We a d m i t s t a t i o n a r y controls u(.) such t h a t t h e con-
trolled process with generator L~ h a s an e q u i l i b r i u m distribution P The c r i t e r i o n
t o b e minimized is
LUll(X) + k ( x , u ) .
(6.4) E 1 = sup
~ [ [ Vd~ - J ( ~ ) ] .
Let
e(l,~) = IX [-H(I) + V]d~ .
The function P is convex in I and linear in p . Formula (6.4) will follow if we
can find Ii' Pl with the saddle point property:
as r e q u i r e d .
REFERENCES
References (cont. }
Certain generalized Gaussian processes which arise as high density limits of super-
critical branching random fields (see [ i ] , [4]) possess interesting properties. In
this note we prove some of these properties. We remark the fact that the processes
obey deterministic evolution equations with generalized random i n i t i a l conditions.
Let S(Rd) denote the Schwartz space of infinitely differentiable rapidly decreasing
real functions on Rd topologized by the norms
Dk @k.~ kl _xkd . .
where x = (x I . . . . . Xd), k = (k I . . . . . kd), Ikl = k l + . . . + k d, = /dXl " " ~ d " Le~
S'(R d) denote the topological dual of s(Rd), ( . , . > the canonical b i l i n e a r form on
S'(R d ) x s ( R d ) , and li'll - the operator norm on the dual of the ~.ll -completion of
-~ d+ ,d+ .. P.
s(Rd). The Schwartz spaces %(R xR ) and $ (R xR ) are slmllarly deflned. The standard
Gaussian white noise on Rd will be written W; i t is the S'(Rd)-valued random variable
whose characteristic functional is Eexp{i <W,@) } = exp{-½~d @2(x)dx}" (See [3]).
Let A be the infinitesimal generator, {Tt,t~O} the semigroup and Pt(x,dy) the
transition probability of a time-homogeneous Markov process {Xt,t~O} whose state space
is all of Rd. We assume that
Tt:s(Rd ) ÷ L2(Rd),
T:s(Rd× R+) ÷ L2(Rd)
,(x,t) ÷
s2 rt,(.,t)(x)dt
and
A:$(R d) ÷ S(R d)
functional
The existence of this process follows from its representation as a stochastic integral
REMARKS.
I. All the randomness of {Mt} comes from the initial condition Mo = W.
2. We recall that a Gaussian random field is "classical" (or continuous) i f
and only i f its covariance measure has a density with respect to Lebesgue measure
(covariance kernel) and i t is continuous (see [ 3 ] ) . In our case, for K(s,dy;t,dz)
to have a density k ( s , y ; t , z ) i t is necessary that Pt(x,dY) have a density Pt(x,y),
and the covariance kernel is
k(s,y;t,z) = [ Ps(X,y)pt(x,z)dx, y , z E R d.
d
Hence for t>O, Mt is classical i f and only i f there is a density Pt(x,y) and
(y,z) ~ k ( t , y ; t , z ) is continuous.
144
A : ½Zi,j=
ld aij(.)@2/~xi@xj + Zi=id bi(.)~/~xi
a transition density (see [6]), but the continuity of (y,z) ~ k(t,y;t,z) is another
question. We leave it open.
3. For Brownian motion things are nice, as they should be. In this case A = ½Aand
Pt(x,y) = exp{-llx-yIl2/2t}(2~t) -d/2. The assumption on A is clearly met. We now
verify the assumptions on Tt. For @Es(Rd),
22 _dl2dydt
12dx
IIr,ll ~ = d oTt@(,,t)(x)dt dx --
d o d @(y,t)e"~y-x~ / t(2~t)
@(y,t)@(z,s)e-Uy-zll2/2(s+t) (2~(s+t))-d/2dydzdtds
145
sup
x,tZO - Jo~oJRd
since IId
j : I ( I + I y j I )-2~I.
Therefore the Gauss-Markov process {Mt} in this case has covariance kernel
In additon the process is self-similar, i.e. for any constant ~>0 the distribution
of {Mt} is invariant under the transformation <~-d/2M 2t,@(~-i.)} , as can be
seen from the covariance kernel.
PROOF. For the Markov property of {Mt} i t suffices to show that given to<t and
@ES(Rd) there is a ~ES(Rd) such that
and this can be done using the covariance of {Mt} and the Chapman-Kolmogorov equation
for Pt(x,dy), with @ : Tt_to¢.
I t follows from the covariance of {Mt } (or more easily from the stochastic i n t e -
gral representation) that
Then, using the semigroup property of Tt, the relation f~TsA(ds = Tt@-~ and the
Schwarz inequality, we have for s<t
146
= i dU
rrt-s
o Ts+rAdp(x)dr~2
j dx _<_It-s)( Ft'S(rs+rA,(x))2drdx
JRd Jo
where K is a constant (depending on each bounded interval [O,T] such that O<s<t<T).
The Dudley-Fernique theorem ([2], Theorem 7.1) then implies that {<Mt,@> } is sample-
continuous for each @Es(Rd), and the stated norm-continuity of {Mt} follows by
Mitoma's extension [5].
To obtain the evolution equation we view {Mt} as a space-time field. By the second
assumption on Tt (definition of T),
Wt = ~Mt/at-A*Mt'
o o
=
IT!
o o d
Ts(?*l~t+A¢) (x,s)T t (~*l~t+A*) (x,t)dxdsdt,
147
hence
cov(<Ws,,(.,s)),< ,~(.,t)>)dsdt
where Ex is the expectation under the distribution Px of {Xt} starting from xERd.
But
,(×t,t) -
I2(~,/~t+A,)(Xs,S)ds, t~O
Ex i o (?,/~t+A,)(Xs,S)ds = -~(x,O).
o
S i m i l a r l y for ~. S u b s t i t u t i n g above we obtain
f f>ov
o
<Ws,,(.,s)) , (wt,,(-,t)))dsdt =
Rd
,(x,O),(x,O)dx,
and therefore
where 62 is the Dirac delta function on R2 centered at the origin. This means that
~t=lw, t=o
O, t > O
REFERENCES
1. Dawson, D. and Gorostiza, L.G. Limit theorems for supercritical branching random
fields. In preparation.
2. Dudley, R.M. (1967). The sizes of compact subsets of Hilbert space and continuity
of Gaussian processes, J. Functional Analysis, Vol. 1, 290-330.
3. Gelfand, I.M. and Vilenkin, N.J. (1966). Generalized Functions, Vol. 4, Academic
Press, New York.
4. Gorostiza, L.G. (1981). Limites gaussiennes pour les champs al~atoires ramifies
supercritiques, in "Aspects Statistiques et Aspects Physiques des Processus
Gaussiens", Colloque International du C.N.R.S., Saint-Flour, France, 1980. Editions
du C.N.R.S. No. 307, 385-398.
5. Mitoma, I. (1981).. On the norm-continuity of S'valued Gaussian processes, Nagoya
Math. J., Vol. 82, 209-220.
6. Stroock, D.W. and Varadhan, S.R.S. (1979). Multidimensional Diffusion Processes,
Springer-Verlag, Berlin.
7. Wong, E. and Zakai, M. (1974). Martingales and stochastic integrals for processes
with a multi-dimensional parameter. Z. Wahrscheinlichkeitstheorie verw. Gebiete,
Vol. 29, 109-122.
EXTREMAL CONTROLS FOR COMPLETELY OBSERVABLE DIFFUSIONS*
U.G. Haussmann
Department of Mathematics
University of British Columhla
(i.i) inf{J[u]:u E U}
where
and U is some class of admissible controls, and where the state x satisfies the
equation
Here x and w are processes defined on some probability space which may depend on
u, and w is a standard Brownian motion. More details will be given in the next
section where several classes of controls will be considered. We begin with the fol-
lowing notation and assumptions (more will follow). Ct is the Banach space of con-
tinuous functions, [0,t] ÷ ~ n " CT = C , G is the Borel u-algebra on C and {G t}
is the canonical filtration on C . Write IIxllt =sup{Ix(s) l : ° ~ s ~t} . if
x:[0,T] × ~ + ~ is a continuous process, i.e. x is a C-valued random variable,
n
then ~t=x-loGt is the u-al~ebra generated by the past of x . On [0,T] we use
the Borel u-algebra. U is a Borel set in some Euclidean space with Borel subsets
% " ~n is n-dimensional Euclidean space and ~n×m is the space of n ×m
matrices.
We show in section three that a control which satisfies the necessary condi-
tion of the maximum principle, i.e. an extremal control, is in fact 'optimal' pro-
vided that certain convexity hypotheses are satisfied. This result is proved along
the lines of the corresponding deterministic case, [9]. Here 'optimal' means optimal
within the class of controls which are adapted processes on a given probability space
with filtration and Brownian motion. However usually the 'non-antieipative' controls
are adapted processes defined on some probability space with filtration and Brownian
motion which may change with u . In section two we see that inf J[u] is the same
over all the usual control classes, so that any extremal control is optimal under the
convexity hypothesis. In section four we apply this resnlt to show that the extre~al
controls computed for several examples in [5] are in fact optimal. It is also useful
in establishing e-optlmality for certain problems, c.f. section three and [6] .
One final note. We shall always assume that the probability spaces are chosen
so that the solution x of (1.3) is not just continuous w.p.l, but in fact all tra-
jectories are continuous.
2. Control Classes, Various sets of controls have been introduced in the literaD~re.
We shall consider several of these here and observe that they all lead to the same
inflmum in (i.i).
Definition 2.4: ~ , the set of Markovian control laws, consists of all Borel meas-
urable functions u:[0,T] ×JR ÷U for which there is a probability spaee (~,F,P)
n
with filtration,which may depend on u and which carries two continuous adapted
processes x , w such that w is a Brownian motion and (x(-),u(',x(')),w(~))
151
satisfy (1.3).
Definition 2.5: For fixed ~ = (~,F,P,{Ft},w) write UN(~ ) for the corresponding
controls in UN . Similarly for UM(~) , UL(~) .
Remark 2.6. If u ~ U L , then according to [2], [i0] there exists a Brownlan motion
- x
(wt,F t) such that
c_+ UL c_+UA c U N .
~ ( ~ ) c_+UL(~ ) c+ UN(~ ) •
Remark 2.7. As we are assuming (1.4), then for u ¢ U N or u c UN(~) , (1.3) has a
u
unique, continuous adapted solution x Gronwall's inequality, Burkholder's in-
equality and (1.4),(1.5) imply that there are constants Kq, K 4 such that
for any u ¢% Here y is the solution of (1.3) with initial condition y(0) .
Now (1.5) implies
I2
J[u] -<E ]~.(t,xU,u(t)) [at+ ]c(xU(T))l
<- ( T + I ) ~ ( I +Kqlxo[q) .
We also have
152
(ii) 0
f(t,x,u) =¢(t,x) + g(t,X,U)[ ? [o1
' O(t,x) = 0 2 (t,x)
~Ith ~ Borel,
g(t,x,u) E I~ m o2(t,x) E I~ m×m ,. g, o, £ continuous, ~2bounded,
invertible with bounded inverse, and with , continuously differentiable in x
with bounded derivative,
(iii) dx = # ( t , ~ d t + o(t,x)dw
IT I]RnP(S,x,t,y)~dY d r < =
for some B >i and all s'>s , c.f. [4] , corollary 3.9, where it is shown that
(2.9) holds with UA replaced by UN .
We point out that if for each u E [~ the solution of (1.3) is unique in law
(true under (2.11)) then for any
and consequently
Inf{J[u]: u e ~ } = inf(J[u]: u E % ( ~ ) } •
Finally we remark that if U is compact then the equality of the inf over
~(z) and ~(~) can be established even if f, ~, Z, c depend on the past of x
rather than just x(t) . Moreover inf J over the predictable controls also assumes
this common value.
3. Extremal Controls. We now define the extremal controls as those which satisfy
the condition of the maximum principle. We assume (1.4), (1.5), (2.11)(i) .
153
where F[x(t)] is the a-algebra generated by x(t) . In this case we can write
p(t) =~(t,x(t)) for a measurable function ~ . We shall, from now on, use p and
interchangeably, and in fact drop the ~
where p is the adjoint process for ~ , and ~ is the solution of (1.3) corre-
sponding to Q .
The following results show that necessarily an optimal control is extremal in
c~rtaln cases, i.e. a Pontryagin-type maximum principle holds.
Theorem 3.5. Assume (1.4), (1.5), (2.11)(i) . If O i~s optimal i__nn UN(W ) for
some ~ , and if Q ins {Ft_} adapted, then Q is extremal.
Proof: This follows from [8], with perturbed controls ut_6(s) =E{Q(t) IFt_ ~} .
Proof. This follows either from [7], [8] or from remark 2.8 and the above. In
fact (2.9) implies that Q is optimal in UA so the previous corollary gives the
result.
Remark 3.8. We know that under the hypotheses of corollary 3.7, there exists
optimal in % , c.f. [8] .
We are now interested in the converse question: if u is extremal is it
optimal?
We define the Hamiltonian
H*dt,x,p) = s u p H(t,x,p,u) .
uEU
Theorem 3.9. Assume (1.4), 41.5), (2.ii)(i) . l_[f (1), fl•UN(n) is extremal
where 4~,Q,w) is a solution of 41.3), if (ll),
where Dk:[o,T] +~R ek:[0,T] + ~ are bounded and measurable, if 4iii) c4")
nxn " n . . . . .
is convex, and if (iv), A c ~ n is an open convex set such that for each t w.p.l
R*(t,x,p(t)) is concave in x for x in A , then for any ~ c (0,i) and some
constant K0 dependin~ only o_~n B and the bounds and growth constants of (1.4),
41.5), (2.11)(i) ,
#(s,t) =~(s)V(r) .
+ [ Dk(t)ek(t)]dt+~(t)[ ek(t)dw k ,
k k
and
- f(t,x(t),u(t))+Z Dk(t)ek(t)] + t x ( t , ~ ( t ) , Q ( t ) ) x ( t ) d t
k
T k
+ p(O) [k Io~(t)e (t)dw k
so that
Hence
= I0-: - H x ( t , ~ ( t ) , p ( t ) , Q ( t ) ) [ x ( t ) - ~(t)]
+ [H(t,x(t),~(t),u(t)) -H(t,~(t),~(t),a(t)]dt .
156
SinCe H and Hx are linear in p , and x,u are Ft adapted, then p can be
replaced by p when expectations are taken. Moreover, since Hx(t,i(t),p(t),Q(t))
is a subgradient (in x) of H*(t,',p(t)) at i(t) and H*(t,.,p) is concave on
A , then
J[Q] - J [ u ] g E 1 a
I; -H*(t,x(t),p(t)) +H*(t,~(t),p(t))dt
+ E ( 1 - 1A)
I2 -Hx(t,e(t),p(t),fl(t))[x(t) -~(t)]dt
+ g
2 H(t,x(t),p(t),u(t))-H(t,e(t),p(t),~(t))dt
E(I - 1A)
2
-Hx(t,e(t) ,p(t) ,fl(t)) Ix(t) - e(t) ]
+ [H(t,x(t),p(t),u(t))-H(t,i(t),p(t),Q(t))]dt
where the last inequality follows 5y the definition of H* , and where IA(~) = 1
if xU(t,~) e A for almost all t ~ [0,T] , all u ~ UN(~) , and iA(m) = 0 otherwise.
Since c , ~ satisfy a growth condition in x and since $ satisfies a linear
x x
equation with bounded coefficients then E Ilpll ~ < ~ for any q < - The growth
conditions, remark 2.7, and H~ider's inequality now give the result.
Corollary 3.10. Under the assumptions of theorem 3.9, if A=~n (i.e. H* is con-
cave for all x ) then Q i__ssoptimal.
It is usually difficult to say when H* is concave, because for example if
n = 1 we cannot usually say that p has constant sign w.p.l; however we have the
following result.
so H* is concave in x •
157
Observe that if Q e%(~) (or one of the other classes) is extremal, then
it is extremal as an element of UN(= ) , thus optimal in UN(~) and so optimal in
~(~) . If we have law uniqueness then Q is optimal in ~ .
Remark 3.13: These results of this section can be extended to the case when there
are constraints of the form E r(x(T)) = 0 if r satisfies the same conditions as
c • We say that u is extremal if E r(x(T)) = 0 and if (3.4) holds with p(t)
defined as
Remark 3.14. Suppose (3.12) holds and ~ is as in the theorem. In [6] we show how
to construct (approximations to) controls uR which are extremal for solutions of
(1.3) when f, ~ are altered on {Jx] > R} to be bounded with bounded x derivative.
From remark 2.7 it follows that given e > 0 , we can choose R such that for fixed
~ (0,i)
But for R sufficiently large we observe that (for all u ) [J[u] - JR[U]l < e/3
where JR[U] is defined by (1.2) with x given by the altered f,o . Hence
J[u R] ~ i n f { J [ u ] : ~ E U N ( ~ ) } + e ,
R
i.e. u is e-optlmal in UN(~) if £0,c are convex in x for each t .
4. Some Examples. In [5], we =onsidered eight simple examples and exhibited ex~
tremal controls in each case. Our aim here is to show that seven of these controls
are optimal. Sometimes it is necessary to add a convexity hypothesis.
U =
P
158
p(t,x) = -2Etx{X(T)'Q*(T,t) +
I; x(s)'M(s)~(s,t)ds}
d~(s,t) = A(s)~(s,t)ds ,
dx= (A+BK)xdt+adw.
so that O is extre~al if
K(t) = -N(t)-iB(t)'P(t)
C
P(t) = ~(T,t) 'QSK(T,t) + Jt #(s,t)'M(S)#K(s,t)ds .
These last two equations imply that P satisfies the usual Ricatti equation
P+A'P+PA-PBN-IB'P+M =0 ,
P(T) = Q .
ExamRle 4.4. (Predicted Miss) f and o are still given by (4.2) (with say, B ,
o continuous, a invertible), but ~ = 0 , c(x) = k(v'x) where v is a constant
vector and k is convex, even, non-negative and continuously differentiable,
k(y) _<(l+lyl q) . U={uEIRp:lUil -<i} . In [5] we show that
Q(t,x) =-sgn[B(t)'s(t)s(t)'x] is extremal with s(t) given b y
ds=-A(t)'s dt , s ( T ) = v .
Hence Q is optimal in UM or any of the other classes. Examples 3.3, 3.4 and 3.5
of [5] are treated similarly, if we add the hypothesis that £(t,-) , k(') are con-
vex.
159
Example 4.5.
dx = B(t)u(t)dt + ~ ( t ) d w
mln E
I;u'N(t)u dt , [ui(t) I -<i
where B(t) , N(t) are bounded, measurable and N(t) > 0 . We add the constraint
E x(T) = ~ .
Then p(t) = ~' (~ is a constant column vector), and in the normal case, i.e. when
{J[u]: u ~ U , Ex(T) = = } is more than just a singleton, then Q is extremal if
Q(t) = sat{N(t)-iB(t)'%}
~- x O=
s2
B(t)Q(t)dt .
References
[1] J.M. Bismut, Th~orle probabiliste du contrSle des diffusions, Mem. Amer.
Math. Sot., 4(1976), No. 167.
[3] U.G. Haussmann, General necessary conditions for optimal control of sto-
chastic systems, Math. Programming. Stud., 9(1976), pp. 30-48.
[7] N.V. Krylov, Controlled Diffusion Processes, Springer-Verlag, New York, 1980.
[8] H.J. Kushner, Necessary conditions for continuous parameter stochastic op-
timizatlon problems, SIAM J. Control, i0(1972), pp. 550-562.
Institut f~r A n g e w a n d t e M a t h e m a t i k
Universit~t Bonn
ABSTRACT
Lt = ~ I to(WsdWs
i 2 _ w;dw l
1. INTRODUCTION
(1) E~xp(i~Lt)] _ 1
cos h (~2---~
and
t~
(2)
162
where Ixl denotes the Euclidean norm of x, cf. [12, p. 173]. L~vy's proof
of Eq. 2 is based on the e x p a n s i o n of (standard) Brownian motions (W~),
i=I,2,.--, in a countable coordinate system, e.g.
i t i + ~ ~ ~in(mt) yi
Wt = Vn Y0 m= I m m' 0<t~-u '
i i
where Y 0 , Y I , - - - are i n d e p e n d e n t normally distributed random v a r i a b l e s
with mean zero and v a r i a n c e I, see [12] or [I, p. 261]. To obtain the
characteristic function
of L t then one simply has to integrate (2) with
2 2
respect to the d i s t r i b u t i o n of Pt: = IWtl . Actually, L~vy gave two
proofs for formula I the second one being based on the skew product re-
presentation of two d i m e n s i o n a l Brownian motion and the formula for the
characteristic function of
+1 0
163
then the process (L~'0/2) is the same as L~vy's stochastic area (Lt)-
We shall study the
~ stochastic process Lt
(Wt,-A'x") and, for fixed t>0,
calculate its c h a r a c t e r i s t i c function. The proof of the result is in-
spired by work of R. Liptser and A. S h i r y a y e v [13, vol. 2, p. 12] on
filtering of r a n d o m processes.
As has been n o t i c e d by B. Gaveau [4] formulae like (I) and (2) y i e l d
estimates as well as e x p l i c i t e x p r e s s i o n s for fundamental solutions for
the generators of the d i f f u s i o n s (~t,L~'X), a class of h y p o e l l i p t i c op-
erators w h i c h n a t u r a l l y arises in some problems in analysis and geometry.
So, in particular, when we choose x(t) E 0, A(t) H A, A skew-symmetric,
we thus give n e w proofs of some of the results, e.g. Theorem 4.2.1 and
4.3.1, o b t a i n e d in [4] using the e x p a n s i o n of B r o w n i a n m o t i o n d e s c r i b e d
above. By the m e t h o d used in [2] we can also give a formula for the fun-
damental solution of the standard sub-Laplacian of any simply c o n n e c t e d
nilpotent Lie group of order 2. The e x p r e s s i o n found is further exploited
in the special case of 'generalised H e i s e n b e r g groups', a class of nil-
potent groups introduced by A. Kaplan [7]. By an elegant m e t h o d Kaplan
has shown that the standard sub-Laplacian of these Lie groups admit fun-
damental solutions analogous to that known for the H e i s e n b e r g group.
bt ( is Zdefined
) as
t
ut(Z) = Z I H -I (s)A(S)X(S)dS.
0 z
164
The main result of this note is the following formula for the joint
characteristic function of the two random variables Xt: = W t + x(t) and
L~,X, t>0. The proof which is rather lenghty and will thus appear else-
where is based on Girsanov's measure transformation technique and ana-
lytic continuation of the function
+Yo
T
* . (iA)
f@*(r)[HiA(r)A2(r)x( r)u r ]drl2ds +
s
+ !2~SP
T (FA(S))d ~
A =
-+CJ
Corollar~ I. If x(t) ~ O, t h e n
F(A,o) = e x p sp(rA(s))ds
aR
Corollary 5. If x(t) =- x, A ( t ) =- A t h e n
[~12] I 2 2
F(A,y) = k=]]T coSh(Atak) exp - ] { ( 0 [ 7 + A A x ] ) 2 k _ ] + ( 0 [ ~ + A A x ] ) 2 k }.
Remark 2. L~vy's formula (I) follows from (6) since for this example
a I = 1. F o r m u l a (2) is also derived from (6) by taking c o n d i t i o n a l ex-
166
3. FUNDAMENTAL SOLUTIONS
dZ t = GA(Zt)d~,Zt, z 0 = 0,
OA(Z) =
(id)
(A~)
.
1 1 d+l 2 d 8 d
2AA 2i,~= ] j •
8zlSz 3. 2 i=] j=] i~ j0~
Corollary 4. The f u n d a m e n t a l s o l u t i o n f o r ~A i s g i v e n by
.era / A )[d/2]
p0 (T;OJ,0) = (2","e) l+,',+a/2 f sinh('2A)
A: = A- 1 mZ AiA(i) ;
i=]
REFERENCES
[2] CYGAN, J.: Heat kernels for class 2 nilpot£nt groups, Studia Math.
64 (1979), pp. 227-238.
[4] P r i n e i p e de m o i n d r e a c t i o n , p r o p a g a t i o n de l a c h a l e u r
GAVEAU, B . :
et estim~es sous elliptiques sur certains groupes nilpotents, Acta
Math. 139 (1977), pp. 95-153.
[8] LEVY, P.: Le mouuement Brownien plan, Amer. Jour. Math. 62 (1940),
pp. 487-550.
[16] YOR, M.: Remarques ~ur une formule de Pa~l L~vy, S~minaire de
Probabilit~s XIV, Lect. Notes in Maths. 784, pp. 343-346, Springer-
Verlag, Berlin, 1980.
ASYMPTOTIC NONLINEAR FILTERING
AND LARGE DEVIATIONS
Omar HiJab
g2 2
Ae = f + 2(gl + "'" + gm ) (1)
lime
+ 0 e log PC(G) ~ - i n f 1 IT u2dtlXu
{~ in G}
O
(3)
lim
e@0
£ log Pc(C) ~ -inf {~ 11T u2dtlx u
o
in C}.
In 1966 S.R.S. Varadhan set down a general framework [i] for dealing with the
asymptotic behavior of families of measures and certain associated expectations,
and in particular derived the above estimates for processes with independent in-
crements [i]. Subsequently, he derived these estimates for the case of drift-free
nondegenerate diffusions (i.e., f = 0) [2]. Later Glass [3] and Ventsel and
Freldlin [4] established these estimates for nondegenerate diffusions with drift.
171
e
lim £ log Qxly(C) ! -inf {~i IT u2 + h(Xu)2dt - I T h(Xu)dylx u in C}
e+0 o o
for almost all y in ~P E C([0,T];RP).
Let b(t) : ~m RTM be given by b(t,~) = ~(t) and impose Wiener measure on
~m. Then t ÷ b(t) = (bl(t) . . . . . bm(t)) is an Rm-valued Brownian motion.
One way to construct diffusions on Rn governed by Ae is to pick a point x
o
in Rn and to let t + xE(t) be the unique process ~m ÷ ~n satisfying
for all K0 in Co(Rn), 0 < s < t < T, and xC(0) = Xo, almost surely on ~m. Here
g(~)db is short for gl(~0)db
I _ + ... + gm(q~)dbm where gi(~0) is defined above.
172
Using the standard exlxtence and uniqueness theorem for stochastic differential equa-
tions and Ito's differential rule, it is easy to show that there is a unique such pro-
cess defined up to an explosion time ~g ~ ~ characterized by the fact that t÷ x6(~
leaves every compact subset of Rn as t ÷ ~, almost surely on Cc < ~.
The merit of the above definition of t ÷ xC(t) is that it makes sense on any
manifold X. Indeed, the Whitney embedding theorem allows one to embed any such X
into some RN and by extending f' gl' "'" 'gm to RN one can derive the result de-
scribed above on any manifold. Of course in Rn t + xE(t) is the "Stratonovitch so-
lution". In any event, as g + 0 the "correction factor" disappears and so estimates
(3) are expected to hold just as well for the diffusions t ÷ x~(t) constructed here.
In what follows we are careful to state everything in such a way as to make sense on
any manifold X.
If T < ~e then the probability distribution Pe of t + xe(t) exists on an
and is the unique probability measure on an satisfying PC(x(0) = x o) = 1 and
for all @ in C~(R n) and 0 < s < t < T. Here x(t) : a n + Rn is the canonical
map and ~ is the ~-algebra generated by the maps x(r), 0 < r < s.
then one can show that the solution t + xe(t) of (5) explodes after time T, i.e.,
~e ~ T, almost surely.
In what follows we shall assume (ii) and
Under assumptions (i),(ii) and (iii) estimates (3) hold for the measures {pC} con-
structed here [5]. To understand these estimates from a more general perspective
consider the following definition [1].
Definition. Let ~ be a completely regular topological space and let pC, £ > 0, be
a family of probability measures on ~. We say that {pC} admits larse devlation if
there is a function I on ~ satisfying
(i) 0 < I < +~.
(ii) I is lower semicontinuous on R.
(iii) {~II(~) ~ M} is a compact subset of ~ for all finite M.
173
Estimates (3) then state that (iv) and (v) hold for the probability distributions
of the diffusions t ~ xC(t), where I is given by
for all ~ in a n, with the understanding that the inflmum of an empty set of real
numbers is +~. Since (ii) is easy to derive and (Ill) is the statement that u ÷ xu
Theorem 1.2. Let {pC} admit large deviation with corresponding I-functional I
and let ~e be a bounded continuous function on a such that ~ converges uniform-
ly to @ as e + O. Let QE be given by
dQ C = e c edps"
lim
£+0 el°g QC(G) _> -inf{I(~) + ¢(~)le in G}.
We note that for the results of theorem 1.2 to hold it is not necessary that
holds [i].
ACt) is then a measurable function on ~n x ~P for each t. Using (7) and invoking
the Cameron-Martin formula it is easy to see that
dP~x,y ) = e c d(P~ x W e)
dP~x,Y) . dp C
dPx[Y = d(P~ x P~) x
dP~x ) d(P~ X W C)
• dp XC
_lc A - Zc A
= e dPE/EE(e 7. (81
x
d s / c -~n-
= Qxly Qxly( ) ,
c
and refer to ~]y as the unnormalized conditional distribution.
So far equation (87 holds for any processes t + xe(t). Now suppose that pC
x
is governed by Ae in the sense of equation C6), where Ae is given by (i). For
any bounded measurable ~ let
the "unnormalized conditional expectation of ~(xg(t)) given ye(s), 0 < s < t".
for 0 < s < t < T. This last equation together with equations (6) and (8) and the
Ito product rule then yield
i it ~(h~0)dy(r).
We emphasize that this proof is valid for any locally bounded measurable h and any
generator Ae of the form (i). This equation is well-known and appears in various
forms in the literature.
E
In the next section we study the asymptotic hehavlour of QXlyl as e + 0.
Note that for h = 0 this theorem reduces to estimates (3). The idea of the
E
proof is simple enough; apply theorem 1.2 to Qxly using the representation given
by equation (8). This however does not work directly because the exponent A is not
a continuous function on ~n for each y in ~P. We therefore have to make a
slight detour and integrate by parts the stochastic integral appearing in A.
For each g > 0 let ~g on ~n be given by
1 12
~E(w) - -y(T)h(~(T)) + y(O)h(~(O)) + I~[+yAe(h)(~) + ~ h(~) 2 - ~ y g(h)(~)2]dt.
Then # + #
= ~ as c + 0 uniformly on ~ n for each y in ~P. Referring to
e
o
(8) and performing an integration by parts in the stochastic integral appearing in
A and invoking Girsanov's theorem we see that
Ae
- ygl(h)gl ... - Ygm(h)gm"
We w i ~ t o apply theorem i.i to iP~:y}. To do so we must check that assumptions (1),
(il), (ill) of section i hold for the vector fields
for all y in ~P, given that they hold for y = 0. For (i) this is obvious. For
(ii) this is also obvious and for (iii) this is so because gl(h), ... ,gm(h) are
bounded feedback terms. Thus theorem i.i applies to {P~ :y_} and hence theorem 1.2
applies to ~IY via equation (i0).
Thus let x u:y denote the unique path in ~n satisfying
The theorem 1.2 implies that for any G open and C closed in ~n
REFERENCES
[4] A.D. Ventsel and M.I. Friedlin, "Small Random Perturbations of Dynamical System8
Russian Math. Surveys, 25 (1970) 1-56 [Uspehi Mat. Nauk. 25 (1970) 3-55].
Thomas G. Kurtz
D e p a r t m e n t of M a t h e m a t i c s
U n i v e r s i t y of W i s c o n s i n - M a d i s o n
Madison, Wisconsin 53706 U S A
i. Introduction
By a c o u n t i n g p r o c e s s we m e a n a s t o c h a s t i c p r o c e s s N whose
sample paths are c o n s t a n t e x c e p t for jumps of +i. The s i m p l e s t
example is, of course, the P o i s s o n process. Recall that the d i s t r i b u t i o n
of the P o i s s o n p r o c e s s is d e t e r m i n e d by s p e c i f y i n g the i n t e n s i t y param-
eter 1 w h i c h gives
and
Tm(X)
(1.4)
I0
l ( t , x ) d t < ~, m = i, 2, 3 ...
G i v e n an i n t e n s i t y f u n c t i o n I, the p r o b l e m then b e c o m e s to
associate w i t h it a c o u n t i n g process N satisfying (1.2). There
are a v a r i e t y of w a y s of a c c o m p l i s h i n g this. H e r e we w i l l s p e c i f y a
stochastic e q u a t i o n for w h i c h N is the u n i q u e solution. For other
approaches see the books by B r e m a u d (1981) and Snyder (1975). All
178
= 1 - exp[-I t+At l ( s , N ( - A t ) ) d s }
t
w h i c h is a p r e c i s e v e r s i o n of (1.2). The fact that I t X(s,N)ds is
0
a s t o p p i n g time also g i v e us the r e l a t i o n b e t w e e n the s t o c h a s t i c e q u a t i o n
(1.5) and the m a r t i n g a l e approach described in B r e m a u d (1981). Since
Y(u) - u is a m a r t i n g a l e , the o p t i o n a l s a m p l i n g t h e o r e m implies
and
(a) Counter model. Let p > 0 and lim p(u) = 0. The equation
U+~
then N(n) ~ N.
and
(2.3) ~(n) (t) = YI({ t IAI (n) (s,N(n))ds) + Y 2 ( { t ( A - I A I (n)) (s,N(n))ds) .
Note that I (n) = IAI (n) + l(n) - IAI (n) and I = IAI (n) + I -
IAI (n) , and it follows by the m u l t i p a r a m e t e r optional sampling the-
orem, Kurtz (1980), that
tat (N (n))
(2.4) N(n) (t A ~ m ( N ( n ) ) ) _ / m l(n) (s,N(n))ds
0
and
{Y2 (iT^~m(N(n))
D II _ i(n) I(s'N(n))ds) > 0}
TATm (~ (n) )
< E[I - exp{-I ll(s,N(n))-I (n) (s,N (n)) Ids}]
0
TA~ m (N (n))
Y2({ (l(n)-lAl(n)) (s,N(n))ds) > 0
if and only if
TAr
(~(n))
y2({ m (l(n)-IAl(n)) (s,N(n))ds) > 0.
and
+ f t nl/2(F(Xn(S)) - F(X(s)))ds
0
= Vn(0) + iCXnCS))ds)
+ f t nl/2(F(X(s) + n -I/2 V (s)) - F(X(s)) ds
0 n
+ I t ~F(X(s))V(s)ds.
0
(3.8) Tn = i n f { t : ~ ( X n ( t ) ) ~ 0]
and
(3.9) T = inf{t:~(X(t))<_ 0}.
Then
and
(3.11) /n(Xn(T n)-X(~ ))
~(x(~)).v(T ) F(X(T)).
V(T) --
~(X(T ))'F(X(T ))
(3.12) 3
3-~ ~ (X(t)) = ~ (X(t))'F(X(t))
implies that for E > 0, ~(X(T -e)) > 0 and ~(X(T +e)) < 0.
Since the c o n v e r g e n c e of Xn to X is u n i f o r m on b o u n d e d time
intervals ~(Xn(t)) + ~(X(t)) uniformly on b o u n d e d time intervals
and it f o l l o w s that ~n ~ T in p r o b a b i l i t y .
Next note that ~(X(T )) = 0 and, since ~ ( X n ( T n)) ~ 0 and
(Xn(~n-)) ! 0
(3.14) ~H ( ~ ( x ( T ) ) - ~(X(Zn)))
a~(x(~ )) -v(~ ).
~(X(~ )) -V(~ )
V(T ) - 8~(X(T ))-F(X(Y )) F(X(T )) .
(3.18) Is ! = nl ~ !
n nn
(3.19) ~i = n ~ i
/ Y n (t) foo
(3.24) I (s) ds = t, 0 < t -<- In( s)ds
0 n 0
(3.27) T
n = inf{t:In(t) = 0}
(3.3o) S(t) -- S ( 0 ) e - I t
and
Consequently T is the s o l u t i o n of
i i
- n-iY2i({ t ~i SnA(nXn(U))du)
2
= xi(0) + n-iYli(n{tl i iA(l+n-lmn-X 1 (u)-Xn(U))du)
where YII 'Y21 ' YI2 ' Y22 are independent Poisson processes.
Assume that -i
n m n = m + 0(i/n) and n - 1 S ni = S i + O ( 1 / n ) . Then the
(3.40) i ) - x i (u) I + 0
sup IXn(U in probability.
u<t
The "central limit theorem" does not quite apply since Fi is not
continuously differentiable. But the same proof gives the following.
Let
(3.41) vni(t) - /n(xi(t) - xi(t))
+ ~t li~-~(iA(l+n-lmn_X~(u)_X~(u))_iA(l+m_xl(u)_X2(u)))du
-i i i " '
_ ~t ~i/~ ( (n Sn)AXn(U)-SIAXI(u) )du
- W2i(0f t ~iSiAXi(u)du)
{t li(X{m } (xl(u)+X2(u)) ( v l ( u ) + V 2 ( u ) ) v 0
4. Fiber Bundles
In order to give another example of an argument similar to those
used in Section 3, we consider the simplest example of the fiber bundle
models studied by Phoenix and Taylor (1973) and Phoenix (1979). In
fact the particular result we give goes back to Daniels (1945).
We consider a bundle of n fibers under a load nL. We assume
that all fibers share the load equally (i.e. initially each fiber is
subjected to a load L). Under this load a number of fibers Nn(L)
will break leaving n - Nn(L) fibers to support the load, and hence
the load on each remaining fiber is nL/(n-Nn(L)) = L/(I-Xn(L)) where
Xn(L) = n-INn(L) is the fraction of fibers that have broken. Finally
assume that a fiber subjected to a load Z breaks with probability
F(Z) and that the fibers break independently of each other. We can
construct this model by associating with each fiber an independent random
variable ~k ' uniformly distributed on [0,i]. If the k th
fiber is subjected to a load £, then it breaks if ~k ~ F(£).
Define the empirical process
~89
(4.5) X(L) = F ( ~ )
+ /~(F( ) - F(I_X--~-U~y))
and it follows that Vn(L) ~ V(L) where V(L) satisfies
L
(4.8) V(L) = WB(F(I_X--j~-~))
L L V(L) ,
+ F ' (I-X--~--~) 2
(l-X (L))
that is
(I_X (L))2WB (F L
(l_x-~TZy))
(4.9) V(L) = (I-X(L)) 2 _ LF' ( ~ )
Finally consider the m a x i m u m load the bundle will support, that is the
maximum L for which (4.3) has a solution. Rewriting (4.3) we see
190
that
* -i y(n), (F(u))) .
(4 .ii) L = sup u(l - n
n U
S i m i l a r l y define
(4.13) /n ( L n - L )
- ~-n (L - u(l-F(u)))] .
References
16. Norman, M. Frank (1974). A central limit theorem for Markov pro-
cesses that move by small steps. Ann. Probability, 2, 1065-i074.
H a r o l d J. K u s h n e r
D i v i s i o n of A p p l i e d M a t h e m a t i c s
Brown University
Providence, Rhode Island 02912
ABSTRACT
I. Introduction
Assumptions
REFERENCES
P.L. Lions
Ceremade , Paris IX University
Place de Lattre de Tassigny
75775 Paris Cedex 16
France
I. Introduction :
Let us first describe briefly the type of problems we consider : the state
of the sTstem we want to control is given by the solution of the following stochas-
tic differential equation :
[ yx(0) = x e
TX Ot )
(2) J(x,A) = E I~ f(Yx(t),v(t))exp [-[ e(Yx(S),V(s))ds
200
where f(x,v) , c(x,v) are rea~l-valued functions from O x V and Wx is the first
exit time of the p~ocess Yx(t,~) from O . To simplify the presentation, we will
assume throughout %his paper : for all ~ = e,b,c,f
(7) u = 0 on F
Here and below A v denotes the 2nd order elliptic operator (eventually degenerate)
defined by :
I T
and the matrix a(x,v) is given by : a = ~ ~ o
A more precise relation between (5) and (6) is the following (see W.H. Flemtng
and R, Rishe~ [16 ], A, Bensoussan and J.L. Lions [3 ], N.V. Krylov [21 ]) :
i) If u ~ C2(0) then u solves (6), ii) If ] e C2(0) n C(O) satisfies (6) and (7)
then ~(x) = u(x) in ~. Unfortunately this classical theory (consisting of verifi-
cation theorems) is not convenient since i) u is not in general C 2, it may even
happen on simple examples that u i~ not continuous ! ; ii) no classical tools can
taka care of (6) and this foz several reasons : first it is a fully nonlinear
equation that is the nonlinearity acts on second derivatives of the unknown and
second it is a degenerate equation since a may not be posifli~e definite.
(9) H(D2u,Du,u,x) = 0 in 0 ,
where H is specified to be :
"" "
H(uiJ,ul,u,x) = sup ~ ! aij(x,v)u IJ
"' - bi(x,v)ui+c(x,v)u_f(x,v)]
v e V i j
(lo) { ~ ~ c.(~ x ~
H(A,p,t,x) ~
~ ~ ~ o~
H(B,p,t,x)
~ for al~ ~,~ ~ ~ ,
(the second part of (10) ekpress the fact that (9) is elliptic).
T+ = x e0 / ~A eL N
s , ~ 6R N
D+~(x) = N
(A,~) ,6 L S x R N /
1
lim sup { ~ ( y ) - ~ ( x ) - ( ( , y - x ) - ~ ( y - x , A ( y - x ) ) } ly-x
I-2< 0 }
y+x
lira i n f { ~ ( y ) ~ ( x ) - ( ~ , y - x )
y÷x
i 1
- ~ ( y - x , A ( y - x ) ) } lY-x1-2~0 ).
203
Re~rk .~,1 :
Definition II.1 :
Remark II.2.
Proposition IT. 1 :
Remark II.3 :
H(D2U(Xo),DU(Xo),U(Xo),Xo) = 0
T ^T
u(xo) = inf
A
E fox° f(Yx (t),v(tl) exp
0
-
0
c(y x (sl,v(s)]ds d ~ +
0
t
TX ^ T
^ - e(y x (t) ,v(t)) dt .
o o o o
205
,'~x ^T t
-1E I o f(Yx (t),v(t))exp £-I c(yx (s)'v(s))ds} dtl O.
T o o o o
s FXo ^ T
~up y E ! {Av(t) ~(Xo) - f(Xo,V(t))} dt > - e(T)
A Jo
where g(T) + 0 if T + 0+. And we conclude remarking that :
CorollarF !I. I :
t ~ u be given by (5) :
i) W e h a u e : ¥ v ~ V , AVu ~f(.,v) inD'(O) ;
ii) If u E oe(O) for soma p > N then we have :
Remark II.4 ;
An example due to Genis and N.V. Krylov [18 ] shows that {AVu - f( • ,v) }
may be a non zero measure on 0.
Remark 11.5 :
All the results mentioned in this section are detailed and proved in P.L.
Lions [27 ~.
As we just saw, the notion of viscosity solutions requires that some conti-
nuity of the value function is known. On the other hand since we are dealing with
eventually degenerate diffusion processes the question whether the value function
is continuous can be extremely difficult to settle (even in the case without
control - see D.W. Stroock and S.R.S. Varadhan [ 47 I )° It turns out that there
exists a natural assumption which gives quite ~eneral results : we assume that
"the problem has a subsolution" :
t ^ t ^ s
i)¥A , W(Yx(t ^ TX)) exp[-Io Txc dsI + Io Txf exp[I- o cd~l ds
Remark III. I :
Theorem III. I :
i) J(-,A) , u ( x ) a r e u . s . c , on ~ f o r a l l A ; u ~ w , u ~ u_ i n ~ whare
Remark III.2 :
Let us mention that assumption (15) is discussed in P.L. Lions [26 ] for
deterministic control problems and that if (15) (or some ana/ogue of (15))
does not hold, everything may happen.
Remark 111.3 :
This result improves various previous results obtained by J.L. Menaldi and
P.L. Lions [37 ].
Corollary IIi. 1 :
In p a r ~ i c u l a r i f u = o on r then u • c ( o ) and u = ~ i n U .
Corollar~r III.2 :
I f (15) h o l ~ ~ d i f we ~ s ~ e : ~ • (0,7 ]
• 3~ ) ~ • ~ / (x,v,~) e RN RN
8 < I if e = I , I = Io ; B = C~ if I >
( -ol
a,
Io'
if 0 < X < 1o ; g = ~ i f ~ < I , X = Xo;
All these results are detailed and proved in P.L. Lions [27 ] : they extend
various results obtained by J.L. Menaldi and P.L. Lions [ 37 ]. Let us finally men-
tion t h a t some extensions of (15) are given in [27 ].
209
Again, the boundary and the eventual degeneracy of the processes create
technical difficulties. A typical result that we may obtain is the following :
Theorem IV. 1 :
L e t u be 9 i u e n by ( 5 ) . We assume :
(20) u e cb(~) , u= 0 on r ;
Iwl ~ on r
Then i f m
u~ a viscosity solu~on o f (6) s a t i s f y i n g ( 2 0 ) , we have :
~(x) = u(x) , ~ e
Remark IV.I :
In P.L. Lions [27 ], the proof of this result is given together with several
variants or extensions. Let us mention that if 0 = R N then conditions (20)-(21)
are vacuous ; let us also indicate that it is possible to replace (21) by : there
exist FI,F 2 relatively open subsets of r such that F = F I U r 2 and on F 2
O(x,v) = 0 (¥ v) and either b(x,v) = 0 (¥ v) or b(x,v).n(x) ~-~ <0 (¥ v) ; and
for each g > 0 there exists w e e CI(o) satisfying A~w e ~ f ( . , v ) + C in 0,
lw£1 ~ on F I. Then we may replace (20) by : u E Cb(O) , u = 0 on
r I•
V. Regularity results.
We will use below the following formulation of the fact that the controlled
processes really cross the boundary (or some part of it) :
Theorem V.].
L~t ~ be an open s u b s ~ of O. L ~ t p e { I , . . . , N } .
m m P 2
(27) such t h a t :
for a l l ~ e RN
then u E C2'6(~ ') for any open s e t ~' C O w , where 8 e (0,1) depends o ~ y on ~'
and {a(x,V)}x, v .
open s e t ~ C C 0 ;
Wz have • w =- u i n ~ .
Remark V. I :
Remark V.2 :
Many variants of Theorems V. I-3 are given in [ 27 ]. Let us just mention that
if 0 = R N,,(22) becomes vacuous.
Remark V.3 :
We indicate briefly a list of problems where the above results and methods
are used.
I. Similar problems : Let us mention that similar results hold for problems where
i) there is a pay-off when the process reach F ; or il) the coefficients are
unbounded in x ; or iii) we add optimal stopping time problems ; or iv) we consider
time-dependent diffusion processes and time-dependent H~B equations. Analogous
results held and are proved by similar methods - see P.L. Lions [27 ] for more
details.
Using similar methods, the ea~e of impulse control problems has been treated by
Perthame [42 ].
3. Bifurcation and optimal stochastic control : in P.L. Lions [29 ], some bifur-
cation phenomena are interpreted in terms of optimal stochastic control. In
addition the analog~es of eigenvalues are introduced for HJB equations.
6. Asymptotic ~roblems : The preceding results and methods can be used in the
proofs of various asymptotic results such as She simplification of large-scale
systems (see R° Jansen and P.L. Lions [20 ]), the homogeneization of optimal
stochastic control problems (see P.L. Lions, G. Papanicolaou and S.R.S, Varadhan
[~o ]).
213
7. Mon6e-Amp~re e%uations : The results and methods given above have enabled us
recently to solve the classical Monge-Amp~re equations (arising in Differential
Geometry) - see P.L. Lions [35 ], [36 ]. The remark that Monge-Amp~re equations are
indeed HJB equations is due to B. Gaveau [17 ].
BibliograPhy :
[7 ] M.G. CRANDALL and P.L. LIONS : Condition d'unicit~ pour les solutions
g~n@raulis~es des ~quations de Namilton-Jacobi du ler ordre. Comptes-
Rendus Paris, 25__~2(1981), p. 183-186.
[9] M.G. CRANDALL , L.C. EVANS and P.L. LIONS : Some properties of viscosity
solutions of Hamilton-Jacobi equations. Preprint.
[1~,] L.C. EVANS and A. FEIEDMAN : Optimal stochastic switching and the Dirichlet
problem for the Bellman equation. Trans. Amer. Math. Soc., .25.3. (1979),
p. 365-389,
[15 ] L.C. EVANS and P.L. LIONS : R@solution des @quations de Hamilton-Jacobi-
Bellman pour des op@rateurs uniform@merit elliptiques. Comptes-Rendds
Paris, 299 (1980), p. 1049-1052.
[16 ] W,H. FLEMING and R. RISHEL : Deterministic and stochastic optimal control.
Springer, Berlin (1975).
[181 I.L. GENIS and N.V. KRYLOV : An example of a one dimensional controlled
process. Th. Proba. Appl., 21 (1976), p. 148-152.
[20 l E. JENSEN and P.L. LIONS : Some asymptotic problems in optimal stochastic
control and fully nonlinear elliptic equations. Preprint.
[22 I N.V. KRYLOV : Control of the diffusion type processes, Proceedinss of the
~nternational Congress of Mathematicians,_ He!sinki, 1978.
[23 ] N,V. KRYLOV : On control of the solution of a stochastic integral equation.
Th. Proba. Appl., 1 7 (1972), p. 114-137.
[25 ] N.V. KRYLOV ; On equations of minimax t~pe in the theory of elliptic and
parabolic equation8 i n the plane. Math. USBR $~ornlk, I_~0(1970),p,I~19,
[26 ] P,L. LIONS : Generalized solutions of Hamilton-Jacobi equations. Pitman,
London (I£82),
[291 F,L. LIONS : Bifurcation and optimal stochastic control. Preprint ; see
also MRC report, University of Wisconsin-Madison (1982).
[30 ] P.L. LIONS : ContrSle de diffusions dans RN. Comptes-Rendds Paris. 288
(I~7£), p. 339-3h2.
[31 ] P,L. LIONS : Control of diffusion processes in H N. Comm. Pure ApD1, Math.,
3__4 (1281), p. 121-147.
215
[35 ] P.L. LIONS : Une m6thode nouvelle pour l'existence de solutions r6guli~res
de l'6quation de Monge-Amp~re. Comptes-Rendus Paris, 293 (1981),
p. 589-592.
136 ] P.L. LIONS : Sur les @quations de Monge-Amp~re. I, II and III. Preprint.
[37 ] P.L. LIONS and J.L. MENALDI : Optimal control of stochastic integrals and
Hamilton-Jacobi-Bellman equations. I, II. SIAM J. Control 0ptim., 20
(1982), p. 58-95.
[38 l P.L. LIONS and B. MERCIER : Approximation num6rlque des ~quations de
Namilton-Jacobi-Bellman. R.A.I.R.O., 1_~h(1980), p. 369-393.
[39 ] P.L. LIONS and M. NISIO : A uniqueness result for the seml-group associa-
ted with the Hamilton-Jacobi-Bellman operator. Preprint.
P e t r RandL
Department of Probability and M a t h e m a t i c a l
Statistics, Charles University
SokoLovsk~ 8 5 , 186 OO Prague 8, C z e c h o s l o v a k i a
1. The c o n t r o l problem
The t o p i c was d e a l t by s e v e r a l authors in Prague ( ~2J - ~Sj ) .
They a p p l y a m a r t i n g a l e method e x p l a i n e d h e r e on a s p e c i f i c model.
are to be u s e d .
the controls
zt: u(x;)• t~o•
suffice. If the risk of the ruin is to be t a k e n into account• (4) is
changed i n t o
where
~'= i n f {t : Ct < O~
z t = ~( ct• x; )• t=~0•
are n e e d e d .
q(i,j;z) = O, O~i~j~O.
z.q.a. Z~ zjqj
= j~o ) / (1 + j~o--A--)~j ,
= kct "
k
kz t = k ~ ( t , C t , X ; ) , t~O,T~ , k=l,...,n.
2. Diffusion approximation
~q(i,J)u(i)) I[ i,j 6 I
Then
y + r ~ q(i,j;u(i))lw4j,s.u) d rFj4s) +
49)
+ rq4i,i;u41))w4i,y,u) - @4u) = O, iel.
410) t t d w4 s) d s , t-~O,
= Ct - ~ @s ds + w 4 t ) - w ( O ) - ~
O O
is the compensator of
t
4 w(s) -w4 s-) ).d N s , t~-O.
O
I'~'-q(i,j;u(i))~(w(j,s,u)-w(j,y,u))
2 + wz(j,s,u)]d rFj(s) +
jfi
(11)
+ rq(i,i;u(i))w2(i,y,u) - ~(u) = O, i ~ I.
we d e f t n e a martingale
t t
Lt = I (W(S) - W(s-)lZd Ns - ~ ~*s ds +
0 0
t
+ w2(t} - u2(O) - ~ ~s w2(s)ds* t->O"
0
wz(J,y,u) : 0(~), r - - 9 ~0 .
More o v e r t
t
E L~ • E S (u(s) - w(s-))4d Ns ~ O, rm>~O .
0
We c o n c l u d e that
t
<M>t ~ I ~(;(S,Cs,.)) 2ds . t~O,
0
where
t
~ (~ ( ~ { S , C s , . ) ) 2ds, t-~O.
0
Consequent ly~
Ut = u, t O,T ,
where u is the m a x i m i z e r o f
h2
h 9(u) --,~-- (~(u) 2 = ~ (u).
j~i q(j,j;z) Z
3. Optimization
Using the d i f f u s i o n a p p r o x i m a t i o n the o r i g i n a l problem was t r a n s -
formed i n t o t h e o p t l m l z a t i o n o f a c o n t r o l l e d Mankov p r o c e s s ~ t , t - ~ O ~
with dlfferential gentrator
d2 d
Lu = ½ ~(u) Z 2 + e(u)~ .
dy
With r e g a r d to ( 2 ) , ( 5 ) the aim i s to a c h i e v e
223
(is) o = sup{L u V
u~ ~ 2 + e4u) - L V z} , Vz(O) = -h
4. References
P. M a n d l : On a g g r e g a t i n g c o n t r o t l e d Markov c h a i n s . C o n t r i b u -
t i o n s to S t a t i s t i c s (ed.J°Jure~kov~) p p . 1 3 7 - 1 5 6 , Prague 1979.
ABSTRACT
State estimation problems for systems involving small parameters are treated by both
Lie algebraic and analytical approximation techniques. An asymptotic expansion for
the unnormalized conditional density corresponding to the case of observations of a
Gauss-Markov process through a (weak) polynomial nonlinearity is computed and a
convergence result is derived. The convergence result is based on arguments used
recently to prove existence and uniqueness and to estimate the tail behavior of
solutions to nonlinear f i l t e r i n g problems with unbounded coefficients. The expan-
sion is related to certain approximations of the associated estimation Lie algebra.
Lie algebraic methods are used to compute f i n i t e dimensional f i l t e r s for the terms
in the asymptotic expansion.
I. INTRODUCTION
dxt = f ( x t ) d t + g(xt)dw t
(1.1)
dyt = h(xt)dt + dvt , 0 < t < T <
du(t,x) = ( ~ * u ) ( t , x ) d t + h(x)u(t,x)dyt
Over the past five years or so a considerable e f f o r t has been devoted to the search
for (recursive) f i n i t e dimensional "representations" in terms of various "sufficient
s t a t i s t i c s " of either the solution of (1.2) or other conditional s t a t i s t i c s (such as
the conditional mean). This e f f o r t , which has produced few such estimators [ 3 ] - [ 6 ] ,
has nevertheless led to a major improvement in the understanding of the estimation
problem and to the tools available for i t s treatment. The l a t t e r include algebraic
methods [7]-[10] and the use of certain transformations which simplify the analysis
of the existence and uniqueness of solutions of (I.2) (see, e.g., I l l ] - [ 1 4 ] and the
collection [ I ] ) . The algebraic methods are especially useful for classifying equiv-
alences of f i n i t e dimensional f i l t e r s , for indicating when no f i n i t e dimensional f i l -
ters exist, and for f a c i l i t a t i n g the computation of conditional s t a t i s t i c s . In those
cases where no f i n i t e dimensional representations exist the available methods must be
redirected to the construction of consistent and useful approximate f i l t e r s . This is
the central theme of this paper.
We consider in this paper the specific subclass of the problems (I.1) in which f is
linear, g is constant, and h is a polynomial of degree k > l ; for convenience of
227
In Section 3, we use Lie algebraic methods (in particular, the Wei-Norman representa-
tion [18]-[20]) to derive recursive f i n i t e dimensional f i l t e r s for the terms in the
asymptotic expansion. This procedure shows clearly that the problem of whether the
expansion is a true (not merely formal) asymptotic expansion is equivalent to that of
existence, uniqueness, and regularity of the conditional density in the original prob-
lem. These points were also made in [15]; here, however, we are able to get a l i t t l e
further since existence and uniqueness questions for nonlinear f i l t e r i n g problems are
now better understood than was the case four years ago (see, e.g., [ I l l , [13], [14]).
The existence of the asymptotic expansion in an appropriate norm is stated and dis-
cussed in Section 4.
A preliminary version of this work appeared in [27]. Related results have been ob-
tained independently with t o t a l l y different methods by Sussmann [28].
228
2. LIE ALGEBRAICMETHODS
A Lie algebra L(Z) can be associated with each f i l t e r i n g problem ( l . l ) , (].2), and i t
is now widely recognized that the realizability of L(Z) or its quotients with vector
fields on a f i n i t e dimensional manifold is related to the existence of f i n i t e dimen-
sional recursive f i l t e r s (this idea is originally due to Brockett [7]). The problems
(I.3) do not in general admit such realizations, suggesting that for these problems
no statistic of the conditional density can be computed with a f i n i t e dimensional re-
cursive f i l t e r .
or in Fisk-Stratonovich form,
is the smallest vector space of differential operators containing the two operators
in (2.3)and closed under the Lie bracket [Dl , D2] = DID2 - D2Dl.
Proposition 1
Let M be a f i n i t e dimensional manifold, and let V(M) be the Lie algebra of smooth
vector fields on M. There is no non-zero homomorphism of Lie algebras from Wl into
V(M).
229
Combined with the results of [21], [22], this result implies that for k=3 a n d ~ 0 ,
fixed, no non-zero s t a t i s t i c of the conditional distribution for (2.1) can be computed
(exactly) with a recursive f i n i t e dimensional f i l t e r . Of course, for ~=0, the Lie a l -
gebra L30 -~ L0 has basis { ~ * - ½ x2, x, d/dx, 1} , the Lie algebra for the Kalman
problem; and in this case there is a two-dimensional sufficient s t a t i s t i c that can be
evaluated recursively. Thus, as c passes from zero to ~ 0 , the f i l t e r i n g problem moves
from the simplest to the most d i f f i c u l t class. Hazewinkel [17] has shown similar be-
havior for k=2; i t is suspected that the behavior of the other problems (I.3) with
k > 4 is similar.
To treat the case ~#0 small, i t is natural to consider expansions of the conditional
density and s t a t i s t i c s in powers of E. This was done for several problems in [15],
[16]. Approximations from a Lie algebraic point of view were considered in [17]; this
can be summarized as follows. Let Wl(~) = R <x, ¢, d/dx> be the Lie algebra of d i f -
ferential operators with coefficients that are polynomials in x and ~. Thus, WI(E)
has a basis { e i j k = cixJd~/dx~; i,j,~=O,l . . . . } (Here we regard E as a "variable".)
The associated estimation algebra Lk~ may be regarded as a sub-algebra of WI(~).
Using the notation, from [17], we define LkE modsn as the Lie algebra obtained from
Lk¢ by setting cl=0 for i ~ n . (A more precise definition is given in [17].) In [17]
i t is shown that Lk mod~n is f i n i t e dimensional for each k, n.
duo(t,x ) = ~ * u 0 ( t , x ) d t + XUo(t,x)d~t
(2.5)
u0(O,x) = Po(X)
k(£*-½x2
uo
0 UoI
k+l .~*-½x2
u1
-X Ul
X
u0
xk x 0 uI
ody~
(2.7)
•xk x . Un~
Define, for i=l . . . . . n+l, the (n+l) x (n+l) matrix Ei which has zero entries everywhere
except for ]'s in the sub-diagonal positions ( i , l ) , (i+l,2) . . . . . (n+l, n+2-i) ( i . e . , on
a particular sub-diagonal); note that El is the identity matrix. Also, denoting U(t,x)
= [Uo(t,x) , .. ., Un(t, X )] ! , we can rewrite (2.7) as
Note that the fundamental solutions of (2.5) and (2.6) are the same for every ~ = 1,2,
.... The fundamental solution is just the unnormalized conditional transition den-
sity of the Kalman problem with Yt' 0 < t < T, appearing as a parameter. The density
Uo(t,x) is, of course, Gaussian. Consider (2.6) in the case ~ = l ; clearly, ul(t,x)
can be found by convolving a Gaussian density with a polynomial (x k) times a Gaussian.
This observation has been made elsewhere for related problems [15], [16]. This inte-
gral can be computed in closed form. Moreover, the several "moments" which appear in
the expression can be eva]uated recursively by f i n i t e dimensional equations• Such
f i n i t e dimensional f i l t e r s for the solution of (2.5)-(2.8) are derived by Lie algebraic
methods in the next section. In the construction of formal asymptotic expansions,
231
Lie algebraic methods serve to organize, simplify, and explicate the structure of the
expansions, as well as to lead to finite dimensional f i l t e r s . However, i t is also
important to prove that (2.4)-(2.6) is a true asymptotic expansion in an appropriate
norm; we indicate how to do this in Section 4.
Denote by L(k, n+l, ~) ~ ~0' XEl + xkE2}LA the Lie algebra of (2.8), where A0
(~*-½x2)EI - xk+IE2 - ½x2kE3. The following theorem describes the structure of
L(k, n+l, ~); then we use Lie algebraic methods to solve (2.8) in terms of a finite
number of recursively computable statistics.
Theorem 1: The Lie algebra L(k, n+l, ¢), for ~ p O, consists of A0 and elements which
are linear combinations of the matrix differential operators
Since L(k, n+l, ~) is solvable and finite dimensional, (2.8) can be solved in terms
of a finite number of recursively computable statistics by the method of Wei and Norman
[4], [18]-[20]. The calculation, which can be rigorously justified in this case as in
[4], [20], proceeds as follows. Assumethat the solution of (2.8) can be written in
the form
where {Ai; i=O . . . . . d} is a basis for L(k, n+l, E), Uo(x ) = (Po(X),O . . . . . 0 ) ' , and
{gi; i=O . . . . . d} are real-valued functions of t and y ~ t o be determined. Substituting
(3.l) into (2.8), using the i d e n t i t y
tAi = tk k t Ai
e Aj = Z F adA Aj e , O<__i, j <__d (3.2)
i=O i
where ad~ B = B, adAk+1 B = [A, adA k B],)and equating the coefficients of each basis
element Ai, one obtains a set of stochastic differential equations driven by y~ for
{gi" i=O. . . . . d}. Since L(k, n+l, ~) is solvable, there is an ordering of a basis for
which the representation (3.1) is globally valid. This method was used to solve the
232
DMZ equation for the Kalman f i l t e r i n g problem by Ocone [4], [20]; this is precisely
equation (2.5) for u0 (which corresponds to L(k, l , ~)). Indeed, the representation
(3.1) for the f i r s t component of U(t,x) is
which is the form obtained by Ocone. The other components of U(t,x) are givep by
u i ( t , x ) = bi(t,X)Uo(t,x ) (3.4)
Brs(t) xr -d-s r + s < i(k+l). The form of the equations is most easily seen by con-
dx s '
s i d e r i n g an example.
Example l (the weak quadratj.c, sensor~: Consider the problem (1.3) with k=2 and a=O,
and the f i r s t order approximation (n=l in (2.7)-(2.8)); thus (2.7)-(2.8) becomes
Substituting the representation (3.1) into (3.5), and using (3.2) repeatedly, the
following differential equations for {gi; i=O. . . . . 9} are obtained (the Lie algebraic
manipulations, similar to those in [19], were performed with the aid of the MACSYMA
symbolic manipulation program at M.I.T.; for details, see [30]);
go(t) = t
dgl(t) : cosh t ° dy~
Thus the "zero"th order approximation of u~ is just the unnormalized conditional den-
sity of the Kalman f i l t e r , and the f i r s t order approximation is Uo(t,x) + EUl(t,x),
where u0 and uI are computed from the nine y~-dependent statistics gl . . . . . gg" A
straightforward but involved off-line calculation is involved to obtain these equa-
tions (this is greatly facilitated by MACSYMA), but the only on-line calculations are
the simple equation (3.6) for the gi's and the calculation of u0 and uI as memoryless
functions of the gi's; this constitutes a finite dimensional recursive f i l t e r for
u0 + cuI. I t is also clear that finite dimensional f i l t e r s for the zero-th and f i r s t
order approximations of the normalization factor;uC(t,z)dz, the normalized conditional
density p(t,x), and the conditional mean R(tlt) can be obtained as functionals of
gl .... 'g9" The on-line computation is the same; only the off-line calculation of the
form of the functionals is different. These calculations and some numerical experi-
ments to test the performance of these filters will be reported elsewhere.
234
4. ASYMPTOTICEXPANSIONRESULTS
To justify the use of algebraic methods on equation (2.2) and its formal asymptotic
expansion, i t must be proved that (2.2) has a unique solution within an appropriate
class of functions, and that the formal expansion (2.4)-(2.6) is in fact an asymp-
totic expansion in an appropriate norm. Existence and uniqueness results are proved
in [13], [14] by f i r s t applying certain exponential transformations to (2.2), and then
using the "classical" theory of PDE's [24], [25]. Here we concentrate on outlining
the proof that (2.4)-(2.6) is an asymptotic expansion.
Consider the system (2.2), (2.4)-(2.6) and let Oc(t,x) : u~(t,x) - Uo(t,x). Then
B~ satisfies the Stratonovich stochastic PDE
(4.3)
+ H~(x)W~(t,x)od~
where
x[i i (4.4)
Introducing
Vc (t,x) = exp[-H~ (x)~t]W~ (t,x) (4.5)
the f i r s t component ~ of the 2-vector V~ satisfies a forced parabolic PDE (for each
H~Ider continuous path of y~)
@t _ 2
l @xxVlc + m~(t,X)@xViC + n~(t,x)~ +cf~(t,X)Uo(t,x) (4.6)
235
n~(t,x) :½rh x x ~ t +
2- 2axhxYt .
2a _
(h~)2] (4.7)
-heyc ^ ~xky_
cfC(t,X)Uo(t,x) = e t[~g, _ ½ (hC)Z][e z l)Uo]
h~ x ~
+ (e- Yt-e- Y t ) [ ~ , _ ½ x2]Uo
-zxk(x + ½ ~xk)e'h~tUo
The analysis of (4.6) proceeds as follows (see [14]). Note f i r s t that the term
-~1 (hE(x))2 dominatesthe potential nC(t,x) as Ix[ ÷ ~. Thus, the fundamental solu-
tion r~(t,x;s,z) of (4.6)"falls off" like exp[-~Clxl k+l] for some ~ e (O,l) as
Ix[ +~, and similarly as I z) ÷ ~ . The forcing function in (4.6) has fc(t,x)
-xy~
pC(t,x)e t where p~(t,x) is a polynomial in x with bounded coefficients (on c > O,
t e [O,T] for each fixed path ~t ). Thus, fi the i n i t i a l data ~C(x) is bounded, then
(4.6) has a well-defined solution
ao
V~(t,x) = /_j~(t,x;O,z)[p~(z) - poO(z)]dz
(4.8)
t
+ )(t'Xs'z)F(s'z)u°(s'z)dzds
Proof: The assumptions (Al), (A2) guarantee that (4.6) has a unique solution Vi(t,x)
(using the results of Besala [25] and Kryzanskii [26]), and that IV~(t,x)l ÷ 0 as
Ix[ ÷ =. (The assumption on poO(x) guarantees that Uo(S,Z) behaves like exp[-~z2]
as Izl ÷ ~ . ) The estimates (4.10), (4.11), follow from Cauchy's inequality applied
to (4.8). QED.
From (4.5) we see that
h~ (x)Yt
~ xYt~ he(x)y~
e~(t,x) = (e -e )V2(t,x) + e Vl(t,x)
Thus 8~ is 0(E) in the Ll-norm of (4.11), since the f i r s t term in (4.12) consists of
powers of E multiplied by moments of a Gaussian distribution and V~ falls o~f like
exp[-^~
B x k+11,. Using a similar proof, i t can also be shown that uE(t,x)- Z c1ui(t, X)
i=O
= 0(~n+l) in the same norm. In addition, i t is straightforward to justify the asymp-
totic expansions of p(t,x) and R(tlt), which are computed as indicated at the end of
Section 3.
CONCLUSIONS
I t remains to perform some numerical experiments to test the usefulness of the asymp-
totic expansions in specific contexts. Work on this is now under way and will be
reported elsewhere. In this connection i t is worth noting that the truncated asymp-
totic expansions (2.4) do not always produce densities. Starting from a different
point, based on the fact that u~(t,x) is a density, one can construct asymptotic ex-
pansions whose truncated versions are always positive. However, the resulting expan-
sions may have very poor convergence properties. This work will be reported in [31].
237
ACKNOWLEDGEMENTS
The third author would like to thank W.E. Hopkins and J.S. Baras for discussions on
the results in Section 4. The calculations in Section 3 were performed with the aid
of the MACSYNLAprogram of the Mathlab Group at M.I.T., which is supported, in part,
by the Department of Energy and the National Aeronautics and Space Administration.
REFERENCES
G. M A Z Z I O T T O and J. S Z P I R G L A S
I-i Preliminaries
J=I/2t
I l
I I
II- O p t i m a l s t o p p i n g on ~ 2
Zp+
1. = t" on {Z~ = t} N {Jt = E(Jt"/~t) }
III- O p t i m a l s t o p p i n g on ~ 2
Bibliography
IN A CONVEX B O U ~ E D DOMAIN
I. INTRODUCTION
Sup[A(v)u-f(-,v) : v ~ V] = O in C~
(1.3)
5__~u=
5n 0 on ~ ,
* 2
where A(v) = -(i/2)tr(~o V ) -g-V + c , V is the gradient operator n denotes
the outward unit normal to O • tr(.) is the trace and * denotes the transpose.
In this way, the initial stochastic control problem (1.2) is connected to the
Hamilton-Jacobi-Bellman equation with Neumann boundary conditions (1.3).
We characterize the optimal cost function u(x) as the unique stationary
function of the corresponding nonlinear semigroup. Also, under some conditions,
we construct an optimal Markovian control.
(*)This research has been supported in part by the University of Paris-Daphine and
I.N.R.I.A., France.
247
2. PRELIMINARIES
Let O be an open, convex and bounded set in A n , and V be a compact convex
set in IRm . We suppose that
f : l~n× V ~ l ~ , c : ~nx V~ ~+
- cl - 'l , ;
Notice that if the convex set O has smooth boundary, the normal reflected
diffusion defined by the S.V.I. (2.2) becomes the classical one. The existence of
such reflected diffusion in (2.2) is deduced from Tanaka [27], Lions, Menaldi and
Sznitman [16], and [21].
Now, for any admissible system • and any real measurable bounded function
h(x) , we define a functional
t
J(x,~,t,h) = Elf f(y(s),v(s))~0(s,~ds + h(y(t))~0(t,d~] ,
0
(2.3) t
~(t,~7) = exp(- 7 c(y(s),v(s))ds) ,
0
Denote by Cb(O) the space of all real functions on O which are uniformly
(1)
continuous and bounded, endowed with the supremum norm If'If • Then as in Nisio
[22,23], Bensoussan and Lions [I], Lions and Menaldi [14,15], we can prove that
(Q(t),t a 0) is a nonlinear semigroup acting on Cb(O) , we have precisely the
following properties:
uniformly on K ,
I . 52 A
(2.11) A(v) = - ~ t r ( o ~ -7)- g ~x + c .
Ox
where
(i)
Since the domain is bounded, we Just need continuous functions.
250
(2.16) [Q¢(t)h]x = i n f [ J ( x , ~ , t , h ) : ~ ] ,
(2.17) u¢(x) = i n f [ J x ( ~ ) : ~} .
for some constant ~ > 0 , and we also suppose that we can not exert any control
near to the boundary, i.e. ,
Lennna 2.1
Let v(x) be a measurable function from O into V . Then under assumptions
(2.1), (2.18), (2.19) ~I)"
~ there exists an admissible system = (n,~',Jt, p,w(t),
v(t), y(t)) such that v(t) = v(y(t)) for all t ~ 0 .
3. CHARACTERIZATIONS
We have
.Theorem 3.1
We asstmle (2.1). Then the function u defined b_y (2.13) is the unique
solution of the problem
where J is the functional given b y (2.3) and 8 stands for any stoppin~ time
associated to the admissible system ~ .
Theorem 3.2
Let assumptions (2.1), (2.18), (2.19) hold. Suppose also that
Then the optimal cost u defined by (2.13) is the unique solution of problem (1.3)
in the Sobolev space W2'~(O) . Moreover, there exists an optimal admissible
system which is Markovian feed-back.
The existence of an optimal control follows from the fact that u is W 2'=
and the Lemma 2.1.
Remark 3.1
Using the technique of Lions and Menaldl [15] and [19], the optimal cost u
can be characterized as the maximum subsolution in an appropriate sense.
Remark 3.2
We can obtain the C2"~(~)- regularity of the solution u by using a recent
result of Evans [4].
On the other hand, we have
Theorem 3.3
Under assumption (2.1) we have the followin~ convergence
where u ,u are Kiven by (2.17), (2.13) respectively. Moreover, for any function
h i__nn C b ( ] R n ) w~e have, a~s ¢ ~ 0 ,
where Q (t) , Q(t) are the nonlinear semi~roup defined by (2.16), (2.4).
being the limit uniform in v(.) and x in O . The processes y¢(.) , y(t)
are given by the stochastic equations (2.14), (2.4) .
Theorem 3.4
$ b_~e any positive constant. Then under hypothesis (2.1) there exists a
function u such that
Remark 3.3
In almost all of this paper, assumptions (2.1), (2.18), (2.19) can be relaxed.
We will have quite similar results.
Remark 3.4
We can extend all results to the parabolic case.
Remark 3.5
With analogous techniques, we can consider the case where we add an impulsive
control in the system ~ .
REFERENCES
[5] L.C. EVANS and A. FRIEDMAN, Optimal Stochastic Switching and the Dirichlet
Problem for Bellman Equation, Trans. Am. Math. Soc., 253 (1979), pp.365-389.
[6] L.C. EVANS and S. LENHART, The Parabolic Bellman Equation, Nonlinear Analysis,
(1981), pp. 765-773.
[7] L.C. EVANS and P.L. LIONS, R~solution des Equations de Hamilton-Jacobl-
Bellman pour des Op~rateurs Uniformement Elliptiques, C. R. Acad. Sc. Paris,
At290 (1980), pp. 1049-1052.
[8] W.H. FLEMING and R. RISNEL, Optimal Deterministic and Stochastic Control,
Springer-Verlag, New York, 1975.
[II] N.V. KRYLOV, Controlled Diffusion Processed, Springer-Verlag, New York, 1980.
[12] N.V. KRYLOV, Some New Results in the Theory of Controlled Diffusions
Processes, Math. USSR Sbornik, 3_/7 (1980), pp. 133-149.
[13] P.L. LIONS, Sur Quelques Classes d'Equations aux D6riv6es Partielles
Nonlin~alres et Leur R6solution Num6rique, Th~se d'Etat, Universit6 de
Paris VI, 1979.
[14] P.L. LIONS and J.L. MENALDI, Probl~mes de Bellman avec le ContrSle dans les
Coefficients de Plus Haut Degre, C. R. Acad. So. Paris, ,A-287, (1978),
pp. 409-412.
[15] P.L. LIONS and J.L. MENALDI, Control of Stochastic Integrals and Hamilton-
Jacobi-Bellman Equation, Part I and If, SIAM J. Control Optim., 2_O0(1982),
pp. 58-95. See also Proc. 20th IEEE CDC, San Diego, 1981~ pp. 1340-1344.
[16] P.L. LIONS, J.L. MENALDI and A.S. SZNITMAN, Construction de Proeessus de
Diffusion R4fl6chis par P~nalisation du Domaine, C. R. Acad. Sc. Paris,
1-292 (1981), pp. 559-562.
[17] J.L. MENALDI, On the Optimal Stopping Time Problem for Degenerate
Diffusions, SIAM J. Control Optim., 18 (1980), pp. 697-721. See also
C. R. Acad. Sc. Paris, A-284 (1977), pp. 1443-1446.
[18] J.L. MENALDI, On the Optimal Impulse Control Problem for Degenerate
Diffusions, SIAM J. Control Optim., I~8 (1980), pp. 722-739. See also
C. R. Acad. Sc. Paris, A-284 (1977), pp. 1449-1502.
[19] J.L. MENALDI, Sur le Probl~me de Temps d'Arr~t Optimal pour les Diffusions
R~fl~chies D4g~n6r~es, C. R. Acad. Sc. Paris, A-289 (1979), pp. 779-782.
See also J. Optim. Theory Appl., 36 (1982), to appear.
[20] J.L, MENALDI, Sur le Probl~me de Contr~le Impulsionnel Optimal pour les
Diffusions R4fl6chies D6g~n6r4es, C. R. Acad. Sc. Paris, A-290 (1980),
pp. 5-8. See also Mathematicae Notae, 2~8 (1982), to appear.
[24] M.V. SAFONOV, On the Dirichlet Problem for Bellman Equation in a Plane
Domain, Math. USSR Sbornik, 3_!I (1977), pp. 231-284 and 3_44 (1978) j
pp. 521-526.
[25] V.A. SHALAUMOV, On the Behavior of a Diffusion Process with a Large Drift
Coefficient in a Half Space, Theory Prob. Appl., 2__4 (1980), pp. 592-598.
E ~(~t ) Y = ~t'
•. ~ t - m a r t i n g a l e . Then
(1)
(3.5) ^ - ~0 - ~0 ~ fs ds
mt = ~t is a ~ u y- martingale and from
process and
/t DsdS = <v,n> t , where < ' >t denotes the quadratic
variation.
In case h t and n t are independent there is an essential simpli-
fication using the innovations property.
For example, one would have the representation
(3.10) h s = h(~s )
5. Zakai Equation
For later use (and throughout the rest of the paper) we shall
consider the Weak form of the Zakai equation in Stratonovich form:
1
(5.5) dPt = (L* - ~. h 2) pt(y) dt + h pt(Y)odY t ,
The ideas of this section are due to CLARK [1978], DAVIS [1980],
and MITTER [1980].
There is as yet no theory of nonlinear filtering where the
261
observations are:
(6.1)
Yt = h(~t) + ~t '
(6.2) dY t = h(~ t) dt + dN t , or
t
(6.3) Yt =~0 h(~ s) ds + N t .
then this filter does not accept the physical observation y. The idea
is to construct a suitable version of the conditional expectation so
that the performance of the filter as measured by the mean-square
error remains close to when the physical observation 'y' is replaced
by the mathematical model of the observation.
This is most easily done by eliminating the stochastic integral in
(5.5) by a suitable transformation (gauge transformation in the
language of physicists).
Define qt(x;y) by
~q _ 1 ~2q
(6.5) ~-~ - ~ a(x) - - + bY(x t) ~-~t + vY(x,t)q
~x 2
dh da
where a(x) = g2(x), b y = - f + Yt a ~-~ + ~-~ ,
1 2 dh )2 df dh 1 d2a
vY = ½ h2 - Yt Lh + 2 Yt a(~-~ dx + Yta~ + 2 dx 2
262
How can one answer the question when two filtering problems have
identical solutions? How can one decide whether the Zakai equation
admits a finite-dimensional statistic?
The starting point of this analysis is the Zakai euqation (5.5)
in Stratonovich form. Consider the two operators:
=L* -y1 h2
9. Final Remarks.
REFERENCES
i. Allinger D., and Mitter, S.K. (1981) New Results on the Innovations
Problem of Nonlinear Filtering, StQchastics, vol 4, pp. 339-348.
13. Feynman, R.P., and Hibbs, A.R., [1965] Quantum Mechanics and Path
Integrals, McGraw Hill, New York.
14. Fleming, W.H., and Mitter, S.K., [1982] Optimal Control and Nonlinear
Filtering for Nonde@enerate Diffusion Processes, to appear
Stochastics.
15. Fujisaki, M., Kallianpur, G., and Kunita, H., [1972] Stochastic
Differential Equations for the Nonlinear F ilterin~ Problem, Osaka
J. of Mathematics, Vol. 9, pp. 19-40.
16. Hazewinkel, M. and Marcus, S., [1982] On Lie Al~ebras and Finite
Dimensional Filtering, to appear in Stochastics.
20. Kac, M., [1951] On Some Connections Between Probability Theory and
Differential and Integral Equations, Proc. 2nd Berkeley Symposium
Math. star. and Prob., pp. 189-215.
29. Ocone, D., Baras, J.S., and Marcus, S., [1982] Explicit Filters for
Certain Diffusions with Nonlinear Drift, to appear Stochastics.
33. Sussmann, H., [1981] Rigorous Results on the Cubic Sensor Problem
in WILLEMS-HAZEWINKEL, loc. cit.
Note on uniqueness of semigroup associated
Makiko Nisio
and n-vector y(x, u) satisfy the following conditions (AI) and (A2).
f ~(X(O), UfO))d~
(I. 2)
~ Exn~ e- "
S(t, x, ~) : usup fCxCs), uCs))ds
268
't
+ e-
io
c(X(s), U(s))ds
¢(X(t)),
where X(t) is the response for U and x its starting point. Now
we a s s u m e (A3),
( v ) generator , O( ) = {¢ • C: ~ (S(t)l - ~)
converges in nomm, as t + 0} = C 2, w h e r e C 2 = {¢ ~ C; ~ ,
~ ¢ = sup hue
u£F
+ f(x, u), for ¢ ~ C2
~2 8
where Lu = .[.aij(x , u) ~x.~ + [yi(z,. u) ~ . - c(x, u)I and
a = ~
i ~a , I = i d e n t i t y .
in §3 we apply these results to S(t) and show that under some con-
problem,
inequality
Define Ah by
270
(2. 6) dW(t)
~ = AhW(t) , W(O) = ~.
For ~ e 9(A),
Hence we see
uniquely.
is an e x t e n s i o n of A, is u n i q u e , if exists.
the f o l l o w i n g conditions.
h, such t h a t
-< sup
- ~
UC U L
E.~ e-
.~j~
o
I c(X(O),
o
U(O))de
G~(X(s))ds
h, so t h a t
6(0) = x
has a weak solution by (A4)) [4]. Although u(~(t)) may not be st(B).
S(t)~(x) -
>Ejj ~- c(~(e), u(~(e))de c(~(o) u(~(O))dO
f(~(8), u(~C~))ds + e- ~(~(t))'
o
h, such that
(3. 9) i (sCt)$(x)
~ -$(x),) - g(x) ~ - ]]hl]~- l, for z ~ Rn , t < t=,
differential equations
273
r
(3. 10) J dX(t) = a(X(t)~ U(t))dB(t) + X(X(t), U(t))dt
X(O) = x
and
Therefore we h a v e , as e ÷ 0,
that
1
(3. 15) II ~ (se(t)~ - ~) - G~ll < de + e', f o ~ t < t0 and e < i.
[5].
sup [ I ~2h(x'u)
~ j
~xL~x- I + I ~h(x'u)
i 1 < ~' i, j = l~...n~
are an i n t e g e r m, a n d
274
m
ul, .... u m • F, el,... 8m • [0, i], ~ e i : i, such that
i=i
A p p e a l i n g to P r o p o s i t i o n s 1 % 3, we have
C a u c h y p r o b l e m in C,
+ e-
s; eCX(z), U(8))ds
¢(x(~)).
~* is given by
Using the same method as in §3, we can get similar results. That
is, define G* by
of Cauchy problem in C,
References
408-410.
1977 (Japanese).
India, 1981.
P D E with random coefficients :
+ +
+
R. BOUC - E. PARDOUX
Introduction
given an unbounded linear operator A(z) on some Hilbert space H (in pratical
examples, A(z) will be a PDE operator), such that the following Cauchy problem is
well - posed :
+
+ UER de Math~matique, Unlversit~ de Provence,
dimension of the space variable in equation (0.2) will often be too large for
dimension n + d. This is the price one has to pay for the fact that (O.I) is non-
U
linear in (zt). The above result exploits the linearity with respect to U t t
t
Unfortunately, one is typically interested in cases where d is large - the
(0.3) du t + A u t dt - B u t dw t
(0.4) d Et
dt +A ~t = o, ~o = E [uo]
The result of this paper is that if in some sense the noise entering (O.I)
that Z t takes values in a (non compact) euclldean space. Our main concern here is
solving only PDEs with space variable of the same dimension as that of (0.I).
Similar results would hold for higher order moments. The details of proofs of the
(I) at least in principle, since the coefficients in the equation become more
We r e p l a c e (0.1) by :
£
du e 1
~[-t + A u t + -- B(Z t )u~ = o
/e2
(1.1)
uE = u
o o
I. a The process Zt
1 32
L = ~ ~ aij(z) +Y
i,j ~z.~z. ibi(z) ~z.
x j x
we suppose that the coefficients aij(.) and bi(.) are measurable and bounded (1) ,
Lp=o
(1.3) p-I Z 3 ~ d
J ~j (aijP) £ L (]R), i= l...d
We add the following hypothesis (which is not necessary, but does simplify
the exposition):
(1.4) p-1L*(p.)= L
(1) All the results that follow would be true with minor changes, in the case
E [ ~ (Zo)IZ t = z] = E [ ~(Zt)Iz o = z]
xE~ n (we consider (I.I) as an equation in the whole space, without boundary
e0nditions). We denote by ['I the norm of L2(]R n), [].][ the norm of HI(]Rn), (.,.)
and <. , . > the scalar product in L 2 ( ~ n) and pairing between HI(jR n) and H-I(]Rn).
We assume that :
Under some additional hypothesis, which are essentially those that we will
introduce in the next sections, one can show that u E converges in law to u, the
(1) This means in fact that the same equality holds for each coefficient of the
du t + [ A - [ E(B(Z0)B(Zo))dS]utdt= dMt(u)
O
where Mt(u ) is a continuous L2(]R n) - valued martingale, such that Vq0E L2(IR n) ,
e.g. BLANKENSHIP - PAPANICOLAOU [ l ]. The difficulty due to the PDE can be taken
care of by using techniques from VIOT [ 8 ]. We also need here the results in § IV
and V below.
E [ u e ]= f q e ( t , z ) d z
t
where qe solves :
~qe + A ~ +--1 B q e = - ~1 L* qe
~t £ E
q~ (o) -- p u
O
~(t,z) s.t.
q~(t,z) = pCz) v--~(t,z)
~v---~ + A ~ 1 e ~
+ ~ Bv--'E = --~
(3.1) ~t c
v-~(o)= u o
~(t,z)=E[ u et I z ct = z]
At that point, we need to introduce one complication. There are two time
scales in our problem : t and ~E 2 . We now have to introduce the "fast time
281
variable" explicitly, in order to treat the initial layer. That is, we introduce
Bv_~
e
+ A ve + l ~ v~ + 4( ~J - L ~)=o
(3.2) ~t e ~e ~
vC(o,o,z) = u O
conditions are missing - but this does not matter, since (3.2) will be used only
S@£(t,T,z)p(z)dz = o , so that :
E[u~ ] = fve(t,~e2,z)p(z)dz
--£
- u (t,t/e2)
--£ ,
u ~s the quantity of interest. Our problem comes from the fact that we
cannot compute -£
u without computing the whole function ve(t,T,z).
s@i(t,Y,z)p(z)dz = o
so that :
The reason why this expansion is useful is that we will get "explicit"
')82
expressions for the ~l's, and the u1's will be solutions of PDEs with state
variable x.
We now replace in (3.2) v e by the right hand side of (3.3), and equate to
and ~ I Let us recall that what follows is not a rigorous derivation. We are
going to "guess" from (3.4)...(3.7) the equations that should satisfy the ul's
--~vl - L v I + B u-°(t)= o
2T
we then deduce :
--~I _ e ~l + B u-°(t)= o
~T
~l(t,o,z)= o
T
(3.9) 01(t, ,z)= - (~ Ez[B(Zs)]ds) u°(t)
o
(3.I0) ~ 2 + y B @ I p dz + du
--° + A ~ ° ~ o
~T dt
we expect that all quantities should have limits for T + + ~ , and that :
~2(t,~,z)= o
~T
duo
-a-~-(t)+ [ A - ~ E ( B ( Z s ) B ( Z ))ds]u-°(t) ~ o
O
(3.11) O
~(o1 = u
0
expectation of the limit in law of u et . The difference between A and the operator
o~
~-2 + [S E(B(Zo)B(Zs))ds]u--°= o
~T "r
~2(t,o)=o
we can make this choice because we do not want really to compute ~2, but only u °
2 7
(t,T)=-[ S dO SE(B(Zs)B(Zo))ds]u--°(t)
o 0
~2(t,o, z)=o
By similar argument, using the above expressions for Ql, ~2 and ~2, we
(3.12) + [f Sg(B(Zs)B(Zs)B(Zo))dsdS]uO(t)=o
oe
~I (o)=o
--3 @3
and one can express u and in terms of ~ ° and ~I. Again, we do not compute the
"true" value of ~3, but just a quantity which will be necessary for estimating
u-~(t) _ u-°(t)_ ~ l ( t ) .
Smz[~(Zt)]dt
o
(4.1) S~(z)p(z)dz = o
IEz[~(Zt)]dt
O
This will be true if, for some norm II .U, (4.1) implies that B e > O s.t.:
Doeblin's condition (see e.g. PAPANICOLAOU [ 6 ]) would imply (4.2) with ]] . I~ equal
But Doeblin's condition cannot be true here, essentially because Z t takes values
L2~L2(]Rd ; p(z)dz). We denote by I • IA and (.,.)^ the norm and scalar product on
~2 . Consider L as an unbounded operator on i2, with domain D(L). For u E D(L), one
gets :
285
unb0unded operator on ~2
o
V z C m d, I~<I>M,
(4.4) b(z). z <- c
[zl
Then o i s an i s o l a t e d point of Sp(L).
[]
The proof of the Theorem uses a criterion due to H. Weyl (see RIESZ-NAGY
[ 7]), which says that the conclusion of the theorem is equivalent to the non
lUnI^ = I, u n + o in ~2 weakly
Lu + o in i2 strongly
n
Remark : Suppose for simplicity that d=l, and define the following unbounded
operator on L2(IR):
A v = ~p L(v-Y-)
Vp
A has the same spectrum as -L on ~2
A + V, where
V(z) = -
l(b2
4 + 2b')
286
Under the hypothesis (4.4), one easily shows that if f C L~ , then the
Poisson equation
(5.1) LW + f = o
W(z) = IEz[f(Zt)]dt
o
It will prove crucial for us to have conditions on f under which the solution W is
bounded.
Again, Doeblin's condition would insure that f bounded implies W bounded. This is
(i) f 6 e~o c ( m d)
belongs to ~2
o
[]
not imply the theorem. Indeed, a smooth function equal to loglz I for ]zI> 1
It is then easy to prove that equations (3.11) and (3.12) have unique solutions
287
belonging to :
L2(o,r,Hl(~Rn)) N C([o,T] ; L2(IRn))
And from the regularity assumptions on the coefficients of A and B, one can check
that any partial derivative of any order with respect to the zi's has the same
Define :
vI JJ(t,%2)
where, for i = 1,2,3, vi(t,T) = ui(t,T)÷ "~i(t,T), and these are the quantities
Lemma 6.1 A sufficient condition for (6.1) to hold is that V t > o ,3C(t)EIR+
s.t :
Now O E satisfies :
Sketch of proof : It follows from the hypotheses of the Theorem, using Theorem
L X(x,z) = £(x,z)
X(X,.)6 L 2 V x E ] R n
o'
satisfies:
(6.6) X E L~ n]R
+C d~ ; 3_~_
~x. E L~(iRn+d) ' i = I " ..n
1
One can then get, using (4.3) and standard PDE techniques :
J]Ye (t)'~9 ~ + ~ (YC (t)'X Ye (t))~2 (L2 (iRd))~k( t)!'~g(s, E)"22(H- | (~d)) ds
L- (L-(]R~))9
If B =o, then one can get estimates for yS(t) uniformly in e, using
o
the maximum principle and then avoiding the restrictive assumption of
Theorem 6.2. []
289
Bibliography
Gauthier-Villars (1972)
Stanley R. Pllska
Northwestern University
Evanston, IL 60201/USA
Discrete time Markov decision chains are usually defined in terms of a Markov
transition matrix. A less common approach, but one that is more useful for appli-
cations, is to formulate the model in terms of a state transition function, where
the next state is a function of the current state, the current action, and an exo-
genous random variable.
For most applications these exogenous random variables (one for each period)
have an explicit, physical interpretation. Moreover, in the case of Markov decision
chains, they are independent and identically distributed. A natural and important
generalization, therefore, and the subject of this paper, is the stochastic decision
process that results when these exogenous random variables are not independent and
identically distributed, but rather they comprise a general stochastic process.
Upon making this generalization, the underlying process being controlled may
become non-Markovian. A few authors have studied non-Markovian stochastic control
models. Davis [2] and Rishel [6] studied very general, continuous-time models.
Discrete-tlme models were presented by Dynkln [3] and Gihman and Skorohod [4]. The
stochastic decision model studied here is considerably more structured and less ab-
stract than any of these. Indeed, as mentioned above, it is only a modest, yet
significant, generalization of the state-transltion-function kind of Markov decision
chain model.
After formulating the stochastic decision model in Section I, its potential
usefulness as a practical tool is illustrated with the brief presentation of five
different applications. Section 3 provides a martingale type of optimality conditi0~
and explains how to use dynamic programming to solve the problem. An alternative
solution technique that is sometimes more efficient than dynamic programming is
sketched out in Section 4; this method involves stochastic calculus and convex
optimization theory. Sections 5~ 6, and 7 solve a fairly general example problem
with both dynamic programming and the alternative solution technique. The paper
ends with some concluding remarks.
The basic elements of the model are a filtered probability space (~, F , ~, P)
and a time horizon T < =. For technical reasons it is assumed the sample space
is discrete. However, most of what is done here is also true when the sample space
is uncountable. Thus in the case of general sample spaces the reader should regard
the results here as being formal but not rigorous.
The filtration ~ = [Ft; t = 0 , 1 , . . , T ] , where each Ft is a ~-algebra of subsets
291
R t = r(Xt, Z t, D t)
Wt = Rt + W t - l ' t =I,2,..,T.
To understand how the decision model operates it is useful to think of the time
parameter t as the index for a sequence of periods. At the beginning of period t
the decision maker observes the information ~ _ ~ h i c h inclRdes XI,X 2 .... Xt; DI,D2,
•.,Dr.1; Zo,ZI,..,Zt_I; A1,A2,..,At; and W0,WI,.. , and Wt. I. In particular, one
should think of X t as the current state and Wt. I as the current wealth. The decl-
sion maker then uses this information to choose the action Dr, after which the next
292
state Z t of the environmental process is observed, the reward R t for the period is
generated, and the new wealth W t is realized. This sequence is repeated perlod-by-
period until the terminal wealth W T is realized. The applications in the next sec-
tion will give further insight into how this decision model functions.
The decision maker's objective may be to maximize the expected terminal wealth
W T. However, it will be useful for purposes of economic modeling to be more general
than this. Let u be a specified real-valued function measuring the utility of the
decision maker's terminal wealth. Then the problem is to choose e decision process
D so ms to maximize the expected utility E[U(WT) ]. Later sections will explain how
to solve this problem.
It is important to recognize that if the random variables in the sequence [Zt}
are independent and identically distributed and if the rest of the decision model is
suitably defined, then the decision model becomes an ordlnary Markov decision chain.
Indeed, it becomes identical to the kind of Markov decision chain treated by Bertse~l
[I] which, in turn, is equivalent to the conventional kind of Markov decision chain
that is formulated in terms of a Markov transition matrix.
2, Soma Applications
A primary reason for the importance of the decision model is its suitability
and usefulness for many different applications. The following table presents five
possible applications. These applications are meant to be suggestive, not defini-
tive. The columns indicate the applications, while the rows specify the various
elements of the model. Note that the constraint process A is sometimes specified
in terms of the controlled process X; this is allowed, since X is predictable.
All of the applications involve an environmental process Z that has an explicit,
physical interpretation. In the special case where Z is a sequence of independent
and identically distributed random variables, all of these problems specialize to
standard applicatlons of Markov decision chains. But for all of these problems it
is both natural and meaningful to allow the environmental process to be more general.
The first three applications are simple generalizations of'classlcal problems
from the operations research literature. For all three of these problems it may be
important to take the environmental process Z to be more general than a sequence of
independent and identically distributed random variables. For the controlled queue-
ing problem the term r I of the reward function is meant to be the service cost, while
r 2 is the waiting cost. In the production-inventory problem r I is the ordering cost,
r 2 is the holding cost, r 3 is the shortage cost, and there is complete backlogging.
In the replacement-maintenance problem r I is the cost of maintaining an item that
has received quantity X t of shocks and now receives shock Zt, while the scalar c is
the replacement cost.
The fourth application is one example of an optimal portfolio problem, an impor-
tant and well-studled problem in finance. The investor can buy a stock, with $I
controlled production- replacement- optimal consumption
queues inventory maintenance portfolio investment
invested at time t becoming worth $(Z t + I ) at time t +I, and/or put money in a bank
at a fixed interest rate ~. The problem is to optimally divide his money between the
two investments. Note that X = W.
The last application is a consumption-investment problem. Consumption-investment
problems, as well as variations such as optimal capital accumulation under uncertsln~
and resource allocation under uncertainty, have been extensively studied in the ec0-
nomics literature. As with the optimal portfolio problem, the environmental process
Z is the rate of return of an investment, and X is current wealth available for invest
merit. However, now W ~ X. Each period the decision maker must consume the portion
D t of his wealth and invest the balance X t - D t. The consumption generates immedlate
utility r(Dt) , while the investment yields wealth (I + Z t ) ( X t -Dr) next period. Now
W should be thought of as the cumulative utility, so one should take u(w) = w. Incl-
dentally, thinking of how one might model the prime interest rate, it may be appro-
priate for the environmental process Z of these last two applications to be a Mark0v
chain.
Thus V t is the maximum expected change in utility from the end of period t. If D E
-
is such that V (W0,XI,.) = V0(Wo,XI,.), then D will be called optimal, for clearly
this D maximizes E[u( )] over D subject to = W 0 and X I ffiX I. Note that V T = 0.
The function V will be called the value function, In order to avoid annoying
technicalities, it will be assumed that V~(w,x,w)_ and Vt(w,x,w ) are well-deflned and
finite for every D, t, w, x, and ~. The main result of this section is that V can
295
+ E[vt(w+r(x,zt,Dt), f(x,zt,Dt),')IFt.1]]
= vt. I.
Just as with conventional dynamic programming problems, if the supremum is
attained in (2) then the corresponding decision process is optimal (the fact t h a t [~
is discrete makes it easy to show this process is predictable). In other words, one
296
= . Xt+l, ._,
)
D
where ~0 " W0 and X 1 - X I. Thus ~t equals the expected change in utility over all
periods if D is used through the end of period t and then optimal decisions ere made.
~ecall t h a t , process such as ,a l e ~ su~er..rtln~ale i f ~ ~ "~+l-~ -~V~] for all t
end that M D is a martingale if both ~ and -M D are smpermartingales. This leads to
the following result.
(5) Theorem. The process M D is a supermartlngale for every D E D_. The decision
process D is optimal if and only if M D is a martingale.
Proof. Let D E D and t be arbitrary. Note that
- + vt+ l - v t
v 'tl.- + l,
eo
5. An Example
To solve the example problem of Section 5, one begins by computing the value
function V. Since there is no controlled process X with this example, the notation
will be modified accordingly. In particular, the dynamic programming equation (2)
becomes
Proof. The proof is by induction. Since V T = 0, the function V T + u clearly has the
specified properties, so for the induction step it will be assumed that so does the
function h, which is defined by
TO analyze this, consider how the o-algebra Ft_ I is equivalent to a partition of fl,
and then focus on an arbitrary cell of this partition, say B c ~. (thus B E Ft_l, but
no proper subset of B is in Ft_l ). The cell B will subdivide into two cells, say BI
and B_I , according to whether Z t = I or Z t = -I. Let W l E B I and w _ I E B _ I be arbitrary,
so that h ( w + d , m ) = h(w+d,ml) for all w E B I and h ( w - d , w ) = h ( w - d , m_l ) for all
w E B_I. Then with p = P(Z t e l ] B ) , (8) becomes, for all w E B,
Vt-l(W'W) = d sup
E [ph(w+d,Wl) + (l-p)h(w-d,W_l)} . u(w).
Differentiating this last expression with respect to w one obtains, after using (9),
(I0) V t (w,w) + ut(W) = p hI(w+f(w),w I) + ( l - p ) h S ( w - f ( w ) , w i)
t-I
Meanwhile, differentiating (9) yields, for all w E B,
(l-p)h"(w-f(w),w l) -p h"(w+f(w),wl)
f'(w) =
(I - p ) h " ( w - f(w),w_l ) + p h"(w+f(w),wl)
where h" denotes the second partial derivative of h with respect to its first argu-
ment. Thus f' is continuous with -1 < f'(w) < I. Hence, by (I0), not only is
Vt.l(W,~) + u(w) increasing with respect to w, but Vt.l(W,~) + u'(w) either converges
to 0 as w -* • or converges to - m as w-~ -- o
where V" denotes the second partial derivative of Vt_ I with respect to its first
t-I
argument. Hence w -" Vt_l(W,W) is strictly concave and has a continuous second
der Ira tive. r"~
What can be said about the computational effort that is required to obtain the
value function V? For each cell in the partition corresponding to each Ft one needs
to solve, for the function f that was defined in the preceding proof. This amounts
to solving the first-order optimality condition (9). For example, if
u(w) = a - (b/c)exp(-c w)
for arbitrary scalars a, b > 0, and c • 0, then uS(w) = b exp(-cw). For the last
period, h ffiu and each cell in the partition corresponding to FT. I consists of two
elements, say ~I and ~-I" Hence (9) becomes p/(l-p) = exp(2c f(w)), where the con-
ditional probability p = P(,.1)/(P(Wl) + P(W_l)). For ~ E [~i,~.i], the optimal
value of DT(W ) is then given by
f(w) = 1 log(p(wl)/p(v~_l)).
M t - Z l + Z 2 + . . +Zt, t=l,2,,.,T.
Thus M is a random walk on the integers, and under the probability measure P'(w) =
(I/2) T M is a martingale with respect to F . Let E t denote the expectation operator
corresponding to P'.
Since Z = AM, it is clear that the wealth process W D under any decision process
D can be represented as the stochastic integral of D with respect to the martingale
M, that is,
t
~t = ~ DsAMs, t=l,2,..,T.
s=l
By standard results, each such wealth process will be a martingale under pt. Further-
more, since pe is the unique probability measure equivalent to P under which M is a
martingale, it follows (see, e.g., Jacod [5, Ch. XI]) that every martingale (under
pt) can be represented as a stochastic integral of a decision process with respect
to M.
The implications of this are as follows. Let ~ denote the space of random var-
tables Y on ~, and let ~ be as in Section 5, that is, ~ consists of all Y E ~ such
that Y = W DT for some D E =D. Since ~ is a martingale under P' null at zero, it
follows that E'[Y] = 0 for all Y E K" Conversely, if Y E X satisfies E'[Y] = 0, then
upon considering the martingale N defined by N t = E'[Y]Ft] it follows from the martin-
gale representation property described above that there exists some decision policy
D E ~ such that ~ T = Y, Hence
This completes the first step in the alternative solution technique. The second
step is to find an optimal terminal wealth, that is, some ~ E W such that E[u(q)]
E[u(Y)] for all Y E ~. This will be done with some convex optimization theory (see,
e.g., Rockafellar[7]).
301
Let Y* denote the space that is dual to Y under the linear functional ES[YY*],
Let W* denote the ortho~onal complement of W, that is
Since E t[Y] = 0 for all Y E W, it is clear that W* contains all the constant functions
in Y*. If Y* E Y* is not constant, then one can readily find some Y E W such that
E'[YY*] ~ 0, so actually
W* = [Y*EY*: Y* is constant].
Denoting U(Y) = E[u(Y)], the fact that u is concave means U is a concave func-
tional on Y. Hence step 2 amounts to solving the concave optimization problem
(11) m a x i m i z e U(Y)
subject t o Y 6 W
Let U* denote the concave conjugate functional of U, that is, for each Y* E Y*,
Proof. To sho~ sufficiency, since E ' [ ~ * ] ~ 0 one has U*(~*) - -U(~). But the
definition of U~ means g*(~) ~ F.'[yq*] - U(Y) for all Y ~ Y, so in particular
U*(~*) -- -U(Y) • E'[YY* ] - U(Y) = -U(Y) for all Y E W_, that is, ~ is optimal.
Conversely, by a version of the Fenchel duality theorem there exists some
9" 6 __W* such that U*(q*) = -U(Y). Since E'[YY* ] -- 0, this means (13) holds.
dP(w)
With g(w) ~ dP' , the Radon-Nykodym derivative, let u*: × R - R be the
concave conjugate functional
P ^
U(Y) - -sup u*(w,y)dP'(~). This snpremum is attained by y = Y*,
yER ~
islon process D. It has Just been determined that WT = Y" As was stated previously
is a martingale under pt, so
- =, y < 0
u*(~,y) = -eg(m), y = 0
(y/c)log(bg(w)/y) - ag(~) +y/c, y > O.
During this computation one notes that for y • 0 the argument in the definition
of u* is minimized by
E'[u*(m,y)] = ~[logb+E'[1ogg]-logy+l] - a,
so Y*(~) = y for all w. Substituting this back into Et[u*(w,y)] and using (17) gives
U(~) = a - ; / c = e - (b/c)exp(E'[logg]).
But WT-I and DT' being FT. I measurable, are constant over the cell [~l,~_l}, so these
two equations suffice to solve for the two constant values. Indeed, WT. 1 and DT
^
can be determined by solving 2T = ]fl[ such equations, after whlch WT. 2 and DT. 1
can be determined by solving 2T-I equations, and so forth. Overall, to solve for
and D one needs to solve 2 T+I- 2 equations of the form (22). Note that with WT = Y
given by (20), the equations in'(22) imply DT(Wl) = (I/2c)log(P(Wl)/P(~.l)) , the same
as the answer computed by dynamic programming in Section 6.
8. Concluding Remarks
References
H. Pragarauskas
Institute of Mathematics and Cybernetics
Academy of Sciences of the Lithuanian SSR
Vilnius, K. Poz~los 54, USSR
as g + 0.
b) For some constants m,K~ 0 and all ~EA, (t,x) £HT, yER d
functions ~. n n
Pn[ti,x,ti+l, F") ~ Pni(X,F), i = 0,i, ... ,n - I, x £ R d, r £ B ( R d) which
n
are Borel in (u,x). Let B n be the class of all families S(n) = (q; . . . . qn_l )
n n
of functions q0(d~01x0) , qi(dailx0,~0, ... ,~i_l,Xi), i < i < n - i, which are
probability measures on B(A) in the first argument and Borel in the other argu-
ments. An initial point (s,x), a strategy B(n) £ B n and a family of transition
functions define a controlled process sn(s) = x, an(s) . . . . ~n(t:_l),~n(T) on the
probability space (~n,B(~n),Q ~I:)), where 9n = (Rd)n+lxA n (see §6, ch. I [1]).
2. Condition. For some constant m' > m and every x 6 Rd
Let D[s,T] be the space of all right continuous and having left hand limits
functions xt with values in Rd defined on [s,T] with Skorochod topology. Set
DIs,T] = o { x t e r ; s < t < T, F£B(Rd)}.
Let us define the process ~tn setting ~tn = ~n(t i ) if n n I).
t £ [ti,ti+ Denote
by pB(n) the measure on (D[s,T], D[s,T]) induced by ~n
s~x
3. Condition. For an arbitrary sequence {8(n)} of strategies B(n) E[B n the
sequence of measures {p~(n)} is tight on (D[s,T],D[s,T]).
s~x
For all u6A, (t,x)£H T define a measure n(~,t,x,.) on B(R d) by the form-
ula: 3(~,t,x,dy) = ~(z : x + c(~,t,x,z) E dy\{x}), ~(u,t,x,{x}) = 0. Set
n n n
A i = ti+ 1 - ti, a = ~ * (~* is the transpose matrix of ~).
4. Condition. For some number p > i every R > 0 and every continuous
bounded function $ on R d which vanishes in some neighborhood of the origin
n-i d
n
I suv sup I I I (y - x)jP~ni(x,dy) - Ai~j(~,ti,x) I ÷ 0,
i=0 I~I<_R ~EA j=l ly-~l_<p
n-i d
l sup sup [ I / (Y x)j(y x) k p~i(x,dy ) _ n n
i=0 I~I<_R sEA j,k=1 ly-xLp
- - Ai[ajk(~'ti'x) +
+ /(y - x)j(y - X)k~(G,tn,x,dy)]I -~ 0,
n-i
n
sup suply~(y - x)P~i(x,dy ) - Anf~(y - x)~(e,ti,x,dy) I ÷ 0
i°0 Ixl£
as n -~ c o
LC~(t,x)u(t,x) --_ L ~ u ( t , x ) ,
F u ( t , x ) " sup L a u ( t , x ) , (1)
c~£A
Fix x £ R d. Let {~(n)} be an arbitrary sequence of stategles 8(n) £ B n,
(~n(s),@n(s) ... ,~n(tn_l),~n(T))_ the process controlled by strategy 8(n) pn
~ S,X
the measure on (D[s,T], D[s,T]) induced by the process ~n(t) - ~n(tn)
n n
t 6 [ti,ti+l), i = 0,i, ... ,n - i.
By assumption 3 there exist a subsequence {m} c {n} and a measure P on
S,X
(D[s,T], D[s,T]) such that the measures pm converge weakly to P as m ÷ o0.
S,X ~ ~ ~ S~X
By lemma from §6, ch. 1 [5] there exist a probability space (~,F,P) and pro-
m
cesses~~~m(.), ~(.) defined on this space such that ~ ~ t hprocesses
e (x(.),Ps,x) and
(~m(.),e) as well as the processes (x(.),Ps,x) and ($(-),P) are equivalent and
6. Lemma. Let the assumptions 1-4 hold, u 6 C 1'2, R > 0, T is the time of
first exit of (t,~(t)) from [s,T)xS R.
Then
u(tAT, ~(t^T)) - u(s,x) - I tAT Fu(r,~(r))dr
S
is a (Ft,P) ~ supermartingale.
n n n n
n t - U(t,x(t)) - u(s,x) - sup f[u(ti+l,y) - u(ti,x(ti))]p~i(x(t~),dy)
.° n < ~6A
z. t i + l _ t
n
is a (Dis,t], Ps,x ) - supermartingale.
Let
~m m m ~m m .] ~ , ~ m , t TM. d "
rl t - u(t,~m(t)) - u(s,x) - ~. sup l[u(ti+l,Y)-u(ti,~ (ti)) Pmit% t i ), Y)-
. n < ~EA
i. t i+l<__t
Suppose that
+ f ~p(Y - X)Rl(t,x,y)p~(x,dy) +
ly-~l>p
+ I~e(y - x)R(t,x,y)[p~(x,dy) - A~(~,r,x,dy)],
d d
where ~" xk(t'x) (y - x)k + ~ k,£~l uxkxE(t,x )(y - X)k(y - x)£,
~(t,x,y) =k=l u 1 [
Substituting these expressions in the left hand side of (3) and using the assump-
tions of the lemma it is easy to prove (3). The lemma is proved.
Let C > 0. Denote by Ae the set of all matrices a of dimension d ×d I
with elements alj E [-E,e], B e the set of all d-dimensional vectors with compo-
nents hi6 [-E,e], C e the set of all elements c6 L2(Rd,~) such that llC}l~,~ g.
Let Xe = A E x B ex C e, ~ = A×~e. We denote the elements of ~ by ~ = (~,~),
where s 6 A , ~£XE.
Let ~. (x),~2(t) be nonnegatlve infinitely differentlable functions of arguments
x 6 R d, t 6 R ~ equal to zero for [x[ L i, I t l t i and such that I~l(X)dx=I~2(t)dt-
i. Let ~n(t,x) = nd+l~l(nx)~2(nt), n = 1,2 . . . . .
b~(~,t,x) = b(n)(e,t,x) + b ~, b ~ E B ~,
Replacing here o(n),b(n),c (n) by o,b,c we construct functions oe, bE, ee.
Using the collections " ~ O n b n cn (n)',(~,~e,b ,ce,g) we construct controlled
' S' ~' e 'g ) E
processes x~p'X(n,e),x~,S,X(g) and payoff functions Vnc,V E in the same way as
we constructed the above controlled process x ~'s'x and the payoff function v on
t
the basis of the collection (A,~,b,c,g).
{g(n)} is a sequence of e-optlmal strategies, then for some subsequence {n'}= {n}
we have
Y
I
as n ÷ ~. Here e > 0 is an arbitrary positive number. Therefore, for (4) it
suffices to show that E g(~(T)) _< v(s,x).
310
Vncx.x. in Sobolev sense such that Fne v n¢ = 0 (a.e. HT) and Vnc(T'" ) = g(n) (')'
13
where F is the operator defined by the formula (i) if we replace in this form-
me
ula A,a,b,c by ~ e < , b n£, ce
n Therefore, using (5) we obtain that for suffi-
ciently large n
FVnE _< 0 a.e. on [0,T] × S R. (6)
Vn~(Yb,~(Y3))£ Vn~(S,X),
where 7 3 = T A T R. Letting n + ~, g + 0, R + ~ in this inequality and using lemma
311
7 we obtain (4).
Now for the proof of the theorem it suffices to show that
Denote by ~n(t) the process controlled by the strategy B(n)=(qg ..... q~_l)£Bn,
where
q0'%n" = Slx 0) =P(a(s) = 8)
n .,.
qj(ej = 8[x0,e 0, ej_l,X j) = P(e(t~) = 81a(s) = e 0 ....
n I) = ej_ I} . j. = .1,2,
~(tj_ . ,n, n _> k, 8 £ ~ .
!
By the assumption 3, for some suhsequence {n') c {n] the measures Pns,x in-
duced by ~n (.) on (D[s,T], D[s,T]) converge weakly to some measure Ps~x" It
is not difficult to show that the measure on (D[s,T],D[s,T]) induced by
x ~'s'x coincides with Ps,x" From this in view of arbitrariness E > 0 (8)
follows. (8) together with (4) proves the theorem.
9. Remark. The complete proof of the theorem and related results will be pub-
lished in Lith. Math. J., vol. XXIII (1983).
REFERENCES
J.P. OUADRAT
Oomaine d e V o l u c e a u - B.P. 105
78153 LE CHESNAV C~dex
I - INTRODUCTION.
AV -~V = C X c ~ = [0,i] n
(3)
V~h = 0
313
For such problem i t is easy to show that the number of eigen vectors associated
to an eigen value smaller than a fixed value, increases exponentially with the
dimension. But we need to have a good representation of the eigen vector associated
to eigen value of small modulus, in any good f i n i t e dimensional approximation of (2).
And thus, whatever could be the approximation, the obtention of a given precision
w i l l be obtain at a cost which increases exponentially with the dimension.
In these cases only i t is possible to compute the optimal local feedbacks for
large systems. F i n a l l y we ~Iscuss b r i e f l y the decoupling point of view.
II-i. !b~_g~!_~!~u~!eg.
Given I the indexes of the subsystems I = {1,2 . . . . k}
ni,rresp-m.] denotes
l
the dimension of the states[resp~he controls] of the subsystem i ~ I. The local
feedback Si is a mapping o f ~ + x ~ n l i n ~i c ~mi the set of the admissible values
of the control i . 8L denotes the class of local feedbacks SL = {S = (S1, . . . . Sk)}.
Given the d r i f t term of the system :
with n = ! ni , ~ : II
i I i(l ~i"
314
- the d i f f u s i o n term :
: R + x ~n ÷ Mn
t x ~(t,x)
with Mn the set of matrices (n,n) and a = ~1 o~ where , denotes the transposition
C : R + X ]Rn X~f + R+
t X U c(t,x,u)
n n
then boS rresp coS] denotes the functions R+ x R -~ R
n
rresp ~+ x ~ + ~R+] b(t,x,S(t,x)) ~resp c ( t , x , S ( t , x ) ) ]
The_u i f XS denotes the diffusion (boS,a) ( d r i f t boS, and diffusion term ~) and
pS i t s measure defined on ~ = C(R+ , R n) with ~ the law of the i n i t i a l condition
we want to solve
T
Min E s J C°S(t'wt)dt
Sc8L p~ o
where ~ c ~, T denotes the time h o r i z o n . We have here a team of I players working
to optimize a single criterion
Min jS = I CoS(t,x)pS(t,x)dt dx
SEgL Q
with p solution of
~s pS = 0
pS(o,.) :
~ 22
~s = -- + Z b j o S - + .~. a i j
Bt j ~xj 1,j @Xi~Xj
the law of the i n i t i a l condition.
315
Than we have :
Theorem 1
with
ZS vS + CoS = 0 , vS(T,.) = O.
R
Step I : compute p
Step 2 : solve backward simultaneously
~S VS + CoS = 0 vS(T,.) = 0
ZZ-2. U ~ e ~ ! ~ - ~ ! ~ - ~ "
• ni
bi : ~+ x n ~ x ~i +
t xi ui bi(t,xi,ui)
and the noises are not coupled between the subsystems that is :
I~ R.
* l l
(5) ~i,RiP i =0 Pi (o,.) : ui with u = ]1 ui
i~I
and
92
= @-- + X bk°Ri(t,X ) ~ + ! akI w
~i'Ri @t kcI i ~Xk k l~I i @Xk~X1
Let us denote by
R ~+ ~Rni IR+
(6) CioRi : x +
t xi I CoR(t,x) jmi
]I p
2 (t,Xj)dXj
That is the conditional expectation of the instantaneous cost knowing the infor-
mation only on the local subsystem i .
We have the following sufficient conditions to be optimal player by player :
R
(7) MinR. F~i'Ri vi + Ci°Ri] = o, i E I ;
1
given E, v ~ R+
Step 1) Choose i ~ I
Solve (7)
i f : ~i(Vi).. ~ v - than v := ~i(Vi)
Ri := Arg Min {~i RiVi+CR°Ri}
R. t
1
~i(Vi) ~ ~ - c, Vi ~ I.
e
Step 2) When~i(Vi) ~ v - ¢, Vi ~ I, than ~:=~- , go to step 1.
Remark 5. In this situation we have to solve a coupled system of P.D.E. but each
of them is on a space of small dimension. By this way we can optimize , in the class
of local feedback, systems which are not reachable by H.J.B. equation. An application
to hydropower systems is given in Delebecque-Quadrat C4].
318
Min lim ~
S T~
1J
o
CoS(~t)dt
Theorem 3
n
(9) p(x) = C ~ Pi(Xi) " i c E
- - i : 1
xi
_!_1
(10) Pi(Xi) = exp - di i ]o ui(s)ds
BD + DB* + 2A = 0
319
which is (8).
Remark 6. This class of diffusion processes are quite natural i f we see them as the
l i m i t process when N ÷ , obtained from Jackson network of queues by the scaling
x t
x÷~ , t÷N~ .
queue j
queue i ~i(xi)
N)
where ~ i ( x i ) is the output rate of the queue i , mij is the probability of a customer
leaving the queue i to go to the queue j .
The correlation of the noise given by (8) corresponds to system for which the
noise satisfies a conservation law (for example the total number of customer in a
closed network~of queues).
Remark 7. We can now apply the result of I I - 2 to compute the optimal local feedback
f o r systems having the product form property and an ergodic c r i t e r i o n . Indeed :
SCoS(x'
x<'
n
p(x) = ~ Pi(Xi)
i=1
and Pi satisfies :
@2
@
Dxi [uiP i ] + @x~ [ d i i pi ] = O,
- - i c E
I P i ( X i ) dxi = 1
ZI-4. B~m~_~_~g~B~!!D~-
gives the best feedback among the class that we can call "local decoupling feedbacks".
This approach is well studied for deterministic linear and non l i n e a r systems
Womham [ 8 ], I s i d o r i [17] and in the dynamic programming l i t t e r a t u r e Larson [ 1 1 ] .
We have seen in §2 that we are able to compute the optimal local feedback only
in p a r t i c u l a r cases. Moreover, sometimes the local information is not good ; we can
have, a p r i o r i , an idea on a better one and would l i k e to use this a p r i o r i
information to solve a simpler problem than the general one. A way to do that is to
parametrize the feedback, optimize the open loop parameter by a Monte Carlo technique.
More precisely given the stochastic control problem
(2) u(t) : S ( t , x t , v t ) Vt c R p
where mi are trajectories of the noise obtained by random generation perhaps after
a discretization time i f we want to avoid the d i f f i c u l t i e s of the non existence of a
321
(3)
Min
v j=1 o
wJt denotes here a particular trajectory of the noise. Thus, at the end, we have to
solve a deterministic dynamic control problem. For that, we can use a gradient
technique or the Pontryagin principle. For discrete time system the convergence
properties of this approach has been studied in Quadrat-Viot [15]. Application to
the French hydropower system is currently done at EDF now. Feedbacks on the demand of
e l e c t r i c i t y and the level of water in the local dam are optimized with success by t h i s
technique (Lederer- Colleter [ 2 ] ) .
The idea of the stochastic gradient method is the same as the fozTnez one but
we use a recursive way to optimize. The recursivity is on the index of the
trajectory of the noise generated. The problem (1) (2) can be reduced to the problem
Min E J(v)
vEV
@J
is a situation where we are able to compute ~ b~z adjoint state tecbmique here
T
J(v) = I C(t'xt'S(t'xt'vt))dt-
o
@J
Vr+l = Pv {Vr - Pr ~'v(Vr'mr)} Pr ~ R+' Vr e ~ ,
pr : ~ , pr < ~
r r
In a convex situation which is not in general the case for the problem ( i ) , (2),
we can give some global convergence results.
322
Theorem 1
On the hypotheses
1) v ~ J(~,v) convex Vm ;
2) ~ ÷ J(m,v) is L' , Vv
where ~* denotes.the optimal cost and ~(v) denotes the distance of v to the optimal
set.
1
E £2(Vr) ~ c2 1
q2 r +
Yo
The proof of this theorem can be found i n I~du-Gottrsat-Hez-hz-Quadrat-Viot [5] a
l o t of similar results can be found in Polyak [14] and in the reference of this
paper. In Kushner-Clark [12]local convergence are proved in the non convex case.
The following result shows that in some sense the stochastic gradient algorithm is
optimal. We suppose that :
Theorem 2
with
32
Hp
: ~v
- - 2 zp J(v)
:I~ t~--JIv ~@2
323
IV - PERTURBATIONMETHODS.
We denote by
We suppose that :
(5) dXt = f ( x t , u t ) d t
f T
V(o,y) = Min{| C(xt,ut)dt I X(o) = y} ;
U JO
The second variation calculus around the optimal trajectory of (6) Cruz [3]
gives the quadratic form "osculatrice" of the optimal cost V around the optimal
trajectory. This quadratic form is defined by the (n,n) time dependent matrix P
solution of the Riccati equation :
(6) P + PA + A'P - PSP + Q = 0 P(T) = 0
where
-1 i
(7) A = fx - fu Huu Hux
_ H- 1 ,
(9) Q = Hxx Hux uu Hux
are evaluated along the optimal trajectory of (5) on the hypotheses that :
where Xo(t ) denotes the optimal trajectory of the deterministic control problem (5),
X(t) the actual trajectory of the diffusion process (11) when the control is (11),
and K(t) is defined by
(12) -z (Hux
K(t) = Huu , + f'u p)(t)
We have :
Theorem
On the hypotheses (3), (4), (10) and (f,c) t~rLce differen~able, ~ affine
control build on the deterministic control problem, used in the stochastic control
problem leads to a loss of optimality of order 0(c4).
Ideas of the proof : Fleming [ 6]has shown that the optimal deterministic feedback
used in the stochastic control problem leads to an error of C~(~4). But in the
estimation of the proof he does not need the optimal deterministic control but a
control which gives exact V, ~V
~, B Z_2_vv along the optimal trajectory of the deterministic
3Y
control problem.
325
Using for example Cruz [3] we know that the affine feedback ( I i ) has this
property and thus the result is proved.
REFERENCES.
[I] BASKET - CHANDY - MUNTZ - PALACIOS : Open Closed and Mixed Network of
Oueues with Different Class of Customers, JACM 22, pp. 246-260
[2] COLLETER -LEDERER : Rapports internes EOF sur la gestion des r@sArvolrs
hydroGlectriques fran~ais
[s] FLEMING : Stochastic Control for Smell Noise Intensities, Brown University
Report, April 1970.
[7J HOLLAND :-Small Noise Open Loop Control, SIAM J. Control, 12, August 1974.
-Gaussian Open Loop Control Problems, SIAM J. Control, 13,
August 1975.
[8] ISIDORI - K~ENER - GORI GIORGI - MONACO : Non Linear Oeoouplin~ via
Feedback e Differential Geometric Approach, IEEE, AC 26, n°2, April 1981.
[lO] KELLY : Reversibility and Stochastic Networks, J. Wiley and Sons, 1979.
" " : Product Form and Optimal Local Feedback for Multllndex
Marker Chains, Allerton Conference, 1960.
by
Raymond Rishel
Department of Mathematics
University of Kentucky
Lexington, KY 40506
I. INTRODUCTION
x'+--,- ( X o , T I , X l , T 2 , . . . , X n , T n , . . . ) (1)
between t h e p r o c e s s x and t h e s e q u e n c e o f i t s jump t i m e s and jump l o c a t i o n s . I f we
u s e t h e c o n v e n t i o n t h a t TO E 0, t h i s c o r r e s p o n d e n c e i s d e f i n e d by
P[Zn+l<t,Xn÷l=J IXn]
= ~ l{Tn<-t'Xn=i} f;n aij(s'u(S'Yk))exp[TS[
aii(r,u(r,Yk))dtds (9)
"Tn
E ~ : c(s,x(s),u(s))dsI (13)
is a minimum.
IThe notation 1A denotes the characteristic function of the set A, that is the func-
tion which is one on A and zero on the complement of A.
329
u t = ~[y(s) ;O~s~t]
~(Xn) = ~[X0,TI,X I ..... Tn,X n]
THEOREM I: On the interval ~k ~ t < T, for each i for which g(i)=Yk, Qi(t,Yk) satis-
fies the system of differential equations
d
d--~ Qi(t,Yk) = aii(t,u(t,Yk))Qi(t,Yk)
Ju(t,Xn) = (21)
E[l{Tn_<t<Tn+l } [Xn]
as a function of (t,Yk,i).
d Ju(t,Yk,i) i = ( i . . . . . N)
dt
THEOREM 4: The conditional remaining cost from an observed jump time ~k onward (or
from time 3 0 > 0 onward if k=O) conditioned with respect to the observed history Yk
satisfies
E ~: c(s,x(s),u(s))ds ,Ykl
k (25)
Fix an integer r. Let {u*(t,Yk)} be an optimal control for the problem formu-
lated in Section IT. In this section we shall formulate two deterministic control
problems whose solution agrees with u*(t,Yr). These problems look at minimizing the
conditional remaining cost from the r-th jump onward over a class of controls which
differ from the optimal control only between the r-th and (r+l)-st observed jump.
To begin this discussion consider the class of controls {v(t,Yk)} such that
v(t,Yk) = u*(t,Yk) if k~r. That is, these controls agree with the optimal control
except perhaps between the r-th and (r+l)-st jump of the observed process. Since
controls of the above class agree up to time ~r with the optimal control the following
theorem follows from the standard argument used to obtain the principle of optimality.
Since any control v of the class above agrees with the optimal control u* from
the (r+l)-st observed jump onward, another standard argument implies that
Theorem 5 and formulas (26), (23), (24) suggest we again fix r and Yr' abbreviate
by defining ~(t) = ~(t,Y r ] , J i ( t ) = J v ( t , Y r , i ) and define the d e t e r m i n i s t i c optimal
control problem with s t a t e s Ji and control v of:
with t r a n s v e r s a l i t y conditions
~j(T) = 0 . (37)
A similar calculation with Pontryagin's principle shows that the adjoint equations
of Problem B are the same as the state equations of Problem A, and that the transvers-
ality conditions are the negatives of the boundary conditions (32). Thus the adjoint
variables of Problem B are the negatives of the state variables of Problem A. Putting
these together with the maximum condition of Pontryagin's principle gives the follow-
ing type of duality for Problems A and B.
veU " {j : g ( i ) ~ g ( j ) } u r z3 ]
THEOREM 7: A n e c e s s a r y c o n d i t i o n t h a t ( u * ( t , Y k ) } be an o p t i m a l c o n t r o l for t h e s t o -
c h a s t i c control problem of Section II is that almost surely with respect to the dis-
tribution of Yk and almost surely with respect to Lebesgue measure on [Tk,T] that
APPENDIX
VII. UNNORMALIZED
CONDITIONALPROBABILITIES
In t h i s appendix, x i s a jump p r o c e s s and y d e f i n e d by y ( t ) = g ( x ( t ) ) i s the
observed process. To shorten the notation slightly we shall assume that the condi-
tional distributions P[Tn+iSt,Xn+l=j I Xn ] have the form
P[Tn+l~t,Xn+l=J [Xn ]
PETn+i<t,Xn+l=J [ Xn]
(42)
= ~ l{Tn<-t,Xn=i) Ii aijCS,Yk) EEl{yn~S<Tn+l} I Xn]ds
n
= ~ 1 - < - It (433
" {Tk-Tn<Tk+l 'Xn=i} Tk aij(S,Yk) E[l{Tn<S<Tn+l } IXn]dS
E E1 . - -
{Xn=l,Tk~Tn<Tk+l,Tn~t<Tn+l }
Ixn]
= l{xn=i,~k~Tn~k+l,Tn~t}
+ 1 {Xn=i'Tk~Tn-Tk+l}
- <- f:
Tk aii(S,Yk) E[l{TnSS<Tn+l } I Xn]dS
(44)
P [Tn+l~t[Xn]= ~ l[Tn~t,Xn=i}I
-exp~inaii(r'Yk)drIl" (46)
= { l{Tn~t,Xn=i} Ii
tn m n (47)
The lower limit Tn in (49) may be replaced by Tk" because on the set {Tk~Tn}, l{TngS}
is zero for Tk ~ s < Tn" In addition
THEOREM: Let ff and a' be two G-fields. Let A be a set which is in both o and a' and
for which
~ o Ac~ I N A . (S2)
Then if z is a random variable for which
E lz[ <
it follows that
E [ I A E[z I(~'3 [ a ] = E [1A z l a3 . (53)
In particular the set {Tk~Tn} is both a(Xn) and a(Yk) measurable and
l{~k~Tn} a(Yk) c l{Tk~Tn} a(Xn) , (s4)
336
Thus
E[[l{zk~Tn }3 E[ z IX n] I Yk ] = E [l{~k~n } z I Yk ] . (55)
THEOREM AI: Given the history Yk of the first k jumps of the observed process, the
conditional probability that the next observed jump Tk+ 1 occurs before t and that at
this time the unobserved process x takes on the value j is given by
P [~k,l~t,X(Tk.1)=j IYk ]
0
if
if
g(J)~Yk
g (J)=Yk
(56)
The form of the conditional probability distribution of the observed process and
the conditional probabilities of the unobserved process at the time of an observed
jump are given in the following two corollaries which follow ilmediately from
Tneorem AI.
COROLLARY AI:
P[~k+l<t,Yk+l=£ I Yk]
= ~ ~ Qi(S,Yk ) aij(S,Yk)dS •
Tk^t {j:g(j)=£} i
COROLLARYA2:
P[X(~k+l)=J [Yk+l ]
i Qi(Tk÷l,Yk ) aij(Tk+l'Yk)
if g(J) =Yk+l
~ Qi(Tk+l,Yk ) aij(Tk+l,Yk~
{j:g(j)=Yk+i } i
0 if g(J)~Yk+l
PROOF OF THEOREMAI: Since Tk*l is a jump time of the observed process, the s t a t e j
t h a t the unobserved process jumps to must s a t i s f y g(J)¢Yk" Thus i t is impossible at
time ~k+l f o r the unobserved process to be in a s t a t e j f o r which g(J)=Yk' and the
conditional p r o b a b i l i t y (56) is 0 f o r these s t a t e s . In the remainder o f the proof
assume t h a t g(J)¢Yk"
337
Since g(J)~Yk and since {~k } are the jumps of the observed process y and {Tn}
the jump of the unobserved process x, we have that
P[Tk~Tn<~k+l,Tn+l~t,Xn+l=J [Yk ]
: E [i - E [1 - -
{Tk~Tn } {Tk~Tn<Tk+l,Tn+l~t,Xn+z=j} Ixn] IYk ] . (61)
The a(Xn) measurability of l{~k~Tn<~k+l} and (42) imply
E [i - -
{Tk~Tn<Tk+l,Tn+l~t,Xn+l=]} IXn ]
= ~
" 1 {~k~Tn~Tk+l'Xn=i}
- - ?
~k aij (s 'Yk) E[Tn~S<Tn+ 11Xn ] ds
Thus substituting (62) in (61) and then (61) in (60) interchanging integration,
summation and conditional expectation; using the generalized law of iterated condi-
tional expectations; and using that
ds (64)
- < s < - %k+l,X(s)=i} [Yk ] aij(S,Yk) }
, ~ { E E l ftZk_
= / tTk
Tk ~ Q i ( S , Y k ) aij(S,Yk) ds ,
which i s t h e c o n c l u s i o n o f Theorem A I .
PROOF OF THEOREM I: If g(i)~Yk, the sets involved in the definition of Qi(t,Yk) have
an empty intersection, and Qi(t,Yk) is the conditional expectation of a zero random
variable and hence is zero. Assume in the remainder of the proof that g(i)=y k.
The monotone convergence theorem for conditional expectations implies
Since ~k are the jumps of the observed process y and Tn the jumps of x the following
set equality holds:
(~k~t<~k+l,Tn~t<Zn+l} = (~k~Tn<~k+l,Tn~t<Tn+l } . (66)
The generalized law of iterated conditional expectations (66) and (44) imply
E [l(x(t)=i,~kSt<~k+l,Tn~t<Tn+l} IYk ]
= E [l(xn=i,~k<Tn<~k+l,Tn<t} I Yk ]
- -
+ E [l(xn=i,Tk~Tn<Tk+l } ftTk aii(S'Yk) E [l(zn~S<Tn+l } IXn]dS IYk] .
E [l(xn=i,~k<Zn<~k+l,Zn<t } IYk ]
Combining (69), (68) and (67), using the generalized law of iterated conditional ex-
pectations, interchanging integration, summation and conditional expectation gives
E [l{x . - -
(t)=i,TkSt<Tk+l,TnSt<Tn+l} ]Yk ]
E [l{x(t)=i,~k~t<~k+l} IYk]
THEOREM A2: If z is any O{Tn+l,Xn+l ..... Tn+k,Xn+k .... }-measurable random variable for
which EIz I < ~, then E[z IX n] can be expressed as a function of x n and Yk where Yk is
the observed history corresponding to X .
n
PROOF: It can be shown using (40) that the theorem is true for the characteristic
function of any o(Tn+l,Xn+l, .... Tn+k,Xn+k)-measurable set. The class of random vari-
ables for which Theorem A2 is true is closed under monotone limits and products.
Therefore, the theorem follows from the monotone class theorem.
Theorem 2 follows by applying Theorem A2 to both the numerator and denominator
in the definition (21) of Ju(t,Xn).
340
I Ju(t,Yk,j) a i j ( t , u ( t , Y k ) )
{j :g(i)=g(j)} (23)
[ }Ju(t.Yk.t.g(J) • J) a i j ( t , u ( t , Y k)
{j:g(i),g(j)
- Ju(t,Yk,i) aii(t,u(t,Yk)) .
PROOF OF THEOREM 3: Breaking the integral in (73) into the integral between t and
the next j~p Tn+ 1 and using that between the n-th and (n+l)-st jump, x(t)=Xn, and
u(t)~(t,Yk) and then using the ~ u l a s (41) and (56) giving the distribution of
Tn+1 and Xn+l, we have
g {~n~t<Tn÷l } t c(s'x(s)'u(s))'ds IX
E . • c(s,i,u(S,Yk))ds I X J
+E E{Tn<t<Tn+l} E T^Tn+
T 1
(733
Now since
Ju(t,Xn) :
E [l{rn<t<rn+l } IXn]
and from (41)
E [I{TnNt<Tn+I}IXn]= ~ l{Tn<-t,Xn=i}
exP ~ n aii(r,Yk) drl • 441)
+
It
[ J(t,Yk,J)aij(S,Yk)+ [ Ju(S,Yk,S,g(j),j)alj(s,Y k
j :g (j)=yk} {j :g(j)~yk}
exp
t
aii(r,Yk)drlds •
TH£OREM 4: The conditional remaining cost from an observed jump time ~k onward (or
from time ~0=0 onward i f k=O] conditioned with r e s p e c t to the observed h i s t o r y Yk
satisfies
E I Tk c(s,x(s),u(s))ds [ Y~l--~j
(2s)
PROOFOFTHEOREM4:
B ~ k c(s,x(s),u(s))ds IYk]
ATk+ i T
= E c(s,x(s),u(s))ds + c(s,xEs)uCs))ds ]Y (76)
L_-k TA~k+ I
Using the generalized law of iterated conditional expectations
E c(s,x(s) ,uCs))ds IY
TATk+ 1
Now
=E
fE n~ t{Tk+l=Tn}
E
T,V'rk+I
c(s,x(s),u(s))ds IX IYk+ IY . (77)
~fk c(s,x(s).u(s))dsI~1-_
= E c(s,x(s),u(s))dsl Y + E I~ l{x(Tk+l)=j } Ju(Tk+l,Yk+l,j) I Ykl
~kl{~kSS<~'k+l}
+ IT ~, s Ju(S,Yk
,g(j). ,j) ~ Qi(S,Yk) aij(S,Yk) ds
"rk J 1
-I
~k ~ (s'i'u(S'Yk)) + Z J (s,Y~,s,g(j)) aij(S,Yk) [ Qi(S,Yk)dS
{j:g(j)~yk} ~ _J
which gives (25).
843
To obtain (26), notice using the generalized law of iterated conditional expec-
tations and Theorem 2 that
REFERENCES
[i] R. Boel, P. Varaiya, "Optimal Control of Jump Processes," SlAM Journal on Control
and Optimization," Vol. 15 (1977).
[2] M. Kohlman, A. Makowski, R. Rishel, "Representation Results for Jump Processes
with Application to Optimal Stopping," Stochastics, Vol. 4 (1980).
[3] A. Makowski, "Local Optimality Conditions for Optimal Stopping," Stochastics 1982.
[4] R. Rishel, "A Minimum Principle for Controlled Jump Processes," Springer Lecture
Notes in Economics and Mathematical Systems, Vol. 107 (197S).
[S] R. Rishel, "The Minimum Principle, Separation Principle and Dynamic Programming
for Partially Observed Jump Processes," IEEE Transactions on Automatic Control,
Volume AC-23 (1978).
[6] R. Rishel, "Optimality for Completely Observed Jump Processes," IEEE Transactions
on Automatic Control, Vol. AC-22 (1977).
[7] R. Rishel, "State Estimation for Partially Observed Jump Processes," Journal of
Mathematical Analysis and Applications, Vol. 65 (1978).
[8] M. Rudemo, "State Estimation for Partially Observed Markov Chains," Journal of
Mathematical Analysis and Applications, Vol. 44 (1973).
[9] A. Segal, "Optimal Control of a Finite State Markov Process," IEEE Transactions
on Automatic Control, Vol AC-22 (1977).
ON NORMAL APPROXIRLATION IN BANACH SPACES
V. V. Sazonov
Moscow, U.S.S.R.
taut new results in the area. Consider first a somewat simplier case
of Hilbert space.
Let XoXz, be a sequence of independent identically distribu-
induced on ~ by Y .
: e {l A +,xJ,
where ~ is some class of Borel subsets of 7q . It is well known
(see e.g. ~ ) that contrary to the finite dimensional case ~ ( ~ ) in
general do not tend to zero as ~ o ~ if ~ is rich enough, e.g. if
contains the class of all halfspaces or the class of all balls.
In what follows we will be concerned mainly with the case w h e n ~ = ~ ,
the class of all balls with a fixed centre ~ .
The first estimates o f } ~ ( ~ ) (apart from logarithmic speed ob-
tained in [17~)were of the type O( m-~ ),~ < l ~ a n d were valid when
EIXj[ <c~ for some ~ 3 (depending on the estimate) (see ~l~, [l~ , ~
and Yurinskii's theorem in ~ ). Under some additional (rather rest-
rictive) conditions on the probability measure induced by ~i better
speeds were obtained in [4~ and [5] .
Denote ~ the kth eigenvalue of V (we assume that eigenvalues
of V are numbered in the decreasing order). In 1979 F.G6tze [7] pro-
ved that if EIX, I~<~ and J~>o then
(-.J.) = o C..,,-, +
(1)
for any ~ > o . Later F.G6tze was able to show (see ElO]) that (1) is
true if E[X.I*<~ and Jj>o j>~i ~ ~j = o~-P) for some ~#o
In 1981 V.V.Yurinskii[18-1~under the assumption that Xj~ 5 - - j ~
are independent (but not necessary identically distributed), EX3--o j
all Xj have the same covariance matrix V , proved that
sitive integer. In this more general case the result has the same form
( ee [2o ,[21] ).
In the same paper B.A.Zalesskii pointed out that the speed in (2)
is the best possible; namely for any ~~ o he constructed a sequence
:D o
He also proved that there exist a sequence {X;~ with EIX;I~4~ such
that
by the estimation of
]=o
347
Ill
-T
red Gaussian ~ -valued random variables with the same covariance ope
a j--'t
t
= W 2-
e.g. Ill, [2] , [6], [8] , [9] , [12]-[1¢] . Here we will mention two recent re-
sults. The first one belongs to F.GGtze[9] and is more general also in
the sence that it deals with a larger class of sets then balls. Sup-
-c4} F(¢)(~)ii cr
plied to F(~):ll.ll only if I[.[I is smooth which is not always true. E.g.
the supremum norm in B= C Cs), the space of continuous real functions
A =EIx,l
350
References
19(1979), 23-43.
3. Borovskih, Yu., Estimates of characteristic functions with applica-
9(19SI), S52-S59.
9. , On the rate of convergence in the .central limit theorem
in Banach spaces, Preprints in Statistics, University of
Cologne, 68(1981), pp. 1-34.
lO. , Convergence rate in the central limit theorem in Hilbert
Abstracts, p. 35.
ll. Kuelbs, J., Kurtz, T., Berry-Esseen estimates in Hilbert space
and an application to the law of the iterated logarithm,
(1976) , 775-791.
14. , Paulauskas, V., On convergence of some functionals of
sums of independent random variables in a real Banach space,
Litovsk. Mat. Sb., 16(1979), 103-121.
15. Sazonov, V.V., Normal approximation - some recent advances. Lec-
(1981), 557-558.
20. Zalesskil, B.A., Estimates of the accuracy of normal approximation
in a Hilbert space, Teor. Verojatnost. i Primenen., 2(1982),
279-285.
21. , On the rate of convergence in the central limit theorem
i. INTRODUCTION
In this p a p e r we c o n s i d e r a class of p r o b l e m s in w h i c h the
process to be c o n t r o l l e d is f u n d a m e n t a l l y a B r o w n i a n m o t i o n on the
n o n n e g a t i v e real line w i t h either a b s o r p t i o n or r e f l e c t i o n at the origin.
At each point in time one of a finite n u m b e r of c o n t r o l s must be em-
ployed, and the c o n t r o l s c h o s e n i n f l u e n c e the e v o l u t i o n of the B r o w n i a n
motion process. Operational costs and linear h o l d i n g costs are con-
tinuously i n c u r r e d at a rate d e p e n d i n g on the control in use and the
state of the B r o w n i a n m o t i o n process. L u m p - s u m s w i t c h i n g costs are
incurred i n s t a n t a n e o u s l y each time there is a switch in controls, and
these s w i t c h i n g costs are dependent on the two c o n t r o l s i n v o l v e d in
the switch. Further, if the b a r r i e r at the origin is a b s o r b i n g there
is a cost for h i t t i n g the boundary. We a l l o w a g e n e r a l set of control
strategies where the current control may depend in an arbitrary, meas-
urable way on past states and past controls. The o b j e c t i v e is to de-
termine a strategy for s e l e c t i n g a control at each point in time, so
as to m i n i m i z e the total e x p e c t e d cost d i s c o u n t e d over an i n f i n i t e
p l a n n i n g horizon. These p r o b l e m s are r e l a t e d to those studied by Doshi
[4,5], Mandl [I0]~ and Pliska [13], d i f f e r e n c e s b e i n g in the form of
the h o l d i n g costs or s w i t c h i n g costs; and also are r e l a t e d to w o r k by
Arkin, Ko~emaev, and S h i r y a e v [I], Benes [2], Davis and V a r a i y a [3],
and F l e m i n g [6], d i f f e r e n c e s b e i n g in the c o n t r o l l a b l e n a t u r e of the
drift c o e f f i c i e n t and/or d i f f u s i o n c o e f f i c i e n t of the d i f f u s i o n pro-
cesses. Additionally, our focus e m p h a s i z e s e x p l i c i t p r o d u c t i o n of
optimal control strategies.
D i f f u s i o n m o d e l s in general, and c o n t r o l l e d B r o w n i a n motion~
in p a r t i c u l a r , have been used by FoschIDi [7], Iglehart [8], and
others in m o d e l i n g w a t e r r e s e r v o i r s , cash m a n a g e m e n t inventories and
other I n p u t - o u t p u t systems. As an a p p l i c a t i o n of the w o r k p r e s e n t e d
here, c o n s i d e r a service f a c i l i t y w i t h i n f i n i t e c a p a c i t y that can be
o p e r a t e d at d i f f e r e n t service rates. (For example, the rate at w h i c h
a packet switch t r a n s p o r t s data might be v a r i e d by c h a n g i n g the n u m b e r
or speed of o u t g o i n g t r a n s m i s s i o n c h a n n e l s . ) Suppose the costs as-
sociated w i t h this service f a c i l i t y are a linear c u s t o m e r - h o l d i n g cost,
384
2. FORMULATION
[hx + r i if x > 0
g(x,i) !
[(l-l)~R if x = 0.
355
M M I~ Ck~e-~t *
+ k=l
~ ~ ~l 0 dQkg (tlx,i,~
I t0
Lemma 1. F o r an a d m i s s i b l e strategy ~ and for e a c h i g A, IV ( x , i ) -
h x / ~ I i s b o u n d e d f o r e a c h x ~ S i f and o n l y i f
E ];I°0 e-~tdQ : ~ (tlx'i'~ < ~ for each k,~ in {I,2,...,M} where Ckg > 0.
357
E Ix,i,~)dt - h x / a <
E ~ C k £ e - a t d Q k £ ( t Ix,i,~ is. []
I £=I
3. O P T I M A L BAND S T R A T E G I E S
mode 2
s5
mode i ~ mode 2, else continue
s4
mode 3
s3
li if x ~ [O,Sl]
if x ~ (Sl,S 2]
f(x,l) = if x • (s2,s 3)
if x ~ [s3,s 4]
if x ~ (s4,~) ,
li if ~ ~ [O,Sl]
if X ~ (Sl,S 2]
f(x,2) = if x ~ (s2,s 3)
if x @ [s3,s 4]
if x ~ (s4,~), and
359
i if x E [0,s I]
f(x,3) = if x ~ (Sl,S 5)
if x ~ [Ss,~)
(0,TI)" aI &
Suppose that 61(x I) = sp Now either i) s I is a closed
aI P aI
boundary point of I , ii) s al is an open boundary point of I but
P
not a boundary point of Ial , of iii) s Pal is a boundary point of Ial
aI a2 a~
Sp , f(s~+l,a 2) = a I for i), f(s~ I' al) = a2 for ii). A result of
Nakao [12] guarantees the pathwise unique existence of the process
{@2(t); t ~ T l} s.t.
where
I if y < s al
~a 2 P ,
~(y) =
if y > s al
~a I P
Ii if y < s al
a2 P
o(y) =
if y > ~ al
[ al P ,
U(s I) = Ual and c(s I) = Ca I for i), and U(Sp ) = Ua2 and o( ) = aa2
i~A p ~ { 0 , 1 , . . . j n ( i ) } p+l s
conclude that E[~n+I-T n] > b > 0 for all n and some constant b. This
-- -- N
2.2 in Kunita and Watanabe [9], the difference being that our class of
functions include those that misbehave at a finite number of points in
each finite spatial interval.
We now show the value function of a band strategy to be
the solution to a standard differential equation.
Theorem 2. For each band function f and state-actlon pair (x,i) E SxA,
let Wf,x, i denote the admissible strategy uniquely corresponding to
f,x, and i. The function Vf: SxA ÷ ~ defined as Vf(x,i) = V (x,i)
~f,x,i
is the unique function V: SxA ÷ ~ satisfying the following for each
i ~ A:
(3.6) V(.,i) ~ C~(S),
for each (x,i) g S×A and distinct k,% ~ {I,2,...,M}. Fix (x,i) ~ SxA
and define {~n : n _> 0} by Yo = 0 and Y n = T n - T n-i for n --
> I " where
the sequence of stopping times {T ;n > 0} associated with f, x, i is
n
defined as in the proof of Theorem I. We have already established that
the Yn'S a r e i n d e p e n d e n t and that E[Y n] _> b _> 0 for each n and some
n
constant b. Define {Sn; n ~ 0} by S n = [ Yj for n > 0 and {N(t); t>0]
by N(t) = sup{n: S n ~ t} for t > 0. Sinc~ =0 -- -
E[Sn(t)+l~-- = n=0
~ EIY nl {N(t )+l>_n I > bE[N(t)
-- + i].
Hence lim sup[E[N(t)]/t] < i/b. Now fix distinct k,i £ {I,2,...,M}.
363
e-mTvi(Zi(T)) = Vi(x)
+ T ae-~tvi(zi(t)) + 1 e- Vi(Zi(t))~ i
0
+ e -at_',~
vi~i(t))~ ~ dt + IT e-~tv' (Zi(t))eidB(t).
0 i
Taking expectations,
i i
and since Vi(x) = Vf(x,i) on [Sp,Sp+l] , function Vf satisfies (3.8).
{
where
~j if y < s 3-"
]~(y) ffi p+l
~a if y >_ sJ+l ,
and
Gj if y < s~+ I
~(y) =
'
I ~a if
a
Ka,f(Sp+l,a)+Vf [a a
Sp+l, f(Sp+l,a ) ] , respectively. Let Vaj(Y) denote the
and sa
p+l" l
Thus, if X is not a boundary point of I i then Vf(x,i)
II l
and Vf(x,i) exist. If x is a closed boundary point of I i, then Vf(x,i)
exists; and if x is an open boundary point of I i but not a boundary
point if Is(i) , then again Vf(x,i) exists. Finally, if x is a boundary
l
point of Is(1), then Vf(x,i) does not necessarily exist. So, (3.6)
holds, and Vf is indeed a solution to (3.6)-(3.10).
Suppose now that function V: S×A + ~ also satisfies
(3.6)-(3.10). Then letting A(x,i) = V(x,i) - Vf(x,l) for each (x,i)E
S×A, we would find that the function A satifies these conditions for
each i £ A: I i
where B 1 is the positive root and B 2 and the negative root of the quad-.
1 22
ratic equation Ul s + ~ oiB - a = 0. But by (3.11), (3.13), and (3.14),
it must be that Y1 = Y2 = 0. Hence, (x,i) = 0 for each (x,i') ~ SxA,
and Vf is the unique solution to (3.6)-(3.10). [ ]
Band function f, then, generates an (x,i)-optimal strategy
if Vf(x,l) = V,(x,i). We call band function f (everywhere) optimal ,
if for each (x,i)( S×A its corresponding band strategy is (X,i)-optimal.
After proving a verification lemma, we wlll derive necessary and suf-
ficient conditions for a band function to be everywhere optimal.
Lemma 3. Suppose that for each i ~ A, V : S × A + ~ s a t i f l e s (3.6), (3.7),
(3.10),
(3.15) V(x,i) j Kij + V(x,J) ~ for each (x,j)~ S×A, and
(3.16) DjV(x,i) - V(x,i) + g(x,j) 3 0 for each (x,j) ~ S×8(i) where
we further define Dj for those x without a second partial spatial
+ e-aTn+Ic~ ~I
(Tn+ !- ), 8 (Tn+l+
366
+ e-C~Tn+lc~ )!J.
(Tn+ I- ), 0 (Tn+l +
ii rl
-sT
v (x,i) > E IV[x,6(i)) + ~ ~ nV(X(Tn 1x,i,w),8(Tn+) )
n=l
-sT m~T ~]
- e n~(X(TnlX,i,~),~(Tn-) ) + e nC~(Tn ),~(Tn+
-sT
e ~nv(X(TnlX,i,w),8(Tn+))
n=l
~7
there exists x E S\{0} i,jEA and E > 0 s.t. Vf(x,i) = Kij + Vf(x,i)
and DjVf(y,j) - Vf(y,J) + g(y,j) < 0 for each y [x-E,x+g]. Defining
the process 4, the stopping time T, and the admissible strategy ~ exactl~
as above, we have
and hence Vw(x,i) < Kij + Vf(x,j) = Vf(x,i). Again we have contradicted
the optimality of f, thereby p r o v i d i n g that (3.20) is also n e c e s s a r y
for f to be optimal. []
We can summarize the optimality conditions (3.19) and
(3.20) by d e m a n d i n g that the optimal value function Vf satlfy the satis-
fy the following single condition for each (x,i) E S×A:
I
(3.25) for each i g A, Vf,(.,i) is continuous of S.
Remarks. (i) Our cost structure can be simplified to one where each
of the operational costs r i is nonnegative. This is accomplished by
redefining cost function g: S×A ÷ ]Ras
rhx + ~i if x > 0
g(x,i) = ~[(l-~)aR if x = 0
rw
where ~i = ri - r, for i=l,2,.~.,N, R = R - 7 ' and r, = min {rl}.
(2) In the case of absorption the cost structure can be simplified
further so that there are zero holding costs as well. To see this
observe that for any admissible strategy w and (x,i) g S×A,
M M ~ ~]
+ ~ ~ I Ck~e-atdQ*k£(t Ix'i'~
k=l Z=I 0
where T is the time of absorption; and if we change the order of in-
tegration,
+
k=l £=i I~ C
0
e-atd^* (tlx,i,~)
where
"{~w(t),r (t)} if t S T
{~(t),r(t) } =
{0,0} if t > T
Therefore we have an equivalent control problem if we define g: SxA ÷
as
Iri if x > 0
g(x,i)
] aR if x = 0 ,
h~i 5,
where ~i = ~ + ri - 5, for i=l,2,...,N R = R - ~-, and 5, = min
"~--~a 4 r i .
(3) Together, conditions (3.21), (3.22), and (3.23) imply that for
the optimal band function, the class continuation set I i is at most
370
(There are M ( M - I ) / 2 pairs of classes and for each pair A k and A£,
two switching numbers are involved. At one, switching out of A k into
A£ occurs and at the other, switching out of A£ into A k occurs.) The
(Nk-l) term accounts for possible switching b e t w e e n actions within
class A k. (To p a r t i t i o n the Ak-Class continuation interval into its
possible action continuation intervals, (Nk-l) switching numbers are
involved. )
For example, Figure 1 depicts an admissible band function
for data N = 3, K12 = K21 = 0, KI3 = K23 > 0, and K31 = K32 > 0. Sup-
pose further that I = 1 and h > 0. The optimal band function might
be as follows:
"I if x ~ [O,s I]
f,(x,l) = 2 if x ~ (Sl,S 3)
3 if x ~ [s3,~),
'I if x ~ [O,s I]
f,(x,2) = 2 if x ~ (sl,s 3)
3 if x @ [s3,~), and
371
'i if x ~ [O,s I]
f,(x,3) = 2 if x E (Sl,S 2]
.3 if x ~ (s2,=)
mode 3
s3~
mode l--)mode 2, else continue
s2
mode 2
mode I
0
4. EXPLICIT SOLUTIONS
+ 2 2 2 2 ./Ul+2ao122
~I ~i+2~°i U2 + z z ~I -
8 = 2 ' P = 2 , v = 2
61 °2 °I
2 2
62 °l 2
AI -- - -2 B 2 - W26 - ~, and A 2 = --~--p -
pl p - a .
372
2 2
oI > o2 use strategy f2
2 2
d'l --> °2 ALl ~ (pl-P2)B - eSr 2 use strategy fl
Pl-P2
r2 > - -
AI < (pl-~2)~ - ~ r 2 use strategy fz
~z £ ~2
pl-~2 a2 < (p2-Pl)p - ~pr 2 use strategy fz
r2 <
2 2 A2 > (~2-Pl)p - ~pr 2 use strategy f2
c I < 02
(2) Now consider the special case of N available control nodes, re-
flection at the boundary, positive holding costs at rate h = I, general
2 and
switching costs, and such that U1 ~ ~2"''~ ~ , o~ > ~ ~ . . . ~ ON,
VfN x ~N rN a ~ -8x
(x,i) = KiN + ~ + ~-~ + -~- + e
where
2 2
= ~N + A N + 2 a ° N
2
oN
x UI rl
Vfl(X,2) = K + ~ + j + -~- + e -Bx for all x >_ 0.
(4.1) -~2-~i
-~ + (r2-rl) + I---A
aB 1 e-~X > aK for all x > O.
Similarly for single band function f2:
~2 r2 1 -px
Vf2(x,l) = K + ~ + T + ~ for all x _> 0,
(4.2) ~I-~2
~ + (rl-r2) + ~ A 2 e - P X > ~K for all x _> 0.
Suppose instead we use the strategy of never switching
control modes. Let f denote the corresponding band function; that
C
is, fc(X,i) = i for each (x,i). Then
x ~I rl ~ - 8x
Vf (x,l) = ~ + --~ + -~ + e for all x > O,
C
374
x ~2 r2 !e-PX
Vf (x,2) = ~ + --~ + -~- + ~P for all x > 0,
and optimality conditions (3.19) and (3.20) reduce to
pl_~ 2
-eK < - - + r I - r 2 < aK
1 1
-aK <
~i-~2 + rl - r2 + 6 - 5 < ~K.
and
f.(x,2) = {i if x ~ E0,z)
if X E (Z.~)
where 0 ~ z ~ Z ~ +'~. The optimal pair' of switching numbers, z, and
Z,, is chosen so as to satisfy the optimality conditions of Theorem 3.
The optimality conditions for f, are
[
i Sl2 - o2 2]~ ,(x.2) + (,I-,2)V~ ,(x,2 ) increasing in x ~ [Z..~) and
~i - ~ v~ + (~1-~2)v~.(z,,2) = r2 - r I + ~-
375
REFERENCES
[4] Doshi, B. T. (1978) Two Mode Control of Brownian Motion with Quad-
ratic Loss and Switching Costs. Stoch. Proc. Appl. 6, 277-289.
Daniel W. Stroock
O. Introduction:
This brief note is intended to introduce the reader to the Malliavin calculus.
However, rather than attempting to explain the intricacies of Malliavin's calculus,
I have decided to only sketch the ideas underlying his calculus and to concentrate
on describing several of the problems to which the calculus has been successfully
applied. Of course, I hope that, having seen its applications, the reader's
appetite will be whetted and that to satisfy his appetite the reader will seek more
information about this subject.
Denote by @ the space of, 8 E C([0,~), Rd) such that 8(0) = 0 and let
he Wiener measure on @ • Civen a mapping ~ : @ ~ R D , the purpose of
Malllavin's calculus is to provide a mechanism for studying the regularity
properties of the induced measure ~# = ~ o ~-I on RD In particular,
Malliavin's calculus gives one a way o~ testing for the absolute continuity of ~
a~ ,
and examining the smoothness of f~ = ~ when it exists. At the same time, it is
often possible to obtain regularity results about the behavior of ~ as a
function of # . Probabilists who are familiar with diffusion theory and related
subjects are all too well aware of that, heretofore~ the only way to attack such
problems has been to identify B~ as the solution to some functional equation
and then invoke the regularity theory associated with that equation (ef. the
discussion in the second paragraph of section 2) below).
Malliavin's idea is to work right in Wiener space (8,~) . In brief, his
technique is to introduce a certain self-adjolnt diffusion operator ~ on L2(~).
The importance of taking to be a diffusion generator is that the associated
bilinear map:
(1.1)
will then satisfy:
can integrate by parts. Namely, given ~,~ E Dora(i) such that ¥/<~,@>£ E Dora(i),
one has from (1.1) , (1.2) and the symmetry of £ :
z~[(#,.~)~] _- s~[<,.~,~>zCW<~,~>~)]
= E~[#o~(¢£(~i<~,¢> ) - y~l<¢,~> z- ~(¢YI<¢,¢>~))]
= -g [~ ~{2z~I<¢,#>~ + <#,Yl<~,¢>~>~)]
o
Let
D .. ~2
(2.1) L = 1/2 ~ aXJ(x) + ~ bi(x) ~
i,j=l ~xi~x j i~l 5xj
where a : RD ÷ R D ® RD and b : R~ ~ R D are smooth functions having bounded
derivatives and a(x) is non-negative and symmetric for each x E RD Suppose
that G : RD ÷ RD ® RD is a smooth function satisfying a ~ ~G and consider the
Ito stochastic integral equation
T T
(2.2) X(T,x) ~ x + J0o(X(t,x))de(t) + fob(X(t,x))dt , T > 0
The cornerstone on which much of modern diffusion theory rests is the observation
that the measure P(T,x,*) given by:
(2.4) 5 u = Lu , t > 0
St
u(0,.) - f
^
To be mere precise (by an elementary application of Ito's formula) if u is a
reasonably smooth solution to (2.4) and u does not grow too fast as Ixl °
then
progressively measurable functions which are "smooth" (in the sense of Frechet
differentlahility). Consider the ItS stochastic integral equation
References
[I] Bismut, J.M., and Michel, D., "Diffusions conditionnelles, I., hypoelliptlcit~
partielle," J. Fnal. Anal. vol. 44 #2, pp. 174-211 (1981).
[2] Friedman, A., Partial Differential Equations of Parabolic T~pe, Englewood
cliffs, N.J., ~ e Hall (1964).
[3] Holley, R., and Stroock, D., "Diffusions on an infinite dimensional torus,"
J. Fnal. Anal., vol. 42 #I, pp. 29-63 (1981).
[4] Rbrmander, L., '~ypoelliptlc second order differential equations," Acts Math.
119, pp. 147-171 (1967).
[5] Kusuoka, S., '~n absolute continuity of the law of a system of multiple Wiener
Integrals," to appear in J. Fac. Scl. Unlv of Tokyo.
[6] Malliavin, P . , "Stochastic calculus of variation and hypoelliptie operators,"
Proc. Intern. Symp. on S.D.E.'s, Kyoto, ed. by K. Its, Kinokuniya, Tokyo
(1978).
[71 Mckean, H.P., Stochastic Integrals, Academic Press (1969).
[8] Michel, D., "R~gularit~ des lols conditionelles en theorie du filtrage non
lindaire et caleul variations stochastique," J. Fnal. Anal. 41 #1, pp. 8-36
(1981).
[9] Shigekawa, I., "Derivatives of Wiener functionals and absolute continuity of
induced measures," J. Math. Kyoto Univ., 20, pp. 263-289 (1980).
[10] Stroock, D., "The Malliavin calculus and its application to second order
parabolic differential equations, Part I," Math. Systems Th., 14, pp. 25-65
(1981).
381
S.R.S. V a r a d h a n
Courant I n s t i t u t e of M a t h e m a t i c a l Sciences
New York University
New York, NY 10012/USA
1
l i m sup ~ log Pn(A) < - inf I(x)
n + ~ x6A
1
l i m sup ~ log Pn(G) ~ - inf I(x)
n + ~ x6G
inf n
I(x) = inf I(x)
we h a v e xeAV xEA
1
l i m ~ log Pn(A) = - inf I(x)
n~ xEA
Theorem.
where X is the dual of the Banach space X. See for instance [2].
i , + ~-
lira £ ~ Prob E sup Ix(t) I i z ] = -(2p) -I
O<t<l
where
P = sup P(t,t)
0~t!l
x(0) = x
Pn(A) = Prob
I ~Xl + "" .+~ Xn E
Then again the large deviation results hold with I(.) defined for
e My by
Therefore
I l(a) = inf 17(11)
[~ :IV (y)~ (dy) =a]
result
(i) Either Y is c o m p a c t or the process is "strongly" positive
recurrent
(ii) W has the F e l l e r property.
(iii) There is a r e f e r e n c e m e a s u r e w i t h respect to w h i c h ~(x,dy)
has an almost d u r e c l y positive d e n s i t y ~(x,y) for each x e y.
I(~) : - inf
u>0
ue?(L)
f (~7-) (x) U(dx)
In particular, if
L = IV'aV on Rd
then
1 I ~aVf, Vf> d x = 1 [
| <a V ~ , V / f > dx
I(~) = ~ f
J
where d~ = f(x) dx.
= (...,Xl,...,Xn,Xl,...,Xn,Xl,...,Xn,...)
Qn ~ 6P 0
Note that P0 e X is a point of X. This is just the ergodic theorem.
Again the large deviation results hold in this context and
II2(P) = EP{I7(P~,~)}
where
p~ = P [ X l e • I ..- X_l,X 0]
II2(P) = Ill(P)
pT
h(T,P) = EP[I7 ( ~,R~(0))]
h(T,P) = T II3(P) ,
and the large d e v i a t i o n results hold with this I(.) function. There is
again a c o n t r a c t i o n principle connecting I13 and 19 identical to one
connecting 18 and I12.
4- Applications.
for 0 < a < i. Here ~(t,x) is the local time at x for the standard
B r o w n i a n motion.
By B r o ~ i a n scaling
G(t) = E e
390
1
lim -l-/&--- log G(t) = - inf
• ~-~ f >_ 0
t 3-a f f dx = i
2. Consider
t t
1
G(e) = lim ~ log E exp {
a -'U-s] do ds} ff
t~ 0 0 ]~(s)-6 o 1
where 6 ( • ) is 3-dimensional Brownian motion.
We will show that
e-l°-s{ dO ds = 2 ds dO
0 0 l~(°)-~(s)l 0 [s(o)-B(s)l
2 ds as t +~
o 16(°)-6(s) l
t
= f F(~s) ds
0
where f -s
e ds
F (~) = 2 |
J0 [~(s)-~(0)[
5. Counterexample.
1 d2 d
L = ~dx-~+ ~-~
= i/ (f,)2 1
I9(U) ~J ~ d x + ~
1 1
lim ~ log 1 = 0 < - inf I9(~) < -
n-~=
References