Guo 2005

1450 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO.
9, SEPTEMBER 2005
REFERENCES I. INTRODUCTION
[1] B. D. O. Anderson, M. Deistler, L. Farina, and L. Benvenuti, “Nonneg- Consider an investor who holds a share of stock whose price at time
is X (t). The investor’s goal is to find the “best” selling time to
ative realization of a linear system with nonnegative impulse response,”
IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 43, no. 2, pp. t
134–142, Feb. 1996. maximize the discounted expected payoff, i.e., the value function is
[2] B. D. O. Anderson, “New developments in the theory of positive sys- max 1 Ex [e0r (X 0 K )], where K is the amount to be paid back
tems,” in Systems and Control in the Twenty-First Century. Boston, when the investor sells the stock, X (0) = x, and r is the discount factor
MA: Birkhäuser, 1997.
(e.g., due to inflation). Assume also that the investor is not clairvoyant.
This is an optimal stopping time problem with the stopping time t
[3] L. Benvenuti and L. Farina, “A tutorial on the positive realization
problem,” IEEE Trans. Autom. Control, vol. 49, no. 5, pp. 651–664,
May 2004. measurable to the “information” available up to time t (characterized
[4] , “The design of fiber-optic filters,” J. Lightw. Technol., vol. 19, no. by the -algebra generated by X (s); 0 s t). Clearly, the choice
9, pp. 1366–1375, Sep. 2001. of an optimal stopping rule will be dictated by the underlying model
for X ( 1 ).
[5] , “An example of how positivity may force realizations of “large”
dimensions,” Syst. Control Lett., vol. 36, pp. 261–266, 1999.
[6] , “A note on minimality of positive realizations,” IEEE Trans. Cir- This problem was first studied by McKean [6] in the 1960s. As-
cuits Syst. I, Fundam Theory Appl., vol. 45, no. 6, pp. 676–677, Jun. suming that X (t) follows a geometric Brownian motion (now well
1998. known as the Black–Scholes model) such that dX (t) = X (t)dt +
X (t)dW (t), and with (X 0 K ) modified equivalently by (X 0
[7] L. Benvenuti, L. Farina, and B. D. O. Anderson, “Filtering through a
K ) , he solved the problem explicitly. He showed that if > r , then
combination of positive filters,” IEEE Trans. Circuits Syst. I, Fundam +
Theory Appl., vol. 46, no. 12, pp. 1431–1440, Dec. 1999.
[8] L. Benvenuti, L. Farina, B. D. O. Anderson, and F. De Bruyne, “Min- one can patiently “wait and see” and the value function is infinite; if
imal positive realizations of transfer functions with positive real poles,” 3 3
< r , then there exists an x = x (; r; ; K ) so that the optimal
IEEE Trans. Circuits Syst. I, Fundam Theory Appl., vol. 47, no. 9, pp. stopping time = inf ft > 0; X (t) x3 g; = r is a degenerate
3
case for which the value function is x, the initial value of X (t), with
1370–1377, Sept. 2000.
the corresponding 3 = 1, a.s.

[9] L. Farina, “On the existence of a positive realization,” Syst. Control Lett.,
vol. 28, no. 4, pp. 219–226, 1996.
[10] L. Farina and S. Rinaldi, Positive Linear Systems: Theory and Applica- If we assume more realistically that the market fluctuates between
tions. New York: Wiley, 2000. “bull” and “bear” states and that and take different values in dif-
[11] K.-H. Frster and B. Nagy, “Nonnegative realizations of matrix transfer
functions,” Linear Alg. Appl., vol. 311, no. 1–3, pp. 107–129, 2000. ferent market states, then the problem of this investor can be cast in a
[12] C. Hadjicostis, “Bounds on the size of minimal nonnegative realizations regime switching model, or the Markov-modulated Brownian motion
for discrete-time LTI systems,” Syst. Control Lett., vol. 37, pp. 39–43, model. More precisely, we consider the following switching diffusion
1999. process:
[13] A. Halmschlager and M. Matolcsi, “Minimal positive realizations for a
class of transfer functions,” IEEE Trans. Circuits Syst. II, Analog Digit.
Signal Process., vol. 52, no. 4, pp. 177–180, Apr. 2005.
[14] J. M. van den Hof, “Realizations of positive linear systems,” Linear Alg. ( ) = X (t)(t) dt + X (t)(t) dW (t);
dX t X (0) = x (1)
Appl., vol. 256, pp. 287–308, 1997.
where (t) 2 f1; 2; . . . ; S g is a finite-state continuous-time Markov
[15] B. Nagy and M. Matolcsi, “A lowerbound on the dimension of positive
chain, and W (t) is a standard Wiener process defined on a probability

realizations,” IEEE Trans. Circuits Syst. I, Fundam Theory Appl., vol.
50, no. 6, pp. 782–784, Jun. 2003.
[16] G. Picci, J. M. van den Hof, and J. H. van Schuppen, “Primes in sev- space (
; F ; P ). Here, we assume that W ( 1 ) and ( 1 ) are indepen-
eral classes of the positive matrices,” Linear Alg. Appl., vol. 277/13, pp. dent, and i and i are known parameters for any given (t) = i.
149–185, 1998.
Note that McKean’s solution corresponds to the special case when
the parameters and , respectively, are identical in different states.
This regime switching model has been studied by many researchers
in various contexts; see, for example, [1], [3], and [9]. The work in
[3] dealt with perpetual lookback option pricing and solved a related
optimal stopping time problem by extending the well-known technique
Optimal Selling Rules in a Regime Switching Model of smooth fit. The work in [9] used a two-point-boundary-value
approach and provided a suboptimal selling rule involving target
X. Guo and Q. Zhang
and cut-loss levels, which is optimal in that particular context.
In this note, we exploit the techniques developed in [3] to de-
Abstract—In this note, we derive optimal selling rules under a regime rive an optimal selling rule for this investor, assuming (1). For
switching model. The optimal stopping rule is of a threshold type for each explicitness, we assume that S = 2 and the generator of (t)
( 01 01 ), 0.
state, derived via the “modified smooth fit.” The proof is via the martingale
theory. Numerical examples are reported to demonstrate the dependence is of the form with 1 ; 2 > We derive ex-
2 2
of threshold levels with various parameters and to compare our result with plicitly a general optimal selling rule and the corresponding
some suboptimal selling rules.
value function in a closed-form. We show that when r < B1
Index Terms—Markov process, optimal stopping, regime switching, (=f1 01 +2 02 + [(1 0 1 ) 0 (2 0 2 )]2 + 41 2 g=2), it
smooth fit.
is optimal to wait and get an infinite return; when r > max(1 ; 2 )
B1 , the optimal stopping rule is of a threshold type for each state;
Manuscript received November 3, 2003; revised August 13, 2004 and April r = B1 is the degenerate case for which the value is x = X (0) and the
28, 2005. Recommended by Associate Editor D. Li. waiting time is infinite. (The case of r 2 (B1 ; max(1 ; 2 )) remains
X. Guo is with School of Operations Research and Industrial Engineering, unsolved). The proof of optimality is via the Dynkin’s formula and
Cornell University, Ithaca, NY 14853 USA.
Q. Zhang is with Department of Mathematics, University of Georgia, Athens, local martingales. Finally, we numerically illustrate the dependence of
GA 30602 USA (e-mail: qingz@math.uga.edu). our optimal threshold levels on various parameters and the difference
Digital Object Identifier 10.1109/TAC.2005.854657 between our solution and that in [9].
0018-9286/$20.00 © 2005 IEEE

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 9, SEPTEMBER 2005 1451
II. MAIN RESULTS Solving this second-order ODE with the form of Mi t ( ) = exp(Bt)
gives us
Consider the following optimal stopping problem with a state–space
(x; ): Mi t ( ) = Ai1 exp(B1 t) + Ai2 exp(B2 t):
3
V (x ) = sup E [e
0r (X ( ) 0 K ) j X (0) = x; (0) = i] (2)
i
0 1 Here, the coefficients are uniquely determined by the boundary condi-
tions.
= ( ( ) ( ))
where is an Ft f W s ; s j s tg-stopping time. The degenerate case when 1 2 =0
can be solved in a similar
0
Here, we assume x > and exclude the case of x =0
for which fashion. This completes the proof.
V3 =0 and = 1.
[ ]
Intuitively, if r is very small so that ert < E Xt , then it does not A. Case 1: r B1
hurt to wait and the payoff is infinite. This intuition can be formalized.
Lemma 1: When i > ; i 0 ( = 1 2)
;
()
First, note that X t in (1) can be written as
2
0 2(s)
t t t
Mi t( ) = E exp ( ) j (0) = i
s ds ( ) = X (0)exp
X t (s) ds + ()
(s) dW t :
0 0 0
= Bi1 00 BB22 eB t + BB11 00 Bi2 eB t : exp( 2
( ) (1 2) )
0 (s) dW s 0 = 0 (s) 0dsrt defines a
(3) t t
In view of this,
martingale. Moreover, it is clear from Lemma 1 that fe Xt gt0 is
If i = 0, then Mi (t) = exp(i t); and for i 6= j; j = 1; 2 submartingale if B1 > r , and supermartingale if B1 < r . Furthermore
M j ( t) =
(i 0 j )exp(j 0 j )t + j exp i t : lim 0rt (B 0r)t X0
i 0 j + j i 0 j + j t!1 E [e Xt ] = tlim
!1 e
1; if B1 > r
Here, B1 > B2 are the roots = x; if B1 r = (5)
2 0 (1 0 1 + 2 0 2 )x + (1 0 1 )(2 0 2 ) 0 1 2 = 0 0; if B1 < r
x
and for T < 1 and any stopping time
such that
E e
0r( ^T ) X ^T 0 e0r( ^T ) K E e0r( ^T ) X ^T : (6)
B1;2
= 1 0 1 + 2 0 2 6 [(1 20 1 ) 0 (2 0 2 )] + 41 2 :

2 Therefore, it is clear that if B1 > r; 3 1 and the value function is
=
infinite. When B1 r; 3 1 and the value function is x, following
= =
(4) from the optional sampling theorem.
Theorem 1: For 1 > ; 2 > ; Vi3 x < 1 if and only if
0 0 ()
Proof: Starting from time 0 and state i, between time 0 and t, 1 r B1 . In particular, when r B1 ; Vi3 x
= x. ( )=
() =
there is either at least one regime shift (RS) of t from state i to j 6 i, Remark 1: Note that the necessary and sufficient conditions for
with probability 0 e0 1t ; or no RS and remains in state i with
1 Theorem 1 hold only when 1 2 6 =0 =0
. When 1 2 , it is possible
probability e0 1t and the process starts afresh again. It is clear that thatmin( ( ) ( ))
V1 x ; V 2 x < 1 = max( ( ) ( ))
V1 x ; V2 x . One can easily
construct an example with 1 =0 ( )
and r 2 1 ; 2 0 2 .
t
( ) = E exp (s) ds 1(at least one RS) + 1(no RS)
Mi t
max(1 ; 2 ) B1
0 B. Case 2: r >
= P fno RSg exp(i 1t)Mi (t 0 1t) In this case, the value function is finite and is determined by the
+ P fat least one RSgMj (t + 1t) + O((1t)2 ) optimal stopping time . If we focus our attention on the threshold type
= exp(0i 1t)exp(i 1t)Mi (t 0 1t) stopping rules (whose optimality is to be verified), then it is reasonable
+ (1 0 e0 1t )Mj (t + 1t) + O((1t)2 ) to anticipate that the thresholds should be different according to the
state of .
with i 6= j; j j 1.
Denoting each threshold xi for the state i, we first provide some
x intuitive derivation of such xi , and then prove its optimality via the
By a first-order Taylor’s expansion on e and some algebra, we get
martingale theory.
the corresponding ordinary differential equation (ODEs) with initial
1) Case x1 < x2 : We first consider the case x1 < x2 . Suppose
()
conditions for Mi t such that
0 (0) =
at time ; i. The very definition of xi implies that if x > xi ,
then one should stop if it is in state i and, therefore, Vi x x 0 K; ( )=
( ) = 0(i 0 i )Mi (t) + Mi0 (t)
i M j t otherwise, one should hold the stock. Moreover, between ; t ; may (0 )
change to state j with probability i t for which the value function is
with Mi (0) = 1; Mi0 (0) = i ; i 6= j; i; j = 1; 2. This implies that ( ( )) 1
Vj X t ; and with probability 0 i t; remains at i for which the
( ( ))
value function is Vi X t . That is
00 ( ) 0 (i 0 i + j 0 j )M 0 (t)
Mi t i
+ ((i 0 i )(j 0 j ) 0 i j )Mi = 0 ( ) e0rt fi tVj (X (t)) + (1 0 i t)Vi (X (t))g:
Vi x
with the initial conditions Mi (0) = 1; Mi0 (0) = i . This inequality becomes an equality if the strategy be optimal.
1452 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 9, SEPTEMBER 2005
Now, assuming the value functions are sufficiently smooth, Ito’s rule Smooth fit along the boundaries (i.e., at x = x1 and x = x2 ) yields
yields, for x 2 [x1 ; x2 ] V1 (x+) = V1 (x0) and V10 (x+) = V10 (x0) so that
A1 x1 + A2 x1 = x1 0 K;
x2 V20 (x) + x2 22 V200 (x) + 2 (x 0 K );
1
(r + 2 )V2 (x) = (16)
2 1 A1 x1 + 2 A2 x1 = x1 :
V 1 (x ) = x 0 K
(7) Smoothness of V2 (x) at x1 and x2 (i.e., the “modified smooth fit” as
in [3]) suggests
for x 2 [0; x1 )
l1 A1 x1 + l2 A2 x1 = C1 x1 + C2 x1 + (x1 )
0
x1 V10 (x) + x2 12 V100 (x) + 1 V2 (x) l1 1 A1 x1 + l2 2 A2 x1
1
(r + 1 )V 1 (x ) = = 1 C 1 x 1 + 2 C 2 x 1 + x 1 (x 1 )
2 (8)
0 1 2 2 00
(r + 2 )V2 (x) = x2 V2 (x) + x 2 V2 (x) + 2 V1 (x)
(17)
2
hence
and for x 2 [x2 ; 1]; V1 (x) = V2 (x) = x 0 K . C1 x2 + C2 x2 + (x2 ) = x2 0 K

Solving V2 as a function of V1 from the first equation in (8) and sub- 1 C1 x 2 + 2 C2 x 2 0 x2 0 (x2 ) = x2 : (18)
stituting it into the second equation lead to a fourth-order differential
equation with constant coefficients after a logarithm transformation. In Combining these three equations with some algebraic manipulations,
view of this, the characteristic equation associated with (8) is we obtain
x0
1

0 x0
2

0
g1 ( )g2 ( ) = 1 2 F1 (x1 ; (x1 )) = F2 (x2 ; (x2 ))
x0 x0
(9)
0 1 0 2
where gi ( ) = i + r 0 (i 0 (1=2)i2 ) 0 (1=2)i2 2 ; (i = 1; 2).

(19)
It is not difficult to see that this equation has four distinct roots 1 >
where
2 > 0 > 3 > 4 .
Therefore, the general form of the solution to (8) is 1 1
01 l1 l2 1 1
01
F1 (x; g(x)) =
4
1 2 l1 1 l2 2 1 2

V 1 (x ) = Ai x x0K g (x )
i=1 2 x
0 xg0 (x)
(20)
4
V 2 (x ) = li A i x (10) and
i=1 1 1
01 x 0 K 0 g(x)
F2 (x; g(x)) = :
x 0 xg0 (x)
(21)
1 2
with li = l(i ) = g1 (i )=1 = 2 =g2 (i ), for i = 1; 2; 3; 4. Note
that V1 (0) = V2 (0) = 0, implying that the negative power of x should 2) Case x1 > x2 : The analysis is identical to the case of x1 < x2
be eliminated such that with corresponding notational changes. We denote ~(x) as a special
solution to
V 1 (x ) = A 1 x + A 2 x ;
x1 V 0 (x) + x2 12 V 00 (x) + 1 (x 0 K ) (22)
1
(r + 1 )V (x ) =
V 2 (x ) = l 1 A 1 x + l2 A 2 x : (11) 2
Next, converting ~i (i = 1; 2) as the real roots of
1 + 12 ( 0 1) = r + 1 :
1
x2 V 0 (x) + x2 22 V 00 (x) + 2 (x 0 K ) (12)
1 (23)
(r + 2 )V (x ) = 2
2
Accordingly F1 and F2 are replaced, respectively, by
to a linear ODE via a change of variables x = ey leads to
1 1
01 l1
~ ~l2 1 1
01
V2 (x) = C1 x + C2 x

+ (x ) (13) F~1 (x; g(x)) =
1 2 l1 1
~ l2 2
~ 1 2
where (x) is a special solution to (12), and i (i = 1; 2) are the real x0K g (x )
roots of
2 x
0 xg0 (x)
(24)
2 + 22 ( 0 1) = r + 2 :
1
(14) li = 1=li and
with ~
2
01 x 0 K 0 g(x)
In particular, if r + 2 0 2 6= 0, one can take F~2 (x; g(x)) =
1 1
:
x 0 xg0 (x)
(25)
~1 ~2
2 K 2 x
(x ) = 0 : 3) Case x1 = x2 = x3 : It is not difficult to show that the value
r + 2 0 2
+ (15)
r + 2
function reduces to the McKean’s problem in this case. Recall that
In order to uniquely determine V1 (x) and V2 (x), we must solve for
A1 ; A2 ; C1 ; C2 ; x1 , and x2 . To this end, we need apply the smooth fit V 1 (x ) = A 1 x + A 2 x
principle. V 2 (x ) = l 1 A 1 x + l2 A 2 x
for x 2 [0; x3 ] and V1 (x) = V2 (x) = x 0 K for x x3 . Smooth fit In particular, if 1 = 2 ; 1 = 2 , then x1 = x2 = x3 , and
conditions lead to
3 10
(x )
A 1 (x 3 ) + A 2 (x 3 ) = x 3 0 K V 3 (x ) = V 1 (x ) = V 2 (x ) = x ; if x < x3
i
1 A 1 (x 3 ) + 2 A 2 (x 3 ) = x 3 x 0 K; if x x3
(26)
and
l1 A 1 (x 3 ) + l2 A 2 (x 3 ) = x 3 0 K where > 0 satisfies r 0 (i 0 (1=2)i2) 0 (1=2)i2 2 = 0 and
l1 1 A 1 (x 3 ) + l2 2 A 2 (x 3 ) = x 3 :
(27) x3 = K=( 0 1))(>rK=(r 0 i )), and i ; ~; ;
~ , are given by (14),
(23), (15), and (22), respectively.
Necessarily, we have A1 = l1 A1 and A2 = l2 A2 , meaning V1 = V2 One can prove the optimality of the value function by modifying the
and 1 = 2 ; 1 = 2 from (8). The value function can then be easily verification theorem argument in [3]. Here, we adopt a slightly different
rederived via the smooth fit, which is exactly the solution of McKean. approach.
We summarize our results as follows. For ease of exposition, we use Proof of Theorem 2: It is easy to see that Vi (1) = 1; i = 1; 2,
Y1 ; Y2 to denote column vectors and define and
x0
x0 D = f(x; 1) j x 2 (0; x1 )g [ f(x; 2) j x 2 (0; x2 )g:
Y1 0
0 0
H (x 1 ; x 2 ; Y 1 ; Y 2 ) = Y2
x0 x0
1 2

0 0
1 2
(28) For any v (x; i) 2C 2
, define
Lv(x; i) = xi @v@x

(x; i) @ v(x; i)
and 2
x0 x0
~ ~ 1
+ x i
2 2
Y1 0
0 0
~ (x 1 ; x 2 ; Y 1 ; Y 2 ) =
H Y2 : @x2
x0 x0
1 2 2
~ ~
0 1 0 2 + i (v (x; 3 0 i) 0 v (x; i)) 0 rv (x; i):
(29)
Take v (x; i) = Vi (x). Then it is easy to check that the following dy-
Theorem 2: Assume r > max(1 ; 2 ), and 1 6= 2 or 1 6= 2 . namic programming principle holds:
Case 1: If there exists an x1 < x2 such that
H (x1 ; x2 ; F1 (x1 ; (x1 )); F2 (x2 ; (x2 ))) = 0 and max( Lv; x 0 K 0 v) = 0:
x2 rK=(r 0 max(1 ; 2 )), then Vi3 (x) = Vi (x), if
Vi (x) > x 0 K with Note that Vi (x) is C 1 and not necessarily C 2 at x1 —this is remedied
by the smooth approximation. Let
A1 x + A2 x ; if x < x1
V 1 (x ) =
x 0 K; if x x1 0 (x ) = e01=(10x ) ; if jxj < 1
l1 A 1 x + l2 A 2 x ; if x < x1 0; if jxj 1:
V 2 (x ) = C1 x + C2 x + (x); if x2 > x x1 (30) 1
Define (x) = 0 (x)= 01 0 (x) dx. Then, it is easy to check that
x 0 K; if x x2 (x) 2 C01 and 01 1 (x) dx = 1. Denote k (x) to be the kernel
where 1 > 2 > 0 are roots of (9) function k (x) = (1= )(x= ) and let v (x; i) = (k 3 v )(x; i) with
v(x; i) = Vi (x) be the corresponding convolution. Then, v (x; i) 2
A1 x1 x1 01 x1 0 K C 1 , > 0. Moreover, as ! 0, we can show as in Øksendal [7, Th.
A2
=
1 x1 2 x1 x1 D.1] that: 1) v ! v uniformly on compact sets in R; 2) Lv ! Lv
01 x2 0 K 0 (x2 ) uniformly on compact sets in (@D)c ; and 3) fLv g is locally bounded
C1 x 2 x 2 on R. With these properties, one can show as in [7] that, for any stop-
:
x 2 0 x 2 0 (x 2 )
=
C2 1 x 2 2 x 2 ping time ;
Case 2: If there exists an x1 > x2 such that v(x; i) E e0r(t^ ) v(X (t ^ ); (t ^ ))
~ (x 1 ; x 2 ; F
H ~2 (x1 ;
~(x1 )); F
~1 (x2 ;
~(x2 ))) = 0 and
x1 rK=(r 0 max(1 ; 2 )), then Vi3 (x) = Vi (x) if E e0r t^ (X (t ^ ) 0 K )
( )
: (32)
Vi (x) x 0 k with
Therefore
A~1 x + A~2 x ; if x < x2
V 1 (x ) = C~1 x ~ + C~2 x ~ + ~(x); if x1 > x x2 v(x; i) E [e0r v(X ( ); ( ))] E [e0r (X ( ) 0 K )] (33)
x 0 K; if x x1
l1 A~1 x + l2 A~2 x ; if x < x2 from the uniform integrability of e0rt v (X (t); (t)). To show the
V 2 (x ) = optimality of 3 , note that if 3 < 1, a.s., then v (X ( 3 ); ( 3 )) =
x 0 K; if x x2
(31)
X ( 3 ) 0 K . In this case, Dynkin’s formula yields v(x; i) =
where E [e0r (X ( 3 ) 0 K )]. Otherwise, let Dk = D \ f(x; i) :
1=k < x < k; i = 1; 2g, for k = 1; 2; . . ., and let k = inf ft
A~1 l1 x2 l2 x2 01 x2 0 K 3
0 j (X (t); (t)) 62 Dk g. Then clearly k ! a.s. Moreover,
as in [9, Ths. 4.5 and 4.6], we see that, for each k; k < 1
=
A~2 l1 1 x2 l2 2 x2 x2
and a.s. Next, by the definition of k , we have v (X (k ); (k )) =
C~1 x1 ~ x1 ~ 01 x1 0 K 0 ~(x1 ) v(X (k ); (k ))IfX ( )=kg + v(X (k ); (k ))IfX ( )<kg . Note that
C~2
=
~1 x 1~ ~2 x 1~ x1 0 x1 ~0 (x1 )
: v(X (k ); (k ))IfX ( )<kg = (X (k ) 0 K )IfX ( )<kg X (k ) 0
K . Note also that e0r IfX ( )=kg ! 0, as k ! 1, a.s. It follows
Furthermore, let D = f(x; i) j Vi (x) > x 0 K g. The optimal stopping that E [e0r v (X (k ); (k ))IfX ( )=kg ] ! 0. Therefore, we have
rule for both cases is given by 3 = inf ft > 0 j (X (t); (t)) 62 Dg. v(x; i) E [e0r v(X (k ); (k ))] ! E [e0r (X ( 3 ) 0 K )], as
1454 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO. 9, SEPTEMBER 2005
Fig. 1. Value functions and comparisons.
k ! 1. This combined with (33) summaries the optimality of the TABLE I

value function, i.e., v(x; i) = E[e0r (X( 3 ) 0 K)]. DEPENDENCY ON
Remark 2: In order to apply Theorem 2, one needs to evaluate if the
corresponding algebraic equations H = 0 or H ~ = 0 admit solutions.
This is easy to verify numerically, as shown in Section III.
Remark 3: It is nontrivial to prove a priori the modified smooth fit
principle (hence, the regularity of the value function) for the regime
switching model. This question is beyond the scope of this note. TABLE II
Remark 4: The case of r 2 (B1 ; max(1 ; 2 )) remains open and DEPENDENCY ON
it is not clear to us if the value function will be finite in this case.
III. NUMERICAL SIMULATION
As mentioned earlier, in [9] a two-point boundary value differential TABLE III

equation (TPBVDE) approach was exploited to derive an “optimal” DEPENDENCY ON K
selling rule for the threshold type stopping rules. This, however, is a
suboptimal rule for our problem since the feasible solution was chosen
from a constrained and smaller set of stopping rules. In this section we
report numerical experiments for comparing our analytical solutions
with the TPBVDE solutions.
First, we take r = 3; 1 = 2; 2 = 1; K = 1; 1 = 2 = 5; 1 = We first vary 1 and keep all other parameters fixed. The resulting
4; 2 = 2, and examine our closed-form solutions and those by the (x1 ; x2 )
are listed in Table I. Both threshold levels x1 and x2 increase
TPBVDE method. (These numbers are so chosen to demonstrate the with the increase in 1 . This shows that a larger 1 leads to a higher
threshold levels. For related examples with real market data, see [8].) In expected reward, and therefore a higher threshold levels.
this case, the threshold levels are given by (x1 ; x2 ) = (9:97; 3:88). In We then vary 1 . The result in Table II implies that both x1 and x2
Fig. 1, V e (x; i) and V 2ptbd (x; i) denote value functions from our op- decrease if 1 increases: this is because a larger 1 leads to a shorter
timal stopping rule and from the TPBVDE approach, respectively. The period for (t) to stay at (t) = 1 and a smaller weight on 1 = 4
difference between V e (x; i) and (x 0 K) in Fig. 1 validates the basic (>2 = 2), which leads to a smaller average volatility.
structure of the optimal stopping policy in terms of threshold levels We finally vary K . Table III suggests that both x1 and x2 increase
(x1 ; x2 ). in K due to the fact that a larger K implies a higher transaction cost
Next, we examine the monotonicity of these threshold levels with which in turn needs to be compensated by a higher sample-wise return
respect to 1 ; 1 , and K . level.
ACKNOWLEDGMENT g(x; k)or g (x; k); c c ; bkk k

maxx;k b(x; k ) ; u k
j j
maxk ud (k ) ; v j
maxk vd (k ) ; j j
maxx;k r (x; k ) ; m j
The authors thank the referees for their very careful reading of their
manuscript, and for their many insightful and constructive suggestions
f g
maxx;k ml (x; k ); mr (x; k ) ; In all notations, a quantity a associ-
ated with the subscripts i; D; r; i, and d belongs respectively to the
which lead to a significant improvement of the manuscript.
left region of the deadzone, the deadzone, the right region of the
deadzone, the real system at the i-th iteration, and the desired system;
REFERENCES Let 3 2 fi; dg; l;3 (k) l (x3 ; k), and r;3 (k) r (x3 ; k) are the
[1] G. B. Di Masi, Y. M. Kabanov, and W. J. Runggaldier, “Mean-variance left and right boundary points of a deadzone; ml;3 (k) m l (x 3 ; k )
hedging of options on stocks with Markov volatility,” Theory Probab. and mr;3 (k) mr (x3 ; k) are the left and right slopes of a deadzone;
ID;3 [l;3 (k ); r;3 (k )] is the deadzone; Il;3 (01; l;3 (k )] is
Appl., vol. 39, pp. 173–181, 1994.
[r;3 (k ); +1) is the right
[2] X. Guo, “Inside information and option pricings,” Quant. Finance, vol.
the left region of the deadzone; Ir;3
1 0 cb(x3 ; k )ml;3 (k ); r;3
1, pp. 38–44, 2001.
[3] , “An explicit solution to an optimal stopping problem with regime region of the deadzone; l;3
switching,” J. Appl. Probab., vol. 38, pp. 464–481, 2001. 1 0 cb(x3 ; k )mr;3 (k ); fi (k ) f (xi ; k), and bi (k) b(xi ; k).
[4] I. Karatzas, “On the pricing of American options,” Appl. Math. Optim.,
vol. 17, pp. 37–60, 1988.
[5] S. D. Jacka, “Optimal stopping and the American put,” Math. Finance, I. INTRODUCTION
vol. 1, pp. 1–14, 1991.
[6] H. P. McKean, “A free boundary problem for the heat equation arising In many industrial processes, often a task is repeated over a finite
from a problem in mathematical economics,” Indust. Manage. Rev., vol. operation cycle and the perfect tracking is required from the very be-
60, pp. 32–39, 1965. ginning. Iterative learning control (ILC) has been proposed to deal with
[7] B. Øksendal, Stochastic Differential Equations, 4th ed. New York:
Springer-Verlag, 1995.
this class of control problems. However, most ILC schemes proposed
[8] G. Yin, R. H. Liu, and Q. Zhang, “Recursive algorithms for stock liq- hitherto were designed and analyzed without taking the input dead-
uidation: A stochastic optimization approach,” SIAM J. Optim., vol. 13, zone into account. Input deadzone is a kind of nonsmooth and non-
pp. 240–263, 2002. affine-in-input factor widely existing in actuators or mechatronics de-
[9] Q. Zhang, “Stock trading: An optimal selling rule,” SIAM J. Control vices. It gives rise to extra difficulty in control due to the presence
Optim., vol. 40, pp. 67–84, 2001.
of singularity in the input channels. The input deadzone problem has
been studied by many researchers. Some useful techniques for over-
coming deadzone are variable structure control and dithering. Moti-
vated by pursuing better control performance, several adaptive inverse
approaches were proposed [1]. Recently, soft computing such as fuzzy
logic and neural network-based control algorithms has also been ap-
Iterative Learning Control for Systems
plied to handle problems relevant to deadzones [2], [3]. In these works,
With Input Deadzone the input deadzone was assumed to be independent of the system oper-
Jian-Xin Xu, Jing Xu, and Tong Heng Lee ating conditions. Such an assumption does not hold when we deal with
a control process with high precision requirement.
In this work, we disclose a finding that ILC schemes [4], [5], orig-
Abstract—Most iterative learning control (ILC) schemes proposed hith- inally designed for systems without input deadzone, can effectively
erto were designed and analyzed without taking the input deadzone into compensate the nonlinear deadzone through control repetitions. In a
account. Input deadzone is a kind of nonsmooth and nonaffine-in-input rigorous mathematical manner, we prove that, despite the presence
factor widely existing in actuators or mechatronics devices. It gives rise to
extra difficulty due to the presence of singularity in the input channels. In
of the input deadzone, the simplest ILC scheme retains its ability of
this note, we disclose that ILC methodology remains effective for systems achieving the satisfactory performance in tracking control. The non-
with input deadzone that could be nonlinear, unknown and state-depen- linear state-dependent deadzone presents a new challenging problem
dent. Through rigorous proof, it is shown that despite the presence of the to control theory in general, and to ILC convergence analysis in
input deadzone, the simplest ILC scheme retains its ability of achieving the particular. To address this issue, the learning convergence is derived
satisfactory performance.
in two phases. First, the learning convergence to the desired regions is
Index Terms—Convergence analysis, input deadzone, iterative learning derived. In this phase, we prove that the actual control input sequence
control, nonlinear dynamics. will enter the correct region after a finite number of iterations, then stay
forever. In the second phase, we will use the mathematical induction
NOMENCLATURE principle to prove the uniform convergence of the learning control
input sequence.
R n denotes the Euclidean space of dimension n; R denotes This note is organized as follows. The nomenclature of this note are
the set of real numbers; Z+ denotes the set of nonnegative in- first listed. In Section II, the control problem is formulated. The re-
tegers; jg j denotes the absolute value of g ; kgk denotes the gional learning convergence is proven in Section III and the asymptotic
Euclidean norm of g; K f0; 1; . . . ; N g where N is a fi- tracking convergence is analyzed in Section IV.
nite integer; k 2 K denotes the time instance; i 2 Z+ de-
notes the iteration number; lg denotes the Lipschitz constant of
II. PROBLEM FORMULATION
Consider the following discrete-time dynamic system:
Manuscript received July 7, 2004; revised December 20, 2004 and May 9,
2005. Recommended by Associate Editor Z.-P. Jiang.
The authors are with the Department of Electrical and Computer Engi- xi (k + 1) = f (xi (k); k) + b(xi (k); k)ui (k)
neering, National University of Singapore, Singapore 117576, Singapore
(e-mail: elexujx@nus.edu.sg).
ui (k) = DZ [vi (k)]
Digital Object Identifier 10.1109/TAC.2005.854658 yi (k + 1) = cxi (k + 1) (1)
0018-9286/$20.00 © 2005 IEEE

Guo 2005

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Guo 2005

Uploaded by

Copyright:

Available Formats

1450 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO.

the corresponding 3 = 1, a.s.

chain, and W (t) is a standard Wiener process deﬁned on a probability

0018-9286/$20.00 © 2005 IEEE

= 1 0 1 + 2 0 2 6 [(1 20 1 ) 0 (2 0 2 )] + 41 2 :

and for x 2 [x2 ; 1]; V1 (x) = V2 (x) = x 0 K . C1 x2 + C2 x2 + (x2 ) = x2 0 K

where gi ( ) = i + r 0 (i 0 (1=2)i2 ) 0 (1=2)i2 2 ; (i = 1; 2).

Next, converting ~i (i = 1; 2) as the real roots of

Lv(x; i) = xi @v@x

Fig. 1. Value functions and comparisons.

k ! 1. This combined with (33) summaries the optimality of the TABLE I

III. NUMERICAL SIMULATION

As mentioned earlier, in [9] a two-point boundary value differential TABLE III

ACKNOWLEDGMENT g(x; k)or g (x; k); c c ; bkk k

0018-9286/$20.00 © 2005 IEEE

You might also like

Guo 2005

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Guo 2005

Uploaded by

Copyright:

Available Formats

1450 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 50, NO.

the corresponding  3 = 1, a.s.

chain, and W (t) is a standard Wiener process deﬁned on a probability

0018-9286/$20.00 © 2005 IEEE

= 1 0 1 + 2 0 2 6 [(1 20 1 ) 0 (2 0 2 )] + 41 2 :

and for x 2 [x2 ; 1]; V1 (x) = V2 (x) = x 0 K . C1 x2 + C2 x2 + (x2 ) = x2 0 K

where gi ( ) = i + r 0 (i 0 (1=2)i2 ) 0 (1=2)i2 2 ; (i = 1; 2).

Next, converting ~i (i = 1; 2) as the real roots of

Lv(x; i) = xi @v@x

Fig. 1. Value functions and comparisons.

k ! 1. This combined with (33) summaries the optimality of the TABLE I

III. NUMERICAL SIMULATION

As mentioned earlier, in [9] a two-point boundary value differential TABLE III

ACKNOWLEDGMENT g(x; k)or g (x; k); c c ; bkk k

0018-9286/$20.00 © 2005 IEEE

You might also like

the corresponding 3 = 1, a.s.

= 1 0 1 + 2 0 2 6 [(1 20 1 ) 0 (2 0 2 )] + 41 2 :

and for x 2 [x2 ; 1]; V1 (x) = V2 (x) = x 0 K . C1 x2 + C2 x2 + (x2 ) = x2 0 K

where gi ( ) = i + r 0 (i 0 (1=2)i2 ) 0 (1=2)i2 2 ; (i = 1; 2).

Lv(x; i) = xi @v@x

ACKNOWLEDGMENT g(x; k)or g (x; k); c c ; bkk k