Professional Documents
Culture Documents
Wald's Lemma' For Sums of Order Statistics of I.I.D. Random Variables
Wald's Lemma' For Sums of Order Statistics of I.I.D. Random Variables
net/publication/305976319
CITATIONS READS
22 66
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by F. Thomas Bruss on 25 August 2016.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://about.jstor.org/terms
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted
digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about
JSTOR, please contact support@jstor.org.
Applied Probability Trust is collaborating with JSTOR to digitize, preserve and extend access to Advances
in Applied Probability
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
Adv. Appl. Prob. 23, 612-623 (1991)
Printed in N. Ireland
Abstract
Let X1, X2, , X,, be positive i.i.d. random variables with known distribution
function having a finite mean. For a given s > 0 we define N,, = N(n, s) to be the
largest number k such that the sum of the smallest k Xs does not exceed s, and
M, = M(n, s) to be the largest number k such that the sum of the largest k X's does
not exceed s. This paper studies the precise and asymptotic behaviour of E(N,),
E(M,), N,, M,, and the corresponding 'stopped' order statistics X(Nn) and
X(n --M,,+) as n -- oo, both for fixed s, and where s = s, is an increasing function of n.
STOPPING TIMES; EMPIRICAL DISTRIBUTION FUNCTION; PROPHET INEQUALITY
1. Introduction
The baker's problem. As a baker opens his shop in the morning he finds n clients in
front of the door, all waiting for fresh breakfast rolls. The baker has a supply of s
rolls. If the customers' demands X1, - - - , X,, are i.i.d. with common distribution
function F, what is the expected number of demands he can serve completely(!) if
he serves the smallest demands first, the demands in order of arrival (FCFS), or the
largest demands first?
This formulation is easy to remember and catches the essential nature of the
problem, although we confine our attention to continuous Xi's. More interesting
forms of such questions of 'stopping times' arise, for example, in connection with
resource-dependent branching processes and their extinction probabilities as a
function of the social behaviour of particles (Bruss (1983)). Other examples where
these problems become important include strategies to select intervals under a sum
constraint and a no-recall condition as studied by Samuels and Steele (1981) and
Coffman et al. (1987). However, in what follows we shall confine our attention to
the theoretical part of the problem.
Received 17 July 1989; revision received 29 May 1990.
* Present address: Departement Wiskunde, F 733, Vrije Universiteit Brussel, B-1050 Brussels,
Belgium.
** Postal address: Department of Statistics and Applied Probability, University of California, Santa
Barbara, CA 93106, USA.
Research supported by the grant 'Automatic Decision Strategies' (No. ST2J-0227-C) of the European
Community.
612
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 613
is to define them via the empirical distribution function F, associated with the
sample X1, X2,- - , X,,. Indeed if F is continuous, then they are determined by
(1.3) N,, = nF,(X,,,),
(1.4) M, = n(1 - F (X,,nM,)).
Here we set Xo,,, = 0. Moreover, for general F we have
xkn S
(1.7) x dF(x) = a
(1.8) v = F(r).
Our aim is to find conditions on the distribution function F under which this
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
614 F. THOMAS BRUSS AND JAMES B. ROBERTSON
Xk'S, i.e.
different = fox
treatment. dF(x). Each of the cases a
(2.1) n N, >=n - - - 1. Yn
The first inequality is clear by the definition of Nn. The second inequality is also
clear if S,, 5 s, since then N, = n and the right-hand side is less than n. Therefore we
suppose that S,, > s,. Then N, + 1 < n and since the Xk,n are increasing in k we have
+ Sn - SN+-S Sn - Sn
XNnN+1,n XNn+ ,n
thus
Sn - Sn
XN+1,n
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 615
XNn+l,n
N, +1 > > Y + o(1).
Sn Sn
N, n n 1
1~ =21-
n yn n - - ----- 1 a
Y = fx dF(x), let r e (a, b) be defined by the equation a = fTx dF(x). Let s,, s2,
be a sequence
distribution such F.that
function limn_
Let N, sn/n by
be defined = a, and
(1.1). let X1, X2, " be i.i.d. with
Then
1 1
lim
n---o n--=i n n-= XNn
n
1n m
(2.3)
n k=1
- E Xkl{X
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
616 F. THOMAS BRUSS AND JAMES B. ROBERTSON
limsup-= x dF(x).
n ---noo fl a
Here the third equality above uses the Glivenko-Cantelli theorem which a
F, converges uniformly to F with probability 1, and we have used the cont
throughout. The above inequality clearly implies that lim sup X,,,=X
similar way we can show that r - lim inf,,n XN,+l,n.
We next show that lim,,~o (XN+1,,, - XN,,n) =0 a.s., which will com
proof of the first equation of the theorem. Let e > 0 and let A1, A2, - - -
partition of [a, b] into intervals of positive lengths less that e/2. Since F
increasing on each Aj, we have P(XeAj)>0 for all ljr. Then
Borel-Cantelli lemma, P(Xn e Ai i.o.) = 1. Thus for almost all w with resp
there exists an integer I(o) such that for all n I(o) and all 1 4 k - n, Xk+l
Xk,n(o) < E (since Xk,n(w) and Xk+l,n(() are in the same or adjacent
(XN,+l,n - XN,,n)(Ow) = XN,(w)+l,n(w) - XN,(w),n(W) < E, implying lim,,,o
XN,,,)(w) = 0 as desired.
The second equation of the theorem follows from the first equation, E
(1.3), and the Glivenko-Cantelli theorem.
The third equation of the theorem follows at once from the second equa
N,/n is uniformly bounded by 1. This completes the proof.
For bounded random variables the M,-problem can be treated in the sam
the N,-problem. The role of r is now played by 0, defined by fbx dF(x)
under the same conditions as in Theorem 2.2 (except that b <oo), we
following result.
Theorem 2.3. We have
1 1
n- n--cl n n-~c n
lim XnM
The proof goes throug
unlike the N,-problem,
restriction, and we have
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 617
( 2
(3.3) P[,t - nF(t)l z a] <- 2 exp 4nF(t)) if 0 5 a 5 2n
t[a 2
(3.4) P $ - nf xdF(x) a 2 exp
We next state some preliminary results
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
618 F. THOMAS BRUSS AND JAMES B. ROBERTSON
Further we have
(3.10) nti,,,F(ti,,n) - si
Proof. If nF(t,,,) > 1 and r < 1, we have
- 1 (F(t2,n)- F(t-l,))
F(t2,n)
= S1l S
F(t1,,)
nF(ti,,) = (t)
fix dF(x)
F(ti,,)si
f x dF(x)
S F(tin)Si
Si
- Att,,,(1 - F(At;,,)/F(ti,,))
This proves Equation (3.10) and completes the proof of the lemma.
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 619
nf x dF(x) = s, N, = nF,(X,,n).
F(Qx)
(3.11) lim sup < 1.
x-.o+ F(x)
Then
Proof. Equation (3.5) implies that ti,,--- 0 as n-- oo. By (3.11) and
nti,,F(t~,,) is bounded in n. Thus since r < 1, we have ti,n(nF(ti,n))r' 0 as n -,
us choose n sufficiently large so that tl,n(nF(t,,))r <l(s - sl). Then
complement of E1 we have S1 <s. This implies that Nt, - 5Nn. In the complem
s
= (1 - (nF(tI,n))r- ) s.
Equation (3.8) implies that nF(t,n,)--oo and n - oo. We may therefore take n
sufficiently large so that
(1 - (nF(tI,))r-n) s 1 - E.
Thus
N. >
nF(t,) -
Similarly by considering D2 and E2 we may take n sufficiently large so that in the
= (1 + (nF(t2,n)r-f) s
=1+e.
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
620 F. THOMAS BRUSS AND JAMES B. ROBERTSON
(3.13) N_ - An +-. tn
N, =wenFn(X.,,,)
From this and Equation (1.5) conclude _5 N,. + n[F,(X.,,,) - F,(t,)].
XNn~n
nsn x dF,,(x)
X Nn,n
ntn A XNn,n
n ntn[Fn,(XNu,,) - F,(t,)]+
=t.(Nn - NJt).
This proves Equation (3.13). Consequently
N,_ _, s
nF(t,) = nF(t,) nt,F(t,)
Thus it is sufficient to prove that Zn,- l/(nF(t,)) is uniformly integrable. Since iit,
is the sum of n i.i.d. indicator functions with 'success' probability p = F(t,), it
follows that E(Z,,)= 1 and
Var (Z)-nF(t,,)(1 - F(t,,)) < 1
Var (Z,) = - .
(nF(t,,))2 - nF(t,)
Thus using (3.8) and Chebychev's inequality we have
P(Z,, > a + 1) - P(IZ, - 1I > a)
t1
as a2
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 621
2M
F(Ax) F(Ax)I(Ax)
F(x) F(x)/x
F(x) - -A g dt
Numerical calculations lead us to conjecture that in these cases we still have
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
622 F. THOMAS BRUSS AND JAMES B. ROBERTSON
4. Remarks
SN - St s-St
tn tn
see that the first inequality holds, suppose first that SN ---,. This implies that
XN,,, -t,,, and SNn - Stn is therefore a sum of terms which are not less than t,.
Consequently
tN
N. , S _ sN- - St
which implies the first inequality. On the other hand if SNn < S,, then (Nt - N,)t, >
St - SNn because each term in the difference on the right-hand side is at most t,.
Thus the final inequality always holds. Finally, taking expectations in (4.1) yields
_ s E(S,)
E (Nn) 5 E( Nt,) +^ = E(,t) = nF(t,).
tn tn
o 2
fxdF(x)= - a 2 r = 2-a.
Corresponding to Section 2, we take s, = na with a fixed. Then Theo
is a stopping time on an i.i.d. sequence X, XI, X2, ... such that E(N)<
E( Xk) = E(N)E(X).
k=1
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
All use subject to http://about.jstor.org/terms
'Wald's lemma' for sums of order statistics 623
have S,__ = SN- s < S,. If r' were not random, we could apply Wald's lemma to
obtain E(SNn+) = E(Nn + 1)E(X I X 5 r'). This heuristic result agrees with the
results in Section 2. However, in Section 3 the factor E(XIX 5 XNJ) is not easy to
calculate, and so we derived an expression for E(N,) directly.
Acknowledgements
Part of the research leading to this paper, supported by the European Com-
munity, was carried out by the authors at the University of Arizona where the first
author was holding a research fellowship of the Alexander-von-Humboldt Founda-
tion. We thank the foundations for their support and the University of Arizona for
its hospitality. We are also grateful to the referee for many constructive comments.
References
BRUSS, F. T. (1983) Resource dependent branching processes (abstract). Stoch. Proc. Appl. 16, 36.
COFFMAN, E. G., FLATTO, L. AND WEBER, R. R. (1987) Optimal selection of stochastic integrals
under a sum constraint. Adv. Appl. Prob. 19, 454-473.
SAMUELS, S. M. AND STEELE, J. M. (1981) Optimal selection of a monotone sequence from a random
sample. Ann. Prob. 9, 937-947.
WALD, A. (1947) Sequential Analysis. Dover, New York.
This content downloaded from 164.15.133.2 on Tue, 23 Aug 2016 13:32:09 UTC
View publication stats
All use subject to http://about.jstor.org/terms