Download as ps, pdf, or txt
Download as ps, pdf, or txt
You are on page 1of 6

Cherno Bounds for the Tail of the Binomial Distribution

Lenwood S. Heath Virginia Polytechnic Institute and State University June 7, 1988 This note proves a result (see Corollaries 2 and 3) of Cherno C] that bounds the probability of the tails of the binomial distribution. This is a direct, self-contained proof that avoids the generality of the original. The Cherno bounds are frequently used in combinatorial arguments (e.g., Valiant V]) and in the probabilistic method (e.g., Erdos and Spencer ES]). Related results can be found in Feller F2], page 525.

De nitions and Notation.


Let X be a real random variable with cumulative distribution function FX (x) = P(X x): If g is a real function of a real variable, then the expected value of g(X) is E(g(X)) = The moment generating function of X is MX (t) = E(etX ): Formal calculation with power series shows that MX (t) =
1 X
Z
+

g(x) dFX (x);

if the integral exists; integrals are, in general, Lebesgue integrals.

E(X i )ti i! i=0

(see Feller F1], page 285.) De ne the shifted in mum of MX to be mX (a) = inf MX a (t) t = inf e at MX (t): t = inf E et(X t
a)

Tchebyche Inequality.
The following result is from Kolmogorov K], page 42.

Lemma 1. Let Y be a real random variable. If g is a non-negative real function of a real variable
such that for y a; g(y) b > 0; then P(Y if E(g(Y )) exists. a) E(g(Y )) ; b

Proof:

E(g(Y )) =

Z a Z a

g(y) dFY (y)

g(y) dFY (y) b dFY (y) a):

The Lemma follows.

= bP(Y

Cherno 's Results. Theorem 1. (Cherno C]) If E(X) > 1 and a E(X); then
P(X a) mX (a):

Proof: Let
g(y; t) = et(y Then, for t 0 and y a; g(y; t) 1: By the Tchebyche inequality, P(X a) E(g(X; t)) = E(g(X; t)); 1 whenever t 0: But, E g(X; t) = MX a (t): Thus, P(X a) tinf0 MX a (t):
a) :

For all real x; 1 + x ex : For all t; MX a (t) =


Z Z
+

1 +1 1

et(x

a)

dFX (x)

1 + t(x a) dFX (x)

= 1 + t(E(X) a): Since a E(X); MX a (t) 1 = MX a (0); whenever t 0: Thus,


t

inf0 MX a (t) = inf MX a (t) = mX (a): t P(X a) mX (a):

Finally,

Corollary 1. If E(X) < +1 and a E(X); then


P(X a) mX (a):

Proof: E( X) > 1 and a E( X): By Theorem 1,


P(X a) = P( X But, a) m X ( a):
X +a)

m X ( a) = inf E et( t = inf E e t

t(X a) a)

= inf E et(X t The Corollary follows. = mX (a):

Binomial Distribution.
Let X1 ; X2 ; : : :; Xn be identically distributed, independent random variables with distribution P(Xi = 1) = p; P(Xi = 0) = q = 1 p:
n X i=1

Then, E(Xi ) = p: Let the sum of the random variables be Sn = Then, and E(Sn ) = np: Xi : 0 j n;

P(Sn = j) = n pj qn j ; j
n

We will now calculate mS (a) and apply Theorem 1 to Sn : MX (t) = E(etX ) = pet + q:
i i

MS (t) = E(etS )
n n

n Y

= (pet + q)n: De ne h(a; t) = e at(pet + q)n : We wish to minimize h(a; t); for xed a < n: Take the derivative of h with respect to t : h0 (a; t) = ae at (pet + q)n + e at pet n(pet + q)n 1: Assume h0(a; t) = 0; and solve for t: Then, a(pet + q) = pet n et = p(nqa a) : Thus h(a; t) is minimized when t = ln qa=p(n a) : We obtain mS (a) = inf h(a; t) t = h a; ln p(nqa a) : Corollary 2. Let Sn be a random variable having a binomial distribution with parameters n and p: If k E(Sn ) = np and 0 < k < n; then
n

i=1

E(et Xi )

by independence

k X j =0

n pj qn j = P(S k) n j

n np n k

n k

np k : k

Proof: By the Theorem,


P(Sn k) mS (k):
n

By the above calculations, mS (k) = h k; ln p(nkq k) n peln = e k ln +q k kq + q n = p(nkq k) n k k nq n k k q = p n k n k k nq k nq n k = np n k n k nq n k np k = n k k n np n k np k : = n k k


n kq p(n k) kq p(n k)

Corollary 3. Let Sn be a random variable having a binomial distribution with parameters n and
p: If k E(Sn ) = np and 0 < k < n; then
n X j =k

n pj qn j = P(S k) n j

n np n k

n k

np k : k

Proof: Combine Corollary 1 with the calculation in the proof of Corollary 2.


In Erdos and Spencer ES], the bound is written in exponential form: mS (k) = exp (n k) ln n nq k + k ln np : k
n

References
C] ES] F1] F2] K] V] Cherno , H. A measure of asymptotic e ciency for tests of a hypothesis based on the sum of observations. Annals of Mathematical Statistics 23 (1952), 493-507. Erdos, P., and Spencer, J. Probabilistic Methods in Combinatorics. Academic Press, New York (1974). Feller, W. An Introduction to Probability Theory and Its Applications, Volume I. John Wiley & Sons, Inc., New York (1968). Feller, W. An Introduction to Probability Theory and Its Applications, Volume II. John Wiley & Sons, Inc., New York (1966). Kolmogorov, A. N. Foundations of the Theory of Probability. Chelsea Publishing Company, New York (1956). Valiant, L. G. A theory of the learnable. Communications of the ACM 27 (1984), 11341142.

You might also like