Professional Documents
Culture Documents
Chap 03
Chap 03
by Howard G. Tucker
CHAPTER 3. EXPECTATION
Now one notices that the center of gravity given by the above formula is
nothing other than the sum of the areas in rectangles denoted by 3 and 4
minus the sum of the areas in the rectangles marked o¤ as 1 and 2. In other
words, the center of gravity is the area between the graph of y = FX (x) , the
line y = 1 and the y-axis minus the area below the graph of y = FX (x), above
the x-axis and to the left of the y-axis. For this and any discrete distribution
function, this is always the case. Since any distribution function can be
approximated quite closely by a discrete distribution function, it would seem
55
almost natural to de…ne the centroid in the general case by such a di¤erence
in areas. Accordingly, we arrive at the following de…nition.
De…nition. Let X be a random variable with distribution function
FX (x). The expectation E(X) or EX of X is de…ned by
R1 R0
E(X) = (1 FX (x))dx FX (x)dx
0 1
R1 R0
= P ([X > x])dx P ([X x])dx ,
0 1
and is said to exist if the two integrals are …nite. In cases where at least one
of the integrals is in…nite, we shall say that the expectation of X does not
exist.
In other words, we say that the expectation of X exists if and only if the
two integrals in the above de…nition are …nite. For technical reasons later on
we shall need the following proposition.
Lemma 1. If X is a random variable that satis…es
Z1 Z0
k
x P ([X > x])dx < 1 and j x jk P ([X x])dx < 1
0 1
and
Z0 Z0
k
x P ([X < x])dx = xk P ([X x])dx .
1 1
Proof: We shall only prove the second equality. We …rst prove this in the
case where k = 0. Let > 0 be arbitrary. Since [X < x] [X x] [X <
x + ], we have
Z0 Z0 Z0
P ([X < x])dx P ([X x])dx P ([X < x + ])dx.
1 1 1
56
Let us make the change of variable y = x + in this last integral. Then this
last integral becomes
Z Z0 Z Z0
P ([X < y])dy = P ([X < x])dx+ P ([X < x])dx P ([X < x])dx+ .
1 1 0 1
57
We shall bound the second integral in this last expression. By the triangle
inequality, j y j j y j + and so by the binomial theorem,
Xk
k
k k
jy j (j y j + ) = j y jj k j
.
j=0
j
Thus,
X
k 1
k
k k
jj y j jyj j j y jj k j
.
j=0
j
R0 R1
jj y jk j y jk j P ([X < y])dy = jj y jk j y jk j P ([X < y])dy
1 1
R0 .
k k
+ jj y j j y j j P ([X < y])dy
1
R1 kP1
k
R1
jj y jk j y jk j P ([X < y])dy j
k j
j y jj P ([X < y])dy
1 j=0 1
R1
(2k 1) j y jk P ([X < y])dy
1
R0 R0
jj y jk j y jk j P ([X < y])dy (2k 1) P ([X < y])dy
1 1
(2k 1).
R0 R0
j x jk P ([X < x])dx j x jk P ([X x])dx
1 1
R0
j x jk P ([X < x])dx + (2k 1) .
1
58
R0
Since by hypothesis j x jk P ([X x])dx < 1, we obtain the conclusion
1
by taking the limit as ! 0.
This proposition is of great use to us. It allows us to replace by < and
by >.
Lemma 2. If P fan g is a denumerable sequence of nonnegative real num-
bers, and if Qn = 1 j=n aj , then
X
1 X
1
Qn = nan .
n=1 n=1
Z1 Z0
I1 = P ([X > x])dx and I2 = P ([X x])dx .
0 1
We shall use the fact that P ([X > x]) is nonincreasing in x, i.e., if x1 < x2 ,
then P ([X > x1 ]) P ([X x2 ]). Because of the de…nition of the improper
59
Riemann integral, we may write for every positive integer n,
X Z
k=n
Because of the nonincreasing property of P ([X > x]) in x that was mentioned
above, we have the following inequality:
Zk=n
1 k 1 k 1
P ([X > ]) P ([X > x])dx P ([X > ]) .
n n n n
(k 1)=n
Thus, if we denote
X1 k
Ln = P ([X > ])
k 1
n n
and
X1 k 1
Un = P ([X > ]) ,
k 1
n n
we easily see because of the above inequality that for every positive integer
n,
Ln I1 Un .
However, we may write
1 XX j j+1
Ln = P ([ < X ]) .
nk 1j k n n
However
k k 1 k k
P
n
P n
<X n
=P
n
fpj : (k 1)=n < xj k=ng
fxj pj : (k 1)=n < xj k=ng .
60
Hence
P P 1
Ln Pk 1 fxj pj : (k 1)=n < xj k=ng n
P ([X > 0])
fxm pm : xm > 0g n1 .
61
both of these improper Riemann integrals exist and are …nite. For A > 0, and
making use of integration by parts, the hypothesis that fX (x) is bounded over
every bounded interval and is continuous everywhere except at a countable
set of points, and the fundamental theorem of calculus, we obtain
RA RA R1
P ([X > x])dx = fX (t)dt dx
0 0 x
RA Rx
= fX (t)dt dx
0 1
Rx RA
= x fX (t)dt jA
0 + xfX (x)dx
1 0
R1 RA
=A fX (t)dt + xfX (x)dx.
A 0
Both quantities on the right hand side are nonnegative. First let us suppose
RA
that E(X) exists and is …nite. Then as A ! 1, we have P ([X > x])dx !
0
R1 RA RA
P ([X > x])dx < 1. Since 0 xfX (x)dx P ([X > x])dx <
0 0 0
R1 R1
1 we have, taking A ! 1, xfX (x)dx P ([X > x])dx < 1, i.e.,
0 0
R1 R1 R1
xfX (x)dx < 1. Since A fX (t)dt xfX (x)dx < 1, it follows that
0 A A
R1 R1
A fX (t)dt ! 0. Thus by the displayed equation above, xfX (x)dx =
A 0
R1 R1
P ([X > x])dx < 1. Now let us assume that xfX (x)dx < 1. Then be-
0 0
R1 R1 R1
cause A fX (t)dt xfX (x)dx < 1, we have A fX (t)dt ! 0 as A ! 1.
A A A
R1
By the displayed equation above, it follows that P ([X > x])dx < 1 and
0
R1 R1 R0
P ([X > x])dx = xfX (x)dx. Similarly, P ([X x])dx < 1 if and
0 0 1
R0
only if xfX (x)dx < 1, in which case the two integrals are equal, which
1
proves the theorem
We now give some examples of expectations of random variables.
62
Example 1. The binomial distribution. Suppose X is a random variable
with the binomial distribution, Bin(n; p), i.e.,
n
px (1 p)n
x
x
if x = 0; 1; 2; ;n
P ([X = x]) =
0 otherwise .
= np .
X
1
d X
1
d 1 1
E(X) = n(1 p)n 1 p = p (1 p)n = p = .
n=1
dp n=0 dp 1 (1 p) p
63
Lemma 3. If x1 ; x2 ; ; xN are any numbers, and if 1 n N , then
X X
n
N 1 X
N
xik = xj .
1 i1 <i2 < in N k=1
n 1 j=1
1
P
N
where x = N
xj .
j=1
Example 5. The Wilcoxon Distribution. In this case, a random variable
W is the sum of n numbers selected at random without replacement from
1; 2; ; N . We recall that
X
N
N (N + 1)
j= ,
j=1
2
and so, applying the result of example 4, if W has the W ilcoxon(n; N ) dis-
tribution, then E(W ) = n(N2+1) .
Example 6. The hypergeometric distribution. Recall that X is said to
have the hypergeometric distribution if it has the same distribution as the
number of red balls in a sample of size n taken at random without replacement
from an urn that contains r red balls and b black balls, where 1 n r + b.
In this case, we assign the number 1 to each of the r red balls and the number
0 to each of the black balls. Now the number X of red balls in the sample
of size n is equal to the sum of the numbers in the sample. Thus, taking
N = r + b, we have
r
E(X) = n .
r+b
64
2
Example 7. The normal distribution, N ( ; ). This absolutely continu-
ous distribution has a density given by
1 (x )2 =2 2
fX (x) = p e , -1 < x < 1,
2 2
2
where 2 ( 1; 1) and > 0 are constants. Using theorem 2, we obtain
Z1 Z1
1 (x )2 =2 2
E(X) = xfX (x)dx = p xe dx = .
2 2
1 1
e x if x 0
fX (x) =
0 if x < 0,
R0 1
and x (1+x2 )
dx = 1 . Thus the condition given by theorem 2 is vi-
1
olated. An example of a random variable with a discrete distribution for
65
which the expectation does not exist, consider a random variable Y with
discrete density given by P ([Y = 2n ]) = 21n for n = 1; 2; . In this case,
P1
2n P ([Y = 2n ]) = 1 , thus violating the condition of theorem 1.
n=1
EXERCISES
66
(i) Find the distribution function of X.
(ii) Compute E(X).
and
Z0 Z0
P ([X x])dx = P ([X x])dx K ,
1 K
67
Theorem 3. If X; Y and Z are random variables, if expectations exist
for X and Z, and if X Y Z; (i.e., if
P ([X Y Z]) = 1,
and
P ([Z x]) P ([Y x]) P ([X x]):
The hypothesis that X and Z have expectations, the de…nition of expectation
and the above inequalities easily imply the conclusions.
Corollary. If X and Z are random variables whose expectations exist,
and if X Z, then E(X) E(Z) .
Proof: This follows from theorem 3 by taking Y = X.
Lemma 1. If X and Y are random variables, then
Z1 Z1
lim P ([X > x] \ [j Y j< K])dx = P ([X > x])dx
K!1
0 0
and
Z0 Z0
lim P ([X x] \ [j Y j< K])dx = P ([X x])dx ,
K!1
1 1
no matter whether in each case the right hand side is …nite or in…nite.
Proof: We shall only prove the …rst conclusion; a similar proof holds for
the second conclusion. We …rst observe that [X > x] \ [j Y j K] [X > x],
so that P ([X > x] \ [j Y j K]) P ([X > x]) . From this we obtain that
Z1 Z1
P ([X > x] \ [j Y j K])dx P ([X > x])dx
0 0
68
for all K > 0 . Since the right side does not depend on K, and since the left
side is nondecreasing in K (prove this!), it follows that the following limit
exists and
Z1 Z1
limK!1 P ([X > x] \ [j Y j K])dx P ([X > x])dx.
0 0
P ([X > x]) = P ([X > x] \ [j Y j K]) + P ([X > x] \ [j Y j> K]) .
Let A > 0 and > 0 be …xed (with A as large as one wishes, and > 0
being as small as one wishes). Then, since P ([j Y j K]) ! 1 as K ! 1,
it follows that P ([j Y j> K]) ! 0 as K ! 1 . Hence there exists a K0 > 0
such that for all K K0 , the inequality P ([j Y j> K]) < A is true. Hence
for all K K0 ,
R1 RA
P ([X > x] \ [j Y j K])dx (P ([X > x]) P ([j Y j> K]))dx
0 0
RA
P ([X > x])dx .
0
Since this holds for all A, we can take the limit of both sides as A ! 1 to
obtain
Z1 Z1
limK!1 P ([X > x] \ [j Y j K])dx P ([X > x])dx .
0 0
Next, take the limit of both sides as ! 0, and the reverse inequality is
established, which concludes the proof of the …rst conclusion.
69
Corollary. If X is a random variable whose expectation exists, then
Proof: We …rst observe the following easily proved identities: for x >
0; K > 0;
We shall prove only the second inequality. We observe that for every x 0,
x x
[X + Y x] [X ] [ [Y ].
2 2
Thus
x x
P ([X + Y x]) P ([X ]) + P ([Y ]),
2 2
70
and, by properties of integration,
Z 0 Z 0 Z 0
x x
P ([X + Y x])dx P ([X ])dx + P ([Y ])dx.
1 1 2 1 2
By the change of variable y = x=2, this last inequality becomes
Z 0 Z 0 Z 0
P ([X + Y x])dx 2 P ([X y])dy + 2 P ([Y y])dy,
1 1 1
71
which proves the theorem in case (ii).
Case (iii) Now suppose X is as in the hypothesis, and suppose Y is a
bounded random variable, i.e., suppose there is an integer N > 0 such that
j Y (!) j< N for all ! 2 . The expectation of Y exists since it is bounded.
In this case we de…ne two random variables, Yn+ and Yn for every positive
integer n by:
XN 2n
+ j
Yn = I n n
n [(j 1)=2 <Y j=2 ]
n
j= N 2 +1
2
and n
X
N2
j 1
Yn = I[(j 1)=2n <Y j=2n ] .
j= N 2n +1
2n
It is easy to verify that Yn (!) Y (!) Yn+ (!) over , and 0 Yn+ (!)
Yn (!) 1=2n for all n and for all ! 2 . Thus by theorem 3 and case (ii)
we have
E(Yn+ Yn ) 1=2n ,
E(Yn+ Yn ) = E(Yn+ ) E(Yn ) and
E(Yn ) E(Y ) E(Yn+ ) ,
which imply
0 E(Y ) E(Yn ) 1=2n and
0 E(Yn+ ) E(Y ) 1=2n .
Since X + Yn X+Y X + Yn+ , we have by case (ii) and theorem 3
that X + Y has …nite expectation, and hence by the inequalities above and
theorem 5, one obtains
E(X + Y ) E(X + Yn+ )
= E(X) + E(Yn+ )
E(X) + E(Y ) + 1=2n
for every positive integer n. Similarly E(X + Y ) E(X) + E(Y ) 1=2n ,
and thus for every positive integer n,
1=2n E(X + Y ) E(X) E(Y ) 1=2n .
Letting n ! 1, we obtain E(X + Y ) = E(X) + E(Y ) for case (iii).
Case (iv) We now prove the theorem in the general case. By lemma 1,
we obtain
R1 R1
0
P ([X + Y > x])dx = limK!1 R0 P ([X + Y > x] \ [j Y j< K])dx
1
= limK!1 0 P ([X + Y I[jY j<K] > x] \ [j Y j< K])dx ,
72
no matter whether the left side is …nite or in…nite. Similarly, …nite or in…nite,
we have by lemma 1,
Z 0 Z 0
P ([X + Y x])dx = lim P ([X + Y I[jY j<K] x] \ [j Y j< K])dx.
1 K!1 1
73
EXERCISES
Evaluate E(X).
6. Prove: if K and x are positive constants, and if X is a random variable,
then
[X > x] \ [j X j< K] = [XI[jXj<K] > x] .
7. Verify the following: if c and x are real numbers, if A is an event, and
if X is a random variable, then
(i) [X + cIA > x] \ A = [X + c > x] \ A ,
c c
(ii) [X
R 0 + cIA > x] \ A = [XR>0 x] \ A and
(iii) c P ([X y] \ A)dy+ c P ([X > y] \ A)dy = cP (A).
8. Prove: if Y is a random variable that satis…es 0 Y (!) n for all
! 2 , where n is a positive integer, then
X
n X
n
(j 1)I[j 1<Y j] (!) Y (!) jI[j 1<Y j] (!)
j=1 j=1
for all ! 2 .
9. In problem 5 above, prove that the random variable X is symmetrically
distributed about 2:3 + 2 .
74
3.3 Moments and Central Moments. The purpose of this section is
to de…ne the various kinds of moments used in statistics and to derive some
of their properties. We note …rst that powers of random variables are also
random variables; we obtained this in theorem 5 in section 2.1.
De…nition. If r is a positive integer, and if X is a random variable, we
de…ne the rth moment of X to be mr = E(X r ), provided the expectation
exists. We de…ne the rth central moment to be r = E((X E(X))r ),
provided E(X) and E((X E(X))r ) both exist. The second central moment,
2 , of X is called the variance of X and is denoted by V arX or V ar(X).
Theorem 1. If X is a random variable, if r < s are positive integers,
and if E(X s ) exists, then E(X r ) exists.
Proof: Since by hypothesis E(X s ) exists, then by de…nition,
Z 1 Z 0
s s
E(X ) = P ([X > x])dx P ([X s x])dx ,
0 1
R1
and both integrals are …nite. We now prove that 0 P ([X r > x])dx < 1.
Indeed, by the same argument used to prove proposition 1 in section 2.3, we
have Z 1 Z 1
r
0 P ([X > x])dx = P ([X r x])dx 1
0 0
and Z Z
0 0
r
0 P ([X < x])dx = P ([X r x])dx 1.
1 1
Now for 1 x < 1; x1=r x1=s , and thus P ([X > x1=r ]) P ([X > x1=s ])
and P ([X < x1=r ]) P ([X < x1=s ]). Hence if r is odd,
R1 r
R1 1=r
0
P ([X > x])dx 1 + R11 P ([X > x1=s ])dx
1 + R1 P ([X > x ])dx
1
1 + 0 P ([X s > x])dx < 1 .
But if r is even, then
R1 R1
0
P ([X r > x])dx 1 + R1 (P ([X < x1=r ]) + P ([X > x1=r ]))dx
1
1 + 1 (P ([X < x1=s ]) + P ([X > x1=s ]))dx .
If s is odd, then by lemma 1 in section 3:1, we have
Z 1 Z 1 Z 1
1=s s
P ([X < x ])dx = P ([X x])dx = P ([X s x])dx < 1 ,
1 1 1
75
and if s is even, then
Z 1 Z 1
1=s
P ([X < x ])dx = P ([X > x1=s ])dx < 1:
1 1
Hence
P by theorem 1 in section 3:2, E(X r ) exists if and only if the series
j yn j qn < 1. But
n
8 9 8 9
X X < X = X< X = X
jyn jqn = jyn j pj = jyn j pj = jxrj jpj ,
: ; : ;
n n fj:xrj =yn g n fj:xrj =yn g j
76
Lemma 1. If Z is a random variable with an absolutely continuous dis-
tribution, and if h : [a; b] ! R1 is a nondecreasing function over the interval
[a; b], then
Z b
h(a)P ([a Z < b]) h(x)fZ (x)dx h(b)P ([a Z < b]) .
a
the last inequality following from the fact that the density is nonnegative
and h is nondecreasing. The second inequality follows by similar reasoning.
Lemma 2. If Z is a random variable, then
X1 Z1
1 k
P Z P ([Z > x])dx.
k=1
n n
0
k k 1 k
g(x) = P Z whenever x< for all x 2 [0; 1).
n n n
ZN XN
1 k
g(x)dx = P Z .
k=1
n n
0
RN RN
But g(x)dx P ([Z x]) dx, and by the de…nition of the improper in-
0 0
tegral, the conclusion follows.
Theorem 3. If X is a random variable with absolutely continuous dis-
tributionR function, and if r is a positive integer, then E(X r ) exists if and
1
only if 1
j x jr fX (x)dx < 1, in which case
Z 1
r
E(X ) = xr fX (x)dx .
1
77
Proof: Case (i) r is even. In this case, X r (!) 0 for every ! 2 , and
h(x) de…ned by h(x) = xr is increasing
R 1 over [0; 1) and is decreasing over
( 1; 0]. Thus, if we denote I = 0 P ([X r > x])dx, it follows from the
de…nition of expectation that E(X r ) exists if and only if I < 1, in which
case E(X r ) = I. Let us denote
( 1; x1=r ) [ (x1=r ; 1) if x 0
(x) =
( 1; 1) if x < 0 .
78
R1
In a similar manner, Un 1
tr fX (t)dt+ n1 . Thus, for every positive integer
n, Z Z
1 1
r 1 1
t fX (t)dt I tr fX (t)dt + ,
1 n 1 n
which yields the conclusions of the theorem in case r is even. Case (ii) r is
odd. In this case, de…ne
Z 1 Z 0
I1 = P ([X > x])dx and I2 = P ([X x])dx .
0 1
which yields the conclusion of the theorem in this case, thus concluding the
proof of the theorem.
EXERCISES
79
5. Prove: if Y and Zn are random variables as in problems 2 and 4, prove
that 0 FY (y) FZn (y) < n1 for all real values of y.
6. Prove: if Y and Zn are random variables as in problems 2 and 4, then
FZn (y) ! FY (y) uniformly in y as n ! 1, i.e., prove that for every > 0
there exists a positive integer N such that j FY (y) FZn (y) j< for all values
of y and for all n > N .
7. De…nition. If X is a nonnegative integer-valued random variable, its
probability generating function gX (u) is de…ned by
X
1
gX (u) = P ([X = n])un = E(uX )
n=0
80
Theorem 2. If X is a random variable with …nite second moment, and
if C is any constant, then V ar(CX) = C 2 V ar(X).
Proof: By already established properties of expectation, we see that
E((CX)2 ) = E(C 2 X 2 ) = C 2 E(X 2 ) and E(CX) = CE(X). By these re-
sults and by theorem 4,
V ar(CX) = E((CX)2 ) (E(CX))2
= C 2 E(X 2 ) C 2 (E(X))2
= C 2 V ar(X) .
Theorem 3. If X is a random variable with …nite second moment,
and if C is any constant, then X + C has a …nite second moment, and
V ar(X + C) = V ar(X):
Proof: If X has …nite second moment, then E(X 2 ) 2CE(X) + C 2 is a
…nite well-de…ned number. But E(X 2 ) 2CE(X) + C 2 = E(X 2 + 2CX +
C 2 ) = E((X E(X))2 ) = V ar(X). The second conclusion follows directly
from the de…nition of variance and the fact that E(X + C) = E(X) + C.
Now let us derive formulas for variances of the six important discrete
distributions.
Example 1. The binomial distribution. Suppose X is a random variable
with the binomial distribution, i.e.,
n k
P ([X = k]) = p (1 p)n k
for k = 0; 1; 2; ; n.
k
We shall compute V ar(X). In section 3.1 we found that E(X) = np. Now
P
n
n
E(X 2 ) = k2 k
pk (1 p)n k
k=0
P
n
n
= p (k(k 1) + k) k
pk (1 p)n k
k=1
nP1
n 1
= np (k 1) k 1
pk 1 (1 p)n k
k 1=0
nP1
n 1
+np k 1
pk 1 (1 p)(n 1) (k 1)
.
k 1=0
Now the …rst of the two sums in this last expression is the expectation of
a random variable whose distribution is Bin(n 1; p) whose expectation is
(n 1)p. The second sum is just (p + 1 p))n 1 = 1. Thus
E(X 2 ) = np(n 1)p + np = (np)2 np2 + np .
81
Using theorem 1 we obtain V ar(X) = (np)2 np2 + np (np)2 = np(1 p) .
Example 2. The geometric distribution. We now …nd the variance of
a random variable, X, whose distribution is geom(p), i.e., P ([X = n]) =
(1 p)n 1 p for n = 1; 2; , where 0 < p < 1. By theorem 2,
P
1 P
1
E(X 2 ) = n2 (1 p)n 1 p = n((n 1) + 1)(1 p)n 1 p
n=1 n=1
P
1 P
1
= (1 p)p n(n 1)(1 p)n 2
+p n(1 p)n 1 .
n=2 n=1
Since 0 < 1 p < 1, the two series in this last expression can be expressed
as term by term second and …rst derivatives respectively, and thus
d 2 P
1
d
P
1
E(X 2 ) = (1 p)p dp2 (1 p)n p dp (1 p)n
n=0 n=0
d2 1 d 1
= (1 p)p dp 2 1 (1 p)
p dp 1 (1 p)
2
= (1 p) p2 + p p12
= 2p2p .
By example 2 in section 3.1, E(X) = p1 . Thus, by theorem 1,
V ar(X) = E(X 2 ) (E(X))2
= 2p2p p12 , or
V ar(X) = 1p2p .
Example 3. The Poisson distribution. The discrete density of the Poisson
distribution is given by
n
P ([X = n]) = e for n = 0; 1; 2; .
n!
In example 3 in section 3.1, we found that E(X) = . In order to …nd
the formula for V ar(X); we shall …nd E(X 2 ) and then apply theorem 1. By
theorem 2, by the same di¤erentiating techniques
P1 n as in the previous example,
z
and remembering the expansion e = n=0 z =n! , we have
P1 n
E(X 2 ) = n2 e n!
n=0
P
1 n
=e n((n 1) + 1) n!
n=0
P1 n P
1 n
=e n(n 1) n! + e n n!
n=2 n=1
2 d2 d 2
=e d 2
e +e d
e = + .
82
Thus, by theorem 1, we have V ar(X) = .
We next wish to derive formulas for the variances of distributions of the
sample survey sum distribution and its special cases. For this we need the
following lemma.
Lemma 1. If x1 ; x2 ; ; xN are a …nite sequence of not necessarily
distinct numbers, if 2 n N , and if f (x; y) is any function of two
variables, then
( )
X X N 2 X
f (xiu ; xiv ) = f (xu ; xv ) .
1 i < <i N 1 u<v n
n 2 1 u<v N
1 n
P P 2 1
n
= xij N
( )
1 i1 < <iP n N j=1 Pn
+2 xiu xiv N1
1 i1 < <in N 1 u<v n (n)
N 1 P
( ) N
( ) P
N 2
= nN 1 x2 + 2 nN 2 x x
( n ) j=1 j ( n ) 1 u<v N u v
PN P
= Nn x2j + 2 Nn(n(N 1)
1)
xu xv .
j=1 1 u<v N
83
Now
V ar(X) = E(X 2 ) (E(X))2 8 9
!2
PN < PN P
N =
n n(n 1)
= N
x2j + N (N 1) xj x2j n2 x2
j=1 : j=1 j=1 ;
n(N n) P
N
= N (N 1)
x2j x2 n(N
N 1
n)
j=1 ( )
n(N n) 1 P
N
= N N 1
x2j N x2 .
j=1
Thus ( N )
(N n) 1 X
V ar(X) = n x2j N x2 .
N N 1 j=1
Recalling that
X
N X
N
x2j 2
Nx = (xi x)2 ,
j=1 i=1
we also have
n 1 X
N
V ar(X) = n 1 (xi x)2 :
N N 1 i=1
We now apply the formula for variance obtained in example 4, where we take
xi = i for 1 i N , to obtain
N n 1 N (N + 1)(2N + 1) (N + 1)2
V ar(W ) = n N .
N N 1 6 4
84
After a little algebra, we arrive at
1
V ar(W ) = n(N n)(N + 1) .
12
Example 6. The hypergeometric distribution. Let H denote the number
of red balls in a simple random sample of size n selected without replacement
from an urn that contains N balls, of which R are red. The problem is to
R
…nd the formula for V ar(H). We determined in section 3.1 that E(H) = n N .
We now apply the result of example 4, where we let R of the xi ’s equal 1 and
P
N PN
the remaining N R of them equal 0. Then xj = x2j = R and
j=1 j=1
N n 1 R2
V ar(H) = n R N .
N N 1 N2
Again, after a little algebra, we obtain the formula
N n
V ar(H) = n R(N R) .
N (N R)
Example 7. The uniform[0; 1] distribution. Let X be a random variable
that is uniformly distributed over [0; 1], i.e.,
1 if 0 x 1
fX (x) =
0 if x < 0 or x > 1.
Then Z Z
1 1
1
E(X) = xfX (x)dx = xdx = .
1 0 2
Also, Z Z
1 1
2 2 1
E(X ) = x fX (x)dx = x2 dx = .
1 0 3
Thus
1
V ar(X) = E(X 2 ) . (E(X))2 =
12
Example 8. The exponential distribution. This absolutely continuous
distribution function has a density given by
e x if x 0
fX (x) =
0 if x < 0,
85
where > 0 is a positive constant. By example 8 in section 3.1, E(X) = 1= .
Now R1 2 R1 2
E(X 2 ) = 1 R x fX (x)dx = 0
x e x dx
1
= 12 0 y 2 e y dy = 22 .
Thus,
2 1 1
V ar(X) = 2
( )2 = 2 .
EXERCISES
86
8. Find the expectation and variance of a random variable W , the sum
of the upturned faces resulting from tossing a pair of fair dice.
9. Prove that if X is a non-constant random variable whose expectation
exists, then P ([X < E(X)]) > 0.
87