MATH230 Lecture Notes 3

MATH 230
Introduction to Probability
Theory
Mathematical Expectation
Copyright © 2010 Pearson Addison-Wesley. All rights reserved.
TEDU
Chapter 4 - Mathematical
Expectation
4.1 Mean of a Random Variable

4.2 Variance and Covariance of Random Variables
4.3 Means and Variances of Linear Combinations of
Random Variables
4.4 Chebyshev’s Theorem
Mean of a Random Variable

 This method of relative frequencies is used to calculate the

average amount earned that we might expect in the long
run.
 We shall refer to this average value as the mean of the
random variable X or the mean of the probability
distribution of X and write it as μx or simply as μ when it is
clear to which random variable we refer.
 It is also common among statisticians to refer to this mean as
the mathematical expectation, or the expected value of the
random variable X, and denote it as E(X).

Definition 4.1


Example:
If a dealer’s profit, in million dollars, on a new
automobile can be looked upon as a random variable X
having the density function:
find the average profit per automobile in million dollars.

Example:

Theorem 4.1
We extend our concept of mathematical expectation to the case

of two random variables X and Y with joint probability
distribution f(x, y).
Definition 4.2

Variance and Covariance of
Random Variables
 The mean, or expected value, of a random variable X is of

special importance in statistics because it describes where the
probability distribution is centered.
 By itself, however, the mean does not give an adequate
description of the shape of the distribution.
 We also need to characterize the variability in the
distribution.

Random Variables
The histograms of two discrete probability distributions that have
the same mean, μ = 2, but differ considerably in variability, or the
dispersion of their observations about the mean:
Figure 4.1 Distributions with equal means and unequal dispersions

Random Variables
Definition 4.3
Random Variables
Theorem 4.2
Random Variables
Random Variables
Random Variables
Theorem 4.3
Random Variables
Example:
The length of time, in minutes, for an airplane to obtain
clearance for takeoff at a certain airport is a random variable
Y = 3X −2, where X has the density function:
Find the mean and variance of the random variable Y .

Review ...
1 1 1
E ( R)   rf (r )dr   6r (1  r )dr   (6r 2  6r 3 )dr
2
6r (1  r ) 0  r  1 0 0 0
f R (r )  
0 otherwise
3r 4 1 3 1
 ( 2r 
3
) 0  2 
2 2 2
Review ...
6r (1  r ) 0  r  1
f R (r )  
0 otherwise
V ( R)  E ( R 2 )  E ( R) 2
V ( R)  E[( R  E ( R)) 2 ] 1 1
3
1
 1
2
1 OR E ( R )   r f (r )dr   (6r 3  6r 4 )dr 
2 2
V  R     r    6r 1  r  dr  0 0
10
0
2 20
3 1 1
V ( R)   ( )2 
10 2 20
Means and Variances of Linear
Combinations of Random Variables
Var  b   0
Var  aX   a 2Var  X 
6r (1  r ) 0  r  1
f R (r )  
0 otherwise
1
5 3 5
E  P   E  3R  1    3r  1 6r 1  r d r  E ( P)  E (3R  1)  3E ( R)  1  1 
0
2 2 2
OR
V  P   E  P2    E  P  V ( P)  V (3R  1)  9V ( R)
2
1 9
   9* 
1
E  P   E  3R  1
67
   3r  1  6r 1  r  d r 
2 2 2
10 20 20
0
2
67  5  9
V  P     
10  2  20
Random Variables
 Covariance indicates how two variables are related.
 A positive covariance means the variables are positively related,
while a negative covariance means the variables are inversely related.
 When X and Y are statistically independent, it can be shown that the

covariance is zero (converse, is not generally true, they may have a
nonlinear relationship and zero covariance).
Random Variables
Definition 4.4
Theorem 4.4
Random Variables
Example:
For the random variables X and Y whose joint density function is
given below, find the covariance.
Random Variables
 The magnitude of covariance does not indicate anything
regarding the strength of the relationship.
 Correlation also tells you the degree to which the variables
tend to move together in addition to telling you whether
variables are positively or inversely related.
 −1≤ ρXY≤ 1 and ρXY =0 when σXY = 0.

 Where there is an exact linear dependency, Y = a + bX, then
ρXY=1 if b > 0 and ρXY = −1 if b < 0.
Random Variables

Independent Random Variables
Let X and Y be two independent random variables. Then:

Conditional Expectation
 Let X and Y be random variables such that the mean of Y

exists and is finite.
 The conditional expectation (or conditional mean) of Y given

X = x is denoted by E(Y| X = x) and is defined to be the
expectation of the conditional distribution of Y given X = x.

Recall that:

DISCRETE RANDOM VARIABLES
P ( X  x, Y  y )
E  X | Y  y    x P ( X  x | Y  y ) where P( X  x | Y  y ) 
x P (Y  y )
P ( X  x, Y  y )
E Y | X  x    y P (Y  y | X  x ) where P (Y  y | X  x ) 
y P( X  x)
CONTINUOUS RANDOM VARIABLES


f  x, y 
E  X |Y  y   x f X |Y  x | y dx where f X |Y  x | y  

h y

f  x, y 
E Y | X  x    y f  y | x dy where fY | X  y | x  
g  x
Y|X


EXAMPLE: Suppose the joint probability mass function of X

and Y is
X
0 1 2
0 0.10 0.15 0.05
Y 1 0.17 0.22 0.12
2 0.02 0.10 0.07
a) What is the conditional distribution of X given that Y=0?
b) What is the expected value of X given that Y=0?

X
0 1 2
0 0.10 0.15 0.05
Y 1 0.17 0.22 0.12
2 0.02 0.10 0.07
a) What is the conditional distribution of X given that Y=0?

P( X  0, Y  0) 0.10 1
P( X  0 | Y  0)   
P(Y  0) 0.30 3
P( X  1, Y  0) 0.15 1
P( X  1 | Y  0)   
P(Y  0) 0.30 2
P( X  2, Y  0) 0.05 1
P( X  2 | Y  0)   
P(Y  0) 0.30 6
2
1 1 1
Note that:  P( X  i | Y  0)    1
i 0 3 2 6

P( X  0, Y  0) 0.10 1
X P( X  0 | Y  0)   
P(Y  0) 0.30 3
0 1 2
P( X  1, Y  0) 0.15 1
0 0.10 0.15 0.05 P( X  1 | Y  0)   
Y 1 0.17 0.22 0.12 P(Y  0) 0.30 2
2 0.02 0.10 0.07 P( X  2, Y  0) 0.05 1
P( X  2 | Y  0)   
P(Y  0) 0.30 6

2
1 1 1 5
E  X | Y  0    x P( X  x | Y  0)  0*  1*  2* 
x 0 3 2 6 6

Law of Total Probability for
Expectations.
P ( X  x, Y  y )
E Y | X  x    y P (Y  y | X  x)   y
y y P( X  x)
E  E Y | X  x     E Y | X  x  P ( X  x )
x
P ( X  x, Y  y )
E  E Y | X  x     y P( X  x)
x y P( X  x)
E  E Y | X  x      y P ( X  x, Y  y )   y h  y   E Y 
x y y
P( X  0, Y  0) 0.10 1
X P( X  0 | Y  0)   
P(Y  0) 0.30 3
0 1 2
P( X  1, Y  0) 0.15 1
0 0.10 0.15 0.05 P( X  1 | Y  0)   
Y 1 0.17 0.22 0.12 P(Y  0) 0.30 2
2 0.02 0.10 0.07 P( X  2, Y  0) 0.05 1
P( X  2 | Y  0)   
P(Y  0) 0.30 6

2
1 1 1 5
E  X | Y  0    x P( X  x | Y  0)  0*  1*  2* 
x 0 3 2 6 6

Chebyshev’s Theorem
We stated that the variance of a random variable tells us

something about the variability of the observations about
the mean.
If a random variable has a small variance or standard

deviation, we would expect most of the values to be
grouped around the mean.
Therefore, the probability that the random variable

assumes a value within a certain interval about the mean is
greater than for a similar random variable with a larger
standard deviation.

If we think of probability in terms of area, we would expect a
continuous distribution with a large value of σ to indicate a
greater variability, and therefore we should expect the area to be
more spread out, as in Figure 4.2(a). A distribution with a small
standard deviation should have most of its area close to μ, as in
Figure 4.2(b).
Figure 4.2 Variability of continuous observations about the mean

The area in the probability histogram in Figure 4.3(b) is spread
out much more than that in Figure 4.3(a) indicating a more
variable distribution of measurements or outcomes.
Figure 4.3 Variability of discrete observations about the mean

Theorem 4.10
 For k = 2, the theorem states that the random variable X has a

probability of at least 1−1/22 = 3/4 of falling within two
standard deviations of the mean.
 That is, three-fourths or more of the observations of any
distribution lie in the interval μ ± 2σ.

Chebyshev’s theorem holds for any distribution of observations.

The value given by the theorem is a lower bound only.
That is, we know that the probability of a random variable
falling within two standard deviations of the mean can be no less
than 3/4, but we never know how much more it might actually be.
Only when the probability distribution is known can we
determine exact probabilities. For this reason we call the theorem
a distribution-free result.

Example 4.27
Solution:

MATH230 Lecture Notes 3

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MATH230 Lecture Notes 3

Uploaded by

Copyright:

Available Formats

MATH 230

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

4.1 Mean of a Random Variable

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

 This method of relative frequencies is used to calculate the

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

find the average profit per automobile in million dollars.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

We extend our concept of mathematical expectation to the case

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

 The mean, or expected value, of a random variable X is of

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Figure 4.1 Distributions with equal means and unequal dispersions

Find the mean and variance of the random variable Y .

 When X and Y are statistically independent, it can be shown that the

 −1≤ ρXY≤ 1 and ρXY =0 when σXY = 0.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

 Let X and Y be random variables such that the mean of Y

 The conditional expectation (or conditional mean) of Y given

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

CONTINUOUS RANDOM VARIABLES

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

EXAMPLE: Suppose the joint probability mass function of X

a) What is the conditional distribution of X given that Y=0?

b) What is the expected value of X given that Y=0?

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

a) What is the conditional distribution of X given that Y=0?

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

b) What is the expected value of X given that Y=0?

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

b) What is the expected value of X given that Y=0?

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

We stated that the variance of a random variable tells us

If a random variable has a small variance or standard

Therefore, the probability that the random variable

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Figure 4.2 Variability of continuous observations about the mean

Figure 4.3 Variability of discrete observations about the mean

 For k = 2, the theorem states that the random variable X has a

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Chebyshev’s theorem holds for any distribution of observations.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

Copyright © 2010 Pearson Addison-Wesley. All rights reserved.

You might also like