MAT205T: Probability Theory: S Vijayakumar

MAT205T: Probability Theory
S Vijayakumar
Indian Institute of Information Technology,
Design & Manufacturing, Kancheepuram
Module 7: Expectations
I Expectations and Variance of Standard Random Variables

I Linearity of Expection
I Independence and Expectation
I Markov and Chebyshev Inequalities
I The Central Limit Theorem
I Laws of Large Numbers
Note
The lectures are mostly based on the textbook:
Sheldon Ross: A First Course in Probability, Pearson.
For details on homework problems and for any further clarifications you may consult this book.
Expectation
Definition
If X is a discrete random variable with probability mass function p(x), then the expectation or
the expected value of X , denoted E (X ), is given by
X
E (X ) = xp(x).
x:p(x)>0
That is, the expectation of X is the weighted average of the possible values that X assumes.
Note: The expected value of a random variable is also called the mean or the first moment.
Example
If the pmf of X is given by
1
p(0) = = p(1),
2
then
1 1 1
E (X ) = 0 × +1× = .
2 2 2
Example
If the pmf of X is given by
1 2
p(0) = and p(1) = ,
3 3
then
1 2 2
E (X ) = 0 × +1× = .
3 3 3
Example: The Indicator Random Variable
Let A be an event and let I be a random variable defined as follows:

1 if A occurs
I =
0 if Ac occurs
Then I is called the indicator random variable of the event A. Its expectation is P(A):
E (I ) = 0 × P(I = 0) + 1 × P(I = 1) = 0 × (1 − P(A)) + 1 × P(A) = P(A).

Example
Let X be the outcome when we roll a fair die. Find E (X ).
Solution:
1
p(1) = p(2) = p(3) = p(4) = p(5) = p(6) = .
6
So,
1 1 1 1 1 1 7
E (X ) = 1 × +2× +3× +4× +5× +6× = .
6 6 6 6 6 6 2
Note
If independently repeating the experiment n times (for n large) results in x1 , x2 , . . . , xn , then

x1 + x2 + . . . + xn
≈ E (X )
n
Expectation is the Same as the Center of Gravity
I Let X be a random variable with pmf p(xi ), i ≥ 1.

I Consider a weightless rod in which weights p(xi ) are attached at locations xi .
I The point about which the rod would be balancing is called the center of gravity of the
rod.
I This point also turns out to be the expectation of the random variable.
I For example, for a random variable with pmf
p(−1) = 0.10, p(0) = 0.25, p(1) = 0.30, p(2) = 0.35,
the center of gravity of the rod described is at 0.9. It is also the expectation of the
random variable.
Expectation of a Function of a Random Variable
Let X denote a random variable that takes on any of the values −1, 0, 1 with respective
probabilities
P(X = −1) = 0.2, P(X = 0) = 0.5, P(X = 1) = 0.3.

Compute E (X 2 ).
Solution: Let Y = X 2 . Then the pmf of Y is given by
P(Y = 0) = P(X = 0) = 0.5

P(Y = 1) = P(X = −1) + P(X = 1) = 0.5
Hence
E (X 2 ) = E (Y ) = 0 × 0.5 + 1 × 0.5 = 0.5.
Note: There is a simpler method for computing this expectation!
Proposition
If X is a discrete random variable that takes on one of the values xi , i ≥ 1, with respective
probabilities p(xi ), then for any real-valued function g
X
E [g (X )] = g (xi )p(xi ).
i
Example
Applying the above proposition to the example in the previous slide, we get
E (X 2 ) = (−1)2 × 0.2 + 02 × 0.5 + 12 × 0.3 = 0.5

Proposition
If X is a discrete random variable that takes on one of the values xi , i ≥ 1, with respective
probabilities p(xi ), then for any real-valued function g
X
E [g (X )] = g (xi )p(xi ).
i
Proof:
Suppose that yj , j ≥ 1, represent the different values of g (xi ), i ≥ 1. Then grouping all the
g (xi ) having the same value gives
X X X
g (xi )p(xi ) = g (xi )p(xi )
i j i:g (xi )=yj
X X
= yj p(xi )
j i:g (xi )=yj
X X
= yj p(xi )
j i:g (xi )=yj
X
= yj P[g (X ) = yj ]
j
= E [g (X )]
Linearity of Expectation I
Corollary
If a and b are constants, then
E (aX + b) = aE (X ) + b.
Proof.
X
E (aX + b) = (ax + b)p(x)
x:p(x)>0
X X
= a xp(x) + b p(x)
x:p(x)>0 x:p(x)>0
= aE (X ) + b
The nth Moment E [X n ]
Definition
For any random variable X , the quantity E [X n ], n ≥ 1, is called the nth moment of X .
Corollary
X
E [X n ] = x n p(x).
x:p(x)>0
Generalizations
Proposition
If X and Y have a joint probability mass function p(x, y ), then
XX
E [g (X , Y )] = g (x, y )p(x, y ).
y x
Linearity of Expectation II
Theorem
If X and Y are any random variables, then
E [X + Y ] = E [X ] + E [Y ].
Theorem
If X1 , X2 , . . . , Xn are any random variables, then
E [X1 + X2 + . . . + Xn ] = E [X1 ] + E [X2 ] + . . . + E [Xn ].

Variance
Definition
If X is a random variable with mean µ, then the variance of X , denoted Var(X ), is given by
Var(X ) = E [(X − µ)2 ].

Variance: Alternative Formula
Var(X ) = E [(X − µ)2 ]

= E [X 2 − 2µX + µ2 ]
= E [X 2 ] − 2µE [X ] + µ2 ]
= E [X 2 ] − 2µ2 + µ2 ]
= E [X 2 ] − µ2
That is,
Var(X ) = E [X 2 ] − (E (X ))2 .
The Standard Deviation
Definition
The standard deviation of a random variable X , denoted SD(X ), is given by
p
SD(X ) = Var(X ).
Independence and Expectation
If X and Y are independent random variables, then
E [XY ] = E [X ] E [Y ].
More generally, in this case,
E [g (X ) h(Y )] = E [g (X )] E [h(Y )].

Properties of Variance
I Var(aX + b) = a2 Var(X ).
I If X1 , X2 , . . . , Xn are independent random variables, then
Var(X1 + X2 + . . . + Xn ) = Var(X1 ) + Var(X2 ) + . . . + Var(Xn ).

Important Note: The Continuous Case
All the above concepts and results are similarly defined / hold for continuous/jointly
continuous random variables. For instance:
Definition
The expected value or the expecation of a continuous random variable X with pdf f (x) is given
by Z ∞
E (X ) = xf (x)dx.
−∞
Corollary
The nth moment of a continuous random variable X with pdf f (x) is given by
Z ∞
E [x n ] = x n f (x)dx.
−∞
Expectation and Variance of Standard Distributions:
Bernoulli Random Variables
Let X be a Bernoulli random variable with parameter p.

Then its pmf is p(0) = P(X = 0) = 1 − p and p(1) = P(X = 1) = p.
Hence its expectation is
E [X ] = 0 × (1 − p) + 1 × p = p.
Also
E [X 2 ] = 02 × (1 − p) + 12 × p = p.
Hence its variance is
Var(X ) = E [X 2 ] − (E (X ))2 = p − p 2 = p(1 − p)

Expectation and Variance of Binomial Random Variables
Let X be a binomial random variable with paramenters (n, p).

Then X = X1 + X2 + . . . + Xn , where X1 , X2 , . . . , Xn are independent and identically
distributed Bernoulli random variables with parameter p.
Hence, by linear of expectation,
E [X ] = E [X1 + X2 + . . . + Xn ]
= E [X1 ] + E [X2 ] + . . . + E [Xn ]
= p + p + ... + p
= np
Expectation and Variance of Binomial Random Variables...
And, by independence of X1 , X2 , . . . , Xn ,
Var(X ) = Var(X1 + X2 + . . . + Xn )
= Var(X1 ) + Var(X2 ) + . . . + Var(Xn )
= p(1 − p) + p(1 − p) + . . . + p(1 − p)
= np(1 − p)
Expectation and Variance of Poisson Distribution
Let X be a Poisson random variable with parameter λ. Then
∞
X λi
E [X ] = ie −λ
i!
i=0
∞
X λi−1
= λe −λ
(i − 1)!
i=1
∞
X λj
= λe −λ
j!
j=0
= λ
Expectation and Variance of Poisson Random Variables...
∞
X λi
E [X 2 ] = i 2 e −λ
i!
i=0
∞
X λi−1
= λ ie −λ
(i − 1)!
i=1
∞
X λj
= λ (j + 1)e −λ
j!
j=0
 
∞ j ∞ j
X λ X λ
= λ je −λ + e −λ 
j! j!
j=0 j=0
2
= λ(λ + 1) = λ + λ
Hence
Var(X ) = E [X 2 ] − (E [X ])2 = λ2 + λ − λ2 = λ.
Homework
Compute the Expectation and Variance of

1 1−p
(a) the geometric random variable with parameter p (E (X ) = and Var(X ) = );
p p2
r
the negative binomial random variable with parameters (r , p) (E (X ) = and
p
r (1 − p)
Var(X ) = ).
p2
Expectation and Variance of the Gamma Distribution
Let X be a gamma random variable with parameters (α, λ). Then
Z ∞
1
E (X ) = xλe −λx (λx)α−1 dx
Γ(α) 0
Z ∞
1
= e −λx (λx)α d(λx)
λΓ(α) 0
Γ(α + 1)
=
λΓ(α)
α
=
λ
α
Homework: Prove that Var(X ) = .
λ2
Expectation and Variance of Normal Random Variables
Let X be a normal random variable with parameters (µ, σ 2 ).
Then Z = (X − µ)/σ is the standard normal random variable and hence has parameters (0, 1).
Now,
E [Z ] = 0 (Prove!).
So,
Var(Z ) = E [Z 2 ] = 1 (Prove!).
Hence
E [X ] = E [σZ + µ] = σE [Z ] + µ = µ
and
Var(X ) = Var(σZ + µ) = σ 2 Var(Z ) = σ 2 .
Linearity of Expection II: More Applications
Suppose that N people throw their hats into the center of a room. If the hats are mixed up
and each person selects a hat at random, find the expected number of people that select their
own hat.
Solution: Let X denote the number of matches. Then
X = X1 + X2 + . . . + XN ,
where

1 if the ith person selects his own hat
Xi =
0 otherwise
Since, for each i, the ith person is equally likely to select any of the N hats,
1
P(Xi = 1) = .
N
Thus, as each Xi is a Bernoulli random variable,
1
E [Xi ] = P(Xi = 1) = .
N
Thus
1
E [X ] = E [X1 + X2 + . . . + XN ] = E [X1 ] + . . . + E [XN ] = N × = 1.
N
Hence, on the average, exactly one person selects his own hat.
Homework: Coupon-Collecting Problem
Suppose that there are N different types of coupons and that each time one obtains a new
coupon it is equally likely to be any of the N types. Find the expected number of coupons one
need amass for obtaining a complete set containing all the N types.
Note: The answer is not 1.

Homework: A Random Walk in the Plane
Suppose that a particle that is at the origin initially undergoes a sequence of steps of unit
length but in a completely random direction. Compute E [D 2 ], where D is the distance of the
particle from the origin after n steps.
Moment Generating Functions
Definition
The moment generating function (mgf) M(t) of a random variable X is defined for all real
values of t by
M(t) = E [e tX ]
 X


 e tx p(x) if X is discrete with mass function p(x)
x
= Z ∞


 e tx f (x)dx if X is continuous with density function f (x).
−∞
Note
Moment generating functions are called so because all the moments of X can be obtained by
successively differentiating M(t) and evaluating the resulting functions at t = 0:
d
M 0 (t) = E [e tX ]
dt
d
= E [ e tX ]
dt
= E [Xe tX ]
Hence
M 0 (0) = E [X ].
Similarly,
M 00 (t) = E [X 2 e tX ] and M 00 (0) = E [X 2 ].
In general,
M (n) (t) = E [X n e tX ] and M (n) (0) = E [X n ] (n ≥ 1).
Example
Let X be a Bernoulli random variable with parameter p. Then its mgf is
M(t) = E [e tx ] = e t·0 p(0) + e t·1 p(1) = 1 − p + e t p = 1 − p + pe t .

So,
M 0 (t) = pe t and M 00 (t) = pe t .
Hence
E [X ] = M 0 (0) = p and E [X 2 ] = M 00 (0) = p
Example
Let X be a binomial random variable with parameters (n, p). Then
M(t) = E [e tx ]
n
tk n
X
= e p k (1 − p)n−k
k
k=0
n
X n
= (pe t )k (1 − p)n−k
k
k=0
= (pe t + 1 − p)n
= (1 − p + pe t )n
Independence and Moment Generating Functions
Proposition
If X and Y are independent random variables, then
MX +Y (t) = MX (t) MY (t).
Proof.
MX +Y (t) = E [e t(X +Y ) ]
= E [e tX +tY ]
= E [e tX e tY ]
= E [e tX ] E [e tY ]
= MX (t) MY (t)
Note
Let X be a binomial random variable with parameters (n, p). Then
X = X1 + . . . + Xn ,
where X1 , . . . , Xn are independent and identically distributed (iid) Bernoulli random variables
with parameter p.
Then the mgf of X is
MX (t) = MX1 (t)MX2 (t) . . . MXn (t)

= (1 − p + pe t )n
Homework: Compute E [X ] and E [X 2 ] using the mgf above. Hence compute the variance of X .
Moment Generating Function of Standard Distributions
Derive the moment generating functions of the following random variables. Hence compute
their mean and variance.
t
−1)
I Poisson random variable with parameter λ. (M(t) = e λ(e .)
pe t
I Geometric random variable with parameter p. (M(t) = .)
1 − (1 − p)e t
r
pe t

I Negative binomial random variable with parameters (r , p). (M(t) = .)
1 − (1 − p)e t
Moment Generating Function of Exponential Random Variable
Let X be exponential with parameter λ. Find its mgf.
Solution:
M(t) = E [e tX ]
Z ∞
= e tx λe −λx dx
0
Z ∞
= λ e −(λ−t)x dx
0
λ
= for t < λ.
λ−t
Homework
Let X be an exponential random variable with parameter λ. Compute E [X ] and E [X 2 ] using

the mgf of X . Hence compute the variance of X .
MGFs of the Standard Normal and the Normal Random Variables
MGF of the standard normal:

Z ∞ Z ∞
1 2 2 1 2 2
MZ (t) = E (e tZ ) = e tz √ e −z /2 dz = e t /2 √ e (−(z−t) /2) dz == e t /2 .
−∞ 2π −∞ 2π
MGF of the normal random variable with parameters (µ, σ 2 ):
2 2 2 2
MX (t) = E (e tX ) = E [e t(σZ +µ) ] = E [e µt e σtZ ] = e µt E [e σtZ ] = e µt e σ t /2
= e µt+σ t /2
.
Homework
1. Find the mean and variance of the standard normal random variable using its moment
generating function. (Mean: 0. Variance: 1.)
2. Find the mean and variance of the normal random variable with parameters (µ, σ 2 ) using
its moment generating function. (Mean: µ. Variance: σ 2 .)
Moment Generating Function of Standard Distributions
Find the moment generating functions of the following random variables. Hence compute their
mean and variance.
e tb − e ta

I Uniform random variable over the interval [a, b]. M(t) =
t(b − a)
α
λ
I Gamma random variable with parameter (α, λ). M(t) =
λ−t
Fact
The moment generating function of a random variable uniquely determines the distribution.
Examples
I If the mgf of a random variable X is 1 − p + pe t , then X must be Bernoulli with
parameter p.
λ
I If the mgf of a random variable X is , then X must be exponential with parameter λ.
λ−t
2
I If the gmf of a random variable X is e t /2 , then X must be the standard normal random
variable.
Example
Show that if X and Y are independent normal random variables with respective parameters
(µ1 , σ12 ) and (µ2 , σ22 ), then X + Y is normal with parameters (µ1 + µ2 , σ12 + σ22 ).
Solution:
MX +Y (t) = MX (t)MY (t)

2 2
/2 µ2 t+σ22 t 2 /2
= e µ1 t+σ1 t e
(µ1 +µ2 )t+(σ12 +σ22 )t 2 /2
= e
We recognize the above as the mgf a normal random variable with parameters
(µ1 + µ2 , σ12 + σ22 ). From the property of MGFs, it now follows that X + Y is normal with
parameters (µ1 + µ2 , σ12 + σ22 ).
Markov’s Inequality
Proposition
Let X be a non-negative random variable. Then for any value a > 0
E [X ]
P(X ≥ a) ≤ .
a
Proof
X
E [X ] = xP(X = x)
x
X
≥ xP(X = x)
x≥a
X
≥ aP(X = x)
x≥a
= aP(X ≥ a)
E [X ]
∴ P(X ≥ a) ≤ .
a
Chebyshev’s Inequality
Proposition
Let X be a random variable with mean µ and variance σ 2 . Then for any value k > 0
σ2
P(|X − µ| ≥ k) ≤ .
k2
Proof:
Note that (X − µ)2 is a non-negative random variable. Hence applying Markov’s inequality
with a = k 2 , we obtain
E [(X − µ)2 ] σ2
P((X − µ)2 ≥ k 2 ) ≤ = .
k2 k2
But (X − µ)2 ≥ k 2 if and only if |X − µ| ≥ k (> 0). Hence the above inequality is equivalent to
σ2
P(|X − µ| ≥ k) ≤ .
k2
Example
The number of items produced in a factory during a week is a random variable with mean 50.
1. What can be said about the probability that this week’s production will exceed 75?
2. If the variance of a week’s production is known to equal to 25, what can be said about the
probability that this week’s production will be between 40 and 60?
Solution:
Let X be the number of items that are produced in a week.

1. By Markov’s inequality,
E [X ] 50 2
P(X > 75) ≤ = = .
75 75 3
2. By Chebyshev’s inequality,
σ2 25 1
P(|X − 50| ≥ 10) ≤ 2
= = .
10 100 4
Hence
1 3
P(40 < X < 60) = P(|X − 50| < 10) ≥ 1 − = .
4 4
Homework
Let X be a normal random variable with parameters (µ, σ 2 ). Using Chebyshev’s inequality, find
an upper bound for the probability P(|X − µ| ≥ 2σ). (Answer: 0.25) Also approximate this
probability using the normal table. (Answer: 0.0456)
Note: Chebyshev’s inequality is often used as a theoretical tool in proving results.
Homework: If Var(X ) = 0, prove that
P(X = E [X ]) = 1.
Note
Let X1 , X2 , X3 , . . . , Xn be a sequence of independent and identically distributed (iid) random

variables each having a finite mean E [Xi ] = µ and a finite variance Var(Xi ) = σ 2 . Then
X1 + X2 + . . . + Xn 1 1
E( ) = × E (X1 + X2 + . . . + Xn ) = × n × µ = µ
n n n
and
X1 + X2 + . . . + Xn 1 1 σ2
Var( ) = 2 × Var(X1 + X2 + . . . + Xn ) = 2 × n × σ 2 = .
n n n n
The Weak Law of Large Numbers
Theorem
Let X1 , X2 , X3 , . . . be a sequence of independent and identically distributed (iid) random
variables each having finite mean E [Xi ] = µ. Then for any > 0

X1 + X2 + . . . + Xn
P − µ ≥ → 0 as n → ∞.

n
Proof:
Assume that the random variables have a finite variance σ 2 . Note that
σ2

X1 + X2 + . . . + Xn X1 + X2 + . . . + Xn
E = µ and Var = .
n n n
Hence by Chebyshev’s inequality
σ2

X1 + X2 + . . . + Xn
P − µ ≥ ≤ 2 → 0 as n → ∞.
n n
Note
Let X be a random variable with mean µ and variance σ 2 . (X need not be a normal random
variable.)
Consider the random variable

X −µ
Z= .
σ
We have
E (Z ) = 0 and Var(Z ) = 1.
(Z need not the standard normal random variable.)

The Central Limit Theorem
Theorem
variables each having mean µ and variance σ 2 . Then the distribution of
X1 +X2 +...+Xn
n −µ X1 + X2 + . . . + Xn − nµ
√ = √
σ/ n σ n
tends to the standard normal as n → ∞. That is, for −∞ < a < ∞,

Z a
X1 + X2 + . . . + Xn − nµ 1 2
P √ ≤a → √ e −z /2 dz as n → ∞.
σ n 2π −∞
Proof:
We will prove that the mgf (moment generating function) of the random variable
X1 + X2 + . . . + Xn − nµ
√
σ n
tends to the mgf of the standard normal as n → ∞. This implies the theorem (by a lemma).
Xi − µ
Let Zi = for i = 1, 2, . . .. Then
σ
X1 + X2 + . . . + Xn − nµ
Z1 + Z2 + . . . + Zn = .
σ
Note also that E [Zi ] = 0 and Var(Zi ) = 1 for all i. Hence E [Zi2 ] = 1.
Moreover, Z1 , Z2 , . . . are independent.

MZi (t) = E [e tZi ]
t2 2 t3 3
= E [1 + tZi + Z + Zi + . . .]
2 i 6
t2 t3
= 1 + tE [Zi ] + E [Zi ] + E [Zi3 ] + . . .
2
2 6
t2 t3
= 1+ + E [Zi3 ] + . . .
2 6
Hence
t t2 t3
MZi ( √ ) = 1+ + 3/2 E [Zi3 ] + . . .
n 2n 6n
t2
≈ 1+ for n large.
2n
Finally,
t
M Z1 +Z2√+...+Zn (t) = MZ1 +Z2 +...+Zn ( √ )
n n
√
= [MZ1 (t/ n)]n
n
t2

≈ 1+
2n
2
→ et /2
as n → ∞.
The Strong Law of Large Numbers
Theorem
variables each having finite mean E [Xi ] = µ. Then, with probability 1,
X1 + X2 + . . . + Xn
→ µ as n → ∞.
n
Proof:
We prove the theorem assuming that E [Xi4 ] = K < ∞. Let us first assume that E [Xi ] = 0.
Let Sn = X1 + . . . + Xn and consider
E [Sn4 ] = E [(X1 + . . . + Xn )(X1 + . . . + Xn )(X1 + . . . + Xn )(X1 + . . . + Xn )].

Expanding the RHS, we obtain terms like Xi4 , Xi3 Xj , Xi2 Xj2 , Xi2 Xj Xk , Xi Xj Xk Xl .
Since E [Xi ] = 0 for all i, we have
E [Xi3 Xj ] = E [Xi3 ]E [Xj ] = 0

E [Xi2 Xj Xk ] = E [Xi2 ]E [Xj ]E [Xk ] = 0
E [Xi Xj Xk Xl ] = 0
4
= 6 terms that equal Xi2 Xj2 . Hence

Now, for a given pair i and j, there are 2

n
E [Sn4 ] = nE [Xi4 ] +6 E [Xi2 Xj2 ]
2
= nK + 3n(n − 1)E [Xi2 ]E [Xj2 ]
Also
0 ≤ Var(Xi2 ) = E [Xi4 ] − (E [Xi2 ])2 .
Hence
(E [Xi2 ])2 ≤ E [Xi4 ] = K .
Hence
E [Sn4 ] ≤ nK + 3n(n − 1)K .
This implies that
Sn4

K 3K
E ≤ 3+ 2.
n4 n n
Hence
∞ ∞ ∞
Sn4

X X 1 X 1
E 4
≤ K 3
+ 3K < ∞.
n=1
n n=1
n n=1
n2
Thus
∞ ∞
" #
S4 Sn4
X X
n
E = E < ∞.
n=1
n4 n=1
n4
P∞ Sn4
This implies that, with probability 1, n=1 < ∞. n4
Sn4 Sn
This implies that 4 → 0 and hence that →0.
n n
X1 + X2 + . . . + Xn
This means that → 0.
n
When the mean µ 6= 0, we apply the above arguments to Yi = Xi − µ and conclude that
Tn
→ 0 with probability 1, where Tn = Y1 + . . . + Yn .
n
X1 + X2 + . . . + Xn
This implies that → µ with probability 1.
n
Applications of the SLLN
Theorem
Let X , X1 , X2 , X3 , . . . be a sequence of independent and identically distributed (iid) random
variables each having a finite mean µ. Then, with probability 1,
X1k + X2k + . . . + Xnk

→ E [X k ] = µk as n → ∞.
n
Applications of the SLLN
Theorem
Consider an event A with P(A) unknown. Perform the underlying experiment repeatedly and
independently and set Xi = 1 if A occurs in the ith trial and set Xi = 0 otherwise. Then, with
probability 1,
X1 + X2 + . . . + Xn
→ P(A) as n → ∞.
n

MAT205T: Probability Theory: S Vijayakumar

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MAT205T: Probability Theory: S Vijayakumar

Uploaded by

Copyright:

Available Formats

MAT205T: Probability Theory

I Expectations and Variance of Standard Random Variables

The lectures are mostly based on the textbook:

Sheldon Ross: A First Course in Probability, Pearson.

Let A be an event and let I be a random variable defined as follows:

E (I ) = 0 × P(I = 0) + 1 × P(I = 1) = 0 × (1 − P(A)) + 1 × P(A) = P(A).

Let X be the outcome when we roll a fair die. Find E (X ).

If independently repeating the experiment n times (for n large) results in x1 , x2 , . . . , xn , then

I Let X be a random variable with pmf p(xi ), i ≥ 1.

p(−1) = 0.10, p(0) = 0.25, p(1) = 0.30, p(2) = 0.35,

P(X = −1) = 0.2, P(X = 0) = 0.5, P(X = 1) = 0.3.

Solution: Let Y = X 2 . Then the pmf of Y is given by

P(Y = 0) = P(X = 0) = 0.5

E (X 2 ) = (−1)2 × 0.2 + 02 × 0.5 + 12 × 0.3 = 0.5

E [X1 + X2 + . . . + Xn ] = E [X1 ] + E [X2 ] + . . . + E [Xn ].

Var(X ) = E [(X − µ)2 ].

Var(X ) = E [(X − µ)2 ]

If X and Y are independent random variables, then

More generally, in this case,

E [g (X ) h(Y )] = E [g (X )] E [h(Y )].

Var(X1 + X2 + . . . + Xn ) = Var(X1 ) + Var(X2 ) + . . . + Var(Xn ).

Let X be a Bernoulli random variable with parameter p.

Hence its expectation is

Var(X ) = E [X 2 ] − (E (X ))2 = p − p 2 = p(1 − p)

Let X be a binomial random variable with paramenters (n, p).

Hence, by linear of expectation,

Let X be a Poisson random variable with parameter λ. Then

Compute the Expectation and Variance of

Let X be a gamma random variable with parameters (α, λ). Then

Let X be a normal random variable with parameters (µ, σ 2 ).

Note: The answer is not 1.

Let X be a Bernoulli random variable with parameter p. Then its mgf is

M(t) = E [e tx ] = e t·0 p(0) + e t·1 p(1) = 1 − p + e t p = 1 − p + pe t .

Let X be a binomial random variable with parameters (n, p). Then

Let X be a binomial random variable with parameters (n, p). Then

MX (t) = MX1 (t)MX2 (t) . . . MXn (t)

Let X be exponential with parameter λ. Find its mgf.

Let X be an exponential random variable with parameter λ. Compute E [X ] and E [X 2 ] using

MGF of the standard normal:

MGF of the normal random variable with parameters (µ, σ 2 ):

MX +Y (t) = MX (t)MY (t)

Let X be the number of items that are produced in a week.

Note: Chebyshev’s inequality is often used as a theoretical tool in proving results.

Homework: If Var(X ) = 0, prove that

Let X1 , X2 , X3 , . . . , Xn be a sequence of independent and identically distributed (iid) random

Consider the random variable

(Z need not the standard normal random variable.)

tends to the standard normal as n → ∞. That is, for −∞ < a < ∞,

Moreover, Z1 , Z2 , . . . are independent.

Let Sn = X1 + . . . + Xn and consider

E [Sn4 ] = E [(X1 + . . . + Xn )(X1 + . . . + Xn )(X1 + . . . + Xn )(X1 + . . . + Xn )].

Since E [Xi ] = 0 for all i, we have

E [Xi3 Xj ] = E [Xi3 ]E [Xj ] = 0

X1k + X2k + . . . + Xnk

You might also like