AdvStats - W3 - Discrete

Advanced Statistics
Discrete Random Variables
Economics
University of Manchester
1
Discrete Probability
Distributions
Axioms (Rules) of probability tell us how to combine
and use probabilities
where do probabilities come from?
We develop MODELS of PROBABILITY

Statistical Models
Discrete variables this session;

Continuous random variables: subsequent
sessions
2
Random Variables
Example: Flipping a coin 3 times
The SAMPLE SPACE is
W={(H,H,H), (H,H,T), (H,T,H), (T,H,H), etc...}
Eight possible events
Each EQUALLY LIKELY, with probability of 1/8
But may be interested in the NUMBER OF HEADS
By convention, a random variable is indicated by

an UPPER CASE letter (X, Y, Z and so forth)
A specific outcome is denoted by the corresponding
LOWER CASE letter (x, y, z, etc)
3
Statistical models
(mathematical functions)
p(x) (3) assign probabilities

(1) sample outcomes
x S
(2) random variable of interest x
4
Discrete random variables I
X = number of HEADS obtained
X(H,H,H) = 3
X(H,H,T)=X(H,T,H)=X(T,H,H) = 2
and so forth . . .
X(T,T,T) = 0
Values taken by X are: x = 0, 1, 2 or 3
Example of a DISCRETE RANDOM VARIABLE
5
Discrete random variables II
The possible outcomes for a discrete random
variable, X, are isolated values
Often only integer values (…, -2, -1, 0, 1, 2, …)
A specific value is denoted x
Probabilities are assigned to possible x values

p(x) = Pr(X=x), for any value x
e.g., p(1) = Pr(X=1)
p(x) is a mathematical function
p(x) must be bounded between 0 and 1
6
Probability mass function (pmf)
Function p(x) that assigns probabilities to each
possible (discrete) outcome x is a PROBABILITY
MASS FUNCTION
Or “probability distribution”
Each value p(x) must be a valid probability
Probability Mass Function
0.4 0.375 0.375
0.3
Probability
0.2
0.125 0.125
0.1
0
01 12 23 34
Number of HEADS
7
Properties of a pmf
Probability Mass Function
0.4 0.375 0.375
0.3
Probability
0.2
Properties of p( x ): 0.125 0.125
p( x )  0, for all values of x;

0.1

x
p( x ) = 1, 01 12 32
Number of HEADS
4
3
where the sum is taken over all possible values of X
p(r)=0  value r is NOT one of

the possible outcomes
8
Example
A company claims 95% of its light-bulbs will last
longer than 10,000 hours
From a large consignment of light-bulbs, 3 are
randomly chosen
Of these 3 light-bulbs, what is the probability that
only ONE will last longer than 10,000 hours?
Let X = “number of light-bulbs which last longer
than 10,000 hours”
Possible x = 0, 1, 2, 3
Require Pr(X=1).
9
Example contd.
The probability that any one light bulb lasts longer than
10,000 hours is 0.95
Pr(Failure)=0.05 Pr(Success)=0.95
The three light-bulbs are chosen independently
x = 1, if we observe any of the following mutually

exclusive events:
F, F, S or F, S, F or S, F, F
Pr (F  F  S) = Pr(F) x Pr(F) x Pr(S); independence
Pr(X=1) = Pr(F  F  S)+Pr(F  S  F)+Pr(S  F F)
= 3 x (0.05 x 0.05 x 0.95)
= 0.007125
10
Example pmf
Pr(X=1) = 0.007125
Pr(X=2) = 3 x (0.05 x 0.95 x 0.95)
= 0.135375
Pr(X=3) = 0.95 x 0.95 x 0.95 = 0.857375
Pr(X=0) = 0.05 x 0.05 x 0.05 = 0.000125
these FOUR probabilities add to 1, and they define
the pmf
example of a BINOMIAL RANDOM VARIABLE
A binomial random variable is the “number of
successes in n independent experiments”
11
Tabulating the pmf
If entire pmf is required, often best to present in
tabular form:
x Pr (X = x)
0 0.0001
1 0.0071
2 0.1354
3 0.8574
1.0000
12
Cumulative distribution
function (cdf)
The cdf is defined as P ( x) = Pr( X  x)
Previous example:
Pr(X  1) = Pr(X=0) + Pr(X=1)
= 0.000125 + 0.007125
= 0.00725
Pr(X  2) = Pr(X=0) + Pr(X=1) + Pr(X=2)

= 0.142625
= 1 - Pr(X=3)
Pr(X > 1) = 1 - Pr(X  1)

= 0.99275
13
Mean of a random variable
The mean of a random variable is analogous to
the sample mean
But computed using probabilities of outcomes
Refer to the “mean of X” or “mean of the
distribution” or “population mean”
Notation: mean of a random variable  = E[X]
the EXPECTED value of X
Note upper case X in E[X]
Population mean is value anticipated “on
average”
14
Mean of a discrete random
variable

sum is taken over
E [ X ] =  =  x Pr( X = x );  all possible
x 
 values of x
• Example: heads in 3 coin tosses
3
E X  =  x Pr( X = x) BINOMIAL DISTRIBUTION
x =0 0.4
( ) ( ) ( ) ( )
= 0  18 + 1 83 + 2  83 + 3  18
probability mass
0.3
= 18 (0 + 3 + 6 + 3) 0.2
0.1
=1 1
2
0 1
x
2 3
X = number of heads 15
Variance of a random variable
Variance of the random variable X, var[X],
characterises the SPREAD of the distribution
Variance of a random variable is a positive value
Analogous to the sample variance, but relates to all

possible outcomes
var[X] = σ2 is the expected squared distance of X

from the mean  = E[X]:
var[ X ] =  = E[( X −  ) ]  0
2 2
16
Variance of a discrete random
variable
var[ X ] =  =  ( x −  ) Pr( X = x )
2 2
Example: Number of heads in 3 coin tosses

E[X] =  = 1½
3
var[ X ] =  ( x − 1.5) 2 p ( x)
x =0
1 3
= (0 − 1.5)  + (1 − 1.5) 
2 2
8 8
3 1
+ ( 2 − 1.5)  + (3 − 1.5) 
2 2
8 8
3
=
4 17
Standard Deviation
Standard deviation defined as:
 =+  2
Also a measure of spread

Always positive
Same units of measurement as original
variable
easier to interpret than variance
 = (3/4)=0.866 for coin tossing example
18
Calculating the variance
Var[X] can be calculated as:


VarX  = E ( X −  )
2

 
= E X 2 − 2
= E X − E X  0
2 2
Will be shown later

Often easier to compute variance using latter
form
19
Calculating variance: example
Number of heads in 3 coin tosses
x p(x) x p(x) x 2 p( x)
0 0.125 0.000 0.000
1 0.375 0.375 0.375
2 0.375 0.750 1.500
3 0.125 0.375 1.125
Total 1.000 1.500 3.000
Var[X] = E[X2] – μ 2
= 3.0 – (1.5)2 = 0.75, as before
20
Bernoulli distribution
A single experiment, outcomes Y=1 (success) &
Y=0 (failure)
probability of “success” π & “failure” (1 – π)
For example: Toss a coin once
Success = head; π = 0.5
pmf can be written: p ( y ) =  y (1 −  )1− y
1
E[Y ] =  y p ( y )
y =0
= 1  + 0  (1 −  ) = 
21
Bernoulli distribution: variance
var[Y ] =  = E[(Y −  ) ]
2 2
1
=  ( y −  ) 2 p( y )
y =0
= (0 −  ) 2  (1 −  ) + (1 −  ) 2  
= (1 −  )[ 2 +  (1 −  )]
=  (1 −  )
π → 1 or π → 0
Variance is smaller for
Largest variance when π = 0.5
22
Binomial distribution
Perform a Bernoulli experiment n times and monitor the
X =TOTAL NUMBER of SUCCESSES
The probability of success in any one experiment is .

Then in n experiments:
n x
p( x) = Pr( X = x) =   (1 −  ) n − x ; 0    1; x = 0, 1, ..., n
 x
where
n n!
  = ; with n!= n  (n − 1)  (n − 2)  ...  2 1;
 x  x!(n − x)!
0!  1
23
Binomial distribution:
mean & variance
A binomial variable X has
E[ X ] = n 
var[ X ] = n (1 −  )
For example, with n = 5 &  = 0.2

E [ X ] = n  = 5  0 . 2 = 1 .0
var[ X ] = n (1 −  )
= 5  0 . 2  0 .8 = 0 . 8
Mean & variance depend on the parameters n & 
24

AdvStats - W3 - Discrete

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AdvStats - W3 - Discrete

Uploaded by

Copyright:

Available Formats

Advanced Statistics

Discrete Random Variables

We develop MODELS of PROBABILITY

Discrete variables this session;

But may be interested in the NUMBER OF HEADS

By convention, a random variable is indicated by

p(x) (3) assign probabilities

Values taken by X are: x = 0, 1, 2 or 3

Example of a DISCRETE RANDOM VARIABLE

Often only integer values (…, -2, -1, 0, 1, 2, …)

A specific value is denoted x

Probabilities are assigned to possible x values

0.4 0.375 0.375

0.4 0.375 0.375

p( x )  0, for all values of x;

where the sum is taken over all possible values of X

p(r)=0  value r is NOT one of

x = 1, if we observe any of the following mutually

Pr(X  2) = Pr(X=0) + Pr(X=1) + Pr(X=2)

Pr(X > 1) = 1 - Pr(X  1)

Variance of a random variable is a positive value

Analogous to the sample variance, but relates to all

var[X] = σ2 is the expected squared distance of X

Example: Number of heads in 3 coin tosses

Also a measure of spread

Var[X] can be calculated as:

Will be shown later

The probability of success in any one experiment is .

For example, with n = 5 &  = 0.2

Mean & variance depend on the parameters n & 

You might also like