Stat IM CH3

Statistics Ch 3: Random Variables and Probability Distributions Page 3-1
RANDOM VARIABLES
<def> A random variable X is a function that associates a real number with each
map
element in sample space S. X : S   R
• A random variable is a numerical description of the outcome of an experiment.
• A random variable can be classified as being either discrete or continuous
depending on the numerical values it assumes.
• A discrete random variable （離散隨機變數）may assume either a finite
number of values or an infinite sequence of values.
• A continuous random variable （連續隨機變數） may assume any numerical
value in an interval or collection of intervals.
 Example: X = the sum of two fair dice
X
S R
(1,1) 2
(1,2)(2,1) 3
... ...
(6,6) 12
Discrete Random Variable:

If a sample space contains a finite number of possibilities or an unending sequence
with as many elements as there are whole numbers, then it is called a discrete
sample space. (可數有限，可數無限)
A random variable is called a discrete random variable if its set of possible
outcomes is countable.
Discrete random variable with a finite number of values
 Example: X = number of TV sets sold at the store in one day
where x can take on 5 values (0, 1, 2, 3, 4)
 Example : Tossing 3 fair coins
S= {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Random variable: X = the # of heads
Realization or outcome: x= 3, 2, 1, 0
Ch 3-1
Discrete random variable with an infinite sequence of values

 Example: X = number of customers arriving in one day
where x can take on the values 0, 1, 2, . . .
 Example: Tossing a die until the number 1 appears
S= {1, *1, **1, ***1, …..}
Random variable: X = the # of tossing until the number 1 appears,
Realization or outcome: x = 1, 2, 3, 4,….
We can count the customers arriving, but there is no finite upper limit on the
number that might arrive.
Continuous Random Variable

If a sample space contains an infinite number equal to the number of points on a line
segment, then it is called a continuous sample space. (不可數無限)
A random variable is called a continuous random variable if it can take on values
on a continuous scale.  Example: X = Travel time from Taipei to Tainan
DISCRETE PROBABILITY DISTRIBUTIONS

<def> The set of ordered pairs (x, f(x) ) is called the probability function, probability
distribution, or probability mass function (pmf) of the discrete random
variable X if for each outcome x,
f ( x )  0,  f ( x )  1, and P ( X  x )  f ( x ).
x
<def> The cumulative distribution function (CDF), F(x), of a discrete random

variable X with probability distribution f(x) is
F ( x )  P( X  x )   f (t ) for    x  .
tx
(a) F(x) is a nondecreasing function of x.

(b) lim F ( x)  F ()  1
x
(c) lim F ( x)  F ( )  0

x
Ch 3-2
• The probability distribution for a random variable describes how probabilities

are distributed over the values of the random variable.
• The probability distribution is defined by a probability function, denoted by
f(x), which provides the probability for each value of the random variable.
 i 1
 i  2,3,,7
 Example: X = the sum of two fair dice => P{ X  i}   36
13  i
 i  8,9,,12
 36
 Example: X = the # of flips required to have the first head (p = the prob. of a coin
coming up head)
=> P{ X  x}  (1  p) x1 p which is a Geometric distribution.

  1
x 1
 P{ X  x}   (1  p) p p 1
x 1 x 1 1  (1  p)
x x
F ( x )  P( X  x )   f (t )   (1  p) t 1 p
t 1 t 1
 p  (1  p) p  (1  p) p (1  p) x 1 p  1  (1  p) x
2
For a fair die, we have

1 1
P( X  x )  f ( x )  (1  p) x 1 p  (1  ) x 1  0 , for x=1,2,3,…
6 6
x 1 1 51 5 1
 f ( x )   (1  p) p   ( ) 2   1
x 1 x 1 6 66 6 6
1 5
F ( x )  1  (1  ) x  1  ( ) x
6 6
 Example: X = the # of flips required to have the k th head (p = the prob. of a coin
coming up head)
 i 1
=> P{ X  i}   (1  p) i k p k , i  k , k  1, 
 k  1
   i  1
ik k
 P{ X  i}     (1  p) p  1 (verify)
ik i  k  k  1
Ch 3-3
 Example : In a trial, there are m possible outcomes with probabilities

p1, p2 ,, pm .
X = the # of trials required until each outcome has occurred at least once.
P(X=n)=?
Let Ai =the event that outcome i has not occurred in the first n trials and B i =the
event that outcome i has not occurred in the first n-1 trials. Then,
P ( Ai )  (1  pi ) n , P ( Bi )  (1  pi ) n 1
P ( Ai A j )  (1  pi  p j ) n , P ( Bi B j )  (1  pi  p j ) n 1

m 
P{ X  n}  P{at least one outcome has not occurred in the first n trials}  P Ai 
i 1 
P{ X  n  1}  P{at least one outcome has not occurred in the first n - 1 trials}
m 
 P  Bi 
i 1 
m  m 
P{ X  n}  P{ X  n  1}  P{ X  n}  P   Bi   P   Ai 
i 1  i 1 
m
  pi (1  pi ) n 1    ( pi  p j )(1  pi  p j ) n 1 
i 1 i j
Graphic form
 Bar Chart  Probability Histogram
9 Frequency
8
7
6
5
4
3
2
1
 CDF
Cumulative Frequency
Ch 3-4
CONTINUOUS PROBABILITY DISTRIBUTIONS

<def> The function f(x) is called the probability density function (pdf) of the
continuous random variable X defined over real numbers R, if for each outcome
x,
 b
f ( x)  0,  f ( x)dx  1, and P(a  X  b)   f ( x)dx.
 a
*NOTE: P(X=x)=0 for the continuous case.
<def> The cumulative distribution function (CDF), F(x), of a continuous random

variable X with density function f(x) is
x
F ( x)  P( X  x)   f (t )dt for    x  

(a) F(x) is a nondecreasing function of x.

(b) lim F ( x)  F ()  1
x
(c) lim F ( x)  F ( )  0

x
• A continuous random variable can assume any value in an interval on the real
line or in a collection of intervals.
• It is not possible to talk about the probability of the random variable assuming a
particular value. That is, P{X=x}=0.
• Given a particular value, we only know the probability density（機率密度）.
• We talk about the probability of the random variable assuming a value within a
given interval.
• The probability of the random variable assuming a value within some given
interval from x1 to x2 is defined to be the area under the graph of the probability
density function between x1 and x2 .
Ch 3-5
Graphic form
f(x)
x1 x2 x
x2
 1 x  2
 Example: f ( x )   3
 0 elsewhere
2
 2 x2 x3 8 1
 f ( x )dx   dx    1
 1 3 9
1
9 9
1
1 x2 x3 1
P( 0  x  1)   dx  
0 3 9
0
9
x
x t2
x t3
F ( x )   f (t )dt   dt 
 1 3 9
1
3
x 1 x 1 3
  
9 9 9
 0 x  1
x  1
3
 F ( x)   1 x  2
 9
 1 x2
Ch 3-6
NOTES:
Discrete case:
1. P( X  x)  f ( x)  P( x  1  X  x)  P( x  1  X  x  1).
2. P( X  x)  P( X  x)  P( X  x  1)  F( x)  F( x  1).
Continuous case:
1. P( X  x)  0  f ( x).
2. f ( x ) is density, not probability
3. P( a  X  b)  P( a  X  b)  P( a  X  b)  P( a  X  b).
dF ( x )
4. f ( x )  for all x at which F(x) is differentiable.
dx
Both Cases:
1. F( )  0 or lim F ( x)  0
x
2. F()  1 or lim F ( x)  1
x
3. F ( x) is nondecreasing in x, i.e. if x1  x 2 , then F ( x1 )  F ( x 2 ) .

4. P(a  X  b)  P( X  b)  P( X  a)  F (b)  F (a)
EMPIRICAL DISTRIBUTIONS
Sometimes, we cannot generate sufficient information or experimental data to
characterize a distribution totally. Usually, the function f ( x ) is unknown and its
form is assumed. We need good judgement based on all available information to
select a valid f ( x ) .
 Example: Car Battery Life (n=40)
2.2 4.1 3.5 4.5 3.2 3.7 3.0 2.6 3.4 1.6 3.1 3.3 3.8 3.1 4.7
3.7 2.5 4.3 3.4 3.6 2.9 3.3 3.9 3.1 3.3 3.1 3.7 4.4 3.2 4.1
1.9 3.4 4.7 3.8 3.2 2.6 3.9 3.0 4.2 3.5
Ch 3-7
Stem and Leaf Plot

Stem Leaf Frequency
1 69 2
2 25669 5
3 001111222333444556777889 25
4 11234577 8
Double-Stem and Leaf Plot

Stem Leaf Frequency
1b 69 2
2a 2 1
2b 5669 4
3a 001111222333444 15
3b 556777889 10
4a 11234 5
4b 577 3
Frequency Distribution
Class Interval Class Midpoint Frequency, f Relative Frequency
1.5-1.9 1.7 2 0.05
2.0-2.4 2.2 1 0.025
2.5-2.9 2.7 4 0.1
3.0-3.4 3.2 15 0.375
3.5-3.9 3.7 10 0.25
4.0-4.4 4.2 5 0.125
4.5-4.9 4.7 3 0.075
Relative Frequency Histogram Estimating f ( x )
Ch 3-8
Skewness:
Left-Skewed Symmetric Right-Skewed

Q 1 Median Q3 Q 1 Median Q3 Q 1 Median Q3
Relative Cumulative Frequency Distribution
JOINT PROBABILITY DISTRIBUTIONS

<def> The function f ( x , y) is a joint probability distribution or joint probability
mass function of the discrete random variables X and Y if,
1. f ( x , y )  0 for all (x, y)
2.   f ( x , y )  1
x y
3. P( X  x, Y  y)  f ( x, y)
For any region A in the xy-plane, P[( X , Y )  A]    f ( x, y) .
A
Ch 3-9
<def> The function f ( x , y) is a joint density function of the continuous random

variables X and Y if,
1. f ( x , y )  0 for all (x, y)
 
2.   f ( x, y)dxdy  1.
- -
3. for any region A in the xy-plane, P[( X , Y )  A]    f ( x, y) dxdy

A
or P{ X  A,Y  B}    f ( x, y)dxdy
B A
<def> The joint cumulative probability function of the random variables X and Y :
F ( x, y)  P{ X  x,Y  y}
<def> The marginal distributions of the discrete random variables X and Y :
g ( x )  P{ X  x , Y  }   f ( x , y )
y
h( y )  P{ X  , Y  y}   f ( x , y )
x
<def> The marginal distributions of the continuous random variables X and Y :


g ( x )  P{ X  x , Y  }   f ( x , y ) dy


h( y )  P{ X  , Y  y}   f ( x , y ) dx

<def> The marginal cumulative function of the random variables X and Y :

G( x )  F ( x , )  P{ X  x , Y  }
H ( y )  F (, y )  P{ X  , Y  y}
Conditional Probability Distributions:
<def> The conditional probability distributions for X given Y=y is given by
f ( x, y)
f ( x| y )  for h( y )  0.
h( y )
<def> The conditional probability distributions for Y given X=x is given by
f ( x, y)
f ( y| x )  for g ( x )  0.
g( x)
Ch 3-10
 f ( x, y)
  f ( x| y )  
 ( a ,b) x ( a ,b) h( y )
 P( a  X  b| Y  y )   x
b b f ( x, y)
 f ( x| y ) dx  
 a
dx
a h ( y )
 Example: 3 blue, 2 red, and 3 green balls are in a bag. Two balls are selected at
random. Let X be the number of blue and Y be the number of red balls
selected.
Sol:
 3  2  3 
   
 x  y  2  x  y
f ( x, y)  for x, y=0, 1, 2 and 0  x  y  2 .
 8
 
 2
5 15 3 2
g ( x )   f ( x , y )  g (0)  , g (1)  , g (2)    g( x)  1
y 2  x 14 28 28 x 0
15 3 1 2
h( y )   f ( x , y )  h(0)  , h(1)  , g (2)    h( y)  1
x 2  y 28 7 28 y0
f ( x,1) 7 1 1
f ( x|1)   f ( x,1)  f ( 01
| )  , f (11
| )  , f ( 21
|)0
h(1) 3 2 2
 2
(2 x  3 y ) 0  x , y  1
 Example: f ( x , y )   5
 0 elsewhere
  1
11
2 1 2 6
2 3 2
  f ( x , y ) dxdy   ( 2 x  3 y ) dxdy   (  y)dy   y  y   1
  00 5 0 5 5 5 5 0
1
1 2 4 3  4 3
g( x )   (2 x  3 y )dy   xy  y 2   x  , 0 x 1
0 5 5 5  y 0 5 5
1
21
2 6  2 6
h( y)   (2 x  3 y)dx   x 2  xy   y, 0 y 1
0 5 5 5  x 0 5 5
Ch 3-11
2
(2 x  3 y )
f ( x, y) 5 2 x  3y
f ( x| y )   
h( y ) 2 6 1  3y
 y
5 5
Statistical Independence:
We know that f ( x, y)  f ( x| y)h( y)  f ( y| x) g ( x) .
The random variables X and Y are said to be statistically independent if and only
if f (x, y)  g (x)h( y) for all (x, y) within their range.
That is, f ( x| y )  g ( x ) and f ( y| x )  h( y )
MULTI-VARIATE DISTRIBUTIONS
The function f ( x1 , x 2 ,  , x n ) is a joint probability function of the random
variables X 1 , X 2 ,  , X n .
Marginal Distributions:
     f ( x1 , x 2 ,  , x n )
 x2 x3 xn
g ( x1 )     
     f ( x1 , x 2 ,  , x n ) dx 2 dx 3 dx n
  
Joint Marginal Distributions:

     f ( x1 , x 2 ,  , x n )
 x3 x4 xn
g ( x1 , x 2 )     
     f ( x1 , x 2 ,  , x n )dx 3 dx 4  dx n
  
Joint Conditional Distributions:

f ( x1 , x 2 ,  , x n )
f ( x1 , x 2 , x 3 | x 4 ,  , x n ) 
g( x4 , x5 , , xn )
Mutually Statistical Independence:
The random variables x1 , x 2 ,  , x n are said to be mutually statistically
independent if and only if f ( x1 , x 2 ,  , x n )  f 1 ( x1 ) f 2 ( x 2 ) f n ( x n ) for all
( x1 , x 2 ,  , x n ) within their range. ( f i ( x i ) are marginal distributions)
Ch 3-12

Stat IM CH3

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat IM CH3

Uploaded by

Copyright:

Available Formats

Statistics Ch 3: Random Variables and Probability Distributions Page 3-1

Discrete Random Variable:

Discrete random variable with an infinite sequence of values

Continuous Random Variable

DISCRETE PROBABILITY DISTRIBUTIONS

<def> The cumulative distribution function (CDF), F(x), of a discrete random

(a) F(x) is a nondecreasing function of x.

(c) lim F ( x)  F ( )  0

• The probability distribution for a random variable describes how probabilities

=> P{ X  x}  (1  p) x1 p which is a Geometric distribution.

For a fair die, we have

 Example : In a trial, there are m possible outcomes with probabilities

CONTINUOUS PROBABILITY DISTRIBUTIONS

*NOTE: P(X=x)=0 for the continuous case.

<def> The cumulative distribution function (CDF), F(x), of a continuous random

(a) F(x) is a nondecreasing function of x.

(c) lim F ( x)  F ( )  0

density function between x1 and x2 .

3. F ( x) is nondecreasing in x, i.e. if x1  x 2 , then F ( x1 )  F ( x 2 ) .

Stem and Leaf Plot

Double-Stem and Leaf Plot

Relative Frequency Histogram Estimating f ( x )

Left-Skewed Symmetric Right-Skewed

Relative Cumulative Frequency Distribution

JOINT PROBABILITY DISTRIBUTIONS

<def> The function f ( x , y) is a joint density function of the continuous random

3. for any region A in the xy-plane, P[( X , Y )  A]    f ( x, y) dxdy

<def> The marginal distributions of the continuous random variables X and Y :

<def> The marginal cumulative function of the random variables X and Y :

Joint Marginal Distributions:

Joint Conditional Distributions:

You might also like