ST03 ProbDistributions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Probability Distributions

(확률분포)
Basic Examples (Chapters 6, 7) &
Theory (Chapter 8)

Kyung Sam Park


Professor of LSOM
Korea University Business School
sampark@korea.ac.kr
Contents

 Basic Examples
 Discrete Probability Distributions(이산확률분포)
• Binomial Distribution(이항분포)
• Poisson Distribution(포아송분포)
 Continuous Probability Distributions(연속확률분포)
• Normal Distribution(정규분포)
• Standard Normal Distribution(표준정규분포)

 Theory
 Central Limit Theorem (중심극한정리)
• Sample Mean ~ Normal Distribution
 Student t-distribution
• Degree of freedom (df)

2
Discrete Probability Distributions

 Example: Rolling a single die


 Possible outcomes = random variable:

 Table Outcomes (X) Probability (P)


1 1/6
2 1/6
3 1/6
4 1/6
5 1/6
6 1/6
Total 1

 Graph 0.3

0.25
Probability

0.2

0.15
0.1

0.05

0
1 2 3 4 5 6
Num ber of Spots

 Prob. Dist. Function (pdf): P(x) = 1/6, for all possible values x of random variable X.
3
Discrete Probability Distributions

 Binomial distribution
 Example: n = 5 flights daily from Seoul into Jeju Airport, and the probability
that any flight arrives late is p = 0.2.

Number of late flights (X) Probability


0 0.3277
1 0.4096
2 0.2048
3 0.0512
4 0.0064
5 0.0003
Total 1

 P(0) = 5C0(0.2)0(1  0.2)5 = (1)(1)(0.3277) = 0.3277


 P(1) = 5C1(0.2)1(1  0.2)4 = (5)(0.2)(0.4096) = 0.4096

 Generalization of Binomial distribution: P(x) = nCx(p)x(1  p)n-x


 Where, n = number of trails, p = probability of success, x = random variable value.

 Mean:  = np
 Variance: 2 = np(1  p)
4
Discrete Probability Distributions

 Poisson distribution
 Assume that the probability a person is passed away during one year from
lung cancer is 0.001. A life insurance company has 1,000 subscribers. It pays
$100,000 to a subscriber if she/he dies due to lung cancer.
 We know: n = 1000, p = 0.001. Thus  = np = 1.0
 Let X  # of death from lung cancer for a year

X Probability $
0 0.3679 0
1 0.3679 100,000
2 0.1839 200,000
3 0.0613 300,000
4 0.0153 400,000
5 0.0031 500,000
6 0.0005 600,000
7 0.0001 700,000
8 0
 0 

5
Discrete Probability Distributions

 Poisson distribution

 P(x) =

 Where,  = np  the mean number of successes,


 e  the mathematical constant 2.71828.

 Mean:  = np
 Variance: 2 = np

 Examples of Poisson distribution:


 X  # of death from car accidents for a year.
 Y  # of typos per page in a report (or manuscript).

6
Continuous Probability Distributions

 Normal distribution
 Random variable X, its mean , and its standard deviation :
( x )2
1 
P( x)  e 2 2
, where  3.14159, e  2.718
 2

 X  N(, 2)

7
Continuous Probability Distributions

 Normal distribution
Major characteristics of the normal distribution:
1. The normal distribution is bell-shaped and the mean, median, and mode are
all equal and are located in the center of the distribution.
2. The distribution is symmetrical(좌우대칭) about the mean. A vertical line
drawn at the mean divides the distribution into two equal halves and
these halves possess exactly the same shape.
3. It is asymptotic(점근). That is, the tails of the curve approach the X-axis but
never actually touch it.
4. A normal distribution is completely described by its mean and standard
deviation. This indicates that if the mean and standard deviation are
known, a normal distribution can be constructed and its curve drawn.

 There are various normal distributions, because of different means (center) and
different variances (height).

 We have to calculate probabilities for some intervals in the normal


distributions, then how?

8
Continuous Probability Distributions

 Standard Normal distribution


 X  N(, 2),
X 
Z

 Z  N(0, 1), standard normal distribution (or z-distribution).

 Example: If the dataset for battery life X  N(36, 32),


P{36  X  40}

 36   X   40   
 P   
    
 40  36  Probability
 P 0  Z    P0  Z  1.33 = 0.908
 3  Z
 0.4082
z =  z = 1.33

 Use Excel function: = norm.s.dist(z): return probability p


= norm.s.inv(p): return location z.
9
Empirical Rule (revisited)

 Proof:
 P{   X   + } = P{1  Z  1}
= norm.s.dist(1 ) – norm.s.dist(1)
= 0.841 – 0.159  68%
 P{ 2  X   +2 } = P{2  Z  2}  95%
 P{ 3  X   +3 } = P{3  Z  3}  99.7%

10
Theory: Definitions

 Population (모집단)
 Mean ()
 Standard deviation ()

 Sample (표본): Gathered n data , where n is called “sample size.”


 Mean (𝑋)
 Standard deviation (s)  Sample variance:

 Central Limit Theorem (중심극한정리)


 If a sample data set of a specified size are selected from any population, the
distribution of the sample mean is approximately a normal distribution. This
approximation improves with larger samples.

   
2
 E( X )  
 
X N ,   
  n   V (X )   2 / n

11
   
2

Illustration  
X N , 

 
 n  

Decomposition (or partitioning) Population Sample Sample mean(X)

6 Data: 2 4 6 4 3 5 3

5
Sample mean: 3 5 4
4
Overall mean = (3 + 5 + 4)/3 = 4 Mean = 
||
Variance = 2 :
:
Mean = (2+4+6+4+3+5)/6 = 4

Variance
of X 2

2/n

Sample size(n)
1 12
Theory: Computer Simulation Results

13
Standardization

 Standard Normal Distribution (z-distribution)


   
2

X  N  ,   
  n  

X 
Z
 n
n

 X  
2
 Z  N(0, 1): z-distribution i
2  i 1

N
 t-distribution
 Replacing  with s,
 X  X
n
2
i
X  s2  i 1
T n 1
s n
 T  t(n – 1): t-distribution with a degree of freedom of (n – 1).
 The df determines the shape (height) of t-distribution
 How to understand the df(자유도) concept?
14
Relationships between Z & T

 t-distribution’s variance is greater than standard normal


distribution’ variance

 Excel = t.dist(x, n – 1): Return the probability, P{T  x}


= t.inv(p, n – 1): Return the location x, such that P{T  x} = p.

15

You might also like