Dr. Rajeev Saha
Population vs Sample
Population Sample
Population Variance
Sample Variance
Standard Deviation - Population

Population Variance = 𝜎 2
Standard Deviation - Sample

Sample Variance = 𝑆 2
Random Variable
• A random variable is a function that associates a real
number with each element in the sample space.
• If a sample space contains a finite number of possibilities
or an unending sequence with as many elements as there
are whole numbers, it is called a discrete sample space.
• If a sample space contains an infinite number of
possibilities equal to the number of points on a line
segment, it is called a continuous sample space.
• A random variable is called a discrete random variable if
its set of possible outcomes is countable.
• When a random variable can take on values on a
continuous scale, it is called a continuous random
Discrete Probability Distributions
• The set of ordered pairs (x, f(x)) is a probability function,
probability mass function, or probability distribution
of the discrete random variable X if, for each possible
outcome x,
• Question: A shipment of 20 similar laptop computers to a retail outlet
contains 3 that are defective. If a school makes a random purchase of
2 of these computers, find the probability distribution for the number
of defectives.
• Solution: Let X be a random variable representing defective
computers whose values x are the possible numbers of defective
computers purchased by the school. Then x can only take the
numbers 0, 1, and 2.
Continuous Probability Distributions
• The function f(x) is a probability density function (pdf)
for the continuous random variable X, defined over the set
of real numbers, if
Some Probability Distributions
Discrete Probability Distributions
• Binomial
• Poisson
Continuous Probability Distributions
• Normal
• Exponential
• Chi Square
Normal Distribution
( x )2
(2 ) 2
f ( x)  −∞< x < ∞
 2
where π = 3.14159 . . . and e = 2.71828 . . . .

 x2
e /2
When μ=0, σ=1
f ( x) 
The Normal Curve
The area under the curve = 1 and the curve is symmetrical.

-4 -3 -2 -1 0 1 2 3 4
The Normal Curve
The Normal Curve
The Normal Curve
The Normal Curve
Standard Normal Distribution
Normal distribution with μ = 0 and σ = 1
Standard Normal Distribution
Normal distribution with μ = 0 and σ = 1
Whenever X assumes a value x, the corresponding value
of Z is given by z = (x − μ)/σ.
Therefore, if X falls between the values x = x1 and x = x2,
the random variable Z will fall between the corresponding
values z1 = (x1 −μ)/σ and z2 = (x2 − μ)/σ.
Standard Normal Distribution

The original and transformed normal distributions

Cumulative Distribution Function
F ( x)  P( X  x)  
F ( x)  
f ( x)dx

Useful MS Excel work sheet functions

Area under the normal curve


-4 -3 -2 -1 0 1 2 3 4
Cumulative Distribution Curve
-3.000 0.001
-2.500 0.006
-1.500 0.067
-1.000 0.159
-0.500 0.309
0.000 0.500
0.500 0.691
1.000 0.841
-4 -3 -2 -1 0 1 2 3 4
1.500 0.933
2.000 0.977
2.500 0.994
3.000 0.999
• The daily demand for a product is normally distributed
with a mean of 350 and a standard deviation of 20. What
is the probability that the demand on any day will be less
than 320
• NORMDIST(320,350,20,1)=0.0668
Using the N(0,1) Distribution
• When X is normally distributed with mean of μ and
standard deviation of σ
• (x – μ)/ σ will be normally distributed with mean of 0 and
SD of 1
• (x – μ)/ σ is denoted by letter z
• (x – μ)/ σ= (320-350)/20= - 1.5
• NORMSDIST(- 1.5) will give the same result
Area under the normal curve
z area z area z area z area
-3.000 0.001 -1.400 0.081 0.200 0.579 1.800 0.964
-2.800 0.003 -1.200 0.115 0.400 0.655 2.000 0.977
-2.600 0.005 -1.000 0.159 0.600 0.726 2.200 0.986
-2.400 0.008 -0.800 0.212 0.800 0.788 2.400 0.992
-2.200 0.014 -0.600 0.274 1.000 0.841 2.600 0.995
-2.000 0.023 -0.400 0.345 1.200 0.885 2.800 0.997
-1.800 0.036 -0.200 0.421 1.400 0.919 3.000 0.999
-1.600 0.055 0.000 0.500 1.600 0.945 3.200 0.999
Sampling Distribution of the mean
• The sampling distribution of the mean refers to the
pattern of sample means that will occur as samples are
drawn from the population at large

• Example
• A study is performed to determine the number of
kilometres the average person in India drives a car in one
day. It is not possible to measure the number of
kilometres driven by every person in the population, so
researcher choose randomly a sample of 10 people and
record how far they have driven.
Sampling Distribution of the mean
• Researcher then randomly sample another 10 drivers in
India and record the same information. This is done a total
of 5 times. The results are displayed in the table below.
Sampling Distribution of the mean
• Each sample has its own mean value, and each value is
• We can continue this experiment by selecting and
measuring more samples and observe the pattern of
sample means
• This pattern of sample means represents the sampling
distribution for the number of kilometres the average
person in India drives in a day.
Sampling Distribution of the mean
• The distribution from this example represents the
sampling distribution of the mean because the mean of
each sample was the measurement of interest
• What happens to the sampling distribution if we
increase the sample size?
• Don’t confuse sample size (n) and the number of
samples. In the previous example, the sample size equals
10 and the number of samples was 5.
The Central Limit Theorem
• What happens to the sampling distribution if we increase
the sample size?

• As the sample size (n) gets larger, the sample means

tend to follow a normal probability distribution

• As the sample size (n) gets larger, the sample means

tend to cluster around the true population mean

• Holds true, regardless of the distribution of the population

from which the sample was drawn
Standard Error of the Mean
• As the sample size increases, the distribution of sample
means tends to converge closer together –to cluster
around the true population mean

• Therefore, as the sample size increase, the standard

deviation of the sample means decreases

• The standard error of the mean is the standard deviation

of the sample means
Standard Error of the Mean
• Standard error can be calculated as follows:
Standard Error of the Mean
• In many applications, the true value of (the SD of the
population) is unknown

• SE can be estimated using the sample SD

Using the Central Limit Theorem
• If we know that the sample means follow the normal
probability distribution

• And we can calculate the mean and standard deviation of

that distribution

• We can predict the likelihood that the sample means will

be more or less than certain values
Exponential Distribution
• Exponential Distribution is used in Reliability Analysis the
time for failure of a component or a system
• Probability density function is given by
f(x)=le-lx Where l denotes the average failure rate
F(k)= P(X<k)
Exponential Distribution
• An electronic component has an average life of 500
hours. What is the probability that a component will last
for 600 hour or more
X is the life of the component
P(X> 600)=1-e(1/500)600 = 0.301
Poisson Distribution
• When time between failures has an exponential
distribution, the number of failures in unit time will have a
Poisson Distribution
• Suppose a random variable X has a Poisson distribution
with average of l . The probability that x will be have a
value of r is given by

r l
P(r ) 
Poisson Distribution (Cont)
• Suppose a machine breaks down, on the average,
once in 10 days. What is the probability that it will
break down four times in a month of 30 days
l 3 r l
l e 34 e 3
P(x=4) =  = 0.168
r! r!

Worksheet function
poisson(x,MEAN, CUM) returns
Poisson Probabilities

0 0.050 0.050
1 0.149 0.199 0.150

2 0.224 0.423 0.100

3 0.224 0.647 0.050

4 0.168 0.815 0.000

0 2 4 6 8 10 12
5 0.101 0.916
6 0.050 0.966
7 0.022 0.988
8 0.008 0.996 0.800

9 0.003 0.999 0.600

10 0.001 1.000 0.400


0 2 4 6 8 10 12
Binomial Distribution

Probability of r successes in n
trials is given by

P (r )  Cr P (1  p )
n r

Worksheet function
binomdist(x, n, p, CUM)
Returns Binomial Probabilities
• A coin is tossed 10 times. What is the
probability of 0,1,2,3,4,5,6,7,8,9 and 10 heads?

10  r
P (r )  Cr (0.5) (1  0.5)
10 r
Probability of r heads when a coin is
tossed 10 times
0 0.001 0.001 0.300

1 0.010 0.011 0.200

2 0.044 0.055 0.150


3 0.117 0.172 0.050

4 0.205 0.377 0 2 4 6 8 10 12

5 0.246 0.623
6 0.205 0.828 1.000

7 0.117 0.945 0.800


8 0.044 0.989 0.400

9 0.010 0.999 0.200


10 0.001 1.000 0 2 4 6 8 10 12
• A lot contains 50 products out of which 5 are
defective. What is the probability that a random
sample of 10 will not contain any defective units?
• n=10, p=0.10

50  r
P (r )  Cr (0.1) (1  0.1)
50 r

= BINOMDIST(0, 10, 0.1, 0)

Poisson Assumption of Binomial Distribution

• A Binomial distribution may be approximated by

a Poisson distribution when n is large and p is
small so that np is a small quantity (< 10)
• N=10, p=0.1 np=1
• P(0)= POISSON(0,1,0) = 0.3678
Normal Assumption to Binomial Distribution
• Binomial Distribution with parameters n and p will have:
• Mean of np
• Variance of np(1-p)
• It can be approximated to a normal distribution with the
same parameters
An Example
• A machine produces 10 % defectives. What is the
probability that a sample of 100 products will show 15
defectives or less?
• Binomial distribution
• Mean = 10
• Variance = 100*0.1*0.9=9
• SD=3
• Assuming a normal distribution with same parameters
• Z=(15-10)/3=5/3=1.66
• P( No. of defectives < 15)=0.9515
• The daily demand for a product is is 400 with a
standard deviation of 20 units. What is the minimum
opening inventory required to meet the demand with a
probability of 90%?

• For service rate of 90 %

• Z = NORMINV(0.90)=1.281
• (X-400)/20=1.281
• X=425.6 or 427
• A time to complete an order is normally distributed
with a mean of 100 days with a sd of 10 days. What
delivery time should you suggest to if the risk of
inability to meet the due date is limited to 5 %?

• Survival Function S=0.05

• Cumulative probability function φ = 0.95
• Delivery period = NORMINV(0.95,100,10)
• =116.5
