Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

PROBABILITY DISTRIBUTION

1. Understanding the concepts of probability


Probability refers the study of events which are going to happen or not. Before understanding
probability distribution, let us first understand a few other terms which are going to be used for the
definition.

a. Experiment
An experiment is defined as an activity that leads to an outcome. For example, tossing of a single coin

b. Outcome
The result of an experiment is called an outcome. For example, in coin-tossing, the two outcomes are
head and tail.

c. Sample Space
Sample space (S) refers to the collection of all possible events of an experiment. For example, in a coin-
tossing experiment, sample space should contain the possible outcomes of a head/tail. S = [H, T ]

d. Event
It refers one or even more than one of the possible outcomes of an experiment. In throwing a dice, the
sample space is, S= [1,2,3,4,5,6]. S will contain the face 1 is an event.

 Equally Likely Events


An event is said to be equally likely, if in a given sample space containing with at least two events,
the chance of the occurrence of each of the event is same. For example, in a coin-tossing
experiment, having a head or tail in a trial is equal to ½ each.

 Mutually Exclusive Events


An event is said to be mutually exclusive, if the outcome is only one element at a time. There is no
chance that two events or more than two events to happen at a time. For example, in a coin-tossing
experiment, we can have either head or tail as an outcome. The occurrence of head will not allow
the occurrence of tail and vice versa. This implies that the two events are mutually exclusive.

2. What is probability ?
It is the number of times an event can happen.

For example, while throwing a dice, the sample set is S=[1, 2, 3, 4, 5, 6]. There are six possible outcomes.
These are also equally likely events. Probability of having the face 2 upon throwing a dice is therefore
1/6. The probability that when the coin is tossed, the result will be a head is ½.

Probability of an event A is denoted by P[A]. The value of P[A] should be in the range 0 ≤P[A] ≤ 1.
If P[A] = 1, then the event A is said to be a sure event.
If P[A] = 0, then the event A is said to be a null event.

If the event A’ be the negation of the event A, then its probability can be defined as P[A’]. Clearly the
range of P[A’] is 0 ≤ P[A’] ≤ 1.

This implies that P[A] + P[A’] = 1. Also P[A] = 1 – P[A’] and P[A’] = 1 – P[A].

3. Probability Distribution
Probability distribution is a listing of all the possible outcomes that can emerge out of an experiment,
along with their respective probabilities of occurrence.

a. Variable
A variable is a symbol (A, B, x, y, etc.) tha t can take on any of a specified set of values.

b. Random variable
A random variable is a variable that can assume different values depending upon the experiment
outcome. It is usually denoted by X or Y. It can be classified as follows:

 Discrete Random Variable


A discrete variable is one which can take only certain values along an interval. In throwing dice, the
outcome can be either 1 or 2 or 3 or 4 or 5 or 6. The values are discrete.

 Continuous Random Variable


A continuous random variable is one when it can take any value along a given interval. The temperature
measured of a location at a specific point of time. Clearly, the temperature can take any value.

3.1 Types of probability distribution


It can be classified into the following two types: Discrete and Continuous probability distribution.

3.1.1 Discrete Probability Distribution


A random variable X takes up n different values say X1, X2, … , Xn with respective probabilities p1, p2, …
, pn ( i = 1, 2, … n, p1 + p2 + … + pn = 1), then the occurrence of the values Xi with their probabilities pi is
called the discrete probability distribution.

The same can be represented in the following tabular form:

Example: 1
A fair coin is tossed twice. Let X be the random variable which represents the number of times the heads
come up.

The sample space for this event can be defined as S = {TT, TH, HT, HH}; n(S) = 4

Event:
A – Having exactly no head; A={TT}; n(A)=1
Therefore, P (A)=n(A)/n(S)=1/4=0.25

B – Having exactly one head; B={TH, HT}; n(B)=2


Therefore, P (B)=n(B)/n(S)=2/4=0. 50

C – Having exactly two heads; C={HH}; n(C)=1


Therefore, P (C)=n(C)/n(S)=1/4=0.25

Therefore, for X, the discrete probability distribution for the random variable, X can be given as
Characteristics of a Discrete Probability Distribution

1. 0 ≤ P(X) ≤ 1, where X can be any value.

2. Values of X are exhaustive and mutually exclusive

3. The sum of their probabilities is one, i.e.,

4. Binomial, Poisson and Normal Distributions

4.1 BINOMIAL DISTRIBUTION

As the name suggests, the binomial distribution is a discrete distribution. It deals with consecutive trials,
each of which can have only two possible outcomes.

4.1.1 Definition
The binomial distribution is obtained by the following:

P[ X ] = nCx px qn-x; x = 0, 1, 2,…, n

Where n = number of trials;

x = number of success;

p = the probability of success

q = the probability of failure [q = 1 – p],

C is the number of combinations of selecting X objects out of n objects, given as


n!
Cx 
X!(n  X)!
n

The same can be expressed in a tabular form:

4.1.2 Conditions of Binomial Distribution


1. Trials are independent
2. Trials are carried out under identical conditions for a fixed number of times.
3. There are only two possible outcomes, success and failure.
4. The success probabilities should be constant for all trials.

4.1.3 Properties of Binomial Distributions

1. The random variable X takes the values 0, 1, 2,…, n. where n is finite.


2. Mean = np;
3. Variance = npq;
4. Standard deviation = √(npq)
5. The mode corresponds to the value of x for which the P[X] is maximum.
6. X[n1, p] and Y[n2, p] are the two random variables that follow binomial distribution, then [X +
Y] with parameters [n1 + n2, p] be a random variable and follow binomial distribution.

4.1.4 Example

1. Consider two pea plants whose seeds are originally green (G)or yellow (Y). Cross between these two,
produces progeny in the ratio 3 Y: 1 G. If 4 such randomly chosen progeny are examined, determine the
probability that (a) 3 are yellow and 1 is green (b) all 4 are yellow (c) all 4 have the same colour

Given:
p = P [The seed of the garden pea is yellow] = 0.75; n = 4 p = 0.75; q =1 – p = 0.25.
By definition P[X = x] = {nCx Px qn-x}

a. P[three are yellow and one is green] = P[X = 3] = 4C3 [0.75]3 [0.25]1 = 0.42188.
The chance of three of the pea being yellow is 42.188%.
b. P[all the four are yellow] = P[X = 4] = 4C4 [0.75]4 = 0.31641.
The chance of all are being yellow is 31.641%.
c. Find P[X = 0] = 4C0 [0.25]4 = 0.00391

P[all four are same in colour]

The chance of all are same colour is 32.032%.

2. A drug successfully treats 90% of cases of diarrhea in children. If 20 children suffering from diarrhea
are to be treated, find the probability that
1. all will be cured.
2. all except one will be cured of diarrhea.
3. exactly 18 will be cured of diarrhea.
4. exactly 90% will be cured of diarrhea.

Consider the children as a sample from the population.

Given:

p = P[The drug cures the hookworm in children] = 0.9

n = 20; q = 1 – p = 1 – 0.9 = 0.1.

By definition P[X = x] = [nCx Px qn-x]

1. P[all 20 will be cured] = P[X = 20] = 20C20 [0.9] 20[0.1]0 = 0.12158.


The chance of all of them will be cured is 12.158%.
2. P[all but one will be cured] = P[X = 19] = 20C19 [0.9]19[0.1] 1 = 0.27017.
The chance of all but one will be is 27.017%.
3. P[exactly 18 will be cured] = P[X = 18] = 20C18 [0.9]18[0.1]2 = 0.28518.
The chance of exactly 18 will be cured is 28.518%.
4. P[exactly 90% will be cured] = P[exactly 18 will be cured] = P[X = 18] = 20 C18[0.9] 18[0.1]2 =
0.28518.
[since n = 20; 90% of 20 = 18].

4.2 POISSON DISTRIBUTION

A discrete probability distribution, which is applicable to events that have an extremely small
probability of occurrence over a given time period is called a poisson distribution. In other words, it is
the probability of an event to occur exactly x times in a given time period. It is given as:

where λ is the parameter (expected number of events) and must be a positive constant and e = 2.71828
[approximately].

4.2.1 Properties of Poisson Distribution

1. Random variable x assumes the values x = 0, 1, 2,…, ∞ (no upper bound).


2. Mean = λ, where λ is the parameter of the distribution or expected number of events
3. Variance = λ,
4. σ = √λ
5. The value of x corresponding to the maximum probability is taken to be the mode. It can have
one or two modes. The number of modes can be decided based on the value of λ. If λ is an
integer then the two modes are λ – 1 and λ. If λ is not an integer then the whole number lies
between λ – 1 and λ is taken as mode.
6. If x and y are two independent Poisson variates, then their sum [x + y] is also a Poisson variate
7. In a binomial distribution, if n → ∞ and p become small, then it tends to a Poisson
distribution.
8. Whenever the value of λ is not given for a Poisson distribution, it can be approximately
evaluated using the relation λ = np; if where [n ≥ 20] is the number of trial and p ≤ 0.05 be the
probability of success.

4.2.3 Example
1. Consider a discrete random variable that is Poisson distributed. Given λ = 2. Evaluate:

(a) P[X = 0] (b) P[X≤2] (c) P[X>2].


Given, λ = 2.

By definition,

a) When x = 0.

b) When X ≤ 2,

P[X≤2] = P[X = 0] + P[X = 1] + P[X = 2]

c) When X > 2,

P[X>2] = P[X = 3] + P[X =4] + . . . + P[X = ∞]

We know ,

Hence, P[X = 0] = 0.135; P[X ≤ 2] = 0.675 and P[X > 2] = 0.325.

2. Of the wheat seeds, the probability of non-germinating is 0.1. If total number of seeds is 10,000 find
σ of non-germinating.
4.3 Normal Distribution

A continuous Probability distribution, expressed as smooth curves is a normal distribution. The


probabilities are expressed as areas under the curve. It is defined by the probability density function as
given below:

where μ = mean and σ = standard deviation. The curve representing this is referred as normal curve.

Different normal distributions are obtained by varying the parameters μ and σ

4.3.1 Properties of normal distribution


1. Mean = μ
2. Variance = σ2
3. Mean = median = mode = μ.
4. The curve is bell shaped, symmetrical about the mean and passes through peak of the curve.
It separates the area into two equal parts. Each part is a mirror image of the other.
5. The total area under the curve and the x-axis is given as:

6. The random variable has an infinite theoretical range: +  to  


7. The area under the normal curve between x = c and x = d (c < d), implies that probability x lies
between c and d, i.e., P[c < x < d].

4.3.2 Standardized normal distribution

 The standardized normal distribution (Z) is given as:


X μ
Z
σ
 Standardized normal probability density function is given as:

1 (1/2)Z2
f(Z)  e

 Mean= 0;
 Standard deviation=1

4.3.4 Example

1. Calculating proportions of a normal distribution of sucrose concentrations, where μ = 65 mg/100 ml


and σ= 25 mg/100 ml
a) What proportion of the population is greater than 85 mg/100 ml?
b) What proportion of the population is less than 45 mg/100 ml?
c) What proportion of the population lies between 45 and 85 mg/100 ml?

Given:

a) To find
P[population is greater than 85 mg/100 ml] = P[X> 85]
Given X= 85
We know that Ζ = [x − μ]/σ= [85 −65]/25 = 0.8

P[X> 85] =P[Z> 0.8] = 0.2119.

b) P[population is less than 45 mg/100 ml] = P[X< 45]


Given X= 45
We know that Ζ = [x − μ]/σ= [45 − 65]/25 = −0.8

P[X< 45] = [Ζ < −0.8] = 0.2119.

c) P[population lies between 45 and 85 mg/100 ml] = P[45 < X< 85]

2. In a college, the average score on the biology portion was 511, with 21.77% of the students secured
more than 600, then find σ.

Given:

μ =511; P[X ≥ 600] = 0.237.

Find σ

Hence the required value of the SD is 114.

You might also like