Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Probability and Statistics

April 2023

Contents
1 Mathematical or a priori definition 2

2 Statistical or empirical or a posteriori definition 2

3 Compound probability 3

4 Conditional probability 4

5 Probability distributions 4
5.1 Binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5.1.1 Practical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.1.2 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.1.3 Distribution formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.1.4 Mean, variance and standard deviation . . . . . . . . . . . . . . . . . . . . . . . . . 5
5.1.5 Historical note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2 Poisson distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.1 Practical examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.2 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
5.2.3 Distribution formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2.4 Mean, variance and standard deviation . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2.5 Deduction from binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . 7
5.2.6 Historical note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3 Gaussian (or normal) distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3.1 Distribution formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3.2 Mean, variance and standard deviation . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.3.3 Deduction from binomial distribution . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.3.4 Historical note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

6 Problems 10

1
1 Mathematical or a priori definition
If there are q number of exhaustive, mutually exclusive and equally likely cases of an event and suppose
that p of them are favorable to happening of an event A under a given set of conditions, then mathematical
probability of the event A is given by
p
P (A) = (1)
q
• A collection of all possible outcomes of an experiment is said to be an event.

• The term exhaustive assures the happening of an event either in favor or against and rules out the
possibility of happening neither (in favor or against) in any trial.

• The term mutually exclusive is a safeguard against the probability of two simultaneous happenings
in a trial, e.g., in tossing a coin, the head or tail cannot fall together, but falling of one excludes the
other.

• The term equal likely means no happening is biased or partially bound to occur.

• An event is said to be simple or compound according as it cannot or can be decomposed. The


simultaneous occurrence of two or more events in connection with each other is said to be a compound
event.

• Two or more events are said to be dependent or independent according as the occurrence of one
does or does not affect the occurrence of the other(s). The dependent events are sometimes called
as contingent.
Alternatively, if odds in favor of the event A are m to n (or n to m against A), the probability of happening
the event A is defined as
m
P (A) = (2)
m+n

2 Statistical or empirical or a posteriori definition


If a large number of trials performed under the same conditions, the limit of the ratio of the number of
happenings of an event A to the total number of trials is unique and finite when the number of trials
tends to infinity, then the limit measures the probability of the happening of the event A.
Thus if in a large number of trials performed under the same set of conditions, p is the probability of
happening of an event A and q that of its failure, then the probability of its happening in the next trial
p
is p+q , being assumed to determine the empirical probability that there is no information relative to the
probability of the happening of the event other than the past trials.
In other words, if an event A happens on pN occasions when a large number N is taken out of a series
of trials, then the probability P (A) of the event A is p defined as

pN
P (A) = lim =p (3)
N →∞ N

Precisely if m is the number of times in which the event A occurs in a series of n trials, then
m
P (A) = lim (4)
N →∞ n
• If p is the probability of happening of an event A i.e. P (A) = p and q that of not happening of that
event denoted by P (Ā) is given by P (Ā) = q = 1 − p so that P (A) + P (Ā) = 1

• The probability P (A) of an event A lies between 0 & 1 i.e. 0 ≤ P (A) ≤ 1. The probability of an
impossible event is zero i.e. P (O) = 0 while the probability of a certain event is one i.e. P (E) = 1

2
3 Compound probability

If there are n mutually exclusive events A1 , A2 , ..., An whose probabilities are P (A1 ), P (A2 ), ...,
P (An ), respectively, then the probability that one of them will happen is the sum of their separate
probabilities i.e. P (A1 + A2 + ... + An ) = P (A1 ) + P (A2 ) + ... + P (An )

Proof. Suppose there are N total number of exhaustive, mutually exclusive and equally likely cases
of which m1 , m2 , ..., mn are favourable to the events A1 , A2 , ..., An , respectively. Then the total
number of cases favorable to either A1 or A2 or ... or An is m1 +m2 +...+mn so that the probability
of happening of at least one of these events is
m1 + m2 + ... + mn
P (A1 + A2 + ... + An ) = = P (A1 ) + P (A2 ) + ... + P (An )
N

• In case of an event A is comprised by n mutually exclusive forms A1 , A2 , ..., An i.e. A =


A1 + A2 + ... + An then the probability of A i.e. P (A) is the sum of probabilities of A1 , A2 ,
..., An separately i.e. P (A) = P (A1 ) + P (A2 ) + ... + P (An )

• In case of n mutually exclusive events are exhaustive also, so that there is certainty of hap-
pening of at least one i.e. P (A1 + A2 + ... + An ) = 1 so that P (A1 ) + P (A2 ) + ... + P (An ) = 1

Let us clearly define the following notations:


• P (A): Probability for an event A to happen

• P (Ā): Probability for an event A not to happen

• P (A + B): Probability for occurrence of at least one of the events A and B.

• P (AB): Probability for occurrence of both the events A and B.

• P (AB̄): Probability for happening of the event A but not of the event B.

• P (ĀB): Probability for happening of the event B but not of the event A.

• P (ĀB̄): Probability for happening of both neither the event A nor of B.


If A and B are two events such that AB and AB̄ are two exhaustive and mutually exclusive forms in
which A can occur, then we have
P (A) = P (AB) + P (AB̄) (5)
and similarly
P (B) = P (AB) + P (ĀB) (6)
so that n o
P (A) + P (B) = P (AB) + P (AB) + P (AB̄) + P (ĀB) (7)
But from the theorem of total probability, we can write

P (A + B) = P (AB) + P (AB̄) + P (ĀB) (8)

i.e. the probability that at least one of A and B happens is equal to the sum of probabilities that A
happens, B happen A not and A, B both can happen. Thus,

P (A + B) = P (A) + P (B) − P (AB) (9)

3
4 Conditional probability
If there are two events A and B, probabilities of their happening being P (A) and P (B) respectively, then
the probability P (AB) of the simultaneous occurrence of the events A and B is equal to the probability
of A multiplied by the conditional probability of B (i.e. the probability of B when A has occurred) or
the probability of B multiplied by the conditional probability of A i.e.

P (AB) = P (A)P (B|A) = P (B)P (A|B) (10)

where, P (B|A) denotes the conditional probability of B and P (A|B) that of A and that if two events are
independent, then the theorem of compound probability is

P (AB) = P (A)P (B) (11)

Suppose there are N number of mutually exclusive and equally likely cases of which m are favourable to
A. Let m1 be the number of cases favourable to A and B both, while m1 is included in m. Thus
m1 m
P (B|A) = and P (A) = (12)
m N
Now, the probability of happening both A and B is given by
m1 m1 m
P (AB) = = . = P (A)P (B|A) (13)
N m N
Interchange of A and B in the above equation yields

P (BA) = P (B)P (A|B) (14)

As P (AB) = P (BA), we can write


P (A)P (B|A)
P (A|B) = (15)
P (B)
This is called Bayes theorem. Now in case of two events are independent i.e. the occurrence of one does
not affect the other, P (B|A) is the same as P (B) and P (A|B) is the same as P (B) so that we can write

P (AB) = P (A)P (B) (16)

• If p be the chance that an event will happen in one trial, the chance that it will happen in any
assigned secession of r trial is pr as we find P (A1 A2 ...Ar ) = P (A1 )P (A2 )...P (Ar ) = p.p...p = pr
• If p1 , p2 , ..., pn are the probabilities that n events happen, then the probability that all the events
fail is (1 − p1 )(1 − p2 )...(1 − pn ). Hence the chance that at least one of these events happen is
[1 − (1 − p1 )(1 − p2 )...(1 − pn )]

5 Probability distributions
• We use probability distributions because they work, they fit lots of data in real world.
• Random variable means the mathematical rule (or function) that assigns a given numerical value
to each possible outcome of an experiment in the sample space of interest. There are two different
types of random variables: Discrete random variables and Continuous random variables

5.1 Binomial distribution


A binomial distribution can be thought of as simply the probability of a SUCCESS or FAILURE outcome in
an experiment or survey that is repeated multiple times. The binomial is a type of distribution distribution
that that has two possible outcomes (the prefix “bi” means two, or twice).

4
5.1.1 Practical examples
The daily life examples of binomial distribution can be considered as follows:

• The number of heads/tails in a sequence of coin flips

• The sex of a newborn (male or female)

• The survival of an organism in a region (live or die)

• The number of male/female employees in a company

• The number of defective products in a production run

5.1.2 Criteria
Binomial distributions must also meet the following three criteria:

• The number of observations or trials is fixed. In other words, you can only figure out the probability
of something happening if you do it a certain number of times. This is common sense-if you toss a
coin once, your probability probability of getting a tails is 50 %. If you toss a coin a 20 times, your
probability of getting a tails is very, very close to 100%.

• Each observation or trial is independent. In other words, none of your trials have an effect on the
probability of the next trial.

• The probability of success (tails, heads, fail or pass) is exactly the same from one trial to another.

5.1.3 Distribution formula

The binomial distribution formula can be written as


! !
n x n x n−x
P (x; n, p) = p (1–p)n−x = p q (17)
x x

where, P is the binomial probability, x is the total number of “successes” (pass or fail, heads or
tails etc.), p is the probability of a success on an individual trial (q = 1 − p being the probability
of a failure on an individual trial) and n denotes the number of trials.

5.1.4 Mean, variance and standard deviation

n
! ! ! ! !
X n x n−x n 0 n n n 2 n−2 n n 0
x̄ = p q x= p q .0 + pq n−1 .1 + p q .2 + ... + p q .n (18)
x=0
x 0 1 2 n

h i
x̄ = np q n−1 + pq n−1 + ... + pn−1 = np(q + p)n−1 = np (19)

n n
! !
n x n−x 2 n x n−x
x¯2 =
X X
p q x = p q [x(x − 1) + x]
x=0
x x=0
x
" n ! # n
" ! #
2
X n − 2 x−2 n−x X n − 1 x−1 n−x
= n(n − 1)p p q + np p q (20)
x=0
x−2 x=0
x−1

5
x¯2 = n(n − 1)p2 (p + q)n−2 + np(p + q)n−1 = n(n − 1)p2 + np (21)
σ 2 = x¯2 − x̄2 = np(1 − p) = npq (22)
√ √
∆ = σ 2 = npq (23)

5.1.5 Historical note


The binomial distribution is one of the oldest known probability distributions. It was discovered by
Bernoulli, J. in his work entitled Ars Conjectandi (1713). This work is divided into four parts: in the
first, the author comments on the treatise from Huygens; the second part is dedicated to the theory of
permutations and combinations; the third is devoted to solving various problems related to games of
chance; finally, in the fourth part, he proposes applying probability theory to moral questions and to the
science of economics.

5.2 Poisson distribution


When there is a large number of trials, but a small probability of success, binomial calculation becomes
impractical. Example: Number of deaths from horse kicks in the Army in different years.
A Poisson distribution is a tool that helps to predict the probability of certain events from happening
when you know how often the event has occurred. It gives us the probability of a given number of events
happening in a fixed interval of time.

5.2.1 Practical examples


The daily life examples of Poisson distribution can be considered as follows:
• The hourly number of customers arriving at a bank
• The daily number of accidents on a particular stretch of highway
• The hourly number of accesses to a particular web server
• The daily number of emergency calls in Dallas
• The number of typos in a book
• The monthly number of employees who had an absence in a large company
• Monthly demands for a particular product

5.2.2 Criteria
Poisson distributions must also meet the following criteria:
• The number of events that occur in any time interval is independent of the number of events in any
other disjoint interval. Here, “time interval” is the standard example of an “exposure variable” and
other interpretations are possible. Example: Error rate per page in a book.
• The distribution of number of events in an interval is the same for all intervals of the same size.
• For a “small” time interval, the probability of observing an event is proportional to the length of
the interval. The proportionality constant corresponds to the “rate” at which events occur.
• The probability of observing two or more events in an interval approaches zero as the interval
becomes smaller.

6
5.2.3 Distribution formula

The Poisson distribution formula can be written as


λx e−λ
P (x; λ) = (24)
x!
where, P is the Poisson probability, x is the random variable, µ = np is the mean number of
successes from n trials.

5.2.4 Mean, variance and standard deviation

∞ ∞
" #
λx e−λ −λ λx λ λ2
= λe−λ 1 + +
X X
x̄ = x=e + ... = λ (25)
x=0
x! x=0
(x − 1)! 1! 2!


λx e−λ 2
x¯2 =
X
x
x=0
x!
" ! #
λ λ2
 
−λ
= λe 1+ ×2 + × 3 + ...
1! 2!
" #
−λ λ λ2
= λe 1 + (1 + 1) + (2 + 1) + ...
1! 2!
" ! !#
−λ λ λ2 λ λ2
= λe 1+ + + ... + λ 1 + + + ...
1! 2! 1! 2!
h i
= λe−λ eλ + λeλ
= λ(λ + 1) (26)

σ 2 = x¯2 − x̄2 = λ (27)


√ √
∆ = σ2 = λ (28)

5.2.5 Deduction from binomial distribution


The Binomial distribution formula reads as
!
n x
P (x; n, p) = p (1–p)n−x (29)
x

Let us take the limit n → ∞ and p → 0 in such a way that np = λ →finite. Under these conditions, we
may write
n!
lim = lim n(n − 1)(n − 2)...(n − x + 1) ≃ nx (30)
n→∞ (n − x)! n→∞

Again we have,
λ
(1 − p)n (1 − p) p
lim (1 − p) n−x
= lim x
= lim = e−λ (31)
p→0 p→0 (1 − p) p→0 (1 − p)x
n→∞ n→∞ n→∞
Thus finally the Binomial distribution formula reduces to
!
n x λx e−λ
lim p (1–p)n−x = (32)
p→0 x x!
n→∞

7
Let us recall the trick to evaluate the limit
1
y = lim (1 − x) x (33)
x→0

so that
ln(1 − x) 1
ln y = lim = lim = −1 ⇒ y = e−1 (34)
x→0 x x→0 x − 1
It is to be noted that here we have used the L Hospital’s rule.

5.2.6 Historical note


The distribution was first introduced by Siméon Denis Poisson (1781–1840) and published together with
his probability theory in his work Recherches sur la probabilité des jugements en matière criminelle et en
matière civile (1837). The work theorized about the number of wrongful convictions in a given country by
focusing on certain random variables N that count, among other things, the number of discrete occurrences
(sometimes called “events” or “arrivals”) that take place during a time-interval of given length. The
result had already been given in 1711 by Abraham de Moivre in De Mensura Sortis seu; de Probabilitate
Eventuum in Ludis a Casu Fortuito Pendentibus. This makes it an example of Stigler’s law and it has
prompted some authors to argue that the Poisson distribution should bear the name of de Moivre.
In 1860, Simon Newcomb fitted the Poisson distribution to the number of stars found in a unit of space.
A further practical application of this distribution was made by Ladislaus Bortkiewicz in 1898 when he
was given the task of investigating the number of soldiers in the Prussian army killed accidentally by
horse kicks; this experiment introduced the Poisson distribution to the field of reliability engineering.

5.3 Gaussian (or normal) distribution


5.3.1 Distribution formula

A continuous random variable (−∞ < x < +∞) is said to be normally distributed with mean µ
and variance σ 2 if its probability density function is
( )
1 (x − µ)2
f (x) = √ exp − (35)
σ 2π 2σ 2

so that the probability distribution formula P (x; µ, σ) within a certain region of x reads as
Z x2
P (x1 ≤ x ≤ x2 ; µ, σ) = f (x)dx (36)
x1

For µ = 0 and σ = 1, the probability density function becomes


!
1 x2
f (x) = √ exp − (37)
2π 2

5.3.2 Mean, variance and standard deviation


( )
(x − µ)2
Z +∞ Z +∞
1
x̄ = xf (x)dx = √ x exp − dx (38)
x=−∞ σ 2π x=−∞ 2σ 2
x−µ
Taking the variable transformation z = √ ,

we can modify the above integral as

1
Z +∞ √ √ µ
Z +∞
x̄ = √ (µ + 2σz) exp(−z 2 )( 2σ)dz = √ exp(−z 2 )dz = µ (39)
σ 2π z=−∞ π z=−∞

8
Here note that the second integral vanishes due to the odd nature of the integrand. In a similar fashion,
the integral Z +∞
x¯2 = x2 f (x)dx (40)
x=−∞
reduces to the form
1
Z +∞ √ √
x¯2 = √ (µ + 2σz)2 exp(−z 2 )( 2σ)dz
σ 2π z=−∞
 Z +∞ Z +∞
1

2 2 2 2 2
= √ µ exp(−z )dz + 2σ z exp(−z )dz
π z=−∞ z=−∞
= µ2 + σ 2 (41)

σ 2 = x¯2 − x̄2 = σ 2 (42)



∆ = σ2 = σ (43)

5.3.3 Deduction from binomial distribution


The Binomial distribution formula reads as
!
n x
P (x; n, p) = p (1–p)n−x (44)
x

For n → ∞, we can use Stirling’s formula as


√ 1 √ 1 √ 1
n! = 2πe−n nn+ 2 ; x! = 2πe−x xx+ 2 ; (n − x)! = 2πe−(n−x) (n − x)n−x+ 2 (45)

so that !
n x n−x 1
lim p q = lim √ (46)
n→∞ x n→∞ B 2πnpq

where,
x+ 1  n−x+ 1
x n−x 1 x 1 n−x
       
2 2
B= ⇒ ln B = x + + n−x+ log (47)
np nq 2 np 2 nq
Let us take the variable transformation
x − np
z= √ (48)
npq
which implies
x q n−x p
r r
=1+z ; =1−z (49)
np np nq nq
Substituting in equation (47), we find

z q p z2 z2 p q 1
r r     
ln B = √ − + − + +O (50)
2 n p q 2 4n q p n2
z2
z2
For n → ∞, ln B → 2 so that B → e 2 and therefore

1 x−µ
dp = √ e− 2σ2 dx (51)
σ 2π

9
5.3.4 Historical note
The normal probability distribution was discovered by Abraham De Moivre in 1733 as a way of approxi-
mating the binomial probability distribution when the number of trials in a given experiment is very large.
In 1774, Laplace studied the mathematical properties of the normal probability distribution. Through
a historical error, the discovery of the normal distribution was attributed to Gauss who first referred to
it in a paper in 1809. In the nineteenth century, many scientists noted that measurement errors in a
given experiment followed a pattern (the normal curve of errors) that was closely approximated by this
probability distribution.

6 Problems
1. An unbaised dice is cast twice. Find the probability that the positive difference (bigger-smaller)
between the two number is 2.
2. A box contains 100 coins out of which 99 coins are fair coins and 1 is a double-headed coin. Suppose
you choose a coin at random and toss it 3 times. It turns out that the results of all 3 tosses are
heads. What is the possibility that the coin you have drawn is the double-headed one?
3. There are on average 20 buses per hour at a point, but at random times. Find the probability that
there are no buses in five minutes.
x
4. If the distribution function of x is f (x) = xe− λ over the interval 0 < x < ∞, find the mean value
of x.
5. If two ideal dice are rolled once, what is the probability of getting at least one 6? where, x ∈
(−∞, +∞) and Pk (x) is an arbitrary polynomial of order k?
6. Find the mean value of random variable x with probability density
" #
1 (x2 + µx)
p(x) = √ exp −
σ 2π 2σ 2

7. Suppose that we toss two fair coins hundred times each. Find the probability that the same number
of heads occur for both coins at the end of the experiment.
8. An electronic circuit with 10000 components performs its intended function successfully with a
probability 0.99 if there is no faulty components in the circuit. The probability that there is a
faulty component is 0.05. If there are faulty components, the circuit performs successfully with a
x
probability 0.3. The probability that the circuit performs successfully is 10000 . What is x?
9. A person named A,B,C,D,E,F,G,H,I,J have come for an interview. They are being called one by
one to the interview panel at random. What is the probability that C gives interview before A and
A gives before F?
10. An unbaised dice is thrown three times successively. Find the probability that the number of dots
on the uppermost surface add up to 16.
11. A ball is picked at random from one of two boxes that contain 2 black, 3 white and 3 black, 4 white
balls respectively. What is the probability that it is white ?
12. A bag contains many balls, each with a number painted on it. There are exactly n balls which have
the number n (namely one ball with 1, two balls with 2 and so on untill N on them). An experiment
consists of choosing a ball at random, noting the number on it and returning it to the bag. If the
experiment is repeated a large number of times, find the average value of the number.

10
13. In a series of five cricket matches, one of the captain calls ‘head’ every times when the toss is taken.
Find the probability that he wins 3 times and lose 2 times.

14. Two independent random variables m and n, which can take the integer values 0,1,2,...,∞, follows
the Poisson distribution with distinct mean values µ and ν respectively. Comment on the probability
distribution of the random variable r = (m − n)?

15. Let X and Y be two independent random variables, each of which follow a normal distribution with
the standard deviation σ, but with means +µ and −µ, respectively. Find the relation that X+Y
follows.

16. The random variable x(−∞ < x < +∞) is distributed according to normal distribution

1 x2
P (x) = √ e− 2σ2
2πσ 2

Find the probability density of the random variable y = x2 .

17. A random variable n obeys Poisson’s statistics. The probability of finding n = 0 is 10−6 . Find the
expectation value of n.

18. 12 balls, 3 each of the colors red, green, blue and yellow are put in a box and mixed. If three balls
are picked at random, without replacement, find the probability that all three are of the same color.

19. A multiple choice exam has 4 questions, each with 4 answer choices. Every question has only one
correct answer. Find the probability of getting all answers correct by independent random guesses
for each one.

20. A box contains 5 white and 4 black balls. Two balls are picked together at random from the box.
What is the probability these two balls are of different colors?

21. Ten glass vases were to be packed one each in 10 boxes marked glass. Twelve brass vases were to be
packed one each in 12 boxes marked brass. Four vases and boxes got mixed up. A customer orders
1 glass and 1 brass vase and is sent appropriately marked boxes. Find the chance that the customer
does not get the ordered vases in correctly marked boxes.

22. A marksman has four successes in six attempts. What is the probability that he had three consec-
utive successes?

23. A basket consists of an infinite number of red and black balls in the proportion p : (1 − p). Three
balls are drawn at random without replacement. Find the value of p for which the probability of
their being two red and one black is maximum.

24. Find the number of distinct ways of placing four indistinguishable balls into five distinguishable
boxes.

25. Two identical cube shaped dice each with faces numbered 1 to 6 are rolled simultaneously. Find the
probability that an even number is rolled out on each dice.

11

You might also like