Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 26

Mathematics of Binomial Distribution

Final Year Project


by
Hijab Fatima

CIIT/FA17-BSM-022/ISB

Supervised by
 
Dr. Mansoor Shaukat Khan
Outline

 History and Background


 Binomial Probability Distribution
 Properties of Binomial Distribution
 Mean, Variance and Moments
 Skewness and Kurtosis
 Some basics of R and R studio
 Real life application of Binomial Distribution
 Limitations
 Conclusion
 References
History and Background
Discovery
•Jacob Bernoulli

Background

• Jacob bernoulli, in a proof published in 1713,


determined that the probability of k such
outcomes in n repetitions is equal to the
 kth term the expansion of the binomial
expression (p + q)n , hence the name
Binomial Distribution

•  
Binomial Probability Distribution

• Definition

 A binomial distribution can be thought of as simply the probability of a


SUCCESS or FAILURE outcome in an experiment or survey that is repeated
multiple times. The binomial is a type of distribution that has two possible
outcomes 
• Probability Density Function
Properties of Binomial Distribution

• The number of observations n is fixed.


• Each observation is independent.
• Each observation represents one of two
outcomes (“success” or “failure”).
• The probability of “success” p is same for each
outcome
• Mean of Binomial Probability Distribution.

• Variance of Binomial Probability Distribution


Moments of Binomial Distribution

• Moments about origin.


 μ1’= np
 μ2’= n(n-1)p2 + np
 μ3’= n(n-1)(n-2)p3 + 3n(n-1)p2 +np
 μ4’= n(n-1)(n-2)(n-3)p4 + 6n(n-1) (n-2)p3 +7n(n-1)p2 +np

• Moments about mean .


 μ1= np
 μ2= npq
 μ3= npq(1-2p)
 μ4= npq(1+3pq(n-2))
• Skewness
 measure of the asymmetry
 The skewness value can be positive, zero, negative, or
undefined.

• Moment coefficient of skewness.

For p∈ (0,1) the skewness of is


• Kurtosis

 Measure of peakedness

• Moment coefficient of kurtosis.

For p∈ (0, 1) the kurtosis of


Skewness and Kurtosis Interpretation in Binomial distribution.

Skewness Kurtosis

Measure of Asymmetry Measure of peakedness

If Skewness=0 then Symmetric If k>3 then leptokurtic (high peak)

If Skewness<0 them left Skewed If k<3 then platykurtic (flat topped)

If Skewness>0 them rightly Skewed If k=3 then mesokurtic (neither peaked


nor flat)
Interpretation through graph
R and R studio
• What is R?
R is a free, open-source software and programming language developed in
1995 at the University of Auckland as an environment for statistical
computing and graphics.
 A Software Environment:
• Statistics
• Graphics
• Programming
• Calculator
• GIS
• Etc…
Binomial Distribution in R

R has a number of built in functions for calculations involving


probability distributions, both discrete and continuous.
d p
b b
i i
n n
o o
m m

usage

q r
b b
i i
n n
o o
m m
• dbinom

 used to find values for the probability density function of X,


f(x).
 dbinom(x,size,prob) is the probability of x successes in size
trials when the probability of success is prob.

• Pbinom
• calculates the c. d. f. of the binomial distribution.
• Pbinom(q,size,prob,lower.tail) is the cumulative
probability(lower tail=true for left tail, lower tail= false for
right tail) of less than or equal to q successes.
• Rbinom
 simulates a series of Bernoulli trials and returns the results.
 The expected syntax is
rbinom(#observations,#trials,#probability of success)

• Qbinom
 Return the value of the inverse cumulative density function
(cdf) of the binomial distribution.
 The syntax for using qbinom is as follows:
 qbinom (q, size, prob) 
So what are all those will be useful for?

• The performance of a machine learning model.


• Number of patients responding to a treatment.
• Think about a hospital
emergency station.
• If you are running a Webserver.
• Number of people who
answered ‘yes’ to a survey question.
• Vote counts for a candidate in an
election.
• Number of defective products
in a production run.
An intuitive real life example of a binomial distribution and
how to simulate it in R

• Let’s use some real life data to apply our knowledge so far.
• It contains data about Horror Movies released since 2012. And
what I asked was whether horror movies are more likely be
released at the 13th each month?
• Is this significant?

We can do this by the qbinom() function in R. For example qbinom(0.975, size,


p) will return the value which will define the cut off which contains 0.975 of the
probabilities. And our confidence interval will be the interval between

qbinom(0.025, size, p) < Confidence Interval < qbinom(0.975, size, p)

lower <- qbinom(0.975, 2782, 1/30)


75
upper <- qbinom(0.025, 2782, 1/30)
112
75 < Confidence Interval < 112
• Conclusion

• 95% of the time, a specific day of month will have between 75


and 112 movie releases. Higher or lower values than this range
can not happen due to random chance according to our
probability distribution.
• 124 movies released at the 13th of any month. This value is
above the 97.5th quantile. So it is significant.
Limitations
• Binomial to Poisson Distribution

 If p 0
 If n ∞

• Binomial to Normal Distribution


 If p 1/2
 If n ∞
• Closing Remarks
References
• Bernoulli, Jacques, and Jacob Bernoulli. The art of conjecturing, together
with letter to a friend on sets in court tennis. JHU Press, 2006.
• Peiffer J. Jacob Bernoulli, teacher and rival of his brother Johann. Journal
Électronique d'Histoire des Probabilités et de la Statistique [electronic
only]. 2006 Nov;2(1b):Article-1
• Weisstein EW. Bernoulli distribution. https://mathworld. Wolfram. Com/.
2002 Jan 20.
• Edwards AW. The meaning of binomial distribution. Nature. 1960 Jun;
186(4730):1074-.
• Feldman, Dorian, and Martin Fox. "Estimation of the parameter n in the
binomial distribution." Journal of the American Statistical
Association 63.321 (1968): 150-158

You might also like