Professional Documents
Culture Documents
Summare Measures (Autosaved)
Summare Measures (Autosaved)
1 3/14/2024
Learning objectives
❑ After completing this chapter, a student will able to;
2 3/14/2024
Measures of Central Tendency or location
3 3/14/2024
Measures of Central Tendency
tendency:
1. Arithmetic Mean
2. Median
3. Mode
4 3/14/2024
1. The Arithmetic Mean or Simple Mean (𝑋ത )
Is the sum of all observations divided by the number of
5 3/14/2024
1. The Arithmetic Mean or simple Mean cont…
where
k=the number of observations
6 3/14/2024
Example 1
Solution:
7 3/14/2024
Grouped data
In calculating the mean from grouped data, we assume that
8 3/14/2024
Example2
Compute the mean age of 169 subjects from the grouped data.
Mean = 5810.5/169 = 34.48 years
9 3/14/2024
Characteristics
✓ The value of the arithmetic mean is determined by every item in the
series.
✓ It is greatly affected by extreme values.
✓ The mean can be used as a summary measure for both discrete and
continuous data, in general however, it is not appropriate for either
nominal or ordinal data.
Advantages
1) It is based on all values given in the distribution.
2) It is most early understood.
3) It is most amenable to algebraic treatment.
10 3/14/2024
Disadvantages
1) It may be greatly affected by extreme items ,
considerably reduced.
11 3/14/2024
2. Median
Is an alternative measure of central tendency, perhaps second in
popularity to arithmetic mean.
1. Ungrouped data
Suppose that there are n observations in a sample.
The observations arranged in increasing or decreasing order, then
median is defined as the middle observation from the set of data.
12 3/14/2024
Example
The number of children with asthma during a specific year in a
certain local districts clinic is shown.
Find the median for this data set.
253, 125, 328, 417, 201, 70, 90
Solution:
First we must arrange the data in ascending order
70, 90, 125, 201, 253, 328, 417
Therefore, the fourth observation is the median of the data, i.e. the
value 201 is the median value.
13 3/14/2024
Median for grouped data
2. For grouped data
Where: Lmed =lower class boundary of the median class. f med= The
frequency of the median class, W=the size of the median class, n=
total number of observation, f c= The cumulative frequency less
than type preceding the median class.
Note: Median class is the class with a smallest value of the
cumulative frequency {less than type) greater than or equal to n/2.
14 3/14/2024
Cont…
Example; find the median for the following distribution
15 3/14/2024
Cont…
Solution
16 3/14/2024
Cont…
We can computed the median value as follow; n=76, 76/2=38
✓ The values greater than or equal to n/2=76/2=38 are 39,54,
66, 72,76.
✓ The smallest value among these less Ogive type frequency is 39
✓ So the median class is the third class (49.5-54.5)
17 3/14/2024
Characteristics
➢ It is an average of position.
Disadvantages
1. The median is not so well suited to algebraic treatment means.
2. It is not so generally familiar as the arithmetic mean
19 3/14/2024
3. Mode
It is the most frequently occurring value.
There may be more than one mode such as when dealing with
20 3/14/2024
Examples
1. Find the mode of 5, 3, 5, 8, and 9 ; Mode = 5
2. Find the mode of 8, 9, 9, 7, 8, 2, 5; Mode =8 and 9
3. Find the mode of 4, 12, 3, 6, and 7. No mode/ mode doesn’t
exist.
21 3/14/2024
Mode for Grouped data
The mode for grouped data can be computed using the following
1
formula: Mode = L + ∗W
1 + 2
where L = The lower class boundary of the modal class;
22 3/14/2024
Cont…
Example: Calculate the modal age for the age distribution of 228 patients
below.
23 3/14/2024
Cont…
Solution
By inspection (simply looking at the frequencies), the mode lies in
the fourth class, where L=29.5, fmod = 57, f1=50, f2=48, w = 5, and
∆1=57-50=7, ∆2=57-48=9
7
Therefore, the modal age, x = 29.5 + 7 + 9 ∗ 5
= 29.5 + 2.2
= 31.7
24 3/14/2024
Characteristics
25 3/14/2024
Advantages
It is not affected by extreme observations.
Easy to calculate and simple understand
It can be calculated for distribution with open ended class
Disadvantages
o 1 It is not suitable for further for mathematical treatment
26 3/14/2024
Measure of non-central tendency
These are quartiles, deciles and percentiles.
1. Quartiles
28 3/14/2024
For grouped data
29 3/14/2024
2. Percentiles
❑ Simply divide the data into 100 pieces
❑ Shows the percentage of values that fall below the particular value in a
30 3/14/2024
Cont…
✓ Arrange the numbers in ascending order.
𝑤 𝑖𝑛
𝑃𝑖 = 𝐿 + − 𝐶𝐹 ,i = 1, 2,...,99 .
𝑓𝑃𝑖 100
31 3/14/2024
Cont…
32 3/14/2024
Cont…
For example: suppose that 50% of a cohort survived at least
4 years.
33 3/14/2024
Example
Marks of 50 students out of 85 is given below. Based on the data find
𝑄1 𝑎𝑛𝑑 𝑃7.
46-50 51-55 56-60 61-65 66-70 71-75 76-80
Marks
4 8 15 5 9 5 4
fSolution:
i first find CB and CF distribution.
34 3/14/2024
Cont…
CF ≥ 12.5 are 27,37,41,46, and 50. but the smallest CF is
27. so the quartile class is the third class (55.5-60.5).
𝑤 𝑛 5
Q1 = L + − 𝐶𝐹 = 55.5 + 12.5 − 12 = 55.7
4 15
𝑓𝑄1
For percentiles
35 3/14/2024
Cont…
1. Calculate 𝑄1 , 𝑄2 , 𝑄3, 𝐷4, 𝑃40 & 𝑃90 for the following data
given on the table below.
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2
36 3/14/2024
Measures of Dispersion (Variation)
• The scatter or spread of items of a distribution is known as
dispersion or variation.
• In other words the degree to which numerical data tend to
spread about an average value is called dispersion or variation
of the data.
The most commonly used measures of dispersions are:
1) Range and relative range
2) Quartile deviation and coefficient of Quartile deviation
3) Mean deviation and coefficient of Mean deviation
4) Variance
5) Standard deviation and coefficient of variation.
37 3/14/2024
Range
The range is the largest score minus the smallest score.
It is a quick and dirty measure of variability
Because the range is greatly affected by extreme scores and its
only depends on two observations
𝑅 =𝐿−𝑆
38 3/14/2024
Cont..
Example: Suppose the first and third quartile for weights of
girls 12 months of age are 8.8 Kg and 10.2 Kg respectively.
The IQR = 10.2 Kg – 8.8 Kg
39 3/14/2024
Variance
Variance is the "average squared deviation from the mean".
A good measure of dispersion should make use of all the data.
For ungrouped data, the population variance is computed as:
40 3/14/2024
For the case of frequency distribution it is expressed as:
41 3/14/2024
Cont…
There is a problem in a variance because the deviations are squared
and its units also square, in order to get the original unit of
measurements using square root.
42 3/14/2024
Example1
Consider the following three datasets
43 3/14/2024
Special properties of variance
• The main drawback of variance => unit is squared, so it is
difficult to interpret .
• Variance gives weight to extreme values than those near the
mean value because the difference is squared.
• Variance will be zero for distributions with equal magnitude
• The greater the difference in the values, the greater the variance
and vise versa.
Why you use n-1;
44 3/14/2024
SD Vs. Standard Error (SE)
SD describes the variability among individual values in a given data
set.
We interpret SE of the mean may give a mean that may lie between ±
SE.
45 3/14/2024
Cont…
o The SD has the advantage of being expressed in the same units
46 3/14/2024
Coefficient of Variation (C.V)
The standard deviation is an absolute measure of deviation of
observations around their mean and is expressed with the same
unit of the data.
𝑆
𝐶. 𝑉 = ∗ 100%
𝑋ത
47 3/14/2024
When to use coefficient of variation
When two data sets have different units of measurements, or
their means differ sufficiently in size, the CV should be used as
a measure of dispersion.
When different units of measurements are involved, e.g.,
comparison.
48 3/14/2024
Standard score (Z-scores)
❑ It is obtained by subtracting the mean of the data set from
49 3/14/2024
Cont…
Z-score computed from the population
𝑋−𝜇
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝜎
Z-score computed from the sample
𝑋 − 𝑋ത
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝑆
Example: Suppose that a student scored 66 in biostatistics and 80 in
Epidemiology . The score of the summary of the courses is given below.
50 3/14/2024
Solution:
𝑋−𝜇 66−51 15
Z-score of student in Biostatistics: 𝑍 = = = =
𝜎 12 12
1.25
𝑋−𝜇 80−72 8
Z-score of student in Epidemiology: 𝑍 = = = =
𝜎 16 16
0.5
51 3/14/2024
Moments about the mean (central moments)
The rth moments about the mean (the rth central moments) defined
as
σ 𝑋𝑖 − 𝑋ത 𝑟
𝑀𝑟 = , r = 0, 1, 2, …
𝑛
For continuous grouped data
σ 𝑓𝑖 𝑋𝑖 − 𝑋ത 𝑟
𝑀𝑟 =
𝑛
Where 𝑋𝑖 ’s is class mark
Find the first three central moments about the mean of the following
individual series of 2, 3 and 7.
52 3/14/2024
Measure of shape
I. Skewness
II. Kurtosis
53 3/14/2024
1. Skewness
o Measure of central tendency and variation do not reveal the
shape of frequency distribution.
54 3/14/2024
Skewness…
o The skewness of a distribution is defined as the lack of symmetry.
55 3/14/2024
Skewness…
• For moderately skewed distribution, the following relation holds
among the three commonly used measures of central tendency.
➢ Mean-Mode=3*(Mean-Median)
56 3/14/2024
Cont…
Skewed to the right (positively skewed)
Mode
Median
Mean
57 3/14/2024
Cont…
Negatively (left) skewed: Smaller observations are less frequent
Median
Mean
58 3/14/2024
Measures of Skewness
1. Karl Pearson’s Coefficient of Skewness (SK):
S k = Mean - Mode
Standard deviation S k = 3(Mean - Median)
Standard deviation
59 3/14/2024
2. Moment Coefficient of Skewness
Moment coefficient of skewness is based on moments. The
formula for calculating coefficient of skewness is:
𝑀3 𝑀3
𝛼3 = 3/2 =
𝑀2 𝜎3
60 3/14/2024
Kurtosis
o Kurtosis is a measure of peakedness of a distribution.
o The degree of kurtosis of a distribution is measured relative
to the peakedness of a normal curve.
o The peakedness of a distribution can be classified into three:
o Leptokurtic: -
- A distribution having relatively high peak.
- A curve is more peaked than the normal curve .
61 3/14/2024
Cont…
o Mesokurtic: -
- Normal peak
- The curve is properly peaked
o Platykurtic:
▪ Flat toped
62 3/14/2024
Cont…
63 3/14/2024
Measures of kurtosis
The moment coefficient of kurtosis 𝛼4 ;
𝑀4
𝛼4 =
𝑀2 2
64 3/14/2024
Example:
Based on the following data:
𝑀0 = 1, 𝑀1 = -0.6, 𝑀2 = 1.6, 𝑀3 = -2.4, 𝑀4 = 5.8
a) Find the coefficient of skewness and discuss the distribution
type.
b) Find the coefficient of kurtosis and discuss the distribution type.
Solution
𝑀′3 −2.4
a) 𝛼3 = 3/2 = = -1.19 < 0, ➔the distribution is negatively
𝑀′2 1.63/2
skewed.
𝑀′4 5.8
b) 𝛼4 = = = 2.26 < 3, ➔the curve is Platykurtic.
𝑀′22 1.62
65 3/14/2024
Which measures to use?
When the distribution of the data is symmetric and unimodal (i.e. the data
are approximately normally distributed), it is usual to summarize the data
using means and standard deviations.
However when the data are skewed, it is preferable to use the median and
quartiles as summary statistics.
Remark:
o The mean and median of symmetric distribution coincide.
o When the distribution is skewed to the right, its mean is larger than its
mode.
o When the distribution is skewed to the left, its mean is smaller than its
66 mode. 3/14/2024
THANK YOU!!!
67 3/14/2024
Probability and probability distribution
68 3/14/2024
Objectives
❑ After completing this chapter, learners should be able to :-
69 3/14/2024
Introduction of probability
Many medical decisions are made on a statistical basis since
individuals differ in their reactions to medications or surgery in an
unpredictable way.
70 3/14/2024
Introduction of probability….
▪ Understanding of probability is fundamental for
▪ quantifying the uncertainty in the decision-making process.
▪ drawing conclusions about a population of patients based on
known information about a sample of patients drawn from
that population.
Mutually exclusive events: two event are not occur at the same time or .
They cannot be occurred simultaneously.
74 3/14/2024
Basic Terms Cont…
• Sample space: The set of all possible outcomes of a statistical
experiment is called the sample space and is represented by the
symbol 'S’.
Example: The sample space for the sex of newborns when two mothers are
in the gynecology ward to give birth is: S= {MM, MF, FM, FF}
• An event: consists of one or more outcomes and is a subset of the sample
space or a collection of sample points.
Example: From the above experiment, an event consisting of at least one
female is E = {MF, FM, FF}
• RandomVariable: is a function that associates a unique
numerical value with every outcome of an experiment
75 3/14/2024
Basic terms cont.…
Two different broad classes of random variables:
1. A continuous random variable: can take any value in
an interval or collection of intervals.
76 3/14/2024
Basic concepts cont’d…..
Some sample spaces for various probability experiments are.
Probability is a number between 0 and 1 that expresses how likely the event is occur.
77 3/14/2024
Basic concepts cont’d…..
Example: Find the sample space for the gender of the children if a family
has three children. Use B for boy and G for girl.
Solution: There are two genders, male and female, and each child could
be either gender. Hence, there are eight possibilities, as shown here.
S= {BBB, BBG, BGB, GBB, GGG, GGB, GBG, BGG}
Note: the way to find all possible outcomes of a probability experiment
(the sample spaces) would be:-
by observation and reasoning;
use a tree diagram (a device consisting of line segments emanating from
a starting point and also from the outcome point.)
78 3/14/2024
Tree diagram of the above example
79 3/14/2024
Types of probability
1.Classical (or theoretical) probability
▪ It is used when each outcome in a sample space is equally
likely to occur.
▪ That is if an experiment has 'n' equally likely outcomes, then
each possible outcome must have probability of 1/n to occur
Or, equivalently the probability for event E is;
Example:
• A medical doctor realized that out of 100,000 patients visited the
hospital, there are 50 cancer cases. What is the probability that a patient
to be examined will be positive for cancer?
P(+ve for cancer) = 50/100,000 = 0.0005
81 3/14/2024
Example 2
In a sample of 50 people, 21 had type O blood, 22 had type A blood, 5
had type B blood, and 2 had type AB blood. Set up a frequency
distribution and find the following probabilities
a. A person has type O blood
b. A person has type A or type Bblood
c. A person does not have type AB blood
82 3/14/2024
Solution
Blood type Frequency
A 22
B 5
AB 2
O 21
Total 50
83 3/14/2024
3. Axioms of probability
Let “E” be a random experiment and “S” be a sample space
0 ≤ P (A) ≤1
P (S) =1
P (A') = 1- P (A)
84 3/14/2024
Mutually exclusive events/Disjoint events:
o Two events E1 and E2 are said to be mutually exclusive events if there is no sample
point which is common to both events E1 and E2.That means, E1∩ E2 =ᶲ (empty).
E.g. One die is rolled. Sample space = S = (1,2,3,4,5,6)
Let A = the event an odd number turns up, A = (1,3,5)
Let B = the event a 1,2 or 3 turns up; B = (1,2,3 )
A. Find Pr (A) and Pr (B)
P( A ) = P(1) + P(3) + P(5) = 1/6+1/6+ 1/6 = 3/6 = ½
P( B ) = P(1) + p(2) + P(3) = 1/6+1/6+1/6 = 3/6 = ½
B. Are A and B mutually exclusive?
o A and B are not mutually exclusive. Because they have the elements 1 and 3 in
common.
85 3/14/2024
The Venn diagram to show two disjoint events A and B might
look like this one:
86 3/14/2024
Union of events: The union of two events A and B,denoted by (A⋃B) ,
consists of all outcomes that are in A or in B or both A andB.
❖ If A and B are two events,then
▪ P(A ∪ B) = P(A) + P(B) − P(A ∩B)
❖ If A and B are mutually exclusive/independent,then
▪ P(A ∪ B) = P(A) + P(B)
Example: In a hospital unit there are 8 nurses and 5 physicians; 7 nurses and
3 physicians are females. If a staff person is selected, find the probability that
the subject is nurse or a male?
87 3/14/2024
Staff Gender
Male Female Total
Physician 2 3 5
Nurse 1 7 8
Total 3 10 13
88 3/14/2024
Intersection of events
If A and B are events, then the intersection of A and B, denoted by A ∩ B,
represents the event composed of all basic outcomes in A and B
P(A ∩ B) = P(A)*P(B/A) if the two events are dependent or related
P(A ∩ B) = P(A)*P(B) if the two events are independent
A B
89 3/14/2024
Conditional probability
Conditional probability of A given B means the probability of occurrence
of A when the event B has already happened, and it is defined as
follow:- P(A/B) = P(A ∩ B)/P(B), P(B) ≠ 0
Special case: when both events are independent then,
P(A/B) = P(A), and P(B/A) = P(B),
P(A ∩ B) = P(A)*P(B)
90 3/14/2024
Example
91 3/14/2024
Independent Events
Two events are independent if the occurrence of one of the events
does not in any way affect the probability of the other event.
That is, A and B are independent if :P (B |A) = P (B) or if P (A |B)
= P (A)
Example: Let event A stands for “the sex of the first child from a
mother is female”; and event B stands for “the sex of the second child
from the same mother is female”
Are A and B independent?
Solution
P(B/A) = P(B) = 0.5
The occurrence of A does not affect the probability of B, so the
events are independent.
92 3/14/2024
Example
The following data shows the association between aspirin use and heart
attack.
Table: Data for treatment versus Myocardial Infarction
Myocardial Infarction
Treatment Yes No Total
Placebo 100 500 600
Aspirin 60 900 960
Total 160 1400 1560
Let us define A and B as, positive for Myocardial Infarction and Aspirin
used respectively.
93 3/14/2024
Find;
A. P(A/B), B. P(B/A)
C. Are the characteristics of A and B independent
Solution:
A. P(A/B) = P(A n B)/P(B) = 60/1560 ÷ 960/1560 = 0.0625
B. P(B/A) = P(B n A)/P(A) = 60/1560 ÷ 160/1560 = 0.375
C. To test independency p(A/B) = p(A) or p(A ∩ 𝐵) = p(A)×p(B)
Therefore: P(A/B) = 0.0625 where as p(A) = 160/1560 =0.103
94 3/14/2024
Counting rules of probability
We have three different counting rules.
✓ Basic multiplication rule
✓ Permutation
✓ Combinations
95 3/14/2024
Multiplication Rule (for counting techniques)
If an operation can be described as a sequence of k steps, and
completing step 2, and so forth until k steps, the total number of ways
of completing the operation is n1*n2*…..* nk
96 3/14/2024
multiplication rule cont.….
E.g. Assume we have a coin & a die. If we toss a coin first and then
the die, how many possible outcomes does the experiment have?
97 3/14/2024
Permutations
The number of possible permutations is the number of different
orders in which particular events occur. The number of possible
permutations are
98 3/14/2024
Combination
The number of ways r objects can be chosen a set of n objects without
considering the order of selection is called the number of combination of
n objects taking r of them at a time, denoted by
Eg 1. C(8,2) ?
2. In a club containing 7 members a committee of 3 people is to be formed.
In how many ways can the committee be formed?
7 7!
7 C3 =
3 = 3!(7 - 3)! =35
99 3/14/2024
Example:
Given the letters A, B, C, and D list the permutation and
combination for selecting two letters.
Solution:
Permutation Combination
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
In case of combination, AB=BA, but not for permutation since it
100 3/14/2024
Random variable and Probability Distribution
Definition of random variables and probability distribution
A random variable: is a variable whose values are determined by chance.
101 3/14/2024
1. Discrete random variables: have a finite number of possible values or an
infinite number of values that can be counted.
The word counted means that they can be enumerated using the numbers
1, 2, 3, etc
Variables that can assume all values in the interval between any two given
values are called continuous variables
102 3/14/2024
Examples of discrete random variable:
• Toss a coin “n” time and count the number of heads.
• Number of experimental rats in specific study.
• Number of defective items in a given company.
• Number of bacteria per two cubic centimeter of water
Examples of continuous random variable:
• Height of students at certain college.
• Mark of students.
• Life time of a certain disease .
• Length of time required to complete a given training
103 3/14/2024
The probability distribution of a discrete random variable is a table, graph,
formula, or other device used to specify all possible values of a random
variable along with their respective probabilities.
✓ Since the values of a probability distribution are probabilities, they must be
numbers in the interval from 0 to 1.
Example: Consider the experiment of tossing a coin three times. Let X be the
number of heads. Construct the probability distribution of X. F (X) = Pr (X =
Xi) , i = 0, 1, 2, 3.
Pr (X = 0) = 1/8 …………………………….TTT
Pr (X = 1) = 3/8 ……………………………. HTT THT TTH
Pr (X = 2) = 3/8 ……………………………..HHT THH HTH
Pr (X = 3) = 1/8 ………………………………HHH
X 0 1 2 3
P(X) 1/8 3/8 3/8 1/8
104 3/14/2024
Example 2:
Construct a probability distribution for rolling a single die.
Solution
Since the sample space is 1, 2, 3, 4, 5, 6 and each outcome has a probability
of the distribution
X 1 2 3 4 5 6
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
105 3/14/2024
Two requirements for probability distribution
The sum of the probabilities of all events in the sample space must be equal
to 1; i.e. σ 𝑝 𝑋 = 𝑥𝑖 = 1
106 3/14/2024
Properties of continuous probability distribution
1.
-
f ( x) = 1
2. P(a X b) = the area under the curve between the point a and b.
3. P( X ) 0
4.
P( X = a ) = 0
5. P(a X b) = P(a X b) = P(a X b) = P(a X b)
b
P(a x b) = f(x) dx
a
107 3/14/2024
Introduction to expectation
Definition: the expected value (also known as the mean) of a
random variable is a measure of the center location for the
random variable.
1. Discrete R.V n
2. Continuous R.V
b
E(X ) = X . f ( x)d ( x)
a
108 3/14/2024
Variance Probability distribution
The expected value of X is its mean
Mean of X= E(X)
The variance of X is given by:
Variance of X=Var(x) = E (X 2 ) - ( E ( X )) 2
n
E ( X ) = X i .P( X i ) if X is discrete
2 2
i =1
= X 2 f (x )d ( x) if X is continuous
x
109 3/14/2024
Example
Let X be a continuous R.V with distribution
1
x 0 x2
f ( x) = 2
0, otherwise
Then find
a) P (1<x<1.5)
b) E(x)
c) Var(x)
d) E (3x 2 - 2 x)
E.g2. Two dice are rolled. Let X be a random variable denoting the sum of
the numbers on the two dice.
i) Give the probability distribution of X
ii) Compute the expected value of X and its variance
110 3/14/2024
Discrete probability distributions …….
✓ Bernoulli
✓ Binomial
✓ Poisson
✓ Negative binomial
✓ Geometric
111 3/14/2024
1. Bernoulli Distribution:
• The random variable X takes two values 1 or 0.
• Ω = {0, 1}, P(X = 1) = p, P(X = 0) = 1 − p
•Then, the probability function is: P(Y = y) =
– Each trial has only one of the two possible mutually exclusive
outcomes, success or a failure.
– The probability of success does not change from trial to trial, and
113 3/14/2024
Binomial distribution Cont..
114 3/14/2024
When using the binomial formula to solve problems, we have to identify three
things:
▪ The number of trials (n)
115 3/14/2024
Example: Suppose that an examination consists of six true and false
questions, and assume that a student has no knowledge of the subject
matter. The probability that the student will guess the correct answer to
the first question is 30%. Likewise, the probability of guessing each of the
remaining questions correctly is also 30%.
a) What is the probability of getting exactly three correct
answers?
b) What is the probability of getting at least two correct answers?
c) What is the probability of getting at most two correct answers?
d) What is the probability of getting less than five correct answers?
e) Find expected value and standard deviation?
116 3/14/2024
117 3/14/2024
118 3/14/2024
Exercise
1. Suppose 14 percent of mothers admitted to smoking one or more
cigarettes per day during pregnancy. If a random sample of size 10 is
selected from this population, what is the probability that it will contain
exactly four mothers who admitted to smoking during pregnancy?
2. Suppose that 80% of adults with allergies report symptomatic relief with a
specific medication. If the medication is given to 10 new patients with
allergies, what is the probability that it is effective in exactly seven? assume
that the replications are independent.
119 3/14/2024
3.Poisson distribution
• The probability distribution of a Poisson random variable ‘X'
representing the number of successes occurring in a given time interval
or a specified region of space is given by the formula:
Where
• k=Number of successes per unit time
✓ Accidents.
✓ Hereditary.
✓ Arrivals
121 3/14/2024
The Poisson distribution differs from the binomial distribution in these
fundamental ways:
122 3/14/2024
Example:
In a study of drug-induced anaphylaxis among patients taking rocuronium
bromide as part of their anesthesia, the occurrence of anaphylaxis followed a
Poisson distribution with λ =12 incidents per year in Norway. Find the
probability that in the next year, among patients receiving rocuronium,
a. exactly three will experience anaphylaxis.
b. At least two will experience anaphylaxis
123 3/14/2024
Solution
𝜆 = 12 incidents per year
𝑒 −12 ∗123
a. P(x=3)= = 0.00177
3!
=0.9
124 3/14/2024
Exercise
In a certain population an average of 13 new cases of esophageal cancer are
diagnosed each year. If the annual incidence of esophageal cancer follows a
Poisson distribution, find the probability that in a given year the number of
newly diagnosed cases of esophageal cancer will be:
A. Exactly 10 cases
B. At least three cases
C. No more than 3
D. Between nine and 12, inclusive
E. Fewer than two
125 3/14/2024
4.Negative Binomial Distribution
Consider a Bernoulli trial with two outcomes: S and F.
Repeat this trial identical and independently.
Count the number of trials until the kth success is observed
If repeated independent trials can result in a success with probability p
and a failure with probability q = 1 − p, has a negative binomial
distribution with discrete probability function given by;
We have that
1 𝑞
𝜇= and 𝜎 2 = 2
𝑝 𝑝
Example:
Tossing a balanced coin until Head appears
127 3/14/2024
Continuous probability distributions
• If a random variable is a continuous variable, its probability
distribution is called a continuous probability distribution
• A continuous probability distribution differs from a discrete
probability distribution in several ways by:
• Under different circumstances, the outcome of a random variable
may not be limited to categories or counts.
Some common continuous probability distribution
✓ Normal distribution
✓ Chis-square distribution
✓ Student’s t-distribution
128 3/14/2024
1. Normal distribution
❑ The normal distribution refers to a family of continuous probability distributions
described by the normal equation and described as follows:
1 x-m
2
1 -
2
f ( x) = e for - x
2
e = 2.7183 = 3.1416
m and are the population mean and standard deviation.
Where;
• The random variable X in the normal equation is called the normal random
variable.
129 3/14/2024
Characteristics of Normal Distribution
• It links frequency distribution to probability distribution
• Has a Bell Shape Curve and is Symmetric
• It is Symmetric around the mean: Two halves of the curve are the
same (mirror images)
• Hence Mean = Median=mode
• The total area under the curve is 1 (or 100%)
• Normal Distribution has the same shape as Standard normal
distribution
130 3/14/2024
Normal Curve
The graph of the normal distribution depends on two factors:
✓the mean and the standard deviation.
The mean of the distribution determines the location of the center of the graph,
and the standard deviation determines the height and width of the graph.
When the standard deviation is large, the curve is short and wide; when the
standard deviation is small, the curve is tall and narrow.
All normal distributions look like a symmetric, bell-shaped curve.
131 3/14/2024
Standard Normal Distribution
• It makes life a lot easier for us if we standardize our normal curve, with
a mean of zero and a standard deviation of 1 unit.
• We can transform all the observations of any normal random variable X
with mean μ and variance σ to a new set of observations of another
normal random variable Z with mean 0 and variance 1 using the following
transformation:
132 3/14/2024
About 68% of the area under the curve falls within 1 standard deviations of
the mean
About 95% of the area under the curve falls within 2 standard deviations of
the mean
About 99.7% of the area under the curve falls within 3 standard deviations
of the mean
A graph of this standardized (mean 0 and variance 1) normal curve is given
in Graph:
133 3/14/2024
Probability and Normal Distributions
❑ We know that the area under any normal curve is 1 unit
➢ Therefore, we can link these areas with probability
i.e. if a random variable, x, is normally distributed, the probability that x will fall in
a given interval is the area under the normal curve for that interval.
➢ Or P(a < x < b) = area under the curve between a and b.
135 3/14/2024
Table of normal distribution
Example 1: Suppose we want to compute the area under the
normal curve to the left of 1.45
• This area can be computed by finding the probability under the normal
curve
• The probability can be read at the normal curve by combining the value of
1.4 under the first column and 0.05 under the first row
• The left side of the area in the diagram represents the area that is within
1.45 standard deviations from the mean.
• The area of this shaded portion is 0.9265(or 92.65% of the total area
under the curve).
136 3/14/2024
137 3/14/2024
138 3/14/2024
Example:
Find the area to the left of z = 2.06
Solution
Step 1: Draw the figure
139 3/14/2024
Step2: We are looking for the area under the standard normal distribution
to the left of z = 2.06, It is 0.9803. Hence, 98.03% of the area is less than z
= 2.06.
140 3/14/2024
Find the area between z = 1.68 and z =-1.37.
Solution
Step 1: Draw the figure as shown.
Step 2 Since the area desired is between two given z values, look up
the areas
corresponding to the two z values and subtract the smaller area from the larger
area. (Do not subtract the z values.) The area for z=1.68 is 0.9535, and the area
for z= -1.37 is 0.0853. The area between the two z values is 0.9535 - 0.0853 =
0.8682 or 86.82%
141 3/14/2024
Example:
For subject A, a 27-year-old female, the ammonia concentration in parts per
billion (ppb) followed a normal distribution over 30 days with mean 491 and
standard deviation 119.What is the probability that on a random day, the
subject’s ammonia concentration is between 292 and 649 ppb?
Solution:
We find the z value corresponding to an x of 292 by
142 3/14/2024
The area desired is the difference between these, 0.9082 - 0.0475 = 0.
8607.
Exercise:
1. For another subject (a 29-year-old male), the acetone levels were
normally distributed with a mean of 870 and a standard deviation of 211
ppb. Find the probability that on a given day the subject’s acetone level is:
a. Between 600 and 1000 ppb
b. Over 900 ppb
c. Under 500 ppb
d. Between 900 and 1100 ppb
143 3/14/2024
2. If the total cholesterol values for a certain population are approximately
normally distributed with a mean of 200 mg\100 ml and a standard
deviation of 20 mg\100 ml, find the probability that an individual picked at
random from this population will have a cholesterol value:
a. Between 180 and 200 mg/100 ml
b. Greater than 225 mg/100 ml
c. Less than 150 mg/100 ml
d. Between 190 and 210 mg/100 ml
144 3/14/2024
2. Student t-distribution
• It is often the case that one wants to calculate the size of sample
needed to obtain a certain level of confidence in survey results
• Unfortunately, this calculation requires prior knowledge of
the population standard deviation σ.
• Realistically, σ is unknown
• Often a preliminary sample will be conducted so that a reasonable
estimate of this critical population parameter can be made
• If such a preliminary sample is not made, but confidence intervals
for the population mean are to be constructing using an unknown
σ, then the distribution known as the Student t distribution can
be used.
145 3/14/2024
Student’s t-distribution cont…
Suppose we have a simple random sample of size n drawn from a
Normal population with mean μ and standard deviation σ. Let us
denote the sample mean by 𝑥ҧ and sample standard deviation by s,
then the quantity:
𝑥ҧ − 𝜇
𝑡= 𝑠
𝑛
has a t distribution with n-1 degrees of freedom.
The degrees of freedom are the number of values that are free to vary
after a sample statistic has been computed.
146 3/14/2024
Some properties of t-distribution are;
The t distribution shares some characteristics of the normal distribution
and differs from it in others
The t distribution is similar to the standard normal distribution in these
ways:
1. It is bell-shaped
2. It is symmetric about the mean
3. The mean, median, and mode are equal to 0 and are located at the center
of the distribution
Converges to the normal distribution as the sample size gets large
5. The curve never touches the x-axis.
147 3/14/2024
The t distribution differs from the standard normal distribution in the
following ways:
▪ The variance is greater than 1.
▪ The t distribution is actually a family of curves based on the concept
of degrees of freedom, which is related to sample size.
148 3/14/2024
Assumption of student’s t-distribution
❑The parent population from which the sample is
drawn is normal.
sample is random.
149 3/14/2024
Student’s t Distribution…….
The t distribution has a (slightly) different shape for each possible
sample size.
150 3/14/2024
What happens as sample gets larger?
T-distribution and Standard Normal Z distribution
0.4
Z distribution
0.3
density
0.1
0.0
-5 0 5
Value
As the df gets larger, the student’s t-distribution looks more and more like the SND
with mean=0 and variance=1.
151 3/14/2024
Student’s t Table
Look up df
Note: the values
tabled for df = ∞
are the same values
for the standard
normal distribution,
za
152 3/14/2024
3. Chi-square distribution
The chi-squared distribution with v degrees of freedom is the distribution
of a random variable that is the sum of the squares of k independent
standard normal random variables.
A continuous random variable Y (Y~𝑋 2 (𝑣)) has a chi-square distribution
with v degrees of freedom if the density function is given by
where:
v = n-1 is the degree of freedom and we have that 𝜇 = v and 𝜎 2 = 2v:
153 3/14/2024
Properties of Chi-square distribution
The exact shape of the distribution depends upon the number of
degrees of freedom v.
The mean and variance of the 𝜒 2 distribution are v and 2v
respectively.
Chi-square values are always positive, so the Chi-square curves is
always positively skewed.
As n → ∞, then 𝜒 2 distribution approaches a normal distribution.
154 3/14/2024
155 3/14/2024