Professional Documents
Culture Documents
Math2101Stat 2 2
Math2101Stat 2 2
Math2101Stat 2 2
Probability
COURSE CODE: MATH-2101
Associate Professor
Department of Statistics, DU
Mode
• Mode: The mode is simply that value which has the highest frequency.
• The scores obtained by 5 students in a statistics test are 10, 7, 7, 7, and 0. The value “7”
has the highest frequency, therefore the mode is “7”
• Find the measure of central tendency from the following frequency distribution showing the
opinion of DU students regarding their curriculum load.
Religion Frequency
Islam 400
Hinduism 36
Buddhism 6
Christianity 5
Others 3
The category “Islam” has the highest frequency, therefore the mode is “Islam”
Mode for grouped data
• For grouped data mode is obtained by using the following formula
(𝑓0 − 𝑓−1 )
𝑀𝑜 = 𝐿 + ℎ
𝑓0 − 𝑓−1 + (𝑓0 − 𝑓1 )
• The class boundaries of modal class is 30.5-40.5, the highest frequency belongs to this class.
50−15
• Mode = 30.5 + × 10 = 30.5 + 8.75 = 39.25
50−15 +(50−45)
Choosing measures of central tendency
• The mean is only suitable for only quantitative data. For this type of data, the median is
used as a measure of central tendency if some unusual values arise.
• The mode may be the only measure available where it is not possible to do arithmetic
operation on the data, as in the case of qualitative variable.
✓ When there are very large and very small values of observations (median can be used)
✓ When the distribution is unevenly spread and the concentration being small or large at
irregular points (see Figure-2). (median can be used).
Skewness
• The term skewness refers to the lack of symmetry i.e., the distribution has the same shape on either
side of the center.
• The lack of symmetry in a distribution is always determined with reference to a normal (bell
shaped) distribution.
• Any departure of a distribution from symmetry leads to an asymmetric distribution and in such case,
we call this distribution as skewed.
• Skewness is said to be positive if the right side tail is longer than the left side tail.
• When the skewness is positive, the associated distribution is positively skewed. For a
positively skewed distribution, the following relationship hold:
𝑀𝑜𝑑𝑒 < 𝑀𝑒𝑑𝑖𝑎𝑛 < 𝑀𝑒𝑎𝑛.
Negatively skewed distribution
• Skewness is said to be negative if the left side tail is longer than the right side tail.
• When the skewness is negative, we call the distribution a negatively skewed. For a
negatively skewed distribution 𝑀𝑒𝑎𝑛 < 𝑀𝑒𝑑𝑖𝑎𝑛 < 𝑀𝑜𝑑𝑒.
How to Detect the Shape of a Distribution
• The shape of a distribution can be detected graphically by plotting a histogram, a frequency
polygon/ frequency curve, a box plot and a stem-and-leaf plot.
• The shape of a distribution can also be detected numerically by computing mean, median,
mode, quartiles, deciles, percentiles, or by some measures of skewness.
Measures of Dispersion
• Necessity: A measure of central value (average) alone cannot adequately describe a set of
observations. It fails to give us any idea about the formulation of the data sets. For this
reason, it is necessary to study the dispersion (variability) along with average for describing
a dataset.
✓ Range
✓ Quartile deviation
✓ Mean deviation
✓ Standard deviation
Different measure of dispersion
✓ Coefficient of range
✓ Coefficient of variation
Measure of Dispersion: Mean deviation
• Mean deviation: The arithmetic mean of the absolute values of deviations from a typical
value of a distribution is called mean deviation. The typical value may be the arithmetic
mean or median. If the typical value is mean then the mean deviation is called mean
deviation about the mean. It is denoted by 𝑀𝐷(𝑥).
ҧ By definition
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ
𝑀𝐷(𝑥)ҧ =
𝑛
• Example: Consider the following sampled data of IQ scores: 95, 103, 105, 110, 104, 105,
112, 90. Compute the mean deviation.
Measure of Dispersion: Mean deviation (cont…)
110 103 7 7
104 1 1
105 2 2
112 9 9
90 -13 13
σ 𝑥𝑖 = 824 σ 𝑥𝑖 − 𝑥ҧ = 42
σ𝑛
𝑖=1 𝑥𝑖 −𝑥ҧ 42
• Here 𝑛 = 8, we have 𝑀𝐷 𝑥ҧ = = = 5.25
𝑛 8
• The mean deviation of scores is 5.25. The IQ scores deviates, on average by 5.25 from the
mean.
Different Measures of Dispersion: Standard deviation
• The arithmetic mean of the squared deviations from the mean is called variance. The
positive square root of the variance is known as the standard deviation.
σ 𝑥𝑖 −𝑥ҧ 2 1
𝑛 values 𝑥1 , 𝑥2 , … , 𝑥𝑛 can be calculated by 𝑆2 = = σ 𝑥𝑖 2 − 𝑛𝑥ҧ 2 .
𝑛−1 𝑛−1
• Example: Consider the following sampled data of IQ scores: 95, 103, 105, 110, 104, 105,
112, 90. Compute the standard deviation.
Standard deviation (cont…)
2 σ 𝑥𝑖 −𝑥ҧ 2 372
• Here 𝑛 = 8, we have 𝑆 = = = 53.14286
𝑛−1 7
• Equivalently,
1 1 1 372
𝑆2 = 𝑥𝑖 2 − 𝑛𝑥ҧ 2 = 952 + ⋯ + 902 − 8 × 1032 = 85244 − 84872 = = 53.1428
𝑛−1 7 7 7
• Every set of interval or ratio level data has a variance. That is, mean can be computed only
for quantitative variable.
• The variance is unique and all the values are included in computing the mean.
• If a set consists of 𝑛1 observations of the form 𝑥11 , 𝑥12 , … , 𝑥1𝑛1 with variance 𝑆𝑥21 and a
second set consists of 𝑛2 observations of the form 𝑥21 , 𝑥22 , … , 𝑥2𝑛2 with variance 𝑆𝑥22 , then
the variance of all the 𝑛1 + 𝑛2 observations called combined variance, is given by
• All relative measures of dispersion are computed using the absolute measures along with
the measures of central tendency/location. Relative measures are usually expressed as
percentage. We will discuss only one measure based on mean and standard deviation.
100, 87, 84, 100, 53,54,98, 89, 67, 115, 80, 76, 72, 70, 91, 110, 94, 79, 86, 91, 93, 105, 83,
89, 92, 84, 100, 81, 105, 86, 95, 80, 69,77,74,79,64,61