Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Measures of Dispersion, Skewness, and Kurtosis Properties of Standard Deviation

Property 1. If each observation of a set


Measures of Dispersion of data is transformed by the addition (or
subtraction) of a constant c, to each
A measure of dispersion is a descriptive summary
measure that helps us characterize the data set in terms observation, the standard deviation of the new
of how varied the observations are from each other. set of data is the same as the standard
deviation of the original data set.
● A small value indicates that the observations are
not too different from each other; that is, there is Property 2. If each observation of a set
a concentration of observations about the center of data is transformed by the multiplication (or
of the Distribution.
division) of a constant c to each observation,
● On the other hand, a large value indicates that
the observations are very different from each the standard deviation of the new set of data is
other or they are widely spread out from the equal to the standard deviation of the original
center. data set multiplied (or divided) by |c|.
● The smallest possible value of a measure of
dispersion should be 0. A zero measure should
indicate the absence of variation. Comparing the Variation of Two or More
Distributions
We cannot use the measures of absolute
Measures of Absolute Dispersion
dispersion to compare the variation of the
● A measure of absolute dispersion has the same
observations of two or more collections when
unit as the observations. Examples: range,
(i) the units are different, or
interquartile range, standard deviation
(ii) the means are very different from each other.
Range = Maximum - Minimum
Measures of Relative Dispersion
Sample interpretation: We can also say
● A measure of relative dispersion has no unit and
that the weights of the rabbits range from 8 to 15
is therefore useful in comparing the variability of
pounds.
one distribution with another distribution.
Example: coefficient of variation
Interquartile Range = Q3 - Q1
𝑁 (𝑋𝑖−µ)
2
2
Population Variance: σ = ∑ The coefficient of variation (CV) is a measure of
𝑁
𝑖=1 relative dispersion and is defined as

Sample variance: σ
Population CV: µ
x 100%
𝑛 (𝑋𝑖− 𝑋 𝑏𝑎𝑟)
2
2
𝑠 = ∑ 𝑛−1
(population standard deviation / population
𝑖=1
mean)
Popular Standard Deviation:
𝑠
𝑁 (𝑋𝑖−µ)
2
Sample CV: 𝑋 𝑏𝑎𝑟
x 100% sample
σ = ∑ 𝑁
𝑖=1 standard deviation / sample mean)
Sample Standard Deviation:
Z-score
𝑛 (𝑋𝑖− 𝑋 𝑏𝑎𝑟)
2

𝑠 = ∑ The standard score or z-score measures how many


𝑛−1
𝑖=1 standard deviations an observed value is above or
Computational Formula of the Variance below the mean.

𝑁 𝑁 2 𝑋−µ
2
𝑁 ∑ 𝑋𝑖 − ( ∑ 𝑋𝑖) Population Z-score: σ
2
Population Variance: σ = 𝑖=1
𝑁
𝑖=1

𝑋 − 𝑋 𝑏𝑎𝑟
Sample Z-score: 𝑠
Sample Variance:
𝑛 𝑛 2
2
𝑛 ∑ 𝑋𝑖 − ( ∑ 𝑋𝑖) A positive z-score measures the number of standard
2
deviations an observation is above the mean, and a
𝑖=1 𝑖=1
𝑠 = 𝑛 (𝑛−1)
negative z-score measures the number of standard
deviations an observation is below the mean. A z-score The sample coefficient of skewness based on the third
of 0 means that the observation is equal to the mean. moment is
𝑛
3
∑ (𝑋𝑖 − 𝑋 𝑏𝑎𝑟) / 𝑛
Measures of Skewness and Kurtosis 𝑆𝑘3 =
𝑚
3 = 𝑖=1
3
A measure of skewness indicates whether the density of ( 𝑚2) (𝑠 (𝑛−1)/𝑛)

the data set looks just the same to the left as to the right
of the center point. It is a single value that indicates the Measures of Kurtosis
degree and direction of asymmetry. A measure of kurtosis indicates the concentration of data
around the peak, whether it is flat or peaked.
Symmetric vs Skewed Distribution
If it is possible to divide the histogram at the center into Types of Kurtosis
two identical halves, wherein each half is a mirror image Mesokurtic
of the other, then it is called a symmetric distribution. ● The hump is the same as the normal curve.
Otherwise, it is called a skewed distribution. ● It is neither too flat nor too peaked.
Leptokurtic
Two Types of Skewness ● The curve is more peaked about the mean and
● Positively Skewed or Skewed to the Right the hump is narrower than the normal curve with
If the concentration of the values is at the the same variance.
left-end of the distribution and the upper tail of ● The prefix “lepto” came from the Greek word
the distribution stretches out more than the leptos meaning small or thin.
lower tail, then the distribution is said to be - the sharper peak implies a higher concentration
positively skewed or skewed to the right. of values around the mode compared to a
normal distribution of the same variance.
● Negatively Skewed or Skewed to the Left Platykurtic
If the concentration of the values is at the ● The curve is less peaked about the mean and
right-end of the distribution and the lower tail of the hump is flatter than the normal curve with
the distribution stretches out more than the the same variance.
upper tail, then the distribution is said to be ● The prefix “platy” came from the Greek word
negatively skewed or skewed to the left. platus meaning wide or flat.
- the flatter peak implies lower concentration of
Direction of Skewness values around the mode compared to a normal
● Sk = 0: symmetric distribution of the same variance. \
● Sk > 0: positively skewed
● Sk < 0: negatively skewed Population Coefficient of Kurtosis Based on the
Fourth Moment
Degree of Skewness
The farther |Sk| is from 0, the more skewed the 𝑁
∑ (𝑋𝑖 − µ) / 𝑁
4
µ4
distribution 𝐾= 4 = 𝑖=1
4
σ σ

Pearson’s First and Second Coefficient of Skewness


● Pearson’s first coefficient of skewness for a Interpretation In general,
sample is
µ4
𝑋 𝑏𝑎𝑟 − 𝑀𝑜
𝑆𝑘1 = 4 - 3 < 0 → platykurtic
𝑠 σ
● Pearson’s second coefficient of skewness for a µ4
- 3 > 0 → leptokurtic
sample is σ
4

3(𝑋 𝑏𝑎𝑟 − 𝑀𝑑) µ4


𝑆𝑘1 = 𝑠 4 - 3 = 0 → mesokurtic
σ
µ4
Coefficient of Skewness Based on the Third Moment 4 - 3 is called “excess of kurtosis”
σ
The population coefficient of skewness based on the
third moment is
𝑁
3
µ3 ∑ (𝑋𝑖 − µ) / 𝑁
𝑖=1
𝑆𝑘3 = 3 = 3
σ σ

You might also like