S6 Skewness2

Measures of
Shape
Learning Objectives
➢ Illustrate the measures of shape.
➢ Solve problems involving measures of shape.
➢ Use appropriate measures of position and other statistical methods in
analyzing and interpreting research data.
➢ Imbibe endurance to understand lessons.
James 1: 2-3 “Count it all joy, my brothers, when you meet trials of various
kinds, for you know that the testing of your faith produces steadfastness.”
Activity
It was alright. However our
teacher mentioned that the
test scores of our class are
How was your negatively skewed. What
last test, John? did she mean? Is that good
or bad?
Measures of Shape
Measures of Shape
describe the distribution (or pattern) of the data within a dataset.
What are the shapes of the dataset?
1. Symmetrical – Normal Distribution

2. Asymmetrical – Skewed Distribution
What are the shapes of the dataset?
1. Symmetrical – Normal Distribution

2. Asymmetrical – Skewed Distribution
Symmetrical Distribution
❑ Two sides of the distribution are a mirror image of each other.

❑ A normal distribution is a true symmetric distribution of
observed values.
Example of a Normal Distribution
Example of a Normal Distribution
❑ When a histogram is constructed on values that are normally distributed,

the shape of columns form a symmetrical bell shape. This is why this
distribution is also known as a 'normal curve' or 'bell curve'.
Key Features of Normal Distribution
❑ Symmetrical shape
❑ Mode, median and mean are the same and are together in the center of the curve
❑ There can only be one mode.
❑ Most of the data are clustered around the center.
Asymmetrical Distribution
❑ Two sides will not be mirror images of each other.
Asymmetrical Distribution
❑ Two sides will not be mirror images of each other.
SKEWNESS
❑ It gives the amount and direction of the skew of a data set or
the degree of symmetry of the distribution.
Skewed to the Right or Positively Skewed
❑ If majority of the data is at the left side and the right tail is longer.
❑ If majority of the data is at the left side and the right tail is longer.
Example:
❑ If majority of the data is at
the left side and the right
tail is longer.
❑ Hence, the right tail of the
data set is longer than the
left and is more stretched
on the side above mean.
❑ For positively skewed data
sets, the mode is less
than the median and the
median is less than the
mean.
Skewed to the Left or Negatively Skewed
❑ If majority of the data is at the right side and the left tail is longer.
❑ If majority of the data is at the right side and the left tail is longer.
Example:
❑ If majority of the data is at
the right side and the left
tail is longer.
❑ The left tail is more
stretched on the side
below the mean.
❑ For negatively skewed
data sets, the mode is
greater than the median
and the median is greater
than the mean.
Key Features of Skewed Distribution
❑ Asymmetrical shape
❑ Mean and median have different values and do not all lie at the center of the
curve
❑ There can be more than one mode.
❑ The distribution of the data tends towards the high or low end of the dataset.
Example 1
Determine whether the data set illustrated is positively
skewed, negatively skewed, or approximately symmetric.
a.
Example 1
a.
Positively Skewed;
majority of the data
are on the left side
Example 1
b.
Example 1
b.
Approximately Symmetric;
the data set has
approximately the same
appearance on the left
and on the right of the
center line.
Example 1
c.
Example 1
c.
Negatively Skewed;
majority of the data
are on the right side.
Skewness is a pure number without a unit. Karl Pearson gave
this formula to estimate skewness 𝑠𝑘,
𝒎𝒆𝒂𝒏 − 𝒎𝒆𝒅𝒊𝒂𝒏 ഥ − 𝒙
𝟑 𝒙 ෥
𝒔𝒌 = 𝟑 =
𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒔
Example 2
a. Determine the skewness of the following data on the height (in cm) of
15 preschool students.
80 82 88 88 90
91 92 92 93 94
94 95 96 96 96
Solution
a. The problem is an example of ungrouped data.
b. First, determine the mean, median and standard deviation.
𝒔𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒈𝒊𝒗𝒆𝒏 𝒅𝒂𝒕𝒂 σ𝒙

Solving for the mean,𝒎𝒆𝒂𝒏 𝒙
ഥ = = .
𝒕𝒐𝒕𝒂𝒍 𝒏𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒅𝒂𝒕𝒂 𝑵
σ𝒙 𝟖𝟎+𝟖𝟐+𝟖𝟖+𝟖𝟖+𝟗𝟎+𝟗𝟏+𝟗𝟐+𝟗𝟐+𝟗𝟑+𝟗𝟒+𝟗𝟒+𝟗𝟓+𝟗𝟔+𝟗𝟔+𝟗𝟔 1 367
ഥ
𝒙 = = = = 𝟗𝟏. 𝟏𝟑 .
𝑵 𝟏𝟓 15
The mean of the data is equal to 91.13.

𝒏+𝟏
Solving for the median,𝒎𝒆𝒅𝒊𝒂𝒏 𝒙
෥ = .
𝟐
Arranged the data in ascending order.

80, 82, 88, 88, 90, 91, 92, 92, 93, 94, 94, 95, 96, 96, 96
𝒏+𝟏 𝟏𝟓+𝟏 𝟏𝟔
Applying the formula of the median, ෥
𝒙 = = = = 𝟖𝒕𝒉 𝒑𝒐𝒔𝒊𝒕𝒊𝒐𝒏
𝟐 𝟐 𝟐
From the data, the height in 8th position is 92.
Thus, the median is 92.

Solving for the standard deviation, let’s create a table.
𝑥 (𝑥 − 𝑥)ҧ 𝑥 − 𝑥ҧ 2
(Height) (Height– mean) (Height – mean)2
80 80 − 91.13 = −𝟏𝟏. 𝟏𝟑 −11.13 2 = 𝟏𝟐𝟑. 𝟖𝟖
82 82 − 91.13 = −𝟗. 𝟏𝟑 −9.13 2 = 𝟖𝟖. 𝟑𝟔
88 88 − 91.13 = −𝟑. 𝟏𝟑 −3.13 2 = 𝟗. 𝟖𝟎
88 88 − 91.13 = −𝟑. 𝟏𝟑 −3.13 2 = 𝟗. 𝟖𝟎
90 90 − 91.13 = −𝟏. 𝟏𝟑 −1.13 2 = 𝟏. 𝟐𝟖
91 91 − 91.13 = −𝟎. 𝟏𝟑 −0.13 2 = 𝟎. 𝟎𝟐
92 92 − 91.13 = 𝟎. 𝟖𝟕 0.87 2 = 𝟎. 𝟕𝟔
92 92 − 91.13 = 𝟎. 𝟖𝟕 0.87 2 = 𝟎. 𝟕𝟔
93 93 − 91.13 = 𝟏. 𝟖𝟕 1.87 2 = 𝟑. 𝟓𝟎
94 94 − 91.13 = 𝟐. 𝟖𝟕 2.87 2 = 𝟖. 𝟐𝟒
94 94 − 91.13 = 𝟐. 𝟖𝟕 2.87 2 = 𝟖. 𝟐𝟒
95 95 − 91.13 = 𝟑. 𝟖𝟕 3.87 2 = 𝟏𝟒. 𝟗𝟖
96 96 − 91.13 = 𝟒. 𝟖𝟕 4.87 2 = 𝟐𝟑. 𝟕𝟐
96 96 − 91.13 = 𝟒. 𝟖𝟕 4.87 2 = 𝟐𝟑. 𝟕𝟐
96 96 − 91.13 = 𝟒. 𝟖𝟕 4.87 2 = 𝟐𝟑. 𝟕𝟐
σ 𝑥 − 𝑥ҧ 2 = 𝟑𝟑𝟓. 𝟕𝟖
Alternative Way to Solve for Standard Deviation
𝑥
𝒙𝟐
(Height) Solving for the standard deviation,
80 (80)2 = 𝟔 𝟒𝟎𝟎
82 (82)2 = 𝟔 𝟕𝟐𝟒 𝒏 σ𝒙𝟐 − (σ𝒙)𝟐
𝒔=
88 (88)2 = 𝟕 𝟕𝟒𝟒 𝒏(𝒏 − 𝟏)
88 (88)2 = 𝟕 𝟕𝟒𝟒
90 (90)2 = 𝟖 𝟏𝟎𝟎 Substitute the values,
91 (91)2 = 𝟖 𝟐𝟖𝟏
92 (92)2 = 𝟖 𝟒𝟔𝟒
92 (92)2 = 𝟖 𝟒𝟔𝟒
15(124 915) − (1367)2
93 (93)2 = 𝟖 𝟔𝟒𝟗 𝒔=
94 (94)𝟐 = 𝟖 𝟖𝟑𝟔
15(15 − 1)
94 (94)2 = 𝟖 𝟖𝟑𝟔
95 (95)2 = 𝟗 𝟎𝟐𝟓
5 036
96 (96)2 = 𝟗 𝟐𝟏𝟔 =
𝟐𝟏𝟎
96 (96)2 = 𝟗 𝟐𝟏𝟔
96 (96)2 = 𝟗 𝟐𝟏𝟔
= 23.98 The standard
σ𝒙 = 𝟏 𝟑𝟔𝟕 𝟐
σ𝒙 = 𝟏𝟐𝟒 𝟗𝟏𝟓 = 𝟒. 𝟗 deviation 4.9.
𝒎𝒆𝒂𝒏 −𝒎𝒆𝒅𝒊𝒂𝒏 ഥ−𝒙
𝟑 𝒙 ෥
Solving for the skewness, applying the formula 𝒔𝒌 = 𝟑 = .
where 𝒙 ෥ = 𝟗𝟐 and 𝒔 = 𝟒. 𝟗 Substituting,

ഥ = 𝟗𝟏. 𝟏𝟑, 𝒙
ഥ − 𝒙
𝟑 𝒙 ෥ 3(91.13 − 92) 3 −0.87 2.61
𝒔𝒌 = = = =− ≈ −𝟎. 𝟓𝟑
𝒔 4.9 4.9 4.9
Note that the value of the skewness is a negative number. This implies that the
data set is skewed to the left. That is, majority of the heights of the students are
greater than the mean and median height of the group.
Example 3
a. Determine the skewness of the data below.
The table below shows the weight (in pounds) of 100 randomly
selected dogs in a dog shelter.
Weight (in pounds) Frequency (f)
60 – 62 5
63 – 65 17
66 – 68 42
69 – 71 27
72 – 74 9
Solution
a. The problem is an example of a grouped data.
b. First, find the class mark, then determine the following; mean, median and standard
deviation (s) using the formula for grouped data.
Weight Class Cumulative

Frequency
(in Mark frequency 𝑓𝑥𝑚 𝑥𝑚 2 𝑓𝑥𝑚 2
(𝑓)
pounds) (𝑥𝑚 ) (𝑐𝑓)
60 – 62 61 5 5 305 3 721 18 605

63 – 65 64 17 22 1 088 4 096 69 632
66 – 68 67 42 64 2 814 4 489 188 538
69 – 71 70 27 91 1 890 4 900 132 300
72 – 74 73 9 100 657 5 329 47 961
Total 𝑵 = 𝟏𝟎𝟎 σ𝒇𝒙𝒎 = 𝟔𝟕𝟓𝟒 σ𝒇𝒙𝒎 𝟐 = 𝟒𝟓𝟕 𝟎𝟑𝟔

Solution
Solving for the mean,
σ 𝒇𝒙𝒎 𝟔 𝟕𝟓𝟒
𝒎𝒆𝒂𝒏 ഥ
𝒙 = = = 𝟔𝟕. 𝟓𝟒
𝑵 𝟏𝟎𝟎
The mean of the data is equal to 67.54.
Solving for the median,

𝑁 100
Locate the median class. Since = = 50. Looking at the cumulative frequency
2 2
column, 50 belongs to the class interval 66 – 68. From this, you can get the following
values:
𝐿 = 65.5 (Since the lower limit is 66, subtract to 0.5, then 66 – 0.5 = 65.5)
𝑁 100
= = 𝟓𝟎
2 2
𝑐𝑓 = 22 (cumulative frequency before the median class 66 - 68)
𝑓 = 𝟒𝟐 (the no. of frequency of the class interval 66 – 68)
𝑖 = 3 (Subtracting two consecutive lower limits or upper limits, i.e. 63 − 60 = 3)
Solution
Use the formula to compute the median.
𝑵
− 𝒄𝒇
𝒙 = 𝑳+ 𝟐
෥ ×𝒊
𝒇
50−22 28
෥
𝒙 = 65.5 + × 3 = 65.5 + × 3 = 65.5 + 2 = 𝟔𝟕. 𝟓
42 42
Therefore, the median is 67.5.
Solving for the standard deviation, refer to the table above and substitute the values in the
𝒏 σ𝑓𝑥𝑚 2 −(σ𝑓𝑥𝑚 )𝟐
formula 𝒔= .
𝒏(𝒏−𝟏)
𝒏 σ𝑓𝑥𝑚 2 − (σ𝑓𝑥𝑚 )𝟐 𝟏𝟎𝟎(𝟒𝟓𝟕 𝟎𝟑𝟔) − (𝟔 𝟕𝟓𝟒)𝟐 𝟖𝟕 𝟎𝟖𝟒

𝒔= = = = 𝟖. 𝟕𝟗𝟔𝟒 = 𝟐. 𝟗𝟕
𝒏 𝒏−𝟏 𝟏𝟎𝟎 𝟏𝟎𝟎 − 𝟏 𝟗 𝟗𝟎𝟎
Thus, the standard deviation of the grouped data is 2.97.

Solution
S2: Solve for the skewness.
𝒎𝒆𝒂𝒏 −𝒎𝒆𝒅𝒊𝒂𝒏 𝟑 ഥ
𝒙−෥
𝒙
Solving for the skewness, applying the formula 𝒔𝒌 = 𝟑 = .
where ഥ
𝒙 = 𝟔𝟕. 𝟓𝟒, ෥𝒙 = 𝟔𝟕. 𝟓 and 𝒔 = 𝟐. 𝟗𝟕 Substituting,
𝟑 ഥ ෥
𝒙 − 𝒙 3(67.54 − 67.5) 3 0.04 0.12
𝒔𝒌 = = = = = 𝟎. 𝟎𝟒
𝒔 2.97 2.97 2.97
The skewness of the given data set is 0.04. It is positively skewed. Therefore, most of
the dogs have weighs that are less than the mean and median weights.
When interpreting skewness, you may refer to the following:
a. If the skewness value is less than – 1 or greater than 1, then the data set is highly skewed.
b. If the skewness value is between – 1 and – 0.5 or between 0.5 and 1, then the data set is
moderately skewed.
c. If the skewness value is between – 0.5 and 0.5, then the data set is approximately symmetric.
A skewness value that is closer to zero indicates that the data set is approximately symmetric.
Example 4
Is the data in example 3 highly skewed, moderately skewed, or
approximately symmetric?
Example 4
Is the data in example 3 highly skewed, moderately skewed, or
approximately symmetric?
Solution
Since the skewness of the data set is 0.04, you can conclude that the distribution of
the weights of the dogs is approximately symmetric.
ACTIVITY

S6 Skewness2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

S6 Skewness2

Uploaded by

Copyright:

Available Formats

Measures of

1. Symmetrical – Normal Distribution

1. Symmetrical – Normal Distribution

❑ Two sides of the distribution are a mirror image of each other.

❑ When a histogram is constructed on values that are normally distributed,

𝒔𝒖𝒎 𝒐𝒇 𝒂𝒍𝒍 𝒈𝒊𝒗𝒆𝒏 𝒅𝒂𝒕𝒂 σ𝒙

The mean of the data is equal to 91.13.

Arranged the data in ascending order.

From the data, the height in 8th position is 92.

Thus, the median is 92.

where 𝒙 ෥ = 𝟗𝟐 and 𝒔 = 𝟒. 𝟗 Substituting,

Weight (in pounds) Frequency (f)

Weight Class Cumulative

60 – 62 61 5 5 305 3 721 18 605

Total 𝑵 = 𝟏𝟎𝟎 σ𝒇𝒙𝒎 = 𝟔𝟕𝟓𝟒 σ𝒇𝒙𝒎 𝟐 = 𝟒𝟓𝟕 𝟎𝟑𝟔

Solving for the median,

Therefore, the median is 67.5.

𝒏 σ𝑓𝑥𝑚 2 − (σ𝑓𝑥𝑚 )𝟐 𝟏𝟎𝟎(𝟒𝟓𝟕 𝟎𝟑𝟔) − (𝟔 𝟕𝟓𝟒)𝟐 𝟖𝟕 𝟎𝟖𝟒

Thus, the standard deviation of the grouped data is 2.97.

You might also like