Professional Documents
Culture Documents
2jane - Frequency Dist and Graphs FINAL March 2022
2jane - Frequency Dist and Graphs FINAL March 2022
Objectives:
At the end of the chapter, the student should be able to:
1. Summarize and present data collected.
2. State the rules for frequency distribution;
3. Make a grouped frequency distribution; and
4. Identify the tree main types of graphs to represent data that is
in a frequency distribution.
Introduction
The problem most decision makers must resolve is how to deal with the uncertainty that is
inherent in almost all aspects of their jobs. Raw data provide little, if any, information to the decision
makers. Thus, they need a means of converting the raw data into useful information.
The two main functions of descriptive statistics are summarizing and presenting data. The most
common way to summarize data is in a frequency distribution. Charts and graphs are also used to
present data.
Frequency Distribution
The easiest method of organizing data is a frequency distribution, which converts raw data into a
meaningful pattern for statistical analysis. A frequency distribution is a table used to describe a data
set. It summarizes data by telling how many frequencies appear in each group or class. A categorical
frequency distribution is used for nominal data and lists the categories and tells how many are in the
category. Numerical data can be presented in ungrouped or grouped frequency distribution. An
ungrouped frequency distribution lists each number and the frequency for that number. A grouped
frequency distribution gives several classes and the frequencies for each class. To decide whether to
use ungrouped or grouped frequency distribution, find the range. A range is the highest number minus
the lowest number in the data set. If the range is small, use an ungrouped frequency distribution.
Examples:
a. Categorical Data
Ungrouped data:
Table 2. Ages of Freshman Students
Age Frequency
15 5
16 17
17 3
Total 25
b. Grouped data:
Table 3. Family Income
Income (Php000) Frequency
10 – 14 25
15 – 19 17
20 – 24 13
25 – 29 9
30 – 34 6
Total 70
180
205
230
255
280
305
Upper class limits
Step 8:
Example: Statistics exam grades. Suppose that 20 statistics students’ scores on an exam are as
follows:
97, 92, 88, 75, 83, 67, 89, 55, 72, 78, 81, 91, 57, 63, 67, 74, 87, 84, 98, 46
r 52
We select k 6 and r 98 46 52 thus cw = = 8.67 9 . The frequency table is as
k 6
follows:
Table 4. Examination Grades in Statistics 11
Grouped frequency distributions have parts besides the class limits and frequencies that can
be found if the limits and the frequencies are given. The class boundaries are obtained by taking half
of the distance from one upper class limit to the next upper class. Subtract this amount from each
lower class limit (LL) and add this amount to each Upper Class Limit (UL).
Range – The difference between the highest data(l) and the lowest data(s):
Lower Class Limit (LL) – The least value that can belong to a class.
Upper Class Limit (UL) – The greatest value that can belong to a class.
Class Width (CW) – The difference between the upper (or lower) class limits of consecutive
classes. All classes should have the same class width.
Class Midpoint (CM) – The middle value of each data class. To find the class midpoint, average
the upper and lower class limits, that is, .
From Table 4, the frequency distribution with class boundaries and midpoints is:
Table 4. Examination Grades in Statistics 11
NOTE: The Frequency Distribution shows how the observations cluster around a central value; and
degree of difference between observations.
MATHEMATICAL NOTATION
The following symbols and variables will have the meanings given below. (unless otherwise
specified)
Variables
x = data value
n = number of values in a sample data set
N = number of values in a population data set
f = frequency of a data class
Symbol
indicates the sum of all values for the following variable or expression.
Example: Using our notation, we can write the statement that the sum of the frequencies in a frequency
table should equal the number of values in the data set as follows:
f n
CUMULATIVE FREQUENCY
The cumulative frequency of a data class is the number of data elements in that class and all
previous classes. This can be done either ascending or descending order.
Example:
Class Frequency ( f ) Cumulative
Frequency
90-99 4 4
80-89 6 10
70-79 4 14
60-69 3 17
50-59 2 19
40-49 1 20
Notice that the last entry in the cumulative frequency column is n 20 .
RELATIVE FREQUENCY
The relative frequency of a data class is the percentage of data elements in that class. We can
calculate the relative frequency for each class as follows:
f
relative frequency =
n
Example:
Class Frequency ( f ) Cumulative Relative
Frequency Frequency
(f / n)
90-99 4 4 0.20
80-89 6 10 0.30
70-79 4 14 0.20
60-69 3 17 0.15
50-59 2 19 0.10
40-49 1 20 0.05
the bars represents the frequencies and the bottom of the bars is 4
0
Example: Created in Excel from the data used in Table 4:
94.5 84.5 74.5 64.5 54.5 44.5
Class limits
Notice that the bar for each class is centered at the class
midpoint, and the bars for successive classes touch.
added to the right, keeping the same distance along the x-axis. 4
to the side and the upper class limits are labeled along the 25
bottom. An extra space is added to the left of the x-axis and the
frequency of this extra one is zero. 20
15
Relative frequency graphs use the proportion for each
group instead of the frequencies. The proportion for each group 10
Pareto charts represent data of a categorical frequency distribution in a bar graph. The bars
need to be the same width and the same space should be used between each bar.
30 150
Percent
15 120.00%
11 100.00%
Frequency 9
10 80.00%
60.00%
20 100
Frequency
5 40.00%
20.00%
0
0 0.00%
2 1 More
10 50
Frequency 11 9 0
8
Cumulative 55.00% 100.00% 100.00%
5 Cumulative Count %
4
3
0 0 Count Gender
FRESHMAN JUNIOR
SOPHOMORE SENIOR
Class level
17
800
connects the dots with a line to show the trend over time.
700
600
40-49 300
50-59 5% 90-99 200
10% 20%
100
60-69 0
15% 2003 2004 2005 2006 2009 2008
80-89
70-79 30%
20%
A STEM-AND-LEAF PLOT uses the first digit (or digits) as the stem and the last digit as the
leaf to form group of classes.
Example 1: A 100 item test was given to 25 statistics students. The result is shown below:
55 32 20 22 43 14 17 48 24
31 21 22 35 23 36 23 18 25
13 28 12 29 13 18 19
Make a stem-and-leaf plot of the above data.
Solution:
12 13 13 14 17 18 18 19 20
21 22 22 23 23 24 25 28 29
31 32 35 36 43 48 55
Step 2. Separate the data according to classes. Using the first digit to separate the classes, we have.
12 13 13 14 17 18 18 19
20 21 22 22 23 23 24 25 28 29
31 32 35 36
43 48
55
Step 3. Use the first digit for the leading digit (or stem) and list all the last digits in order for the
trailing digit (or leaf):
Stem Leaf
1 2 3 3 4 7 8 8 9
2 0 1 2 2 3 3 4 5 8 9
3 1 2 5 6
4 3 8
5 5
Interpretation:
The stem-and-leaf plot shows that most of the students obtained the score from 20 to 29.
Solution:
212 213 215 218 221 223 225 226 228 232
236 236 237 238 239 242 245 246 247 248
Prepared by MARIANNE JANE ANTOINETTE D. PUA, M.S. -Page- 8 -
Bio Statistics 2022
Step 2. Separate the data according to classes. Since all of the first digits in the given data are the same
(2), use the second digit to separate the classes.
Step 3. Use the first 2 digits for the leading digit (or stem) and list all the last digits in order for the
trailing digit (or leaf):
Leading Digit Stem
21 2 3 5 9
22 1 3 5 6 8
23 2 6 6 7 8 9
24 2 5 6 7 8
Interpretation:
The stem-and-leaf plot shows that most of the students obtained the score from 231 to 239.
A BOX-AND-WHISKER PLOT graphs five values of the set of data on a number line. The
five values are:
1. The lowest value in the set of data.
2. The lower hinge.
3. The median.
4. The upper hinge.
5. The highest value of the set of data.
A box is drawn from the lower hinge to the upper hinge and lines are drawn from the box to the
highest and lowest value. The lower hinge is the median of all the values less than or equal to the
median when the set of data set has an odd number of values, or the median of all values less than the
median when the set of data has an even number of values. The upper hinge is the median of all values
greater than or equal median when the set of data has an odd number of values, or the median of all
values greater than the median when the set of data has an even number of values.
Example 1: A 100 item test was given to 25 statistics students. The result is shown below:
55 32 20 22 43 14 17 48 24
31 21 22 35 23 36 23 18 25
13 28 12 29 13 18 19
Solution:
Step 1. Arrange the data to ascending order (from lowest to highest)
12 13 13 14 17 18 18 19 20
21 22 22 23 23 24 25 28 29
31 32 35 36 43 48 55
Interpretation: The box whisker plot shows that the data is not symmetrical and that the data is
positively skewed since the whisker in longer on the right.
Example 2.
Interpretation: The box and whisker plot shows that the data is not symmetrical and that the
data is negatively skewed since the whisker in longer on the right.
Worksheet no. 2
1. From the given data, construct their corresponding frequency distribution indicating the Steps 1
to 7.
a. Ages of 30 ISU students
18 17 31 36 30 35 23 33 22 24
19 27 21 22 24 33 19 26 28 26
29 18 21 25 23 25 28 29 27 18
Class
f
Limits
40-49 2
50-59 3
60-69 4
70-79 8
80-89 5
90-99 2
Total 24
References:
Beaver, B.M. and Beaver R.J. (1999). Introduction to Probability and Statistics. 10th ed. New York: Duxbury Press.
Bluman, A. (1998) Elementary Statistics: A Step by Step Approach. 3rd ed. McGraw-Hill Book Co.
Deuna, Melecio C. (1996), Elementary Statistics for Basic Education. Quezon City: Phoenix Publishing House, Inc.
Febre, F.A. and Virginia F. Cawagas (Consultant)(1987) Introduction to Statistics. Metro Manila, Pheonix Publishing
House, Inc.
Ferguson G. (1981) Statistical Analysis in Psychology and Education. 5th ed. New York: McGraw-Hill Book Company.
Padua, R. N., E.G. Adanza and R.T. Guinto (1986) Statistics: Theory and Applications. Metro Manila: Hermil Printing
Services.
Reyes, C.Z. and Saren, L.L. (2003). Metro Manila. M.G. Reprographics.
Spiegel, M. and Stephens, L. (1999). Schaum’s Outline Theory and Problems in Probability and Statistics. 3rd. Edition.
Singapore: McGraw-Hill Book Company.
Triola, Mario (1995) Elementary Statistics. New York: Addison-Wesley Publishing Company.
Walpole, R.E (1982) Introduction to Statistics. 3rd ed. New York: Macmillan Publishing Co. Inc.