Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Descriptive Statistics: Measures Of Frequency

Measures of frequency are essentially counting. The amount or frequency at which


something occurs. For example, 20 students are asked to choose one animal from dog, cat,
goldfish, or parrot as their favorite. The results are tabulated below.

Student
Animal
Number
1 Dog
2 Dog
3 Goldfish
4 Cat
5 Parrot
6 Dog
7 Cat
8 Cat
9 Cat
10 Goldfish
11 Dog
12 Parrot
13 Parrot
14 Dog
15 Cat
16 Dog
17 Parrot
18 Goldfish
19 Cat
20 Dog

The frequency at which dog occurs is 7 because there were 7 students that chose dog.
*The students that chose dog were student numbers 1, 2, 6, 11, 14, 16, and 20.
The frequency at which cat occurs is 6 because there were 6 students that chose cat.
*The students that chose cat were student numbers 4, 7, 8, 9, 15, and 19.

The frequency at which goldfish occurs is 3 because there were 3 students that chose
goldfish.
*The students that chose goldfish were student numbers 3, 10, and 18.

The frequency at which parrot occurs is 4 because there were 4 students that chose parrot.
*The students that chose parrot were student numbers 5, 12, 13, and 17.

The frequency may be presented as an absolute frequency, or a relative frequency. Absolute


frequency is simply the count, while relative frequency is the proportion out of the total. The
relative frequency may be expressed as a fraction, a decimal (approximated if necessary), or
a percentage (approximated if necessary). Another way of presenting the frequency is
cumulative frequency, but another example will be used for this.
Frequency Table
Anima Freque
l ncy
Dog 7
Cat 6
Goldfi
3
sh
Parrot 4
The above table is an example of the frequencies presented as absolute frequencies.

**For frequency in statistics, the choices of dog, cat, goldfish, and parrot are usually called
classes.
(A) (B) (C)
Relative Frequency Relative Frequency Relative Frequency
Table Table Table
Relative Relative Relative
Animal Animal Animal
Frequency Frequency Frequency
Cat 3/10 Dog 0.35 Goldfish 15%
Dog 7/20 Cat 0.3 Parrot 20%
Goldfish 3/20 Parrot 0.2 Cat 30%
Parrot 1/5 Goldfish 0.15 Dog 35%

The above tables are examples of frequencies presented as relative frequencies. Since
there were a total of 20 students, the absolute frequencies are divided by the total of 20.
Table (A) shows the frequencies as simplified fractions. Please note that sometimes fractions
are not simplified so that they have the same denominator. In fact, depending on what you
are presenting, it is possible to justify choosing a particular denominator. Remember, when
presenting information, present it in such a way that is most useful. Table (B) shows the
frequencies as decimal numbers, while table (C) shows the frequencies as percentages.

**Note that “absolute frequency” is sometimes referred to as just “frequency”.

Notice that the columns have been arranged differently between the tables. This is to
highlight that you need to take into consideration how you present information. In table (A),
the left column is arranged alphabetically, from A to Z (another option would be from Z to A).
In table (B), the right column is arranged in decreasing order, while in table (C), the right
column is arranged in increasing order. The table for the absolute frequency does not seem
to be in any order, but the left column may have been arranged in the same order that the
choices were presented to the students.

Absolute frequencies are usually used when the actual count is more important than how it
compares to the others. Relative frequencies are usually used when it is desirable to
compare the counts. Sometimes, you may want to show both, so you could use the following
table:
Frequency
Animal
Count Percent
Dog 7 35%
Cat 6 30%
Goldfish 3 15%
Parrot 4 20%

If the data was collected to see what the most popular animal was, then showing either
absolute or relative frequency would be acceptable, although the relative frequency will
sometimes give a better idea of how much more popular the animal is compared to the others
(for example, if the difference in count between the 2 largest is 10,000 (ten thousand), but
there were a total of 10,000,000 (ten million), then although 10,000 is a large difference,
compared to the total, it is relatively small).
If, for some reason you need at least five of those particular students to like one of the
animals, then the absolute frequency would be more appropriate.

Cumulative frequency essentially adds the frequencies of all previous frequencies to the
current frequency. Using the above example, this table may be generated:
Cumulative
Animal Frequency
Frequency
Dog 7 7
Cat 6 13
Goldfish 3 16
Parrot 4 20

There are no entries before dog, so its cumulative frequency is also its frequency. The
cumulative frequency for cat is the frequency of cat plus all previous frequencies, so 6 + 7 =
13. The cumulative frequency for goldfish is the frequency of goldfish plus all previous
frequencies, so 3 + 6 + 7 = 16. The cumulative frequency for parrot is the frequency of parrot
plus all previous frequencies, so 4 + 3 + 6 + 7 = 20.

Please note that the cumulative frequency is not appropriate for the above example, as the
information it shows does not interpret well (i.e. doesn’t have any meaning). The cumulative
frequency can be used if the classes (i.e. groups) can be ordered in a meaningful way (such
as if your data are actual numbers).

For example, the heights of certain plants were measured in centimeters. The measurements
were recorded as follows:

107, 139, 197, 209, 281, 254, 163, 150, 127, 308, 206, 187, 169, 83, 127, 133, 140, 143,
130, 144, 91, 113, 153, 255, 252, 200, 117, 167, 148, 184, 123, 153, 155, 154, 100, 117,
101, 138, 186, 196, 146, 90, 144, 119, 135, 151, 197, 171, 190, 169

Clearly, presenting the data this way is not helpful. Before we make a cumulative frequency
table, we need the frequency table.

If we follow what we have been doing and make a frequency table, the following is the result.
Height Freque
(cm) ncy
83 1
90 1
91 1
100 1
101 1
107 1
113 1
117 2
119 1
123 1
127 2
130 1
133 1
135 1
138 1
139 1
140 1
143 1
144 2
146 1
148 1
150 1
151 1
153 2
154 1
155 1
163 1
167 1
169 2
171 1
184 1
186 1
187 1
190 1
196 1
197 2
200 1
206 1
209 1
252 1
254 1
255 1
281 1
308 1
Technically, there is nothing wrong with this frequency table, but neither is it very helpful. A
better frequency table would be:
Height (cm) Frequency
75 – 124 11
125 – 174 24
175 – 224 10
225 – 274 3
275 – 324 2

In this table, it is easier to get a sense of how the heights are distributed. From this table, you
can see that a significant amount of trees are between 125 cm and 174 cm.

To make such a table, we need some definitions.

Lower class limits: The smallest value that can belong to each class (from the example, the
lower class limits are 75, 125, 175, 225, and 275).

Upper class limits: The largest value that can belong to each class (from the example, the
upper class limits are 124, 174, 224, 274, and 324)

Class boundaries: The numbers used to separate classes, but without the gaps between the
lower and upper class limits. These are the numbers in the middle of the gaps. In the
example, the class boundaries are 124.5, 174.5, 224.5, and 274.5.

Class midpoints: The numbers in the middle of classes. These are found by adding the upper
and lower class limits, then dividing by two, for each class. The class midpoints in the
example are 99.5, 149.5, 199.5, 249.5, and 299.5

Class width: The difference between two consecutive lower class limits (or two consecutive
upper class limits). The class width should be the same for any consecutive lower class
limits. In the example, the class width is 50. **Do NOT make the common MISTAKE of
computing the class width as the difference between the upper and lower class limits of a
class.
To make the frequency table:
1) Decide on how many classes (you may decide to change this value during the next step).
Typically, this will be between 5 and 20, but the study might also influence the number of
classes. Another option is to go by Sturges’ guideline (this is just a guideline, not a law/rule),
where the number of classes can be approximated by
log ( n )
1+
log ( 2 )
where “n” is the number of data. In the example, there are 50 data entries, so by Sturges’
guideline, the suggested number of classes will be around
log ( 50 )
1+ ≈ 6.643856
log ( 2 )
You will note that the guideline recommends about 7 classes, but the example uses 5.

2) Choose a class width. The following formula can guide you


Z− A
W≈
c
where “W” is the class width, “Z” is the largest data value, “A” is the smallest data value, and
“c” is the number of classes you decided on in the first step. The formula is just a guide, and
you will usually round up. Sometimes, you may also change the number of classes.

If we use 7 classes for the example, the suggested class width is around
308−83
W≈ =32. 142857
7
The result is not a convenient number for computation, so you may choose a more
convenient number. In this case, a more convenient choice for the class width would be 30,
but this would increase the number of classes to 8. Choosing 40 for the class width would
change the number of classes to 6. I chose 50 as it’s more convenient, and the resulting table
yields some information.

3) Choose the first lower class limit, and make sure that it has the same accuracy as the
data. This may be the smallest data value, or a more convenient number that is smaller than
the smallest data value.
For the example, I chose 75, you may have chosen 80, or 83. The reason I chose 75 was
simply aesthetics. You’ll notice that the difference between the largest and smallest data
value is 225, so multiples of 25 would be aesthetically pleasing, hence the class width of 50
and the starting value of 75. You do not need to think this way when making your frequency
table. The most important thing is that the resulting frequency table can show how the data is
distributed.

Regarding accuracy, if for example, the data gathered was measured to two decimal places,
then my chosen class limit would have been 75.00.

4) Calculate the lower class limits for the rest of the classes. To get the next lower class limit,
add the class width to the lower class width limit of the class that is right before the class that
you are computing for.

For the example, the starting lower class limit was 75, the class width was 50. The next lower
class limit will be 75 + 50 = 125. After that will be 125 + 50 = 175, and so on until you have 5
lower class limits. Hence, the lower class limits for the example are 75, 125, 175, 225, 275.

5) Choose an appropriate upper class limit for all the classes. This is done by using the result
of the class width minus one accuracy value. This result is added to the lower class limit.

In the example, the accuracy is to the nearest whole number, so 1 is the accuracy value.
Hence, one less than the chosen class width of 50 is 49. This is the number that will be
added to the lower class limits to get the upper class limits. Thus, the first upper class limit is
75 + 49 = 124. The next upper class limit will be 125 + 49 = 174. After that is 175 + 49 = 224,
and so on.

If the data gathered was measure to two decimal places, then the accuracy value would be
0.01. Hence, if the class width was 50, the value to add to the lower class limits would be 50
– 0.01 = 49.99. Thus, if 75.00 was the first lower class limit, then the upper class limit would
be 75.00 + 49.99 = 124.99.
6) Count the number of data values that fall in each class. These will be the frequency for
each class. It is okay for classes to have a frequency of zero.

In the example, you will note that there are 11 values between 75 to 124, hence the
frequency for that class is 11. The 11 values are 83, 90, 91, 100, 101, 107, 113, 117, 117,
119, and 123.

Once you’ve entered the frequencies, your table is complete.

Some things to note:


1) Class widths should be equal, but sometimes it is more convenient to adjust the class
width for the first and/or last class. For example, the first class might be “124 or less”, or your
last class might be open ended, such as “274 cm or more”. This depends on the data and
what you are trying to show, and usually occurs if some of your data values are extreme (i.e.
very small or very large compared to the other data values).

2) One quick check is the sum of the frequencies should equal the number of data values. If
that is not the case, there was an error in counting.

3) The accuracy described here is simplified, but works in general. Sometimes accuracy may
be determined by the instrument that is used to collect the measurements/data. In this class,
we will assume the simple accuracy.

Now that the frequency table has been made, it is relatively simple to make the cumulative
frequency table. If you follow the previous example with the animals, the cumulative
frequency table for this example will be:

Cumulative
Height (cm) Frequency
Frequency
75 – 124 11 11
125 – 174 24 35
175 – 224 10 45
225 – 274 3 48
275 – 324 2 50

Note that in these examples, I have shown the frequency with the cumulative frequency.
Sometimes, only the cumulative frequency is shown, so the class limits column will be
modified. Our example would look like

Height less than Cumulative


(cm) Frequency
124 11
174 35
224 45
274 48
324 50

Lastly, there are many other modifications that may be done. Whatever you decide to do,
remember that the main justification is to make the data more presentable so that more
information can be learned (usually the distribution, or the number of data value above/below
a particular value).

You might also like