Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 12

7.

0 Frequency Distribution and descriptive Measures:

When we have collected data, we can describe the data by


graphing it on a histogram (or frequency bar chart) and by using
measures called statistics. Statistics are numerical measures that
describe how a collection (or sample) of data behaves. When such
statistics are correctly derived, they go a long way in describing the
population that such sample came from.

Let us use a product manufacturing example:

A company that manufactures hand lotion fills large containers


up to 300 ml of lotion. Containers may be overfilled up to 10 ml but
never under-filled. An under filled container is considered reject and
need to brought back to the production line for filling. On a certain
day, 28 containers were sampled and the following volumes of lotion
were recorded per container:

301 302 300 300 305


310 297 301 301 304
298 317 299 295
310 290 288 305 300
299 300 310 306
308 298 303 310 304

Required: Construct a frequency distribution table having 6 classes


to summarize the data above.

7.1 Frequency Distribution Table:

Let us show the completed table and discuss how it was arrived at.
The completed frequency distribution table should look like this:

Class Cumulative
Class midpoin Class Frequency Frequency
Interval t Boundaries
288- 290 287.5— 2 2
292 292.5
293- 295 292.5— 2 4
297 297.5
298- 300 297.5— 12 16

Frequency Distributions, Descriptive Measures, Excel page 103


302 302.5
303- 305 302.5— 6 22
307 307.5
308- 310 307.5— 5 27
312 312.5
313- 315 312.5— 1 28
317 317.5

We were required to create 6 classes. This means that the whole


range of values from the minimum 288 to the maximum 317 must be
divided into 6 intervals.

These 6 class intervals each have a width of

max i  min i  1 317  288  1


Class width  k classes
or  5 mL
6

The numerator has a +1 so as to include all numbers in the


sample. 288 up to 317 is exactly 30 units inclusive (and not 29).

We will now create the first column of the table labelled “Class
Interval”. Beginning from the smallest value 288, we can now add 5
units including 288 to arrive at the upper value 292. This means that
the first class interval should include the 5 numbers 288, 289, 290, 291
and 292. Like so:

Class
Interval
288-292

The second interval begins with the next number after the first
interval’s upper value: 293 and then we add 4 units to get 297. We
can do this process over and over until we get to the last interval.

Class

Frequency Distributions, Descriptive Measures, Excel page 104


Interval
288-292
293-297
298-302
303-307
308-312
313-317
Class midpoint is exactly what its name denotes—it is the exact
middle in the interval. For 5 sorted numbers inclusive, the 3rd number
is the midpoint. This is also incidentally the concept of the measure
called median. The median is the exact midpoint among a series of
sorted numbers.

Class Class
Interval midpoint
288- 290
292
293- 295
297
298- 300
302
303- 305
307
308- 310
312
313- 315
317
Class boundaries represent the exact upper and lower limits (or
boundaries) of the interval. This is the shared limit between two
adjacent intervals. They are usually just the midpoint between the
upper value of one interval and the lowest value of the next higher
interval. So between the first and the second interval, we find that
their boundary is between 292 (of the first class interval) and 293 (of
the 2nd class interval). The midpoint of these two points is exactly

Frequency Distributions, Descriptive Measures, Excel page 105


292.5. If we extend this line of reasoning for all the other intervals, we
could easily get the column called class boundaries.

Class Cumulative
Class midpoin Class Frequency Frequency
Interval t Boundaries
288- 290 287.5—
292 292.5
293- 295 292.5—
297 297.5
298- 300 297.5—
302 302.5
303- 305 302.5—
307 307.5
308- 310 307.5—
312 312.5
313- 315 312.5—
317 317.5
Once we have succeeded in dividing up the sample range into
neat intervals, it is now time to simply collect the frequency counts in
each interval. This should be straight-forward, though tedious.

Fortunately, we can use Microsoft™ Excel for this.

When you are in an IBM-compatible PC, you may open the Excel
program bundled in Microsoft Office.

Once in Excel, you may enter the data on each cell and use the
arrow keys to move to the next cells when you finish entering each
data point. Try doing this. Try to see as well what the <ENTER> key
does after you encode a number in Excel.

Frequency Distributions, Descriptive Measures, Excel page 106


Screen shot after encoding all 28 data points in lotion fill volume
example:

You may now copy the details of the frequency table immediately
below the data: Begin the table headings at cell A8. try writing the
class boundaries as separate column values on column C and D.

Frequency Distributions, Descriptive Measures, Excel page 107


You should have something like thus:

Formula edit
box

We will now use an Excel function to create the frequency entries:

1. highlight cells E9 to E14 by click-dragging the mouse across


those cells.
2. Release the mouse and you should find the cells still highlighted.
3. click on the Formula Edit box (see screenshot above).
4. Type (Equal sign) =FREQUENCY(A1:E6,D9:D14).
a. You may do this by typing “=frequency(“
b. then use the mouse to highlight the data range A1 to E6
c. press comma (,)
d. then use the mouse to highlight the bin range upper limits
cells D9 to D14.
e. Press close parenthesis “)”
f. Press CTRL-SHIFT-Enter.

Frequency Distributions, Descriptive Measures, Excel page 108


You should now get the following screen shot.

And you would have automatically seen the frequencies placed on the
array formula cells we’ve written.

To fill up the cumulative frequency column, we could simply


enter the following formulas on each corresponding cell.

Cell Type beginning with an equal


sign “=”
F9 =E9
F10 =F9+E10
F11 =F10+E11
F12 =F11+E12
F13 =F12+E13
F14 =F13+E14

You may also use the shorter way of typing the formulas on cells
F9 and F10 as above, and then “copy-drag” the cell F10 down to cell
F14. You “copy-drag” cell formulas by clicking on the original cell

Frequency Distributions, Descriptive Measures, Excel page 109


(F10, in this case) first, then point the mouse on the lower right hand
corner of the highlighted cell until the pointer becomes a “+” cross.
When a cross appears, click on the mouse and drag the highlighted
cells down to the last cell you want the copied formula to be on.

You should now have the completed frequency table.

301 302 300 300 305


310 297 301 301 304
298 317 299 295 310
290 288 305 300 299
300 310 306 308 298
303 310 304

Interval Midpoint LowerBnd UpperBnd Frequency CumFreq


288-292 290 287.5 292.5 2 2
293-297 295 292.5 297.5 2 4
298-302 300 297.5 302.5 12 16
303-307 305 302.5 307.5 6 22
308-312 310 307.5 312.5 5 27
313-317 315 312.5 317.5 1 28

We can use the Graph wizard to create a column chart:

1. Copy the Interval column to a clear part of the worksheet.


2. Copy the Frequency column to be beside the interval column.
Interval Frequency
288-292 2
293-297 2
298-302 12
303-307 6
308-312 5
313-317 1
3. highlight the two columns.
4. Click on the menu item Insert Chart..
5. Select Standard Chart—Column, then press NEXT>
6. You should be able to see a histogram on Step 2 of 4 on the
Graphics Wizard. Press NEXT> again.
7. Fill in “Fill Volume” as the Chart title, “mL” as X-axis title, and
“Frequency” as the Y-axis label. Then Press NEXT> and Finish.

You could delete the “frequency” label box at the right side of the
graph and end up with the histogram like thus:

Frequency Distributions, Descriptive Measures, Excel page 110


Fill Volume

14
12
10
Frequency

8
6
4
2
0
288- 293- 298- 303- 308- 313-
292 297 302 307 312 317
mL

Measures of Central Tendency (Mean, Median, Mode)

1. Mean X = arithmetic average of data

For raw data For Grouped data


n n

X i fX i i
X  i 1
X  i 1
n n
where fi =frequency of ith
class,
Xi = midpoint of ith
class.

2. Median XM

For raw data:


Arrange data in ascending or descending order. The median is
determined by getting the middlemost value among the arranged data.
Ex. 1.) 4 8 3 1 6 10 5 2 3
In ascending order: 1 2 3 3 4 5 6 8 10 XM = 4
2.) 4 8 3 1 6 10 5 2 3 12
In ascending order: 1 2 3 3 4 5 6 8 10 12
XM = midpoint between 5th and 6th ordered item on list

Frequency Distributions, Descriptive Measures, Excel page 111


45
XM= = 4.5
2
Median for Grouped data

 n  Fm 1 
X M  L  c 2 
 fm 
 
where: L - lower class boundary of the median class
n - total frequency
Fm-1 - cumulative frequency of the class preceding the median
class
fm - frequency of the median class
c - class width

Note: The median class is the class whose cumulative frequency first
exceeds 50% of
total frequency.

3. Mode X̂ - the most frequently occurring value

For raw data: it is as defined.

For grouped data-- the midpoint of the highest occurring class


interval.
(on a bar chart, it is where the highest bar is.)

7.3 Measures of Dispersion (Standard deviation, Range)

1. Variance s2 or Standard Deviation (s)

For raw data:



2
 
 X  X
n
n X    X i 
2 2
i
i
 i  Standard deviation s = s2
s 
2 i 1
 i
n 1 n n  1

For Casio calculators, standard deviation (s) is represented as Xn-1.

Frequency Distributions, Descriptive Measures, Excel page 112


For Sharp Calculators, standard deviation (s) is represented as Sx. .
For grouped data:
n  fi xi2 - ( fi xi)2
s2 = ------------------------------
n (n - 1)

2. Range (R) = Maximum (Xi) – Minimum(Xi)

When reporting range, we usually report the maximum and


minimum values and not the exact difference. As in weather
temperature reports: “temperature ranges from a high of 31oC down
to a low of 23oC.”

With Microsoft™ Excel, we can easily calculate these descriptive


measures by doing the following:

1. Copy the data into a single column.


2. Click on Tools—Data Analysis. (If your Excel program does not
have this menu item under tools, you may have to reinstall Excel
with the Analysis TookPak1 option on.)
3. The Data Analysis dialog box should come out. Choose
Descriptive Statistics.
4. The Descriptive Statistics dialog box should now come out. On
the Input Range edit bar type in A1.A28. (or click on the window
thingie at the rightmost edge of the edit bar panel and click-and-
drag on the worksheet’s A1 to A28 data range. Press Enter to go
back to the dialog box.)
5. Press OK on the last window, and another worksheet should
include the following results.

Column1

Mean 302.1785714
Standard Error 1.174073508
Median 301
Mode 310
Standard Deviation 6.212613045
Sample Variance 38.59656085
Kurtosis 0.717531273
Skewness 0.009742376
Range 29
Minimum 288
Maximum 317
Sum 8461

Frequency Distributions, Descriptive Measures, Excel page 113


Count 28

Quite Easily done, Mate?

Practice Example:

An employee of the personnel department made a research of the


number of absences made by 50 employees of a large company in one
year. The following are the results of his survey:
10 15 12 19 9 12 7 16 11 11
13 5 18 14 3 12 8 11 12 12
13 15 16 21 9 12 9 8 12 18
10 2 9 10 11 13 6 5 5 17
11 15 16 18 14 13 15 9 6 9

a. Set up the frequency distribution table. Group the data into seven
classes.
b. Determine the class boundaries, class mark and the cumulative
frequency distribution.
c. Draw the histogram and the ogive for this distribution.
d. Calculate the mean, median, mode, average deviation and standard
deviation using:
i. raw data
ii. frequency distribution

Frequency Distributions, Descriptive Measures, Excel page 114

You might also like