Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

HANDOUTS IN AEC 221- STATISTICAL ANALYSIS AND SOFTWARE

APPLICATION

ORGANIZING DATA

Week 3

Learning Competencies:

Construct complete frequency distribution table for qualitative and qualitative data set

ORGANIZING DATA

RAW DATA
Raw data is information obtained by observing values of a variable. Data obtained by
observing values of a qualitative variable are referred to as qualitative data. Data obtained by
observing values of a quantitative variable are referred to as quantitative data. Quantitative data
obtained from a discrete variable are also referred to as discrete data and quantitative data
obtained from a continuous variable are called continuous data.

EXAMPLE 2.1 A study is conducted in which individuals are classified into one of sixteen
personality types using the Myers-Briggs type indicator. The resulting raw data would be
classified as qualitative data.

EXAMPLE 2.2 The cardiac output in liters per minute is measured for the participants in a
medical study. The resulting data would be classified as quantitative data and continuous data.

EXAMPLE 2.3 The number of murders per 100,000 inhabitants is recorded for each of several
large cities for the year 1994. The resulting data would be classified as quantitative data and
discrete data.

FREQUENCY DISTRIBUTION FOR QUALITATIVE DATA


A frequency distribution for qualitative data lists all categories and the number of
elements that belong to each of the categories.

EXAMPLE 3.1 A sample of Attendance of students by department who attended the


orientation

Bus. Ed ETD ETD Nursing LA-Ed


Criminology GS LA-Ed LA-Ed LA-Ed

1
nursing Nursing ETD HS GS
LA-Ed Bus. Ed LA-Ed HS GS
ETD HS LA-Ed LA-Ed Criminology

The variable, type of attendance, is classified into the categories: Bus. Ed, Nursing, Criminology,
ETD, GS, LA-Ed, and HS. As shown in Table 3.1, the seven categories are listed under the
column entitled Department, and each occurrence of a category is recorded by using the symbol /
in order to tally the number of times each department occurs. The number of tallies for each
department is counted and listed under the column entitled Frequency. Occasionally the term
absolute frequency is used rather than frequency.

Table 3.1 (Frequency Distribution Table)


DEPARTMENT TALLY FREQUENCY
Bus. Ed // 2
HS /// 3
LA-Ed /////-/// 8
GS /// 3
ETD /// 3
Criminology /// 3
Nursing /// 3

RELATIVE FREQUENCY OF A CATEGORY


The relative frequency of a category is obtained by dividing the frequency for a category
by the sum of all the frequencies. The relative frequencies for the seven categories in Table 3.1
are shown in Table 3.2. The sum of the relative frequencies will always equal one.

PERCENTAGE
The percentage for a category is obtained by multiplying the relative frequency for that
category by 100. The percentages for the seven categories in Table 3.1 are shown in Table 3.2.
The sum of the percentages for all the categories will always equal 100 percent.

Table 3.2
DEPARTMENT RELATIVE FREQUENCY PERCENTAGE
Bus. Ed 2/25 = .08 .08 x 100 = 8%
HS 3/25 = .12 .12 x 100 = 12%
LA-Ed 3/25 = .12 .12 x 100 = 12%
GS 3/25 = .12 .12 x 100 = 12%
ETD 3/25 = .12 .12 x 100 = 12%
Criminology 8/25 = .32 .32 x 100 = 32%
Nursing 3/25 = .12 .12 x 100 = 12%

FREQUENCY DISTRIBUTION FOR QUANTITATIVE DATA

There are many similarities between frequency distributions for qualitative data and
frequency distributions for quantitative data. Terminology for frequency distributions of

2
quantitative data is discussed first, and then examples illustrating the construction of frequency
distributions for quantitative data are given. Table 2.5 gives a frequency distribution of the
Stanford−Binet intelligence test scores for 75 adults.

IQ score is a quantitative variable and according to Table 2.5, eight of the individuals
have an IQ score between 80 and 94, fourteen have scores between 95 and 109, twenty-four have
scores between 110 and 124, sixteen have scores between 125 and 139, and thirteen have scores
between 140 and 154.

CLASS LIMITS, CLASS BOUNDARIES, CLASS MARKS, AND CLASS WIDTH


The frequency distribution given in Table 2.5 is composed of five classes. The classes
are: 80– 94, 95–109, 110–124, 125–139, and 140–154. Each class has a lower class limit and an
upper class limit. The lower class limits for this distribution are 80, 95, 110, 125, and 140. The
upper class limits are 94, 109, 124, 139, and 154.

If the lower class limit for the second class, 95, is added to the upper class limit for the
first class, 94, and the sum divided by 2, the upper boundary for the first class and the lower
boundary for the second class are determined. Table 2.6 gives all the boundaries for Table 2.5.

If the lower class limit is added to the upper class limit for any class and the sum divided
by 2, the class mark for that class is obtained. The class mark for a class is the midpoint of the
class and is sometimes called the class midpoint rather than the class mark. The class marks for
Table 2.5 are shown in Table 2.6.
The difference between the boundaries for any class gives the class width for a distribution. The
class width for the distribution in Table 2.5 is 15.

When forming a frequency distribution, the following general guidelines should be


followed:

1. The number of classes should be between 5 and 15


2. Each data value must belong to one, and only one, class.

3
3. When possible, all classes should be of equal width.

EXAMPLE 2.8 The price for 500 aspirin tablets is determined for each of twenty randomly
selected stores as part of a larger consumer study. The prices are as follows:

Suppose we wish to group these data into seven classes. Since the maximum price is 3.15
and the minimum price is 2.50, the spread in prices is 0.65. Each class should then have a width
equal to approximately 1/7 of 0.65 or .093. There is a lot of flexibility in choosing the classes
while following the guidelines given above. Table 2.8 shows the results if a class width equal to
0.10 is selected and the first class begins at the minimum price.

CUMULATIVE FREQUENCY DISTRIBUTIONS

A cumulative frequency distribution gives the total number of values that fall below
various class boundaries of a frequency distribution.

EXAMPLE 2.11 Table 2.11 shows the frequency distribution of the contents in millilitres of a
sample of 25 one-liter bottles of soda. Table 2.12 shows how to construct the cumulative
frequency distribution that corresponds to the distribution in Table 2.11.

CUMULATIVE RELATIVE FREQUENCY DISTRIBUTIONS

A cumulative relative frequency is obtained by dividing a cumulative frequency by the


total number of observations in the data set. The cumulative relative frequencies for the

4
frequency distribution given in Table 2.11 are shown in Table 2.12. Cumulative percentages are
obtained by multiplying cumulative relative frequencies by 100. The cumulative percentages for
the distribution given in Table 2.11 are shown in Table 2.12.

Reference:

Galera, Orpha Josefn M. Handouts in Organizing Data, June 2020

Stephens, Larry J, Beginning Statistics 2nd Edition. Mc Graw Hill, New York, USA(e-)
pp. 16-17

You might also like