Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Lesson 2 : DATA COLLECTION AND PRESENTATION

Learning Objectives

 State the different methods in collecting and presenting data


 Differentiate probability from non probability sampling
 Construct the frequency distribution table
 Enumerate the different graphical presentations.

Data collected are useless and meaningless unless they are properly
presented for analysis and interpretation. All statistical procedures help to describe
data.

In this lesson, you will learn the different ways of presenting data, either tabular or
graphical. These methods of presenting data are considered important
characteristics of the data on a more direct manner than is possible using any of the
statistical analysis.

A. Data Collection

 Methods of Data Collection

 Direct Method referred to as interview. This may be structured or


unstructured interview. This is mainly used for a small sample size.
This is a method where there is a person to person exchange of idea
between the one soliciting information (interviewer) and the one
supplying the information (interviewee).

 Indirect Method popularly known as paper and pencil method or the


questionnaires method. Researcher has to prepare questions relevant
to the subject of the study.

 Registration Method referred to as documentary analysis where the


researcher makes use of the data /fact / information on file. These
documents are something that is enforced by a certain law or policy.
This includes birth, death, licenses and other records.

 Observation Method data pertaining to behaviors of an individual or a


group of individuals at the time of occurrence of a given situation are
best obtain by direct observation. Subjects may be taken individually or
collectively, depending on the target of the investigator. This method is
used also if the objects of the study cannot talk nor write – like plants
and animals.

 Experimental method this method examines the cause and effect of


certain phenomena. Data obtained here are done through a series of
experiments which require laboratory results.

B. Sampling Techniques

 Probability Sampling.

It is a sampling procedure wherein every element of the population is given a


non zero chance of being selected as sample. This is taken to mean that everyone in
the population has the chance to be included in the sample. It is also known as
Random Sampling.

 Simple Random Sampling Selection is done fairly, just and without bias.
Researcher gives no criteria or is being objective in the selection of samples.
Examples: drawing of winning stub in the tambiolo; selection of number in
the table of random sampling and others.

 Systematic Sampling. The researcher obtains sampling by developing a


certain nth star or simply developing a pattern which can also be dine
through random selection.

 Stratified Sampling. Selection of samples in this sampling technique can be


done by equal or proportional strata. This is the technique commonly used
particularly if there are several sources of data.

 Cluster Sampling. This technique is done by choosing samples in group.


Selection will be randomly done in clustered form. When a group is chosen,
regardless of who is in the group, they are all considered as samples.

 Multistage Sampling. This technique is referred to as selection of samples


in several stages of sampling.

 Non- Probability Sampling.

It is a sampling technique wherein not every element of the population is


given a chance of being selected as a sample. The researcher states his prejudice
for certain samples. It is otherwise known as non random sampling.
 Purposive Sampling. It is a non random sampling technique of
choosing samples where the researcher defined his criteria and rules.

 Quota Sampling. The researcher or investigator limits the number of


samples on the required number for the subject of his study.

 Convenience Sampling. The researcher chooses his most preferred


location / venue where he can conduct his study. The researcher
specifies the place and time where he can collect his data.

C. Data Presentation

 Textual Presentation

Data collected is presented in paragraph form if it is purely qualitative or when


there are very few numbers involved. This method is commonly adopted by
researchers undergoing qualitative research.

 Tabular Presentation

The more effective way of presenting the data is by means of table which
appears in the form of rows and columns. Data presented in tabular form can be
easily used for comparison and emphasis. One can easily draw relationships from
the presented table.

A statistical table has four components: table heading, body, stubs, and box heads.

Table 1

Frequency Distribution of Respondents in terms of Sex

Percentag
Sex Frequency
e
     
Male 20 29
     
Female 50 71
     
Total 70 100

 Graphical Presentation of Data


The statistics often uses graphs for better analysis of variables. There are two types
if graphs for analyzing variables :

- Histogram ( bar chart)

- Pie Chart

 Histogram is a standard graph where variants of the variables are


represented on one axis and variable frequencies on the other axis.
Individual values of the frequency are then displayed as bars ( boxes,
vectors, logs, cones etc.)

 Pie Chart represents relative frequencies of individual variants of a


variable. Frequencies are presented as proportion in a sector of a circle.

 Bar Graph is used to represent discrete data, so instead of being joined,


like in the histogram the bars are separated. The length of each
represents the frequency within the given class. The width of the bar is
arbitrary., however must be of the same width almost the same as the
histogram.

 Frequency Polygon is a line chart. The frequency is placed along the


vertical axis and the individual variants are placed along the horizontal
axis. The values are attached to a line.

 Ogive a graphical presentation of cumulative frequencies or relative


cumulative frequency. The vertical axis is the cumulative frequency or
relative cumulative frequency. The horizontal axis represent the variants.
The graph always start at zero , at the lowest variant and ends up at the
total frequency.

 Pareto Graph is a bar chart for qualitative variable with the bars
arranged by frequency. The variants are on the horizontal axis and are
sorted from the highest importance to the lowest .

 Stem and Leaf plot a device for presenting quantitative data in graphical
format, similar to a histogram, to assist in visualizing the shape of a
distribution.

To construct a stem - and – leaf display, the observations must first be sorted
in ascending order; this can be done most easily if working by hand by constructing
a draft of the stem – and –leaf display with the leaves unsorted , then sorting the
leaves to produce the final stem-and-leaf display.

Here is the sorted set of data values that will be used in the following
example:

44 46 47 49 63 64 66 68 68 72 72
75 76 81 84 88 106

In this example the leaf represents the ones place and the stem will represent the
rest of the numbers. The stem-and-left display is drawn with two columns separated
by a vertical line. The stems are listed to the left of the vertical line. It is important
that each stem is listed only once and that no numbers are skipped, even if it means
that some stem have no leaves. The leaves are listed in increasing order in a row to
the right of each stem.

Stem Leaf
4 4 6 7 9
5
6 3 4 6 8 8
7 2 2 5 6
8 1 4 8
9
10 6

D. Frequency Distribution

Frequency Distribution is an arrangement of data showing the frequency of


occurrence of the different values of the variable.

Frequency Distribution Table is the tabular arrangement of data by classes or


categories together with their corresponding frequencies.
Constructing Frequency Distribution Table

Supposed we have collected a raw data as shown below:

Given: 70 83 87 76 80 87 75 84 85
76 81 82 89 77 84 86 71 80

80 79 84 86 93 83 85 88 72

84 84 92

Steps

1. Find the Range ( R) of values. Get the difference of the highest value
(HV) and the lowest value (LV).

R = HV - LV

R = 93 - 70

R = 23

2. Determine the desired Class interval ( CI). The ideal number of class
intervals is somewhere between 5 and15 preferably odd class
intervals. But the more scientific way is applying the pattern :

C I = 3.33 + log n

= 3.33 + log 30

= 3.33 + 1.4771

= 4.81 or 5

3. Compute for Class Size ( i) . Divide the computed range (R ) by the


desired computed class interval (CI ).

i = R / CI

= 23 / 5 = 4.6 = 5
4. Construct a frequency table by making class intervals starting with the
lowest value in the lower limit of first class interval, then add the
computed class size (i) to obtain the lower limit of the next class
interval. Continue adding the class size on the lower limits until you
reach the desired class interval ( CI). Get the upper limit of each class
interval by subtracting one from the lower limit of the next class
interval.

5. Determine the number of data (frequency) for every class interval by


tallying the raw data.

6. Write the obtained frequency ( f) from each class interval by counting


the tallied form.

7. Determine the Class mark ( X ) of each class interval. Add the lower
limit (LL) and the upper limit (UL ) then divide the sum by 2 to get its
mid-point.

8. Determine the class boundary (CB) or class limit by subtracting 0.5


from every lower limits and adding 0.5 from every upper limits.

9. Determine the less than cumulative frequency ( < F ) and the greater
than cumulative frequency ( > F ). To determine the less than
cumulative frequencies, write the first class frequency ( f) under the
column ( < F ) and add the next class frequency of the next class
interval. From the cumulative sum, add again the third class frequency
to obtain the 3rd < F, continue performing the process until you reach
the last class interval. To determine the greater than cumulative
frequency, write the total number of data collected ( n ) under the
column > F. Subtract the second class frequency to determine the 3 rd >
F. Continue performing the operation until the last class interval is
reached.

10. Obtain the relative frequencies (RF) to determine the percentage


distribution of frequencies. Divide the class frequency ( f ) of each class
interval ( CI ) then multiply by 100.
Frequency Distribution table

Class
f X CB <F >F RF
Interval
             
70 - 74 3 72 69.5 - 74.5 3 30 3/30 x 100 = 10
         
5/30 x 100 =
75 - 79 5 77 74.5 - 79.5 8 27 16.67
         
80 - 84 12 82 79.5 - 84.5 20 22 12/30 x 100 = 40
         
8/30 x 100 =
85 - 89 8 87 84.5 - 89.5 28 10 26.67
         
2/30 x 100 =
90 - 94 2 92 89.5 - 94.5 30 2 6.67

Based from the table above, notice that 70 75 80 85 90 are called


lower limit ( LL ) and 74 79 84 89 94 are called upper limit.

Try to answer the following :

1. Which class has the greatest frequency ?

2. Which class has the least frequency?

3. What limits does 85 - 89 class interval have?

4. How many respondents got 80 and above?

5. How many respondents got 89 and below?

6. About how many percent belongs to 75 - 79 ?

7. What is the midpoint of 80 - 84?


 Definition of Terms

 Range (R ). It is determine by the difference of highest and


lowest values.

 Class Interval (CI) – it is the grouping of category defined by a


lower limit and an upper limit.

 Class Size (i) – refers to the quotient of the computed range


and class frequency of the desired class interval.

 Class frequency (f) – refers to the number of observations


belonging to a class interval or the number of items within a
category.

 Class Boundaries (CB) – the true limit which is situated


between the upper limit of one interval and the lower limit of the
next interval. These are more precise expressions of the class
limits by at least 0.5 of their values.

 Class Mark ( X ) – refers to the midpoint if the acquired class


size. It is obtained by adding the lower and upper values divided
by 2.

 Cumulative Frequency – the total number of observations that


have values less than or equal to specified amount.

 Relative frequency (RF) – these are the percentage distribution


in every class interval.
Exercise 2

1. In each of the following, construct a complete frequency distribution table.

35 58 43 80 48 85 42 39 63 44 35

54 38 63 62 65 37 76 46 34 34 45

36 44 42 47 51 40 31 80 54 50 50

34 50

Find the following

1.1 Class size


1.2 Number of classes
1.3 Class mark of the 3rd class
1.4 Lower limit of the 4th class
1.5 Upper class boundary of the third class
1.6 Total number of frequency
1.7 Highest frequency
1.8 Class that comprise 30% if the distribution
1.9 Class with the highest frequency
1.10 Class boundary of the class with lowest frequency.

2. The grades given to you are the following:

84 81 74 92 80 88 98 79

82 85 97 82 89 84 86 91

85 87 95 90 90 84 93 92

88 85 86 90 86 89 88 91

88 98 96 94 83 92 95 87

From this data, prepare the following:


1. Stem- and – leaf display
2. Complete frequency distribution table using 5 class intervals
3. Histogram

You might also like