ADA-Module-Chapter-2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Chapter 2: Data Collection and Presentation

Learning Objectives

At the end of this chapter, the learners should be able to:


Identify the different methods in collecting and presenting data
Distinguish among the various types of frequency distribution tables
Distinguish between probability and non-probability sampling techniques and
identify the various sampling methodologies
Differentiate primary from secondary sources of data

Lesson 2.1: Methods of Collecting and Presenting Data

In Statistics, data collection refers to the process by which information are


gathered from relevant sources in order to obtain data that will ultimately solve a research
question or problem at hand. Various data collection methods are employed to make
assumptions about future probabilities or trend. Such methods allow a person to conclude
an answer to a relevant question. Data Collection is a very important part of the statistical
process. In this phase, the researcher gathers all information needed in the study.

Data Collection Methods


Data collection methods are the ways and means by which a researcher or even a
layman tries to obtain information or data relative to a study. There are 5 commonly used
data collection schemes, namely: direct, indirect, registration, observation, and
experimentation.

a. Direct Method. Popularly known as interview, the direct method gets the
needed data or information directly from the source or respondent. The information is
collected by direct personal interview.
b. Indirect Method. This is a very commonly used method of collecting primary
data. The information is collected through a set of questionnaires. A questionnaire is a
document prepared by the researcher containing a set of questions given out to obtain the
needed data or information.
c. Registration Method. It refers to continuous, permanent, compulsory
recording of occurrence of vital events together with certain identifying or descriptive
characteristics concerning them, as provided through the civil code, laws or regulations.
Examples of registration method are the records of birth, marriages, and deaths
at the Philippine Statistics Authority. Another example is the registration record of all
Filipinos of voting age at the ComElec.

27
d. Observation Method. It involves human or mechanical observations of what
people actually do or what events take place. The information is collected by observing
process at work.
e. Experimentation Method. An experiment is a study of cause and effect. It
involves the deliberate manipulation of one variable while trying to keep all the other
variables constant or the same.

Methods of Data Presentation


In order for readers to better appreciate the statistical data included in an article
or research, the author should provide visuals to better aid the readers in understanding
the information. Presenting the data helps the users to study and explain the statistics
thoroughly. Data presentation is one of the most important aspects of Statistics.
You must present your findings in such a way that the readers can go through them
quickly and understand every point that you wanted to showcase. Data presentation is
defined as the process of using various graphical formats to visually represent the
relationship between two or more data sets so that an informed decision can be made
based on them.
a. Textual – is the simplest one out of the different methods of data presentation. All
you need to do is to write your findings in a coherent manner and your job is done.

The downside of presenting your data in textual manner is the fact that it makes it
hard for readers to understand the data most especially if your presentation includes
many observations, values, percentages, and the like. As the simplest among the data
presentation methods, it is often used for small-scale data, typically those that are too few
to make a table for, such as the report on the percentage of males and females, employed
or unemployed, etc.

28
b. Tabular – to avoid the complexities involved in the textual way of data presentation,
people use tables and charts to present data. In this method, data is presented in rows
and columns. Each row and column have an attribute (name, year, sex, age, etc.) It is
against these attributes that data is written within a cell.

Presenting your data in tables saves you time from writing and your readers time
for reading, all while giving you a better visual of the data. However, to effectively present
your data in tabular form, you should keep in mind that the elements of a table should be
displayed very well as shown in the figure above.
c. Graphical – is an attractive method of showcasing numerical data that help in
analyzing and representing quantitative data visually. A graph is a kind of a chart where
data are plotted as variables across the coordinate. It became easy to analyze the extent
of change of one variable based on the change of other variables. Graphical representation
of data can also be done through different mediums such as lines, plots, diagrams, etc.
Below are some of the commonly used types of graphical data presentation methods.

A bar graph presents data with rectangular bars often with lengths proportional to
their values which can be places either horizontally or vertically.

29
The pie chart is the type of graph in which a circle is divided into sectors where each
sector represents a portion of the whole or a particular percentage of the total 100%.

The line graph represents the date in a form of series that is connected with a straight
line. These series are called markers.

Data shown in the form of pictures is called a pictograph. Pictorial symbols for words,
objects, or phrases can be represented with different numbers.

30
The histogram is a type of graph where the diagram consists of rectangles, the area is
proportional to the frequency of a variable and the width is equal to the class interval.

The stem-and-leaf plot is a way to present quantitative data according to a frequency


ranges of frequency distribution. It is a graph that shows numerical data arranged in
order where each data is broken into a stem and a leaf.

Scatter plot or scatter diagram is a way of graphical representation using Cartesian


coordinates (x-y axis) of two variables. The plot shows the relationship between two
variables.

31
Lesson 2.2: Types of Frequency Distribution Tables
A frequency distribution table is a chart that represents values of any given sample
and their frequency, i.e., the number of times the values have occurred. Through a
frequency distribution table, you can easily handle the outcome of a sample through a
proper organization of data.
A frequency distribution table consists of two columns: Column A and Column B.
Column A lists the different values of outcomes in a given sample. Column B states the
frequency of the outcomes.

Example:
Example: Suppose, you had veggies on 1st, 2nd, 4th, 6th, 7th, 8th, 11th, 13th,
14th, 17th, 19th, 20th, 22nd, 25th, 27th, 29th, 30th of a month for lunch. On the
3rd, 9th, 12th, 16th, 23rd, you had a hamburger. The rest of the days, i.e., 5th,
10th, 15th, 18th you had chicken dumpling and on 21st, 24th, 26th, 28th, you had
eggs.
Instead of writing all of those dates, you can basically present the scenario this
way:

1. Ungrouped Frequency Distribution Table


For example, we are assuming the marks that 15 students scored in English,
considering the total marks to be 50. Here are the scores: 45, 34, 39, 23, 36, 47, 48, 34,
28, 44, 45, 43, 32, 39, 41. Now let us make a table and see how many students got each of
these marks.
Scores Frequency (f)
23 1
28 1 An ungrouped frequency
32 1 distribution will illustrate
34 2
the number of
36 1
39 2 occurrences of each
41 1 outcome or scores.
43 1
44 1
45 2
47 1
48 1
N = 15

32
2. Grouped Frequency Distribution Table
The previous data can be represented in groups as well. Therefore, the next table
is a grouped frequency distribution table. The groups are commonly known as class
intervals. You might get the class intervals given in the question, or you have to find it
yourself.

Class Interval Frequency (f)


1-5 0 A grouped frequency
6-10 0
11-15 0 distribution will illustrate
16-20 0 the number of
21-25 1 occurrences for each class
26-30 1 interval.
31-35 3
36-40 3
41-45 5
46-50 2
N = 15

3. Cumulative Frequency Distribution Table


The cumulative frequency distribution is undeniably one of the most important
frequency distribution. In this form of frequency distribution table, the frequencies are
cited in a cumulative format. Here’s how to calculate and define the cumulative frequency
distribution of a given set of data.
The cumulative frequency for each class interval can be derived based on the
frequency for that interval, added to the preceding cumulative total. Another way to define
cumulative frequency is by summing up all previous frequencies up to the current point.

< >
Class Frequency
Cumulative Cumulative
Interval (f)
Frequency Frequency
0-5 0 0 15 A cumulative frequency
6-10 0 0 15 will illustrate the number
11-15 0 0 15
16-20 0 0 15 of occurrences less than
21-25 1 1 15 or greater than each class
26-30 1 2 14 interval.
31-35 3 5 13
36-40 3 8 10
41-45 5 13 7
46-50 2 15 2
N = 15

33
4. Relative Frequency Distribution Table
Relative frequency distribution table is a chart that displays the popularity or mode
of a particular type of data, based on the sampled population. The table will help you to
develop an idea about the frequency of times a particular event occurs, compared to the
entire count of events. It is also to be noted that determining the Relative Frequency
Distribution of a particular set of data is all about the percentages, rather than the counts.

Class Interval Frequency Relative Frequency


(f)
0-5 0 0/15 = 0 = 0%
A relative frequency
6-10 0 0/15 = 0 = 0%
11-15 0 0/15 = 0 = 0%
distribution will illustrate
16-20 0 0/15 = 0 = 0% the percentage of the
21-25 1 1/15 = 0.0667 = 6.67% occurrence of each class
26-30 1 1/15 = 0.0667 = 6.67% interval.
31-35 3 3/15 = 0.2 = 20%
36-40 3 3/15 = 0.2 = 20%
41-45 5 5/15 = 0.3333 = 33.33%
46-50 2 2/15 = 0.1333 = 13.33%
N = 15 100%

Lesson 2.3: Types of Data and Sampling Techniques


It is almost impossible to gather data from every member of a group of individuals
when conducting a research on them. For example, you want to know what is the average
monthly income of a Filipino nurse. Instead of gathering data from all nurses in the
country, you may just gather data from a sample. To choose a sample, you may employ
various sampling techniques.
Sampling techniques are methods you employ in order to choose a sample from a
population. For instance, you could select every 3rd person in the group, or everyone in a
particular age group, and so on. You must carefully consider your study before choosing
a sampling technique because it has a significant effect on your results. For example, some
sampling techniques might be intentionally biased.

Data Sampling Techniques


A. Probability Sampling Techniques
Probability Sampling (random sampling) uses a set of predetermined
criteria and a random selection of population members, a researcher uses the sampling
technique known as probability sampling. With this selection criteria, each member has
an equal chance of being included in the sample. Our best shot at producing a sample that
is accurately representative of the population and enables us to draw robust statistical
conclusions about the entire group is through probability sampling.

34
a. Simple Random Sampling Technique
Every person in the population has an equal probability of getting chosen in a
simple random sampling. The entire population should be part of your sampling frame.
The Simple Random Sampling method is one of the top probability sampling approaches
that aid in time and resource conservation. It is a reliable way to gather information.
The fact that this method is the most straightforward for probability sampling is a
significant benefit. It does, however, come with a disclaimer: it might not choose enough
people who fit our criteria. We use it when we don’t know anything about the target
population beforehand.
Example: A company has decided to give a bonus to 10 of its employees. These
employees will be selected randomly through any method from the whole company.
b. Systematic Sampling Technique
In systematic sampling, the first person is chosen randomly, and the others are
selected according to a predetermined sampling interval. Put each person, in the
population, in some kind of order and select every nth member to be in the sample from
a random starting point.
Example: Suppose you need to choose a sample of 50 people from a population of
100. You will select every 2nd person on the list.
c. Stratified Sampling Technique
Stratified sampling entails breaking the population up into smaller groups that
might have significant differences. Ensuring that each subgroup is fairly represented in
the sample enables you to reach more accurate findings. You can employ this sampling
technique by dividing the population into smaller groups or strata according to the
pertinent property (e.g., gender, age, residence area, etc.). You determine the appropriate
number of individuals to sample from each subgroup based on the population’s overall
proportions. Then you choose a sample from each subgroup using random or systematic
sampling.
We employ this sort of sampling when seeking representation from all the
population’s subgroups. However, stratified sampling necessitates thorough familiarity
with demographic characteristics.
Example: A researcher wants to know the number of people in a country who went
to college. He/she would divide the country into cities and then further divide the cities
into age groups. He/she would then randomly select a sample to get information about
the topic.
d. Cluster Sampling Technique
In cluster sampling, we break down the overall population into smaller groups,
each of which shares the features of the population as a whole. We also choose the entire
subgroups randomly rather than merely picking people. You might incorporate each

35
person from each sampled group if it is practically feasible. If the clusters are large, you
can also sample people from each cluster using one of the methods mentioned above.
The sample has a higher chance of mistakes because there may be significant
differences between clusters, but it is pretty helpful for handling oversized and dispersed
populations. It is challenging to ensure that the sampled clusters accurately reflect the
entire population.
Example: A mobile company is looking to survey people from a country about the
usage of phones. It would divide the country into cities, known as clusters, and then
further divide the cities into areas (clusters) that are more populated.
B. Probability Sampling Techniques
Non-Probability Sampling (non-random sampling) is when participants
are chosen at random by the researcher. This type of sampling is not a set or
predetermined selection procedure. Due to this, it is challenging to ensure that every
component of a population has an equal chance of being represented in a sample. It
enables simple data collection. A non-representative sample that cannot yield
generalizable conclusions carries considerable risk.
a. Convenience Sampling Technique
Because participants are chosen based on their availability and willingness to
participate, this sampling technique may be the simplest. Only those people who are easily
accessible and available to participate in the study are included in convenience sampling.
Although it is quick and affordable, this method cannot yield generalizable
conclusions because it is impossible to determine whether the sample reflects the
population. Considering how simple it was for the researcher to conduct the study and
contact the subjects, it is frequently referred to as convenience sampling. Researchers
with almost no authority choose the sample components, and they are selected entirely
based on accessibility rather than representativeness.
When gathering feedback is time and money-constrained, this non-probability
sampling technique is used.
Example: For example, if a person is conducting a study about the use of shampoo,
they would go to the people they know instead of the general public.
b. Purposive Sampling Technique
In the purposive sampling technique, the researcher uses their knowledge to
choose a sample that will be most helpful to the research’s objectives. This sort of
sampling is also known as selective or judgment sampling. It is frequently employed when
the researcher prefers to learn in-depth information on a particular occurrence versus
drawing general conclusions from statistics or when the population is relatively tiny and
focused.
Example: For example, if a researcher wants to gather information about a
particular religion, they should go to the area where it is practiced the most.

36
c. Snowball Sampling Technique
When subjects are challenging to trace, researchers use the snowball sampling
technique. To discover people who are interested in participating in the study, the
researcher contacts other people they know. Using the snowball theory, researchers can
follow a few categories to interview and gather data in situations where it is challenging
to survey people on a particular topic.
This sampling strategy is also used by researchers when the subject is highly
delicate and taboo. The population expands like a snowball as a result of this referral
strategy. This sampling technique works well when it’s challenging to pinpoint a sampling
frame. Snowball sampling carries a considerable risk of selection bias because the people
who are referred will have characteristics in common with the person who refers them.
Example: For example, if a researcher is conducting a study about the
psychological effects of STDs, the snowball sampling technique would be useful as STDs
are considered taboo in most areas.
d. Quota Sampling Technique
The quota sampling technique is conducted based on a predetermined criterion.
From the entire population, a representative sample is taken. This approach divides the
sample into groups based on traits and then interviews. The sample should reflect the
population regarding the proportion of traits and attributes. The researcher stops
collecting data once each group has adequate sample units.
This sampling technique has numerous benefits, including its ability to compare
groups within the population, quick and uncomplicated execution, and lack of need for a
sample frame. The division of the groups may not be correct, and there is a possibility of
some bias.
Example: For example, if our population is composed of 50% women and 50%
men, our sample should be composed of the same proportion of males and females.

Lesson 2.4: Primary vs. Secondary Sources of Data


There are two types of sources of data: primary and secondary. Primary sources
are direct source of data or other information by an individual who actually conducted a
study or witnessed an event. Secondary sources are by individuals who did not directly
observe or participate in the events described or who was not the originator of the
information gathered.
a. Primary Sources. These refer to data observed or collected from firsthand
experience which is gathered directly from an original source. Best examples of primary
data sources are interviews and questionnaires.
Advantage: The information you get from a primary source is more accurate and
more likely to be correct.
Disadvantage: Collection of primary data can be costly and time consuming.

37
Examples of primary resources include:

diaries, correspondence, ships' logs


original documents e.g., birth certificates, trial transcripts
biographies, autobiographies, manuscripts
interviews, speeches, oral histories
case law, legislation, regulations, constitutions
government documents, statistical data, research reports
a journal article reporting new research or findings
creative art works, literature
newspaper advertisements and reportage and editorial/opinion pieces

b. Secondary Sources. These refer to information collected in the past or other


parties which are previously gathered by individuals or agencies. Some examples of
secondary sources are journals and magazines.
Advantage: Data can be obtained more quickly and less expensive as it can be done
through books and the internet.
Disadvantage: The needed information sometimes does not meet one’s specified
needs.
Examples of secondary sources include:
journal articles that comment on or analyze research
textbooks
dictionaries and encyclopedias
books that interpret, analyze
political commentary
biographies
dissertations
newspaper editorial/opinion pieces
criticism of literature, art works or music

38
Exercises

Do as indicated. Provide answers to questions on the blanks provided and supply


examples whenever necessary. Write legibly and answer in a precise yet comprehensive
manner.
1. Refer to the discussion regarding the methods of data collection. Identify what type of
data collection technique should be used in the following scenarios.
________________1. The teacher wants to describe the participation level of students
by writing down how frequent her students recite.
________________2. A government employee was tasked to coordinate with the
Statistics Authority to find out the mortality rate for the year
2023.
________________3. A social worker talks to indigenous people regarding the
difficulties they face in their relocation site so that he may
report the findings to the office.
________________4. Volunteers go out on the streets to administer a survey
regarding the community’s need for a clean-up drive.
________________5. In a seminar, a researcher wants to experiment how many
males and females are willing to finish a 2-hour workshop
through the attendance form given out before and after the
seminar.
________________6. A teacher wants to determine if taking a nap in between his
subject period can help improve the scores of his students, so
he recorded scores of students in one section which does not
take a nap, and another section which takes a nap.
________________7. Through a ticketing system, a store records how much a client
spends in a single purchase to study the purchasing behavior
of clients in order to determine peak seasons and trends in
sales.
________________8. A researcher talks to women to explore their sentiments
regarding a newly-passed law concerning women and their
children and takes note of them for her future online
publication.
________________9. Doctors have discovered another plant-based medicine that is
believed to be a preventive agent of cancer. In order to test its
effectiveness, they conducted a series of clinical trials and
records the findings each and every time.
________________10. Maria has asked her colleagues to look for records in their
office that could be of use on their ongoing report about the
division of stocks in their company for the recent decade.

39
2. Refer to the grouped frequency distribution below and supply the remaining columns
with the required values as reflected on the column header.
Class
f <cf >cf rf
Interval
51 – 55 8
56 – 60 19
61 – 65 14
66 – 70 25
71 – 75 16
75 – 80 22
81 – 85 15
86 – 90 15
91 – 95 14
96 – 100 20
101 – 105 9
106 - 110 13
N = 15

3. Give examples of situations where each sampling technique must be used.

Simple Random

Systematic

Stratified

Cluster

Purposive

Convenience

Snowball

Quota

40

You might also like