Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Data Collection

Data collection is a systematic process of gathering observations or measurements. Whether


you are performing research for business, governmental or academic purposes, data
collection allows you to gain first-hand knowledge and original insights into your research
problem.

While methods and aims may differ between fields, the overall process of data collection
remains largely the same. Before you begin collecting data, you need to consider:

• The aim of the research


• The type of data that you will collect
• The methods and procedures you will use to collect, store, and process the data

To collect high-quality data that is relevant to your purposes, follow these four steps.

Step 1: Define the aim of your research

Before you start the process of data collection, you need to identify exactly what you want to
achieve. You can start by writing a problem statement: what is the practical or scientific issue
that you want to address and why does it matter?

Next, formulate one or more research questions that precisely define what you want to find
out. Depending on your research questions, you might need to collect quantitative or
qualitative data:

• Quantitative data is expressed in numbers and graphs and is analyzed through


statistical methods.
• Qualitative data is expressed in words and analyzed through interpretations and
categorizations.

If your aim is to test a hypothesis, measure something precisely, or gain large-scale statistical
insights, collect quantitative data. If your aim is to explore ideas, understand experiences, or
gain detailed insights into a specific context, collect qualitative data. If you have several
aims, you can use a mixed methods approach that collects both types of data.

Examples of quantitative and qualitative research aimsYou are researching employee


perceptions of their direct managers in a large organization.

• Your first aim is to assess whether there are significant differences in perceptions of
managers across different departments and office locations.
• Your second aim is to gather meaningful feedback from employees to explore new
ideas for how managers can improve.

You decide to use a mixed-methods approach to collect both quantitative and qualitative data.

Step 2: Choose your data collection method

Based on the data you want to collect, decide which method is best suited for your research.
• Experimental research is primarily a quantitative method.
• Interviews/focus groups are qualitative methods.
• Surveys,observations, archival research and secondary data collection can be
quantitative or qualitative methods.

Carefully consider what method you will use to gather data that helps you directly answer
your research questions.

Data Collection methods

Method When to use How to collect data


Manipulate variables and
Experiment To test a causal relationship. measure their effects on
others.
To understand the general Distribute a list of questions
Survey characteristics or opinions of to a sample online, in person
a group of people. or over-the-phone.
Verbally ask participants
To gain an in-depth
open-ended questions in
Interview/focus group understanding of perceptions
individual interviews or
or opinions on a topic.
focus group discussions.
To understand something in Measure or survey a sample
Observation
its natural setting. without trying to affect them.
To study the culture of a Join and participate in a
Ethnography community or organization community and record your
first-hand. observations and reflections.
Access manuscripts,
To understand current or
documents or records from
Archival research historical events, conditions
libraries, depositories or the
or practices.
internet.
Find existing datasets that
To analyze data from have already been collected,
Secondary data collection populations that you can’t from sources such as
access first-hand. government agencies or
research organizations.

Step 3: Plan your data collection procedures

When you know which method(s) you are using, you need to plan exactly how you will
implement them. What procedures will you follow to make accurate observations or
measurements of the variables you are interested in?

For instance, if you’re conducting surveys or interviews, decide what form the questions will
take; if you’re conducting an experiment, make decisions about your experimental design.

Operationalization

Sometimes your variables can be measured directly: for example, you can collect data on the
average age of employees simply by asking for dates of birth. However, often you’ll be
interested in collecting data on more abstract concepts or variables that can’t be directly
observed.

Operationalization means turning abstract conceptual ideas into measurable observations.


When planning how you will collect data, you need to translate the conceptual definition of
what you want to study into the operational definition of what you will actually measure.

Example of operationalizationYou have decided to use surveys to collect quantitative data.


The concept you want to measure is the leadership of managers. You operationalize this
concept in two ways:

• You ask managers to rate their own leadership skills on 5-point scales assessing the
ability to delegate, decisiveness and dependability.
• You ask their direct employees to provide anonymous feedback on the managers
regarding the same topics.

Using multiple ratings of a single concept can help you cross-check your data and assess the
test validity of your measures.

Sampling

You may need to develop a sampling plan to obtain data systematically. This involves
defining a population, the group you want to draw conclusions about, and a sample, the group
you will actually collect data from.

Your sampling method will determine how you recruit participants or obtain measurements
for your study. To decide on a sampling method you will need to consider factors like the
required sample size, accessibility of the sample, and timeframe of the data collection.

Standardizing procedures

If multiple researchers are involved, write a detailed manual to standardize data collection
procedures in your study.

This means laying out specific step-by-step instructions so that everyone in your research
team collects data in a consistent way – for example, by conducting experiments under the
same conditions and using objective criteria to record and categorize observations.

This helps ensure the reliability of your data, and you can also use it to replicate the study in
the future.

Creating a data management plan

Before beginning data collection, you should also decide how you will organize and store
your data.

• If you are collecting data from people, you will likely need to anonymize and
safeguard the data to prevent leaks of sensitive information (e.g. names or identity
numbers).
• If you are collecting data via interviews or pencil-and-paper formats, you will need to
perform transcriptions or data entry in systematic ways to minimize distortion.
• You can prevent loss of data by having an organization system that is routinely
backed up.

Step 4: Collect the data

Finally, you can implement your chosen methods to measure or observe the variables you are
interested in.

Examples of collecting qualitative and quantitative dataTo collect data about perceptions of
managers, you administer a survey with closed- and open-ended questions to a sample of 300
company employees across different departments and locations.

The closed-ended questions ask participants to rate their manager’s leadership skills on scales
from 1–5. The data produced is numerical and can be statistically analyzed for averages and
patterns.

The open-ended questions ask participants for examples of what the manager is doing well
now and what they can do better in the future. The data produced is qualitative and can be
categorized through content analysis for further insights.

To ensure that high quality data is recorded in a systematic way, here are some best practices:

• Record all relevant information as and when you obtain data. For example, note down
whether or how lab equipment is recalibrated during an experimental study.
• Double-check manual data entry for errors.
• If you collect quantitative data, you can assess the reliability and validity to get an
indication of your data quality.

Sources of data

Broadly speaking, there are two sources of statistical data-internal and external. Internal
source refers to the information collected from within the organization. For example different
organization and Government departments generate large volume of information. These
informations relate production, sales, purchases, profits, wages, salaries etc. These internal
data are compiled in basic records of the institutions. Compilation of internal data ensures
smooth management and fit policy formulation of the organization. On the other hand, if data
are collected from outside, are called external data. External data can be collected either from
the primary (original) so or from secondary sources. Such data are termed as primary and
secondary data respective.

Primary data:

Primary data are first hand informations. These informations are collected directly from the
source by means of field studies. Primary data are original and are like raw materials. It is the
most crude form of information. The investigator himself collects primary data or supervises
its collection. It may be collected on a sample or census basis or from case studies.

Secondary data:
Secondary data are the Second hand informations. The data which have already been
collected and processed by some agency or persons and are not used for the first time are
termed as secondary data. According to M. M. Blair, “Secondary data are those already in
existence and which have been collected for some other purpose.” Secondary data may be
abstracted from existing records, published sources or unpublished sources.

The distinction between primary and secondary data is a matter of degree only. The data
which are primary in the hands of one become secondary for all others. Generally the data are
primary to the source who collects and processes them for the first time. It becomes
secondary for all other sources, who use them later. For example, the population census
report is primary for the Registrar General of India and the information from the report are
secondary for all of us.

Both the primary and secondary data have their respective merits and demerits. Primary data
are original as they are collected from the source. So they are more accurate than the
secondary data. But primary data involves more money, time and energy than the secondary
data. In an enquiry, a proper choice between the two forms of information should be made.
The choice to a large extent depends on the “preliminaries to data collection”.

Difference between Primary and Secondary Data

Primary Data Secondary Data


Definition
Secondary data refers to those data which
Primary data are those which are collected
have already been collected by some other
for the first time.
person.
Originality
Primary data is original because these are Secondary data are not original because
collected by the Investigator for the first someone else has collected these for his own
time. purpose.
Nature of data
Primary data are in the form of raw materials. Secondary data are in the finished form.
Reliability and Suitability
Primary data are more reliable and suitable It is less reliable and less suitable as someone
for the enquiry because it is collected for a else has collected the data which may not
particular purpose. perfectly match our purpose.
Time and Money
Collecting primary data is quite expensive Secondary data requires less time and money
both in time and money terms. so it is economical.
Precaution and Editing
No particular precaution or editing is Both precaution and editing are essential as
required while using primary data as these secondary data were collected by someone
have been collected with a definite purpose. else for his own purpose.
Methods of collecting primary data in statistics

Statistical data as we have seen can be either primary or secondary. Primary data are those
which are collected for the first time and so are in crude form. But secondary data are those
which have already been collected.

Primary data are always collected from the source. It is collected either by the investigator
himself or through his agents. There are different methods of collecting primary data. Each
method has its relative merits and demerits. The investigator has to choose a particular
method to collect the information. The choice to a large extent depends on the preliminaries
to data collection some of the commonly used methods are discussed below.

1. Direct Personal observation:

This is a very general method of collecting primary data. Here the investigator directly
contacts the informants, solicits their cooperation and enumerates the data. The information
are collected by direct personal interviews.

The novelty of this method is its simplicity. It is neither difficult for the enumerator nor the
informants. Because both are present at the spot of data collection. This method provides
most accurate information as the investigator collects them personally. But as the investigator
alone is involved in the process, his personal bias may influence the accuracy of the data. So
it is necessary that the investigator should be honest, unbiased and experienced. In such cases
the data collected may be fairly accurate. However, the method is quite costly and time-
consuming. So the method should be used when the scope of enquiry is small.

2. Indirect Oral Interviews :

This is an indirect method of collecting primary data. Here information are not collected
directly from the source but by interviewing persons closely related with the problem. This
method is applied to apprehend culprits in case of theft, murder etc. The informations relating
to one’s personal life or which the informant hesitates to reveal are better collected by this
method. Here the investigator prepares ‘a small list of questions relating to the enquiry. The
answers (information) are collected by interviewing persons well connected with the incident.
The investigator should cross-examine the informants to get correct information.

This method is time saving and involves relatively less cost. The accuracy of the information
largely depends upon the integrity of the investigator. It is desirable that the investigator
should be experienced and capable enough to inspire and create confidence in the informant
to collect accurate data.

3. Mailed Questionnaire method:

This is a very commonly used method of collecting primary data. Here information are
collected through a set of questionnaire. A questionnaire is a document prepared by the
investigator containing a set of questions. These questions relate to the problem of enquiry
directly or indirectly. Here first the questionnaires are mailed to the informants with a formal
request to answer the question and send them back. For better response the investigator
should bear the postal charges. The questionnaire should carry a polite note explaining the
aims and objective of the enquiry, definition of various terms and concepts used there.
Besides this the investigator should ensure the secrecy of the information as well as the name
of the informants, if required.

Success of this method greatly depends upon the way in which the questionnaire is drafted.
So the investigator must be very careful while framing the questions. The questions should be

(i) Short and clear

(ii) Few in number

(iii) Simple and intelligible

(iv) Corroboratory in nature or there should be provision for cross check

(v) Impersonal, non-aggressive type

(vi) Simple alternative, multiple-choice or open-end type

(a) In the simple alternative question type, the respondent has to choose between alternatives
such as ‘Yes or No’, ‘right or wrong’ etc.

For example: Is Adam Smith called father of Statistics ? Yes/No,

(b) In the multiple choice type, the respondent has to answer from any of the given
alternatives.

Example: To which sector do you belong ?

(i) Primary Sector

(ii) Secondary Sector

(iii) Tertiary or Service Sector

(c) In the Open-end or free answer questions the respondents are given complete freedom in
answering the questions. The questions are like –

What are the defects of our educational system ?

The questionnaire method is very economical in terms of time, energy and money. The
method is widely used when the scope of enquiry is large. Data collected by this method are
not affected by the personal bias of the investigator. However the accuracy of the information
depends on the cooperation and honesty of the informants. This method can be used only if
the informants are cooperative, conscious and educated. This limits the scope of the method.

4. Schedule Method:

In case the informants are largely uneducated and non-responsive data cannot be collected by
the mailed questionnaire method. In such cases, schedule method is used to collect data. Here
the questionnaires are sent through the enumerators to collect informations. Enumerators are
persons appointed by the investigator for the purpose. They directly meet the informants with
the questionnaire. They explain the scope and objective of the enquiry to the informants and
solicit their cooperation. The enumerators ask the questions to the informants and record their
answers in the questionnaire and compile them. The success of this method depends on the
sincerity and efficiency of the enumerators. So the enumerator should be sweet-tempered,
good-natured, trained and well-behaved.

Schedule method is widely used in extensive studies. It gives fairly correct result as the
enumerators directly collect the information. The accuracy of the information depends upon
the honesty of the enumerators. They should be unbiased. This method is relatively more
costly and time-consuming than the mailed questionnaire method.

5. From Local Agents:

Sometimes primary data are collected from local agents or correspondents. These agents are
appointed by the sponsoring authorities. They are well conversant with the local conditions
like language, communication, food habits, traditions etc. Being on the spot and well
acquainted with the nature of the enquiry they are capable of furnishing reliable information.

The accuracy of the data collected by this method depends on the honesty and sincerity of the
agents. Because they actually collect the information from the spot. Information from a wide
area at less cost and time can be collected by this method. The method is generally used by
government agencies, newspapers, periodicals etc. to collect data.

Information are like raw materials or inputs in an enquiry. The result of the enquiry basically
depends on the type of information used. Primary data can be collected by employing any of
the above methods. The investigator should make a rational choice of the methods to be used
for collecting data. Because collection of data forms the beginning of the statistical enquiry.

Secondary Data Collection Methods

Definition: When the data are collected by someone else for a purpose other than the
researcher’s current project and has already undergone the statistical analysis is called as
Secondary Data.

The secondary data are readily available from the other sources and as such, there are no
specific collection methods. The researcher can obtain data from the sources both internal and
external to the organization. The internal sources of secondary data are:

• Sales Report
• Financial Statements
• Customer details, like name, age, contact details, etc.
• Company information
• Reports and feedback from a dealer, retailer, and distributor
• Management information system

There are several external sources from where the secondary data can be collected. These are:

• Government censuses, like the population census, agriculture census, etc.


• Information from other government departments, like social security, tax records, etc.
• Business journals
• Social Books
• Business magazines
• Libraries
• Internet, where wide knowledge about different areas is easily available.

The secondary data can be both qualitative and quantitative. The qualitative data can be
obtained through newspapers, diaries, interviews, transcripts, etc., while the quantitative data
can be obtained through a survey, financial statements and statistics.

One of the advantages of the secondary data is that it is easily available and hence less time is
required to gather all the relevant information. Also, it is less expensive than the primary
data. But however the data might not be specific to the researcher’s needs and at the same
time is incomplete to reach a conclusion. Also, the authenticity of the research results might
be skeptical.

You might also like