Research Methodology

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 46

Sample Design

A sample design is a definite plan for obtaining a sample from a given population. It refers to the
technique or the procedure the researcher would adopt in selecting items for the sample. Sample
design also leads to a procedure to tell the number of items to be included in the sample i.e., the
size of the sample. Hence, sample design is determined before the collection of data. Among
various types of sample design technique, the researcher should choose that samples which are
reliable and appropriate for his research study.

Sample and Sampling


A Sampling is a part of the total population. It can be an individual element or a group of
elements selected from the population. Although it is a subset, it is representative of the
population and suitable for research in terms of cost, convenience, and time. The sample group
can be selected based on a probability or a non probability approach. A sample usually consists
of various units of the population. The size of the sample is represented by “n”.

A good sample is one which satisfies all or few of the following conditions:

1. Representativeness: When sampling method is adopted by the researcher, the basic assumption


is that the samples so selected out of the population are the best representative of the population
under study. Thus good samples are those who accurately represent the population. Probability
sampling technique yield representative samples. On measurement terms, the sample must be
valid. The validity of a sample depends upon its accuracy.
2. Accuracy: Accuracy is defined as the degree to which bias is absent from the sample. An
accurate (unbiased) sample is one which exactly represents the population. It is free from any
influence that causes any differences between sample value and population value.
3. Size: A good sample must be adequate in size and reliable. The sample size should be such that
the inferences drawn from the sample are accurate to a given level of confidence to represent the
entire population under study.
4. Homogeneity or Heterogeneity of the universe: Selection of sample depends on the nature of
the universe. It says that if the nature of universe is homogeneous then a small sample will
represent the behavior of entire universe. This will lead to selection of small sample size rather
than a large one. On the other hand, if the universe is heterogeneous in nature then samples are to
be chosen as from each heterogeneous unit.
5. Number of classes proposed: If a large number of class intervals to be made then the size of
sample should be more because it has to represent the entire universe. In case of small samples
there is the possibility that some samples may not be included.
6. Nature of study: The size of sample also depends on the nature of study. For an intensive study
which may be for a long time, large samples are to be chosen. Similarly, in case of general
studies large number of respondents may be appropriate one but if the study is of technical in
nature then the selection of large number of respondents may cause difficulty while gathering
information.

Characteristics of a Good Sample Design

1. Sample design must result in a truly representative sample,


2. Sample design must be such which results in a small sampling error,
3. Sampling design must be viable in the context of funds available for the research study,
4. Sample design must be such that systematic bias can be controlled in a better way, and
5. Sample should be such that the results of the sample study can be applied, in general, for the
universe with a reasonable level of confidence.

Steps in Sampling Process

1. Defining the target population.


2. Specifying the sampling frame.
3. Specifying the sampling unit.
4. Selection of the sampling method.
5. Determination of sample size.
6. Specifying the sampling plan.
7. Selecting the sample.

1. Defining the Target Population:


Defining the population of interest, for business research, is the first step in sampling process. In
general, target population is defined in terms of element, sampling unit, extent, and time frame.
The definition should be in line with the objectives of the research study. For ex, if a kitchen
appliances firm wants to conduct a survey to ascertain the demand for its micro ovens, it may
define the population as ‘all women above the age of 20 who cook (assuming that very few men
cook)’. However this definition is too broad and will include every household in the country, in
the population that is to be covered by the survey. Therefore the definition can be further refined
and defined at the sampling unit level, that, all women above the age 20, who cook and whose
monthly household income exceeds Rs.20,000. This reduces the target population size and
makes the research more focused. The population definition can be refined further by specifying
the area from where the researcher has to draw his sample, that is, households located in
Hyderabad.

A well defined population reduces the probability of including the respondents who do not fit the
research objective of the company. For ex, if the population is defined as all women above the
age of 20, the researcher may end up taking the opinions of a large number of women who
cannot afford to buy a micro oven.

2. Specifying the Sampling Frame:

Once the definition of the population is clear a researcher should decide on the sampling frame.
A sampling frame is the list of elements from which the sample may be drawn. Continuing with
the micro oven ex, an ideal sampling frame would be a database that contains all the households
that have a monthly income above Rs.20,000. However, in practice it is difficult to get an
exhaustive sampling frame that exactly fits the requirements of a particular research. In general,
researchers use easily available sampling frames like telephone directories and lists of credit card
and mobile phone users. Various private players provide databases developed along various
demographic and economic variables. Sometimes, maps and aerial pictures are also used as
sampling frames. Whatever may be the case, an ideal sampling frame is one that entire
population and lists the names of its elements only once.
A sampling frame error pops up when the sampling frame does not accurately represent the total
population or when some elements of the population are missing another drawback in the
sampling frame is over —representation. A telephone directory can be over represented by
names/household that have two or more connections.

3. Specifying the Sampling Unit:

A sampling unit is a basic unit that contains a single element or a group of elements of the
population to be sampled. In this case, a household becomes a sampling unit and all women
above the age of 20 years living in that particular house become the sampling elements. If it is
possible to identify the exact target audience of the business research, every individual element
would be a sampling unit. This would present a case of primary sampling unit. However, a
convenient and better means of sampling would be to select households as the sampling unit and
interview all females above 20 years, who cook. This would present a case of secondary
sampling unit.

4. Selection of the Sampling Method:

The sampling method outlines the way in which the sample units are to be selected. The choice
of the sampling method is influenced by the objectives of the business research, availability of
financial resources, time constraints, and the nature of the problem to be investigated. All
sampling methods can be grouped under two distinct heads, that is, probability and non-
probability sampling.

5. Determination of Sample Size:

The sample size plays a crucial role in the sampling process. There are various ways of
classifying the techniques used in determining the sample size. A couple those hold primary
importance and are worth mentioning are whether the technique deals with fixed or sequential
sampling and whether its logic is based on traditional or Bayesian methods. In non-probability
sampling procedures, the allocation of budget, thumb rules and number of sub groups to be
analyzed, importance of the decision, number of variables, nature of analysis, incidence rates,
and completion rates play a major role in sample size determination. In the case of probability
sampling, however, formulas are used to calculate the sample size after the levels of acceptable
error and level of confidence are specified. The details of the various techniques used to
determine the sample size will be explained at the end of the chapter.

6. Specifying the Sampling Plan:

In this step, the specifications and decisions regarding the implementation of the research process
are outlined. Suppose, blocks in a city are the sampling units and the households are the
sampling elements. This step outlines the modus operandi of the sampling plan in identifying
houses based on specified characteristics. It includes issues like how is the interviewer going to
take a systematic sample of the houses. What should the interviewer do when a house is vacant?
What is the recontact procedure for respondents who were unavailable? All these and many other
questions need to be answered for the smooth functioning of the research process. These are
guide lines that would help the researcher in every step of the process. As the interviewers and
their co-workers will be on field duty of most of the time, a proper specification of the sampling
plans would make their work easy and they would not have to revert to their seniors when faced
with operational problems.

7. Selecting the Sample:

This is the final step in the sampling process, where the actual selection of the sample elements is
carried out. At this stage, it is necessary that the interviewers stick to the rules outlined for the
smooth implementation of the business research. This step involves implementing the sampling
plan to select the sampling plan to select a sample required for the survey.

There are two types of sampling methods:

 Probability sampling involves random selection, allowing you to make strong statistical


inferences about the whole group.
 Non-probability sampling involves non-random selection based on convenience or
other criteria, allowing you to easily collect data.
Population vs sample

First, you need to understand the difference between a population and a sample, and identify the
target population of your research.

 The population is the entire group that you want to draw conclusions about.
 The sample is the specific group of individuals that you will collect data from.

The population can be defined in terms of geographical location, age, income, and many other
characteristics.

It can be very broad or quite narrow: maybe you want to make inferences about the whole adult
population of your country; maybe your research focuses on customers of a certain company,
patients with a specific health condition, or students in a single school.

It is important to carefully define your target population according to the purpose and


practicalities of your project.

If the population is very large, demographically mixed, and geographically dispersed, it might be
difficult to gain access to a representative sample.

Sampling frame
The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it
should include the entire target population (and nobody who is not part of that population).

Example: Sampling frameYou are doing research on working conditions at Company X. Your
population is all 1000 employees of the company. Your sampling frame is the company’s HR
database which lists the names and contact details of every employee.

Sample size

The number of individuals you should include in your sample depends on various factors,
including the size and variability of the population and your research design. There are
different sample size calculators and formulas depending on what you want to achieve
with statistical analysis.

 Simple random sampling: One of the best probability sampling techniques that helps in
saving time and resources, is the Simple Random Sampling method. It is a reliable method of
obtaining information where every single member of a population is chosen randomly, merely
by chance. Each individual has the same probability of being chosen to be a part of a sample.
For example, in an organization of 500 employees, if the HR team decides on conducting team
building activities, it is highly likely that they would prefer picking chits out of a bowl. In this
case, each of the 500 employees has an equal opportunity of being selected.
 Cluster sampling: Cluster sampling is a method where the researchers divide the entire
population into sections or clusters that represent a population. Clusters are identified and
included in a sample based on demographic parameters like age, sex, location, etc. This makes
it very simple for a survey creator to derive effective inference from the feedback.
For example, if the United States government wishes to evaluate the number of immigrants
living in the Mainland US, they can divide it into clusters based on states such as California,
Texas, Florida, Massachusetts, Colorado, Hawaii, etc. This way of conducting a survey will be
more effective as the results will be organized into states and provide insightful immigration
data.
 Systematic sampling: Researchers use the systematic sampling method to choose the sample
members of a population at regular intervals. It requires the selection of a starting point for the
sample and sample size that can be repeated at regular intervals. This type of sampling method
has a predefined range, and hence this sampling technique is the least time-consuming.
For example, a researcher intends to collect a systematic sample of 500 people in a population
of 5000. He/she numbers each element of the population from 1-5000 and will choose every
10th individual to be a part of the sample (Total population/ Sample Size = 5000/500 = 10).
 Stratified random sampling: Stratified random sampling is a method in which the researcher
divides the population into smaller groups that don’t overlap but represent the entire
population. While sampling, these groups can be organized and then draw a sample from each
group separately.
For example, a researcher looking to analyze the characteristics of people belonging to
different annual income divisions will create strata (groups) according to the annual family
income. Eg – less than $20,000, $21,000 – $30,000, $31,000 to $40,000, $41,000 to $50,000,
etc. By doing this, the researcher concludes the characteristics of people belonging to different
income groups. Marketers can analyze which income groups to target and which ones to
eliminate to create a roadmap that would bear fruitful results.

Uses of probability sampling


There are multiple uses of probability sampling:

 Reduce Sample Bias: Using the probability sampling method, the bias in the sample derived
from a population is negligible to non-existent. The selection of the sample mainly depicts the
understanding and the inference of the researcher. Probability sampling leads to higher
quality data collection as the sample appropriately represents the population.
 Diverse Population: When the population is vast and diverse, it is essential to have adequate
representation so that the data is not skewed towards one demographic. For example, if Square
would like to understand the people that could make their point-of-sale devices, a survey
conducted from a sample of people across the US from different industries and socio-
economic backgrounds helps.
 Create an Accurate Sample: Probability sampling helps the researchers plan and create an
accurate sample. This helps to obtain well-defined data.
Types of non-probability sampling with examples
The non-probability method is a sampling method that involves a collection of feedback based
on a researcher or statistician’s sample selection capabilities and not on a fixed selection process.
In most situations, the output of a survey conducted with a non-probable sample leads to skewed
results, which may not represent the desired target population. But, there are situations such as
the preliminary stages of research or cost constraints for conducting research, where non-
probability sampling will be much more useful than the other type.

Four types of non-probability sampling explain the purpose of this sampling method in a better
manner:

 Convenience sampling: This method is dependent on the ease of access to subjects such as


surveying customers at a mall or passers-by on a busy street. It is usually termed
as convenience sampling, because of the researcher’s ease of carrying it out and getting in
touch with the subjects. Researchers have nearly no authority to select the sample elements,
and it’s purely done based on proximity and not representativeness. This non-probability
sampling method is used when there are time and cost limitations in collecting feedback. In
situations where there are resource limitations such as the initial stages of research,
convenience sampling is used.
For example, startups and NGOs usually conduct convenience sampling at a mall to distribute
leaflets of upcoming events or promotion of a cause – they do that by standing at the mall
entrance and giving out pamphlets randomly.
 Judgmental or purposive sampling: Judgemental or purposive samples are formed by the
discretion of the researcher. Researchers purely consider the purpose of the study, along with
the understanding of the target audience. For instance, when researchers want to understand
the thought process of people interested in studying for their master’s degree. The selection
criteria will be: “Are you interested in doing your masters in …?” and those who respond with
a “No” are excluded from the sample.
 Snowball sampling: Snowball sampling is a sampling method that researchers apply when the
subjects are difficult to trace. For example, it will be extremely challenging to survey
shelterless people or illegal immigrants. In such cases, using the snowball theory, researchers
can track a few categories to interview and derive results. Researchers also implement this
sampling method in situations where the topic is highly sensitive and not openly discussed—
for example, surveys to gather information about HIV Aids. Not many victims will readily
respond to the questions. Still, researchers can contact people they might know or volunteers
associated with the cause to get in touch with the victims and collect information.
 Quota sampling:  In Quota sampling, the selection of members in this sampling technique
happens based on a pre-set standard. In this case, as a sample is formed based on specific
attributes, the created sample will have the same qualities found in the total population. It is a
rapid method of collecting samples.

Uses of non-probability sampling


Non-probability sampling is used for the following:

 Create a hypothesis: Researchers use the non-probability sampling method to create an


assumption when limited to no prior information is available. This method helps with the
immediate return of data and builds a base for further research.
 Exploratory research: Researchers use this sampling technique widely when conducting
qualitative research, pilot studies, or exploratory research.
 Budget and time constraints: The non-probability method when there are budget and time
constraints, and some preliminary data must be collected. Since the survey design is not rigid,
it is easier to pick respondents at random and have them take the survey or questionnaire.

What Is a Sampling Error?


A sampling error is a statistical error that occurs when an analyst does not select a sample that
represents the entire population of data. As a result, the results found in the sample do not
represent the results that would be obtained from the entire population.

Sampling is an analysis performed by selecting a number of observations from a larger


population. The method of selection can produce both sampling errors and non-sampling errors.

A population specification error would occur if XYZ Company does not understand the specific
types of consumers who should be included in the sample. For example, if XYZ creates a
population of people between the ages of 15 and 25 years old, many of those consumers do not
make the purchasing decision about a video streaming service because they may not work full-
time. On the other hand, if XYZ put together a sample of working adults who make purchase
decisions, the consumers in this group may not watch 10 hours of video programming each
week.

Categories of Sampling Errors

 Population Specification Error – Happens when the analysts do not understand who to


survey. For example, for a survey of breakfast cereals, the population can be the mother,
children, or the entire family.
 Selection Error – Occurs when the respondents’ survey participation is self-selected,
implying only those who are interested respond. Selection errors can be reduced by
encouraging participation.
 Sample Frame Error – Occurs when a sample is selected from the
wrong population data.
 Non-Response Error – Occurs when a useful response is not obtained from the surveys.
It may happen due to the inability to contact potential respondents or their refusal to
respond.

The Criteria of a good sample are:

 An ideal sample must be representative of the population corresponding to its properties.


It should not lack in any characteristic of the population.
 It must be unbiased and must be obtained by a probability processor random method.
 It must make the research work more feasible and has the practicability for the research
situation.
 It must yield an accurate result and does not involve errors. The probability of error can
be estimated.
 Sample must be adequate to ensure reliability. A sample having 10% of the whole
population is generally adequate.
 The sample must be comprehensive. It is a quality of sample which is controlled by the
specific purpose of the investigation.
 Sample units must be chosen systematically and objectively.

DETERMINING SAMPLE SIZE

What are the terms used around the sample size?

Before we jump into sample size determination, let’s take a look at the terms you should know:

1. Population size: Population size is how many people fit your demographic. For example, you
want to get information on doctors residing in North America. Your population size is the total
number of doctors in North America. Don’t worry! Your population size doesn’t always have
to be that big. Smaller population sizes can still give you accurate results as long as you know
who you’re trying to represent.
2. Confidence level: Confidence level tells you how sure you can be that your data is accurate. It
is expressed as a percentage and aligned to the confidence interval. For example, if your
confidence level is 90%, your results will most likely be 90% accurate.
3. The margin of error (confidence interval): When it comes to surveys, there’s no way to be
100% accurate. Confidence intervals tell you how far off from the population means you’re
willing to allow your data to fall. A margin of error describes how close you can reasonably
expect a survey result to fall relative to the real population value. Remember, if you need help
with this information you can use our margin of error calculator.
4. Standard deviation: Standard deviation is the measure of the dispersion of a data set from its
mean. It measures the absolute variability of a distribution. The higher the dispersion or
variability, the greater the standard deviation and the greater the magnitude of the deviation.
For example, you have already sent out your survey. How much variance do you expect in
your responses? That variation in response is the standard of deviation.

Measurement: Measurement is the process of observing and recording the observations that are
collected as part of research. The recording of the observations may be in terms of numbers or
other symbols to characteristics of objects according to certain prescribed rules. The
respondent’s, characteristics are feelings, attitudes, opinions etc. For example, you may assign
‘1’ for Male and ‘2’ for Female respondents. In response to a question on whether he/she is using
the ATM provided by a particular bank branch, the respondent may say ‘yes’ or ‘no’. You may
wish to assign the number ‘1’ for the response yes and ‘2’ for the response no. We assign
numbers to these characteristics for two reasons. First, the numbers facilitate further statistical
analysis of data obtained.

Second, numbers facilitate the communication of measurement rules and results.

The most important aspect of measurement is the specification of rules for assigning numbers to
characteristics. The rules for assigning numbers should be standardised and applied uniformly.
This must not change over time or objects.

Scaling: Scaling is the assignment of objects to numbers or semantics according to a rule. In


scaling, the objects are text statements, usually statements of attitude, opinion, or feeling. For
example, consider a scale locating customers of a bank according to the characteristic
“agreement to the satisfactory quality of service provided by the branch”. Each customer
interviewed may respond with a semantic like ‘strongly agree’, or ‘somewhat agree’, or
‘somewhat disagree’, or ‘strongly disagree’. We may even assign each of the responses a
number.

For example, we may assign strongly agree as ‘1’, agree as ‘2’ disagree as ‘3’, and strongly
disagree as ‘4’. Therefore, each of the respondents may assign 1, 2, 3 or 4. there are four levels
of measurement scales or methods of assigning numbers: (a) Nominal scale, (b) Ordinal scale,
(c) Interval scale, and (d) Ratio scale.

a) Nominal Scale is the crudest among all measurement scales but it is also the simplest scale.
In this scale the different scores on a measurement simply indicate different categories. The
nominal scale does not express any values or relationships between variables. For example,
labelling men as ‘1’ and women as ‘2’ which is the most common way of labelling gender for
data recording purpose does not mean women are ‘twice something or other’ than men. Nor it
suggests that men are somehow ‘better’ than women. Another example of nominal scale is to
classify the respondent’s income into three groups: the highest income as group 1. The middle
income as group 2, and the low-income as group 3. The nominal scale is often referred to as a
categorical scale. The assigned numbers have no arithmetic properties and act only as labels.
The only statistical operation that can be performed on nominal scales is a frequency count. We
cannot determine an average except mode.

In designing and developing a questionnaire, it is important that the response categories must
include all possible responses. In order to have an exhaustive number of responses, you might
have to include a category such as ‘others’, ‘uncertain’, ‘don’t know’, or ‘can’t remember’ so
that the respondents will not distort their information by forcing their responses in one of the
categories provided. Also, you should be careful and be sure that the categories provided are
mutually exclusive so that they do not overlap or get duplicated in any way.

b) Ordinal Scale involves the ranking of items along the continuum of the characteristic being
scaled. In this scale, the items are classified according to whether they have more or less of a
characteristic. For example, you may wish to ask the TV viewers to rank the TV channels
according to their preference and the responses may look like this as given below:

TV Channel Viewers preferences

Doordarshan-1 1

Star plus 2

NDTV News 3

Aaaj Tak TV 4

The main characteristic of the ordinal scale is that the categories have a logical or ordered
relationship. This type of scale permits the measurement of degrees of difference, (that is, ‘more’
or ‘less’) but not the specific amount of differences (that is, how much ‘more’ or ‘less’). This
scale is very common in marketing, satisfaction and attitudinal research.

Another example is that a fast food home delivery shop may wish to ask its customers:

How would you rate the service of our staff?


(1) Excellent • (2) Very Good • (3) Good • (4) Poor • (5) Worst •

Suppose respondent X gave the response ‘Excellent’ and respondent Y gave the response
‘Good’, we may say that respondent X thought that the service provided better than respondent Y
to be thought. But we don’t know how much better and even we can’t say that both respondents
have the same understanding of what constitutes ‘good service’.

In marketing research, ordinal scales are used to measure relative attitudes, opinions, and
preferences. Here we rank the attitudes, opinions and preferences from best to worst or from
worst to best. However, the amount of difference between the ranks cannot be found out. Using
ordinal scale data, we can perform statistical analysis like Median and Mode, but not the Mean.

c) Interval Scale is a scale in which the numbers are used to rank attributes such that
numerically equal distances on the scale represent equal distance in the characteristic being
measured. An interval scale contains all the information of an ordinal scale, but it also one
allows to compare the difference/distance between attributes. For example, the difference
between ‘1’ and ‘2’ is equal to the difference between ‘3’ and ‘4’. Further, the difference
between ‘2’ and ‘4’ is twice the difference between ‘1’ and ‘2’. However, in an interval scale, the
zero point is arbitrary and is not true zero. This, of course, has implications for the type of data
manipulation and analysis. We can carry out on data collected in this form. It is possible to add
or subtract a constant to all of the scale values without affecting the form of the scale but one
cannot multiply or divide the values. Measuring temperature is an example of interval scale. We
cannot say 400C is twice as hot as 200C. The reason for this is that 00C does not mean that there
is no temperature, but a relative point on the Centigrade Scale. Due to lack of an absolute zero
point, the interval scale does not allow the conclusion that 400C is twice as hot as 200C.

Interval scales may be either in numeric or semantic formats. The following are two more
examples of interval scales one in numeric format and another is semantic format.

Ratio Scale is the highest level of measurement scales. This has the properties of an interval
scale together with a fixed (absolute) zero point. The absolute zero point allows us to construct
a meaningful ratio. Examples of ratio scales include weights, lengths and times. In the marketing
research, most counts are ratio scales. For example, the number of customers of a bank’s ATM
in the last three months is a ratio scale. This is because you can compare this with previous three
months. Ratio scales permit the researcher to compare both differences in scores and relative
magnitude of scores. For example, the difference between 10 and 15 minutes is the same as the
difference between 25 and 30 minutes and 30 minutes is twice as long as 15 minutes. Most
financial research that deals with rupee values utilizes ratio scales. However, for most
behavioural research, interval scales are typically the highest form of measurement. Most
statistical data analysis procedures do not distinguish between the interval and ratio properties of
the measurement scales and it is sufficient to say that all the statistical operations that can be
performed on interval scale can also be performed on ratio scales.

Now you must be wondering why you should know the level of measurement. Knowing the level
of measurement helps you to decide on how to interpret the data. For example, when you know
that a measure is nominal then you know that the numerical values are just short codes for longer
textual names. Also, knowing the level of measurement helps you to decide what statistical
analysis is appropriate on the values that were assigned. For example, if you know that a measure
is nominal, then you would not need to find mean of the data values or perform a t-test on the
data. (t-test will be discussed in Unit-16 in the course). It is important to recognise that there is a
hierarchy implied in the levels of measurement. At lower levels of measurement, assumptions
tend to be less restrictive and data analyses tend to be less sensitive. At each level up the
hierarchy, the current level includes all the qualities of the one below it and adds something new.
In general, it is desirable to have a higher level of measurement (that is, interval or ratio) rather
than a lower one (that is, nominal or ordinal).
Comparative Scales

For comparing two or more variables, a comparative scale is used by the respondents. Following
are the different types of comparative scaling techniques:

Paired Comparison

A paired comparison symbolizes two variables from which the respondent needs to select one.
This technique is mainly used at the time of product testing, to facilitate the consumers with a
comparative analysis of the two major products in the market.

To compare more than two objects say comparing P, Q and R, one can first compare P with Q
and then the superior one (i.e., one with a higher percentage) with R.

For example, A market survey was conducted to find out consumer’s preference for the network
service provider brands, A and B. The outcome of the survey was as follows:
Brand ‘A’ = 57%
Brand ‘B’ = 43%
Thus, it is visible that the consumers prefer brand ‘A’, over brand ‘B’.

Rank Order

In rank order scaling the respondent needs to rank or arrange the given objects according to his
or her preference.

For example, A soap manufacturing company conducted a rank order scaling to find out the
orderly preference of the consumers. It asked the respondents to rank the following brands in the
sequence of their choice:

SOAP BRANDS RANK

Brand V 4

Brand X 2

Brand Y 1

Brand Z 3

The above scaling shows that soap ‘Y’ is the most preferred brand, followed by soap ‘X’, then
soap ‘Z’ and the least preferred one is the soap ‘V’.

Constant Sum

It is a scaling technique where a continual sum of units like dollars, points, chits, chips, etc. is
given to the features, attributes and importance of a particular product or service by the
respondents.
For example, The respondents belonging to 3 different segments were asked to allocate 50 points
to the following attributes of a cosmetic product ‘P’:

ATTRIBUTES SEGMENT 1 SEGMENT 2 SEGMENT 3

Finish 11 8 9

Skin Friendly 11 12 12

Fragrance 7 11 8

Packaging 9 8 10

Price 12 11 11

From the above constant sum scaling analysis, we can see that:

 Segment 1 considers product ‘P’ due to its competitive price as a major factor.
 But segment 2 and segment 3, prefers the product because it is skin-friendly.

Q-Sort Scaling

Q-sort scaling is a technique used for sorting the most appropriate objects out of a large number
of given variables. It emphasizes on the ranking of the given objects in a descending order to
form similar piles based on specific attributes.

It is suitable in the case where the number of objects is not less than 60 and more than 140, the
most appropriate of all ranging between 60 to 90.

For example, The marketing manager of a garment manufacturing company sorts the most
efficient marketing executives based on their past performance, sales revenue generation,
dedication and growth.
The Q-sort scaling was performed on 60 executives, and the marketing head creates three piles
based on their efficiency as follows:

In the above diagram, the initials of the employees are used to denote their names.

Non-Comparative Scales

A non-comparative scale is used to analyse the performance of an individual product or object on


different parameters. Following are some of its most common types:

Continuous Rating Scales

It is a graphical rating scale where the respondents are free to place the object at a position of
their choice. It is done by selecting and marking a point along the vertical or horizontal line
which ranges between two extreme criteria.

For example, A mattress manufacturing company used a continuous rating scale to find out the
level of customer satisfaction for its new comfy bedding. The response can be taken in the
following different ways (stated as versions here):
The above diagram shows a non-comparative analysis of one particular product, i.e. comfy
bedding. Thus, making it very clear that the customers are quite satisfied with the product and its
features.

Itemized Rating Scale

Itemized scale is another essential technique under the non-comparative scales. It emphasizes on
choosing a particular category among the various given categories by the respondents. Each class
is briefly defined by the researchers to facilitate such selection.

The three most commonly used itemized rating scales are as follows:

 Likert Scale: In the Likert scale, the researcher provides some statements and ask the
respondents to mark their level of agreement or disagreement over these statements by
selecting any one of the options from the five given alternatives.
For example, A shoes manufacturing company adopted the Likert scale technique for its
new sports shoe range named Z sports shoes. The purpose is to know the agreement or
disagreement of the respondents.
For this, the researcher asked the respondents to circle a number representing the most
suitable answer according to them, in the following representation:

 1 – Strongly Disagree
 2 – Disagree
 3 – Neither Agree Nor Disagree
 4 – Agree
 5 – Strongly Agree

NEITHER
STRONGLY AGREE STRONGLY
STATEMENT DISAGREE AGREE
DISAGREE NOR AGREE
DISAGREE

Z sports shoes are 1 2 3 4 5


very light weight

Z sports shoes are 1 2 3 4 5


extremely
comfortable

Z sports shoes 1 2 3 4 5
look too trendy

I will definitely 1 2 3 4 5
recommend Z
sports shoes to
friends, family and
colleagues

The above illustration will help the company to understand what the customers think about its
products. Also, whether there is any need for improvement or not.

 Semantic Differential Scale: A bi-polar seven-point non-comparative rating scale is


where the respondent can mark on any of the seven points for each given attribute of the
object as per personal choice. Thus, depicting the respondent’s attitude or perception
towards the object.

For example, A well-known brand for watches, carried out semantic differential scaling to
understand the customer’s attitude towards its product. The pictorial representation of this
technique is as follows:

From the above diagram, we can analyze that the customer finds the product of superior quality;
however, the brand needs to focus more on the styling of its watches.

 Stapel Scale: A Stapel scale is that itemized rating scale which measures the response,
perception or attitude of the respondents for a particular object through a unipolar rating.
The range of a Stapel scale is between -5 to +5 eliminating 0, thus confining to 10 units.

For example, A tours and travel company asked the respondent to rank their holiday
package in terms of value for money and user-friendly interface as follows:
With the help of the above scale, we can say that the company needs to improve its package in
terms of value for money. However, the decisive point is that the interface is quite user-friendly
for the customers.

Comparison Chart

BASIS FOR
PRIMARY DATA SECONDARY DATA
COMPARISON

Meaning Primary data refers to the first Secondary data means data
hand data gathered by the collected by someone else earlier.
researcher himself.

Data Real time data Past data

Process Very involved Quick and easy


BASIS FOR
PRIMARY DATA SECONDARY DATA
COMPARISON

Source Surveys, observations, Government publications, websites,


experiments, questionnaire, books, journal articles, internal
personal interview, etc. records etc.

Cost effectiveness Expensive Economical

Collection time Long Short

Specific Always specific to the May or may not be specific to the


researcher's needs. researcher's need.

Available in Crude form Refined form

Accuracy and More Relatively less


Reliability

Definition of Primary Data

Primary data is data originated for the first time by the researcher through direct efforts and
experience, specifically for the purpose of addressing his research problem. Also known as the
first hand or raw data. Primary data collection is quite expensive, as the research is conducted by
the organisation or agency itself, which requires resources like investment and manpower. The
data collection is under direct control and supervision of the investigator.
The data can be collected through various methods like surveys, observations, physical testing,
mailed questionnaires, questionnaire filled and sent by enumerators, personal interviews,
telephonic interviews, focus groups, case studies, etc.

Definition of Secondary Data

Secondary data implies second-hand information which is already collected and recorded by any
person other than the user for a purpose, not relating to the current research problem. It is the
readily available form of data collected from various sources like censuses, government
publications, internal records of the organisation, reports, books, journal articles, websites and so
on.

Secondary data offer several advantages as it is easily available, saves time and cost of the
researcher. But there are some disadvantages associated with this, as the data is gathered for the
purposes other than the problem in mind, so the usefulness of the data may be limited in a
number of ways like relevance and accuracy.

Moreover, the objective and the method adopted for acquiring data may not be suitable to the
current situation. Therefore, before using secondary data, these factors should be kept in mind.

Key Differences Between Primary and Secondary Data

The fundamental differences between primary and secondary data are discussed in the following
points:

1. The term primary data refers to the data originated by the researcher for the first time.
Secondary data is the already existing data, collected by the investigator agencies and
organisations earlier.
2. Primary data is a real-time data whereas secondary data is one which relates to the past.
3. Primary data is collected for addressing the problem at hand while secondary data is
collected for purposes other than the problem at hand.
4. Primary data collection is a very involved process. On the other hand, secondary data
collection process is rapid and easy.
5. Primary data collection sources include surveys, observations, experiments,
questionnaire, personal interview, etc. On the contrary, secondary data collection sources
are government publications, websites, books, journal articles, internal records etc.
6. Primary data collection requires a large amount of resources like time, cost and
manpower. Conversely, secondary data is relatively inexpensive and quickly available.
7. Primary data is always specific to the researcher’s needs, and he controls the quality of
research. In contrast, secondary data is neither specific to the researcher’s need, nor he
has control over the data quality.
8. Primary data is available in the raw form whereas secondary data is the refined form of
primary data. It can also be said that secondary data is obtained when statistical methods
are applied to the primary data.
9. Data collected through primary sources are more reliable and accurate as compared to the
secondary sources.

Primary Data Collection Methods

Primary data collection methods are different ways in which primary data can be collected. It
explains the tools used in collecting primary data, some of which are highlighted below:

1. Interviews

An interview is a method of data collection that involves two groups of people, where the first
group is the interviewer (the researcher(s) asking questions and collecting data) and the
interviewee (the subject or respondent that is being asked questions). The questions and
responses during an interview may be oral or verbal as the case may be.

Interviews can be carried out in 2 ways, namely; in-person interviews and telephonic interviews.
An in-person interview requires an interviewer or a group of interviewers to ask questions from
the interviewee in a face-to-face fashion. 
It can be direct or indirect, structured or structure, focused or unfocused, etc. Some of the tools
used in carrying out in-person interviews include a notepad or recording device to take note of
the conversation—very important due to human forgetful nature.

On the other hand, telephonic interviews are carried out over the phone through ordinary voice
calls or video calls. The 2 parties involved may decide to use video calls like Skype to carry out
interviews.

A mobile phone, Laptop, Tablet, or desktop computer with an internet connection is required for
this.

Pros

 In-depth information can be collected.


 Non-response and response bias can be detected.
 The samples can be controlled.

Cons

 It is more time-consuming.
 It is expensive.
 The interviewer may be biased.

2. Surveys & Questionnaires

Surveys and questionnaires are 2 similar tools used in collecting primary data. They are a group
of questions typed or written down and sent to the sample of study to give responses.

After giving the required responses, the survey is given back to the researcher to record. It is
advisable to conduct a pilot study where the questionnaires are filled by experts and meant to
assess the weakness of the questions or techniques used.
There are 2 main types of surveys used for data collection, namely; online and offline
surveys. Online surveys are carried out using internet-enabled devices like mobile phones, PCs,
Tablets, etc.

They can be shared with respondents through email, websites, or social media. Offline surveys,
on the other hand, do not require an internet connection for them to be carried out.

The most common type of offline survey is a paper-based survey. However, there are also offline
surveys like Formplus that can be filled with a mobile device without access to an internet
connection. 

This kind of survey is called online-offline surveys because they can be filled offline but require
an internet connection to be submitted.

Pros

 Respondents have adequate time to give responses.


 It is free from the bias of the interviewer.
 They are cheaper compared to interviews.

Cons

 A high rate of non-response bias.


 It is inflexible and can’t be changed once sent.
 It is a slow process.

3. Observation

The observation method is mostly used in studies related to behavioral science. The researcher
uses observation as a scientific tool and method of data collection. Observation as a data
collection tool is usually systematically planned and subjected to checks and controls.

There are different approaches to the observation method—structured or unstructured, controlled


or uncontrolled, and participant, non-participant, or disguised approach.
The structured and unstructured approach is characterized by careful definition of subjects of
observation, style of observer, conditions, and selection of data. An observation process that
satisfies this is said to be structured and vice versa.

A controlled and uncontrolled approach signifies whether the research took place in a natural
setting or according to some pre-arranged plans. If an observation is done in a natural setting, it
is uncontrolled but becomes controlled if done in a laboratory.

Before employing a new teacher, academic institutions sometimes ask for a sample teaching
class to test the teacher’s ability. The evaluator joins the class and observes the teaching, making
him or her a participant.

The evaluation may also decide to observe from outside the class, becoming a non-participant.
An evaluator may also be asked to stay in class and disguise as a student, to carry out a disguised
observation.

Pros

 The data is usually objective.


 Data is not affected by past or future events.

Cons

 The information is limited.


 It is expensive 

4. Focus Groups

Focus Groups are gathering of 2 or more people with similar characteristics or who possess
common traits. They seek open-ended thoughts and contributions from participants.

A focus group is a primary source of data collection because the data is collected directly from
the participant. It is commonly used for market research, where a group of market consumers
engages in a discussion with a research moderator.
It is slightly similar to interviews, but this involves discussions and interactions rather than
questions and answers. Focus groups are less formal and the participants are the ones who do
most of the talking, with moderators there to oversee the process.

Pros

 It incurs a low cost compared to interviews. This is because the interviewer does not have
to discuss with each participant individually.
 It takes lesser time too.

Cons

 Response bias is a problem in this case because a participant might be subjective to what
people will think about sharing a sincere opinion.
 Group thinking does not clearly mirror individual opinions.

5. Experiments

An experiment is a structured study where the researchers attempt to understand the causes,
effects, and processes involved in a particular process. This data collection method is usually
controlled by the researcher, who determines which subject is used, how they are grouped, and
the treatment they receive.

During the first stage of the experiment, the researcher selects the subject which will be
considered. Therefore, some actions are carried out on these subjects, while the primary data
consisting of the actions and reactions are recorded by the researcher.

After which they will be analyzed and a conclusion will be drawn from the result of the analysis.
Although experiments can be used to collect different types of primary data, it is mostly used for
data collection in the laboratory.

Pros

 It is usually objective since the data recorded are the results of a process.
 Non-response bias is eliminated.

Cons 

 Incorrect data may be recorded due to human error.


 It is expensive.

Qualitative data-collection methods

One-on-one interviews

Interviews are one of the most common qualitative data-collection methods, and they’re a
great approach when you need to gather highly personalized information. Informal,
conversational interviews are ideal for open-ended questions that allow you to gain rich,
detailed context.

Open-ended surveys and questionnaires 

Open-ended surveys and questionnaires allow participants to answer freely at length, rather
than choosing from a set number of responses. For example, you might ask an open-ended
question like “Why don’t you eat ABC brand pizza?” 

You would then provide space for people to answer narratively, rather than simply giving
them a specific selection of responses to choose from — like “I’m a vegan,” “It’s too
expensive,” or “I don’t like pizza.” 

Focus groups

Focus groups are similar to interviews, except that you conduct them in a group format. You
might use a focus group when one-on-one interviews are too difficult or time-consuming to
schedule.  

They’re also helpful when you need to gather data on a specific group of people. For example,
if you want to get feedback on a new marketing campaign from a number of demographically
similar people in your target market or allow people to share their views on a new product,
focus groups are a good way to go.

Observation

Observation is a method in which a data collector observes subjects in the course of their
regular routines, takes detailed field notes, and/or records subjects via video or audio.  

Case studies

In the case study method, you analyze a combination of multiple qualitative data sources to
draw inferences and come to conclusions. 

What is a Questionnaire?
A questionnaire is a research instrument that consists of a set of questions or other types of
prompts that aims to collect information from a respondent. A research questionnaire is typically
a mix of close-ended questions and open-ended questions.

Open-ended, long-form questions offer the respondent the ability to elaborate on their thoughts.
Research questionnaires were developed in 1838 by the Statistical Society of London.

The data collected from a data collection questionnaire can be both qualitative as well
as quantitative in nature. A questionnaire may or may not be delivered in the form of a survey,
but a survey always consists of a questionnaire.

Advantages of a good questionnaire design

 With a survey questionnaire, you can gather a lot of data in less time.
 There is less chance of any bias creeping if you have a standard set of questions to be used for
your target audience. You can apply logic to questions based on the respondents’ answers, but
the questionnaire will remain standard for a group of respondents that fall in the same
segment.
 Surveying online survey software is quick and cost-effective. It offers you a rich set of
features to design, distribute, and analyze the response data.
 It can be customized to reflect your brand voice. Thus, it can be used to reinforce your brand
image.
 The responses can be compared with the historical data and understand the shift in
respondents’ choices and experiences.
 Respondents can answer the questionnaire without revealing their identity. Also, many survey
software complies with significant data security and privacy regulations.

Characteristics of a good questionnaire


Your survey design depends on the type of information you need to collect from respondents.
Qualitative questionnaires are used when there is a need to collect exploratory information to
help prove or disprove a hypothesis. Quantitative questionnaires are used to validate or test a
previously generated hypothesis. However, most questionnaires follow some essential
characteristics:

 Uniformity: Questionnaires are very useful to collect demographic information, personal


opinions, facts, or attitudes from respondents. One of the most significant attributes of a
research form is uniform design and standardization. Every respondent sees the same
questions. This helps in data collection and statistical analysis of this data. For example,
the retail store evaluation questionnaire template contains questions for evaluating retail store
experiences. Questions relate to purchase value, range of options for product selections, and
quality of merchandise. These questions are uniform for all customers.
 Exploratory: It should be exploratory to collect qualitative data. There is no restriction on
questions that can be in your questionnaire. For example, you use a data collection
questionnaire and send it to the female of the household to understand her spending and saving
habits relative to the household income. Open-ended questions give you more insight and
allow the respondents to explain their practices. A very structured question list could limit the
data collection.
 Question Sequence: It typically follows a structured flow of questions to increase the number
of responses. This sequence of questions is screening questions, warm-up questions, transition
questions, skip questions, challenging questions, and classification questions. For example,
our motivation and buying experience questionnaire template covers initial demographic
questions and then asks for time spent in sections of the store and the rationale behind
purchases.

Types & Definitions

As we explored before, questionnaires can be either structured or free-flowing. Let’s take a


closer look at what that entails for your surveys.

 Structured Questionnaires: Structured questionnaires collect quantitative data. The


questionnaire is planned and designed to gather precise information. It also initiates a formal
inquiry, supplements data, checks previously accumulated data, and helps validate any prior
hypothesis.
 Unstructured Questionnaires: Unstructured questionnaires collect qualitative data. They use
a basic structure and some branching questions but nothing that limits the responses of a
respondent. The questions are more open-ended to collect specific data from participants.

Types of questions in a questionnaire

You can use multiple question types in a questionnaire. Using various question types can help
increase responses to your research questionnaire as they tend to keep participants more engaged.
The best customer satisfaction survey templates are the most commonly used for better insights
and decision-making.

Some of the widely used types of questions are:

 Open-Ended Questions: Open-ended questions help collect qualitative data in a


questionnaire where the respondent can answer in a free form with little to no restrictions.
 Dichotomous Questions: The dichotomous question is generally a “yes/no” close-ended
question. This question is usually used in case of the need for necessary validation. It is the
most natural form of a questionnaire.
 Multiple-Choice Questions: Multiple-choice questions are a close-ended question type in
which a respondent has to select one (single-select multiple-choice question) or many (multi-
select multiple choice question) responses from a given list of options. The multiple-choice
question consists of an incomplete stem (question), right answer or answers, incorrect
answers, close alternatives, and distractors. Of course, not all multiple-choice questions have
all of the answer types. For example, you probably won’t have the wrong or right answers if
you’re looking for customer opinion.
 Scaling Questions: These questions are based on the principles of the four measurement
scales – nominal, ordinal, interval, and ratio. A few of the question types that utilize these
scales’ fundamental properties are rank order questions, Likert scale questions, semantic
differential scale questions, and Stapel scale questions.
 Pictorial Questions: This question type is easy to use and encourages respondents to answer.
It works similarly to a multiple-choice question. Respondents are asked a question, and the
answer choices are images. This helps respondents choose an answer quickly without over-
thinking their answers, giving you more accurate data.
Types of Questionnaires

Questionnaires can be administered or distributed in the following forms:

 Online Questionnaire: In this type, respondents are sent the questionnaire via email or other
online mediums. This method is generally cost-effective and time-efficient. Respondents can
also answer at leisure. Without the pressure to respond immediately, responses may be more
accurate. The disadvantage, however, is that respondents can easily ignore these
questionnaires. Read more about online surveys.
 Telephone Questionnaire: A researcher makes a phone call to a respondent to collect
responses directly. Responses are quick once you have a respondent on the phone. However, a
lot of times, the respondents hesitate to give out much information over the phone. It is also an
expensive way of conducting research. You’re usually not able to collect as many responses as
other types of questionnaires, so your sample may not represent the broader population.
 In-House Questionnaire: This type is used by a researcher who visits the respondent’s home
or workplace. The advantage of this method is that the respondent is in a comfortable and
natural environment, and in-depth data can be collected. The disadvantage, though, is that it is
expensive and slow to conduct.
 Mail Questionnaire: These are starting to be obsolete but are still being used in some market
research studies. This method involves a researcher sending a physical data collection
questionnaire request to a respondent that can be filled in and sent back. The advantage of this
method is that respondents can complete this on their own time to answer truthfully and
entirely. The disadvantage is that this method is expensive and time-consuming. There is also
a high risk of not collecting enough responses to make actionable insights from the data.

How to design a Questionnaire

Questionnaire design is a multistep process that requires attention to detail at every step.

Researchers are always hoping that the responses received for a survey questionnaire yield
useable data. If the questionnaire is too complicated, there is a fair chance that the respondent
might get confused and will drop out or answer inaccurately.
As a survey creator, you may want to pre-test the survey by administering it to a focus group
during development. You can try out a few different questionnaire designs to determine which
resonates best with your target audience. Pre-testing is a good practice as the survey creator can
comprehend the initial stages if there are any changes required in the survey.

Steps Involved in Questionnaire Design

1. Identify the scope of your research:


Think about what your questionnaire is going to include before you start designing the look of it.
The clarity of the topic is of utmost importance as this is the primary step in creating the
questionnaire. Once you are clear on the purpose of the questionnaire, you can begin the design
process.

2. Keep it simple:
The words or phrases you use while writing the questionnaire must be easy to understand. If the
questions are unclear, the respondents may simply choose any answer and skew the data you
collect.

3. Ask only one question at a time:


At times, a researcher may be tempted to add two similar questions. This might seem like an
excellent way to consolidate answers to related issues, but it can confuse your respondents or
lead to inaccurate data. If any of your questions contain the word “and,” take another look. This
question likely has two parts, which can affect the quality of your data.

4. Be flexible with your options:


While designing, the survey creator needs to be flexible in terms of “option choice” for the
respondents. Sometimes the respondents may not necessarily want to choose from the answer
options provided by the survey creator. An “other” option often helps keep respondents engaged
in the survey.
5. The open-ended or closed-ended question is a tough choice:
The survey creator might end up in a situation where they need to make distinct choices between
open or close-ended questions. The question type should be carefully chosen as it defines the
tone and importance of asking the question in the first place.

If the questionnaire requires the respondents to elaborate on their thoughts, an open-ended


question is the best choice. If the surveyor wants a specific response, then close-ended questions
should be their primary choice. The key to asking closed-ended questions is to generate data that
is easy to analyze and spot trends.

6. It is essential to know your audience:


A researcher should know their target audience. For example, if the target audience speaks
mostly Spanish, sending the questionnaire in any other language would lower the response rate
and accuracy of data. Something that may seem clear to you may be confusing to your
respondents. Use simple language and terminology that your respondents will understand, and
avoid technical jargon and industry-specific language that might confuse your respondents.

For efficient market research, researchers need a representative sample collected using one of the
many sampling techniques, such as a sample questionnaire. It is imperative to plan and define
these target respondents based on the demographics required.

7. Choosing the right tool is essential: 


QuestionPro is a simple yet advanced survey software platform that the surveyors can use to
create a questionnaire or choose from the already existing 300+ questionnaire templates.

Always save personal questions for last. Sensitive questions may cause respondents to drop off
before completing. If these questions are at the end, the respondent has had time to become more
comfortable with the interview and are more likely to answer personal or demographic questions.

Differences between a Questionnaire and a Survey


Questionnaire Survey

A questionnaire can is a research A survey is a research method used for


instrument that consists of a set of collecting data from a pre-defined group of
Meaning
questions to collect information respondents to gain information and insights on
from a respondent. various topics of interest.

What is it? The instrument of data collection Process of collecting and analyzing that data

Consists of questionnaire and survey design,


Characteristic Subset of survey
logic and data collection

Time and
Fast and cost-effective Much slower and expensive
Cost

Use Conducted on the target audience Distributed or conducted on respondents

Close-ended and very rarely open-


Questions Close-ended and open-ended
ended

Answers Objective Subjective or objective

Sources of Secondary Data

Sources of secondary data include books, personal sources, journals, newspapers, websitess,
government records etc. Secondary data are known to be readily available compared to that of
primary data. It requires very little research and needs for manpower to use these sources.

With the advent of electronic media and the internet, secondary data sources have become more
easily accessible. Some of these sources are highlighted below.

 Books
Books are one of the most traditional ways of collecting data. Today, there are books available
for all topics you can think of.  When carrying out research, all you have to do is look for a book
on the topic being researched, then select from the available repository of books in that area.
Books, when carefully chosen are an authentic source of authentic data and can be useful in
preparing a literature review.

 Published Sources

There are a variety of published sources available for different research topics. The authenticity
of the data generated from these sources depends majorly on the writer and publishing company. 

Published sources may be printed or electronic as the case may be. They may be paid or free
depending on the writer and publishing company’s decision.

 Unpublished Personal Sources

This may not be readily available and easily accessible compared to the published sources. They
only become accessible if the researcher shares with another researcher who is not allowed to
share it with a third party.

For example, the product management team of an organization may need data on customer
feedback to assess what customers think about their product and improvement suggestions. They
will need to collect the data from the customer service department, which primarily collected the
data to improve customer service.

 Journal

Journals are gradually becoming more important than books these days when data collection is
concerned. This is because journals are updated regularly with new publications on a periodic
basis, therefore giving to date information.

Also, journals are usually more specific when it comes to research. For example, we can have a
journal on, “Secondary data collection for quantitative data” while a book will simply be titled,
“Secondary data collection”.
 Newspapers

In most cases, the information passed through a newspaper is usually very reliable. Hence,
making it one of the most authentic sources of collecting secondary data.

The kind of data commonly shared in newspapers is usually more political, economic, and
educational than scientific. Therefore, newspapers may not be the best source for scientific data
collection.

 Websites

The information shared on websites is mostly not regulated and as such may not be trusted
compared to other sources. However, there are some regulated websites that only share authentic
data and can be trusted by researchers.

Most of these websites are usually government websites or private organizations that are paid,
data collectors.

 Blogs

Blogs are one of the most common online sources for data and may even be less authentic than
websites. These days, practically everyone owns a blog, and a lot of people use these blogs to
drive traffic to their website or make money through paid ads.

Therefore, they cannot always be trusted. For example, a blogger may write good things about a
product because he or she was paid to do so by the manufacturer even though these things are not
true.

 Diaries

They are personal records and as such rarely used for data collection by researchers. Also, diaries
are usually personal, except for these days when people now share public diaries containing
specific events in their life.
A common example of this is Anne Frank’s diary which contained an accurate record of the Nazi
wars.

 Government Records

Government records are a very important and authentic source of secondary data. They contain
information useful in marketing, management, humanities, and social science research.

Some of these records include; census data, health records, education institute records, etc. They
are usually collected to aid proper planning, allocation of funds, and prioritizing of projects.

 Podcasts

Podcasts are gradually becoming very common these days, and a lot of people listen to them as
an alternative to radio. They are more or less like online radio stations and are generating
increasing popularity.

Information is usually shared during podcasts, and listeners can use it as a source of data
collection. 

Advantages of Secondary Data:


 Ease of access
The secondary data sources are very easy to access. The Internet has changed the way secondary
research works. Nowadays, you have so much information available just by clicking with the
mouse.
 Low cost or free
The majority of secondary sources are absolutely free for use or at very low costs. It saves not
only your money but your efforts. In comparison with primary research where you have to
design and conduct a whole primary study process from the beginning, secondary research
allows you to gather data without having to put any money on the table. (see more on our
post: primary vs secondary data)
 Time-saving 
As the above advantage suggests, you can perform secondary research in no time. Sometimes it
is a matter of a few Google searches to find a source of data.
 Allow you to generate new insights from previous analysis
Reanalyzing old data can bring unexpected new understandings and points of view or even new
relevant conclusions.
 Longitudinal analysis
Secondary data allows you to perform a longitudinal analysis which means the studies are
performed spanning over a large period of time. This can help you to determine different
trends. In addition, you can find secondary data from many years back up to a couple of hours
ago. It allows you to compare data over time.
 Anyone can collect the data
Secondary data research can be performed by people that aren’t familiar with the different data
collection methods. Practically, anyone can collect it.
 A huge amount of secondary data with a wide variety of sources
It is the richest type of data available to you in a wide variety of sources and topics.
Disadvantages:
 Might be not specific to your needs
Secondary data is not specific to the researcher’s needs due to the fact that it was collected in the
past for another reason. That is why the secondary data might be unreliable for your current
needs. Secondary data sources can give you a huge amount of information, but quantity does not
always mean appropriateness.
 You have no control over data quality
The secondary data might lack quality. The source of the information may be questionable,
especially when you gather the data via the Internet. As you relying on secondary data for
your data-driven decision-making, you must evaluate the reliability of the information by finding
out how the information was collected and analyzed.
 Biasness
As the secondary data is collected by someone else than you, typically the data is biased in favor
of the person who gathered it. This might not cover your requirements as a researcher or
marketer.
 Not timely
Secondary data is collected in the past which means it might be out-of-date. This issue can be
crucial in many different situations.
 You are not the owner of the information
Generally, secondary data is not collected specifically for your company. Instead, it is available
to many companies and people either for free or for a little fee. So, this is not exactly a
“competitive advantage” for you. Your current and potential competitors also have access to the
data.

You might also like