21MST18 M2

Module - 02
Data collection is the process of gathering and measuring information on variables of

interest, in an established systematic fashion that enables one to answer stated research
questions, test hypotheses, and evaluate outcomes. The data collection component of research
is common to all fields of study including physical and social sciences, humanities, business,
etc. While methods vary by discipline, the emphasis on ensuring accurate and honest
collection remains the same.
The importance of ensuring accurate and appropriate data collection

Regardless of the field of study or preference for defining data (quantitative, qualitative),
accurate data collection is essential to maintaining the integrity of research. Both the selection
of appropriate data collection instruments (existing, modified, or newly developed) and
clearly delineated instructions for their correct use reduce the likelihood of errors occurring.
Consequences from improperly collected data include
1. inability to answer research questions accurately

2. inability to repeat and validate the study
3. distorted findings resulting in wasted resources
4. misleading other researchers to pursue fruitless avenues of investigation
5. compromising decisions for public policy
6. causing harm to human participants and animal subjects
While the degree of impact from faulty data collection may vary by discipline and the nature
of the investigation, there is the potential to cause disproportionate harm when these research
results are used to support public policy recommendations.
Data types
There are two main approaches for data collection about a problem, person or phenomenon
viz., Primary data and secondary data.
1. Primary Data –
In this type, the information must be collected by the researcher in person. Sources used
in this approach are called primary sources.
Examples of information collected through primary sources are:
 Finding out the attitude of a community towards health services.

 estimate the health needs of a community.
 evaluating a social program.
 determining the job satisfaction of the employees of an organization.
 Estimating the qualities of services provided by a worker.
In summary, primary sources provide first-hand data, whereas secondary sources provide
second-hand data.
Primary sources of data collection are:
 Observations which may be participant or non-participant.

 Interviews which may be structured or non-structured.
 Questionnaire which may be mailed or collective questionnaire.
Methods of primary data collection –
Primary research can be done through various methods, but this type of research is often
based on the principles of the scientific method. This means that in the process of doing
primary research, researchers develop research questions or hypotheses, collect and analyze
measurable, empirical data, and draw evidence-based conclusions.
a. Surveys – This is a data-collection approach where individuals are asked to provide

answers to particular questions, such as about their emotions, beliefs, attitudes, and
behaviour. This form of questioning tends to be less flexible than interviews due to
the fixed nature of the questions. However, surveys are useful for collecting
information from large groups of people.
b. Interviews – Interviews are a convenient way of collecting information from
individuals or small groups of people. Researchers can also use interviews to get
expert opinions on their fields of study.
c. Observation – This primary research method involves observing people, occurrences,
and other variables important to the research or study. Observation entails measuring
and recording quantitative or qualitative data. This research method is useful for
gaining knowledge without the biased viewpoint sometimes present in interviews.
d. Data analysis – Data analysis requires collecting data and organizing them according
to criteria developed by the researcher. This primary research method is useful for
discovering trends or patterns in data.
e. Focus groups – Researchers can also gather information through focus groups, which
typically comprise up to 12 people. Focus groups participate in a guided discussion of
the topic, usually facilitated by the researcher. This qualitative data-gathering method
is often used to gain a deeper appreciation of social problems.
2. Secondary Data - Sometimes information required is already available and needs only to
be extracted. Information gathered using this approach is said to be collected from
secondary sources.
Examples of secondary sources of data collection are:
 use of census data to obtain information on the age-sex structure of a population.

 the use of hospital records to find out the mortality patterns of a particular population.
 the use of an organization record to estimate its activities.
 data collection through articles, journals, magazines and books.
 It may be documents like govt. publications, earlier research, census forms or
personal records.
Classification of secondary data
1. Internal Secondary data - Internal secondary sources include databases containing

reports from individuals or prior research. This is often an overlooked resource as
many researchers seldom check the reports and publications done by their peers. This
prior research would still be considered secondary even if it were performed internally
because it was conducted for a different purpose.
2. External Secondary data- A wide range of information can be obtained from

secondary research. Reliable databases for secondary sources include Government
Sources, academic peer-reviewed journals, published books and articles, commercial
information sources etc. This data is generated by others but can be considered useful
when conducting research into a new scope of the study. It also means less work for a
non-for-profit organization as they would not have to create their own data and
instead can piggyback off the data of others.
Sources of Secondary data
Researchers have plenty of options to explore when it comes to doing secondary research.
The following sources can assist researchers in doing secondary research:
 Academic peer-reviewed journals – These often include original research

undertaken by authors or researchers themselves.
 Published books and articles – Many books reference primary-source materials,
along with an analysis from the author.
 Government agencies – Many government agencies maintain archives or databases
of documents and reports, which contain data that can prove to be useful to
researchers.
 Educational institutions – Colleges and universities do a significant amount of
research and produce data that can be requested by researchers.
 Commercial information sources – Information sources such as newspapers,
magazines, and TV shows can also prove to be useful sources for secondary research.
These sources provide firsthand information and insights to political agendas, market
research, and economic developments for instance (Bhat, 2020).
The Internet makes secondary research significantly easier for researchers today. Many
government agencies and educational institutions, for instance, make their data available
online so researchers can easily download information for their use. There are even web
applications for creating word clouds to visualize the frequency of keywords for topics in
databases.
Limitations while using data from secondary sources –
When using data from secondary sources you need to be careful as there may be certain
problems with the availability, format and quality of data. The extent of these problems
varies from source to source. While using such data some issues you should keep in mind are:
1. Validity and reliability - The validity and reliability of information may vary from
source to source. For example, data collection through census may be more valid than
data collection through personal diaries.
2. Personal bias - The information obtained from personal diaries, magazines and
newspapers may be personal bias, as these writers are likely to exhibit less
rigorousness and objectivity than one would expect in a research report.
3. Availability of data - It is common for beginning researchers to assume that the
required data will be available. But you cannot and should not make this assumption.
Therefore, it is important to make sure that the required data is available before you
proceed further with your study.
4. Format - Before deciding to use the data from secondary sources, it is equally
important to make sure that the data are available in the required format. For example,
you might need to analyse the age in the categories 23-33, 34-48 etc., but in your
sources age may be categorized differently, e.g., 21-24, 25-29 etc.
Designing questionnaires and schedules
A questionnaire refers to a technique of data collection which consist of a series of written

questions along with alternative answers.
A schedule is a formalized set of questions, statements and spaces for answers, provided to
the enumerators who ask questions to the respondents and note down the answers.
Difference between questionnaire and schedule
The important points of difference between the questionnaire and schedule are as under:
1. Questionnaire refers to a technique of data collection which consist of a series of

written questions along with alternative answers. The schedule is a formalised set of
questions, statements, and spaces for answers, provided to the enumerators who ask
questions to the respondents and note down the answers.
2. Questionnaires are delivered to the informants by post or mail and answered as
specified in the cover letter. On the other hand, schedules are filled by the research
workers, who interpret the questions to the respondents if necessary.
3. The response rate is low in case of questionnaires as many people do not respond and
often return it without answering all the questions. On the contrary, the response rate
is high, as they are filled by the enumerators, who can get answers to all the question.
4. The questionnaires can be distributed a large number of people at the same time, and
even the respondents who are not approachable can also be reached easily.
Conversely, in schedule method, the reach is relatively small, as the enumerators
cannot be sent to a large area.
5. Data collection by questionnaire method is comparatively cheaper and economical as
the money is invested only in the preparation and posting of the questionnaire. As
against this, a large amount is spent on the appointment and training of the
enumerators and also on the preparation of schedules.
6. In the questionnaire method, it is not known that who answers the question whereas,
in the case of schedule, the respondent’s identity is known.
7. The success of the questionnaire lies in the quality of the questionnaire while the
honesty and competency of the enumerator determine the success of a schedule.
8. The questionnaire is usually employed only when the respondents are literate and
cooperative. Unlike schedules which can be used for data collection from all classes
of people.
Developing a questionnaire
Designing of questionnaires and schedules are similar and the flowchart below gives as
overall idea of formulating the same.
There are steps involved in the development of a questionnaire:
1. Deciding on the information required - It should be noted that one does not start by
writing questions. The first step is to decide 'what are the things one needs to know
from the respondent in order to meet the survey's objectives?' These, as has been
indicated in the opening chapter of this textbook, should appear in the research brief
and the research proposal. One may already have an idea about the kind of
information to be collected, but additional help can be obtained from secondary data,
previous rapid rural appraisals and exploratory research. In respect of secondary data,
the researcher should be aware of what work has been done on the same or similar
problems in the past, what factors have not yet been examined, and how the present
survey questionnaire can build on what has already been discovered. Further, a small
number of preliminary informal interviews with target respondents will give a
glimpse of reality that may help clarify ideas about what information is required.
2. Define the target respondents - At the outset, the researcher must define the
population about which he/she wishes to generalise from the sample data to be
collected. For example, in marketing research, researchers often have to decide
whether they should cover only existing users of the generic product type or whether
to also include non-users. Secondly, researchers have to draw up a sampling frame.
Thirdly, in designing the questionnaire we must take into account factors such as the
age, education, etc. of the target respondents.
3. Choose the method(s) of reaching target respondents - It may seem strange to be
suggesting that the method of reaching the intended respondents should constitute part
of the questionnaire design process. However, a moment's reflection is sufficient to
conclude that the method of contact will influence not only the questions the
researcher is able to ask but the phrasing of those questions. The main methods
available in survey research are:
1. personal interviews
2. group or focus interviews
3. mailed questionnaires
4. telephone interviews.
Within this region, the first two mentioned are used much more extensively than the
second pair. However, each has its advantages and disadvantages. A general rule is
that the more sensitive or personal the information, the more personal the form of data
collection should be.
4. Decide on question content - Researchers must always be prepared to ask, "Is this
question really needed?" The temptation to include questions without critically
evaluating their contribution to the achievement of the research objectives, as they are
specified in the research proposal, is surprisingly strong. No question should be included
unless the data it gives rise to is directly of use in testing one or more of the hypotheses
established during the research design.
There are only two occasions when seemingly "redundant" questions might be included:
 Opening questions that are easy to answer and which are not perceived as being
"threatening", and/or are perceived as being interesting, can greatly assist in
gaining the respondent's involvement in the survey and help to establish a rapport.
This, however, should not be an approach that should be overly used. It is almost
always the case that questions which are of use in testing hypotheses can also serve
the same functions.
 "Dummy" questions can disguise the purpose of the survey and/or the sponsorship
of a study. For example, if a manufacturer wanted to find out whether its
distributors were giving the consumers or end-users of its products a reasonable
level of service, the researcher would want to disguise the fact that the distributors'
service level was being investigated. If he/she did not, then rumours would abound
that there was something wrong with the distributor.
5. Develop the question wording - Survey questions can be classified into three forms,
i.e. closed, open-ended and open response-option questions. So far only the first of
these, i.e. closed questions has been discussed.
This type of questioning has a number of important advantages;
o It provides the respondent with an easy method of indicating his answer - he does
not have to think about how to articulate his answer.
o It 'prompts' the respondent so that the respondent has to rely less on memory in
answering a question.
o Responses can be easily classified, making analysis very straightforward.
o It permits the respondent to specify the answer categories most suitable for their
purposes.
6. Putting questions into a meaningful order and format
 Opening questions: Opening questions should be easy to answer and not in any

way threatening to THE respondents. The first question is crucial because it is the
respondent's first exposure to the interview and sets the tone for the nature of the
task to be performed. If they find the first question difficult to understand, or
beyond their knowledge and experience, or embarrassing in some way, they are
likely to break off immediately. If, on the other hand, they find the opening
question easy and pleasant to answer, they are encouraged to continue.
 Question flow: Questions should flow in some kind of psychological order, so that
one leads easily and naturally to the next. Questions on one subject, or one
particular aspect of a subject, should be grouped together. Respondents may feel it
disconcerting to keep shifting from one topic to another, or to be asked to return to
some subject they thought they gave their opinions about earlier.
 Question variety: Respondents become bored quickly and restless when asked
similar questions for half an hour or so. It usually improves response, therefore, to
vary the respondent's task from time to time. An open-ended question here and
there (even if it is not analysed) may provide much-needed relief from a long series
of questions in which respondents have been forced to limit their replies to pre-
coded categories. Questions involving showing cards/pictures to respondents can
help vary the pace and increase interest.
7. Piloting the questionnaires
Even after the researcher has proceeded along the lines suggested, the draft questionnaire
is a product evolved by one or two minds only. Until it has actually been used in
interviews and with respondents, it is impossible to say whether it is going to achieve the
desired results. For this reason, it is necessary to pre-test the questionnaire before it is
used in a full-scale survey, to identify any mistakes that need correcting.
The purpose of pretesting the questionnaire is to determine:
o whether the questions as they are worded will achieve the desired results
o whether the questions have been placed in the best order
o whether the questions are understood by all classes of respondent
o whether additional or specifying questions are needed or whether some
questions should be eliminated
o whether the instructions to interviewers are adequate.
Usually, a small number of respondents are selected for the pre-test. The respondents
selected for the pilot survey should be broadly representative of the type of respondents
to be interviewed in the main survey.
If the questionnaire has been subjected to a thorough pilot test, the final form of the
questions and questionnaire will have evolved into its final form. All that remains to be
done is the mechanical process of laying out and setting up the questionnaire in its final
form. This will involve grouping and sequencing questions into an appropriate order,
numbering questions, and inserting interviewer instructions.
Sampling Methods
There are two main methods of sampling:
1. Probability sampling and

2. Non-probability sampling.
Probability Sampling - In probability sampling, respondents are randomly selected to take

part in a survey or other mode of research. For a sample to qualify as a probability sample,
each person in a population must have an equal chance of being selected for a study, and the
researcher must know the probability that an individual will be selected. Probability sampling
is the most common form of sampling for public opinion studies, election polling, and other
studies in which results will be applied to a wider population. This is the case whether or not
the wider population is very large, such as the population of an entire country, or small, such
as young females living in a specific town.
Types of Probability Sampling - There are several sampling methods that fall under
probability sampling. In each method, those who are within the sample frame have some
chance of being selected to participate in a study. Four of the common types of probability
sampling are:
a. Simple Random Sample: The most basic form of probability sampling, in a simple
random sample each member of a population is assigned an identifier such as a number,
and those selected to be within the sample are picked at random, often using an
automated software program.
Advantages
 Lack of Bias - The use of simple random sampling removes all hints of bias or at
least it should. Because individuals who make up the subset of the larger group are
chosen at random, each individual in the large population set has the same
probability of being selected. In most cases, this creates a balanced subset that carries
the greatest potential for representing the larger group as a whole.
 Simplicity - There are no special skills involved in using this method, which can
result in a fairly reliable outcome. This method involves dividing larger groups into
smaller subgroups that are called strata. Members are divided up into these groups
based on any attributes they share. As mentioned, individuals in the subset are
selected randomly and there are no additional steps.
 Less Knowledge Required - It requires little to no special knowledge. This means

that the individual conducting the research doesn't need to have any information or
knowledge about the larger population in order to effectively do their job.
Limitations
 Difficulty Accessing Lists of the Full Population - An accurate statistical measure of
a large population can only be obtained in simple random sampling when a full list of
the entire population to be studied is available. Think of a list of students at a
university or a group of employees at a specific company. The problem lies in the
accessibility of these lists. As such, getting access to the whole list can present
challenges. Some universities or colleges may not want to provide a complete list of
students or faculty for research. Similarly, specific companies may not be willing or
able to hand over information about employee groups due to privacy policies.
 Time Consuming - When a full list of a larger population is not available, individuals
attempting to conduct simple random sampling must gather information from other
sources. If publicly available, smaller subset lists can be used to recreate a full list of
a larger population, but this strategy takes time to complete. Organizations that keep
data on students, employees, and individual consumers often impose lengthy
retrieval processes that can stall a researcher's ability to obtain the most accurate
information on the entire population set.
 Costs - In addition to the time it takes to gather information from various sources, the
process may cost a company or individual a substantial amount of capital. Retrieving
a full list of a population or smaller subset lists from a third-party data provider may
require payment each time data is provided. If the sample is not large enough to
represent the views of the entire population during the first round of simple random
sampling, purchasing additional lists or databases to avoid a sampling error can be
prohibitive.
 Sample Selection Bias - Although simple random sampling is intended to be an

unbiased approach to surveying, sample selection bias can occur. When a sample set
of the larger population is not inclusive enough, representation of the full population
is skewed and requires additional sampling techniques.
b. Stratified Random Sample: A stratified random sample is a step up from complexity

from a simple random sample. In this method, the population is divided into sub-groups,
such as male and female, and within those sub-groups a simple random sample is
performed. This enables a random sample that is representative of a larger population
and its specific makeup, such as a country’s population.
Advantages
Stratified random sampling accurately reflects the population being studied because
researchers are stratifying the entire population before applying random sampling
methods. In short, it ensures each subgroup within the population receives proper
representation within the sample. As a result, stratified random sampling provides better
coverage of the population since the researchers have control over the subgroups to
ensure all of them are represented in the sampling. With simple random sampling, there
isn't any guarantee that any particular subgroup or type of person is chosen. In our earlier
example of the university students, using simple random sampling to procure a sample of
100 from the population might result in the selection of only 25 male undergraduates or
only 25% of the total population. Also, 35 female graduate students might be selected
(35% of the population) resulting in under-representation of male undergraduates and
over-representation of female graduate students. Any errors in the representation of the
population have the potential to diminish the accuracy of the study.
Limitations
Unfortunately, this method of research cannot be used in every study. The method's
disadvantage is that several conditions must be met for it to be used properly.
Researchers must identify every member of a population being studied and classify each
of them into one, and only one, subpopulation. As a result, stratified random sampling is
disadvantageous when researchers can't confidently classify every member of the
population into a subgroup. Also, finding an exhaustive and definitive list of an entire
population can be challenging.
Overlapping can be an issue if there are subjects that fall into multiple subgroups. When
simple random sampling is performed, those who are in multiple subgroups are more
likely to be chosen. The result could be a misrepresentation or inaccurate reflection of
the population.
The above example makes it easy: Undergraduate, graduate, male, and female are clearly
defined groups. In other situations, however, it might be far more difficult. Imagine
incorporating characteristics such as race, ethnicity, or religion. The sorting process
becomes more difficult, rendering stratified random sampling an ineffective and less than
ideal method.
c. Cluster Sample: In cluster sampling, a population is divided into clusters which are
unique, yet represent a diverse group – for example, cities are often used as clusters.
From the list of clusters, a select number are randomly selected to take part in a study.
Advantages
1. Requires fewer resources - Since cluster sampling selects only certain groups from
the entire population, the method requires fewer resources for the sampling process.
Therefore, it is generally cheaper than simple random or stratified sampling as it
requires fewer administrative and travel expenses.
2. More feasible - The division of the entire population into homogenous groups
increases the feasibility of the sampling. Additionally, since each cluster represents
the entire population, more subjects can be included in the study.
Limitations
1. Biased samples - The method is prone to biases. If the clusters representing the
entire population were formed under a biased opinion, the inferences about the entire
population would be biased as well.
2. High sampling error - Generally, the samples drawn using the cluster method are
prone to higher sampling error than the samples formed using other sampling
methods.
d. Systematic Sample: Using a systematic sample, participants are selected to be part of a

sample using a fixed interval. For example, if using an interval of 5, the sample may
consist of the fifth, 10th, 15th, and 20th, and so forth person on a list.
Advantages
1. It’s extremely simple and convenient for the researchers to create, conduct, analyze
samples.
2. As there’s no need to number each member of a sample, it is better for representing
a population in a faster and simpler manner.
3. The samples created are based on precision in member selection and free from
favoritism.
4. In the other methods of probability sampling methods such as cluster sampling and
stratified sampling or non-probability methods such as convenience sampling, there
are chances of the clusters created to be highly biased which is avoided in systematic
sampling as the members are at a fixed distance from one another.
5. The factor of risk involved in this sampling method is extremely minimal.
6. In case there are diverse members of a population, this sampling technique can be
beneficial because of the even distribution of members to form a sample.
Limitations
1. Assumes Size of Population Can Be Determined - The systematic method assumes
the size of the population is available or can be reasonably approximated. For
instance, suppose researchers want to study the size of rats in a given area. If they
don't have any idea how many rats there are, they cannot systematically select a
starting point or interval size.
2. Need for Natural Degree of Randomness - A population needs to exhibit a natural
degree of randomness along the chosen metric. If the population has a type of
standardized pattern, the risk of accidentally choosing very common cases is more
apparent.For a simple hypothetical situation, consider a list of favorite dog breeds
where (intentionally or by accident) every evenly numbered dog on the list was
small and every odd dog was large. If the systematic sampler began with the fourth
dog and chose an interval of six, the survey would skip the large dogs.
3. Greater Risk of Data Manipulation - There is a greater risk of data manipulation with
systematic sampling because researchers might be able to construct their systems to
increase the likelihood of achieving a targeted outcome rather than letting the
random data produce a representative answer. Any resulting statistics could not be
trusted.
Example of probability sampling
Let us take an example to understand this sampling technique. The population of the US
alone is 330 million. It is practically impossible to send a survey to every individual to gather
information. Use probability sampling to collect data, even if you collect it from a smaller
population.
For example, an organization has 500,000 employees sitting at different geographic locations.
The organization wishes to make certain amendments in its human resource policy, but
before they roll out the change, they want to know if the employees will be happy with the
change or not. However, it’s a tedious task to reach out to all 500,000 employees. This is
where probability sampling comes handy. A sample from the larger population i.e., from
500,000 employees, is chosen. This sample will represent the population. Deploy a survey
now to the sample.
From the responses received, management will now be able to know whether employees in
that organization are happy or not about the amendment.
Steps involved in probability sampling
Follow these steps to conduct probability sampling:
o Choose your population of interest carefully: Carefully think and choose from the
population, people you believe whose opinions should be collected and then include them
in the sample.
o Determine a suitable sample frame: Your frame should consist of a sample from your
population of interest and no one from outside to collect accurate data.
o Select your sample and start your survey: It can sometimes be challenging to find the
right sample and determine a suitable sample frame. Even if all factors are in your favor,
there still might be unforeseen issues like cost factor, quality of respondents, and
quickness to respond. Getting a sample to respond to a probability survey accurately might
be difficult but not impossible.
But, in most cases, drawing a probability sample will save time, money, and a lot of
frustration.
When to use probability sampling?
Use probability sampling in these instances:
1. When you want to reduce the sampling bias: This sampling method is used when the
bias has to be minimum. The selection of the sample largely determines the quality of the
research’s inference. How researchers select their sample largely determines the quality
of a researcher’s findings. Probability sampling leads to higher quality findings because
it provides an unbiased representation of the population.
2. When the population is usually diverse: Researchers use this method extensively as it

helps them create samples that fully represent the population. Say we want to find out
how many people prefer medical tourism over getting treated in their own country. This
sampling method will help pick samples from various socio-economic strata,
background, etc. to represent the broader population.
3. To create an accurate sample: Probability sampling help researchers create accurate

samples of their population. Researchers use proven statistical methods to draw a precise
sample size to obtained well-defined data.
Non-probability sampling
Non-probability sampling is when a sample is created through a non-random process. This
could include a researcher sending a survey link to their friends or stopping people on the
street. This type of sampling would also include any targeted research that intentionally
samples from specific lists such as aid beneficiaries, or participants in a specific training
course. Non-probability samples are often used during the exploratory stage of a research
project, and in qualitative research, which is more subjective than quantitative research, but
are also used for research with specific target populations in mind, such as farmers that grow
maize.
Generally speaking, non-probability sampling can be a more cost-effective and faster

approach than probability sampling, but this depends on a number of variables including the
target population being studied. Certain types of non-probability sampling can also introduce
bias into the sample and results. For general population studies intended to represent the
entire population of a country or state, probability sampling is usually the preferred method.
Types of Non-Probability Sample

In non-probability sampling, those who participate in a research study are selected not by
random, but due to some factor that gives them the chance of participating in a study that
others in the population do not have. Types of non-probability sample include:
a. Convenience Sample: As its name implies, this method uses people who are convenient
to access to complete a study. This could include friends, people walking down a street,
or those enrolled in a university course. Convenience sampling is quick and easy, but
will not yield results that can be applied to a broader population.
Advantages
1. Simplicity of sampling and the ease of research
2. Helpful for pilot studies and for hypothesis generation
3. Data collection can be facilitated in short duration of time
4. Cheapest to implement that alternative sampling methods
Disadvantages
1. Highly vulnerable to selection bias and influences beyond the control of the
researcher
2. High level of sampling error
3. Studies that use convenience sampling have little credibility due to reasons above
b. Snowball Sample: A snowball sample works by recruiting some sample members who
in turn recruit people they know to join a sample. This method works well for reaching
very specific populations who are likely to know others who meet the selection criteria.
Advantages
1. The ability to recruit hidden populations
2. The possibility to collect primary data in a cost-effective manner
3. Studies with snowball sampling can be completed in a short duration of time
4. Very little planning is required to start the primary data collection process
Disadvantages
1. Oversampling a particular network of peers can lead to bias
2. Respondents may be hesitant to provide names of peers and asking them to do so
may raise ethical concerns
3. There is no guarantee about the representativeness of samples. It is not possible to
determine the actual pattern of distribution of the population.
4. It is not possible to determine the sampling error and make statistical inferences
from the sample to the population due to the absence of a random selection of
samples.
c. Quota Sample: In quota sampling, a population is divided into subgroups by

characteristics such as age or location and targets are set for the number of respondents
needed from each subgroup. The main difference between quota sampling and stratified
random sampling is that a random sampling technique is not used in quota sampling; For
example, a researcher could conduct a convenience sample with specific quotas to ensure
an equal number of males and females are included, but this technique would still not
give every member of the population a chance of being selected and thus would not be a
probability sample.
Advantages
1. Quota sampling emerges as an attractive choice when you are pressed for time,
because primary data collection can be done in shorter time.
2. The application of quota sampling can be cost-effective.
3. Quota sampling is not dependent on the presence of the sampling frames. In
occasions where suitable sampling frame is absent, quota sampling may be the only
appropriate choice available.
Disadvantages
1. Same as other non-probability samplingmethods, in quota sampling it is not possible
to calculate the sampling error and the projection of the research findings to the total
population is risky.
2. While this sampling technique might be very representative of the quota-defining
characteristics, other important characteristics may be disproportionately represented
in the final sample group.
3. There is a great potential for researcher bias and the quality of work may suffer due
to researcher incompetency and/or lack of experience
d. Purposive or Judgmental Sample: Using a purposive or judgmental sampling

technique, the sample selection is left up to the researcher and their knowledge of who
will fit the study criteria. For example, a purposive sample may include only PhD
candidates in a specific subject matter. When studying specific characteristics this
selection method may be used, however as the researcher can influence those who are
selected to take place in the study, bias may be introduced.
Advantages
1. Purposive sampling is one of the most cost-effective and time-effective sampling
methods available
2. Purposive sampling may be the only appropriate method available if there are only
limited number of primary data sources who can contribute to the study
3. This sampling technique can be effective in exploring anthropological situations
where the discovery of meaning can benefit from an intuitive approach
Disadvantages
1. Vulnerability to errors in judgment by researcher
2. Low level of reliability and high levels of bias.
3. Inability to generalize research findings
Non-probability sampling examples
An example of convenience sampling would be using student volunteers known to the

researcher. Researchers can send the survey to students belonging to a particular school,
college, or university, and act as a sample.
In an organization, for studying the career goals of 500 employees, technically, the sample
selected should have proportionate numbers of males and females. This means there should
be 250 males and 250 females. Since this is unlikely, the researcher selects the groups or
strata using quota sampling.
Researchers also use this type of sampling to conduct research involving a particular illness
in patients or a rare disease. Researchers can seek help from subjects to refer to other
subjects suffering from the same ailment to form a subjective sample to carry out the study.
When to use non-probability sampling?
Use this type of sampling to indicate if a particular trait or characteristic exists in a

population. Researchers widely use the non-probability sampling method when they aim at
conducting qualitative research, pilot studies, or exploratory research. Researchers use it
when they have limited time to conduct research or have budget constraints. When the
researcher needs to observe whether a particular issue needs in-depth analysis, he applies this
method. Use it when you do not intend to generate results that will generalize the entire
population.

21MST18 M2

Uploaded by

Copyright:

Available Formats

You might also like

21MST18 M2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

21MST18 M2

Uploaded by

Copyright:

Available Formats

Module - 02

Data collection is the process of gathering and measuring information on variables of

The importance of ensuring accurate and appropriate data collection

Consequences from improperly collected data include

1. inability to answer research questions accurately

Examples of information collected through primary sources are:

 Finding out the attitude of a community towards health services.

Primary sources of data collection are:

 Observations which may be participant or non-participant.

Methods of primary data collection –

a. Surveys – This is a data-collection approach where individuals are asked to provide

Examples of secondary sources of data collection are:

 use of census data to obtain information on the age-sex structure of a population.

Classification of secondary data

1. Internal Secondary data - Internal secondary sources include databases containing

2. External Secondary data- A wide range of information can be obtained from

Sources of Secondary data

 Academic peer-reviewed journals – These often include original research

Limitations while using data from secondary sources –

Designing questionnaires and schedules

A questionnaire refers to a technique of data collection which consist of a series of written

Difference between questionnaire and schedule

1. Questionnaire refers to a technique of data collection which consist of a series of

There are steps involved in the development of a questionnaire:

6. Putting questions into a meaningful order and format

 Opening questions: Opening questions should be easy to answer and not in any

7. Piloting the questionnaires

The purpose of pretesting the questionnaire is to determine:

There are two main methods of sampling:

1. Probability sampling and

Probability Sampling - In probability sampling, respondents are randomly selected to take

 Less Knowledge Required - It requires little to no special knowledge. This means

 Sample Selection Bias - Although simple random sampling is intended to be an

b. Stratified Random Sample: A stratified random sample is a step up from complexity

d. Systematic Sample: Using a systematic sample, participants are selected to be part of a

Example of probability sampling

Steps involved in probability sampling

Follow these steps to conduct probability sampling:

When to use probability sampling?

Use probability sampling in these instances:

2. When the population is usually diverse: Researchers use this method extensively as it

3. To create an accurate sample: Probability sampling help researchers create accurate

Generally speaking, non-probability sampling can be a more cost-effective and faster

Types of Non-Probability Sample

c. Quota Sample: In quota sampling, a population is divided into subgroups by

d. Purposive or Judgmental Sample: Using a purposive or judgmental sampling

Non-probability sampling examples

An example of convenience sampling would be using student volunteers known to the

When to use non-probability sampling?

Use this type of sampling to indicate if a particular trait or characteristic exists in a

You might also like