Lesson 2 - Methods of Data Collection

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

CHAPTER 1 - OBTAINING DATA

LESSON 2 – METHODS OF DATA COLLECTION

Lesson Objectives:
At the end of the lesson, students should:
1. Define and differentiate the different methods of data collection.
2. Explain the advantages and disadvantages of different methods of data collection.
3. Explain the importance/purpose of different methods of data collection in research work.
4. Identify the type of the study describe in the situation.
5. Discuss the type of interference that can and cannot be drawn from the study.
6. Identify the experimental units and the treatments from the study or situation.

Introduction
“So far, we have learned how to explore, summarize, display, and describe patterns in data. But
how did we get that data in the first place? This chapter provides an introduction to five ways of
obtaining data; questionnaire, interview, experiment, sample survey, observational study/direct
observation. You will learn about how to obtain data using these methods, about different types of biases
that can get introduced due to inaccurately applying these methods, and about different types of
conclusions that can be drawn from data obtained using these different methods.” (Richard L. Scheaffer,
et.al, 2012, p.130).

“The key to decision making is objective data; the key to good decision making is good objective
data. It is not enough just to have data; the data must be valid and reliable in that they actually measure
what they are supposed to measure and do so to a reasonable degree of accuracy.” (Richard L. Scheaffer,
et.al, 2012, p.131).

Lesson Proper
There are many methods of gathering information, and a wide variety of information sources. The
following are the few methods of collecting information for research projects.
1. Questionnaire 4. Survey
2. Interview 5. Observational study
3. Experimental Study

Surveys, interviews and focus groups are primary instruments for collecting information. Today, with
help from Web and analytics tools, organizations are also able to collect data from mobile devices,
website traffic, server activity and other relevant sources, depending on the project.

The choice of data collection methods depends on the research problem under study, the research design
and the information gathered about the variable. Broadly, the data collection methods can be classified
into two categories:
• Primary Data Collection Methods: The primary data are the first-hand data, collected by the
researcher for the first time and is original in nature. The researcher collects the fresh data when the
research problem is unique, and no related research work is done by any other person. The results
of the research are more accurate when the data is collected directly by the researcher but however
it is costly and time-consuming.
• Secondary Data Collection Methods: When the data is collected by someone else for his research
work and has already passed through the statistical analysis is called the secondary data. Thus, the
secondary data is the second-hand data which is readily available from the other sources. One of
the advantages of using the secondary data is that it is less expensive and at the same time easily
available, but however the authenticity of the findings can be questioned.
Thus, the researcher can obtain data from either of the sources depending on the nature of his study and
the pursued research objective.

2.1 Questionnaire
Questionnaires are a popular means of collecting data. But the designing is difficult because it often
requires many re-writes before finalization. The most important issue related to data collection is
choosing the most appropriate information or evidence to answer the author’s questions. To plan
data collection the author had to think about the questions to be answered and information sources
available. Also it had to think how these data could be organized, interpreted and then reported to
various audiences before finalizing the questionnaires.

There are advantages of questionnaires. Some of them are,


▪ Can be used as a method in its own right or as a basis for interviewing or a telephone survey
▪ Can be posted, e-mailed or faxed
▪ Can cover the large number of people and organization
▪ Wide geographical coverage
▪ Relatively cheap
▪ No prior arrangements are needed
▪ Avoid embarrassment on the part of the respondent
▪ No interviewer bias
▪ Possible anonymity of respondent

There are also disadvantages of questionnaires. They are,


▪ Designing problem
▪ Question have to be relatively simple
▪ Time delay whilst waiting for responses to be returned
▪ Assume no literacy problems
▪ No control over who completes it
▪ Problems with incomplete questionnaires.

The targeted group of people had to be selected carefully to avoid such disadvantages.
2.2 Interview
• An interview is generally a qualitative research technique which involves asking open-ended
questions to converse with respondents and collect elicit data about a subject.
• Interviews are similar to focus groups and surveys when it comes to garnering information from
the target market but are entirely different in their operation – focus groups are restricted to a
small group of 6-10 individuals whereas surveys are quantitative in nature.
• Interviews are conducted with a sample from a population and the key characteristic they exhibit
is their conversational tone.
• The interviewer in most cases is the subject matter expert who intends to understand respondent
opinions in a well-planned and executed series of questions and answers.
• Interviewing is a great way to learn detailed information from a single individual or small number
of individuals. This is a main data collection method used in the research. It is very useful when
someone wants to gain expert opinions on the subject or talk to someone knowledgeable about
a topic.

Methods of Research Interviews:


There are several types/methods to conduct research interviews, each of which is peculiar in its
application and can be used according to the research study requirement. The author has to select
one kind of interviewing method considering the type of technology which is available and the
availability of the individual the author is interviewing, and how comfortable author feels talking to
people.

These are the methods of interviews which are very popular among the researchers:
1. Face to face Interviews (in-person interviews)
2. Phone Interviews
3. Email Interviews
4. Chat/Messaging Interviews (Online)

Face to face Interviews/Personal Interviews: (in-person interviews)


✓ Personal interviews are one of the most used types of interviews, where the questions are asked
personally directly to the respondent. For this, a researcher can have a guide online surveys to
take note of the answers. A researcher can design his/her survey in such a way that they take
notes of the comments or points of view that stands out from the interviewee.
✓ When the author sits down and talks with someone it is a face-to-face interview. It is very
important that the author can adapt questions to the answers of the person author is
interviewing and also it is needed to bring recording device for the interview.

Advantages:
• Higher response rate.
• When the interviewees and respondents are face-to-face, there is a way to adapt the questions
if this is not understood.
• More complete answers can be obtained if there is doubt on both sides or a particular
information is detected that is remarkable.
• The researcher has an opportunity to detect and analyze the interviewee’s body language at
the time of asking the questions and taking notes about it.
• In-depth and a high degree of confidence on the data

Disadvantages:
✓ They are time-consuming and extremely expensive.
✓ They can generate distrust on the part of the interviewee, since they may be self-conscious and
not answer truthfully.
✓ Contacting the interviewees can be a real headache, either scheduling an appointment in
workplaces or going from house to house and not finding anyone.
✓ Therefore, many interviews are conducted in public places, such as shopping centers or parks.
There are even consumer studies that take advantage of these sites to conduct interviews or
surveys and give incentives, gifts, coupons, in short; there are great opportunities for online
research in shopping centers.
✓ Among the advantages of conducting these types of interviews is that the respondents will have
more fresh information if the interview is conducted in the context and with the appropriate
stimuli, so that researchers can have data from their experience at the scene of the events,
immediately and first hand. The interviewer can use an online survey through a mobile device
that will undoubtedly facilitate the entire process.

Telephonic Interviews/ Phone Interviews:


✓ Telephonic interviews are widely used and easy to combine with online surveys to carry out
research effectively.

✓ If author needs to interview someone who is geographically far away, or too busy to personally
meet, or does not have internet connectivity, the phone interview method is very convenient.

Advantages:
• To find the interviewees it is enough to have their telephone numbers on hand.
• They are usually lower cost.
• The information is collected quickly.
• Having a personal contact can also clarify doubts, or give more details of the questions.
• High degree of confidence on the data collected, reach almost anyone

Disadvantages:
• Many times researchers observe that people do not answer phone calls because it is an unknown
number for the respondent, or simply already changed their place of residence and they cannot
locate it, which causes a bias in the interview.
• Researchers also face that they simply do not want to answer and resort to pretexts such as they
are busy to answer, they are sick, they do not have the authority to answer the questions asked,
they have no interest in answering or they are afraid of putting their security at risk.
• One of the aspects that should be taken care of in these types of interviews is the kindness with
which the interviewers address the respondents, in order to get them to cooperate more easily
with their answers. Good communication is vital for the generation of better answers.
• Expensive, cannot self-administer, need to hire an agency
Email or Web Page Interviews:
✓ Online research is growing more and more because consumers are migrating to a more virtual
world and it is best for each researcher to adapt to this change.
✓ The increase in people with Internet access has made it popular that interviews via email or
web page stand out among the types of interviews most used today. For this nothing better
than an online survey.
✓ More and more consumers are turning to online shopping, which is why they are a great niche
to be able to carry out an interview that will generate information for the correct decision
making.
✓ The author used this method to get some clarification of the information received from the
questionnaire.
✓ This method is highly convenient for most individuals who are used to emailing frequently.
✓ It is also less personal than face to face or phone interviews. But it may not get more
information from an individual in an email interview because author is not able to follow up
questions or play off the interview response. However, email interviews are useful because
they are already in a digital format.

Advantages of email interviews:


• Speed in obtaining data
• The respondents respond according to their time, at the time they want and in the place they
decide.
• Online surveys can be mixed with other research methods or using some of the previous
interview models. They are tools that can perfectly complement and pay for the project.
• A researcher can use a variety of questions, logics, create graphs and reports immediately.
• Can reach anyone and everyone – no barrier

Disadvantages of email interviews:


• Expensive, data collection errors, lag time

Chat/Messaging Interviews (Online)


✓ Using instant messaging services like MSN messenger, Google talk, Skype, SMS messages using
mobile phones, the author is able to collect necessary information relating to the research
project.
✓ These interview methods allow to get information from the people who are living/working far
away and who are having internet connectivity and it is also convenient for Chat/Messaging
methods.

Advantages of Chat/Messaging Interviews:


• Cheap
• Can self-administer
• Very low probability of data errors

Disadvantages of Chat/Messaging Interviews:


• Not all your customers might have an email address/be on the internet
• Customers may be wary of divulging information online
When setting up an interview the author make sure to be courteous and professional. Before starting
the interview the author explained the reason of the interview, what author wanted to talk to them
about, and what the research project the author is going to do? Getting permission from the officers
who were engaged in interviews, author was able to use video recorders to record the conversations
held.
When conducting interviews the author adhered to the following rules.
• Carefully selected the questions asked.
• Started interview with some small talks
• Brought extra recording device (another video recorder)
• Author paid more attention while the interviews were going on
• Came to the interview prepared
• Did not pester or push the officer. The author was interviewing and if he/she did not talk about
an issue, author respected and did not push them
• At the interview time author was rigid with his questions
• Did not allow the officer to get off the topic and asked follow up questions to redirect the
conversation to the subject

Conclusion
✓ Undoubtedly, the objective of the research will set the pattern of what types of interviews are
best for data collection. Based on the research design, a research can plan and test the questions,
for instance, if the questions are the correct and if the survey flows in the best way.
✓ In addition there are other types of research that can be used under specific circumstances, for
example in the case of no connection or adverse situations to carry out surveyors, in these types
of occasions it is necessary to conduct a field research, which cannot be considered an interview
if not rather a completely different methodology.
✓ To summarize the discussion, an effective interview will be one that provides researchers with
the necessary data to know the object of study and that this information is applicable to the
decisions researchers make.

2.3 Survey

Sample Surveys and Inference about Populations


Some studies are designed for the purpose of estimating population characteristics, such as means or
proportions. Before planning such a data collection activity, we need to identify the population
involved.

Population - is the entire group of individuals or items in a study.


Sample – a part of a population that is actually studied.
Frame - a list (or comparable form of identification) all of members of a population
Example:
• a list of all students in the college of engineering
• a list of all equipment owned by a company
• a list of possible errors that can occur when a program is run
• a list of all addresses served by a power supplier
• a list of all bidders for a construction project
• a list of all the trees in a particular plot

Could serve a frame for various studies. For many population, like residents of a state, a frame is not
readily available.
A sample is a part of population that is actually studied. For example,
• All the fish Mobile Bay constitute by a population, but the fish caught to measure mercury levels
make up a sample.
• All the items produce in one run of an assembly line make up of a population, but the items inspected
for defects make up a sample.

Sample Survey - A sample is collected and studied to gain information about a population.
For example:
• During the process of negotiating an annual contract with a parking facility, a manager of a large
company wants to know how many employees will need parking space next year. How can we
get the reliable information? One way is to question all employees, but this procedure would be
somewhat inaccurate and very time-consuming. We could take the number of space in use this year
and assume that the need for the next year will about the same, but this method would have
inaccuracies as well. A simple technique that works well cases is to select a sample ‘of employees
not planning to retire at the end of this year and ask each selected employee if he or she will be
requesting a parking space. From the proportion of” yes” answers, an estimate of the number of
parking space required by the entire population of the employees can be obtained.

• When Alabama was planning to offer tax incentives to Mercedes for building a plant in Alabama
the Mobile register conducted a telephone survey of about 400 adult Alabama residents and asked
them, “Should Alabama offer tax incentives to industries to relocate in the state”? People
respondent by saying “agree,” “disagree,” or “don’t know”. From those who agreed, an estimate
of the percent of adult Alabama residents in favor of offering tax incentives to industries for
relocating to the state was estimated.

The scenarios outlined above have all elements of a typical sample survey. There is a question of “How
many?” or “How much” to be determined for a specific target population, the population to which we
intend to apply the result of the study. The population from which the data is collected is known as the
sampled population. It is desirable to have the target population the same as sampled population, but in
some circumstances they might differ. For example, random-digit-dialing telephone poll systematically
leave out those without telephone and may miss those with cell phones.

An approximate answer for a population is derive from a sample of data extracted from the population
of interest. Of key importance is the fact that the approximate answers will be a good approximation
only if the sample truly represents the population under study. Randomization plays a vital role in the
selection of samples that represent the population and hence produce good approximations. Virtually
any sampling scheme that depends upon subjective judgements rather than randomizations as to who
(or which item) should be intended in the sample will suffer from judgmental bias. As you will see later
chapters, randomization also forms the probabilistic basis for statistical inference.
Example
The Tennessee State board of Architectural and Engineering Examiners asked the Tennessee Society of
Professional Engineers (TSPPE) and the Consulting Engineers of Tennessee (CET) to look into various
issues related to professional registration. One of the issues was the professional registration of
engineering facility in Tennessee. Between 1999 and 20003, they sent a survey to engineering deans to
determine the registration rate for administrators (deans and departments chair) and full-time faculty.
Also of the interest where the opinions about the need for maintaining professional registration and
whether they provide incentives to the faculty to obtain their PE certification. The survey questionnaire
and results were reported by Madhaban and Malasri (Journal of Professional Issues in Engineering
Education and Practice, 2003). The deans that receive the survey questionnaire were not selected
randomly from all available deans of engineering colleges. In fact, no specific scheme was used to select
deans. Repeated rounds of mailing were used. What effect (if any) do you think this nonrandom selection
will have on the outcome?

Census
The United States conducts a census every 10 years, in other words, the government attempts to count
everyone living in the United States and to measure various other features of the population. The
information collected is used for the future planning in such areas as taxation, building of schools,
planning retirement centers, and forecasting energy needs. A census means a complete enumeration. It is
a process of collecting information from every unit in a target population. In other words, a census is
big sample survey. Making a list of all music CDs you own is taking a census of music CDs. If a firm takes
inventory, it is taking a census of everything in stock. The computerized record of all the employees of a
firm is in fact a census of employees. So, the target population might be your CDs, the stock of the firm,
or the employee of the company, but the key identifies census is that information is available on each
element of that population. No randomization is used in the census data collected from all the residents
of United States, but random sampling is used to augments these data in selected issues.

Example
U.S News and World report (September 2003) reported 50 top-rated doctoral universities in the country.
They collected information on several important factors such as ACT or SAT scores, percent of freshmen
in the top 10% of their high school class, students/faculty ratio, graduation rate, freshmen retention rate,
alumni giving rate, and so on, for all the doctoral universities in the country. Then, using statistical
techniques, they ranked the universities. In this study, U.S News and World Report collected information
from all doctoral institution in the country, no randomization was used in collecting this data. In other
words, they conducted census.

It is feasible to conduct a census if the population is small in the process of getting information does not
destroy or modify units of the population. For example, the owner of manufacturing firm might be
interested in getting information about the stores to which his business supplies items produced. It is
possible to gather this information even if there are 2,500 area stores to which he supplies items.
However, in many situation census is not method of choice to gather information. For large population,
a census can be costly and time-consuming process of data gathering. Sometimes the process of
measurements is destructive, as in testing an appliance for life length.
• One political advisor to a candidate for governor’s position is interested in determining how much
support his candidate has in the state. Suppose the state has 4,000,000 eligible voters. Then will be
too time-consuming to contact each voter to determine the amount of support. By the time the
census is finished, the support level might change, and the information collected may be useless.
• Suppose a Department of Fisheries is interested in determining the mercury level in the fish in
Mobile Bay. Using a census will mean capturing all the fish in Mobile Bay and testing them for
mercury level, which is not advisable (or even possible) method of gathering information.
• A manufacturer of suspension cable is interested in determining the strength of the cable produce
by his factory. The strength test involves applying force till the cable breaks. Obviously, a census
would leave no cable to use. So, a census would not be a practical method of gathering information
in this situation.

2.4 Experiment and Inference about Cause and Effect


An experiment is a planned activity designed to compare “treatments.” In an experiment, the
experimenter creates differences in the experimental units involved by subjecting them to different
treatment and then observing the effect of such treatments on the measure of outcome.
For example,
• In laboratory testing, engineers at one car manufacturing facility run crash test that involve
running cars at different speeds (predetermined and controlled) and crashing them at a specific
site. Then they measure the damage to the bumpers. In this example, the team of engineers
creates the differences in environment by running cars at different speed. (The cars are
experimental units, and the speed are treatments.)

• Engineers interested in studying heat transfer use pipes of different sizes and controlled thy
direction in which water is flowing. In one study, the engineers create different environment
by controlling the size of the pipes and direction of the water flow to determine the percent
of heat transfer in those different environments. (The pipes are the experimental units and size-
direction combination are the treatments.)

As in sample surveys, randomization plays a vital role in designed experiments. By randomizing


the assigned of different environment (treatments) to experimental unit, biases that might
result due to learning effects or specific orders can be avoided. Designed experiments are
conducted not only to establish differences in outcome and environments. In sample surveys,
a sample is selected randomly from a population of interest to estimate some population
characteristic, in designed experiments, different experimental units are designed randomly too
different treatments to study the treatments effects.

Example

Guo and Uea (Trains IchemE, 2003) conducted experiments to study effects of impregnation conditions
on the textural and chemical characteristic of the prepared absorbents. They used three different
concentrations (20%, 30%, and 40%) of three different solutions ----zinc chloride (ZnCI2), phosphoric
acid (H3PO4), and potassium hydroxide (KOH) -----and recorded the amount of nitrogen dioxide (NO2)
and ammonia (NH3) absorbed onto the oil-palm-shell absorbents. In this experiments, different
treatments were created by using different concentrations of the solutions, and the effects of these
different treatments were measure in the amount of the nitrogen dioxide and the ammonia absorbed.
Nine different treatments created in the experiment can be listed as follows:
(1) 20% of ZnCI2 (2) 30% of ZnCl2 (3) 40% of ZnCI2
(4) 20% of H3PO4 (5) 30% of H3PO4 (6) 40% of H3PO4
(7) 20% of KOH (8) 30% of KOH (9) 40% of KOH

2.5 Observational Study


An observational study is a data collection activity in which the experimenter merely plays the role
of an observer. The experimenter observes the differences in the conditions of units and observes the
effects of these conditions on measurements taken on these units. The experimenter does not interject
any treatment and does not contribute to the creation of observed differences.

For example:
• One researcher collected information about the speed at which the car was travelling when a
crash occurred and the amount of damage to the bumper from the accident’s reports filed by
the local Police Department. In this example, the researcher has no control over the speed of
the car. He did not contribute to creation among the differences among the speeds. The
researcher merely the differences in speeds and the result of them measured by the amount of
damage to the bumper.

• The weather station at the Mobile/Pascagoula Regional Airport recorded the wind speed, wind
direction, and eye radius of the storm when Hurricane Danny stayed over Mobile Bay for three
days. The meteorologists studied the relation among different factors to investigate reasons
behind the fluctuations in the eye of the storm. Changing values of the wind speed and wind
direction had created different environments in the storm and such environments could be
evaluated using difference in wind speeds and directions. However, the meteorologist did not
control those scenarios, they merely observed those conditions created by nature.

Example
Wolmuth and Surters (Proceeding of the ICE, 2003) studied crowd-related failure of bridge in the world.
They collected information on the bridge failure from the years 1825-2000. For each failed bridge, they
collected information on the age, use (road, footbridge, other), form (aluminum, chain, cable supports,
concrete, desk structure, iron, steel, timber), span, width, occasion,( cavalry or soldiers, sports gathering,
religious gathering, river spectacle, toll dispute, other,) crowd size, crowd action, (walking from one end
of the bridge to other, procession, crowd concentrated, at one parapet, crowd going from one parapet
to the other, queue, cavalry, soldiers or other military), number of deaths, number injured, and so forth.
This was an observational study because the authors collected from existing scenarios (they did not create
differences inn them) and analyzed collected to the answers specific question about the bridge failure.
Even an observational study such as this one provides very valuable information to engineers about
planning bridges construction and proper use of bridge, but such studies do not allow cause and -effect
conclusions.

Although we might like to, it is possible to conduct an experiment in all investigations that involve a
comparison of treatments. Sometimes we must use an observational study instead of an experiment.
• To study effect of asbestos on the health of the workers in a certain industry that makes use of that
product, an experiment will require a group worker to be exposed in product containing asbestos
while another group is not. It is unethical to expose somebody intentionally to possibly harmful
chemicals so that damage to health can be measured.
• Certain inherited traits a worker’s ability to perform certain task. It is possible to randomly assign
genetic traits to different workers; they are born with those traits.

In observational studies, result cannot be generalized to a population because observational studies use
volunteers or sample of convenience, such as workers in the first shift instead of random sample selected
from all workers. However, we can sometimes check to see whether the result can reasonably be
explained by chance alone.

Review Exercises
1. Engineers are interested in comparing the mean hydrogen production rates per day for three
different heliostat sizes. From the past week’s records, the engineers obtained the amount of
hydrogen produced per day for each of the three heliostat sizes. Then they computed and
compared the sample means, which showed that the mean production rate per day increased
with the heliostat sizes.
a. Identify the type of the study describe here.
b. Discuss the type of interference that can and cannot be drawn from this study.

2. To investigate reasons why people do not work, the Census Bureau interviewed a group of
randomly selected individuals, from April-July 1996, in four separate rotation groups,
respondents were asked to select 1 of 11 categories consisting of economic and noneconomic
reasons for not working, in response to the question, “What is the main reason you did not work
at a job or business [in the last four months]?”
a. Identify the type of the study described here.
b. Identify the population of interest.

3. Ariatnam, Najafi, and Morones (Journal of professional Issues in Engineering Education and
Practice, 2002) describe an overview of academies in horizontal directional drilling conducted to
train engineers and inspector for the California Department of Transportation. A pretest was
administered on the first day prior to any instruction. Instruction and field experience were
provided over a 3 day period, followed by a final test administered at the end of the last day.
Although the average final test score of 75.27% was higher than the average pretest score of
55.61%, the difference was not significant.
a. Identify the type of the study described here.
b. What is the purpose of administering the pretest?

4. A materials engineer wants to study the effects of two different processes for sintering copper (a
process by which copper powder coalesces into a solid but porous copper) on two different types
of copper powders. From each type of copper powder, she randomly selects two samples and
then randomly assigns one of the two sintering processes to each sample by the flip of a coin.
The response of interest measured is the porosity of the resulting copper. Explain what type of
study this is and why.

5. A textile engineer is interested in measuring heat resistance of four different types of treads used
in making fire-resistant clothing for firefighters. A random sample of 20 threads from each type
was taken and subjected to a heat test to determine resistance (the length of time the fibers survive
before starting to burn.) Explain what type of study this is and why.

6. A manufacturer of “Keep it Warm” bags is interested in comparing the heat retention of bags
when used at five different temperatures (100 oF, 125 oF, 150 oF, 175 oF, and 200 oF). Thirty bags
are selected randomly from last week’s production and randomly assigned, six each, to five
different groups. Items from group 1 at beginning temperature 100 oF were kept in bags for an
hour, and the temperatures of those items were recorded after an hour. Similarly, groups 2 to 5
were assigned items at 125 oF, 150 oF, 175 oF, and 200 oF, respectively.
a. Identify the type of study used here.
b. What type of inference is possible from this study?

You might also like