Professional Documents
Culture Documents
Michael Gregg - Field Epidemiology Tahap Awal Investigasi
Michael Gregg - Field Epidemiology Tahap Awal Investigasi
level (and to a lesser degree the national level), surveillance data are used to detect
epidemics that lead to control and prevention activities.
Test Hypotheses
Surveillance data are used to quantify the impact of intervention programs. For
example, decreases in poliomyelitis rates occurred following the introduction
of both the inactivated and oral polio vaccines (Fig. 3–3).12 In other examples,
Figure 3–2. Estimated rate (cases per 100,000 population) of measles, by age
group—United States, 1980–1982. Rates were estimated by extrapolating age
from the records of case-patients. [Source: CDC, 1983.]
60 Background
100
Inactivated vaccine
10
Rates (Logarithmic scale)
Oral vaccine
0.1
0.01
0.001
0.001
1951 1956 1965 1966 1971 1976 1981 1986 1991 1996 2001
Year
Figure 3–3. Rate (reported cases per 100,000 population) of paralytic poliomy-
elitis—United States, 1951–2004. [Source: CDC, 2006.]
9
8
7
6
Percent
5
4
3
2
1
0
1980 1982 1984 1986 1988 1990
Year
Surveillance has been used to monitor health practices such as vaccinations, hys-
terectomy, caesarean delivery, mammography, and tubal sterilization. The surveil-
lance of such practices and health-care technologies has been increasing in public
health practice in recent years.14
This chapter describes the essentials of a surveillance system and how to start and
maintain one. It is important to understand the basic levels of evaluation of such
systems. They should be evaluated at three levels: (1) the public health importance
of the health event; (2) the usefulness and cost of the surveillance system (e.g.,
whether it is meeting its goals and at what cost); and (3) the explicit attributes of
the quality of the surveillance system, including sensitivity, specificity, represent-
ativeness, timeliness, simplicity, flexibility, and acceptability.15, 16
The decision to establish, maintain, or deemphasize a surveillance system
should be guided by assessments based on these criteria. Ultimately, the decision
rests on whether a health event under surveillance is a public health priority and
whether the surveillance system is useful and cost-effective.
62 Background
SUMMARY
Public health surveillance is a basic tool of the field epidemiologist, providing the
scientific and factual database essential to informed decision-making and to the
conduct of public health prevention and control programs. Surveillance is based
on morbidity, mortality, and risk factor data, often from multiple sources. Some
data, such as vital statistics, are collected primarily for other uses; other data, such
as behavioral risk factors, are collected specifically for the surveillance system.
Surveillance systems are established by the field epidemiologist for specific
outcomes, such as a disease or injury, and must have clearly expressed goals.
Explicit case definitions are at the core of a surveillance system. The initiation and
maintenance of any successful surveillance system will reflect recognition of the
human element in surveillance practice: data collection, analysis, and data dis-
semination. Insensitivity to the persons involved in such a system dooms it to
failure.
Surveillance data have many uses but, in general, are needed to assess the
health status of a population, to set public health priorities, and to determine
appropriate actions. Effective systems of public health surveillance are evaluated
regularly on the basis of their usefulness in public health practice.
REFERENCES
1. Thacker, S.B., Berkelman, R.L. (1988). Public health surveillance in the United States.
Epidemiol Rev 10, 164–90.
2. Thacker, S.B., Stroup, D.F. (1992). Future directions for comprehensive public health
surveillance and health information systems in the United States. Am J Epidemiol 140,
383–97.
3. Langmuir, A.D. (1963). The surveillance of communicable diseases of national impor-
tance. N Engl J Med 268, 182–92.
4. Koo, D., Parrish, R.G., II. (2000). The changing health-care information infrastructure
in the United States: opportunities for a new approach to public health surveillance. In
S.M. Teutsch, R.E. Churchill (eds.), Principles and Practice of Public Health
Surveillance, vol. 1, pp. 76–94. Oxford University Press, New York.
5. Centers for Disease Control and Prevention (1997). Case definitions for public health
surveillance. Morb Mortal Wkly Rep 46, (RR-10) 1–55.
6. Centers for Disease Control and Prevention (2005). Progress in improving state and
local disease surveillance—United States, 2000–2005. Morb Mortal Wkly Rep 54,
822–5.
7. Bender, J.B., Hedberg, C.W., Besser, J.M., et al. (1997). Surveillance for Escherichia
coli 0157:H7 infections in Minnesota by molecular subtyping. N Engl J Med 337(6),
338–94.
8. Roush, S., Birkhead, G., Koo, D., et al. (1999). Mandatory reporting of diseases and
conditions by healthcare professionals and laboratories. JAMA, 282, 164–70.
Surveillance 63
9. Centers for Disease Control and Prevention (2006).Assessing capacity for surveil-
lance, prevention, and control of West Nile virus infection—United States, 1999 and
2004. Morb Mortal Wkly Rep 55, 150–3.
10. Centers for Disease Control and Prevention (1999). Summary of notifiable diseases,
United States, 1998. Morb Mortal Wkly Rep 47, 47.
11. Centers for Disease Control and Prevention (1983). Annual Summary 1983: Reported
morbidity and mortality in the United States. Morb Mortal Wkly Rep 32, 33.
12. Centers for Disease Control and Prevention (1999). Summary of notifiable diseases,
United States, 1998. Morb Mortal Wkly Rep 47, 54.
13. Centers for Disease Control and Prevention (1990). Sexually Transmitted Disease
Surveillance Morbidity Report.
14. Thacker, S.B., Berkelman, R.L. (1986). Surveillance of medical technologies. J Public
Health Policy 7, 353–77.
15. Romaguera, R.A., German, R.R., Klaucke, D.N. (2000). Evaluating public health
surveillance. In S.M. Teutsch, RE. Churchill (eds.), Principles and Practice
of Public Health Surveillance, 2nd ed., pp. 176–93. Oxford University Press,
New York.
16. Centers for Disease Control and Prevention (2001). Updated guidelines for evaluating
public health surveillance systems: recommendations from the guidelines working
group. Morb Mortal Wkly Rep 50(No. RR-13).
DATA SETS
R i c h a r d A . Go o d ma n
Ja me s L. H ad l e r
D uc J . Vug i a
67
68 The Field Investigation
jurisdiction personnel in positions to call for assistance are able to understand the
perspective of those they invite in to assist.
In the United States, the responsibility for public health rests primarily with the
state and local agencies. While many investigations may be carried out with local
resources, sometimes they require additional assistance. Additional staff and/or
expertise may be necessary to fully investigate and respond to complex, large or
multijurisdictional problems. When additional help is needed, a process and
discussion follow to extend an invitation and define the relative roles of all
involved.
THE INVITATION
An essential consideration is the need to have a formal request for assistance from
an official who is authorized to request help. In the United States, usually the state
epidemiologist has the authority and responsibility for major epidemiologic field
investigations of acute public health problems and for making decisions about
whether to investigate independently or to seek help elsewhere. Other persons or
organizations may be involved in generating a request, including persons in insti-
tutional settings (e.g., nursing homes, hospitals, and businesses), as well as organ-
izational entities with special jurisdiction or authority (e.g., prisons, military
facilities, cruise ships, and reservations for Native Americans). For international
problems, determining who possesses authority to extend a request may be
considerably more complicated and may involve, for example, ministries of
health or multinational organizations such as the World Health Organization
(see Chapter 21).
The relations between larger and smaller health jurisdictions vary from
state to state (or province to province) within countries, as well as from country
to country. In general, the larger health jurisdictions help serve the smaller in
time of need. Yet the sensitivities between these jurisdictional levels are often
delicate, particularly as they relate to perceived competence, local scope of
responsibility, and ultimate authority. The health officials of the jurisdiction
providing assistance must decide—on the basis of prevailing local–state
agreements, as well as their best judgment—what is the most appropriate
response.
At the time of the initial request for assistance, the invited field epidemiolo-
gists should attempt to determine the answers to three questions.
Operational Aspects of Epidemiologic Field Investigations 69
As noted in Chapter 1, there are several reasons why field investigations should be
done, if not encouraged. These include, but are not limited to, the need to:
During preliminary discussions with the public health official(s) who are
requesting assistance with an investigation, the assignment of the following roles
and responsibilities should be addressed:
These are extremely critical issues, some of which cannot be totally resolved
before the investigative team arrives on the scene. However, they must be
addressed, discussed openly, and agreed upon as soon as possible.
PREPARATION
Once the field team has been chosen, certain key measures should be taken.
• Identify the team leader and the senior staff to whom s/he should report
regularly at the “home base.”
• Hold a meeting of all proposed field team members with home-based staff
to review known details of the public health problem, the nature of the
request for assistance, current knowledge of any suspected pathogen or
disease, goals and objectives for the field investigation, and preliminarily
agreed-upon roles and responsibilities.
• Try to arrange in advance an initial meeting with the requestor (e.g., the
state epidemiologist) or persons either designated or identified by the
requestor (e.g., local disease-control director or other official). This will
ensure that local authorities are not surprised by an unexpected arrival. In
addition, this step underscores for all parties the need for advance planning
and orderliness in the investigation—in essence, it sets a tone for the
conduct of the investigation.
• Before leaving for the field, a senior member of the team should write a
memorandum to the record and/or to relevant key officials. It should sum-
marize how and when the request was made, what information was pro-
vided by the requesting health agency, what is the agreed-upon purpose of
the investigation, what are the commitments of both the visiting team and
the requesting health officials, who is on the field team, and when the latter
is expected to arrive in the field. This memorandum should be distributed
to key personnel in the offices of the visiting team and the requesting, host
agency, and to others who need to know. This kind of communication
will serve not only as notification to all concerned, but as a method of pre-
venting redundant responses (i.e., to avoid “crossing wires”). It will also
72 The Field Investigation
identify expertise and resources from other programs that may contribute
to the investigation. Basic programmatic jurisdictions and interests must
also be respected, and some programs and staff simply want to or need to
know as a courtesy. Even when a problem does not directly involve a state
(for example, as in the case of prison or military facility), a wide array of
state and local officials are generally notified because of possible ramifica-
tions of the problem for populations in surrounding communities.
• Lastly, before departing for the field, each member of the investigative
team should review a basic checklist to ensure they have materials and aids
essential for field operations, and have covered fundamental travel and
logistical considerations. Such items include, for example, background
journal articles, statistical references, laptop computers (already loaded
with essential software, such as Epi Info), digital cameras, portable voice
recorders, credit cards, and travel and lodging reservations. Finally, each
member also should review the need for any necessary personal protective
measures, including vaccinations, anti-malarial prophylaxis, antimicrobi-
als or antivirals for post-exposure prophylaxis, and personal protective
equipment such as face masks, gloves, and gowns.
A key concept and philosophy for the field investigation team to keep in mind is
the importance of the role of “consultant/collaborator,” and what that implies. In
general, the guiding principle should be that the team is there to provide help, not
simply to “take charge.” Equally important is the need to balance the focus of the
investigation with the competing priorities in the locality of the requesting juris-
diction. While the immediate problem is the team’s sole concern, local health
officials must continue to address myriad other priorities and ongoing problems.
This dichotomy can be appreciated if the team tries to take the local point of view
early in the investigation.
Once on site, the team should meet promptly with the official who requested
assistance—usually the state, provincial, or local epidemiologist or a program
director. At this meeting, essential steps include the needs to:
• Create a method and schedule for providing updates to local officials and
headquarters;
• Review sensitivities, including potential problems with institutions and
individuals (e.g., hospitals, administrators, practitioners, and local public
health staff) likely to be encountered during the investigation. Ideally, the
team initially should take the time to meet the requesting official—so that
key “doors” will be opened—rather than spend valuable time later in the
investigation mending bridges.
During the initial meeting between local staff and the visiting team, an appro-
priate local person should be identified to speak for the entire investigative team.
In general, the visiting team should try to avoid direct contact with the news media
and should always defer to local health officials (see Chapter 13). The field team
is essentially working at the request and under the aegis of the local health offi-
cials. Therefore, it is the local officials who not only know and appreciate the local
situation but also are the appropriate persons to comment on the investigation. In
the most practical sense, the less the media make contact with the visiting team,
the more the team can do at its own pace and discretion.
The work required to organize an investigation through this stage (starting
travel and convening the initial meeting) is relatively straightforward and uncom-
plicated. In contrast, however, at least three factors will probably complicate the
start of the scientific investigation: (1) the effects of a new setting (i.e., visiting
team members are outsiders and unfamiliar with the environment); (2) the often
intense pressure to solve the problem immediately and end the outbreak; and
(3) the media’s queries and other demands for the team’s time. Thus, in short
order, circumstances may change from tranquility and orderliness to a situation of
pressure and confusion. To overcome the myriad potential distractions, team
members must maintain the proper perspective by adhering to the basics: focusing
the mission to collect data systematically; verifying the diagnosis; and then
proceeding through case identification, orientation of data, and development and
testing of hypotheses (see chapters 8 and 10). Therefore, at the conclusion of the
initial meeting, some team members should try to visit patients to verify the
diagnosis through interviews, review of laboratory data, and, if necessary,
conducting a physical examination.
MANAGEMENT
progression of the investigation. First, maintain lists of necessary tasks, check off
the actions that have been completed, and update the list at least twice daily.
Second, communicate frequently with coworkers, the requesting official, and the
person designated to be the media contact; hold a team meeting each day at a
regularly scheduled time. The team leader also should communicate with the
home base senior staff as frequently as needed. Third, never hesitate to request
additional help if required by the circumstances. Fourth, to ensure the investiga-
tion will be completed, avoid setting a departure date in advance or succumbing
to the pressure of family members to return earlier.
Investigations of large and complex problems may be particularly challeng-
ing for field teams and require even more rigorous organization of field opera-
tions. The following practical pointers are offered to assist in managing key
aspects of the investigation:
• Record the team’s decisions as the team makes them—this will help ensure
consistency and make the study reproducible, a consideration particularly
important in regard to case definitions and why certain criteria were used.
• Remember the need for quality-control measures such as training and
monitoring of data collectors and abstractors, conducting error checks
and validating data independently, and evaluating non-respondents and
missing records.
• Resist collecting more data than are needed (e.g., excessive clinical details).
• Write down the reason the team went to investigate and what was there
when the team arrived (i.e., a background section).
• Write while the investigation is ongoing—months later, team members
will have forgotten what they did.
• Write the methods while they are being defined and developed by team
members—a decision log helps.
• Maintain and retain an inventory of data files.
• Protect the information privacy interests of subjects.
• Because field investigations are difficult, associated with long hours and
great stress, field team managers must make a special effort to maintain
morale and should find ways to provide encouragement, positive reinforce-
ment, and appreciation to those who participate.
Occasionally, after some time in the field, an investigation does not yield
definite results or it identifies additional questions that require one or more inves-
tigations to address. At that point, the team leader, in consultation with the home
base senior staff, should assess the team’s morale and capacity to persevere to
determine whether the team should continue its efforts or a fresh team should
prepare to come in to extend the investigation.
Operational Aspects of Epidemiologic Field Investigations 75
DEPARTURE
REPORTS
Written summaries of the investigation include both preliminary and final reports.
The preliminary report fulfills the immediate obligation to the requesting official
and agency. It should include a summary of methods used to conduct the investi-
gation, preliminary epidemiologic and laboratory findings, recommendations, a
clear delineation of tasks and activities that must be completed, and appropriate
“thank you’s.” In addition to the preliminary report, which optimally should be
delivered to the requestor on departure or from the field within one to two weeks
of completion of the investigation, the team should prepare follow-up letters to
other principals (e.g., local health officials, co-investigators) to inform them and
to reinforce long-term relations.
The final reports should be written as quickly as possible—before team
members are called out to another epidemiologic field investigation! The final
report should include complete and final data. In addition to a written final report,
field investigation team members should consider other methods or forums for
communicating the investigation’s findings. Options include formal seminars—in
person, by teleconference, or by videoconference—where an oral presentation
will promote critical feedback; reports for public health bulletins intended for
public health practitioners (e.g., CDC’s Morbidity and Mortality Weekly Report);
76 The Field Investigation
EMERGING DEVELOPMENTS
This chapter has outlined and described operational principles for epidemiologic
field investigations. These principles represent a combination of standard man-
agement and administrative concepts adapted to the settings and needs of field
epidemiology. However, in the closing decades of the 20th century and the begin-
ning of the 21st century, major epidemiologic developments and other events, as
well as shifts in societal norms and governmental policies, have had an impact on
the operational aspects of field investigations. These emerging developments
encompass, but are not limited to, health information privacy, field investigations
spanning multiple jurisdictions, investigations of suspected bioterrorism, and use
of incident-command system. This chapter concludes by briefly discussing these
selected emerging developments.
Privacy Concerns
Multi-jurisdiction Investigations
As global transport of foods, goods, and humans became rapid and routine,
multi-jurisdictional disease outbreaks also have become more frequent.
Operational Aspects of Epidemiologic Field Investigations 77
Contaminated produce from a farm or deli meat from a processing plant can cause
a multi-county, multistate, or international food-borne disease outbreak. In the
United States, improved surveillance and detection of multi-jurisdictional out-
breaks have been facilitated by new tools, such as PulseNet, a national molecular
subtyping network,2 and investigations of multistate food-borne disease outbreaks
have become better coordinated across local, state, and federal agencies. The key
features of a well-coordinated investigation of a multistate food-borne disease
outbreak include rapid communication between local, state, and federal public
health officials as soon as possible after recognition of the outbreak, led by a
single central group, such as the Foodborne and Diarrheal Diseases Branch at the
CDC; participation on regular, weekly teleconference calls to share proposed case
definition(s), evolving case counts, laboratory findings, hypotheses, and study
instruments; coordination of epidemiologic and laboratory investigations and sup-
port, whether provided in the field by some states or centrally by CDC; involve-
ment of food safety regulatory agencies in tracebacks or traceforwards of
implicated food(s); and frequent and consistent risk communication to the public
and media.3
For an acute public health problem involving many jurisdictions, rapid, cen-
tralized communication and coordination of field investigations involving key
public health officials from all levels is necessary to efficiently define and control
the source of the problem. For example, in the multistate outbreak of E. coli 0157:
H7 infections associated with consumption of fresh spinach in the United States
in 2006, CDC held a multistate teleconference within a few days after being
alerted by at least two states of small clusters of E. coli 0157:H7 infections and
CDC PulseNet confirmation of a rare matching PFGE pattern in patients from
some of these states.4 Based on preliminary information shared on this call, the
U.S. Food and Drug Administration (FDA) immediately advised consumers to not
eat bagged fresh spinach, effectively intervening on the ongoing nationwide out-
break. Days later, parallel laboratory investigations in several states detected the
outbreak strain of E. coli 0157:H7 in some bagged spinach, and weeks later, two
epidemiologic investigations, one led by a state enrolling its own infected resi-
dents and the other led centrally by CDC enrolling patients from other states,
implicated the same food vehicle.
Suspected Bioterrorism
several factors, including the extent to which information and evidence suggest
that the site is a “crime scene,” as well as the extent to which findings indicate that
the site has public health intervention implications in terms of preventing further
exposures and identifying and managing persons who may have been exposed
prior to the initiation of the investigation.
In addition to the issue of determining who is “in charge” of the site, exam-
ples of related and key operational considerations are the approaches to collecting
samples from the site of the concurrent investigation and to interviewing persons
who may be affected—either with active cases of disease or who may have been
exposed—and/or who may be targeted by the criminal investigation as potential
suspects. From a public health perspective, the routine epidemiologic investiga-
tion may necessitate the collection of biologic specimens from persons (e.g.,
blood and sputum samples) and environmental samples (e.g., swabs). However,
during a concurrent investigation of suspected bioterrorism, when such specimens
and samples are being obtained in the setting of a possible crime scene, epidemi-
ology field team members must recognize that physical information-gathering
steps taken by law enforcement officials as part of an ongoing criminal investiga-
tion must adhere strictly to the process of establishing a “chain of custody of
evidence.” A key purpose for this process is to ensure that specimens and samples
presented as evidence during a criminal prosecution and trial in court can with-
stand challenge by the defense. The standards required for establishing a chain of
custody of evidence for any given sample are rigorous in relation to documenting
its precise source, who obtained it, who handled and maintained it, and other
factors.
Conducting interviews jointly by public health and law enforcement officials
is another example of emerging operational developments that pose new chal-
lenges for the epidemiology field team. Under the circumstances of an investiga-
tion of suspected bioterrorism or some other problem in which criminal activity
may have contributed to the public health problem, epidemiologists and criminal
investigators must adhere to procedures that both serve the interests of public
health and safety, and respect laws that safeguard the rights of individuals, includ-
ing persons who are or may become suspects in a criminal investigation. These
requirements have prompted some jurisdictions to develop agreements between
public health and law enforcement agencies that provide a basis for crafting
protocols for joint interviews conducted by representatives of both sectors.6
REFERENCES
1. Hodge, J.G., Hoffman, R.E., Tress, D.W., et al(2007). Identifiable health information
and the public’s health: practice, research, and policy. In: R.A. Goodman,
R.E. Hoffman, et al. (eds.), Law in Public Health Practice, 2nd ed., Oxford University
Press, New York. pp. 238–61.
2. Swaminathan, B., Barrett, T.J., Hunter, S.B., et al. (2001). CDC PulseNet Task Force.
PulseNet: the molecular subtyping network for foodborne bacterial disease surveillance,
United States. Emerg Infect Dis 7(3), 382–9.
80 The Field Investigation
3. Sobel, J., Griffin, P.M., Slutsker, L., et al. (2002). Investigation of multistate foodborne
disease outbreaks. Public Health Rep 117, 8–19.
4. CDC (2006). Ongoing multistate outbreak of Escherichia coli serotype 0157:H7 infec-
tions associated with consumption of fresh spinach—United States, September. Morb
Mortal Wkly Rep 55, 1045–6.
5. Goodman, R.A., Munson, J.W., Dammers, K., et al. (2003). Forensic epidemiology: law
at the intersection of public health and criminal investigations. J Law Med Ethics 31,
684–700.
6. CDC (2004). NYC protocol: http://www2a.cdc.gov/phlp/docs/BTProtocolCover.PDF;
http://www2a.cdc.gov/phlp/docs/Investigations.PDF.
7. Perkins, B.A., Popovic, T., Yeskey, K. (2002). Public health in the time of bioterrorism.
Emerg Infect Dis 8,1015–8.
8. Rendin, R.W., Welch, N.M., Kaplowitz, L.G. (2005). Leveraging bioterrorism prepar-
edness for non-bioterrorism events: a public health example. Biosecur Bioterror 3,
309–15.
5
CONDUCTING A FIELD INVESTIGATION
M i c h a e l B . Gr e g g
BACKGROUND CONSIDERATIONS
81
82 The Field Investigation
and/or laboratory findings are unclear, the task becomes much more difficult.
It requires more careful consideration of the clinical presentation of disease in an
effort to determine the source, mode of spread, and population(s) at risk. For
example, bacterial contamination of food or water is usually manifested by signs
and symptoms referable to the gastrointestinal tract. Pathogenic agents transmit-
ted in air often affect the respiratory tract and sometimes the skin, eyes, or mucous
membranes. Skin abrasions or lesions may suggest animal or insect transmission.
So the clinical manifestations of disease may serve as critical leads.
Regardless of how secure the clinical diagnosis may be, your thought process
must include clinical, laboratory, and epidemiologic evidence. Together these
provide leads and pathways to take or reject to discover the natural history of
the epidemic.
Although you will perform several separate operations, in broad strokes you
will really do two things. First, you will collect information that describes the
setting of the outbreak; namely, when people became sick, where they acquired
disease, and what the characteristics of the ill people were. These are the descrip-
tive aspects of the investigation. Often, simply by knowing these facts (and the
diagnosis), you can determine the source and mode of spread of the agent and
can identify those primarily at risk of developing disease. Common sense will
often give you these answers, and relatively little, if any, further analysis is
required.
On occasion, however, it will not be readily apparent where the agent resided,
how it was transmitted, who was at risk of disease, and what the exposure was.
Under these circumstances, you will have to use a second operation, analytic epi-
demiology, to provide the answers. And the critical operations here include deter-
mining rates and comparing these rates. Virtually all epidemiologic analyses
require comparisons, usually groups of persons—ill and well or exposed and not
exposed (see Chapter 8). In epidemic situations you will usually compare ill and
well people—both believed at risk of disease—to determine what exposures ill
people had that well people did not have. These comparisons are made by using
appropriate statistical techniques (see Chapters 8 and 10). If the differences
between ill and well persons are greater than one would expect by chance, you can
draw certain inferences about why the epidemic occurred. In some situations,
comparisons can be made between exposed persons and those not exposed to see
if there are significant differences in rates of illness between the two groups.
An underlying theme throughout this chapter is the need to act quickly, establish
clear operational priorities, and perform the investigation responsibly. This should
not imply haphazard collection and inappropriate analysis of data, but rather the
Conducting a Field Investigation 83
use of simple and workable case definitions, case-finding methods, and analyses
(see Chapters 4 and 11).
Data collection, analyses, and recommendations should be performed in the
field. There is a strong tendency to collect what you think is the essential informa-
tion in the field and then retreat to “home base” for analysis—particularly now with
the availability of personal computers. Avoid this reflex at all costs. Such action
will probably be viewed as lack of interest or concern or even possessiveness by the
local constituency. A premature departure also makes any further collection of data
or direct contact with study populations and local health officials difficult, if not
impossible. Once home, you lose the urgency and momentum to perform, the sense
of relevancy of the epidemic, and, most of all, the totally committed time for the
investigation. Every field investigation should be completed, not only to the field
team’s satisfaction, but particularly to that of the local health department as well.
THE INVESTIGATION
Introduction
Ten basic tasks will be described in a logical order (Table 5–1). However, you may
perform several of these functions simultaneously or in different order during the
investigation. Control and prevention measures may even be recommended soon
after beginning the investigation simply on the basis of intuitive reasoning or
common sense. Sometimes the local officials know why the epidemic occurred,
and you are there simply to supply a scientific basis for their conclusion.
No two epidemiologists will take the exact same pathway of investigation.
Yet, in general, the data they collect, the analyses they apply, and the control and
prevention measures they recommend are likely to be similar.
Since, by definition, our example epidemic has resulted from a point source
and may be nearly over before the field team arrives, the investigation will in all
likelihood be retrospective in nature. This should alert you to some fundamental
aspects of any investigation that occurs “after the fact.” First of all, because many
illnesses and critical events have already occurred, virtually all information
acquired and related to the epidemic will be based upon memory. Health officers,
physicians, and patients will probably have different recollections, views, or
perceptions of what transpired. Information may conflict, may not be accurate,
and certainly cannot be expected to reflect the precise occurrence of past events.
Like the clinician, you may have to ask patients what they think made them sick,
what they think caused the epidemic. Most critically, in parallel with medical
practice, action may have to be taken without the benefit of all the desired data
(see Chapter 11).
For the young, inexperienced medical epidemiologist steeped in the tradition
of molecule and millimole determinations, the “more-or-less” measurements of
the field epidemiologist can initially be major hurdles to a successful field inves-
tigation. However lacking in accuracy these data may be, they are often the only
data you have; and they must be collected, analyzed, and interpreted with care,
imagination, and caution. Furthermore, you have not seen the epidemiologic
method work in real life. Unlike clinical medicine when in a matter of minutes to
a few hours the physical examination usually reinforces the history, and the labo-
ratory results usually reinforce both, there often is no immediate reinforcement of
the thought processes and activities in the field. It usually takes several days or a
week before data start coming in that begin to reassure you that you are on the
right track.
Local health officials usually will know if more disease is occurring than would
normally be expected. Since most local health departments have ongoing records
of communicable diseases and certain noninfectious conditions, by comparing
weekly, monthly, or yearly data, you can easily determine if the observed numbers
exceed the expected level. Although there may not be laboratory confirmation at
this time, an increase in reported cases by local physicians is enough evidence
to investigate. However, at this time avoid the use of the terms “epidemic” or
“outbreak.” These words are quite subjective. Local health officials take different
Conducting a Field Investigation 85
views of the normal rise and fall in cases, and whether changes in the pattern merit
investigation.
You must be aware of artifactual causes of increases or decreases in reported
cases such as changes in local reporting practices, increased interest in certain
diseases because of local or national awareness, a new physician or clinic in town,
or changes in diagnostic methods. An excellent example of artifactual reporting
occurred in southwest Florida in 1977 when a new physician in the community
reported many cases of encephalitis in his practice. After extensive field work by
local, state, and federal epidemiologists, it was clear there was no epidemic, but
simply misdiagnoses by the physician.1
Sometimes, however, it may be difficult to document the existence of an
epidemic rapidly. You may need to acquire absentee records from schools or fac-
tories, records of outpatient clinic visits or hospitalizations, laboratory records, or
death certificates. A simple telephone survey of practicing physicians will strongly
support the existence of an epidemic, as would a similar rapid survey of house-
holds in the community. In such quick assessments, you could ask about signs and
symptoms rather than about specific diagnoses. Ask physicians or clinics if they
are treating more people than usual with sore throats, gastroenteritis, or fever with
rash, as examples, in order to obtain an index of disease incidence. Although not
specific for any given disease, such surveys can often establish the existence of an
epidemic. Sometimes it is extremely difficult to determine if there is an epidemic.
Yet because of local pressures, the team may have to continue the investigation
even if they believe no significant health problem exists.
Now try to create a workable case definition, decide how to find cases, and count
them. The simplest and most objective criteria for a case definition are usually the best
(e.g., fever, X-ray evidence of pneumonia, white blood cells in the spinal fluid, number
of bowel movements per day, blood in the stool, or skin rash). However, be guided by
the accepted, usual presentation of the disease with or without standard laboratory
confirmation in the case definition. Where time may be a critical factor in a rapidly
unfolding field investigation, use a simple, easily applicable definition—recognizing
that some cases will be missed and some non-cases included. For example, in an
epidemic of hepatitis A, a history of jaundice, fever, and an abnormal liver enzyme
test should be quite adequate to start with. Later you can refine the definition.
Some factors that can help determine the levels of sensitivity and specificity
of the case definition are the following:
No matter what criteria are used, you must apply the case definition equally
and without bias to all persons under investigation.
Methods for case finding will vary considerably according to the disease in
question and the community setting. Most outbreaks involve certain clearly iden-
tifiable groups at risk; therefore, finding cases will be relatively self-evident and
easy. Active, direct contact with selected physicians, hospitals, laboratories,
schools, or industries or by using some form of public announcement will find
most of the remaining, unreported cases. However, sometimes more intensive
efforts—such as physician, telephone, door-to-door, or culture or serologic
surveys—may be necessary to find cases. Regardless of the method, you must
establish some system(s) of case finding during the investigation and perhaps
afterwards (see Chapter 3).
Conducting a Field Investigation 87
Simply knowing the number of cases does not provide adequate informa-
tion. Control and prevention measures depend upon knowing the source and
mode of spread of an agent as well as the characteristics of ill patients. Therefore,
case finding should include collecting pertinent information likely to provide
clues or leads to the natural history of the epidemic and, particularly, relevant
characteristics of the ill. First, collect basic information about each patient’s
age, gender, residence, occupation, and date of onset, for example, to define the
basic descriptive aspects of the epidemic. Next, get pertinent signs, symptoms,
and laboratory data. If the disease under investigation is usually water- or food-
borne, ask questions about exposure to various water and food sources; if trans-
mitted by person-to-person contact, ask about the frequency, duration, and
nature of personal contacts. If the nature of the disease is not known or cannot
be comfortably presumed, you will need to ask a variety of questions covering
all possible aspects of disease transmission and risk. Also be mentally prepared
for the possibility of having to apply a second questionnaire if the first analysis
does not help.
Now the team should have a reasonably accurate number of cases to view descrip-
tively. So it is time to characterize the epidemic in terms of when patients became
ill, where they lived or became ill, and what special attributes the patients had (see
Chapter 9 for greater detail). You may want to wait until the epidemic is over or
until all likely cases have been reported before performing such an analysis. Don’t.
The earlier you develop ideas of why the epidemic started, the more pertinent data
you can collect. The addition of a proportionately small number of cases later on
will usually not affect the analysis or recommendations.
Time
Characterize the cases by plotting a graph that shows the number of cases
(y axis) over the time of onset of illness using an appropriate time interval (x axis)
(Fig. 5–1). This “epidemic curve” gives a considerably deep appreciation for the
magnitude of the outbreak, its possible mode of spread, and the possible duration
of the epidemic—much more than would a simple “line-listing” of cases. One can
often infer a remarkable amount of information from a simple picture of times of
onset of disease. If the incubation period of the disease is known, relatively firm
inferences can be made regarding the likelihood of a point source exposure,
person-to-person spread, or a mixture of the two. And just the opposite: if you
know when the exposure occurred, you may be able to determine the incubation
period. This is particularly important if you do not know what the disease is.
Also, if the epidemic is still in progress and you have a good idea of the disease,
88 The Field Investigation
you may be able to predict how many more cases are likely to occur. Finally, an
epidemic curve provides an excellent “prop” for ready communication to nonepi-
demiologists, administrators, and the like who need to grasp in some fashion the
nature and magnitude of the epidemic.
The epidemic curve in Figure 5–1 shows cases of Pontiac fever (subsequently
confirmed as Legionnaires’ disease) that occurred in Pontiac, Michigan, July and
August 1968, by day of onset.2 The epidemic was explosive in onset, suggesting
(1) a virtually simultaneous point-source exposure of many persons; (2) a disease
with a short incubation period because of the very tight clustering of cases over a
very narrow time frame); and (3) a continuing exposure spanning several weeks—
all of which were subsequently confirmed.
Place
Sometimes diseases occur or are acquired in unique locations in the com-
munity, which, if you can visualize, may provide major clues or evidence regard-
ing the source of the agent and/or the nature of exposure. Water supplies, milk
distribution routes, sewage disposal outflows, prevailing wind currents, air-flow
patterns in buildings, and ecological habitats of vectors may play important roles
in disseminating microbial or environmental pathogens and determining who is at
risk of acquiring disease. If one plots cases geographically, a distribution pattern
may appear that approximates these known sources and routes of potential expo-
sure. This, in turn, may help identify the vehicle or mode of transmission.
Conducting a Field Investigation 89
Figure 5–2. Culture-positive cases of Shigella, by sites along the Mississippi River
where each case swam within three days of onset of illness. [Source: Rosenberg
3
et al., 1976. ]
Person
Lastly, you must examine the characteristics of the patients themselves in
terms of a variety of attributes, such as age, gender, race, occupation, or virtually
90 The Field Investigation
any other characteristic that may be useful in portraying the uniqueness of the
case population. If a singular or special attribute emerges, this frequently suggests
a strong lead as to the group at risk and even an idea of the specific exposure.
Some diseases primarily affect certain age groups or races; frequently, occupation
is the key attribute of people with certain diseases. The list of human characteris-
tics—really potential risks and exposures—is nearly endless. However, the more
you know about the disease in question (the agent’s reservoir, mode(s) of spread,
persons usually at greatest risk), the more specific and pertinent information you
should seek to determine whether any of these risks or exposures predispose to
illness.
You now know the number of ill people, when and where they were when they
became ill, what their general characteristics are, and, usually, have a firm diagno-
sis or a good “working” diagnosis. These data frequently provide enough infor-
mation to determine with reasonable assurance how and why the epidemic started.
For example, a time, place, and person description of the epidemic will strongly
suggest that only people in a particular community supplied by a specific water
system were at risk of getting sick, or that only certain students in a school or
workers in a single factory became ill. Perhaps it was only a group of people who
attended a local restaurant who reported illness. However, no matter how obvious
it might appear that only a single group of persons was at risk, one should look
carefully at the entire community to be sure there are not other affected persons.
Sometimes it is very difficult to know who is at risk, particularly in epidem-
ics that cover large geographic areas and involve many age groups with initially
no obvious unique characteristics. Under these circumstances the team may have
to do a survey of some kind to get more specific information about the ill persons
and some idea of who is at risk.
(because of the signs and symptoms) and because no other cluster of similar dis-
ease had occurred elsewhere in the community, the epidemiologists focused atten-
tion only on those who bought food from the pizzeria. The logical hypothesis then
was that the exposure necessary to develop nausea, vomiting, and diarrhea was
consumption of some food(s) contaminated with a microbial or chemical agent.
Therefore, those who bought and ate food from the pizzeria on the presumed day
of exposure were given a questionnaire asking what beverages and kinds of foods
and pizza they had eaten; that is, what foods had they been exposed to. Early
analysis showed that 100% of the ill persons (cases) had eaten mushrooms on
pizza. Because so many ill people had eaten these pizzas, one might quickly
assume it was the contaminated food. Yet the 100% simply represents how popu-
lar the mushroom pizza was among the ill attendees. Alone, the 100% does not
give adequate valid support to the hypothesis that exposure to the pizza (i.e.,
eating the mushroom-topped pizza) caused illness. What had to be done was to
determine the food histories of the well pizza eaters (controls) and compare their
histories to the ill persons. When this comparison was done, the food histories
were very similar between the two groups except for one food, the mushroom
pizzas: only 33% of the well attendees ate the mushroom-topped pizza. The
hypothesis, then, was that the difference in exposure rates—100% among the ill
and 33% among the well—was because the mushroom pizza was contaminated.
When these rates were tested statistically, it showed that, assuming that eating the
particular pizza had no relation to getting ill, such a difference would occur less
than one time in 10,000 such instances. Therefore, the statistical evidence as well
as other information (isolation of S. aureus from cans of mushrooms) supported
the hypothesis that eating the mushroom pizza was the exposure that caused the
outbreak.
Again, this phase of the investigation clearly will pose the greatest challenge.
Field epidemiologists must review the findings carefully; weigh the clinical, labo-
ratory, and epidemiologic features of the disease; and hypothesize possible expo-
sures that could plausibly cause disease. In other words, you must seek from the
patients’ histories exposures that could conceivably predispose them to illness. If
exposure histories for ill and well are not significantly different, a new hypothesis
must be developed. This will require imagination, perseverance, and sometimes
resurveying those at risk to obtain more pertinent information.
spread, and population affected fit well with the known facts of the disease? For
example, if, in the gastroenteritis outbreak referred to above, the analysis incrimi-
nated a food of high protein and low acid content that supports growth of staphy-
lococcal organisms and production of enterotoxin (as is the case with mushrooms),
the hypothesis fits well with our understanding of staphylococcal food poisoning.
However, if the analysis incriminated coffee or water—highly unlikely sources of
staphylococcal enterotoxin—you must then reassess the findings, perhaps secure
more information, reconsider the clinical diagnosis, and certainly pose and test
new hypotheses. Unfortunately, on rare occasions this reassessment is necessary,
and you should be prepared.
The following investigations illustrate the uses of simple descriptive and ana-
lytic epidemiology, how some analyses may not prove helpful, how posing new
hypotheses may be necessary, how the facts must fit logically, and how important
persistence is in arriving at a defensible conclusion.
Thirty-four cases of perinatal listeriosis and seven cases of adult disease
occurred between March 1 and September 1, 1981, in several maritime provinces
of Canada.5 These cases represented a manifold increase over the number of cases
diagnosed in previous years, suggesting some common exposure. Although
L. monocytogenes is a common cause of abortion and nervous system disease in
cattle, sheep, and goats, the source of human infection has been obscure. Cases
could not be linked together by person-to-person contact; they shared no common
water source; and food exposures, as determined from a general food history,
were not different between cases and controls. However, a second, more detailed
food history and subsequent intensive interrogation of cases and controls revealed
that there was a statistically significant difference between cases and controls
regarding exposure to coleslaw. Even though this food had never been previously
incriminated as a source of listeria, it was the only food item positively associated
with disease and essentially the only lead the investigators had at the time. Armed
with this clue, the team subsequently found a specimen of coleslaw in the refrig-
erator of one of the patients that grew out the same serotype of listeria isolated
from the epidemic cases. No other food items in the refrigerator were positive for
listeria.
The coleslaw had been prepared by a regional manufacturer who had obtained
cabbages and carrots from several wholesale dealers and many local farmers.
Although environmental cultures from the coleslaw plant failed to reveal listeria
organisms, two unopened packages of coleslaw from the plant subsequently grew
L. monocytogenes of the same epidemic serotype. A review of the sources of the
vegetable ingredients was made, and a single farmer was identified who had grown
cabbages and also maintained a flock of sheep. Two of his sheep had previously
died of listeriosis in 1979 and 1981. Also, he was in the habit of using sheep
manure to fertilize his cabbage.
Conducting a Field Investigation 93
This information does not prove this farm was the source of the listeria organ-
isms that caused the epidemic. However, the hypothesis that coleslaw was the
source and the statistical test that supported this hypothesis provided the neces-
sary impetus to continue the investigation. And, ultimately, a single, highly likely
source of the bacteria was discovered. These findings strongly suggest listeriosis
is a zoonotic infection transmitted from infected animals via contaminated
vegetables to humans.
In January and February 1980, an epidemic of 85 cases of salmonellosis in
Ohio prompted an extensive field investigation by Taylor et a1.6 All cases were
caused by an uncommon serotype of salmonella, S. muenchen. This finding plus
the fact that all cases were among teenagers and young adults strongly suggested
a common source of exposure. Knowing that the natural reservoirs for almost all
serotypes are poultry, chicken eggs, and other domestic farm animals and that the
majority of salmonella epidemics can be traced to eating meat or poultry products
or having contact with these animals, Taylor and colleagues questioned the cases
and appropriate controls. Their questions included food histories and contact with
farm animals. Not too surprisingly, the investigators found that significantly more
cases than controls gave a history of eating ham. On the surface this evidence
strongly incriminated ham as the vehicle of infection. However, in trying to define
the source of the contaminated ham, Taylor learned that the ham eaten by the
patients came from five different distributors. How likely is it that one uncommon
serotype would come from five different distributors who, in turn, secured their
ham from different producers? The logic was overwhelming: despite a reasonable
food source of the salmonella and persuasive statistics, the ham was not the source
of the salmonella, and more questioning had to be done.
At this same time, another, identical, epidemic of salmonella was reported in
Michigan. Having more cases to work with and focusing on possible unique char-
acteristics of the teenage/young adult population, the team asked many more
questions of cases and controls, including questions about the use of drugs. To
their great surprise, they found a highly significant association between illness
and smoking marijuana. Although this association seemed as implausible as that
with ham, samples of marijuana smoked by the cases were culture positive for
S. muenchen, strongly incriminating the marijuana as the vehicle of infection.
The actual field investigation and analyses have now been completed, requiring
only a written report (see below). However, because there may be a need to find
more cases, to define better the extent of the epidemic, or to evaluate a new labora-
tory method or case-finding technique, you may want to perform more detailed
and carefully executed studies. With the pressure of the investigation somewhat
94 The Field Investigation
A Record of Performance
In this day of input and output measurements, program planning, program
justifications, and performance evaluations, there is often no better record of
accomplishment than a well-written report of a completed field investigation. The
number of investigations performed and the time and resources expended not only
document the magnitude of health problems, changes in disease trends, and the
results of control and prevention efforts, but also serve as concrete evidence of
program justification and needs.
SUMMARY
REFERENCES
Joa n M. H e r o l d
97
98 The Field Investigation
1. Write a protocol.
2. Select a survey mode.
3. Develop a questionnaire.
4. Design and select the sample.
5. Train interviewers (or prepare for mail-out).
6. Collect data (fieldwork).
7. Enter data into a computer, edit, and process the data.
8. Analyze the data.
9. Write a survey report.
Writing a Protocol
Perhaps more than any other type of data collection, a survey cannot be under-
taken without a protocol or detailed plan. The various steps in survey design are
all interconnected and, therefore, must be thought out carefully prior to beginning
any steps of the survey itself. For example, the budget influences the choice of the
survey mode, sample design, sample size, and length of questionnaire. The analy-
sis plan affects the format of questions, sample design, and sample size. The
sample size influences the mode, analysis, and interpretation of results. The mode
dictates the length and format of the questionnaire. A protocol allows all these
interrelationships to be considered—and they must be considered—before actu-
ally choosing a mode, designing a questionnaire, or selecting a sample. The pro-
tocol should include: Study Objectives; Methodology, including a list of
information to be sought from the survey, survey design, sampling plan, and data
editing plan; Analysis Plan, including the computer software necessary to analyze
the data; Logistics for implementing the survey, including personnel and equip-
ment needed; Budget; and Time Line. Once all these steps have been thoroughly
thought through, you are ready to begin work on the survey.
Surveys can be classified by their mode of data collection. There are mail and
Internet surveys, telephone surveys, and face-to-face interview surveys. Prior to
developing a questionnaire and choosing a sample, the survey mode must be
decided. Each of these modes has its advantages and disadvantages.
Mail, or self-administered, surveys are seldom used to collect information
from the general public because names and addresses usually are not available,
and the response rate tends to be low. However, the method may be highly effec-
tive with members of particular groups, such as members of an HMO or members
Surveys and Sampling 99
to respond to the initial method. The most popular of the mixed mode surveys is
the combined mail and telephone mode. Caution is advised, however, in the use of
mixed mode surveys. Questionnaire construction is usually different for different
survey modes, and this difference creates a measurement problem with mixed
mode approaches. Moreover, potential biases vary depending on the mode, and
this can cause problems in interpreting the results. Therefore, it is highly recom-
mended that you seek the assistance of a survey expert if you choose to use a
mixed mode approach.
Developing a Questionnaire
Writing Questions
The first step in developing a questionnaire is to define the research question
and to list the information that you wish to obtain. A list of information or varia-
bles that you want will help you create the questions that will give that informa-
tion, as well as help you avoid writing unnecessary questions. You will also need
to know the type of analysis you plan to use and the mode of data collection. Your
analysis plan will determine the form of responses to questions. For example, if
you plan to work with proportions, you will want to collect categorical responses.
On the other hand, if you can work with means, you may collect scaled responses.
It is also possible that you want to do a content analysis on qualitative data, in
which case you may seek open-ended responses (see below). The mode of data
collection (mail/Internet, telephone, or face-to-face interview) will affect how the
question is written, what response categories are shown, the length of the ques-
tionnaire, and the overall format of the questionnaire. Here are some general rules
on question writing and questionnaire format.
The question format can be of three types:
Questionnaire Format
The order of questions in the questionnaire varies depending on the mode of
data collection. A questionnaire administered by an interviewer should have some
easy and non-threatening questions to start. Often demographic questions are
asked first to give the respondent an opportunity to get comfortable with the inter-
viewer prior to being asked more difficult or sensitive questions. On the other
hand, a mail/Internet survey should get right to the point, with questions specific
to the principal purpose of the survey at the beginning of the questionnaire. In this
way the respondent’s attention is immediately engaged, and he or she is more
likely to complete and return the questionnaire.
102 The Field Investigation
removed from the questionnaire after fieldwork and data entry are completed. In
this way, the respondent cannot be linked to the answers he or she has provided.
The Pretest
A newly developed questionnaire that has not been used previously in a pop-
ulation must be pretested. A pretest is a form of pilot study to check on the validity
and reliability of the individual questions and the instrument as a whole. In a pre-
test, the final questionnaire is administered, using the mode that will be used in the
actual survey, to a small (usually purposive) sample taken from the same popula-
tion that the final sample is drawn from. It is very important, however, that the
respondents for the pretest not be included in the final sample. Thus the pretest is
usually not conducted until after the final sample has been drawn. Depending on
the length and complexity of the questionnaire, the sample size for a pretest can
vary from 25 to 100 individuals. The pretest should reveal any problem respond-
ents may have in understanding the questions and whether or not there are any
difficulties in following the questionnaire. If the pretest is of sufficient size, it may
also be used to identify questions that may be removed from the questionnaire
because of lack of variability in responses to those items. The pretest of the ques-
tionnaire also affords an opportunity to test other aspects of survey procedures,
such as ability to locate respondents, willingness of prospective respondents to
consent to the interview, the duration of the interview, and so forth.
Selecting a Sample
through the size and design of the sample. Sampling errors, even for small sam-
ples, are often the least of the errors present in a survey. (See below for a listing of
other sources of error.)
Nevertheless, it is important to keep in mind that, for a small, geographically
concentrated population, it may be possible to do a complete or census-type
survey. For example, an outbreak in a small village or at a church supper may be
best studied by surveying everyone in those small populations and thus completely
avoiding sampling error.
Types of Samples
Sampling error cannot be calculated for all samples. Our ability to calculate
sampling error depends on the type of sampling employed. There are two broad
types of sampling used in sample surveys: probability sampling and nonprobability
sampling. Probability sampling uses statistical theory in the design of the sample
and, consequently, permits the calculation of sampling error. Probability sampling
is the selection of a sample such that every member in the population has a known
and nonzero probability of being included. This type of sampling is unbiased and
enables you to draw valid conclusions about the population from which your
sample is drawn. Nonprobability sampling is not based on statistical theory. It is a
type of sampling that is inherently biased and does not permit the calculation of
sampling error. Nonprobability samples include purposive samples, convenience
samples, and the like.
Purposive (or judgment) sampling is the selection of a sample based on some-
one’s judgment and knowledge of the subject matter. This type of sampling is
biased and generally used only when there is no time to define a probability sample.
An example might be the selection of community leaders in a refugee camp in such
a way as to try to get maximum representation of the various ethnic groups residing
there when the target population are all refugees in the camp. The community
leaders would then be asked questions about the members of their community.
Convenience sampling is the use of a sample that is near at hand. Such a
sample is inherently biased by the fact that it includes only persons that happen to
be out and about or taking a specific route or engaged in a specific activity at the
time of the survey. Route samples, street-corner political surveys, or a sample
based on persons coming into a clinic are convenience samples when the target
population is the resident population of a given area.
On occasions when the goal of a study is not to make statistical estimates
about a population but rather to explore ideas and opinions of people about a new
topic that may not be ready for a quantitative investigation, convenience samples
may be very useful. They can provide ideas about people’s thoughts and opinions
and can be used to generate hypotheses for further study. But any study that wishes
to produce statistics about the total population must use probability sampling.
Surveys and Sampling 105
Probability Sampling
As stated before, a probability sample is one in which every member of the
population has a known and nonzero probability of being selected into the sample.
A special case of the probability sample is the random sample, where each member
of the population has an equal chance of being selected. The vast majority of the
statistical tests we use carry with them the assumption that the sample has been
randomly selected. While the selection of a random sample may not always be
feasible, if we know the probability of selection of every member of the popula-
tion, we can make adjustments to the data (through computer programs) to account
for the differences between our probability sample and a strictly random sample.
If we do not meet the criteria of a probability sample, we cannot draw conclusions
about the population using standard significance tests and confidence intervals.
To draw a random sample you need to have a list of all members of the
population from which the sample is to be drawn. This list is called a sampling
frame. It is important that this frame be current and accurate. For example, when
drawing a sample of college students, you will want a list of current students and
not a list of those attending the college a year ago. When drawing a sample of
housing units in a town, you will not want to use a list of housing units identified
at the previous census eight years ago. We often fail to realize that our sample is
only as good as the sampling frame from which it is selected. It is, therefore, very
important to obtain an updated sampling frame that lists all elements of the popu-
lation as close to the survey date as possible. If none is available, it is up to you,
the survey implementer, to add to your budget the personnel and materials neces-
sary to produce a current sampling frame. If you have to create a sampling frame
for your sample, pay close attention to cluster sampling, described below.
randomly, one determines a selection interval (n), by dividing the total population
listed by the sample size. The sampler then chooses a random starting point on the
population list, and selects every nth person (the length of the selection interval).
Good geographic or strata distribution can be assured if the population is listed
according to geographic area or other stratifying characteristic. It is an easy
method to apply and a popular one among public health professionals.
In stratified sampling, the target population is divided into suitable, non-
overlapping subpopulations or strata. Each stratum should be homogeneous within
and heterogeneous between other strata. A random sample can then be selected
within each stratum. In this way, each stratum is more accurately represented, and
since members are more alike within each stratum, the overall sampling error is
reduced. Separate estimates can be obtained from each stratum, and an overall
estimate can be obtained for the entire population defined by the strata. The sample
selection for each stratum is further defined by whether it is proportionate or
disproportionate across strata. Proportionate stratified sampling uses the same
sampling fraction that is calculated for the total population (sample size divided
by the total population size) for selecting a sample from each stratum.
Disproportionate stratified sampling uses different sampling fractions across strata
in an attempt to get sufficient numbers of elements to make separate statistical
estimates for each stratum. Disproportionate sampling is the method used when
“oversampling” of a particular stratum is done. This method frequently is used to
get sufficient representation of minority ethnic groups and to enable independent
estimates of characteristics by ethnicity. While disproportionate stratified sampling
allows for sufficient sample size within each stratum to make stratum-specific esti-
mates with relatively equal precision, it requires the extra step of weighting the
data when estimates are made for the total population, or all strata combined.
Cluster sampling is of particular value to save resources in surveys of human
populations when the population is geographically dispersed or when a sampling
frame for the elements of the population you wish to study is not available. In this
type of sampling, the units first sampled are not the individual elements we are ulti-
mately interested in, but, rather, clusters or aggregates of those elements. For exam-
ple, in sampling the population of a rural area that is widely dispersed and difficult
to reach, a sample of villages may be selected, and then all of the households in the
sampled villages may be included in the survey. It is apparent that such a sample
involves less traveling than a simple random sample of households throughout the
rural area. Another typical example is in a study of schoolchildren, when a list of all
children attending schools in an area is not available. One would then use a list of
schools, and select a sample of schools from the list. In this case, schools are the
sampling unit or clusters. Once the sample of schools has been selected, all of the
students in the sampled schools may be surveyed. This type of sampling almost
always loses some degree of precision. To maintain the same degree of precision as
Surveys and Sampling 107
in simple random sampling, one would have to approximately double the sample
size that would be calculated for a simple random sample.
Multistage sampling involves sampling at different levels of population
groupings. It is most often used in surveys where cluster sampling is necessary. In
the cases given in the previous paragraph, for example, if one were to sample the
households in the selected villages, or sample the children in the selected schools,
there would be two levels of sampling involved, and they would demonstrate a
two-stage sample design. The first stage would be the selection of a sample of vil-
lages or schools, and the second stage would be the selection of a sample of
households or students. Both of these examples, however, may be converted to
three-stage designs. If, in the example of the population of rural areas, we were
first to select a sample of villages (clusters), then select a sample of households
(clusters) in each selected village, and finally select one adult (final sampling unit)
in each of the sampled households, we would have a three-stage sample design.
With respect to the children attending school, our first stage would be the selec-
tion of schools (clusters), the second stage might be the selection of a sample of
classes or grades (clusters) in the selected schools, and the third stage would be
the selection of a sample of students (final sampling unit) in the classes. Again, we
would have a three-stage design. An obvious advantage in the school example is
that you need only a listing of schools, classes within the selected schools,
and students in the selected classes. You have bypassed the need for a list of all
children attending school!
Most national surveys involve multistage sampling, with geographically
defined clusters being the first stage or the primary sampling unit. This is, in part,
because of the prohibitive cost of dispersing interviewers across a broad geo-
graphic area, but more important, it is due to many countries’ not having a list of
all members of their population. The United States is an example. The United
States does not have a list of all members of its population. Instead it has a list of
census tracts. To conduct a sample survey for the U.S. population, one would have
to first select census tracts from a list provided by the Census Bureau, then update
the sampling frame for the selected tracts by mapping out the number and location
of housing units in those tracts (unless that had been done with a recent census or
survey), then select housing units from the updated frame, and finally select people
living in the selected housing units.
Sample Size
Ideally, the sample size chosen for a survey should be based on how reliable
the final estimates must be. In practice, usually a trade-off is made between the
ideal sample size and the expected cost of the survey. The size of the sample must
be sufficient to accomplish the purpose but should not be larger than necessary,
because it will draw resources from other aspects of the survey process.
108 The Field Investigation
Sample size is determined by the desired confidence level and precision of your
estimates and the variability of the characteristic being measured for the population.
The formula for calculating the sample size needed to estimate a proportion is:
n = z2 pq/d 2
where,
If the total population from which the sample is to be drawn is less than
10,000, then the size of the population must also be taken into account. Thus,
for a population of size 10,000 or greater, n, above, is the final sample size; for
populations less than 10,000, the following adjustment must be made:
nf = n/1+(n/N)
where,
The above formulas may be easily modified when the estimates of interest
are means rather than proportions. As stated, these formulas are the basic ones for
simple random sampling. However, we know that in reality we often use stratified
or multistage sampling methods. Adjustments to these formulas are necessary for
these more complex designs or for more complex analysis than estimating propor-
tions and means. Under such circumstances it is best to consult a statistician. If
one is not easily available, some rules of thumb that may come in handy are:
(1) for cluster designs, you will want to double the calculated sample size for a
simple random sample; (2) for disproportionate stratified samples, you will want
to calculate a sample size for each stratum.
Surveys and Sampling 109
When fieldwork is about to begin, the following supplies will be needed for
the interviewers:
Response Rate
A response rate needs to be calculated at the end of every survey. The response
rate is simply the number of completed questionnaires divided by the number of
people in your original sample. Often survey directors will report other rates,
using a smaller denominator, such as the number of people from the original
sample who were actually located, or the number of people who answered their
telephone, or the number of mailed questionnaires that actually reached a correct
address. These rates are not true response rates. They are simply ways of avoid-
ing having to report a response rate that may not have been as high as desired.
Note that nonresponse is not simply a function of the number of refusals obtained,
but also the number of sample points that were never located. Therefore, the
response rate reflects not only the compliance of the population contacted, but
also the skill of the survey personnel in finding the members of the selected
sample.
Depending on the duration of fieldwork, the collected data may be entered into
a computer while data collection is taking place or at the completion of data
collection.
Data entry is a stage of survey work when many errors can occur. In the past,
the standard way of checking for data entry error was to require double entry.
Double entry simply means the data are entered twice, by different enterers. Then
the two sets of data are compared (by computer), and if disagreement in the data
is found between the two data sets, the original questionnaire is consulted and the
incorrect code is corrected. Double data entry is both costly and time-consuming.
Fortunately, there is less need for it today because of the proliferation of data
entry/edit programs.
112 The Field Investigation
As can be deduced from the previous paragraph, in the recent past, data entry
and data editing were two separate activities. Today, there are numerous computer
programs that accomplish editing of data entry error concurrent with data entry.
One such program is Epi Info (see Chapter 7). These programs usually reproduce
the questions from the questionnaire on the computer screen and allow movement
from one question to the next only if an acceptable entry is made. An acceptable
entry may be defined by allowable codes or by following the appropriate skip pat-
tern. An example of allowable codes may be an age range of 15 to 49 for a repro-
ductive health survey. If the data entry person attempts to enter a 52 as the response
for age, the computer will not accept it. Or, if there are five response categories for
educational level, coded 1 to 5, the computer will only accept an entry in the 1 to 5
range. If the computer entry person mistakenly enters a 6, the computer will not
move on to the next question until it is corrected. Such a data entry/edit program
can identify the majority of data entry errors and consequently obviate the need
for double data entry. In the absence of an entry/edit program, double data entry
may be called for.
The data entry/edit program, however, does not identify all errors, and it
cannot correct all errors. A mis-keyed code that falls within the range of accepta-
ble codes will not be caught. Furthermore, any errors that are in the questionnaire,
and not due to data entry, cannot be corrected by the data entry person. The latter
errors can only be corrected by the interviewer and often not without returning to
the individual respondent. This usually is only done if data entry is timed close to
the original interview.
Once sufficient data have been entered into the computer, other edits can be
done. Consistency checks that may not have been easily programmed into the data
entry/edit program may be written and run on the data (see Chapter 7). And when
all the data have been entered, a set of frequencies for all the items in the question-
naire should be produced for further scrutiny for previously undetected errors.
After all the data have been entered and edited and there is no possibility of
returning to the field, the cover sheets with identifying information on the respond-
ents should be destroyed, as well as any other material (such as lists) that can
identify participants in the survey. The questionnaires themselves should be
kept—at least through the initial stages of analysis. Consistency errors may still
be discovered during the analysis phase, and it is always helpful to be able to
return to the original questionnaires.
Principles of data analysis are described elsewhere in this book (see Chapter 10).
The analysis consideration unique to survey data, however, is the importance of
using the appropriate statistical techniques to adjust for the sample design and to
Surveys and Sampling 113
SOURCES OF ERROR
All along the route of survey implementation there are places for error to
contaminate the data. While it is not possible to eliminate all error, you must be
cognizant of potential error, must attempt to keep it at a minimum, and when all
else fails, be prepared to define the bias that error may cause in the data. When
conducting a survey, the following broad areas for error must be foremost in your
mind as the survey planner and implementer.
1. Coverage error. Coverage error occurs when the population from which
the sample is drawn is not equivalent to the target population. This usually
results from outdated or poorly constructed sampling frames. It can also
114 The Field Investigation
SHORTCUTS TO AVOID
been pretested. The pretest not only demonstrates whether or not the
target population understands the questions in the same way as the person
who created them but also offers an opportunity to identify errors in the
questionnaire.
3. Failure to train the interviewers thoroughly. Interviewers not only need to
learn to conduct the interviews in a standardized fashion but they must
also be taught the importance of locating the correct respondent and
persuading that person to be interviewed. A week or two of interviewer
training is not unreasonable for a large-scale survey.
4. Failure to use adequate quality control procedures. You should build into
your survey necessary checks on its different facets at all stages—review
of sample selection procedures, supervision of interviewing, random
checks that interviews have actually taken place, and oversight of editing
and coding decisions, among other things. Insisting on proper standards
in recruitment and training of survey personnel helps a great deal, but
equally important are proper review, verification, and evaluation to ensure
that the execution of the survey corresponds to its design. Without proper
quality control of all steps in the survey process, errors can occur that can
be irreversible, costly, and have damaging results.
Survey Methods
1. Abramson, J.H. (1974). Survey Methods in Community Medicine. Churchill Livingstone,
Edinburgh and London.
2. Couper, M.P., Baker, R.P., Bethlehem, J.,et al. (1998). Computer Assisted Survey
Information Collection. John Wiley and Sons, New York.
3. Dillman, D.A. (1978). Mail and Telephone Surveys: The Total Design Method. John
Wiley and Sons, New York.
4. Dillman, D. (2002). Mail and Internet Surveys: The Tailored Design Method. John
Wiley and Sons, New York.
5. Fowler, F.J. (2002). Survey Research Methods, 3rd ed., Sage Publications, Thousand
Oaks, Calif.
6. Groves, R.M., Fowler, F.J., Couper, M.P., et al. (2004). Survey Methodology. John Wiley
and Sons, Hoboken, N.J.
7. Lessler, J., Kalsbeek, W. (1992). Nonsampling Error in Surveys. John Wiley and Sons,
New York.
8. Rossi, P.H., Wright, J.D., Anderson, A.B. (1983). Handbook of Survey Research.
Academic Press, Orlando, Fla.
9. Salant, P., Dillman, D.A. (1994). How to Conduct Your Own Survey. John Wiley and
Sons, New York.
Surveys and Sampling 117
Sampling
1. Cochran, W.G. (1977). Sampling Techniques, 3rd ed., John Wiley and Sons, New York.
2. Kish, L. (1965). Survey Sampling. John Wiley and Sons, New York.
3. Levy, P.S., Lemeshow, S. (1980). Sampling for Health Professionals. Lifetime Learning
Publications, Belmont, Calif.
4. Lohr, S. (1999). Sampling: design and analysis. Duxbury Press, Pacific Grove, Calif.
5. World Health Organization (1986). Sample Size Determination. World Health
Organization, Geneva.
7
USING A COMPUTER FOR FIELD
INVESTIGATIONS
• Tasks that are clearly defined and that will be done many times in the
same way
• Rapid computation or counting involving large numbers of similar records
• Tasks matching the capabilities of existing software
• Numerically intensive calculations
• Accurate retention of details
• Investigators who have used the same system before
MICROCOMPUTERS
Progress in the miniaturization of computers has been nearly miraculous in the past
three decades, and a description of microcomputer hardware is sure to be outdated
as soon as it is printed. At present a portable computer and a printer can be carried
to the field in a briefcase and operated either from batteries or standard electrical
power. Palmtop computers that fit in a pocket and do not require a keyboard are
becoming popular, although they are still limited compared with laptop models.
A laptop or desktop computer may have a hard disk capable of storing millions of
records. Portable modems make it technically possible to send files, access biblio-
graphical databases, or search the Internet from any area with cable or telephone
Internet service, although some countries place restrictions on modem use. Wireless
connections are rapidly becoming available in some areas, and useful work can be
performed from “Internet cafés” if private connections are not available.
The most common type of microcomputer is the Intel-compatible computer
with a Microsoft Windows® operating system. There are more than 500 million
copies of Microsoft Windows in the world, at least 22% of which are pirated.1
Since microcomputers running some form of Microsoft Windows are ubiquitous
and also permit fairly easy development of software, most epidemiologic software
is available for these models. Macintosh and Linux computers require different
software, but browsers in all three systems can access the Internet for searching,
communication, and calculation. Windows programs can be run within or beside
the other operating systems using Windows emulators or “dual boot” systems.
Documents and spreadsheets can be created, edited, and shared on the Internet using
only a browser, thus allowing those with different types of computer operating
systems to participate.
Laptop computers are more expensive than desktop models of the same
capacity but are fairly rugged and light enough to carry, and most models easily