Professional Documents
Culture Documents
A Dialogue With The Data The Bayesian Foundations of Iterative Research in Qualitative Social Science
A Dialogue With The Data The Bayesian Foundations of Iterative Research in Qualitative Social Science
We advance efforts to explicate and improve inference in qualitative research that iterates between theory development, data
collection, and data analysis, rather than proceeding linearly from hypothesizing to testing. We draw on the school of Bayesian
“probability as extended logic,” where probabilities represent rational degrees of belief in propositions given limited information, to
provide a solid foundation for iterative research that has been lacking in the qualitative methods literature. We argue that
mechanisms for distinguishing exploratory from confirmatory stages of analysis that have been suggested in the context of APSA’s
DA-RT transparency initiative are unnecessary for qualitative research that is guided by logical Bayesianism, because new evidence
has no special status relative to old evidence for testing hypotheses within this inferential framework. Bayesian probability not only
fits naturally with how we intuitively move back and forth between theory and data, but also provides a framework for rational
reasoning that mitigates confirmation bias and ad-hoc hypothesizing—two common problems associated with iterative research.
Moreover, logical Bayesianism facilitates scrutiny of findings by the academic community for signs of sloppy or motivated
reasoning. We illustrate these points with an application to recent research on state building.
I
n the context of the replication crisis, APSA’s trans- eses and data gathering strategies, evidence inspires new or
parency initiative, and surrounding debates, scholars refined hypotheses along the way, and there is continual
have sought to revalue, explicate, and improve inference feedback between theory and data. This iterative style of
in qualitative research that proceeds in an inherently research, which is common in process tracing and com-
iterative manner, where prior knowledge informs hypoth- parative historical analysis, diverges from norms that
A list of permanent links to Supplemental Materials provided by the authors precedes the References section.
Tasha Fairfield is Associate Professor at the London School of Economics and 2017–2018 Mellon Foundation Fellow at
Stanford’s Center for Advanced Study in the Behavioral Sciences (T.A.Fairfield@lse.ac.uk). Her article with Andrew Charman,
“Explicit Bayesian Analysis for Process Tracing” (Political Analysis, 2017), won APSA’s Sage Paper Award for Qualitative and
Multi-Method Research. She is the author of Private Wealth and Public Revenue in Latin America: Business Power and Tax
Politics (Cambridge University Press, 2015, Donna Lee Van Cott Award, Latin American Studies Association).
They thank Andy Bennett, Ruth B. Collier, David Collier, Justin Grimmer, Macartan Humphreys, Alan Jacobs, Jack Levy,
James Mahoney, Jason Sharman, Hillel Soifer, and Elisabeth Wood for detailed comments and intellectual engagement. They
are also grateful to journal editor Michael Bernhard, Devin Caughey, Christopher Darnton, Steven Goodman, Jacob Hacker,
Antoine Maillet, James Mahon, Richard Nielsen, Craig Parsons, Jessica Rich, and Ken Shadlen, as well as seminar
participants at the Center for Advanced Study in the Behavioral Sciences, the Syracuse Institute for Qualitative and Multi-
Method Research, Rutgers, Princeton, Yale, University of Texas–Austin, University of California–Berkeley, and the University
of Oregon.
doi:10.1017/S1537592718002177
154 Perspectives on Politics © American Political Science Association 2019
or obtain that evidence. Prior/posterior describe idealized time. Jeffrey’s “probability kinematics” is a prominent
states of knowledge without/with specific pieces of evi- example;33 his approach introduces non-standard rules for
dence included. Hypotheses can contain temporal struc- updating that violate the laws of probability and imply that
turing, and evidence can contain temporal information. the order in which evidence is analyzed can matter.34
However, probabilities themselves carry no intrinsic time In sum, probability theory requires keeping track of
stamps. what information has been incorporated into our analysis,
These points merit expounding. Recall that within not when that information was acquired.35 Time-stamps
logical Bayesianism, only the data at hand and the indicating when hypotheses were composed or when
background knowledge are relevant for assessing the evidence was observed or incorporated are not relevant
degree of belief that a hypothesis merits. Nothing else to scientific inference.36
about our state of mind should influence our probabil-
ities. The relative timing of when we stated the hypoth- Curtailing Confirmation Bias and Ad-Hoc Theorizing
esis, worked out its implications, and gathered data falls Careful application of Bayesian logic helps guard against
into this later category of logical irrelevance. confirmation bias and ad-hoc hypothesizing in iterative
To further stress the logical irrelevance of keeping track research. We consider these dual pitfalls in turn.
of what we knew when, the rules of conditional probability Two common variants of confirmation bias entail
mandate that we can incorporate evidence into our analysis overfocusing on data that fit a particular hypothesis or
in any order without affecting the posterior probabilities. overlooking data that undermine it, and focusing on
Using the product rule (2) and commutativity, the joint a single favored hypothesis while forgetting to consider
whether data consistent with that hypothesis might be
likelihood of two pieces of evidence can be written in any
as or more supportive of a rival hypothesis. A common
of the following equivalent ways:
recommendation for precluding such biases entails
PðE1 E2 jHIÞ ¼ PðE2 E1 jHIÞ ¼ identifying observable implications of rivals as well as
the main working hypothesis before gathering data.37
PðE1 jE2 HIÞ PðE2 jHIÞ ¼ PðE2 jE1 HIÞ PðE1 jHIÞ: However, this advice can be problematic for two
4 reasons.
First, deducing observable implications beforehand
Evidence learned at time one (E1) may thus be treated as may be infeasible, because any hypothesis may be
logically posterior to evidence learned at time two (E2). If compatible with a huge number of possible evidentiary
in practice conclusions differ depending on the order in findings—just with varying probabilities of occurrence.
which evidence was incorporated, there is an error in our For qualitative research on complex socio-political phe-
reasoning that should be corrected. Otherwise we have nomena, there is essentially no limit to the different kinds
violated the fundamental notion of rationality that lies of evidence we might encounter, and there is no way
at the heart of logical Bayesianism (refer to Bayesian to exhaustively catalogue these infinite possibilities in
Foundations)—information incorporated in equivalent advance.
ways should lead to the same conclusions. Second, anticipating observable implications may fos-
Once we recognize that timing is irrelevant in prob- ter even greater bias. If we have already elaborated
ability theory, it follows that each step below is logically hypotheses to be considered and evidence expected under
distinct: each, we are now better situated to seek out the sorts of
evidence that will support our pet theory, compared to
• drawing on evidence E to inspire hypotheses;32 a situation where we collect evidence without necessarily
• assigning prior probabilities to those hypotheses given anticipating what will support which hypothesis. This
background information I that does not include E; caveat embodies classic advice from Doyle’s Sherlock
• assessing the likelihood of E under alternative hypotheses Holmes: “It is a capital mistake to theorize before one
to derive posterior probabilities. has data. Insensibly one begins to twist facts to suit
theories, instead of theories to suit facts” (A Scandal in
Information is neither “exhausted” nor “double-counted” in Bohemia).
this process (online appendix A expounds). All relevant Risks of confirmation bias can be better controlled by
knowledge can be sorted as convenient into background conscientiously endeavoring to follow Bayesian reasoning.
information on which all probabilities are conditioned and Tendencies to seek evidence that supports a favored
into evidence that we use to update probabilities. hypothesis, interpret evidence as overly favorable to that
Psychological/subjective approaches to Bayesianism hypothesis, and underweight evidence that runs against
often diverge from logical Bayesianism on these points, that hypothesis are counteracted by following Bayesian
because the former focus on individuals’ personal degrees prescriptions to condition probabilities on all relevant
of belief and how their psychological states evolve over information available, without presuming anything
The mayor of Lima openly hoped for a prompt Chilean contrast, E1 strongly favors HLRA over HW: whereas this
occupation for fear that subalterns might rebel. The agrarian evidence is unsurprising under HLRA, it is highly unlikely
upper class not only refused to support General Cáceres’ efforts to
fight back, but actively collaborated with Chilean occupiers
under HW (refer to the Bayesian Inference sub-section).
because of Cáceres’ reliance on armed peasant guerillas.42 E2 very strongly favors HLRA over each alternative.
Neither HW nor HR speaks to the nature of agricultural
This evidence might inspire a new hypothesis: relations, whereas in the world of HLRA, semi-servile labor
HLRA 5 Labor-repressive agriculture is the central factor hinder- is highly expected given that Peru has a weak state (E1).46
ing institutional development. Elites resist taxation and central- Furthermore, under either HW or HR, the behavior of
ized control over coercive institutions, because they fear greater
vulnerability to local rebellions.43
Peruvian elites described in E2 would be extremely
surprising—we would instead expect them to resist the
To assess which hypothesis better explains the evidence Chilean incursion (however ineffectively, given state
acquired thus far, we must return to our background weakness) in an effort to retain control over their territory
information and reassign priors across the new hypothesis and mineral resources. In contrast, their behavior fits quite
set: HR, HW, and the inductively-inspired HLRA. We then well with HLRA in showing that elites’ concern over
assess likelihood ratios for the aggregate evidence E1E2. maintaining subjugation of the labor force undermined
For priors, strictly speaking we should assess the the most basic function of the state—national defense. Of
plausibility of each hypothesis taking into account all course, we know E2 fits well with HLRA since the former
information accumulated in previous state-building liter- inspired the latter; however, the critical inferential point is
ature. However, systematically incorporating all of our that E2 is much more plausible under HLRA relative to the
background information is infeasible in social science. alternatives. Accordingly, this evidence very strongly
Given practical limitations, one reasonable approach increases the odds in favor of HLRA.
keeps equal odds on HR versus HW but gives HLRA Overall, the likelihood ratio (5) strongly favors HLRA
a moderate penalty relative to each rival, thereby acknowl- over both alternatives. E2 overwhelms the moderate
edging the novelty of this hypothesis with respect to support that E1 provides for HR. And all of the evidence
existing state-building research and anticipating skepti- weighs strongly against HW. Accordingly, HLRA emerges as
cism among readers. Another reasonable option places the best explanation given the evidence acquired thus far.
equal odds on all three hypotheses, considering that HLRA If we begin with a moderate penalty on HLRA, the posterior
is grounded in a longstanding research tradition originated still favors that hypothesis, although the higher the prior
by Barrington Moore.44 While HLRA is not discussed in penalty, the more decisive the overall evidence needed to
state-building literature, labor-repressive agriculture has boost the plausibility of HLRA above its competitors.
been identified as a crucial factor affecting other macro- In essence, we have now “tested” an inductively-
political outcomes including regime type, so a priori we inspired hypothesis with “old evidence.” What matters is
might expect this factor to be salient for state-building as not when HLRA came to mind or which evidence was
well. Furthermore, although HLRA was introduced post-hoc known before versus after that moment of inspiration, but
(in light of E2), it is no more or less ad-hoc compared to the simply which hypothesis is most plausible given our
rivals—upon inspection, none of the three hypotheses background information and all the evidence. Imagine
seems appreciably more complex than the others. Each that a colleague is familiar with all three hypotheses from
identifies a single structural cause that operates by shaping the outset and shares essentially the same background
actors’ incentives.45 knowledge, but has not seen E1E2. She would follow
Turning to the evidence, the easiest way to proceed a logically identical inferential process in evaluating which
entails assessing likelihood ratios for HLRA vs. HR and hypothesis best explains the Peruvian case: assessing the
HLRA vs. HW. Since the overall likelihood ratio factorizes: likelihood of E1E2 under these rival hypotheses. It would
be irrational for two scholars with the same knowledge to
PðE1 E2 jHi I Þ PðE1 jHi I Þ PðE2 jE1 Hi I Þ reach different conclusions merely because of when they
¼ 5
P E1 E2 jHj I P E1 jHj I P E2 jE1 Hj I learned the evidence.
To further emphasize the irrelevance of relative timing,
we first consider E1 and then E2. we do not know from reading Kurtz’s article whether he
E1 moderately favors HR over HLRA. As explained in the invented HLRA before or after finding E2, but that
Bayesian Inference sub-section, E1 fits quite well with the chronological information would not make E2 any more
resource-curse hypothesis. However, E1 is not surprising or less cogent. Our goal is not to reproduce the order in
under HLRA; a weak state with mineral resources would which the neurons fired inside the author’s brain; it is to
still be an easy and attractive target for invasion if labor- independently assess which hypothesis is most plausible in
repressive agriculture were the true cause of state weakness. light of the evidence and arguments presented.
Nevertheless, resource dependence in conjunction with Of course “new evidence” is often valuable for improv-
state weakness makes E1 more expected under HR. In ing inferences by providing additional weight of evidence.
Regarding concern b, scholarly dialogue again serves an interest in Keynesian demand management.48 The
as a corrective to sloppy analysis. If an inductive authors delineate evidence E5Records of deliberations
hypothesis manifesting multiple fine-tuned variables among cabinet officials about the tax cut show “prominent
or inordinate complexity is granted too much initial mention of . . . Keynesian stimulus,” and they judge the
credence, readers should notice and demand additional probability of finding such evidence if HK is true to be very
evidence to overcome an unacknowledged or under- high. However, E as stated above is too vague to assign
estimated Occam penalty. Beyond the simple advice to a meaningful likelihood in advance. Here are two different
treat inductively-devised hypotheses with healthy skep- clues we might encounter in the records:
ticism, three suggestions can help curtail ad-hoc hy-
pothesizing: start with reasonably simple theories and E9 5 The Finance Minister invokes Keynesian stimulus when
explaining the tax cuts to other cabinet members.
add complexity incrementally as needed; critically assess
whether all casual factors in the theory actually improve E99 5 One of the cabinet members comments that tax cuts are
explanatory leverage; and ask whether the explanation consistent with Keynesian stimulus, whereafter discussion is
might apply more broadly. interrupted by derisive jokes about Keynesian economics.
In contrast, reporting the temporal sequencing of the Suppose further that the time and attention devoted to
research process in and of itself does not help ascertain these mentions of Keynesiansim are similar for E9 and
how severe an Occam penalty a hypothesis should suffer. E99, such that both qualify as instances of E as articulated
The critical point is that a hypothesis that is post-hoc— above, even though they carry very different import.
devised after the evidence—is not necessarily ad-hoc— Whereas the likelihood of E9 might well be high if HK is
arbitrary or overly complex. These are distinct concepts. As true, the likelihood of E99 certainly is not—E99 would be
argued in the Iteration in Practice section, HLRA is post-hoc extremely surprising in a world where HK is correct.
(relative to E2), but not ad-hoc, because it is no more Bowers et al. recognize this “problem of precision,”
arbitrary or complex than its rivals. noting that E as defined earlier “still leaves some things
Biased Likelihoods open. Just how prominent do mentions of Keynesian logic
have to be . . . ? How many actors have to mention it?
Concern: We may succumb to confirmation bias in What forms of words will count as the use of Keynesian
overstating how strongly evidence favors an inductively- logic?”49 However, they underestimate the problem. The
derived hypothesis. issue is not just how many mentions or how many actors or
Suggestions for pre-registration and time-stamping in what terms we associate with Keynesianism, but an endless
qualitative research47 aim to address these concerns, on the array of other possibilities and nuances that depend on the
premise that differentiating exploratory from confirmatory context and manner in which Keynesianism is discussed.
analysis allows us to more credibly evaluate inductively- However much additional detail we specify before gath-
inspired hypotheses. Importing this prescription into ering data, we can always invent—and the real world may
a Bayesian framework would entail assigning likelihoods well produce—another twist or tweak that matters non-
to clues we might encounter before gathering data. trivially. Despite efforts to anticipate what might surprise
Even in light of human cognitive limitations, we find us ahead of time, science advances most when evidence
this approach unhelpful. Although a scholar’s prospective surprises us in unforeseen ways.
assessment of likelihoods for “new evidence” might be less Jaynes, an outspoken advocate of logical Bayesianism
prone to confirmation bias than retrospective analysis of in the physical sciences, reinforces these key points:
“old evidence,” confirmation bias could just as easily
intrude when gathering additional evidence—by subcon- The orthodox line of thought [holds] that before seeing the data
sciously looking harder for clues that favor the working one will plan in advance for every possible contingency and list
the decision to be made after getting every conceivable data set.
hypothesis or overlooking those that do not (refer to The problem . . . is that the number of such data sets is usually
Curtailing Confirmation Bias and Ad-Hoc Theorizing). astronomical; no worker has the computing facilities needed . . . .
Moreover, we reiterate the impossibility of foreseeing We take exactly the opposite view: it is only by delaying
all potential evidentiary observations in the complex a decision until we know the actual data that it is possible to
world of social science. Anticipating coarse-grained cate- deal with complex problems at all. The defensible inferences are
the post-data inferences.50
gories of observations is not adequate for specifying
likelihoods for any actual, concrete evidence that might What matters is how sound the inferences are in light of
fit within that class, because specific details of evidence the arguments and evidence presented, not in compar-
obtained can matter immensely to likelihoods under ison to every twist and turn of analysis before the author
different hypotheses. Consider the example Bowers arrived at the final conclusions, or what the author
et al. present in their discussion of pre-analysis plans for would have thought had the data turned out differently.
qualitative research: a government has cut taxes, and we Returning to the core concern of mitigating bias when
wish to assess hypothesis HK 5 Tax cuts were motivated by assessing likelihoods, first, recall that inference always