Professional Documents
Culture Documents
Thickening Thin Concepts and Theories - Combining Large N and Small in Comparative Politics (M. Coppedge)
Thickening Thin Concepts and Theories - Combining Large N and Small in Comparative Politics (M. Coppedge)
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Ph.D. Program in Political Science of the City University of New York is collaborating with JSTOR to digitize,
preserve and extend access to Comparative Politics.
http://www.jstor.org
ResearchNote
Michael Coppedge
A Perspective on Methods
465
ComparativePolitics July 1999
466
Michael Coppedge
467
ComparativePolitics July 1999
Thick Concepts
468
Michael Coppedge
bility of the hypothesis that different researchers are in fact explaining different
things in different ways and therefore favors (but does not guarantee)cumulation.
However, the thin concepts implied by the construction of some of the variables
often introduces uncertainty about the validity of these measures. Quantitative
researchersin effect use the bait-and-switchtactic of announcingthat they are test-
ing hypothesesabout the impact of, for example, economic developmentand then by
sleight of hand substitutean indicatorof per capita energy consumptionand assert
that it measures development well enough. The problem with such substitutionsis
not necessarily that they do not measurethe concept of interest at all, but that a sin-
gle narrowindicatorcan not captureall the relevantaspects of a thick concept.
It is temptingto hide behind the excuse that we often test only the implicationsof
the hypothesisratherthan the hypothesis itself. But this fig leaf offers no real protec-
tion, because a valid test of the full hypothesiswould still requirethe testing of all of
its manifold implications.Often these tests would be the same as tests involving an
equivalentcomplex multidimensionalindicator.Forexample, if theory tells us that X
causes Y, it makes little difference whetherwe treatY's thin indicatorsY1 andY2 as
observable implications of Y or as component dimensions of Y; either way, we end
up testing for associationsbetween X, on the one hand,and Y1 and Y2, on the other.
Unpackinga thick concept, exploringits dimensionality,and translatingit into quan-
titative indicators can be seen as a process of discovering more of the observable
implicationsof a theory and thereforeof renderingit more testable.
Is there any way to assemble large datasetswith valid indicators?It would be eas-
ier if concepts were not thick. But some thick concepts are too old and too centralto
our thinking to be reduced in this way. In such cases the alternativeis to explore
empiricallyhow to measurethem validly and reliably,which requiresrecognition of
their complexity. A basic procedure in measuring any complex concept has four
steps. First, the analyst breaksthe mother concept up into as many simple and rela-
tively objective components as possible. Second, each of these components is mea-
sured separately.Third,the analyst examines the strengthof association among the
components to discover how many dimensions are representedamong them and in
the mother concept. Fourth,components that are very strongly associated with one
anotherare treatedas unidimensional,that is, as all measuringthe same underlying
dimension, and may be combined.Any other componentsor clusters of components
are treatedas indicatorsof differentdimensions. If the mother concept turns out to
be multidimensional,the analystthen has two or more unidimensionalindicatorsthat
togethercan captureits complexity. If the mother concept turns out to be unidimen-
sional, then the analysthas several closely associated componentindicatorsthat may
be combined into a single indicatorthat capturesall the aspects of that dimension
betterthan any one componentwould.8
Controversyhas always surroundedquantitativeindicators. One basic objection
holds that, when the theoreticalconcept is categorical,not continuous,then attempts
469
ComparativePolitics July 1999
470
Michael Coppedge
no reason not to develop and use thick continuousmeasuresas long as one threshold
of their continuousconcept is faithfulto all the facets of the original categoricalcon-
cept.
This defense of the superiority of continuous indicators rests entirely on the
premise that the hypotheticalcontinuous and categoricalindicatorsmeasure exactly
the same concept. In theory they could, but in practice they often do not. The prob-
lem with many quantitativeindicatorsis not that they are quantitative,but that they
are qualitativelydifferentfrom the categoricalconcepts they purportto measure.The
real problem with continuous indicatorsis that they measure only thin, reductionist
versions of the thickerconcepts that interestnonquantitativescholars.
Thick Theory
471
ComparativePolitics July 1999
Allende claimed to be a socialist, that he proposed socialist policies, that these poli-
cies became law, that these laws adversely affected the economic interests of certain
powerful actors, that some of these actors moved into opposition immediately after
certain quintessentiallysocialist policies were announced or enacted, that Allende's
rhetoric disturbedother actors, that these actors issued explicit public and private
complaints about the socialist government and its policies, that representativesof
some of these actors conspired together to overthrowthe government, that actors
who sharedthe president'ssocialist orientationdid not participatein the conspiracy,
and that the opponents publicly and privately cheered the defeat of socialism after
the overthrow.Much of this evidence could also disconfirm alternativehypotheses,
for example, that Allende was overthrownbecause of U.S. pressure despite strong
domestic support. If it turns out that all of these observable implications are true,
then the scholarcould be quite confident of the hypothesis. In fact, she would be jus-
tified in remainingconfident of the hypothesis even if a macro comparisonshowed
that most elected socialist governmentshave not been overthrown,because she has
already gathered superior evidence that failed to disconfirm the hypothesis in this
case.
The longitudinalcase study is simply the best researchdesign availablefor testing
hypotheses about the causes of specific events. In additionto maximizing opportuni-
ties to disconfirm observable implications, it does the best job of documentingthe
sequence of events, which is crucial in establishingthe directionof causal influence.
Moreover,it is the ultimate "most similar systems" design, because conditions that
do not change from time 1 to time 2 are held constant and every case is always far
more similar to itself at a different time than it is to any other case. The closer
together the time periods are, the tighter the control is. In a study of a single case
that examines change from month to month or day to day, almost everythingis held
constant, and scholars can often have great confidence in inferring causation
between the small number of conditions that do change around the same time.
Competentsmall-N comparativistshave every reason to be skeptical of conclusions
from macro comparisonsthat clash with theirmore solid understandingof a case.
This approachhas two severe limitations,however.First, it is extremely difficult
to use it to generalize to other cases. Every additionalcase requires a repetitionof
the same meticulous process-tracingand data collection. To complicate mattersfur-
ther, the researcherusually becomes aware of other conditions that were taken for
grantedin the first case and now must be examined systematicallyin it and all addi-
tional cases. Generalizationtherefore introducesnew complexity and increases the
data demandsexponentially,making comparativecase studies unwieldy. Second, the
case study does not providethe leverage necessary to test counterfactualhypotheses,
for which a single case can supply little data (beyond interviews in which actors
speculate about what they would have done under other conditions). For example,
would the Chilean military have intervened if Allende had been elected in 1993
472
Michael Coppedge
473
ComparativePolitics July 1999
results from the fact thatmost of the latterare eithercast at the subnational(group or
individual) level or move easily between all levels, from individualto international.
Small-N comparison is more flexible in this respect. Case studies routinely mix
togethernationalstructuralfactors such as industrializationand constitutionalprovi-
sions, group factors such as party or union characteristics,individualfactors such as
particular leaders' decisions, and international factors such as IMF influence.
Quantitativeresearchersare caught flat-footed when faced with shifting levels of
analysis because they go to great pains to build a datasetat one level. Incorporation
of explanatoryfactors from a lower level requirestheir building a completely differ-
ent datasetfrom scratch.Their units of analysis are countriesand years, at best. For
example, to test the hypotheseson regime transitionsfrom the O'Donnell-Schmitter-
Whiteheadproject, one would have to collect data about strategicactors ratherthan
countriesand resampleat intervalsof weeks or monthsratherthanyears.
In view of the difficulty of bridging levels of analysis, it is tempting to conclude
that the effort is not necessary, that the choice of a level of analysis is a matter of
taste and those working at the individual and national levels may eat at Almond's
separatetables and need never reconcile their theories. But from the perspective of
methodologicalperfectionoutlined in this article, the level of analysis is not a matter
of taste, because no level of analysis by itself can yield a complete picture of all the
causal relationships that lead to a macro outcome. All levels of analysis are, by
themselves, incomplete. Rational choice advocates are right to insist that a political
theory tell us what is going on at the individuallevel. However,it is wrong to argue
that associations at a macro level do not qualify as theory until one can explain the
causal mechanismsat the individuallevel. A theory of structuralcausationis a theo-
ry, but an incomplete one, just as theory at the individuallevel is incomplete until it
tells us what process determined the identities and number of players, why these
players value the ends they pursue rationallyand which variety of rationalityguides
their choices, how the institutionalarena for the game evolved, where the payoffs
come from, why the rules sometimes change in mid-game, and how the power distri-
bution among actors determines the macro outcome. And both microtheories and
macrotheoriesare incompleteuntil we understandthem in their slowly but constant-
ly evolving historical-structural context.
This insistence on bridginglevels of analysis is not mere methodologicalprudery.
Empiricalquestions of great theoretical,even paradigmatic,import, such as whether
individualsaffect democratization,depend on it. Rationalchoice theory assumes that
they do; Linz and Stepan and O'Donnell, Schmitter,and Whitehead asserted that
they do.14Yet, despite all the eloquent theorizing that led to "tentativeconclusions
about uncertaintransitions,"all the cases covered by Transitionsfrom Authoritarian
Rule underwentsuccessful transitions that have lasted remarkablylong. There are
many possible explanations for this genuinely surprising outcome, but one that is
plausible enough to require disconfirmation is the idea that these transitionswere
474
Michael Coppedge
driven by structuralconditions. Even if it is true that elites and groups had choices
and made consequential decisions at key moments, their goals, perceptions, and
choices may have been decisively shapedby the context in which they were acting. If
so, they may have had a lot of proximateinfluence but very little independentinflu-
ence after controlling for the context. I do not assert this interpretationas fact but
merely suggest that it has some plausibility and theoretical importance; we will
never know how seriously to take it until we bridge these levels of analysis with
methods that permit testing of complex multivariatehypotheses. This effort would
requirecollecting a lot of new data on smallerunits of analysis at shortertime inter-
vals.
Conclusion
NOTES
I am grateful to David Collier, Ruth Berins Collier, and three anonymous reviewers for their thoughtful
and constructivesuggestions.
1. Robert H. Bates, "Letter from the President: Area Studies and the Discipline," APSA-CP.
Newsletter of the APSA Organized Section in Comparative Politics, 7 (Winter 1997), 1-2; Barbara
Geddes, "Paradigmsand Sandcastles:ResearchDesign in ComparativePolitics,"APSA-CP:Newsletterof
theAPSA OrganizedSection in ComparativePolitics, 8 (Winter 1997), 18-20.
2. Karl R. Popper, The Logic of Scientific Discovery (New York:Harperand Row, 1968). Of course,
we also need to demonstratethat our hypotheses fit the evidence. Thoughnot always easy, this demonstra-
tion is easier than ruling out all the other possibilities, so I emphasize disconfirmationhere. One of the
alternative hypotheses is that any association we discover is due to chance. Some scholars therefore
encourageus to avoid proceduresthat increasethe probabilityof false positives, such as testing a hypoth-
esis with the same sample that suggested it or engaging in exploratoryspecification searches or "mere
curve-fitting." Some even find methodological virtue in procedures that are more likely to generate
hypothesesthat are wrong, that is, logical deductionof the implicationsof simplistic assumptions.I con-
sider this stance an overreaction to the danger of chance associations. The counterintuitiveness of a
475
ComparativePolitics July 1999
hypothesis should increase our skepticism and our insistence on thoroughtesting, not our confidence in
thinly documentedassociations.There are betterways to guardagainst false positives: enlargingthe sam-
ple, replicatingwith differentindicators,and testing otherobservableimplicationsof the hypothesis.
3. Imre Lakatos, "Falsificationand the Methodology of Scientific Research Programmes,"in John
Worralland Gregory Currie, eds., The Methodology of Scientific Research Programmes (Cambridge:
CambridgeUniversityPress, 1978), pp. 8-101.
4. Terry Lynn Karl, The Paradox of Plenty: Oil Booms and Petro-States (Berkeley: University of
CaliforniaPress, 1997).
5. Samuel Huntington, The Third Wave:Democratization in the Late TwentiethCentury (Norman:
Universityof OklahomaPress, 1991).
6. Rational choice is primarilya method for generating theory. Rational choice theorists who test
their models have no choice but to use existing empirical researchmethods and are subject to the same
methodologicalstandardsas quantitativeor small-N researchers.I do not take the naive falsificationalist
stand that any single disconfirmationinvalidatesthe entire approach;such invalidationcomes only after
the accumulationof many disconfirmationsof an approach'spredictions. However,the accumulationof
consistencies or inconsistencies requiresmuch prior testing of specific hypotheses. In this process every
hypothesis must be evaluatedwith respect to at least one competing hypothesis, perhapsdrawnfrom the
same approach,perhapsfrom others.
7. David Collier and Steven Levitsky, "Democracy with Adjectives: Conceptual Innovation in
ComparativeResearch,"WorldPolitics, 49 (April 1997), 430-51.
8. It is sometimes possible to combine multidimensional components into a single indicator.
However,a theory that tells one how to combine them properlyis required.In geometry,for example, vol-
ume is a single indicatorof a multidimensionalquality,but it can not be calculatedunless one knows the
appropriateformulafor the shape of the object in question.
9. Adam Przeworski,MichaelAlvarez, Jose Antonio Cheibub,and FernandoLimongi, "WhatMakes
Democracies Endure?,"Journal of Democracy, 7 (January1996), 39-55; GiovanniSartori,The Theoryof
DemocracyRevisited(Chatham:ChathamHouse, 1987).
10. Gary King, Robert O. Keohane,and Sidney Verba,Designing Social Inquiry:Scientific Inference
in QualitativeResearch(Princeton:PrincetonUniversityPress, 1994), p. 24.
11. As Fearonargues, both large- and small-N tests involve accepting some counterfactualproposi-
tions. In large-N regression analyses it is assumed that the explanatoryvariablesare not correlatedwith
any omitted variables.James D. Fearon,"Counterfactualsand Hypothesis Testing in Political Science,"
WorldPolitics, 43 (January1991), 169-95. Only careful refutationof the plausible alternativescan make
this counterfactualplausible.A large-N analysis has the potentialto deal with the inevitablecounterfactu-
als more satisfactorily,and we should improvemeasurementand control so that large-N analysis can bet-
ter realize its potential.
12. Harry Eckstein, "Case Study and Theory in Political Science," in Fred Greenstein and Nelson
Polsby,eds., Strategiesof Inquiry(Reading:Addison-Wesley,1975), pp. 79-138.
13. CharlesRagin, The ComparativeMethod:Moving beyond Qualitativeand QuantitativeStrategies
(Berkeley:Universityof CaliforniaPress, 1987).
14. JuanJ. Linz and Alfred Stepan,eds., TheBreakdownof DemocraticRegimes(Baltimore:The Johns
HopkinsUniversityPress, 1978); GuillermoO'Donnell, PhilippeSchmitter,and LaurenceWhitehead,eds.,
Transitions from AuthoritarianRule (Baltimore:The Johns HopkinsUniversityPress, 1978).
476