Professional Documents
Culture Documents
Measuring Reasoning Ability: January 2005
Measuring Reasoning Ability: January 2005
net/publication/270585231
CITATIONS READS
55 3,392
1 author:
Oliver Wilhelm
Ulm University
200 PUBLICATIONS 8,699 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Oliver Wilhelm on 12 January 2015.
21
MEASURING REASONING ABILITY
OLIVER WILHELM
373
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 374
and how profound the proposed explanations distributed symbolic representations are the
are. They are also different with respect to how basis of relational reasoning in working memory.
much experimental research was done to inves- There is no doubt substantial promise in extend-
tigate them and how much supportive evidence ing these accounts of inductive thinking to
was collected. The theory of mental models available reasoning measures. So far, there is not
(Johnson-Laird & Byrne, 1991) is one outstand- enough experimental evidence available allow-
ing effort in describing and explaining what ing derivation of predictions of item difficulties
people do when they reason, and this theory will (but see Andrews & Halford, 2002), and there is
be described in more detail after briefly review- not enough variability in the application of the
ing more specific accounts of deductive and theories to allow a broad application in pre-
inductive reasoning, respectively. dicting psychometric properties of reasoning
Besides many more specific accounts of rea- tests in general. To illustrate the character and
soning, the mental logic approach to reasoning promise of theories of reasoning processes,
has many adherents and was applied to a broad I will limit the exposition to the mental model
range of reasoning problems (Rips, 1994). theory. It is hoped that the future will bring an
According to mental logic theories, individuals integration of theories of inductive and deduc-
apply schemata of inference when they reason. tive reasoning along with strong links to
Errors in reasoning occur when inference theories of working memory.
schemata are unavailable, corrupted, or cannot The mental model theory has been exten-
be applied. More complex inferences are sively applied to deductive reasoning (Johnson-
accomplished by compiling several elemental Laird, 2001; Johnson-Laird & Byrne, 1991)
schemata. The inference schemata in various and inductive thinking (Johnson-Laird, 1994b).
mental logic theories are different from each Briefly, mental model theory views thinking
other, from logical terms in natural language, as the manipulation of models (Craik, 1943).
and from logical terms in formal logic. The These models are analogous representations,
“psychology of proof” by Rips (1994) is the meaning that the structure of the models corre-
most elaborated and sophisticated theory of sponds to what they represent. Each entity is
mental logic. However, the mental model theory represented by an individual token in a model.
covers a broader range of phenomena than Properties of and relations between entities
mental logic accounts do. In addition, the exper- are represented by properties of and relations
imental support seems to be in favor of the between tokens, respectively. Negations of
mental models theory. Finally, both sets of atomic propositions are represented as annota-
theories are closely related with each other—the tions of tokens. Information can be represented
major difference being that the mental model implicitly, and the implicit status of a model is
approach deals with reasoning on the semantic part of the representation. If necessary, implicit
level, whereas mental logic theories investigate representations can be fleshed out by simple
reasoning on the syntactic level. mechanisms. The epistemic status of a model
Analogical reasoning is a subset of inductive is represented as a propositional annotation in
thinking that has received considerable attention the model.
in cognitive psychology. For example, Holyoak A major determinant of the difficulty of rea-
and Thagard (1997) developed a multiconstraint soning tasks is the number of mental models
theory of analogical reasoning. Three con- that are compatible with the premises. The
straints are claimed to create coherence in ana- premises “A is left of B. B is left of C. C is left
logical thought: similarity between the concepts of D. D is left of E.” can be easily integrated into
involved; structural parallels—specifically, one mental model:
isomorphism—between the functions in the
source and target domains; and guidance by the A B C D E
reasoner’s goals. This work was recently
extended. Hummel and Holyoak (2003) devel- This mental model supports conclusions
oped a symbolic connectionist model of rela- such as “C is left of E.” However, the premises
tional inference. The theory suggests that “A is left of B. B is left of C. C is left of E. D is
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 376
However, errors in answering reasoning relations. The ability to educe relations and
problems can be located at any of the three correlates is best reflected in reasoning measures.
stages. The relevance of the third stage as a Other intelligence measures are characterized
primary source of errors can be debated. by varying proximity to the general factor.
Johnson-Laird (1985) argues that the search Reasoning measures are expected to have high g
for counterexamples is crucial for individual loadings and low proportions of specific vari-
differences, yet Handley, Dennis, Evans, and ance. The g factor is said to be precisely defined
Capon (2000) argue that individuals rarely and the core construct of human abilities
engage in a search for counterexamples. Psy- (Jensen, 1998; but see Chapter 16, this volume).
chometrically, syllogisms and spatial relational There are several more or less strict interpreta-
tasks that do not rely on a search for counterex- tions of the g factor theory (Horn & Noll, 1997).
amples are as good or better than measures of In its strictest form, one core process is causal
reasoning ability as items that require such a for all communalities in individual differences.
search (Wilhelm & Conrad, 1998). In a much more relaxed form of the theory, a
Theories about reasoning processes in general general factor is supposed to capture the corre-
and the mental model theory in particular have lations between oblique first- or second-order
been widely and successfully applied to reason- factors. With respect to reasoning, Spearman
ing problems. Few of these applications have (1923) considered inductive and deductive
considered problems from psychometric reason- reasoning to be forms of syllogisms. Although
ing tasks (but see Yang & Johnson-Laird, 2001). Spearman (1927) did not exclude the option of
We will now discuss the status of reasoning abil- a reasoning-specific group factor besides g, per-
ity in various models of the structure of intelli- formance on reasoning measures was assumed
gence, as assessed by psychometric reasoning to be primarily limited by mental energy—or g.
tasks, and then turn to formal and empirical clas- The controversy around Spearman’s theory
sifications of reasoning measures. Ideally, a gen- was initially focused on statistical and method-
eral theory of reasoning processes should govern ological issues, and it was in the context of new
test construction and confirmatory data analysis. statistical developments that Thurstone con-
In practice, theories of reasoning processes have tributed his theory of primary mental abilities.
rarely been considered when creating and using Thurstone’s initial work on the structure of intelli-
psychometric reasoning tasks. gence (1938) was substantially modified and
improved by Thurstone and Thurstone (1941). In
the later work, the primary factors of Space,
REASONING IN VARIOUS MODELS Number, Verbal Comprehension, Verbal Fluency,
OF THE STRUCTURE OF INTELLIGENCE Memory, Perceptual Speed, and Reasoning are
distinguished. The initial distinction between
Binet’s original definition of intelligence inductive and deductive reasoning was abandoned,
focused on abilities of sensation, perception, and the associated variances were allocated to
and reasoning, but this definition was modified Reasoning, Verbal Comprehension, Number, and
several times and ended up defining intelligence Space. The Reasoning factor is marked mostly
as the ability to adapt to novel situations (Binet, by inductive tasks. Several of the other factors
1903, 1905, 1907). Structurally, Binet’s as well have substantial loadings from reasoning
as Ebbinghaus’s (1895) earlier investigations do tasks. In a sample of eighth-grade students, the
not fall within the realm of factor-analytic work, Reasoning factor is the factor with the highest
and consequently, they have been rarely dis- loading on a second-order factor. Further elabo-
cussed in this context. ration of deductive measures by creating better
Spearman’s invention of tetrad analysis as a indicators, as suggested by the Thurstones, was
means to assess the rank of correlation matrices attempted only by the research groups surround-
was the starting point of factor-analytic work ing Colberg (Colberg, Nester, & Cormier, 1982;
(Krueger & Spearman, 1906; Spearman, 1904). Colberg, Nester, & Trattner, 1985) and Guilford.
Spearman’s definition of general intelligence Guilford’s contribution to the measurement
(g) focuses on the role of educing correlates and of reasoning ability is mostly in constructing and
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 378
Categor. Quantit.
Syllog. Tasks
Sequ.
Reason. Quantitat.
Linear Multiple
Syllog. Exempl.
Rule Odd
Discover Elements
approach the problems in the same way. Some be used when studying reasoning ability. The
individuals are more successful than others benefits would be mutual. For example, differ-
because they have “more” of the required abil- ences in correlations between various individual
ity. Consequently, it is implicitly assumed that reasoning items as used in cognitive research
individuals at the very top of the ability distrib- and latent variables from reasoning ability tests
ution proceed roughly in the same way through might reveal important differences between the
a reasoning test as individuals at the very experimental tasks. Similarly, variability in the
bottom of the distribution. If a subgroup of par- difficulties of items from standard psychometric
ticipants chooses a different approach to work reasoning tests can be possibly explained by
on a given test, the consequence is that the test application of various theories of reasoning
is measuring different abilities for different sub- processes—like the mental model theory that
groups. For syllogistic reasoning, it is known was sketched above.
that there are two or three subgroups of individ-
uals who approach syllogistic reasoning tests
differently. Depending on which strategy is EMPIRICAL CLASSIFICATIONS
chosen, different items are easy and hard, respec- OF REASONING MEASURES
tively (Ford, 1995). Knowledge about strategies
in reasoning is limited (but see Schaeken, de In psychology, inductive reasoning has fre-
Vooght, Vandierendonck, & d’Ydewalle, 2000), quently been equated with proceeding from
and the role of strategies in established reasoning specific premises to general conclusions.
measures has been barely investigated. Conversely, deductive reasoning has frequently
The actual reasoning tasks that have been been equated with proceeding from general
used in experimental investigations of reasoning premises to specific conclusions. This definition
processes and psychometric studies of reason- can still be found in textbooks, but it is outdated.
ing ability have little to no overlap in surface There are inductive arguments proceeding from
features. However, there is now good evidence general premises to specific conclusions, and
(Stanovich, 1999) that reasoning problems, as there are deductive arguments proceeding from
they have been used in cognitive psychology, specific premises to general conclusions. For
are moderately correlated with reasoning mea- example, the argument “Almost all Swedes are
sures as they have been used in individual- blond. Jan is a Swede. Therefore Jan is blond.”
differences research. The experimentally used is an inductive argument that violates the above
tasks have been thoroughly investigated, and we definition, and the argument “Jan is a Swede.
now know a lot about the ongoing thought Jan is blonde. Therefore some Swedes are
processes involved in these tasks. One important blond.” is a deductive argument that also
conclusion from this research is that the instan- violates the above definition.
tiations of reasoning problems are appropriate According to Colberg et al. (1982), most
to elicit the intended reasoning processes for the established reasoning tests confound the direc-
most part (Shafir & Le Boeuf, 2002; Stanovich, tion of inference (general or specific premises
1999). However, there are pervasive reliability and general or specific conclusions) with deduc-
issues because frequently, only a few such tive and inductive reasoning tasks. By con-
problems are used in any given experiment. structing specific deductive and inductive
Conversely, we do not know a lot about ongoing reasoning tasks (Colberg et al., 1985), they pre-
thought processes in established measures of sent correlational evidence that seems to support
reasoning ability as used in psychometric the unity of inductive and deductive reasoning
research. However, we do know a lot about their tasks. However, reliability of the measures is
structure (Carroll, 1993), their relations with very low; the applied method of disattenuating
other measures of maximal behavior (Carroll, correlations is not satisfying; and, most impor-
1993; Jäger et al., 1997; Kyllonen & Christal, tant, Shye (1988) reclassifies their tasks and
1990), and their validity for the prediction finds support for a distinction between rule-
of real-life criteria (Schmidt & Hunter, 1998). inferring and rule-applying tasks (see Chapter 18,
Both sets of reasoning tasks can and should this volume). In the initial classification and
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 382
intelligence. Finally, deductive reasoning appears deductive figural-spatial tasks. However, these
8 times, with an average loading of .70 on a tasks frequently represent a mixture with other
factor labeled 2H—reflecting a mixture of fluid demands. For example, “ship-destination” has
and crystallized intelligence. Induction, on the quantitative demands; “match problems,” “plot-
other hand, appeared only twice, with an ting,” and “route planning” have visualization
average loading of .41. demands. In classifying 90 German intelligence
Given these considerations, the proposal of tasks, Wilhelm (2000) could not find a single
reasoning ability as being composed of induc- deductive figural-spatial measure.
tive, deductive, and quantitative reasoning is To test the structure of reasoning ability,
competing with a proposal of verbal, figural- Wilhelm (2000) selected reasoning measures
spatial, and quantitative reasoning. To investi- based on their cognitive demands and the
gate possible structures of reasoning ability, one content involved. In addressing the above-
should include tasks that allow for comparison mentioned criticisms of existent reasoning tasks,
between several competing theories. There are several reasoning tasks were newly constructed.
basically five theories competing as explana- The following 12 measures were included in the
tions for the structure of reasoning ability. study (D and I denote deductive and inductive
reasoning; F, N, and V stand for figural, numeri-
1. a general reasoning factor accounting for the cal, and verbal content, respectively).
communality of reasoning tasks varying with
DF1 (Electric Circuits): Positive and negative
respect to content (verbal, quantitative, figural-
signals travel through various switches. The result-
spatial) and operation (inductive, deductive);
ing signal has to be indicated. The number and kind
2. two correlated factors for inductive and of switches and the number of signals are varied
deductive reasoning, respectively, without the (Gitomer, 1988; Kyllonen & Stephens, 1990).
specification of any content factors;
DF2 (Spatial Relations): Spatial orientation of
3. three correlated factors for verbal, quantitative, symbols is presented pairwise. The spatial orien-
and figural-spatial reasoning, without distin- tation of two symbols that were not presented
guishing inductive and deductive reasoning together can be derived from the pairwise presen-
processes; tations (Byrne & Johnson-Laird, 1989).
4. a general reasoning factor along with nested DN1 (Solving Equations): A series of equations is
and completely orthogonal factors for verbal presented. Participants can derive values of vari-
and quantitative reasoning but no figural- ables deductively. Items vary by the number of
spatial factor; and variables and the difficulty of relation. A difficult
sample item is “A plus B is C plus D. B plus C is
5. two correlated factors for inductive and deduc-
2*A. A plus D is 2*B. A + B is 11. A + C is 9.”
tive reasoning along with completely orthogo-
nal content factors for verbal and quantitative DN2 (Arithmetic Reasoning): Participants pro-
reasoning and again no figural-spatial factor. vide free responses to short verbally stated arith-
metic problems from a real-life context.
For the evaluation of these models, it is
DV1 (Propositions): Acts of a hypothetical
important to avoid a confound between content
machine are described, and the correct conclusion
and process on the task level. A second crucial
has to be deduced. The number of mental models,
aspect for exploring the structure of reasoning
logical relation, and negation are varied in this
ability is to select appropriate tasks to measure
multiple-choice test (Wilhelm & McKnight,
the intended constructs. This is particularly hard
2002). A simple sample item is as follows: “If the
in the domain of deductive reasoning. Following
lever moves and the valve closes, then the inter-
the above-presented definition of inductive and
rupter is switched. The lever moves. The valve
deductive reasoning, it is very difficult to find
closes.”
adequate measures of figural-spatial deductive
reasoning. In fact, only 7 of all the tasks DV2 (Syllogisms): Verbally phrased quantitative
described in Carroll (1993) can be classified as premises are presented in which the number of
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 384
Table 21.1 Fit Statistics of Five Competing Structural Explanations of Reasoning Ability
Note: Ind. Ded. = inductive and deductive; Cont. = contents; CFI = comparative fit index; RMSEA = root mean square error
of approximation; BIC = Bayesian information criterion; CAIC = consistent Akaike’s information criterion.
general factor model is the better explanation of factor. This model is presented in Figure 21.2.
the data because it is more parsimonious than the The two content factors—Verbal and Quantita-
two-factor model. However, both models do not tive Reasoning—reflect deductive and inductive
provide acceptable fit. reasoning with verbal and quantitative material,
A model specifying three correlated group respectively. Due to the relevance of task con-
factors for content does substantially better in tent, it can be expected that the Verbal and the
explaining the data. Although there is still room Quantitative Reasoning factors do predict dif-
to improve fit, the model represents an accept- ferent aspects of criteria such as school grades,
able explanation of the data. Given that the achievement, and the like. The loading of the
model is completely derived from theory, it can Figural Reasoning factor on fluid intelligence is
serve as a good starting point for future investi- freely estimated to be 1. Not only are g and Gf
gations. Comparing the two models with com- very highly or perfectly correlated (Gustafsson,
pletely orthogonal content factors again 1983), but the same is true between figural-
demonstrates the superiority of the model that spatial reasoning and fluid intelligence. Con-
postulates the unity of inductive and deductive sequently, the current analysis extends Undheim
reasoning. In this data set, inductive and deduc- and Gustafsson’s (1987) work to a lower stra-
tive reasoning are perfectly correlated. tum. It is a replicated finding that Gf is the
Introducing a distinction between both factors is Stratum 2 factor with the highest loading on
unnecessary and consequently does not improve g (Carroll, 1993). It has also been argued that
model fit. Both models are substantially better this relation might be perfect (Gustafsson, 1983;
than the initial one- and two-factor models. Undheim & Gustafsson, 1987, but see Chapter
However, one of the loadings on the verbal 18, this volume). Figural-spatial reasoning, in
factor is not significant and negative in sign. turn, has the highest loading on fluid intelli-
Given this departure from the theoretical expec- gence, and in the data presented in this chapter,
tation of positive and significant loadings, and the relation between figural-spatial reasoning
keeping in mind interpretative issues with group and the factor labeled fluid intelligence is per-
factors in nested factor models (see Chapter 14, fect. Hence, if we do want to measure g with a
this volume), the best solution seems to be single task, we should select a task of figural-
accepting the model based on the content spatial reasoning. Matrices tasks have been con-
factors. In this model, there are three content- sidered particularly good measures of Gf and g.
related reasoning factors, each one of them sub- Spearman (1938) suggested the Matrices test
suming inductive and deductive reasoning tasks. from Penrose and Raven (1936), as well as the
In the current study, the model with correlated inductive figural measure from Line (1931), as
group factors is equivalent to a second-order the single best indicators of g. The latter test is
factor model. In this model, the correlations less prominent than the Matrices test, but vari-
between factors are captured by a higher-order ants of it can be found in various intelligence
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 386
tests. Although it is not good practice to emerged considering the desiderata for future
measure rather general constructs with single research provided by Carroll (1993, p. 232).
tasks, there is certainly evidence suggesting Specifically, the present tasks have been
that, if need be, this sole task should be a selected or constructed based on a careful review
figural-spatial reasoning measure. Whether such of the individual-differences and cognitive
a task is classified as inductive or deductive is literature on the topic, the items were analyzed
not important for that purpose. by latent item response theory, and the scales
Frequently, the composition of intelligence were analyzed by confirmatory factor analyses.
batteries is not well balanced in the sense that The current tests include several new reasoning
there are many indicators for one intelligence measures that are based on and informed
construct but few or no tests for other intelli- through cognitive psychology.
gence constructs. In such cases (e.g., Roberts
et al., 2000), the overall solution can be domi-
nated by tasks other than fluid intelligence WORKING MEMORY AND REASONING
tasks. As a result, figural-spatial reasoning tasks
might not be the best selection in these cases to There have been several attempts to explain
reflect the g factor of such a battery. reasoning ability in terms of other abilities that
When interpreting the results from this study, are considered more basic and tractable. Specifi-
it is important to keep in mind that the differ- cally, working memory has been proposed as
ences between various models were not that big. the major limiting factor for human reasoning
With different tasks and different participants, it (Kyllonen & Christal, 1990; Süß, Oberauer,
is possible that different results emerge. The Wittmann, Wilhelm, & Schulze, 2002). The
present results are preliminary and in need of working definition of working memory has been
replication and extension. The most important that any task that requires individuals to simul-
result from the study reported above is that in a taneously store and process information can be
critical test aimed to assess a distinction considered a working memory task (Kyllonen &
between inductive and deductive reasoning, no Christal, 1990). This definition has been criti-
such distinction could be found. Latent factors cized because it seems to include all reasoning
of inductive and deductive reasoning are per- measures. The definition has also been criti-
fectly correlated in several models. The result of cized because its notion of “storage” and “pro-
a unity of inductive and deductive reasoning cessing” are imprecise and fuzzy (see Chapter
was also obtained with multidimensional 22, this volume). A critique of the “working
scaling, exploratory factor analysis, and tetrad memory = reasoning” hypothesis can also focus
analysis. It is important to note that this result on the problem of the reduction of one construct
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 387
in need of explanation through another one psychological constructs. There should be more
(Deary, 2001) that is not doing any better. than three indicators of sufficient psychometric
However, this critique is unjustified for several quality for each construct to allow an evaluation
reasons. of the measurement models on both sides.
1. It is easy to construct and create working 2. Depending on the task selection and the
memory tasks. Many tasks that satisfy the above breadth of the definition of both constructs, the
definition work in the sense that they correlate specification of more than one factor on both
highly with other working memory measures, sides might be necessary (Oberauer, Süß,
reasoning, Gf, and g. In addition, it is easy and Wilhelm, & Wittmann, 2003).
straightforward to manipulate the difficulty of a 3. The definition of constructs and task
working memory item by manipulating the stor- classes is a difficult issue. Classifying anything
age demand, the process demand, or the time as a working memory task that requires simulta-
available to do storage, processing, or both. neous storage and processing could turn out to
Those manipulations account for a large amount be overinclusive. Restricting fluid intelligence
of variance of task difficulty in almost all cases. to figural-spatial reasoning measures is likely to
2. There is an enormous corpus of research be underinclusive. The comments on tasks of
on working memory and processes in working reasoning ability presented in this chapter, as
memory in cognitive psychology (Conway, well as similar comments on what constitutes a
Jarrold, Kane, Miyake, & Towse, in press; good working memory task (see Chapters 5 and
Miyake & Shah, 1999). It is fruitful to derive 22, this volume), might be a good starting point
knowledge and hypotheses about individual dif- for definition of task classes.
ferences in cognition from this body of research. 4. Content variation in the operationaliza-
3. In the sense of a reduction of working tion for both constructs can have an influence on
memory on biological substrates, intensive and the magnitude of the relation. When assessing
very productive research has linked working reasoning ability, one is well advised to use
memory functioning to the frontal lobes and several tasks with verbal, figural, and quantita-
investigated the role of various physiological tive content. The same is true for working
parameters to cognitive functioning (Kane & memory. This chapter provided some evidence
Engle, 2002; see Chapter 9, this volume, for a for the content distinction on the reasoning side.
review of research linking reasoning to various Similar evidence for the working memory
neuropsychological parameters). Hence, the side is evident in structural models that posit
equation of working memory with reasoning is content-specific factors of working memory
complemented by relating working memory to (Kane et al., 2004; Kyllonen, 1996; Oberauer,
the frontal lobes and other characteristics and Süß, Schulze, Wilhelm, & Wittmann, 2000).
features of the brain. Relating working memory tasks of one content
with reasoning tasks of another content causes
The strengths of the relation found between one to underestimate the true relation.
latent factors of working memory and reasoning 5. A mono-operation bias should be avoided
vary substantially, fluctuating between a low of in assessing both constructs. Using only com-
.6 (Engle, 2002; Engle, Tuholski, Laughlin, & plex span tasks or only dual-tasks to assess
Conway, 1999; Kane et al., 2004) and a high of working memory functioning does not do
nearly 1 (Kyllonen, 1996). In the discussion of justice to the much more general nature of the
the strength of the relation, several sources that construct (Oberauer et al., 2000). Task class-
could cause an underestimation or an overesti- specific factors or task-specific strategies might
mation should be kept in mind. have an effect on the estimated relation.
1. The relation should be assessed on the 6. Reasoning measures—like other intelli-
level of latent factors because this is the level gence tasks—are frequently administered under
of major interest when it comes to assessing time constraints. Timed and untimed reasoning
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 388
Boole, G. (1847). The mathematical analysis of logic: variable approach. Journal of Experimental
Being an essay towards a calculus of deductive Psychology: General, 128, 309–331.
reasoning. Cambridge, UK: Macmillan, Barclay, Epstein, S. (1994). Integration of the cognitive and
and Macmillan. the psychodynamic unconscious. American
Byrne, R. M. J., & Johnson-Laird, P. N. (1989). Psychologist, 49, 709–724.
Spatial reasoning. Journal of Memory and Evans, J. St. B. T. (1989). Bias in human reasoning:
Language, 28, 564–575. Causes and consequences. Hove, UK: Lawrence
Carnap, R. (1971). Logical foundations of probabil- Erlbaum.
ity. Chicago: University of Chicago Press. Ford, M. (1995). Two modes of mental representation
Carroll, J. B. (1989). Factor analysis since Spearman: and problem solution in syllogistic reasoning.
Where do we stand? What do we know? In Cognition, 51, 1–71.
R. Kanfer, P. L. Ackerman, & R. Cudeck (Eds.), Frege, G. (1879). Begriffsschrift: Eine der arithmetis-
Abilities, motivation, and methodology: The chen nachgebildete Formelsprache des reinen
Minnesota symposium on learning and individ- Denkens [Begriffsschrift: A formula language
ual differences (Vol. 10, pp. 43–70). Hillsdale, modeled upon that of arithmetic, for pure
NJ: Lawrence Erlbaum. thought]. Halle a.S.: L. Nebert.
Carroll, J. B. (1993). Human cognitive abilities: A Gilinsky, A. S., & Judd, B. B. (1993). Working
survey of factor-analytic studies. Cambridge, memory and bias in reasoning across the life
MA: Cambridge University Press. span. Psychology and Aging, 9, 356–371.
Colberg, M., Nester, M. A., & Cormier, S. M. (1982). Gitomer, D. H. (1988). Individual differences in
Inductive reasoning in psychometrics: A philo- technical troubleshooting. Human Performance,
sophical corrective. Intelligence, 6, 139–164. 1, 111–131.
Colberg, M., Nester, M. A., & Trattner, M. H. (1985). Guilford, J. P. (1956). The structure of intellect.
Convergence of the inductive and deductive Psychological Bulletin, 53, 267–293.
models in the measurement of reasoning abilities. Guilford, J. P. (1967). The nature of human intelli-
Journal of Applied Psychology, 70, 681–694. gence. New York: McGraw-Hill.
Conway, A. R. A., Jarrold, C., Kane, M., Miyake, A., Guilford, J. P., Christensen, P. R., Kettner, N. W.,
& Towse, J. (in press). Variation in working Green, R. F., & Hertzka, A. F. (1954). A factor
memory. Oxford, UK: Oxford University Press. analytic study of Navy reasoning tests with the
Craik, K. (1943). The nature of explanation. Air Force Aircrew Classification Battery.
Cambridge, MA: Cambridge University Press. Educational and Psychological Measurement,
Deary, I. J. (2001). Human intelligence differences: 14, 301–325.
Towards a combined experimental-differential Guilford, J. P., Comrey, A. L., Green, R. F., &
approach. Trends in Cognitive Science, 5, Christensen, P. R. (1950). A factor-analytic
164–170. study on reasoning abilities: I. Hypotheses and
Ebbinghaus, H. (1895). Über eine neue Methode description of tests. Reports from the
zur Prüfung geistiger Fähigkeiten und ihre Psychological Laboratory, University of
Anwendung bei Schulkindern [On a new method Southern California, Los Angeles.
to test mental abilities and its application with Guilford, J. P., Green, R. F., & Christensen, P. R.
schoolchildren]. Zeitschrift für Psychologie und (1951). A factor-analytic study on reasoning
Physiologie der Sinnesorgane, 13, 401–459. abilities: II. Administration of tests and analysis
Ekstrom, R. B., French, J. W., & Harman, H. H. of results. Reports from the Psychological
(1976). Manual for kit of factor-reference cogni- Laboratory, University of Southern California,
tive tests. Princeton, NJ: Educational Testing Los Angeles.
Service. Gustafsson, J.-E. (1983). A unifying model for the
Engle, R. W. (2002). Working memory capacity as structure of intellectual abilities. Intelligence, 8,
executive attention. Current Directions in 179–203.
Psychological Science, 11, 19–23. Hammond, K. R. (1996). Human judgment and social
Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, policy: Irreducible uncertainty, inevitable error,
A. R. A. (1999). Working memory, short-term unavoidable injustice. Oxford, UK: Oxford
memory and general fluid intelligence: A latent University Press.
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 390
Their nature and measurement (pp. 97–116). Spearman, C. (1923). The nature of ‘intelligence’ and
Mahwah, NJ: Lawrence Erlbaum. the principles of cognition. London: Macmillan.
Magnani, L. (2001). Abduction, reason, and science: Spearman, C. (1927). The abilities of man: Their
Processes of discovery and explanation. nature and measurement. New York: AMS.
Dordrecht, the Netherlands: Kluwer Academic. Spearman, C. (1938). Measurement of intelligence.
McDonald, R. P. (1985). Factor analysis and related Scientia, 64, 75–82.
methods. Hillsdale, NJ: Lawrence Erlbaum. Stanovich, K. E. (1999). Who is rational: Studies of
Miyake, A., & Shah, P. (1999). Models of working individual differences in reasoning. Mahwah,
memory: Mechanisms of active maintenance NJ: Lawrence Erlbaum.
and executive control. New York: Cambridge Stegmüller, W. (1996). Das Problem der Induktion:
University Press. Humes Herausforderung und moderne Antworten
Oberauer, K., Süß, H.-M., Schulze, R., Wilhelm, O., [The problem of induction: Hume’s challenge
& Wittmann, W. W. (2000). Working memory and modern answers]. Darmstadt: Wissenschaf-
capacity: Facets of a cognitive ability construct. tliche Buchgesellschaft.
Personality and Individual Differences, 29, Stenning, K., & Oberlander, J. (1995). A cognitive
1017–1045. theory of graphical and linguistic reasoning:
Oberauer, K., Süß, H.-M., Wilhelm, O., & Wittmann, Logic and implementation. Cognitive Science,
W. W. (2003). The multiple faces of working 19, 97–140.
memory: Storage, processing, supervision, and Sternberg, R. J., & Turner, M. E. (1981). Components
coordination. Intelligence, 31, 167–193. of syllogistic reasoning. Acta Psychologica, 47,
Penrose, L. S., & Raven, J. C. (1936). A new series 245–265.
of perceptual tests: Preliminary communication. Störing, G. (1908). Experimentelle Untersuchungen
British Journal of Medical Psychology, 16, über einfache Schlussprozesse [Experimental
97–104. studies on simple inference processes]. Archiv
Rips, L. J. (1994). The psychology of proof: für die gesamte Psychologie, 11, 1–27.
Deductive reasoning in human thinking. Süß, H.-M., Oberauer, K., Wittmann, W. W.,
Cambridge: MIT Press. Wilhelm, O., & Schulze, R. (2002). Working
Roberts, R. D., Goff, G. N., Anjoul, F., Kyllonen, P. C., memory capacity explains reasoning ability—
Pallier, G., & Stankov, L. (2000). The Armed and a little bit more. Intelligence, 30, 261–288.
Services Vocational Aptitude Battery: Not much Thurstone, L. L. (1938). Primary mental abilities.
more than acculturated learning (Gc)? Learning Chicago: University of Chicago Press.
and Individual Differences, 12, 81–103. Thurstone, L. L., & Thurstone, T. G. (1941).
Schaeken, W., de Vooght, G., Vandierendonck, A., & Factorial studies of intelligence. Chicago:
d’Ydewalle, G. (Eds.). (2000). Deductive reason- University of Chicago Press.
ing and strategies. New York: Lawrence Erlbaum. Undheim, J. O., & Gustafsson, J.-E. (1987). The hier-
Schmidt, F. L., & Hunter, J. E. (1998). The validity archical organization of cognitive abilities:
and utility of selection methods in personnel Restoring general intelligence through the use of
psychology: Practical and theoretical implica- linear structural relations. Multivariate Behavior
tions of 85 years of research findings. Research, 22, 149–171.
Psychological Bulletin, 124, 262–274. Wilhelm, O. (2000). Psychologie des schlussfolgern-
Shafir, E., & Le Boeuf, R. A. (2002). Rationality. den Denkens: Differentialpsychologische Prüfung
Annual Review of Psychology, 53, 491–517. von Strukturüberlegungen [Psychology of rea-
Shye, S. (1988). Inductive and deductive reasoning: A soning: Testing structural theories]. Hamburg:
structural reanalysis of ability tests. Journal of Dr. Kovac.
Applied Psychology, 73, 308–311. Wilhelm, O., & Conrad, W. (1998). Entwicklung und
Sloman, S. A. (1996). The empirical case for two Erprobung von Tests zur Erfassung des logis-
systems of reasoning. Psychological Bulletin, chen Denkens [Development and evaluation of
119, 3–22. deductive reasoning tests]. Diagnostica, 44,
Spearman, C. (1904). “General intelligence” objec- 71–83.
tively determined and measured. American Wilhelm, O., & McKnight, P. E. (2002). Ability and
Journal of Psychology, 15, 201–293. achievement testing on the World Wide Web. In
21-Wilhelm.qxd 9/8/2004 5:09 PM Page 392