Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

In: Kokinov, B., Karmiloff-Smith, A., Nersessian, N. J. (eds.) European Perspectives on Cognitive Science.

New Bulgarian University Press, 2011


ISBN 978-954-535-660-5

Cognitive Complexity in Matrix Reasoning Tasks


Marco Ragni (ragni@cognition.uni-freiburg.de)
Department of Cognitive Science,
University of Freiburg, Germany

Philip Stahl (stahl@informatik.uni-freiburg.de)


Department of Computer Science
University of Freiburg, Germany

Thomas Fangmeier (fangmeier@cognition.uni-freiburg.de)


Department of Cognitive Science
University of Freiburg, Germany
Abstract
Reasoning difficulty for items in IQ-tests is generally determined empirically: The item difficulty is measured by the
number of reasoners who are able to solve the problem. Although this method has proven successful, (nearly all IQ-Tests
are designed this way) it is desirable to have an inherent formal measure reflecting the reasoning complexity involved. In
this article, we analyze and classify geometrical analogy reasoning problems. Based on the types of functions necessary
to solve these problems, a complexity measure is introduced,
which reliably captures human reasoning difficulty. Finally,
our complexity measure is compared to the empirical difficulty
ranking as determined by Cattells Culture Fair Test and Evans
Analogy problems.
Keywords: Cognitive Complexity; Analogical Reasoning;
Geometric Analogies

a formal characterization may elucidate future test development and can then form a formal foundation of reasoning
complexity. An analysis of the IQ-Test problems of Raven
has been conducted by Carpenter, Just, and Shell (1990). Figure 1 is an example, variations of which can be found in popular literature (e.g., Eysenck, 1962; Russell & Carter, 1994).

Introduction
For the past hundred year human intelligence has mostly been
tested by use of IQ-tests (Binet & Simon, 1905). Geometrical analogy problems (cf. Fig. 1) are part of a number
of IQ-tests, for example the Hamburg-Wechsler-IntelligenceTest (Wechsler, Hardesty, Lauber, & Bondy, 1961). A significant number of IQ-tests even consist exclusively of such geometrical reasoning problems, e.g., Cattells Culture Fair Test
(K. Cattell & Cattell, 1959) or Ravens Standard Progressive
Matrices (Raven, Raven, & Court, 2000) and Advanced Progressive Matrices (Raven, 1962). Such problems are sometimes classified as culture fair (R. Cattell, 1968) as they require less declarative knowledge than for instance word analogy problems. While the success in solving word analogy
problems can depend on additional knowledge, geometrical
reasoning problems can be modeled using mathematical functions exclusively. For this reason these problems are more
accessible in formal terms than other analogy problems. An
individuals intelligence is always measured by determining
the deviation of his or her performance on a given set of reasoning problems, from a particular group (specific age and
educational status, etc.). Problems in turn are classified empirically as simple or challenging, based on whether a given
population is able to solve most or only a limited number of
similar problems. While it is possible to empirically capture
the human reasoning difficulty it seems more desirable to
identify the characteristics of such problems formally. Such

Figure 1: An example of a geometric reasoning problem. The


task is to fill in one of the four answers below. The correct
solution is the third one. All figures and problems in the following were designed by the authors to protect the security of
IQ-tests.
A side-effect of such a formal classification is that mental
operations or functions must be classified as easier or more
difficult for the human reasoner to apply. Another aspect is
the creation of new, different reasoning problems with the
same reasoning difficulty. A formal classification in turn requires a computational model. This approach, cognition as
computation, was coined and introduced by Newell and Simon (1972). Once problems are formally represented and
functionally classified, the computational requirements necessary to solve them can be calculated. In this article we put

this idea to the test for one of the best analyzed sample sets,
namely for Cattells Culture Fair Test (CFT). This test, along
with the aforementioned Ravens SPM and APM, is given an
empirical difficulty rating in the manual, i.e. the percentage
of persons who have correctly solved each problem. Our formal complexity measure is evaluated against these empirical
difficulty measures. Another empirical investigation, which
was recently performed for Evans Analogy problems (Evans,
1964; Lovett, Tomai, Forbus, & Usher, 2009), serves as an
additional benchmark. This article is structured as follows:
In the next section, we briefly review the literature on cognitive complexity, especially regarding explanations of human
reasoning difficulty. In Section 3, we analyze typical reasoning problems and develop a classification. Thereafter, we
introduce a functional complexity measure and compare it, in
Section 5, to empirical findings.

State-of-the-Art
Solving matrix problems requires the recognition and computation of similarities between the presented matrix objects
and their attributes. According to Representational Distortion (RD) theory, object representation similarity is defined
by the number of basic transformations necessary to transform one representation into another (Chater & Hahn, 1997).
Hahn and colleagues defined the complexity of the similarity computation as a special case of Kolmogorov complexity,
(e.g. Chater, 2000; Li & Vitanyi, 1997). Instead of defining
the length of the shortest program as a complexity measure,
they used the transformational similarity, i.e., the length of the
sequence of basic transformations (Hahn, Chater, & Richardson, 2003).
The problem of such a formal measure is the tractability
constraint (van Rooij, 2008). A transformation between two
object representations can be represented as a binary string.
The similarity can then be computed by a Boolean circuit, but
the outcome of this is super-polynomial complexity (Muller,
van Rooij, & Wareham, 2009). Consequently, it would be intractable and RD models would be psychologically implausible. This motivated them to argue for an analysis of restricted
problem parameters to avoid problems with intractability in
classical RD models.
Studied since the 1960s, the question of determining factors of the subjective difficulty of concepts, (Feldman, 2000)
has not found a sufficient answer. Feldman undertook a series
of experiments to test a wide range of Boolean functions with
respect to the question: Why are some concepts psychologically simple and easy to learn, while others seem difficult,
complex or incoherent? (Feldman, 2000). The data revealed
a surprisingly simple empirical law: The subjective difficulty
of a concept is directly proportional to its Boolean complexity. The influence of Boolean Complexity has, however, recently been questioned by Vigo (2006).
Evans (1964) wrote a program called ANALOGY to solve
geometric analogy problems frequently encountered in intelligence tests. The program can solve problems that can be

described like: Figure A is to Figure B as Figure C is to


which of the given (answer) figures? The program uses an
algorithm which first decomposes each problem figure into
objects. Then, it calculates a set of properties for these objects and the relationships between them. Next, properties
and relationships between A and B are compared with properties and relations between C and the various answer figures.
Finally, the answer figure which leads to the most similar set
is chosen as the answer (Evans, 1964).

Classification of Problems
We will now characterize the problems by various properties.
The first and most simple distinction is made according to
the characteristic changes of functions. It is possible to extract functions for object changes by a horizontal and vertical
analysis of this sort of problem. Fig. 2 depicts an example of
a horizontal rotation transformation function of a rectangle.
Objects can be decomposed to simple squares, triangles, dots
or lines. As shown in Fig. 2, all objects consist of triangles,
rectangles or hexagons.
The second category, topological characteristic changes,
specifies the relationships between certain shapes or objects.
Topological characteristics are mathematically classified as
determining if two object are: (i) distinct, (ii) overlapping, or
(iii) contained within one another.
Category three specifies the changes of pattern. It is determined if there are any transformation changes concerning the
shapes of objects; that is, if an object (e.g. circle) is transformed into another object e.g., in Fig. 6 the square in the left
cell above can be associated with the triangle in the left cell
below.
The fourth distinction concerns the alteration of the number of elements in a problem. Take, for example, Fig. 3, there
the number of objects increases. In other words, if there is an
implicit sequence of numbers it must be determined whether
the number of objects increases or decreases from cell to cell
and whether this change happens horizontally, vertically, or
in both directions.
A further distinction concerns the characteristics of different shapes. To be analyzed is whether lines, dots, dashes,
or other characteristics act in compliance with Boolean functions such as, AND, OR, and XOR. Category number 6
number of parts characterizes how many parts the objects
are composed of.
The following categories horizontal function, vertical function and horizontal-vertical function classify the problems according to the objects of the horizontal/vertical succession of
changes. This refers to whether the objects in the cells are
dependant upon addition and/or subtraction or only succession. One aspect is rotation, indicating if at least one of the
objects is rotated during the changes of the cells. Another aspect takes moving objects into account, this is true if at least
one object is moving (e.g., clockwise or counter-clockwise,
up, down, or left or right movements of an object). Then,
the overall dimension characterizes whether the underlying

Figure 2: An example for a problem in which the functions


of change can be extracted horizontally.
problem can be solved horizontally or vertically alone (onedimensionally) or requires both (two-dimensionally).

General Categories
The classification items for matrix reasoning problems, as
discussed above, can be grouped into three main types, which
we denote as (i) geometric operation problems, (ii) Boolean
operation problems, and (iii) grouping problems.

Figure 3: An example for a problem in which objects are


added vertically and horizontally.

Type 1: Geometric Operation Problems. Problems of this


type can be solved by the application of geometric functions
which have to be applied on each successive cell (horizontally, vertically or in both directions). Examples for such
functions are:
Continuous changes by affine transformations. An example are rotations (e.g., the rectangle is rotated in Fig. 2) or
changes in size (scaling).
Addition or removal of objects. An example is the star in
the leftmost column in Fig. 3. In each successive cell an
additional star is inserted (linear).
Movement of objects (e.g., the triangle and the hexagon in
Fig. 2)
Type 2: Boolean Operation Problems. To solve problems
of this type it is necessary to consider information from at
least two cells in relation to a resulting cell. An example for
such an underlying functions is the Boolean function XOR
(see Fig. 1 and Fig. 7).
Type 3: Grouping Problems. This type of problems requires the identification of groups of objects. The pattern
changes cannot be solely identified through a determination
in the horizontal, the vertical or in both directions. One must
consider all cells in order to group features and figure out
what is absent or does not occur as frequently as others objects. An example for such a kind of problem is Fig. 4. There
are three groups of objects consisting of L, N and X with differences in color, size, shape. The reasoner has to identify the
missing object and its characteristic property.

Mathematical Foundation
In theoretical computer science the difficulty of a problem is
determined with respect to the resources necessary to solve

Figure 4: An example for a grouping problem. The reasoners


task is to identify three groups of objects.
it. A central aspect is the use of computational models (e.g.
the Turing machine) to determine the amount of resources
necessary to solve the problem. Central to matrix problems
are functions mapping one cell to another cell. For this reason, we introduce a complexity measure based on functions,
which we can correlate, in the next section, to empirical findings.

Mappings between matrices


Most common geometric transformations are linear, including rotations, scaling, shearing, and reflection. In two dimensions, linear transformations can be represented using a 2 2
transformation matrix. These transformations (rotation, scaling, i.e. enlarging or shrinking, etc.), can, along with other
basic mappings, be defined as basic transformations pi . The
cost for a single problem is computed by summing up all
transformations which must be applied and the associations
which must be built between the stimuli.
The complexity of a single association depends on the
number of stimuli in each field and the number of transformations between the associated stimuli. See Fig. 2 for an
example of computing the costs for rotating the rectangle.

The costs of a basic transformation can be weighted according to the different types of transformation. For example:
A translation of a stimuli might be easier to transform as a rotation. So the translated transformation gets a basic cost of 1
unit and the rotation a cost of 2.
For the problems listed in Table 1 nine basic transformations were used: identity function (with a cost of 0), translation (1), scaling (1), change in fill-in/background of a stimulus (1), addition and removal of a stimulus (1 each), rotation
(2), rotation of the fill-in/background of a stimulus (3) and a
change of the type of a stimulus (for example: transforming
a triangle into a square, 4). Without weights, all transformations receive a basic cost of 1.
Figure 6: An example of a more complex transformation
problem. The reasoner has to keep track of changes both horizontally and vertically.

Figure 5: A simple transformation problem as it requires only


the horizontal identification of each association between the
objects.
To solve problem 5 three horizontal associations are required. The different types of the stimuli (line, rectangle, triangle) make it easy to identify the associations. The building
of associations follows a hierarchy based on the attributes of
a stimulus. In this problem the preferred direction of movement is horizontal, by association of the rectangle in field one
with the rectangle rotated 45 degrees in the field to the right,
and not vertical where the rectangle could be associated with
the triangle below, which has the same degree of rotation and
differs only in type.
The costs of the three associations are computed by summing up the costs of the different transformations between
the members of each association (cell 1 compared with cell 2,
cell 2 compared with cell 3 and finally cell 1 compared with
cell 3). In this case each association gets a cost of 2 (1 without
weights) for applying one rotation. To determine the stimulus
in the answer box one additional 45-degree-rotation must be
applied to the lower, central triangle. The total costs for this
problem are six (costs for all associations) plus two (costs for
applied rotation) making eight.
Problem 6 shows the difficulty of correct association building. Due to the fact that each field contains two stimuli, they
have to be compared to each other to identify the associations.
For example taking the white square in the first field and go-

ing the horizontal way, we have to decide weather we associate it with the black square or the shaded rectangle. Therefor the attributes of both stimuli have to be compared with the
ones of the white square. Again, the hierarchy of attributes
plays an important role, because these stimuli are associated
with each other. It depends on which share the highest ranked
attribute and not the most attributes. The number of comparisons which have to be performed to identify an association
are added to the transformation costs of an association.
Once the stimuli are associated to each other, the transformation between them have to be applied on the unassociated
stimuli (In this case the white triangle and the shaded rectangle) to determine the stimuli in the answer field. There it is
necessary to find the right pairs of stimuli, so that the white
triangle changes its color to black and not the shaded circle.
Overall it becomes obvious that the difficulty in this problem is raised by the complex relations between the stimuli and
this relations can be described as transformations and comparisons on the attributes of the stimuli.

Complexity Measure (Type 1 Problems)


We now introduce a more formal notation of this complexity
measure. Here pi represents the basic transformation mapping of object on (i) in cell i to on ( j) in cell j. Then the unweighted difference between cell i and cell j is the sum of
all basic transformations. The weighted difference formula is
derived by adding each weight ai of each basic function pi
(mathematically this is expressed by forming the sum of all
weights of all basic functions pi ):

ai

(1)

pi

Complexity Measure (Type 2 Problems)


Is it possible to apply the same cognitive complexity functions used for Type 1 to problems describing Boolean functions? A Boolean function like the exclusive or (short: XOR)

Table 1: Evaluation of our cognitive complexity measure for


a sample of CFT3-problems. Index P denotes the empirical
measure from the manual (Weiss, 1971) and give the mean
correctness in percent. Cw are the cognitive complexity costs
with weights and C0 are the costs without weights.
Problem
CFT3-A-1-3-1
CFT3-A-1-3-6
CFT3-A-1-3-7
CFT3-A-2-1-1
CFT3-A-2-1-3
CFT3-A-2-1-4
CFT3-A-2-3-2
CFT3-A-2-3-3
CFT3-A-2-3-5
CFT3-A-2-3-6

Index P
76
93
78
96
95
90
94
90
47
85

Cw
8
13
18
5
6
4
10
4
26
21

C0
7
10
10
3
3
4
6
4
18
17

Figure 7: An example for a simple XOR problem.


can be represented by truth tables:
P
0
0
1
1

Q
0
1
0
1

P XOR Q
0
1
1
0

The solution for the problem in Fig. 7 requires the identification of a Boolean XOR function. If the middle cell in the
leftmost column (identified by the matrix coordinates (1, 2))
is compared with the middle cell of the same row (2, 2) and
both are compared with the middle cell in the rightmost column (3, 2) once can deduce that if the same object (triangle)
can be found in the left and middle cell it is not found in the
rightmost cell (so it will be deleted). And, whenever there is
an object only in the leftmost or middle column (e.g., the star
in the same row) it is found in the rightmost cell. These kinds
of functions are certainly not as easily recognizable as rotations or scaling and must be defined as such. Feldman (2000)
argues that the number of combinations of Boolean functions
determines the reasoning difficulty. A position which has recently been questioned (Vigo, 2006; Goodwin, 2006).

Evaluating the Complexity Measure


We compare the performance of our complexity measure to
some experimental findings with human subjects.
First, the complexity measure is able to explain the reasoning difficulty of Type 1 problems. Especially rotating,
scaling, movement, position changing, and linear transformations of objects can be well described with our complexity measure. The empirical classification of CFT (Weiss,
1971) problems can be seen in Table 1. The correlation for
this complexity function with the empirical data is significant
for both: with weight r = .74, p < .01 and without weight
r = .72, p < .05.

Table 2: Evaluation of our cognitive complexity measure for


problems from Lovett et al. (2009). Nr. = problem number,
Correct = correctness in percent, Time = mean solving time
in seconds, C0 = complexity measure (cf. sum in Equation 1).
Nr.
1
2
3
4
5
6
7
8
9
10

Correct
100
100
100
76
100
97
100
100
100
82

Time
10.8
7.3
6.7
8.7
13.4
8.5
6.1
6.1
6.0
23.6

C0
2
1
1
1
1
1
1
1
1
2

Nr.
11
12
13
14
15
16
17
18
19
20

Correct
100
97
91
94
97
100
56
91
82
97

Time
5.8
4.5
5.7
12.3
6.4
5.4
26.7
10.5
6.1
10.8

C0
1
1
2
1
1
1
3
1
1
2

A complexity measure with problems from Evans (1964)


was reviewed by Lovett and colleagues (2009). The comparison between our measurement is depicted in Table 2. The
correlation between the number of correctly answered problems with our complexity measure is r = .62, p < .01. The
reason for the negative correlation is due to the point that
the smaller our complexity measure is, the easier the problem is. Correlation with the solution time between both is
r = .67, p < .001.

General Discussion
The presented cognitive complexity measure for matrix problems allows the classification of problems with respect to the
number of basic functions needed to solve them. This measure is based on abstract units and might have a cognitive
counterpart: Nevertheless, it can predict problem difficulty
for Type 1 problems and pose a general enough framework
for Type 2 and Type 3 problems. What are the differences
to existing measures? Our approach is a combination of the-

oretical computer science techniques with ideas going back


to the Evans Analogy program (Evans, 1964). It integrates
ideas from Kolmogorov complexity (Chater, 2000; Li & Vitanyi, 1997), as it determines complexity based on the minimum number of functions necessary to describe the reasoning
difference. Feature extraction and object recognition are basic operations that are required for solving matrix problems.
Humans are amazingly accurate and fast at this. However,
it is still unknown how humans decide to decompose images
of matrix sequences into smaller compounds or treat these
images as single objects. Representational Distortion theory
can help answer this question by providing a measure for total
dissimilarity. Humans might compute these decompositions
only for objects that are at least somewhat similar (Muller
et al., 2009). The task of feature extraction is difficult because the exact number and set of features that are necessary
to extract is not known in advance. Furthermore, the salient
features important for the transformation vary heavily from
problem to problem. The object transformation results in feature differences that can be classified in two categories. First,
objects can change their visibility during the sequence, i.e.
they exist in a matrix cell or not. Second, geometric transformations result in changes of position, rotation and scale.
Objects for which no transformation can be computed must
be treated as separate objects. Nevertheless, humans do not
simply check the cells line by line or cell by cell, they can
easily shift their reference frame from global to local features
and back. One can assume that for an overall first impression
the whole problem will be inspected globally, which triggers
the next step. The first phase (inspection phase) gives a hint
as to how to proceed with the problem and which functions
should be chosen in order to successfully solve it. A simply local feature extraction cell by cell can, in many cases,
lead to the solution, especially for Type 1 problems (see, Fig.
3). However, this strategy fails if one has to solve Type 3
problems. For this type it is more efficient to examine the
problem more globally to discover the required features (see
Fig. 4). While Feldmans (2000) approach focuses on which
(Boolean) functions to use, we concentrate on the number of
function applications. Both approaches together may form a
more complete notion of cognitive complexity.

Acknowledgement
This research was partially supported by the DFG (German National Research Foundation) in the Transregional Collaborative Research Center SFB/TR 8 within project R8[CSPACE] and by a Steinbuch-Stipend to the second author.

References
Binet, A., & Simon, T. (1905). The development of intelligence in children. L Annee psychologique, 12, 191244.
Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one
intelligence test measures: A theoretical account of the processing in the raven progressive matrices test. Psychological Review, 97(3), 404431.

Cattell, K., & Cattell, A. (1959). Culture fair intelligence


test. Institute for Personality and Ability Testing.
Cattell, R. (1968). Are IQ tests intelligent? Psychology
today, 1(10), 5662.
Chater, N. (2000). The logic of human learning. Nature, 407,
572-573.
Chater, N., & Hahn, U. (1997). Representational distortion,
similarity and the universal law of generalization. In Proceedings of the similarity and categorization workshop 97
(pp. 3136). University of Edinburgh.
Evans, T. G. (1964). A heuristic program to solve geometricanalogy problems. Air Force Cambridge Research, Laboratories (OAR) Bedford, Massachusetts.
Eysenck, H. J. (1962). Know your own I.Q. Penguin, Harmondsworth, Middlesex.
Feldman, J. (2000). Minimization of boolean complexity in
human concept learning. Nature, 407, 630633.
Goodwin, G. P. (2006). How individuals learn simple
boolean systems and diagnose their faults. Princeton University.
Hahn, U., Chater, N., & Richardson, L. B. (2003). Similarity
as transformation. Cognition, 87(1), 132.
Li, M., & Vitanyi, P. (1997). An introduction to Kolmogorov
Complexity and its Applications. Springer.
Lovett, A., Tomai, E., Forbus, K., & Usher, J. (2009). Solving
geometric analogy problems through two-stage analogical
mapping. Cognitive Science, 33 (7), 11921231.
Muller, M., van Rooij, I., & Wareham, T. (2009). Similarity
as tractable transformation. In N. Taatgen & H. van Rijn
(Eds.), Proceedings of the 31st Annual Meeting of the Cognitive Science Society (pp. 4955).
Newell, A., & Simon, H. A. (1972). Human problem solving.
Englewood Cliffs, NJ: Prentice-Hall.
Raven, J. C. (1962). Advanced Progressive Matrices, Set II.
London: H. K. Lewis, (Distributed in the United States by
The Psychological Corporation, San Antonio, TX).
Raven, J. C., Raven, J. C., & Court, J. H. (2000). Manual
for Ravens Progressive Matrices and Vocabulary Scales.
Oxford: Oxford Psychologists Press.
Russell, K., & Carter, P. J. (1994). IQ Firepower. Robinson
Publishing, London.
van Rooij, I. (2008). The tractable cognition thesis. Cognitive
Science, 32, 939-984.
Vigo, R. (2006). A note on the complexity of boolean concepts. Journal of Mathematical Psychology, 50(5), 501
510.
Wechsler, D., Hardesty, A., Lauber, H., & Bondy, C. (1961).
Die Messung der Intelligenz Erwachsener: Textband
zum Hamburg-Wechsler-Intelligenztest fur Erwachsene
(Hawie). H. Huber.
Weiss, R. H. (1971). Grundintelligenztest Skala 3 - CFT 3.
Stuttgart.

You might also like