A Defense of Platonic Realism in Mathematics

The conflict between Platonic realism and Constructivism marks a watershed in philosophy
of mathematics. Among other things, the controversy over the Axiom of Choice is typical
of the conflict. Platonists accept the Axiom of Choice, which allows a set consisting of
the members resulting from infinitely many arbitrary choices, while Constructivists reject
the Axiom of Choice and confine themselves to sets consisting of effectively specifiable
members. Indeed there are seemingly unpleasant consequences of the Axiom of Choice.
The non-constructive nature of the Axiom of Choice leads to the existence of non-Lebesgue
measurable sets, which in turn yields the Banach-Tarski Paradox. But the Banach-Tarski
Paradox is so called in the sense that it is a counter-intuitive theorem. To corroborate my
view that mathematical truths are of non-constructive nature, I shall draw upon Gdels
Incompleteness Theorems. This also shows the limitations inherent in formal methods.
Indeed the Lwenheim-Skolem Theorem and the Skolem Paradox seem to pose a threat to
Platonists. In this light, Quine/Putnams arguments come to take on a clear meaning.
According to the model-theoretic arguments, the Axiom of Choice depends for its
truth-value upon the model in which it is placed. In my view, however, this is another
limitation inherent in formal methods, not a defect for Platonists. To see this, we shall
examine how mathematical models have been developed in the actual practice of
mathematics. I argue that most mathematicians accept the Axiom of Choice because the
existence of non-Lebesgue measurable sets and the Well-Ordering of reals open the
possibility of more fruitful mathematics. Finally, after responding to Benacerrafs
challenge to Platonism, I conclude that in mathematics, as distinct from natural sciences,
there is a close connection between essence and existence. Actual mathematical theories
are the parts of the maximally logically consistent theory that describes mathematical



A fundamental problem of philosophy of mathematics boils down to the conflict between
Platonic realism and Constructivism, and the conflict between them marks a watershed in
philosophy of mathematics. By Platonic realism I mean the philosophical view that
posits mathematical entities, such as numbers, sets, functions and so on, as
super-spatio-temporal ones. Indeed, owing to this view, mathematical knowledge was
extended further and further. At the turn of the last century, however, a variety of
paradoxes, such as Russells Paradox, were discovered by mathematicians and logicians in
the wake of the attempts to base the whole of mathematics on set theory, and gave rise to
the so-called crisis in the foundations of mathematics.
Against Platonic realism, there arose an anti-realistic doctrine called
Constructivism. Constructivism avoids positing mathematical entities dogmatically and
restricts them to those that are legitimately constructible in space and time.
But this view
conceals in itself the danger that we have to pay a high price: the sacrifice of many
productive results of classical mathematics. This is the reason why philosophers of
mathematics take pains to seek some middle ground between the two extreme camps. At
this point the problem of how to deal with the Law of Excluded Middle, impredicative
definition, the Axiom of Choice, actual or potential infinity and so on becomes a
controversial issue.
Among other things, the controversy over the Axiom of Choice is typical of the
conflict between Platonic realism and Constructivism. Not only is the Axiom of Choice
the most interesting axiom in axiomatic set theory, but it also plays an important role in
many other areas of mathematics. So the problem of the Axiom of Choice is one of the
significant topics in philosophy of mathematics.
First of all, we shall see what the Axiom of Choice is and where the problem with the
Axiom lies. Especially, we shall focus on what we can do in the presence of the Axiom of

I will use the word Constructivism in a broader sense than Browers Constructivism. In Browers
Constructivism mathematical entities are constructible in our mind. But I will use the word
Constructivism in a narrower sense than Gdels axiom of constructibility. Gdels Axiom of
Constructibility is a much stronger assumption than Constructivism as I call it.

Choice that we couldnt otherwise. Platonists accept the Axiom of Choice, which allows a
set consisting of the members resulting from infinitely many arbitrary choices, while
Constructivists reject the Axiom of Choice and confine themselves to sets consisting of
effectively specifiable members (Chapter 1).
Lebesgues theory of measure will set the stage for discussing the Banach-Tarski
Paradox and the existence of measurable cardinals in later chapters. Also, since Lebesgue
is one of the French Constructivists, it is interesting to see the non-constructive nature of
Lebesgue measure creates an irreconcilable tension with Lebesgues skeptical attitude
toward the Axiom of Choice (Chapter 2).
The Hausdorff Paradox is the prototype of the Banach-Tarski Paradox. Informally,
the Hausdorff Paradox states that a sphere is decomposed into finite number of pieces and
reassembled by rigid motions to form two copies of almost the same size as the original.
Here almost means except on a countable subset. Banach and Tarski made
improvement on the Hausdorff Paradox by eliminating the need to exclude a countable
subset from a sphere. Informally, the Banach-Tarski Paradox states that a sphere is
decomposed into finite number of pieces and reassembled by rigid motions to form two
copies of exactly the same size as the original. The Banach-Tarski Paradox deepened the
skepticism about the Axiom of Choice. But the Banach-Tarski Paradox is so called in the
sense that it is a counter-intuitive theorem, as distinct from a logical contradiction or a
fallacious reasoning. I argue that we should accept the Banach-Tarski Paradox as a
Platonic truth and rejects epistemology based on a mathematical intuition (Chapter 3).
Next, from a slightly different perspective, I corroborate my view that mathematical
truths are of non-constructive nature. Once we got the undecidability of Peano Arithmetic
(PA), Gdels First Incompleteness Theorem is immediate. The set of true sentences in PA
is not recursively enumerable. But the set of theorems (provable sentences) in PA is
recursively enumerable. So it is easy to see that there is a sentence that is true but
unprovable. This implies that there are some arithmetical truths we cannot get access to in
an effective way. We also have to note Gdels Incompleteness Theorems show that there
are limitations inherent in formal methods (Chapter 4).
The Lwenheim-Skolem Theorem and the Skolem Paradox seem to pose a threat to

Platonists. In the light of the Lwenheim-Skolem results, both Quines thesis of the
indeterminacy of translation and Putnams model-theoretic arguments against metaphysical
realism come to take on a clear meaning. According to the model-theoretic arguments, the
Axiom of Choice depends for its truth-value upon the model in which it is placed. In my
view, however, this is another limitation inherent in formal methods, not a defect for
Platonists (Chapter 5).
Finally, I meet Benacerrafs epistemological and ontological challenges to Platonism
by examining how mathematical models have been developed in the actual practice of
mathematics. Most mathematicians prefer the Axiom of Choice to the Axiom of
Determinacy in favor of the existence of non-Lebesgue measurable sets and the
Well-Ordering of reals. Also, most mathematicians reject the Axiom of Constructibility in
favor of the existence of a measurable cardinal. In both cases, working mathematicians
are driven by Platonic realism rather than Constructivism. I conclude that in mathematics,
as distinct from natural sciences, there is a close connection between essence and existence,
actuality and possibility. The actual mathematical theories are the parts of the maximally
logically consistent theory that describes mathematical reality (Chapter 6).

I shall give some credit to the sources from which I got mathematical technicalities. Throughout the
process of writing the dissertation, I referred to Cameron (1998), Hamilton (1988), Jech (1978), Kunen (1980),
Levy (1979). They offer a panoramic view of set theory overall. The former two are concise but useful
introductions, whereas the latter three provide detailed and exhaustive information. For Lebesgues theory
of measure, Hawkins (1975) is a good help to know the historical background. We have seen how the
Lebesgue integral overcomes the difficulties of the Riemann integral. For this, see e.g. Weir (1973), Wilcox
and Myers (1978). For the Banach-Tarski Paradox, one can find technical details in Wagon (1985).
Wapner (2005) gives a more informal presentation of the Banach-Tarski Paradox. When discussing Gdels
First Incompleteness Theorem, I put focus on the approach from the theory of computability. For this
approach, Boolos and Jeffery (1974) is a classic although a wholesale revision has been made in the 4

edition of the same title (2002). Also, I consulted Cohen (1987), Cutland (1980), Ebbinghaus, Flum and
Thomas (1994). Franzn (2005) warns against a prevalent misconception of Gdels First Incompleteness
Theorem and a conflation of distinct senses of completeness and undecidability. Manin (1977),
Mendelson (1997) are good guides for the Skolem Paradox.



In this chapter, first of all, I recapitulate the Axioms of Zermelo-Fraenkel (ZF) set theory
(Section 1). Then, I state the Axiom of Choice and give a couple of its equivalents: the
Well-Ordering Theorem and the Multiplicative Axiom (Section 2). Next, I shall show that
the Axiom of Choice has some useful consequences, e.g., the Aleph Theorem. At the
same time, we shall see that there were many opponents of the Axiom of Choice, and that it
has some unpleasant consequences as well (Section 3). Also, I shall discuss a weaker
form of the Axiom of Choice: the Denumerable Axiom of Choice, and some of its
consequences (Section 4). Moreover, I shall examine the relation of the Axiom of Choice
and the Continuum Hypothesis (Section 5). Finally, I provisionally conclude that the
debate over the Axiom of Choice favors Platonic realism.
1.1 The Axioms of ZF Set Theory
Before I state the Axiom of Choice, I shall see what constitutes the Axioms of ZF set theory.
In 1930 Zermelo proposed ZF set theory in a form closely related to that used today, which
consisted of the following seven Axioms.
(i) Axiom of Extensionality: If the two sets x, y have the exactly same members, then they
are equal.
(ii) Power Set Axiom: For any set x, the power set of x is a set.
Here the power set is the set of all subsets of x.
(iii) Axiom of Union: For any set x, the union of x is a set.
The union, denoted by , is the set of all members of the members of a set x.
(iv) Axiom of Pairing: For any sets x, y, {x, y} is a set.
(v) Axiom of Separation: If a propositional function P(x) is definite for a set z, there is a set
y containing exactly the members of z for which P(x) holds.
The Axiom allows us to separate the members with some property from a set and form a set
consisting of these members.
(vi) Axiom of Replacement: If F is a function, then for every set x, F[x] is a set.

F[x] is called the image of x under the mapping F.
(vii) Axiom of Foundation: If x0 then there exists yx such that yx0.
This means that there is no infinite descending -sequence.
Also, there were two Axioms that were not included in this system but had occurred
in his system of 1908: the Axiom of Infinity and the Axiom of Choice. Since I shall
discuss the Axiom of Choice in detail below, I shall mention just the Axiom of Infinity here.
Axiom of Infinity: There exists a set x such that 0x and whenever yx then y{y}x.
This means that if we pick up any member y in a set x, then the immediate successor of y is
also in x.
Zermelo did not include the Axiom of Infinity in his system of 1930 because he
believed that it did not belong to general theory of set theory. He did not include the
Axiom of Choice on the ground that it differed in nature from the other Axioms. In
contemporary ZFC set theory are included the seven Axioms as postulated above, the
Axiom of Infinity, and the Axiom of Choice.
1.2 The Axiom of Choice and its Equivalents
The Axiom of Choice
First of all, we shall see what the Axiom of Choice says:
For every family F of disjoint nonempty sets S, there exists a set C containing
exactly one member from each member S of F (i.e., for each SF the set SC
is a singleton).
Using the notion of a function we can paraphrase this as follows:
For every family F of disjoint nonempty sets S, there exists a choice function f
on F such that f(S)S for each set S in the family F.
For instance, we can classify all natural numbers by the residues that result when they are
divided by 3 (i.e., the set T of the sets S of numbers congruent each other, modulo 3).
{0, 3, 6, },
{1, 4, 7, },
{2, 5, 8, }}
Then it is easy to see that there exists a set C containing exactly one member from each

member S
, S
, S
of T (e.g., C{0, 4, 8}). In fact, the use of the Axiom of Choice is
dispensable in the case of a family of finitely many disjoint non-empty sets, and even in the
case of a family of infinitely many disjoint non-empty sets if we can specify the rule by
which to perform the choices. In our case, we can make sure that there exists such a set
without appealing to the Axiom of Choice, for instance, following the rule of choosing the
least member from the members of S
(i.e., C{0, 1, 2}). The problem of the Axiom of
Choice is concerned only with infinitely many arbitrary choices.

Figure 1: The Axiom of Choice.

The Well-Ordering Theorem
The most useful form of the Axiom of Choice is the Well-Ordering Theorem: Every
set can be well-ordered. Actually, the Axiom of Choice is equivalent to the Well-Ordering
Theorem. But since this requires proof, we cannot regard the Well-Ordering Theorem
itself as an axiom despite its usefulness. So it is important to show the equivalence of the
Axiom of Choice and the Well-Ordering Theorem. But first we have to define a

In order to define a well-ordering exactly, we need to define the notion of an
R-minimal member:
x is an R-minimal member of A if and only if xA(y)(yA(yRx)).
Also, we need to define the notion of connected:
R is connected in A if and only if (x)(y)(x, yAxyxRyyRx).
We shall next define a well-ordering:
R well-orders A if and only if every nonempty subset of A has an R-minimal
member & R is connected in A.
Roughly speaking, the notion of an R-minimal member guarantees us the existence of a
least member of every subset of A under the relation R. The notion of connected
guarantees that there is a linear ordering on A excluding the possibility of circularity. In
Appendix (I), I shall show that the Axiom of Choice is equivalent to the Well-Ordering
In 1904 Zermelo explicitly formulated the Axiom of Choice and proved the Axiom of
Choice is equivalent to the Well-Ordering Theorem. As we shall see in Section 3, there
arose much controversy over the non-constructive nature of the Axiom of Choice. In
response to his critics, in 1908 Zermelo reformulated the Axiom of Choice and his proof.
There Zermelo attempted to deprive the Axiom of Choice of all the constructivist
appearances by replacing a system of successive choices by a system of simultaneous ones
and put more emphasis on its super-temporality. We can clearly see the figure of Zermelo
as a Platonic realist here. In the same year Zermelo launched the axiomatization of set
theory. It is often said that the discovery of set-theoretic paradoxes motivated Zermelo to
axiomatize set theory. Under these circumstances, however, we could safely conclude that
Zermelo wanted to secure the status of the Axiom of Choice by creating a rigorous system
of axioms for set theory and lay down firm foundations of set theory and mathematics in
The Multiplicative Axiom
We also have to notice that there are many other equivalents of the Axiom of Choice.
For instance, in abstract algebra one of the equivalents of the Axiom of Choice, Zorns
Lemma, is applied earlier than the Well-Ordering Theorem. This means that the Axiom of

Choice is not an ad hoc principle formed in the development of mathematics, but a stable
principle which is widely applicable in many branches of mathematics. But here in
connection with axiomatic set theory I shall confine my attention to Russells
Multiplicative Axiom. In Principia Mathematica Russell introduces the Axiom of Choice
in the following way: If is a class of mutually exclusive classes, no one of which is null,
there is at least one class which takes one and only member from each member of .

Russell calls it the Multiplicative Axiom, probably because of the Axioms connection
with cardinal multiplication, i.e., the construction of a set for the product of a denumerable
infinity of cardinals.
Russell takes as an example the millionaire who bought
pairs of boots and
of socks.
The question is how many boots and how many socks the millionaire had in all.
Although it is natural to suppose that he had 2
boots and 2
socks, we know that
is not increased by doubling it, that is, 2

. So the answer is that he had
socks. In general, the sum of
pairs must have
members. But we have to
notice that this result presupposes the existence of a set that consists of either of each pair.
In some cases we can have such a set without the Multiplicative Axiom, whereas in other
cases we cannot unless we assume the Axiom. In our case, among a pair of boots we can
distinguish left from right and thus choose all the right boots and then all the left boots.
Since there are no such distinguishing features among a pair of socks, however, we have no
specific rule by which to choose either of each pair of socks. Therefore, in the case of
socks the use of the Multiplicative Axiom is essential to show that there exists a set
consisting of either of each pair of socks.
1.3 The Consequences of the Axiom of Choice
As we have seen above, if we assume the Axiom of Choice, then, by the Well-Ordering
Theorem, every set can be well-ordered. So the set R of all real numbers can be
well-ordered. This is one of the most significant consequences of the Axiom of Choice.
This does not mean that in the absence of the Axiom of Choice we know little about the set
R. Actually, we know that the cardinality of the set R is greater than that of the set N of all

Russell, B. and Whitehead, A. N. [1910], vol. I, p. 536.

natural numbers by the Cantorian diagonal argument, and that the cardinality of the set R or
of the continuum is that of the power set of the set N, i.e., 2

. Based on ZF set theory
without the Axiom of Choice, however, we cannot prove whether or not the set R can be
well-ordered, therefore we dont even know whether or not the cardinality of the set R is an
Only in the presence of the Axiom of Choice we do know that the set R can be
well-ordered, and that the set R is an aleph. And only then we can ask which aleph is its
The set N of all natural numbers can be well-ordered by the less-than relation.
Using the terminology of ZF set theory, the set N can be well-ordered by the membership
relation. One of the strengths of ZF set theory is that the less-than relation can be replaced
by the membership relation. The set N can be well-ordered by the less-than relation
because every nonempty subset of the set N has a least member. On the other hand, the set
N cannot be well-ordered by the greater-than relation because there are a bunch of subsets
that do not have a greatest member. The set Q of all rational numbers cannot be
well-ordered by magnitude. But, it is easy to see how the set Q of all rational numbers can
be well-ordered. Because, using the ordering that emerges from the proof that the
cardinality of rational numbers is the same as that of natural numbers, its trivial that there
is some way in which the set Q is put into one-to-one correspondence to the set of all
natural numbers.
But the situation is quite different with the set R of all real numbers. Intuitively
speaking, we dont know how the set R can be well-ordered. Even so by the
Well-Ordering Theorem, which implies that every set can be well-ordered, the set R can be
well-ordered. As with the set Q, it is obvious that the set R cannot be well-ordered by
magnitude for the same reason as the set Q. But unlike the set Q, there is no obvious
ordering to hand that does the trick. However, the Well-Ordering Theorem tells us that
there is some relation by which the set R can be well-ordered, though we dont know what
it is specifically. We can see even from this that the Well-Ordering Theorem indeed makes

Russell, B. [1919], p. 126.
Alephs are the infinite well-ordered cardinals.


a very strong and powerful claim.
The Aleph Theorem, The Trichotomy of Cardinals
Moreover, since the Well-Ordering Theorem claims that every set can be well-ordered, it is
not just the set R that can be well-ordered. So it follows from the Well-Ordering Theorem
that all the cardinals are ordinals, which leads us to the Aleph Theorem that every infinite
cardinal is an aleph. Thus the Well-Ordering Theorem simplifies addition and
multiplication of infinite cardinal numbers, which would be more complicated otherwise.
Also, all cardinals are taken to be initial ordinals. In particular, any two sets are
comparable in terms of cardinality. Therefore the Trichotomy of Cardinals is true:
For every cardinal m and n, either mn, or mn, or mn.
Furthermore, as a corollary of the Aleph Theorem, the following equalities hold:
In this fashion the fundamental propositions true for alephs are extended to all infinite
If we assume the Axiom of Choice, then by the Well-Ordering Theorem, we dont
have to worry about the existence of sets that cannot be well-ordered. We know much
more about the cardinals of well-orderable sets than about the cardinals of sets that cannot
be well-ordered. As a consequence, once we assume the Axiom of Choice, which implies
the Well-Ordering Theorem, the theory of cardinals is considerably simplified.
But in fact there arose much controversy over the Axiom of Choice and Zermelos
proof of its equivalence to the Well-Ordering Theorem. Hadamard, Hausdorff, and
Keyser defended the proof in full generality. Roughly speaking, however, German critics
such as Bernstein and Schoenflies disputed the proof on the ground that the Burali-Forti
paradox lies hidden in the proof, while French Constructivists such as Lebesgue, Borel, and
Baire opposed the Axiom of Choice itself on the ground that it does not provide the specific
rule by which to perform the choices.
Though Zermelo met the first criticism by
rejecting the assertion that the collection W of all ordinals is a set, the second one was more

Poincar, who is often said to be a conventionalist, accepted the Axiom of Choice so he did not reject the
Well-Ordering Theorem but Zermelo's proof of it because it makes use of impredicative definition.

serious because of the stark philosophical difference underlying that criticism.

The fundamental opposition in philosophy of mathematics is that between Platonic
realism, which posits mathematical entities outside of space and time, on the one hand, and
anti-realism, which restricts them to those which are legitimately constructible in space and
time, on the other. Under the philosophical background of this sort, the Platonic realists
accept the Axiom of Choice, which allows a set consisting of the members resulting from
infinitely many arbitrary choices, while the anti-realists reject the Axiom of Choice and
confine themselves to sets consisting of the effectively specifiable members. Hence some
mathematicians have claimed that we should avoid the Axiom of Choice wherever possible,
treating it just as a heuristic device for finding a new theorem, which is then to be proved
without appeal to the Axiom.
Though, as we have seen above, the legitimacy of the Axiom of Choice was already
controversial, skepticism about the Axiom of Choice was deepened when in 1914
Hausdorff discovered an unpleasant consequence of it, which is called Hausdorffs
paradox: half of a sphere is congruent to a third of the same sphere. Later Banach and
Tarski established this result as the Banach-Tarski paradox: any sphere S can be
decomposed into a finite number of pieces and reassembled into two spheres with the same
radius as S. In fact, Borel believed Hausdorffs paradox to show that contradictions
follow from the Axiom of Choice and that as a result the Axiom of Choice should be
1.4 A Weaker Form of the Axiom of Choice
Given the controversial character of the Axiom of Choice, it is natural to attempt to weaken
it in some way acceptable to its opponents. We can then save some of its consequences,
although we have to sacrifice others. Precisely speaking, I have thus far confined myself
to the so-called full Axiom of Choice in distinction from its weaker form. Since the full
Axiom of Choice is independent of ZF, the weaker form of the Axiom of Choice should be

The following two objections against the Axiom of Choice can be expected:
(1) The Axiom of Choice should be constructibly justifiable.
(2) Even if the Axiom of Choice cannot be justified constructibly, we should be able to justify constructibly
the Well-Ordering of reals which is most wanted.
I doubt that both are legitimate criticisms.

too strong for theorems of ZF, but too weak for the full Axiom of Choice. In other words,
the weaker form of the Axiom of Choice should be a theorem T of ZFC. More
specifically, when we ask firstly whether or not its a theorem of ZF and then whether or
not its equivalent to the full Axiom of Choice, both of the questions should be answered in
the negative. For its supposed to have the intermediate power between the theorems of
ZF and the full Axiom of Choice.
The Denumerable Axiom of Choice, The Principle of Dependent Choices
An example in point is the Denumerable Axiom of Choice, which restricts infinitely many
arbitrary choices to the cases of denumerable many sets. To put it precisely, the
Denumerable Axiom of Choice runs as follows:
Every family of denumerably many nonempty sets has a choice function.
The Denumerable Axiom of Choice is closely related to the Principle of Dependent
If R is a relation on a set S such that for every xS there exists yS such that
xRy, then there is a sequence x
, x
, x
, of members of S such that
, x
, , x
This principle enables us to make a countable number of consecutive choices. In
Appendix (II), I shall show that the Principle of Dependent Choices implies the
Denumerable Axiom of Choice.
The Countable Union Axiom
If we assume the Denumerable Axiom of Choice, then we can get the Countable Union

The union of countably many countable sets is countable.
In Appendix (III), I shall show this.
Every infinite set has a countable subset, Every Dedekind-finite set is finite, The
restricted form of Trichotomy of Cardinals

A set is called denumerable if it is equinumerous with . A set is called countable if it is either
equinumerous with or finite.


Also, if we assume the Denumerable Axiom of Choice, then we can prove that every
infinite set has a denumerable subset. In Appendix (IV), I shall show this.
In sum, I have shown above that the Principle of Dependent Choices implies the
Denumerable Axiom of Choice and this in turn implies that every infinite set has a
countable subset. Incidentally, neither of these implications can be reversed. The last
fact means that every Dedekind-finite set is finite. A set S is Dedekind-finite if and only if
there is no proper subset of S equipollent to S. It is a matter of significance that if we
dont assume the Denumerable Axiom of Choice we cannot prove the equivalence of the
notions of Dedekind-finite set and finite set. For this means that in the absence of the
Denumerable Axiom of Choice there might exist sets which were infinite in one sense but
were finite in another. Russell and Whitehead were seriously concerned that there might
exist mediate cardinals which were too large to be finite but too small to be
Dedekind-infinite. At the same time, it is worth noting that the Denumerable Axiom of
Choice, instead of the full Axiom of Choice, suffices to reject such a possibility. Thus,
every cardinal number is comparable with
, and the restricted from of the Trichotomy of
Cardinals does hold, i.e., |x|
, or |x|
, or |x|
for any x. But the Principle of
Dependent Choices has its limitations; it does not, for instance, imply the existence of a
well-ordering of the set R of all real numbers. Historically speaking, Borel, who rejected
the full Axiom of Choice, accepted only the Denumerable Axiom of Choice, while unlike
Borel, Hobson rejected even denumerably many arbitrary choices, though he was
sympathetic with Borels critique.
1.5 The Axiom of Choice and the Continuum Hypothesis
In Section 3, we have seen that the cardinality of the set R of all real numbers is greater
than that of the set N of all natural numbers, and that it is that of the power set of the set N,
i.e., 2
. But there we have also seen only in the presence of the Axiom of Choice, which
implies the Well-Ordering Theorem, we know that the set R can be well-ordered and the
cardinality of the set R is thus an aleph, and also we can ask which aleph is its cardinal.
That is, we can ask whether the cardinality of the set R is the successor cardinal
of that of
the set N, or there is the successor cardinal
between the cardinality of the set N and that
of the set R. The Continuum Hypothesis claims that the cardinality of the set R is the

successor cardinal
of that of the set N, i.e.,

. This means that the Continuum
Hypothesis presupposes the Axiom of Choice. To generalize this, the Generalized
Continuum Hypothesis claims that the cardinality of the a set S is the successor cardinal of
that of a set S, i.e.,

In this connection, it is interesting to see that Brouwer claims that for the intuitionists
the Continuum Hypothesis doesnt make sense.

is the only infinite cardinality of
which the intuitionists can accept the existence. For the intuitionists real numbers are the
rule-governed sequences constructed by a finite number of steps. Therefore for the
intuitionists the set of all real numbers which contains free choice sequences is meaningless.
So Brouwer claims that for the intuitionists it has no meaning to ask whether or not the
cardinality of the set of all real numbers is greater than
, and whether or not the
cardinality of the set of all real numbers is the second smallest infinite cardinality. Given
that the Continuum Hypothesis presupposes the Axiom of Choice, it comes as no surprise
that Brouwer believes that for the intuitionists the Continuum Hypothesis doesnt make
sense. But it is interesting to see that Brouwer admits that a set S is infinite if and only if S
is equipollent to one of its subsets. As we have seen in Section 4, this definition is exactly
Dedekind-infinite. This means that even Brouwer uses the Denumerable Axiom of
Choice implicitly.
To see how the Generalized Continuum Hypothesis works, we shall introduce the
function . The letter (beth) is the second letter of the Hebrew alphabet.




, where is a limit ordinal.

This definition makes sense only if we assume the Axiom of Choice because only in the
presence of the Axiom of Choice, which implies the Well-Ordering Theorem, every set can
be well-ordered and all cardinals are ordinals. For all ,

, since


Brouwer [1999], in Jacquette (ed) (2002), p. 271-4.


. Especially, if the Generalized Continuum Hypothesis holds, then

Also, we shall see that under the Generalized Continuum Hypothesis an inaccessable
cardinal is the first weakly inaccessable ordinal. An ordinal is called weakly
inaccessable if is a limit cardinal

for a limit ordinal . We can get the concept of an

inaccessable cardinal stronger than that of an weakly inaccessable cardinal by replacing the
moderately increasing sequence from

by the exponentially and thus more rapidly
increasing sequence from

to 2

. Then we can ask how big an inaccessable cardinal is.

Since an inaccessable cardinal is stronger than a weakly inaccessable cardinal, it is at least
as big as the first weakly inaccessable ordinal. If we assume the Generalized Continuum
Hypothesis, since

, an inaccessable cardinal is the first weakly inaccessable

1.6 Fictionalism or Instrumentalism
I believe that a deep-rooted and far-reaching topic in philosophy of mathematics is the
debate between those who claims mathematical objects to exist over and above space and
time (Platonic realism) and those who take them to be constructed within space and time
(Constructivism). Indeed there is a fictionalist or instrumentalist account of mathematical
objects, but I dont believe that fictionalism or instrumentalism is a good account of
mathematical objects. According to fictionalism, mathematical statements are simply
false, whereas according to instrumentalism, mathematical statements are neither true nor
false. The difference between fictionalism and instrumentalism is only in the letter but not
in the spirit. Philosophers of this sort attempt to explain the usefulness of mathematics in
natural science by means of the conservation theorem. This means that the mathematical
theory preserves the truth of the scientific theory, but facilitates the deductions which could
be made at greater length and with greater difficulty otherwise. A mathematical object
plays a role like a catalyst in chemistry that is a substance facilitating a chemical reaction
though itself remaining unchanged.
Fictionalists make their claim by refuting the indispensability argument. Roughly
speaking, indispensabilists accept the existence of mathematical objects, insofar as those
mathematical objects are indispensable to explain natural sciences. So fictionlists attempt

to show that mathematical objects are dispensable to explain natural sciences. But we
have to note that one can reach the fictionalist conclusion by refuting the indispensability
argument only if the indispensability argument is the most promising argument for
mathematical realism. If there is a better argument for mathematical realism than the
indispensability argument, fictionalists will have much more work to do in order to deny
the existence of mathematical objects. So we shall examine the Quine/Putnam
indispensability argument in detail.
The Quine/Putnam indispensability argument aims to establish a realm of
mathematical objects by showing that if a scientific theory is accepted as true, then any
mathematical theory which is indispensable to formulate that scientific theory must also be
accepted as true.
[Q]uantification over mathematical entities is indispensable for science, both formal and
physical; therefore we should accept such quantification; but this commits us to accepting the
existence of the mathematical entities in question. This type of argument stems, of course,
from Quine, who has for years stressed both the indispensability of quantification over
mathematical entities and the intellectual dishonesty of denying the existence of what one daily

We must notice that Quine and Putnam are not only claiming that the truth of mathematics
is presupposed by its use in science, but that the mathematics employed in our best
scientific theories enjoys empirical support. The upshot of the Quine/Putnam
indispensability argument is that the mathematics employed in a scientific theory is
confirmed indirectly from the confirming evidence for the scientific theory in which it is
According to Quine, just as we accept the existence of molecules, atoms, and quarks
if by so doing we have the best scientific theory that organizes and explains our experience,
so we accept the existence of mathematical objects. Putnam stresses that scientific
theories cannot even be formulated without the use of mathematics. Physical laws, such
as Newtons law of gravitation, are formulated using equations. Thus, Putnam claims that
they cannot be stated in a nominalistic language, that is, one in which no reference is made
to numbers, functions, sets, etc. If this is the case, scientific theories refer to mathematical

objects and so we cannot accept our best scientific theories without accepting the existence
of mathematical objects. Putnam says, mathematics and physics are integrated in such a
way that it is not possible to be a realist with respect to physical theory and a nominalist
with respect to mathematical theory.

The indispensability argument is supported by the idea that mathematical objects are
on a par with physical objects. Quine in Two Dogmas of Empiricism maintains that our
statements about the external world face the tribunal of sense experience not individually
but only as a corporate body. This means that logical reflection and sense experience
together shape the total theory. Quine introduces a policy of minimum mutilation in
which we revise less central beliefs rather than more central ones. The logical and
mathematical beliefs are the most central beliefs. This is why we seldom are tempted to
revise them in the light of experience. According to Quine, however, the logical and
mathematical beliefs do not enjoy some special sort of nonempirical justification. The
logical and mathematical beliefs differ just in degree from the empirical beliefs. Thus, our
belief in the validity of modus tollens is just central than our vernacular beliefs.
Quine claims that the logical and mathematical beliefs are central because they apply
to a lot of situations and plays an important role in organizing how we think about these
situations. Even if modus tollens is central, however, I dont think that we could say that
every theorem in pure mathematics is like this. A result in some recondite area of
algebraic topology, for instance, might play little or no general role in organizing how we
think about the world. Likewise, the parts of mathematics, such as advanced set theory,
that go beyond this role are not accepted as true. The drawback of the indispensability
argument is that it conflicts with the actual practice of mathematics. The history of
mathematics after the nineteenth-century shows how mathematics separated and developed
itself independently from natural sciences and took its own course.
I think that, when we say that mathematics is indispensable, its very important that
mathematics is indispensable to either natural sciences or mathematics itself, especially
considering the autonomous developments of actual practice of mathematics after the

Putnam, Mathematics, Matter and Method, p. 347.
Putnam, Mathematics, Matter and Method, p. 74.

nineteenth century. If we interpret it as indispensable to mathematics itself, such as the
self-organization of mathematics, it amounts to much the same thing as the Platonistic
claim that a consistent mathematical theory describes at least a part of mathematical
universe. But in this case it seems to me that indispensability is not the best way to
represent the characteristic feature of mathematical objects. So if indispensabilists have
something to say different from Platonists, we have to interpret it as indispensable to
natural sciences. For instance, if indispensabilists would accept the existence of
mathematical objects which are indispensable to mathematics itself, since the Axiom of
Choice is indispensable to mathematics itself, especially axiomatization of Cantorian set
theory, they would have to accept the Axiom of Choice. Against Platonists, however,
indispensabilists reject the Axiom of Choice in its own right. If indispensabilists accept
the existence of mathematical objects which are indispensable to natural sciences, since the
Axiom of Choice is dispensable to natural sciences, they can reject the Axiom of Choice as
required. So when we use the indispensability argument to justify the ontological status of
mathematical objects, we have to make it clear that mathematics is indispensable to natural
It might be objected that, even if indispensabilists would accept the existence of
mathematical objects which are indispensable to mathematics itself, they could differentiate
themselves from Platonists in the sense that, as we shall see later, the Axiom of Choice is
dispensable to prove the Banach-Tarski Paradox. But I shall claim that, even if the
Banach-Tarski Paradox can be reformulated without the Axiom of Choice, it does not
necessarily deal a blow to the Platonists. I shall ask whether or not the proof without the
Axiom of Choice depends on extremely complex or ad hoc principles, compared with the
proof with the Axiom of Choice. If by invoking the Axiom of Choice the Banach-Tarski
Paradox can be proved in a simpler, more systematic and more unified way, I believe the
proof with the Axiom of Choice reflects the fact of matter rather than the proof without the
Axiom of Choice. In any case, I dont believe that the indispensability argument is the
most promising argument for mathematical realism. If we believe mathematical theories
applicable to natural sciences, in the extension we should believe mathematical theories not
applicable to natural sciences. For instance, if we believe a weaker form of the Axiom of

Choice, there is no good reason to disbelieve the full Axiom of Choice. In Chapter 6, I
shall submit the argument for mathematical realism I believe is the best. So even if
fictionalists succeed in the nominalization of mathematical objects in natural sciences, since
there is a better argument for mathematical realism, I dont believe that factionalism or
instrumentalism is a correct account of mathematical objects.
The main problem with the Axiom of Choice concerns the issue of whether or not infinitely
many arbitrary choices should be accepted in mathematics. The Platonists admit the
possibility of making a set consisting of indefinable members of a certain kind. On the
other hand, the constructivists allow only the existence of sets consisting of members that
are specifiable by a finite number of steps. But the Axiom of Choice largely contributed
to the systematization of Cantorian set theory. It is interesting to note that even some of
the opponents of the Axiom of Choice used it implicitly. For instance, though Russell was
skeptical of the Well-Ordering Theorem and the Trichotomy of Cardinals, he used the
proposition that every infinite set has a denumerable subset in order to prove that a set is
Dedekind-finite if and only if it is finite. But the proof of this proposition makes essential
use of the Denumerable Axiom of Choice. This is a good example of the deductive power
of the Axiom of Choice. Mathematics, then, is severely curtailed if we reject the Axiom
of Choice. In light of this, I provisionally conclude that the debate over the Axiom of
Choice favors Platonic realism.




In this chapter, I shall trace back the theory of large cardinals to its origin: Lebesgues
theory of measure, and claim that the non-constructive nature of the Lebesgue measure lies
in the notion of -additivity (or, more generally, -additivity). To that aim, in the first half
of the chapter, we shall see that the Lebesgue integral based upon the Lebesgue measure
was devised in attempts to solve the problems with the Riemann integral based upon
Jordans content. The Lebesgue integral enabled the integration of functions that are not
Riemann integrable, and also made much improvement on Riemanns theory of
convergence properties.
In the second half of this chapter, we shall investigate how Lebesgues theory of
measure is applied to the theory of large cardinals. The cogent relationship between
Lebesgues theory of measure and the theory of large cardinals can be detected in the
theorem to the effect that if there exist measurable cardinals, they are (strongly)
inaccessible. Finally, I shall point out that the non-constructive nature of the Lebesgue
measure as shown above creates an irreconcilable tension with Lebesgues skeptical attitude
toward the Axiom of Choice.
2.1 Lebesgues Theory of Integration
We can see the nature of the Riemann integral in the method to find the area bounded by a
continuous function f(x) and the x-axis. The Riemann integral involves partitioning the
domain of f(x) and approximating f(x) by means of the upper and lower step functions
bracketing f(x) from without and within respectively. A partition P of [a, b] is a set {a
, a
, a
} such that
Let S
{x | a
}. (i1, 2, n)

(x) is the characteristic function of the set S
, defined by

(x)1 if xS

0 if x S

Also, let M
f(x) and m
Then, the upper step function





Similarly, the lower step function





Now, the upper sum U(f, P) is the area bounded by the upper step function (x) and the
U(f, P)M
) M

In the same way, the lower sum L(f, P) is the area bounded by the lower step function (x)
and the x-axis:
L(f, P)m
) m

As n, we have the upper integral

and the lower integral

Finally, f is Riemann integral iff
Although the Riemann integral will do for most practical use, there are some
problems with the Riemann integral when it comes to advanced fields of mathematics.
First of all, there exist a lot of functions that are not Riemann integrable. Secondly, the
Riemann integral contains too strict convergence properties. The Lebesgue integral
extends the range of integrable functions, taking over the nice properties of the Riemann
integral. The turn from the Riemann to Lebesgue integral could be characterized as the
one from constructive to non-constructive mathematics, as it were.
In order to overcome the difficulties with the Riemann integral, Lebesgue substituted
the Lebesgue measure for Jordans content that provided foundation for the Riemann
Earlier concepts of measure such as Jordans content were only finitely additive in the sense
)} , ( inf{ P f U f

)} , ( inf{ P f L f


f f

for any two disjoint measurable sets A, B, and these led to more limited theories of
integration. On the other hand, the Lebesgue measure is -additive (countably additive) in
the sense that


for any pairwise disjoint measurable sets X

. The measure defined in this way fits into our

intuition that the measure should be a length in one dimension, an area in two dimensions,
and a volume in three dimensions. The upshot of this definition is that though the
countable union of sets with measure zero is again of measure zero, the uncountable union
of sets with measure zero has positive measure. Due to the property of -additivity of the
Lebesgue measure, the Lebesgue integral based on the Lebesgue measure is more powerful
than the Riemann integral based on Jordans content.
In order to see how significant the notion of -additivity is, it is useful to reconsider
Zenos argument and fifth-century Atomists reaction to that.
According to Zenos
argument, if finite extension is infinitely divisible, either the resulting least parts have no
size or they have some positive size. If they have no size, however, when put together
they result in something with no size. If they have a positive size, no matter how small it
may be, when an infinite number of them are put together, the result is something of infinite.
Either way, we cannot form the original object by reassembling its parts. So, the Atomists
avoid this argument by claiming that bodies are ultimately composed of indivisibles. To
Zenos argument, Lebesgue would reply that a countable set of points with measure zero
remains of measure zero, and only an uncountable set of points with measure zero can have
positive measure.
We define the Lebesgue measure more precisely. Let E be the unit interval [0, 1].
The outer measure can be obtained by approximating the set from without by open sets.
That is, the outer measure of A is the infimum of open sets containing A. In symbols,

For this, see McKirahan, Philosophy before Socrates: an Introduction with Texts and Commentary, p. 310.

(A)inf{m(G) AGopen set}

On the other hand, the inner measure can be obtained by approximating the set from within
by compact (i.e., closed and bounded) sets.
That is, the inner measure of A is the
supremum of compact sets contained in A. In symbols,
(A)sup{m(K) AKcompact set}1m
(E A)
E A is the difference of E and A. Most importantly, a set AE is Lebesgue measurable
if m
A set with Lebesgue measure zero is called a null set. Here we must be careful not
to confuse empty set with null set in this sense. Actually, since we define the empty set
to be of measure zero, the empty set is a null set. But a set containing just a single point,
that is, a singleton is a null set as well. Due to the property of -additivity of the Lebesgue
measure, any countable set of points is also a null set. Only an uncountable set can have
positive measure. But we have to note that some uncountable sets of points can be null
sets. A case in point is the Cantor set. The Cantor set is constructed as follows:
Take the unit interval [0, 1]. Divide it into three equal intervals and remove the middle
open third, leaving the set C
[0, 1/3][2/3, 1]
Then, divide each of the two intervals into three equal intervals and remove the middle
open third of each interval, leaving the set C
[0, 1/9][2/9, 1/3][2/3, 7/9][8/9, 1]
At the nth step, we get the set C
[0, 1/3
, 1/3
] [11/3
, 12/3
, 1]
Repeat this process again and again. After steps, we get the Cantor set C


A set S is open if, for any xS, S contains an open ball of center x. In symbols, x((x, x)
A set S is closed if its complement S
is open. A Set S is bounded if it is contained in some ball. Note
that a closed set is not necessarily bounded. According to this definition, there are closed and unbounded
sets. For instance, an infinite half open interval (, 1] is closed because its complement (1, ) is open,
but unbounded because it is an infinite interval.


Now, we prove that the Cantor set is of measure zero. As we can clearly see from
the process of constructing the Cantor set, at the nth step we get 2
many intervals of the
length 1/3
. Therefore, C
has the total length 2
. Since (2/3)
0 as n
, the Cantor set is of measure zero. To see the unique nature of the Cantor set, for
instance, we shall divide the unit interval [0, 1] into two equal intervals and remove the one
half, leaving the other half. Indeed, after steps, we get the set A

of measure zero since

0 as n. Unlike the Cantor set C

, however, A

converges to a single point, so

it is no wonder that A

is of measure zero.
It remains to show that the Cantor set is uncountable. The upshot of this proof is to
see that the Cantor set is the set of reals in [0, 1] that can be expressed, in the ternary
system, only by 0 and 2 (i.e., without 1). This is the reason why the Cantor set is often
called the Cantor ternary set.
In general, when an integer N in the decimal system has a ternary expansion:
.. a
0, 1, 2),
N is written, in the ternary system, as
.. a
Applying this notation to a decimal n[0, 1], when n has a ternary expansion:
.. (a
0, 1, 2),
n is written, in the ternary system, as
0. a
For instance, the number expressed by 34 in the decimal system is expressed by 1021 in the
ternary system because 34 has a ternary expansion: 1.3
. Likewise, the
number expressed by 0.5 in the decimal system is expressed by 0.111 in the ternary
system because 0.5 has a ternary expansion: 1/3
A possible ambiguity is that the end points can be written in two ways. For instance,
1/3 can be written as both 0.1 and 0.022.. and 2/3 can be written as both 0.2 and 0.122
.. But this does not cause much trouble, considering similar cases encountered in the
decimal system, such as 10.999 We shall adopt the rule according to which:
If the last non-zero place is 1, we choose the non-terminating expression.

Otherwise (i.e., if the last non-zero place is 2), we choose the terminating
Then, 1/3 is written as 0.022.. and 2/3 is written as 0.2.
Using the terminology of the ternary system, we can put the process of constructing
the Cantor set in a more simple way. Every number in [0, 1], expressed in the ternary
system, is of the form:
0. a
.. (a
0, 1, 2)
The construction of the Cantor set in the ternary system is as follows:
iff a
0, 2
iff a
0, 2 & a
0, 2
iff a
0, 2 & a
0, 2 .. a
0, 2
Therefore, the Cantor set consists of reals in [0, 1] expressed, in the ternary system, by 0
and 2 (i.e., without 1) as stated above.
Now, from the diagonal argument, we can show that the Cantor set is uncountable.
Suppose that the Cantor set is put in one-to-one correspondence with natural numbers as
1 0.a
2 0.a
n 0.a
As we have seen, each a
is 0 or 2. So, let b
2 if a
0 and b
0 if a
2. Then, we
get a number
This is exactly the number different in the nth place from that corresponding to n, therefore
cannot be found in the list above. This shows that even though it is uncountable, the
Cantor set is so scattered that it is negligible from the Lebesgue measure-theoretical point
of view.
Now, we shall discuss what impact the Lebesgue measure has on the Lebesgue

integral. The upshot of the Lebesgue integral is that the values of a function f(x) dont
affect the values of the integral f(x)dx at all points x that form a null set. The Lebesgue
integral can give the explicit answer to the question of how many points can be removed
without altering the value of the integral. The answer is as many points as form a null set.
In other words, points that do not form a null set determine the value of the integral. We
already know that due to the property of -additivity of the Lebesgue measure, a countable
set of points forms a null set. Therefore, most importantly, even if the values of a function
f(x) could be altered at a countable set of points, the value of the integral remains the same.
This is another way to show that the set of reals is not countable. This also tells us that the
boundaries make no difference to the area.
If some property holds except on a null set, the property is said to be hold almost
everywhere, (abbreviated a.e), or presque partout (abbreviated p.p.). If there are two
different functions f(x)g(x) almost everywhere, we cannot distinguish f(x) from g(x).
Then, the Lebegue theory tells us that we can regard these two functions as being virtually
The Lebesgue integral made possible the integration of functions that are not
Riemann integrable. The characteristic function

of the set Q of rationals is an
example of functions that are not Riemann integrable but Lebesgue integrable. The reason
(x) is not Riemann integrable is as follows. No matter how small the partitions
, x
), (x
, x
), (x
, x
) of [0, 1] are, rationals and irrationals coexist in the same
partition. Therefore, its possible to choose a rational from any partition and then an
irrational from any partition in such a way that the upper and lower step functions
(x) cannot coincide each other.
The stock-in-trade of the Lebesgue integral is to approximate f(x) not by means of
vertical strips, i.e., the upper and lower step functions but by means of horizontal strips.
The Lebesgue integral involves a partition of the range of f(x) rather than a partition of the
domain as for the Riemann integral. Thus, in the Lebesgue integral we partition the range
of f(x) and approximate f(x) by means of the upper and lower step functions bracketing f(x)
from without and within respectively. A partition P of [a, b] is a set {a
, a
, , a
} such

Let S
{x a
}. (i1, 2, n)
Then, the upper function





This function is called a simple function (or generalized step function).
Similarly, the lower function





Now, the upper sum
U(f, P)a
) a

In the same way, the lower sum
L(f, P)a
) a
As with Riemann integral, f is Lebesgue integrable iff the infimum of the upper sum is the
supremum of the lower sum. Therefore,
(x) is Lebesgue integrable on R, and

(x)dx1m(Q)0m(R Q)0
because Q is a null set, so m(Q)0 .

) stands for the Lebesgue measure of S

Figure 2: Riemann Integral vs. Lebesgue Integral.


Also, there are two major convergence theorems involving the Lebesgue integral: the
Monotone Convergence Theorem and the Lebesgue Dominated Convergence Theorem,
neither of which is true with regard to Riemann integrable functions. Both are concerned
with when the limit of the integrals is the integral of the limit, that is, when we can
interchange the order of the limit and the integral.

(Monotone Convergence Theorem)
Let {f
(x)} be a monotonic increasing sequence of measurable functions
such that f
(x) converges (pointwise) to f(x).
Then, lim

(Lebesgue Dominated Convergence Theorem)
Let {f
(x)} be a sequence of measurable functions such that lim
If the sequence is dominated by an integrable function g(x) in the sense that
(x) g(x).
Then, lim


Here I need to clarify the distinction between uniform convergence and pointwise
convergence. The sequence of functions f
(x), f
(x), is said to converge uniformly to
f(x) if
0Nx (if nN, then f(x)f
(x) ).

On the other hand, the sequence of functions f
(x), f
(x), is said to converge pointwise to
f(x) if
0xN (if nN, then f(x)f
(x) ).
We note that the order of x and N is contrary each other. In uniform convergence for
all x the sequence converges to f(x) simultaneously, whereas in pointwise convergence at

This would be easier to understand if interpreted in such a way that 0Nx (nN is large enough
so that f(x)f
(x) ).


each x the sequence could converge to f(x) in different ways. This is the reason why N
precedes x in uniform convergence while x precedes N in pointwise convergence.
Obviously, uniform convergence implies pointwise convergence, but this implication
cannot be reversed. Uniform convergence is nice in the sense that if each function of the
sequence f
(x), f
(x), has some property such as continuity or measurability, so does the
limit function f(x).
We shall take a couple of examples to see in concreto the difference between uniform
convergence and pointwise convergence. The sequence of functions f
(x)(11/n)x (x
[0, 1]) converges uniformly to the function f(x)x for each x[0, 1]. To see this, for a
given choose N1/. Then, for all nN for all x[0, 1] f(x)f
(x) as
required. On the other hand, the sequence of functions f
(x[0, 1]) converges
pointwise to the function f(x)0 if 0x1 and f(x)0 if x1. This sequence is not
uniform convergence because when 1/2, say, then, no matter how large n is, f(x)
(x) 1/2 for x[1/
2, 1). Notice that in the former case each f
(x) and f(x) are
continuous on [0, 1], while in the latter case each f
(x) is indeed continuous on [0, 1] but
f(x) is discontinuous at x1. These examples precisely show that in uniform convergence
f(x) inherits a nice property of each f
(x), i.e., continuity.
The significance of the Lebesgue integral, not least, Lebesgues Convergence
Theorems, is that it provides foundation for functional analysis. Functional analysis is a
branch of mathematics that discusses Banach space and Hilbert space in a rigorous manner,
and is applied to the theory of integral equation or, beyond mathematics, to quantum
physics. Since functional analysis has to deal with discontinuous functions, the sequence
of functions does not necessarily converge uniformly. Therefore, Lebesgue Convergence
Theorems, which make it possible to interchange the order of the limit and the integral not
only in uniform convergence but also in pointwise convergence, plays a significant role in
functional analysis.


2.2 Measurable Cardinals
Now we consider how Lebesgues theory of measure is applied to the theory of large
cardinals. Large cardinals are uncountable cardinals that cannot be reached from below.
Actually, mathematicians assume various kinds of large cardinals, e.g., inaccessible, Mahlo,
weakly compact, ineffable, measurable cardinals in the order of magnitude. That is, the
least weakly compact cardinal, if any, is a lot bigger than the least Mahlo cardinal, which is,
in turn, a lot bigger than the least inaccessible cardinal. Measurable cardinals are very
large cardinals, and the least measurable cardinal, if any, is greater than many weakly
compact and even ineffable cardinals. We have the proof that 2

is not a measurable
cardinal. The concept of a measurable cardinal plays a much more major role in the
theory of large cardinals than the weakly compact and the ineffable cardinals.
Now we shall see a measurable cardinal in connection with Lebesgues theory of
measure. We begin with the definition of measure on a set S. A measure on a set S is a
map m from P(S) to [0, 1] such that
(i) m()0 and m(S)1
(ii) Monotonicity: If AB, then m(A)m(B)
(iii) Non-triviality: m({a})0 for aS
(iv) -additivity: If the X

s are pairwise disjoint, then m(


As to the Lebesgue measure, it is natural to ask whether or not there is non-trivial
translation-invariant countably additive measure on all subsets of reals. In 1905 Vitali
showed the existence of non-Lebesgue measurable sets of reals by using the Axiom of
Choice. The concept of a measurable cardinal arose in response to Vitalis construction of
a non-Lebesgue measurable set of real numbers.
(1) If the measure does not need to be translation-invariant, is there a non-trivial countably
additive measure on all sets of real numbers?
(2) Is there such a measure for all subsets of some set S?
These questions led to the theory of large cardinal numbers, which had a great impact on
both in pure set theory and in descriptive set theory.
We define the notion of -additivity by generalizing the notion of -additivity:

is regular, , and the X

s are pairwise disjoint for any , then



Though m(


) because measure zero is assigned to singletons, the

upshot of this definition is that the union of fewer than sets with measure zero remains of
measure zero. We are now ready to define measurable cardinals:
is measurable if and only if
has a two-valued, -additive measure.

Intuitively, a cardinal is measurable iff is an uncountable cardinal and the union of
fewer than sets of measure zero is of measure zero. It is worth noting that we put

in the definition. If it were not for this condition,
would be measurable. The cardinal
2 would be also measurable. In this case, according to the conditions that a measure must
satisfy, m(0)0, m({0})0, m({1})0, and m({0, 1})1. Obviously, m is two-additive
measure on 2. But the core of the definition of measurable cardinals is that the union of
countably many sets of measure 0 is again of measure zero. Therefore, cardinals

should not be considered as measurable. This is why we put
in the definition.
We have to note that there is a slippage between the naming of -additivity and the
naming of -additivity. For -additivity is so called by paying attention to the point up to
which the union of sets with measure zero remains of measure zero (therefore, the
uncountable union of sets with measure zero can have positive measure), whereas
-additivity is so called by paying attention to the point beyond which the union of sets
with measure zero can have positive measure (therefore, the fewer than union of sets with
measure zero remains of measure zero). Thus, -additivity is the same as

is a non-measurable cardinal. Then, a question arises: Is 2

also a
non-measurable cardinal? We can show that the answer is yes by reductio ad absurdum:

Since we put non-triviality into the definition of measure above, we dropped it from the definition of
measurable cardinals here. But we could separate non-triviality from the definition of measure, and put it
into the definition of measurable cardinals: is measurable if and only if
has a two-valued, -additive,
non-trivial measure.
Fortunately, there arises no ambiguity here because
is the first aleph.


If we suppose that 2

is a measurable cardinal, we shall reach a contradiction. The proof
runs as follows:
Suppose that 2

is a measurable cardinal, that is, we can assign a two-valued,
-additive measure to all the subsets of 2

. Therefore, there are 2

many subsets. Now
we take a function f:
{0, 1}. But note that this function is still not the one that assigns
a measure to all the subsets of 2

. And we have the set of function fs:

, f
, f

}. Assigning a measure to all the subsets of 2

boils down to assigning a measure to all
the subsets of

2. When we discuss the set of functions, we denote it by

2 in distinction
from cardinal exponentiation 2

in order to avoid confusion.
According to the definition (i) of measure above, m(

2)1. Here we can divide

into the two disjoint sets: the set of functions satisfying f(0)0 and the set of functions
satisfying f(0)1. That is,


2 f(0)0}{f

2 f(0)1}
Then, we have to assign measure 1 to either {f

2 f(0)0} or {f

2 f(0)1}, no
matter which it may be, because if both of them were of measure zero, they would not add
up to m(

2)1, contrary to the definition (iv) of measure above. Just for the sake of
argument, we shall assume that m({f

2 f(0)1})1. Again, we can divide the set


2 f(0)1} into the two disjoint sets: the set of functions satisfying f(0)1&f(1)
0 and the set of functions satisfying f(0)1&f(1)1. That is,

2 f(0)1}{f

2 f(0)1&f(1)0}{f

2 f(0)1&f(1)1}.
For the same reason as before, we have to assign measure 1 to either {f

2 f(0)
1&f(1)0} or {f

2 f(0)1&f(1)1}. Just for the sake of argument, we shall

assume that m({f

2 f(0)1&f(1)1})1. We let this process go on and on.

Here is the upshot of this proof. The Axiom of Choice guarantees that after steps,
we get the set which consists of the unique function:
}{ f

2 f(0)1&f(1)1& .. &f(

Then we ask: Is m({f
})0 or 1? According to -additivity, all the sets to which we have
assigned measure zero so far cannot add up to measure 1, but m(

2)1, so m({f

1. According to the definition (iii) of measure above, however, the singleton is of
measure zero and the set {f
} is a singleton, so m({f
})0. A contradiction. This
completes the proof. This proof also tells us that if

2 has a two-valued, -additive

measure, it should be trivial.
It is interesting to capture measurable cardinals in connection with the notion of an
ultrafilter. But first we have to define the notion of a filter:
A filter on a non-empty set S is a collection F of subsets of S such that for any A, BS,
(i) SF and F.
(ii) If A, BF, then ABF.
(iii) If AF and AB, then BF (In words, any set B that contains a set A being a
member of a filter F is also in that filter).
A trivial filter F{S}. We define a principal filter. Let X
be a non-empty subset of S.
A principal filter F{XS | X
X}. This means that there is no infinite regress in that
filter. Thus every filter on a finite set is a principal filter. Take as an example the filters
on the set {0, 1, 2, 3}. A trivial filter F{{0, 1, 2, 3}}. A filter F{{0, 1, 2}, {0, 1, 2,
3}}. Another filter F{{0, 1}, {0, 1, 2}, {0, 1, 3}, {0, 1, 2, 3}}.
I shall also define the dual notion of a filter, i.e., an ideal:
An ideal on a non-empty set S is a collection F of subsets of S such that for any A, BS,
(i) I and S I.
(ii) If A, BI, then ABI.
(iii) If AI and BA, then BI (In words, any set B that is contained in a set A being a
member of an ideal I is also in that ideal).
There is a remarkable relationship between a filter F and an ideal I on S: I{SX XF}
or equivalently, F{SX XI}. So, for the set {0, 1, 2, 3} mentioned above, an ideal
I{, {3}}. Another ideal I{, {2}, {3}, {2, 3}}.

We now turn to the notion of an ultrafilter that is closely related to measurable
An ultrafilter is a filter on a set S such that for any XS
either XF or X
A set X
is the complement of a set X. Again, for the set {0, 1, 2, 3} mentioned above, an
ultrafilter U{{0}, {0, 1}, {0, 3}, {0, 4}, {0, 1, 2}, {0, 1, 3}, {0, 2, 3}, {0, 1, 2, 3}}.
Tarskis Theorem tells us that every filter can be extended to an ultrafilter. It is known that
the proof of Tarskis Theorem uses the Axiom of Choice. A prime ideal is the dual notion
of an ultrafilter. A prime ideal on the set {0, 1, 2, 3} is {, {1}, {2}, {3}, {1, 2}, {1, 3},
{2, 3}, {1, 2, 3}}. The Prime Ideal Theorem, which is the counterpart of Tarskis
Theorem for an ultrafilter, tells us that every ideal can be extended to a prime ideal. It is
also known that we need the Axiom of Choice in order to prove the Prime Ideal Theorem.
The subsets of a set are classified into an ultrafilter or a prime ideal.
Measure-theoretically, if m is a two-valued measure, we have to note that an ultrafilter is a
collection U of sets to which measure 1 is assigned. That is,
U{XS | m(X)1}.
To put measurable cardinals using the notion of an ultrafilter,
is measurable if and only if there exists a -complete non-principal ultrafilter on .
In analogy with -additivity, we can give a definition of a -complete filter on S:
is regular, , and X

F, then

Assuming that there exist measurable cardinals, the following theorem clearly shows
the nature of measurable cardinals:
If is measurable, then is (strongly) inaccessible.
In what follows, omitting strongly we shall call it just inaccessible.
is inaccessible if and only if
is regular and strong limit.
In contrast, is weakly inaccessible if and only if
is regular and limit.
We can show this theorem by just modifying the proof that 2

is a non-measurable


We need to define the notion of regular cardinals. But first we have to define the
notion of the cofinality of a set. Let <S, > be a well-ordered set. A subset of S is
called cofinal in S if it has the maximal element of S. Also, the cofinality of a set S,
denoted by cf(S), is the least number of elements of cofinal subsets. For an ordinal , if
cf(), is called singular, whereas if cf(), is called regular. Therefore,
(i) cf(0)0, so 0 is regular.
(ii) For any successor ordinal

, cf(

)1, so 1 is regular and every other successor

ordinal is singular.
(iii) cf(), so is regular.
Take an ordinal 3 as an example. The cofinal subsets in ar ordinal 3 are {2}, {0, 2}, {1, 2},
{0, 1, 2} and cf(3)1 (the number of elements of a cofinal subset {2}). Thus 3 is singular,
as desired. The upshot of this definition is that a set of which is the maximal element
must have many elements, as we can see from the fact that any cofinal subset in , {0, 1,
., }, {1, 2, ., }, {0, 2, ., }, has many elements.
We can get the concept of an inaccessable cardinal stronger than that of an weakly
inaccessable cardinal by replacing the moderately increasing sequence from
the exponentially and thus more rapidly increasing sequence from
to 2

A limit cardinal is sup{
, .}
A strong limit cardinal is sup{, 2

, 2

, .}
Interestingly enough, however, if the Generalized Continuum Hypothesis holds, the first
weakly inaccessible cardinal is identical with the first inaccessible cardinal.
is an
example of a strong limit cardinal, though
is of course not inaccessible because an
inaccessible cardinal has to be
. Also, we can say that
is to finite cardinals what
an inaccessible cardinal is to smaller cardinals. To put it more precisely, the theorem that
every measurable cardinal is inaccessible means that measurable cardinals (if any) are not

We prove that first of all is a regular cardinal and then is a strong limit. For the latter, we can only
replace by in the proof that 2

is a non-measurable cardinal.

constructible from below by the set-theoretic operations. Therefore, every cardinal that is
constructible by the set-theoretical operations is non-measurable. Another way to see this
is Dana Scotts simple result in 1960 to the effect that if the Axiom of Constructibility does
hold, there are no measurable cardinals. Since the existence of measurable cardinals
opens more fruitful mathematical universe, most mathematicians reject the Axiom of
Tait claims that despite the fact that both the Law of Excluded Middle and the Axiom of
Choice are non-constructive principles, it seems strange that there is a remarkable
difference in the attitudes of mathematicians toward them. More specifically, the so-called
French Constructivists, i.e., Borel, Baire, and Lebesgue did not challenge the Law of the
Excluded Middle, but only the Axiom of Choice. According to Tait, however, the
non-constructive nature of the Axiom of Choice ought to be attributed to the Law of
Excluded Middle, which should be therefore rejected. Tait formulates the Axiom of
Choice by using the term type instead of set in order to express that mathematical
objects are to be constructed as objects of some type A. In addition, if we understand the
existential quantifier correctly, Tait claims, the Axiom of Choice is constructively
In my view, however, as DeVidi precisely points out, the Axiom of Choice contains
more information than can be saved by the constructive understanding of the existential
quantifier. In order to justify the Axiom of Choice in such a way as Tait suggests, we are
required to change the standard notion of a set drastically. The restricted form of the
Axiom of Choice, whatever form it may be, must be strictly distinguished from the Axiom
of Choice in its original form. In light of this, I point out that the non-constructive nature
of Lebesgue measure as shown above is incompatible with his skeptical attitude toward the
Axiom of Choice. In 1905 Vitali showed the existence of non-Lebesgue measurable sets
by using the Axiom of Choice. Nevertheless, Lebesgue was convinced that the Axiom of
Choice was false. But we have seen that the theory of large cardinals is based on the
notion of -additivity that is the generalization of the notion of -additivity, which in turn

requires the Axiom of Choice in its proof.
Hence, I provisionally conclude that
Lebesgue should have accepted the Axiom of Choice with no restrictions.
In the next chapter we shall go on to consider the existence of non-Lebesgue
measurable sets in connection with the Banach-Tarski paradox in more detail.

Lebesgue implicitly used the Axiom of Choice in order to prove that the Lebesgue measure on the real line
is -additive. See Tait (1994), p. 47 and p. 64, n.6.



In this chapter, we discuss the Banach-Tarski Paradox in a mathematically rigorous way by
tracing back to its original form, the Hausdorff Paradox. The basic ideas of the latter can
be already detected in the former. So I believe that the Hausdorff Paradox holds the key to
correctly understanding the Banach-Tarski Paradox. In Section 1, we set the stage. In
Section 2, I shall give a concrete example of non-Lebesgue measurable sets derived using
the Axiom of Choice. In Section 3, I shall introduce two interconnected notions,
G-paradoxical and G-equidecomposable. Using these notions we can formulate the
Hausdorff Paradox and the Banach-Tarski Paradox in a rigorous manner. Then, we shall
discuss the Hausdorff Paradox. I put more emphasis on a concrete example of a
paradoxical decomposition than on the formal proof of the Paradox. We find out that
showing the existence of a non-Lebesgue measurable set by appeal to the Axiom of Choice
plays a crucial role in a paradoxical decomposition. In Section 4, we shall discuss the
Banach-Tarski Paradox. The Banach-Tarski Paradox made improvement on the
Hausdorff Paradox. So we shall show how the Hausdorff Paradox was reformed by the
Banach-Tarski Paradox. In Section 5, we shall distinguish three different senses of
paradox. The Banach-Tarski Paradox is so called in the sense that it is a counterintuitive
theorem, as distinct from a logical contradiction or fallacious reasoning. In Section 6, we
shall discuss what cannot happen in the Banach-Tarski Paradox. That is, the Paradox
doesnt hold in R
and R
, and a paradoxical decomposition cannot be performed using
fewer than five pieces. In Section 7, we shall see a paradox without invoking the Axiom
of Choice. This result is in favor of the Platonists. For even if the use of the Axiom of
Choice yields a paradox, it doesnt follow from this that we should reject the Axiom of
Choice. In Section 8, however, I mention a version of the Banach-Tarski Paradox that
does not require the Axiom of Choice. Although it seems that this result strikes a blow to
the Platonists, I claim that the ontological status of the Paradox should be determined by

carefully examining which of the Platonists or the Constructivists can explain the nature of
the Paradox more systematically. Finally, I point out that the Banach-Tarski Paradox
represents a characteristic nature as distinct from natural science.
3.1 Preliminaries
First of all, we shall define an upper bound and the least upper bound or supremum. An
upper bound x of a set S is such that, for any element s in S, x is equal to or greater than s.
In symbols, sS (sx). Let U be the set of all upper bounds of the sets. U may be
empty, but otherwise the least element of U is called the least upper bound or supremum of
S. A lower bound and the greatest lower bound or infimum of S is defined analogously.
The point is that S does not necessarily attain the least upper bound or the greatest lower
bound in S. Take an open unit interval (0, 1) as an example. According to the definition,
0 and 1 are the greatest lower bound and the least upper bound of (0, 1) respectively, but are
not in (0, 1).
Note that we need to make a distinction between a maximal element and the greatest
element. a is a maximal element in S if there exist no elements greater than a in S. On
the other hand, b is the greatest element if b is greater than any other element in S.
Let (S, ) be a partially ordered set as below.

Figure 3: Example of a partially ordered set.

Since x
and y
are not comparable, there is no greatest element in S. But both x
and y

are maximal elements in S because there is no element greater than x
and y
. The

distinction between a minimal element and the least element is the analogue of this
We shall define relations R on a set S as follows:

R is reflexive if xRx
R is irreflexive if (xRx)
R is symmetric if xRyyRx
R is anti-symmetric if (xRy & yRx)xy
R is transitive if (xRy & yRz)xRz
R is connected if xy(xRy yRx)
Also, R is an equivalence relation iff it is reflexive, symmetric and transitive. Let R
be an equivalence relation on a set S. Then, we shall define the set of x such that x is in S
and x is in the relation R to a. This set is called the equivalence class of a under the
equivalence relation R, denoted by R[a]. In symbols, R[a]{x xS & xRa}. It is an
important fact that the equivalence relation R on a set S partitions S, that is, divides S into
equivalence classes so that any two distinct equivalence classes are disjoint. This is
known as the Equivalence Relation Theorem. For instance, x y (mod n) is an
equivalence relation R on the set Z of all integers, and partitions Z, that is divides Z so that
any two distinct residue classes are disjoint. This means nothing other than that we can
classify Z by the remainders when divided by n (i.e., the remainders 0, 1, 2, , n1).
Later we shall use the Equivalence Relation Theorem to prove the existence of
non-Lebesgue measurable sets.
In order to prove the Equivalence Relation Theorem, we first have to show that any
aS belongs to some equivalence class. For any aS, since R is reflexive, aRa.
Therefore, for any aS, aR[a]. We next show that any two distinct equivalence classes
are disjoint. This means that for any a, bS, either R[a]R[b] or R[a]R[b]. Now

In a logically correct order, the relation R is denoted by
(x, y)R, R(x, y), xRy
So precisely a reflexive relation xRx should be denoted by (x, x)R.


we suppose that R[a]R[b], and then show that R[a]R[b]. Let cR[a]R[b].
Then cR[a] & cR[b]. So cRa & cRb. But, since R is symmetric, aRc & cRb. Since
R is transitive, aRb. Now we prove that if aRb, then R[a]R[b]. Let xR[a]. Then
xRa. By assumption aRb. Since R is transitive, xRb. So xR[b]. Therefore xR[a]
implies xR[b]. This means R[a]R[b]. Let xR[b]. Then xRb. By assumption
aRb. Since R is symmetric, bRa. Since R is transitive, xRa. So xR[a]. Therefore
xR[b] implies xR[a]. This means R[b]R[a]. From R[a]R[b] & R[b]R[a], we
get R[a]R[b]. This completes the proof.
We shall define some important ordering relations, using the relations mentioned
A strict partial ordering of a set S is a relation on S which is irreflexive, anti-symmetric,
A non-strict partial ordering of a set S is a relation on S which is reflexive, anti-symmetric,
In extension of a relation of magnitude between numbers, a strict ordering is denoted by
representing an irreflexive relation and a non-strict ordering is denoted by representing a
reflexive relation.
A total ordering of a set S is a connected partial ordering of S. A connected ordering is
also expressed as trichotomy: for any x, yS, either xy, or xy, or xy. In a totally
ordered set, a maximal element is identical with the greatest element.
A well-ordering of a set S is a well-founded total ordering of S. A partially ordered set S is
well-founded if every non-empty subset of S has a minimal element. Since, as we have
seen above, in a total ordering a minimal element is identical with a least element, a
well-ordering of a set S amounts to a total ordering of S with the property that every
non-empty subset of S has a least element. Note that in a well-ordered set S every
non-empty set of S, not just the set S itself, has a least element. Take the closed unit
interval [0, 1]. We have seen before that the Axiom of Choice is equivalent to the
Well-Ordering Theorem: Every set is well-orderable. So [0, 1] is well-orderable. But it

is not well-orderable in the order of magnitude because, for instance, an open set (1/3, 2/3)
is the subset of [0, 1] but does not have a least element in the order of magnitude.
Roughly speaking, a partially ordered set is organized in a network of a number of one-way
tracks, while a totally ordered set is arranged along the single, one-way track. Also, a
well-ordered set is aligned one by one, next to next, with a beginning.
For instance, the power set of a set S{0, 1, 2} is partially ordered by inclusion .
To see this, the power set of S is {, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}}.

Figure 4: Partial order by inclusion of the power set of a set S{0, 1, 2}.

A totally ordered subset of S is called a chain. In this example, since {0}{0,
2}{0, 1, 2}, a subset of S, { , {0}, {0, 2}, {0, 1, 2}}, constitutes a chain.
Now we are ready to state what Zorns Lemma is.
Zorns Lemma: Let (S, ) be a partially ordered set. If every chain C in S has an upper
bound in S, then S has a maximal element.

It is known that Zorns Lemma is equivalent to the Axiom of Choice. This means that
even if every chain C in S has an upper bound in S, without the Axiom of Choice, we

In regard to a minimal element, the analogue of Zorns Lemma does hold: Let (S, ) be a partially
ordered set. If every chain C in S has a lower bound in S, then S has a minimal element.


cannot say that S has a maximal element. Also, note that Zorns Lemma does not concern
how many maximal elements there exist in S. We shall show how to use the Axiom of
Choice in order to prove Zorns Lemma. The Axiom of Choice guarantees us that there is
a choice function f on the power set of S. By assumption, for any element cC, we can
define a set Uc{x xC (cx)}. Consider, for some element c
C, Uc
. We can
find Uc
in the power set of S. Note that Uc
because every chain C in S has an upper
bound in S. So f(Uc
'C. Let c
. Consider, for the element c
C, Uc
We can find Uc
in the power set of S. Uc
, so f(Uc
'C. Let c
. We
continue this process on and on until Uc
is a singleton. Then f(Uc
' is a maximal

Figure 5: The use of the Axiom of Choice in the proof of Zorns Lemma.


It is obviously false to claim that there is a maximal element in the set Z of all
integers. But does Zorns Lemma prove this? It is incorrect to apply Zorns Lemma to
this claim. For there is a chain that does not have an upper bound in Z. That is, Z itself,
{0, 1, 2, ..} Z. So this claim does not satisfy the assumption of Zorns Lemma.
3.2 Non-Lebesgue Measurable Sets
In 1905 Vitali showed that all sets of real numbers are not Lebesgue measurable. But he
used the Axiom of Choice in order to derive a non-Lebesgue measurable set. Therefore,
Lebesgue was convinced that the Axiom of Choice was false. But Lebesgue himself
implicitly used the Axiom of Choice to derive the -additive nature of Lebesgue measure.
Indeed, Lebesgue used a weaker form of the Axiom of Choice that restricts its application
to countable sets. As we shall see, in order to derive a non-Lebesgue measurable set, we
need a stronger form of the Axiom of Choice, which extends its application to uncountable
sets. But then naturally there arises a question of why we should restrict the use of the
Axiom of Choice to countable sets.
The existence of non-Lebesgue measurable sets is surprising and counter-intuitive.
In what follows, I shall derive a non-Lebesgue measurable set using the Axiom of Choice in
concreto. We define a relation R on the closed unit interval [0, 1] : xRy if yx is a
rational. Note that yx is in [1, 1]. For instance, 1/3R1/2 because 1/21/31/6,
which is a rational. Also, (/101/4)R(/101/3) because (/101/3)(/101/4)
1/12, which is a rational. But [/6R/5] because /5/6/30, which is an irrational.
Since the relation R is reflexive, symmetric and transitive, it is an equivalence relation. As
we have seen above, the Equivalence Relation Theorem tells us that an equivalence relation
R on a set S partitions S, that is, divides S into equivalence classes so that any two distinct
equivalence classes are disjoint.
The difference of any two numbers that belong to distinct equivalence classes is an
irrational. But the difference of any two numbers that belong to the same equivalence
class is a rational. Therefore, each equivalence class is countable. Moreover, since the
relation R partitions [0, 1], which is uncountable, there are uncountable many distinct
equivalence classes. Note that the countable union of countable sets is countable. Now,

the Axiom of Choice guarantees us that there exists a set S containing exactly one element
from each equivalence class. We shall show that S is non-Lebesgue measurable.
Consider the translates of S by S
, where r
is a rational in [1, 1]. S
pair-wise disjoint. To see this, suppose for reductio that xS
. So for s
, s
S, s

. Then s
, which is a rational. As we have seen above, however,
is an irrational. A contradiction. So S
are pair-wise disjoint.
Consider S'

. Since r
is a rational in [1, 1] and the set of r
is countable,
[0, 1]S'

[1, 2]
By the translate-invariant nature of Lebesgue measure, Lebesgue measures of S
are all the

m([0, 1])1m(S')

)m(S)m([1, 2])3
Now we ask: m(S)0 or m(S)0? If m(S)0, then, by the -additive nature of Lebesgue
measure, m(S') 0. On the other hand, if m(S) 0, then m(S') . Thus S is
non-Lebesgue measurable.

The translation-invariant property means that the distance between two points a, b remains the same even
if each of them is shifted by t along the real line. That is,
d(a, b)d(at, bt)ba
In this way, the translation-invariant property is essential to derive a non-Lebesgue measurable set. If the
measure does not need to be translation-invariant, is there a non-trivial countably additive measure on all sets
of real numbers? This is Lebesgues Measure Problem.

Figure 6: Example of a non-Lebesgue measurable set.

3.3 The Hausdorff Paradox
In order to discuss the Hausdorff Paradox in a rigorous manner, we need to define two
technical notions first.
Let X be a set and G be a group that acts on X.
X is G-paradoxical if subsets disjoint pairwise are there A
, A
, A
, B
, B
, B
of X
G such that


)X and


The figure below shows the case where i4, j3. Note that i could be different from j.
Also, note that as the figure shows,

is not necessarily X itself but could
be a subset of X.

Figure 7: G-paradoxical.

In order to define G-equidecomposable, we need to define G-congruent first.
is G-congruent to B
(i.e., A

) if B

A is G-equidecomposable to B (i.e., A
B) if A and B are decomposed into the same,
finite number of pieces such that A


Figure 8: G-equidecomposable.

We are ready to formulate the Hausdorff Paradox.

The Hausdorff Paradox claims that there is a countable subset such that S
is a unit sphere centered at the origin. The difference S
is the set of all elements
which belong to S
but not to : S
{x xS
& x }. The special orthogonal
group in three dimensions, denoted by SO
, represents a group of rotations of R
. The
following is the flow chart of the proof of the Hausdorff Paradox.
1. If a group G is paradoxical and acts on X freely, X is G-paradoxical.
2. A free group of rank 2 is paradoxical.
3. SO
has a free subgroup of rank 2.
4. SO
acts freely on S

5. Therefore, S
is SO
The Axiom of Choice is indispensable in Step 1. Since the Axiom of Choice is just used
so generally in the major premise, however, we cant clearly see into which pieces the
sphere is actually decomposed and how these pieces are reassembled. So, in what follows,
we shall consider a concrete example of Hausdorffs paradoxical decomposition.
We shall start off with the definition of a group. A group G is a set, with a binary
operation, satisfying the following axioms:
(1) Closure
(2) Associative
(3) Identity e (One under multiplication)
(4) Inverse (Every element in G has an inverse)
Let G be a group generated by and . Let be a counterclockwise rotation by
120 around the z-axis and let be a counterclockwise rotation by 180 around another line
through the origin. But it is not the case that any line through the origin does the job.
From the angles of the rotations , we have

e (where e is the identity). Using
this equality,

can be reduced to
. Then we get two kinds of reduced
products of and :

. . (
1 or 2)

. . (
1 or 2)

But still different reduced products could be substantially the same. For instance, suppose
that is a counterclockwise rotation by 120 around the y-axis. A quick thought
experiment shows that . Bracing for the following argument, we want to
avoid a situation like this. By imposing constraints on we can make arrangements so
that different reduced products represent different rotations. Now, suppose that is a line
on the xz-plane, which makes with z-axis. If different reduced products are actually
equal, an algebraic equation involving cos2 can be solved, so cos2 is a number that can
be a solution of an algebraic equation (i.e., a algebraic number). Conversely, if cos2 is a
number that cannot be a solution of an algebraic equation (i.e., a transcendental number),
different reduced products are certainly different. Thus, we take so that cos2 can be a
transcendental number.

Figure 9: The Hausdorff Paradox.

G is called a free group if

e is the only relation between and . In other

e does not hold between and . In that sense, and are free of
relations. The upshot of this definition is that two different reduced products represent
two different transformations. For if any two reduced products were the same, it would
mean that there is some relation between and except for

e. So, a
combination of and as presented in the Hausdorff Paradox forms a free group generated
by and .
Here we define G acts on X freely. It is trivial that any point can be fixed by the
identity. So, a point that can be fixed by an element in G other than the identity, if any, is
called a non-trivial fixed point. G acts on X freely if there are no non-trivial fixed points
in X by an element in G, that is, every point in X can be transformed to another point in
there by an element in G except for the identity. We have to note that a free group G
doesnt necessarily act on X freely. In the Hausdorff Paradox, G doesnt act on S
We notice that an element in G is a rotation around some axis. So any element in G fixes
exactly the two points (i.e., the intersections of the sphere and the axis of rotation). Since
there are at most countable combinations of and , the number of fixed points is also
countable. Let be the set of all fixed points of S
by an element in G. And we can find
out an interesting feature of S
. A point in S
is fixed only by the identity. The
upshot is, an element in S
is moved to another element in there by any other element in
G than the identity. Technically speaking, G acts on S
freely (i.e., with no nontrivial
fixed points).
Now we consider the set of all points to which x
is moved by any element in G. It
is called the G-orbit of x
: Ox
{x gx
, gG}. We can prove that for two points x
in S
, either Ox
or Ox
by showing that if Ox
then Ox
. Note that S
is uncountable because S
is uncountable and is countable.
So the G-orbit of x
is the equivalence class of x
. The Equivalence Relation Theorem
tells us that the equivalence relation R on a set X partitions X, that is, divides X into
equivalence classes so that any two distinct equivalence classes are disjoint. Indeed, since
the elements in G are distinct, a point x
in S
is moved to different points by different

element in G. Since there are countable combinations of and , however, the number of
elements in G is at most countable. So S
is partitioned into uncountable equivalence
classes. Then the Axiom of Choice guarantees the existence of a set T containing exactly
one element from each equivalence class.

Figure 10: The existence of a set T containing exactly one element from each G-orbit.

But we note that the T itself is not a piece into which S
is decomposed,
although we use T to derive a paradoxical decomposition. Hausdorff decomposed S

into three pieces, S
, S
, S
, depending on the last word multiplied from T (Hausdorff
{x tT; x . (t)}
{x tT; x. (t)}
{x tT; x
. (t)}
Thus, (S
, (S
Hence, S

Figure 11: Hausdorffs paradoxical decomposition.

In order to make this agree with the definition of G-paradoxical, by dividing the S

further into two pieces, we decomposed the sphere into four pieces, A
, A
, B
, B
classified by the last rotation multiplied from T as follows:
{x tT; xt or x . (t)}
{x tT; xt or x
. (t)}
{x tT; xt or x. (t)}
{x tT; x
t or x
. (t)}
These four pieces are the ones we use in a paradoxical decomposition.
Next we consider the subsets, A
', A
', B
', B
', of these pieces.
'{x tT; x . (t)}
'{x tT; x
. (t)}

'{x tT; x

. (t)}
'{x tT; x

. (t)}
Note that (A
). Also,

This shows that S
is G-paradoxical.
We can show that these four pieces used in a paradoxical decomposition are
non-Lebesgue measurable. We have (A
, B
, B
). So,
. Measure is preserved by rotations. Hence,
But we also have (A
) T(B
). For the same reason,
This is impossible. Therefore, we know that A
, A
, B
, B
, are all non-Lebesgue
We have to determine how to count the number of pieces into which we decompose
to derive a paradox. First of all, we need to define G-equidecomposable using n pieces.
Let X be a set and G a group that acts on X. For AX, A is G-equidecomposable with X
using n pieces if

Suppose that XAB. If A is equidecomposable with X using i pieces and B is
equidecomposable with X using j pieces (in symbol, A
B), then X is G-paradoxical
using ijk pieces.
For instance, suppose that
, BB

Then A
B, so X is G-paradoxical using five pieces.

This is exactly what we mean below when we refer to the minimum number of pieces
required in a paradoxical decomposition. In this sense, X cannot be G-paradoxical using
fewer than four pieces. This definition is important because in some cases the number of
pieces differs depending on the order of rigid motions. In the Hausdorff Paradox, it seems
that we manage with only three pieces, S
, S
, S
, but according to this definition we need
four pieces, A
, A
, B
, B
3.4 The Banach-Tarski Paradox
The Banach-Tarski Paradox claims that S
is SO
-paradoxical. The Banach-Tarski
Paradox made improvement on the Hausdorff Paradox by eliminating the need to exclude a
countable subset from S
. The rough sketch of the proof runs as follows:
1. If X
Y & X is G-paradoxical, then Y is also G-paradoxical.
2. For any countable subset of S
, S

3. S
is SO
-paradoxical. (The Hausdorff Paradox)
4. Therefore, S
is SO
-paradoxical. (The Banach-Tarski Paradox)
To see how Banach and Tarski eliminates the need to exclude from S
, we shall
take a look at how to form the set A (0, ) from A {0} only by a translation. First, we
divide A {0} into the following two sets:
A {0, 1, 2, 3, . }
{1, 2, 3, . }
Let be a translation by 1.

{0, 1, 2, 3, . }
Therefore, AA


Using the same technique, we want to divide S
into the following two sets. For
some rotation ,
{, ,
, . }
, . }
But can we define such a rotation? Suppose that {x
, x
, x
, . }. We want to avoid
a such that x

. That is, a rotation such that a point in can be transformed to a

point in there again by multiplications of the rotation. While there are at most countable
bad choices for the rotation because the set is countable and the number of
multiplications is countable, there are uncountable rotations (Recall that the countable
union of countable sets is countable). So there is a such that S
is divided into the
two sets D
, D

{, ,
, . }
Hence, S


The Hausdorff Paradox is the prototype of the Banach-Tarski Paradox. The
Hausdorff Paradox states that S
is SO
-paradoxical. Informally speaking, a sphere is
decomposed into finite number of pieces and reassembled by rigid motions to form two
copies of almost the same size as the original. Im using the word almost in slightly
different way from that in which it is used measure-theoretically. In the theory of measure,
almost everywhere means except on a set of measure zero, while almost means
except on a countable subset. On the other hand, the Banach-Tarski Paradox states that
is SO
-paradoxical. A sphere is decomposed into finite number of pieces and
reassembled by rigid motions to form two copies of exactly the same size as the original.
This is a Weak Form of the Banach-Tarski Paradox (Two Spheres from One Version).
There is another Weak Form of the Banach-Tarski Paradox (The Pea and The Sun Version):
Any solid ball is M
-paradoxical (M
is the group of all isometries of R
). Informally, a
ball the size of a pea is decomposed into a finite number of pieces and reassembled by rigid
motions to form a ball the size of the sun. The Strong Form of the Banach-Tarski Paradox
says that if A and B are any two bounded subsets of R
with non-empty interior, then A and
B are equidecomposable.


Figure 12: The Weak Form of the Banach-Tarski Paradox (Two Spheres from One

Figure 13: The Weak Form of the Banach-Tarski Paradox (The Pea and the Sun Version).


We shall introduce a new relation symbol, A B. This notation can be seen as the
abbreviation of A
B'B. In words, A is G-equidecomposable to a subset B' of B.
The relation is an equivalence relation, so reflexive, symmetric and transitive.
Banach-Schrder-Bernstein Theorem claims that if A B & B A, then A
B. So, in
order to show that A
B, we shall show A B & B A. Actually we have only to show
A B because B A follows by the same argument. Since A and B are bounded, there
exists a ball K containing A and a ball L contained in B. Without loss of generality we
may assume K is larger than L. Then K is covered by n many balls L
', L
', , L
', of the
same size as L. Note that L
', L
', , L
', are allowed to partially overlap each other.
Let S be a set of pair-wise disjoint balls, L
, L
, , L
, of the same size as L. If we save
a part of L
', L
', , L
', that is not overlapping each other and pick an overlapping part
from any one of them. K is M
-equidecomposable to a subset S' of S. That is, K is
decomposed into n many pieces and reassembled only by the identity and translations to
form S'. So, K S. Then, in order to show S L, we need to prove that L is
-paradoxical (using the Axiom of Choice) and S
From here on, the proof proceeds as follows. Again, we omit the details and show
the schema of the proof.
1. K S
2. S L
3. K S & S LK L (because is transitive)
4. Therefore, K L.
Thus, AK LB. Therefore, A B.
3.5 What is a Paradox?
In order to consider what a paradox is, we shall start off with the definition of the word
paradox from the Oxford English Dictionary. The following is a part deemed to be
relevant here.
1.a A statement or tenet contrary to received opinion or belief; often with the implication that it

Its not trivial. The proof is necessary, although we leave it out here.

is marvelous or incredible; sometimes with unfavorable connotation, as being discordant with
what is held to be established truth, and hence absurd or fantastic; sometimes with favorable
connotation, as a correction of vulgar error.
2.a A statement or proposition which on the face of it seems self-contradictory, absurd, or at
variance with common sense, though, on investigation or when explained, it may prove to be
b Often applied to a proposition or statement that is actually self-contradictory, or contradictory
to reason or ascertained truth, and so, essentially absurd and false.
c Logic. A statement or proposition which, from an acceptable premise and despite sound
reasoning, leads to a conclusion that is against sense, logically unacceptable, or
Except for a paradox in logic, what all the definitions seem to have in common is that
a statement called a paradox is self-contradictory or contrary to common sense. But the
definition is divided into three parts as to whether a statement or common sense is true.
1.a, doesnt mention whether a statement or common sense is true. In 2.a, a statement is
true, so common sense is false, while in 2.b, a statement is false, so common sense is true.
But a paradox in the sense of 1.a could be eventually distributed into either 2.a or 2.b; so I
would distinguish two senses of the word paradox, distinct from a paradox in logic. The
following indicates how I believe the word paradox should be classified.
(I) A logical contradiction from Logic and Mathematics
(Logic, Mathematics)A&A (ex.) Russells Paradox
(II) Logic and Mathematics conflict with our intuition
(Logic, Mathematics)A & (Our Intuition)A
(i) (Logic, Mathematics) A is falsefallacious reasoning (ex.) Zenos
(ii) (Our Intuition) A is falsecounterintuitive theorem (ex.) The
Hausdorff Paradox, The Banach-Tarski Paradox, The Skolem Paradox
A paradox in sense (I) is a logical contradiction. A case in point is Russells
Paradox. Is the set of all sets that are not a member of themselves a member of itself? If
it is, then it isnt. If it isnt, then it is. Russell attempted to resolve this paradox by
appealing to the simple theory of types. The simple theory of types distinguishes among

the individuals (type 0), the properties of these individuals (type 1), the properties of these
properties (type 2), and so on, and for any property restricts its application to the next lower
type. But Russell believed that the simple theory of types is not sufficient to construct
mathematics from logic and must be ramified for the following two closely related reasons.
Firstly, there are some paradoxes which cannot be resolved by the simple theory of types.
Secondly, it is necessary to eliminate a vicious circle in definition.
Take Grellings Paradox as an example of the first reason. Some adjectives have
the same property as they denote, e.g., the adjective English is English, and others do not,
e.g., the adjective German. If we call the adjectives of the second kind heterological, it
is easy to see that the adjective heterological is heterological if and only if it is not
heterological. Take the concept inductive number as an example of the second reason.
A number is called inductive if it possesses all the hereditary properties of zero. But
since the property inductive itself is a hereditary property, this definition is seen as an
impredicative definition, i.e., a definition of the part by the whole to which it itself belongs,
and thus viciously circular.
The ramified theory of types subdivides the properties of type 1 into the properties in
whose definition all properties do not occur (order 0), the properties in whose definition all
properties of the first order occur (order 1), the properties in whose definition all properties
of the second order occur (order 2) and so on. When we define a certain property by
reference to all properties, all these properties that occur in the definition of the property of
order n have to be restricted to those of the order n1.
But the ramified theory of types stumbles on the definition of the real numbers and
thus leads to the destruction of real analysis. According to the ramified theory of types,
we cannot use the expression for all real numbers without reference to a determinate
order. So we have to say that all real numbers that occur in the definition of a real number
of order n are restricted to those of order n1. Whether practical or not, this would
certainly be extremely inconvenient and probably intolerable. So Russell endeavored to
resolve this difficulty by devising the Axiom of Reducibility: Any high-order sentence is
reducible to an order-0 sentence which is its equivalent in extension.

Ramsey criticized the Axiom of Reducibility on the ground that it is too artificial.
Ramsey divided all paradoxes into two kinds: logical paradoxes, such as Cantors,
Burali-Fortis and Russells Paradox, on the one hand and semantical paradoxes, such as
the liars, Richards and Grellings Paradox, on the other. A paradox of the first kind is
already eliminated by the simple theory of types. A paradox of the second kind is due to
the defect of our ordinary language and we need not to take them into account in the
construction of mathematics from logic. Therefore for Ramsey the ramified theory of
types and the Axiom of Reducibility were redundant for the logicist program. According
to Ramsey, impredicative definition is admissible insofar as it does not create a new entity
and just defines the entity which already exists. For instance, it is innocuous to define a
person by the description the tallest man in the room.
A paradox so called in the sense (II)(i) conceals a fallacious reasoning in itself. An
example in point is so-called Zenos Paradox. Zenos Paradox is the one that Zeno
presented as an argument against motion. Aristotle in Physics introduces four arguments
of Zenos and rejects them as fallacious.

The first argument is that you cannot reach the end of the stadium because you must
pass midpoints beforehand.
The second argument is that Achilles cannot overtake the tortoise. Suppose that the
tortoise started off ahead of Achilles. Although Achilles runs faster than the tortoise, the
tortoise runs ahead when Achilles has reached where the tortoise started. When Achilles
has reached where he ran at that time, the tortoise again runs ahead.
The third argument is that the flying arrow is at rest.
The fourth argument is that half of a given time is double the time. Suppose that
there are three row A, B and C, each of which consists of the same four members. A is at
rest, centered at the middle of the stadium. B extends from the beginning to the middle of
the stadium. C extends from the end to the middle of the stadium. Then B and C move
in opposite directions at the same speed. Eventually A, B and C lie centered at the middle

Aristotle, Physics, 239b5ff. For commentaries on Zenos Paradox, see McKirahan, Philosophy before
Socrates, p. 310ff. Also, Russell discusses Zenos Paradox in Our Knowledge of the External World, p.


of the stadium at the same time. This argument assumes that members of the rows pass
each other in succession. So, time it takes for a member of a row to pass members of
another row is the number of the members. But the assumption simultaneously forbids the
situation like a member of the B passes a member of C at some point between two
successive moments. So, time it takes for a member of B and C moving in opposite
directions to pass members of A at rest is twice the number of the members. Therefore, a
given time is twice the time. On the other hand, the number of members of A that the
rightmost B passed is half the number of members of C that it passed. Therefore, a given
time is half the time. A, B and C start and finish the movement at the same time. Hence
half a given time is the twice the time.
Zenos Paradox is based on a fallacious reasoning of seeing time as the analogue of
the sequence of natural numbers. As Aristotle suggests, however, time is not composed
nows but continuous. If time is seen as the continuum like the set of real numbers, the
Paradox can be solved. Just as it does not makes sense to say the very next real number in
the order of magnitude (because there are an infinitely many real numbers between), so it
doesnt either to say the very next moment in the order of events. Also, just like a
mathematical point, a moment has no size.
If time were like the sequence of natural numbers, the end of the stadium or the point
where Achilles overtakes the tortoise would be recognized as a limit of a convergent
sequence. So it would take an infinite time to reach the limit. But since time is the
continuum, any spatial point can be put in one-one correspondence with a moment in a
finite time. So you can reach the end of the stadium and Achilles can overtake the tortoise
in a finite time. Also, we can admit of some point between two moments. So the fourth
paradox can be solved. According to the theory of measure, an uncountable number of
points with measure zero yield a finite measure. So also uncountably many spatial points
as moments in a finite time yield a finite space. So the flying arrow moves a finite
A paradox belonging to the two senses mentioned above should be avoided and
dismissed. The Banach-Tarski Paradox is not a paradox in the sense that Russells
Paradox is a paradox or Zenos Paradox is a paradox. Rather, it is a counterintuitive

theorem; it is a paradox only in the sense of being contrary to our intuition. The situation
is similar to that in which one-to-one correspondence between natural numbers and even
numbers had been considered as a paradox of the infinite before Cantor gave the definition
of the infinite using the very fact. The rest of this dissertation is devoted to the study of
paradoxes belonging to this category
3.6 What cannot Happen?
We shall notice the following two things:
(1) The Banach-Tarski Paradox holds in R
(n3). In other words, the analogue of the
Banach-Tarski Paradox breaks down in R
and R
(2) S
is SO
-paradoxical using four pieces, the minimum number possible. Any solid ball
is M
-paradoxical using at least five pieces.
These show that what happens in mathematics is still regulated by a rigorous reasoning. A
paradox is not the one that tells us anything goes. The proof reveals not only what can
happen but also what cannot happen.
The reason why the analogue of the Banach-Tarski Paradox doesnt exist in R
is that an isometry group does not have a free subgroup of rank 2 in R
and R
. But
there is another way to see that the analogue of the Banach-Tarski Paradox breaks down in
. Actually, we have the Bolyai-Gerwin Theorem:
Two polygons are congruent by dissection if and only of they have the same area.
The claim that the Banach-Tarski Paradox does not have the analogue in R
amounts to this:
If two polygons are congruent by dissection, they cant have different areas, i.e., they have
to have the same area. This is intuitively acceptable. So it is crucial to show that if two
polygons have the same area, they are congruent by dissection. But this is difficult to
prove. Considering that it was not shown until the nineteenth-century century, we realize
the recalcitrant nature of the problems of this sort.
Then it is natural to ask whether this theorem has the analogue in R
Two polyhedrons are congruent by dissection if and only if they have the same volume.
Specifically, Hilberts third problem was whether or not we can cube a regular tetrahedron
by dissection into polyhedra. Dehn had already showed that a regular tetrahedron is not
congruent by dissection with any cube. Therefore, the three dimensional analogue of the

Bolyai-Gerwin Theorem is simultaneously rejected. This result seems to contradict the
Banach-Tarski Paradox that should hold in R
. For the Banach-Tarski Paradox implies
that a regular tetrahedron is equidecomposable with a cube of the same volume. But we
have to note that in this analogue a paradoxical decomposition is restricted to a dissection
into polyhedra. So if we are more generous in the nature of pieces used in a
decomposition, the Banach-Tarski Paradox does hold.
To see that the minimum four pieces are necessary in a paradoxical decomposition,
let X be a set and G be a free group of rank 2 (i.e., generated by and ) that acts on X.
Suppose that some relation R divides X into equivalence classes. The Axiom of Choice
guarantees the existence of a set T containing exactly one element from each equivalence
class. Then X is decomposed into four pieces, depending on the last word multiplied from
{x tT; x . (t)}
{x tT; x
. (t)}
{x tT; x. (t)}
{x tT; x
. (t)}
It follows from this that


Thus we can see that X is G-paradoxical using four pieces and four pieces cannot be
3.7 A Paradox without the Axiom of Choice?
In Section 5, we have seen that the Banach-Tarski Paradox is a paradox in the sense that it
is a counterintuitive theorem, so it is not a logical paradox and does not contain fallacious
reasoning. But the critics of the Axiom of Choice would say that even if the
Banach-Tarski Paradox is a counterintuitive theorem, the non-constructive nature of the
Axiom of Choice leads to the existence of non-Lebesgue measurable sets, which in turn
yields the Banach-Tarski Paradox that is surprising and counterintuitive. Therefore, the

Axiom of Choice should be rejected. Rather than that, this is a distinguished feature of
mathematics from natural science. The fact that we can get a counterintuitive theorem as
a result of logical and mathematical reasoning is not its weakness but its strength. But we
have to note that a paradox (or a counterintuitive theorem) can be derived even without the
Axiom of Choice. A good example of this is the Sierpinski-Mazurkiewicz Paradox:
There is a subset of R
such that its decomposed into two pieces and reassembled by rigid
motions to form the two copies of itself.
Let be a translation by 1 and be a counterclockwise rotation by one radian
about the origin O. Actually the set of points obtained by multiplications of and from
the origin O, which is a subset R
, is paradoxical using two pieces. Call this set E. Note
that a combination of and is not a group transformation generated by and because it
does not contain the inverses
. This is the reason why E is paradoxical using
just two pieces. E is decomposed into the two sets, depending on the last word multiplied
from the origin O. To see this,
{, , , , , , .}
{, , , , , , .}



The fact that there is a paradox (in the sense of a counterintuitive theorem) without
invoking the Axiom of Choice is in favor of the defenders of the Axiom of Choice. For
even if the Axiom of Choice yields a paradox like the Banach-Tarski Paradox, it does not
constitute a good reason why we should reject the Axiom of Choice.
But it is worth mentioning that there is a version of the Banach-Tarski Paradox that
does not use the Axiom of Choice. For although it is good news for the Platonists that the
Axiom of Choice is not the culprit for yielding the Banach-Tarski Paradox, at the same time
this means that the Paradox does not necessarily express the Platonic (non-constructive)
truth. Actually, in one version we need a countable number of pieces in a paradoxical

decomposition rather than a finite number of (the minimum five, for that matter) pieces.
More surprisingly, however, Dougherty and Foreman found another version in which
we only use a finite number of pieces. This is a quite significant result because the
Platonists can no longer claim that we can get the Banach-Tarski Paradox only in a
non-constructive way. The Banach-Tarski Paradox can be derived in a constructive way
as well. So I need to review the proof very carefully, especially in connection with the
property of Baire. But at this point I would just say that this result does not necessarily
deal a death blow to the Platonists. For the Platonists dont have to change their minds
about the truth of the Banach-Tarski Paradox. This result is also disconcerting the
Constructivists as well. For according to their methodological principle, they are forced to
accept the Banach-Tarski Paradox that they have been refusing to accept.
In any case, even if the Banach-Tarski Paradox can be reformulated without the
Axiom of Choice, the Platonists may ask the Constructivists before making a concession to
(1) Doesnt the proof have a lower capability than the proof with the Axiom of Choice (as
to the minimum number of pieces, for instance)?
(2) Doesnt the proof contain some non-constructive principle other than the Axiom of
(3) Doesnt the proof depend on extremely complex or ad hoc principles, compared with
the proof with the Axiom of Choice?
If either of these questions is answered in the negative, momentum is on the side of
the Platonists. If by assuming the Axiom of Choice the Banach-Tarski Paradox can be
proved in a simpler, more systematic and more unified way, I believe the proof with the
Axiom of Choice reflects reality. We should make a judgment on the ontological status of
the Banach-Tarski Paradox comprehensively, not just based on the result but also including
the process of the proof, considering that either proof fits in well with the essence of the
matter. The Platonists are still not losing ground in this respect.
3.8 Some Philosophical Implications
We have discussed the Hausdorff Paradox, especially focusing on how the sphere is
decomposed into pieces and these pieces are reassembled. We have seen that the pieces of

a paradoxical decomposition take over non-measurablity from a non-Lebesgue measurable
set derived using the Axiom of Choice, and non-measurablity of these pieces causes the
paradoxical nature of decomposition. The Banach-Tarski Paradox represents a
characteristic feature of mathematics as distinct from natural science. The fact that we can
get a counter-intuitive theorem as a result of logical and mathematical reasoning is not its
weakness but its strength. The difficulty with the Banach-Tarski Paradox is due to the fact
that not only is it not intuitive but it is also not effectively carried out. There are a lot of
things going on out there that are true but counter-intuitive. But we are convinced of their
truth if they are actually realized. So the idiosyncratic nature of the Banach-Tarski
Paradox lies in that we are forced to change our philosophy in order to accept it. In other
words, we accept the Banach-Tarski Paradox only on the basis of the Platonic assumption,
that is, by assuming the Platonic ideal world over and above the spatio-temporal world.
Also, the Banach-Tarski Paradox casts light on an epistemological question about
Platonism: How can we get access to mathematical entities that are supposed to be
non-spatio-temporal and thus causally inert? I suspect that the pathologies concerning
non-Lebesgue measurable sets or the Banach-Tarski Paradox deal a blow to epistemology
based on a mathematical intuition such as Gdels. Gdel claims that just as we have a
sensible intuition in physical sciences, so we have a mathematical intuition in mathematical
sciences. Indeed we could say that we have a mathematical intuition about lower
mathematical objects such as natural numbers. But we could hardly say that we have an
intuition about non-Lebesgue measurable sets or the Banach-Tarski Paradox. Since
Gdels mathematical intuition develops from Husserlian intuition of essences,
phenomenological epistemology would have to be brought into question in all.
The conflict between Platonism and Constructivism forms the watershed in the philosophy
of mathematics. The Platonists posit mathematical entities as super-spatio-temporal ones.
By contrast, the Constructivists restrict them to those that are legitimately constructible in
space and time. The dispute over the use of the Axiom of Choice is the most symbolic of
the opposition between Platonism and Constructivism. The Platonists accept the Axiom
of Choice and allow for the existence of a set of members formed by infinitely many

arbitrary choices, while the Constructivists rejects the Axiom of Choice and allows for only
a set of members selected by the specific rule.
One of the reasons the Constructivists reject the Axiom of Choice is that the
existence of non-Lebesgue measurable sets can be proved using the Axiom of Choice,
which is counterintuitive and surprising. In this way Lebesgues theory of measure is very
closely related to the use of the Axiom of Choice. We have seen the distinctive feature of
Lebesgue measure in the notion of -additivity. The Lebesgue integral based upon the
Lebesgue measure made much improvement on convergence property of the Riemann
integral based upon Jordans content. Lebesgues theory of measure is applied to the
theory of large cardinals.
The use of the Axiom of Choice yields the Banach-Tarski Paradox. But this is not a
good reason why we should reject the Axiom of Choice. For the Banach-Tarski Paradox
is a paradox only in the sense that it is a counterintuitive theorem. In this sense, the
Banach-Tarski Paradox casts doubt on Gdels mathematical intuition and Husserls
intuition of essences. The Banach-Tarski Paradox is not just a mathematical figment. It
reflects reality. But since the Banach-Tarski Paradox cannot effectively be carried out,
what kind of reality is it? A contribution philosophy can make to the Banach-Tarski
Paradox is to provide a solid foundation for the Paradox by claiming that it reflects the
reality of the Platonic world over and above the natural world.




The aim of this chapter is to clarify the nature of arithmetical truths based on Gdels First
and Second Incompleteness Theorems. We approach Gdels First Incompleteness
Theorem from two different perspectives: Gdels original paper and the theory of
computability. We begin with the former (Section 1) and then move on to the latter.
First of all, we make a distinction between the two senses of undecidability. An intuitive
way to see if there is a decision method is to check whether or not a Turing machines halt.
(Section 2) Churchs thesis claims that the set of Turing computable functions coincides
with the set of partial recursive functions. By these functions we define recursively
enumerable sets and recursive sets, bracing for the different nature of the set of the true
sentences and the set of theorems (provable sentences) in arithmetic. (Section 3) We show
that the Halting problem is undecidable. (Section 4) Then, based on the undecidability of
the Halting problem, we show that first-order logic and arithmetic are undecidable
respectively. (Section 5) From the undecidability of arithmetic, Gdels First
Incompleteness Theorem is immediate. (Section 6) Finally, we shall see Gdels Second
Incompleteness Theorem to the effect that the consistency of arithmetic is unprovable
within arithmetic itself. (Section 7) As a consequence, the consistency proof is not
absolute but relative (Section 8). I conclude Gdels Incompleteness Theorems, which
implies that there are arithmetical truths we cannot get access to in an effective way,
strengthen my claim that mathematical truths are of non-constructive nature.
4.1 Gdels First Incompleteness Theorem
Before we go into Gdels First Incompleteness Theorem, we shall take a quick look at the
characteristic features of modern logic. Modern logic consists of propositional logic and
predicate logic. The latter is further divided into first-order and second-order (or
high-order). In first-order logic, (universal and existential) quantifiers range over only
individuals but in second-order predicates as well. Propositional logic and first-order

logic behaves in a similar way. Both have the Completeness, Soundness, Compactness
Theorems. These theorems are based on the distinction between syntax (theory of proof)
and semantics (theory of truth). But first-order and second-order logic behave in a very
different manner. The central mathematical notions, such as finitude and countability, can
be defined in second-order logic, not in first-order logic. Also, the Completeness,
Compactness and (upward / downward) Lwenheim-Skolem Theorems dont hold in
second-order logic. They hold owing to the weakness of expressive power of first-order
logic. For this reason they are called limitative theorems.
Introducing (universal and existential) quantifiers was an epoch of Fregean logic.

For instance, both a sentence Socrates is mortal and a sentence All human beings are
mortal. share the same grammatical form by virtue of being of the subject-predicate
structure. But Frege shows that the former can be analyzed into M(S) but the latter can be
analyzed into x(H(x)M(x)). That is to interpret universally quantified sentences as
conditionals. So in some cases two sentences share the same grammatical form but their
logical forms are different. We could say that the distinction between grammatical and
logical form is one of the fruitful results of Fregean Logic. The aim of Freges
Begriffshrift (Concept writing or Concept notation) is to create the ideal language
rigorous enough to express mathematics by eliminating the ambiguity of the ordinary
Now we shall see Gdels First Incompleteness Theorem in his original approach.
Gdels First Incompleteness Theorem tells us that:
Assume that a first-order formal system S is strong enough to express
arithmetic and consistent. Then, S is incomplete, that is, there are undecidable
sentences in the sense that they are neither provable nor disprovable in S.
In other words, Gdels First Incompleteness Theorem says that a formal system that has
enough strength to express arithmetic is either inconsistent or incomplete. In this

Kant in Critique of Pure Reason says until now it has also been unable to take a single step forward, and
therefore seems to all appearance to be finished and complete. (B.viii) But I dont want to rush to the
conclusion that Kant couldnt anticipate the revolution to come in logic. For some would say that Kants
distinction between intuition and concept resembles Freges distinction between argument and function. (For
this, see e.g. Redding, Analytic Philosophy and the Return of Hegelian Thought, p. 93.)

connection, we have to note the First Incompleteness Theorem does not say that every
formal system is either inconsistent or incomplete. There are many formal systems that
are both consistent and complete. First Incompleteness Theorem only applies to formal
systems in which arithmetic can be carried out.
More specifically, Gdels First Incompleteness Theorem tells us that:
Let the assumptions of a formal system S be the same as above. Then, there
are true but unprovable sentences in S.
Although we have already distinguished syntax (theory of proof) from semantics (theory of
truth), Gdels First Incompleteness Theorem shows that actually there is a gap between
them. Using the method of Gdel numbering, which corresponds Gdel number to an
arithmetical sentence, Gdel formulated a true but unprovable sentence called the Gdel
sentence G in Peano Arithmetic PA. The Gdel sentence G is self-referential, meaning
that Im not provable in PA. In symbols, G: PA G.
Pr(x, y) expresses that x is a proof of a formula with Gdel number y. Let z be
Gdel number of a formula that has the only free variable in it, and sub(z, z) be the formula
obtained by substituting for the variable of the formula with Gdel number z the numeral
for z.
Then, Pr(x, sub(z, z)) means that x is a proof of the formula sub(z, z) obtained by
substituting for the variable of the formula with Gdel number z, which has the only free
variable, the numeral for z. Take xPr(x, sub(z, z)). Since this is also an arithmetical
formula, it has Gdel number, say, g. Now, consider what xPr(x, sub(g, g)) exactly
means. This means that there is no proof of the formula (i.e., xPr(x, sub(g, g))
itself!) obtained by substituting for the variable (i.e., z) of the formula with Gdel number g
(i.e., xPr(x, sub(z, z))) the numeral for g. Therefore, xPr(x, sub(g, g)) is an
arithmetical translation of a sentence I am not provable in PA.
Now we shall prove that, assuming the -consistency of Peano Arithmetic PA, the
Gdel sentence G is a true but unprovable sentence in PA.
We have an important

Note that x is not a proof of just a formula with Gdel number z.
In order to prove that the Gdel sentence G is true but unprovable in PA, Gdel actually used a stronger
assumption -consistency than consistency.
-consistency means that:

theorem as follows:
If a relation R(x
, x
, , x
) is recursive, then it is representable in PA. Therefore there
is a relation R(x
, x
, , x
) such that
If R(a
, a
, , a
) does hold, then
, a
, , a
If R(a
, a
, , a
) does not hold, then
, a
, , a
, a
The relation Pr(x, y) is recursive, so representable in PA. Therefore there is a relation
Pr(x, y) such that
If Pr(m, n) does hold, then
R(m, n).
If Pr(m, n) does hold, then
R(m, n).
Now we show that G is neither provable nor disprovable.
(i) G is not provable in PA.
Suppose that G were provable in PA.
Thus, xPr(x, sub(g, g)) does hold.
xPr(x, sub(g, g)) by the theorem above.
xPr(x, sub(g, g)).
A contradiction. Therefore, G is not provable in PA.
(ii) G is not provable in PA.
We now know that G is not provable in PA by (i).
Thus, xPr(x, sub(g, g)) does not hold.

either A(0), A(1), ., or xA(x).
-inconsistency means that:
both A(0), A(1), ., and xA(x).
If a formal system is -consistent, then it is consistent. But the converse does not hold.

xPr(x, sub(g, g)) by the theorem above.

xPr(x, sub(g, g)) by -consistency.
Therefore, G is not provable in PA.
But undecidable sentences are not confined to self-referential sentences such as the
Gdel sentence G. Even though a self-referential sentence can be translated into the
language of arithmetic, the sentence itself is of no mathematical interest. This indeed does
not reduce the importance of the First Incompleteness Theorem because, as we shall see
soon, the Gdel sentence G plays an essential role in the Second Incompleteness Theorem.
More significantly, however, undecidable sentences in Gdels First Incompleteness
Theorem could be not only self-referential but also of mathematical interest. Actually, the
Four-Color Conjecture, which is now confirmed, was once considered as such an
undecidable sentence as dictated in Gdels First Incompleteness Theorem.
It is indeed not the case that all undecidable sentences are true, but it is important that
an undecidable sentence could be true. An undecidable sentence is undecidable not in an
absolute sense but in a relative sense. An undecidable sentence in some formal system is
undecidable because the formal system is too weak to prove that sentence. So an
undecidable sentence in some formal system is always decidable in a stronger system than
the system concerned. Especially, the Gdel sentence G in Peano Arithmetic PA is
decidable in the formal system PAG in which G is added as a new axiom to PA. But in
the new formal system arises another undecidable sentence G claming its own
unprovability in PAG. In symbols, G: PAG G.
4.2 Turing Machines
In what follows, we approach Gdels First Incompleteness Theorem from the theory of
computability. But in order to do that, we have to show that arithmetic is undecidable. It
is worth noting that there are the two distinct senses of undecidability. We have seen
above in Gdels First Incompleteness Theorem that a sentence is said to be undecidable if

Or in what follows, suppose that G were provable in PA.
xPr(x, sub(g, g)).
This contradicts -consistency.

it is neither provable nor disprovable. But there is another sense of undecidability in
which a set is said to be undecidable if there is no mechanical method to decide whether or
not x is a member of A. This is exactly what we mean by saying that first-order logic or
arithmetic is undecidable. As we shall see later, these two senses are indeed connected in
some way, but we should be careful not to confuse one with the other.
An intuitive way to understand what constitutes a mechanical method to decide
whether or not x is a member of A is to check whether or not Turing machines halt.
Roughly speaking, there is a mechanical method to decide whether or not x is a member of
A if and only if there is a computer program by which a Turing machine halts if x is a
member of A and there is another computer program by which a Turing machine halts if x is
a member of A (the complement of A).
Now we shall see how Turing machines work. A Turing machine can be seen as a
black box into which we input a tape which is divided into equal squares and infinite at
both ends. The simplest Turing machines have two tape symbols, 0 and 1. When a
Turing machine is in the state q
and scanning the symbol x
, the instruction by a Turing
machine is given by a quadruple I
, x
, d, q
(1) The current state
(2) The scanned symbol
(3) The action taken
(4) The next state
There are the only three actions to be taken:
(a) Print a symbol (first erase the symbol in the square being scanned and then
write the other symbol in the same square).
(b) Move one square to the right.
(c) Move one square to the left.
The combination of q
, x
, has to be unique, because otherwise the next instruction cannot
be specified. The set of instructions {I
, I
, I
, .., I
} constitutes a computer program
of Turing machine. A Turing machine halts if there is no instruction I
. For instance, a
set of instructions
, 1, R, q
, q
, 1, R, q
, q
, 0, R, q
, q
, 0, R, q


is an example of a computer program to decide whether or not x is an even. That is, if x is
an even, a Turing machine halts and if x is an odd, it doesnt halt. Another set of
, 1, R, q
, q
, 1, R, q
, q
, 0, R, q
, q
, 0, R, q

is an example of a computer program to decide whether or not x is an odd. That is, if x is
an odd, a Turing machine halts and if x is an even, it doesnt halt. I shall also give an
example of a computer program to compute xy.
Since there is one-one correspondence between the set of computer programs and the
set of natural numbers, we can enumerate the computer programs of Turing machine by T
, T
, .. Also, when we input n into a Turing machine T
, we denote it by T


Figure 14: Example of a computer program to decide whether x is an even.

Figure 15: Example of a computer program to decide whether x is an odd.


Figure 16: Example of a computer program to compute xy.

4.3 Recursive Functions and Recursive Sets
Churchs thesis claims that the set of Turing computable functions coincides with the set of
partial recursive functions. We say that f(x) is undefined if x Dom(f). A function is
said to be partial if it cannot be defined in the domain of all natural numbers. A function
is said to be total otherwise. We shall define primitive recursive and recursive functions.
To that aim, we begin with the definition of the initial functions and the three rules.
The initial functions are:
(1) The zero function: z(n)0
(2) The successor function: s(n)s1
(3) The projection function: p
, x
, . , x

Especially, p
. So this is the identity function.
The three rules are:

(i) Composition: f(x)g(h(x))
(ii) Recursion:
f(1)h(1, f(0))
f(2)h(2, f(1))
f(n)h(n, f(n1))
(iii) Minimization: f(x)y[g(x, y)0] (f(x) is the least number y such that g(x,
A function is said to be recursive if it is obtained by applying the three rules to the initial
functions. A function is said to be primitive recursive if it is obtained by applying the
rules (i) and (ii) only to the initial functions.
We now move from recursive functions to recursive sets. Informally, the set S is
said to be recursively enumerable if for every member in the set S there is a computer
program that halts when it is input. The set S is said to be recursive (decidable) if not only
for every member in the set S there is a computer program that halts when it is input but
also for every member in the complement S of S there is a complementary computer
program that halts when it is input. So, we have to note that in a recursively enumerable
set there is not always a complementary computer program that halts for input for which a
computer program doesnt halt. In what follows, we shall give a few formal definitions of
recursively enumerable and recursive sets.
(1) The set S is recursively enumerable if
(i) S is the domain of a partial recursive function.
(ii) The partial characteristic function

(x) 1 if xS
undefined if x S
is Turing-computable.


Figure 17: Recursively enumerable (r.e.).

(2) The set S is recursive (decidable) if
(i) S is recursively enumerable and S (the complement of S) is also recursively

(ii) The characteristic function

(x) 1 if xS
0 if x S
is Turing-computable.
The upshot of these definitions is this:
In a recursive (decidable) set, there is a computer program that halts for every member in
the set and there is a complementary computer program that halts for every member in the
complement of the set. So, by running simultaneously a complementary computer
program that halts for every member in the complement of the set, we can decide whether,
for a member for which a computer program is still running, the computer program does
not halt forever or it does halt at some time in the future.
In a non-recursive (undecidable) set, there is no computer program that halts for every
member in the set and/or there is no computer program that halts for every member in the
complement of the set. So, we cannot decide whether, for a member for which a computer

It is shown that the set S is recursively enumerable if and only if xS is expressed by y(x, y) using a
decidable predicate . So xS is expressed by y(x, y). Then S is called co-recursively enumerable.
As a result, the set S is recursive if and only if S is recursively enumerable and S is co-recursively enumerable.

program is still running, the computer program does not halt forever or it does halt at some
time in the future.

Figure 18: Decidable (recursive).

Figure 19: Undecidable.

For instance, the set S of even numbers is recursively enumerable. The set S (the
complement of S, i.e., the set of odd numbers) is recursively enumerable. So the set S of
even numbers is recursive. The same applies to the set S of odd numbers. The crucial
result is that all recursive sets are recursively enumerable, but not vice versa. Actually,
there is a set that is recursively enumerable but not recursive. As we shall see later, a
couple of examples are K in the Halting problem or the set of theorems (provable
sentences) in arithmetic.

Figure 20: Example of a recursive set.

Figure 21: Example of a recursively enumerable but not recursive set.

4.4 The Halting Problem
We shall begin this section by giving some fundamental decidability and undecidability
(1) Propositional logic is decidable.
(2) Pure first-order logic (monadic logic) is decidable.
(3) The Halting problem is undecidable.
(4) First-order logic is undecidable.
(5) Arithmetic is undecidable.
The Halting problem is important because the undecidability of the Halting problem
provides foundation for other undecidability proofs in the sense that it is a logicians stock
in trade to solve the latter by reducing it to the former. So we have to state what the
Halting problem is and why the Halting problem is undecidable. In order to prove that the

Halting problem is undecidable, we appeal to a diagonal argument. The proof is
analogous to that of Cantors theorem to the effect that the cardinality of the power set of a
set is strictly greater than the cardinality of the original set.
We show K{n T
(n)} is undecidable.

Suppose, for reductio ad absurdum, K were decidable.
By definition, the characteristic function

(x) 1 if T
0 if T
is Turing-computable.
Consider a function g such that
(x) (i)
Let g be
g(x) 0 if T
undefined if T
is Turing-computable, g is also Turing-computable.
So there is a computer program T
that computes g so
(n)nDom(g) (ii)
If T
(x), then
by (i), n Dom(g)
by (ii), nDom(g) A contradiction
If T
(x), then
by (i), nDom(g)
by (ii), n Dom(g) A contradiction
Therefore, K is undecidable.

If we denote the set of members that halt when we input y into a Turing machine T
by W
{y T
then K{x xW


Figure 22: The analogy between the Halting Problem and Cantors Theorem.

We can go further than that. K is obviously recursively enumerable. So K (the
complement of K) is not recursively enumerable. K is called a creative set and K is called
a productive set.
4.5 The Undecidability of First-order Logic and Arithmetic
Based on this result, we shall show the undecidability of first-order logic. Let I
, I
, ..,
(i, jk) be a set of instructions in a Turing machine. So each instruction consists of a
quadruple: I
, x
, d, q
(i) If I
print a symbol, let
be the sentence
, .., y, x
, ......, x
; i)R(x
, .., y, x
, ......, x
; j)]
1 if x
0 if x

(ii) If I
move one square to the right, let
be the sentence
, .., y, x
, ......, x
; i)R(x
, .., y, x
, ......, x
; j)]

(iii) If I
move one square to the left, let
be the sentence
, .., y, x
, ......, x
; i)R(x
, .., y, x
, ......, x
; j)]

Now let be the sentence

& .. &
&R(0, 0, ........, 0; 1)
That is, is a set of instructions and the initial condition of the tape.
Then let be the sentence
: x
, .., y, ......, x
; k)]
Assuming a set of a part of Peano axioms, we have the following equivalence:
In order to see this, if T(0), then under a set of instructions and the initial condition of the
tape there is a sentence such that R(a
, .., b, ......, a
; k) so x
, ..,
y, ......, x
; k)] does hold. Conversely, if under a set of instructions and the initial condition
of the tape there is a sentence such that R(a
, .., b, ......, a
; k) so x
.., y, ......, x
; k)] does hold, then T(0). Thus we actually formulate a true sentence
that is equivalent to T(0) in the first-order language.
It remains to show that T(0) is undecidable. Once we get this, by the equivalence
above is also undecidable. We complete the proof by showing that T(0)can be
reduced to the Halting problem.
(i) First, we input a blank tape into a Turing machine.
(ii) Then, we construct n by the following computer program.
1 q
, 0, 1, q
, 1, R, q

2 q
, 0, 1, q
, 1, R, q

n q
, 0, 1, q

(iii) Finally, we set up a computer program in the same way that T
processes n.
So there is some computer program T
that combines both programs so T
We have already seen that the Halting problem T
(n) is undecidable. Therefore, T(0)
is also undecidable.


Figure 23: T(0) is undecidable.

Indeed we are actually assuming a set of a part of Peano axioms, but there is a
theorem to the effect that if a system S is a finite extension of S and S is undecidable, then
S is also undecidable. Here is a finite extension of first-order logic, so it is shown that
first-order logic is undecidable. We can show the undecidability of arithmetic using the
same technique.
4.6 Undecidability and Incompleteness
Once we have the undecidability of arithmetic, Gdels First Incompleteness Theorem is
immediate from this as follows. The set of true sentences is not recursively enumerable.
Suppose, for reductio ad absurdum, that the set of true sentences in arithmetic were
recursively enumerable. Then the set of false sentences in arithmetic would be also
recursively enumerable. In order to see this, we have only to point out that for a false
sentence its negation is true. In symbols,

F{ }{ }
By definition, the set of true sentences in arithmetic is recursive if and only if S is
recursively enumerable and S (the complement of S) is also recursively enumerable. So
arithmetic would be decidable. This contradicts the undecidability of arithmetic. So, the
set of true sentences in arithmetic is not recursively enumerable. But the set of theorems
(provable sentences) in arithmetic is recursively enumerable. Note that in any
axiomatizable theory the set of theorems is recursively enumerable and PA is an
axiomatizable theory. So it is easy to see that there is a sentence that is true but
unprovable in PA., so there is some arithmetical truth we cannot get access to in an
effective way. This is the approach to Gdels First Incompleteness Theorem from the
theory of computability. This approach more clearly shows the non-constructive nature of
arithmetical truths than Gdels original approach.
PA is axiomatizable (so, the set of theorems in PA is recursively enumerable),
whereas arithmetic is not axiomatizable (so, the set of theorems in arithmetic is not
recursively enumerable). But PA is not complete, whereas arithmetic is complete. Here
we can see that there is a tension between axiomatizability and completeness.
The incompleteness proof doesnt show that the system is undecidable. So, Gdels
First Incompleteness Theorem doesnt constitute the proof that arithmetic is undecidable.
Consider propositional logic. There is a sentence that is neither provable nor disprovable.
But propositional logic is decidable. But a system in which some sets are undecidable is
incomplete. In order to see the relation between decidability and completeness, the
following theorem is useful:

Any axiomatizable complete theory is decidable.
A theory is complete in the sense that for every sentence in the theory either it or its
negation can be proved. First-order logic is axiomatizable but incomplete in this sense.
So it comes as no surprise that first-order logic is undecidable. Also, every complete
decidable theory is axiomatizable. It is trivially true. But it is false that every decidable
axiomatizable theory is complete.

For this theorem, see Boolos and Jeffery, Computability and Logic (1974), p. 177-8, Cohen, Computability
and Logic, p. 204.

Figure 24: Approach to Gdels First Incompleteness Theorem from the the theory of

In propositional logic we have a decision procedure based on the truth table. In
propositional logic, no matter how complicated the sentence is, we can mechanically decide
whether or not the sentence is a tautology. That is, we assign the truth-values to every
symbol of that sentence and pay attention to the last column of the truth table. If the
entries are all true, the sentence is logically valid, otherwise not logically valid. We have
already seen that there is a kind of analogy between propositional logic and first-order logic.
In this vein, it is natural to conjecture that first-order logic might be decidable. But
Churchs undecidability theorem proves the opposite.
We wish there were a mechanical method to decide whether or not some sentence is
provable, especially when the sentence is very complicated and seems to be very difficult to
prove. For, if there were such a method, we could know whether or not the sentence has a
proof, even if we cant get the concrete proof. An undecidability result dashes that hope.
It tells us that there is no such decision method so we have to come up with the proof case
by case or wait until a Turing machine halts with the suspicion that the sentence might not

be a theorem.
It is often said that Gdels First Incompleteness Theorem shows that the human mind
is not a machine. But we have to make a distinction between the effectiveness of proofs
(formal methods) and the effectiveness of decision (mechanical methods). In order to be
able to decide whether or not there is a proof in a mechanical procedure, both the set of
theorems and its complement must be recursively enumerable. But if only the set of
theorems is recursively enumerable, the proofs are formal. So the mechanical
effectiveness is stricter than the formal effectiveness. Churchs Undecidability Thesis
shows the limitations of mechanical methods, whereas Gdels First Incompleteness
Theorem shows those of formal methods.
4.7 Gdels Second Incompleteness Theorem
Gdels Second Incompleteness Theorem tells us that the consistency of a formal system
that is strong enough to express arithmetic is unprovable within the system itself.
Especially, the consistency of Peano Arithmetic PA is unprovable within PA itself.
Suppose that the consistency of PA were provable in PA. In symbols,

Gdels First Incompleteness Theorem implies we can prove that if arithmetic is consistent,
then we have the Gdel sentence G. In symbols,
So by modus ponens we get
G, which contradicts the fact that, assuming the
consistency of PA, G is unprovable in PA (Gdels First Incompleteness Theorem). Thus
the consistency of PA is unprovable in PA (Gdels Second Incompleteness Theorem).
But we have to note Gdels Second Incompleteness Theorem does not say that the
consistency of arithmetic is unprovable by any means whatsoever. The Second
Incompleteness Theorem just rejects the consistency proof of Peano Arithmetic within PA
itself. Actually, Gentzen later proved the consistency of arithmetic, not finitistically any
more but using a stronger methodthe transfinite induction. Just by replacing natural

numbers in the complete (ordinary) induction with transfinite ordinals, we obtain the
transfinite induction.

It is worth noting that Gentzen claims that a stronger method is indeed required to
prove the consistency of arithmetic than axiomatic methods as dictated by Hilberts
programme but it is still in harmony with the constructivist interpretation of infinity. But
it seems to me that even if the consistency of a formal system is proved by a stronger
system, the consistency of the stronger system must in turn be proved by a much stronger
system. Therefore, there will be need to secure its own consistency outside the formal
system in one way or another at some point. For could the consistency of a formal system
be proved by a stronger one that is more likely to be inconsistent?
As a consequence of Gdels Second Incompleteness Theorem, we have learned that
there is no such thing as an absolute consistency proof: the consistency proof of the system
within the system itself. The appeal to the existence of a model outside the formal system
enables a relative consistency proof: the consistency proof of one formal system on the
assumption of the consistency of another formal system. Gdels Incompleteness
Theorems show that in order to secure mathematical truths we have to appeal to the
existence of a model outside the formal system. Gdel believed that the Incompleteness
Theorems support Platonic realism in this sense.
4.8 Relative Consistency Proofs
A model M of a first order system S is an interpretation that makes every theorem of S true.
There are multiple models (interpretations) that satisfy a formal system S. The idea of a
relative consistency proof runs as follows. Our goal is to prove the consistency of a
formal system SA assuming the consistency of the formal system S, that is,
All we have to do is to construct a model M of S that makes A true. If we can find such a
model M, M satisfies S and M also satisfies A. And we may assume Con(S). Then,
suppose Con(SA). Since S is consistent, S proves A. Since M is a model that

The transfinite ordinals are the extension of the sequence of natural numbers (n): 1, 2, , , 1,
2, , 2, ,
, ,

, ,

, .


satisfies S, M is a model that also satisfies A. But we have already seen that M satisfies
A. A contradiction. Therefore, we get Con(SA).
We shall discuss the reason why a relative consistency proof is so important. There
is a tension between completeness and consistency. For if a formal system is strengthened
for the sake of its completeness, then it will be more likely to be inconsistent. To the
contrary, if a formal system is weakened for the sake of its consistency, then it will be more
likely to be incomplete. This is the reason why there is need to adjust the axioms of a
formal system. The more axioms are added to a formal system, the more chance there is
that the system concerned is inconsistent. So, a formal system SA is more likely to be
inconsistent than a formal system S. This likelihood seems to constitute a good reason
why we should adopt a formal system S instead of a formal system SA. But if it can be
shown that:
then such a justification loses its force. Rather, it would be more rational to say that if
once we accept a formal system S, then we should accept a formal system SA as well.
For instance, ZF set theory is more likely to be inconsistent than ZF set theory minus
the Axiom of Foundation (ZF

). So, at this point ZF is more dubious than ZF

. But
suppose that we can prove that:

This is a relative consistency proof because we prove the consistency of one formal system
(i.e., ZF) assuming the consistency of another formal system (i.e., ZF

). Then, if we
accept ZF

, there will be no good reason why we reject ZF. This line of argument
directly leads to the consistency proof of the Axiom of Choice with ZF set theory.
The consistency proof of the Axiom of Choice with ZF set theory is a relative
consistency proof in the sense that we prove the consistency of ZFAC (ZFC) assuming
that ZF is consistent. That is, we prove that:
(1) Con(ZF)Con(ZFC)
Actually Gdel showed a little stronger result. Gdel used constructible sets in order to

construct a model of the axioms of ZF set theory that satisfies the Axiom of Choice. The
Axiom of Constructibility says that every set is constructible. In symbols, VL. Gdel
proved that:
(1) Con(ZF)Con(ZFVL)
This is also a relative consistency proof of ZFVL on the assumption that ZF is
Now VL implies the Axiom of Choice.
So, as a corollary of (1) Con(ZF)Con(ZFVL), we get (1) Con(ZF)Con(ZFAC).
In a later chapter we shall see that many mathematicians reject the Axiom of
Constructibility. For Dana Scott showed that if we assume the Axiom of Constructibility,
there exist no measurable cardinals. Since we know that measurable cardinals play a large
role in the theory of large cardinals, we dont want to reject them. Gdel himself was
skeptical about the Axiom of Constructibility. Here is a worry. Gdel did not show that
the Axiom of Choice is consistent with the axioms of ZF set theory irrespective of VL.
In other words, Gdels relative consistency proof of the Axiom of Choice with the axioms
of ZF set theory depends on the assumption VL. It is a separate issue whether or not the
Axiom of Choice is consistent with the axioms of ZF set theory if VL. But this doesnt
mean that if we are seeking for the consistency of the Axiom of Choice, we have to adopt
the model VL.
What Gdel did show is that the model VL satisfies the Axiom of Choice and the
axioms of ZF theory so the Axiom of Choice is consistent with the axioms of ZF set theory,
not that the model VL is the only model that can satisfy them. So a relative consistency
proof of ZFC with ZFCVL becomes a matter of urgency, that is,
(3) Con(ZFC)Con(ZFCVL)
To that aim, we have only to show that there is a model that denies VL which satisfies the
Axiom of Choice and the axioms of ZF theory. But I cant follow this here because, in
order to do so, we need Cohens forcing method.
In general, if A is an undecidable sentence in a formal system S, we can assume the

two new formal systems in extension of S: the formal system SA and the formal system S
A. If both of the two new systems are consistent with the old system, A is
independent from S. Cohen proved that the negation of the Axiom of Choice is consistent
with the axioms of ZF set theory, that is,
In order to show this, Cohen used the method of forcing and generic sets. Putting both (1)
and (2) together, the Axiom of Choice was proved to be independent from ZF set theory.
We have seen Gdels First Incompleteness Theorem in two different approaches: Gdels
original paper and the theory of computability. In his original paper, Gdel constructed
the Gdel sentence I am not provable in PA in the first-order language and showed that it
is neither provable nor disprovable in PA. Since this sentence claims its own
unprovability in PA, it is actually true but unprovable. Gdels First Incompleteness
Theorem can be shown from the theory of computability as well. The set of true sentences
in arithmetic is not recursively enumerable. If it were recursively enumerable, the set of
false sentence would be also recursively enumerable. But this contradicts the
undecidability of arithmetic. On the other hand, the set of theorems (provable sentences)
in arithmetic is recursively enumerable. So there is a sentence in arithmetic that is neither
provable nor disprovable. Also, we have seen, as a consequence of the First
Incompleteness Theorem, Gdels Second Incompleteness Theorem to the effect that the
consistency of arithmetic is unprovable within arithmetic itself. As a result, there is no
such thing as an absolute consistency proof and a consistency proof is necessarily relative.
A relative consistency proof is possible only by appeal to the existence of a model outside
the formal system. Gdels Incompleteness Theorems imply that there are arithmetical
truths we cannot get access to in an effective way, and strengthen my claim that
mathematical truths are of non-constructive nature.




In the last chapter, by drawing upon Gdels Incompleteness Theorems, we have seen that
there are the limitations of formal methods. That is, Gdels First Incompleteness
Theorem tells us that there is a sentence in a first-order formal system that is true but
unprovable. And, as a corollary of that, Gdels Second Incompleteness Theorem tells us
that the consistency of a formal system cannot be proved within that system itself. Also,
we have seen that Platonists dont have to be bothered by both limitations because they
claim that we have to go outside of the formal system and resort to informal methods at
some point.
In this chapter, by drawing upon the Lwenheim-Skolem Theorem we shall see
another limitation inherent in formal methods. The Lwenheim-Skolem Theorem shows
that if a formal system has a model, then it has multiple models. And, as a consequence of
that, the Skolem Paradox shows that some mathematically important notions are relative to
a model. This seems to pose a threat to Platonists. For since these results imply that a
reference relation depends on a model, one is inclined to hold that there is no unique
complete description of the world independently of our conceptual schemes. But again I
claim that this is a defect for formalists, not for Platonists.
To that aim, first of all, we state what a model is and consider the need to assign a
model to a formal system (Section 1). Then, we discuss in detail the Lwenheim-Skolem
Theorem and the Skolem Paradox (Section 2). Moreover, we shall argue that when
reconsidered in the light of the Lwenheim-Skolem results, both Quines thesis of the
indeterminacy of translation and Putnams model-theoretic arguments against metaphysical
realism come to take on a clear meaning. Actually, we can regard the former as a
precursor of the latter (Section 3 and 4). Finally, we take note of the positive and negative
lessons from the model-theoretic arguments (Section 5). In spite of some important
insights of the model-theoretic arguments, as a Platonist I conclude that there are still left

some informal methods for us to overcome the challenges from the model-theoretic
5.1 What is a Model?
A formal system consists of a set of sentences with empty symbols stripped of meaning,
except for logical constants. It is an interpretation that is assigned to that sentence. We
shall see what it means to assign an interpretation to a formal system in detail.
Let x, y, z be variables, P be the two-place predicate symbol and f be the two-place
function symbol. Consider a formula:
xyzP(f(x, z), y))
First, let the domain of an interpretation be natural numbers, P stand for and f stand for
multiplication. So an interpretation of the formula is:
This is false because, in order for this formula to hold, the domain of an interpretation has
to be rational numbers (A counterexample: x3, y2, z2/3 N).
But if we extend the domain of an interpretation to the set of rational numbers, leaving the
rest of interpretation the same as above, then it is obviously true because this is exactly the
peoperty of rational numbers.
In general, the truth or falsehood of a formula depends on an interpretation given to
the symbols in a formula. But there is a class of formulas that are true no matter what
interpretation is given to the symbols in them. These formulas, which are true only in
terms of their logical structures, are called logically valid. According to Gdels
Completeness Theorem, if a formula of first-order logic is logically valid, it can be proved
in a formal system of first-order logic. A logically valid formula of first-order logic
corresponds to a tautology of propositional logic.
To take the simplest examples, let x be variables, P be the one-place predicate symbol.
Consider a sentence:
Assume that we interpret the sentence by taking the predicate P(x) to be x is a
philosopher and taking the universe of discourse to be a set {Socrates, Plato, Kant}.

Then since all individuals in this domain are philosopher, and there is an individual who is
a philosopher in this domain, this interpretation make this sentence true. Even if we take
the universe of discourse to be a set {Socrates, Plato, George Washington}, since the
antecedent is false, this sentence turns out to be true regardless of whether the consequent is
true or false. Both interpretations make this sentence true. This comes as no surprise
because this sentence is logically valid.
Now consider the converse:
Assume that we give this sentence the same interpretations as above. If we take the
universe of discourse to be a set {Socrates, Plato, Kant}, then there is an individual who is a
philosopher in this domain, and every member in this domain is an individual who is a
philosopher. So this interpretation makes the sentence true. But assume that we interpret
the same sentence by taking the universe of discourse to be a set {Socrates, Plato, George
Washington}. Obviously this sentence is false under this interpretation.
One could hope that since mathematical truths hold indiscriminately with no regard
to material objects, all the mathematical truths can be proved using the rules of inference in
a finite number of steps together with some appropriate axioms. But the hope was dashed
by Gdels First Incompleteness Theorem and it became clear that there is a gap between
the truth of arithmetical sentences (semantics) and their provability (syntax). Therefore, it
becomes significant to assign an interpretation or model to a formal system. A formal
system is said to have a model if there is an interpretation that satisfies every sentence in
that system. The problem is that a first-order formal system cannot uniquely fix the
5.2 The Lwenheim-Skolem Theorem and the Skolem Paradox
We have already seen that a first-order formal system is incomplete in the sense as shown
by Gdels First Incompleteness Theorem: There is a sentence in a first-order formal
system that is neither provable nor disprovable. More specifically, there is a sentence in a
first-order formal system that is true but unprovable. In this section, we shall see another
sense in which a first-order formal system is incomplete by drawing upon the
Lwenheim-Skolem Theorem. The incompleteness of a formal system in the latter sense

is no less important than that in the former sense in order to shed light on the limitations of
formal methods.

As a corollary of Gdels Completeness Theorem, we get Gdels Model Existence
If a first-order formal system is consistent, then it has a model.

Then, we shall state the Lwenheim-Skolem Theorem. This consists of the Downward
and Upward Lwenheim-Skolem Theorems.

The Downward Lwenheim-Skolem Theorem:
If a first-order formal system has a model, then it has a countable model, i.e., a model
whose universe of discourse is countable.

The Upward Lwenheim-Skolem Theorem:
If a first-order formal system has a model, then it has an arbitrarily infinite model.

We shall not get into the technical proof of the Lwenheim-Skolem Theorem. I just
mention that we need the Axiom of Choice in order to prove the Downward
Lwenheim-Skolem Theorem. Now, the Lwenheim-Skolem Theorem, combined with
Gdels Model Existence Lemma, says that if a first-order formal system is consistent, then
it has multiple models. This means that a first-order formal system cannot uniquely
determine the model.
But despite appearances multiple models might be essentially the same and different
only in notations. Mathematically speaking, two models are notational variants of each
other if one is isomorphic to the other. A function f is said to be an isomorphism if f is
bijective and order-preserving (homeomorphic). Since mathematics is interested in more
than just notational variants, it would be enough if a first-order formal system had only one
model up to isomorphism. Actually, a formal system is said to be categorical if it has
only one model up to isomorphism. Since the Lwenheim-Skolem Theorem claims that

there exist multiple models with different cardinalities, however, a fortiori they are not
isomorphic each other. Therefore the Lwenheim-Skolem Theorem shows that a
first-order formal system is not categorical. Here arises another sense of incompleteness,
distinct from that in Gdels First Incompleteness Theorem.
It seems that one way to resolve the incompleteness is to construct a new formal
system by adding new axioms. But this cannot help because the Lwenheim-Skolem
theorem tells us that the new formal system in turn has multiple models. So the
Lwenheim-Skolem theorem claims that the incompleteness is inherent in a first-order
formal system. The problem whether or not there is a complete formal system was
already noticed in the early twentieth century. Veblen was the first to realize the problem
and coined the word categorical in this context.
This is a serious problem for formalism. For the Lwenheim-Skolem Theorem
shows that a first-order formal system has a variety of unintended or non-standard models
as by-products along with the intended or standard model, so by the formal methods alone
we cannot fix the model uniquely. For instance, assume that we formalize arithmetic in a
first-order formal system. The model of arithmetic has to be countable. By the
Lwenheim-Skolem Theorem, however, we have an uncountable model as well. This
means that a first-order formal system has various of unintended or non-standard models so
we cannot pinpoint the intended or standard model of arithmetic.
Now, in order to avoid confusion, we need to distinguish at least three senses of
(1) A first-order formal system is complete in the sense that there is a complete set of rules
of inference to derive all logically valid sentences (Gdels Completeness Theorem).
(2) A first-order formal system is not complete (negation-complete) in the sense that for
every sentence in the system, either it or its negation can be proved (Gdels First
Incompleteness Theorem).
(3) A first-order formal system is not complete (categorical) in the sense that a formal
system has only one model up to isomorphism (The Lwenheim-Skolem Theorem).
The Skolem Paradox is based on the apparent conflict between the
Lwenheim-Skolem Theorem and the Cantor Theorem. The Downward

Lwenheim-Skolem Theorem tells us that if a first-order formal system has a model, then it
has a countable model (i.e., a model in which the universe of discourse is countable). The
Power Set Axiom says that there exists the power set P() of which consists of all the
subsets of , and the Cantor Theorem says that they are uncountable. But how is it
possible that a countable model makes true a sentence that claims the existence of an
uncountable set?
This is the Skolem Paradox. In order to resolve the Skolem Paradox,
we have to deepen the understanding of what the Cantor Theorem really states.
There are two things to note here: one is concerned with the definition of an
uncountable set and the other is concerned with the Power Set Axiom. If a set S is
uncountable, then there is no one-to-one correspondence between members of S and .
But note this means that there exists no one-to-one correspondence in the model M. A
one-to-one correspondence is formally expressed by an enumerating set of ordered pairs.
Recall that we define an ordered pair in a set-theoretical term: a, b{{a}, {a, b}}.
So, more specifically, there being no one-to-one correspondence in the model M means that
there exists no enumerating set of ordered pairs in the model M. But there may be an
enumerating set of ordered pairs outside M, so if we add such ordered pairs to M, there is
one-to-one correspondence between members of S and outside M. Therefore it is
perfectly possible that a certain set, which is uncountable from inside M, is countable from
outside M.
In regard to the Power Set Axiom, it says that for any set S, there is a set P(S) which
consists of all the subsets of S. In symbols, P(S){x|xS}. As we shall see in the next
chapter, however, what is considered to be the power set P() of may differ in
accordance with the model concerned. This is exactly the reason why different models
give us different answers to the Continuum Hypothesis. That is, according to Gdels V
L model P()
, but according to Cohens generic extension model P()

To see this more clearly, let M be a countable transitive model. A model M is transitive if for any x, y, if
xyM, then xM. If by the Power Set Axiom there exists the power set P() of in M, by transitivity
M contains all the subsets of in M. By the Cantor Theorem they are uncountable. But how is it possible
that a countable model contains uncountable many sets?

So precisely speaking, the Power Set Axiom states that given any set x, there is a set P(S)
which consists of all the subsets of x in the model M.
We can resolve the Skolem Paradox by restating the Cantor Theorem based on these
two facts. The Cantor Theorem just states that there is no enumerating set of ordered pairs
in M between all the subsets of in M and . So, P() is uncountable from inside
the model M (the Cantor Theorem) but countable from outside M (the Lwenheim-Skolem
Theorem). The argument goes as follows. There is no one-to-one correspondence in M
between the power set P() of (i.e., all the subsets of in the model M) and
because no enumerating set of ordered pairs are between them in M. But since there may
exist an enumerating set of ordered pairs outside M, if we add such ordered pairs to M, then
there is one-to-one correspondence between them outside M. Thus the apparent tension
between the Lwenheim-Skolem Theorem and the Cantor Theorem is overcome.

Figure 25: The Skolem Paradox.

The lesson we can learn from the Skolem Paradox is that a cardinality (countability,
uncountability)one of the most important mathematical conceptsis a notion relative to
a model. That is to say, there is a set which is uncountable from inside the model but is
countable from outside the model. So, technically speaking, an uncountable cardinal in
the model M could collapse into a countable cardinal in the extension model M. Since
we usually assume that a mathematical concept has an absolute meaning regardless of the
background in which it is placed, it is quite surprising to know that such an important
mathematical notion as a cardinality is relative to a model.
But the problem is whether or not we can press this argument by claiming that every
set is relatively uncountable and it is countable outside the model. Cant we bite the bullet
and claim that there is a set which is absolutely uncountable? Some people say that the
Skolem Paradox is not a genuine paradox. For the Skolem Paradox is prima facie a
paradox but not in the sense that Russells paradox is a paradox. In this respect it may be
similar to the Banach-Tarski Paradox, which is called a paradox though it should have been
named a theorem. But I believe that the situation is more serious than that. Even if we
advocate relativism at one point, there is still a possibility that absolutism raises its head
Consider the set of real numbers. If we follow the relativist line of thought, the set
of real numbers is relatively uncountable, so we can reconstruct the model proper to real
numbers in a first-order formal system which has a countable model. But we can take
Cantors diagonal argument as showing not just that there is no enumerating set of ordered
pairs in M between the set of real numbers in M and , but there is no such thing anywhere.
If, as the Skolemite actually did, one claims that the cardinality of any set is relative to the
model, then he or she at least has to show that there is a model in which the set of real
numbers is countable, though it may be not a standard model. Otherwise, the set of real
numbers is uncountable in whatever model, so we cannot construct the model proper to real
numbers in a first-order formal system which has a countable model.


5.3 Quines Thesis of the Indeterminacy of Translation
Quine in Word and Object doesnt mention the Lwenheim-Skolem Theorem and the
Skolem Paradox when discussing the thesis of the indeterminacy of translation.
But I
believe that Quines thesis appears afresh in this perspective. Actually, we can regard the
thesis of the indeterminacy of translation as a precursor of Putnams model theoretic
arguments against metaphysical realism. First of all, by drawing upon Word and Object
we shall see what Quine means by the indeterminacy of translation.
Quine supposes a somewhat unusual situation, which he calls radical translation, in
which we translate the language of people with whom we have never previously made
contact. Not only is our language so different from theirs syntactically and semantically
but also our culture is so different from theirs that we have few clues as to what they are
talking about. So, for instance, stimulus meaning cannot decide whether the natives
gavagai refers to rabbit, rabbit stage, undetached rabbit part, rabbit fusion, or rabbithood.
Actually, we have multiple translation manuals or analytical hypotheses, as he calls them,
which are all internally consistent with the totality of speech behavior and the totality of
dispositions to speech but incompatible each other.
So, it is perfectly possible to assume
that an analytical hypothesis which interprets the natives gavagai as rabbit is no less
internally consistent than one which interprets it as rabbit stage, though the two hypotheses
are contradictory each other. This is what the indeterminacy of translation is all about.
The reason why I put stress on Quines thesis of the indeterminacy of translation is
that his argument goes deeper than the gavagai example. As Quine himself says in On
the Reasons for Indeterminacy of Translation, his real intention in the indeterminacy of
translation does not lie in the gavagai example. Actually, Quine in Word and Object
considers the indeterminacy of translation by analogy with the underdetermination of
scientific theory, and his point was in the latter.
Quine believes that we know external

As we shall see later, Quine mentions the Lwenheim-Skolem Theorem elsewhere.
When Quine says that translation manuals or analytical hypotheses are internally consistent with the
totality of speech behavior and the totality of dispositions to speech, he means that just as we have our speech
behavior or dispositions to speech in the sense that we use a brief general term for rabbits but no brief general
term for rabbit stages or parts, so, too, the native have their own.
In an informal discussion with Dr. Dancy, I was informed Quine thinks that the indeterminacy of
translation is worse off than the underdetermination of scientific theory because in the former, unlike in the

things only through impacts on our nerve endings. But our surface irritations cannot
completely determine the behaviors of invisible physical particles such as electrons, protons,
neutrons and neutrinos. In fact, there are multiple scientific hypotheses about their
behaviors, which are all internally consistent with the totality of data available to us but
incompatible each other. So it is perfectly possible to assume that one hypothesis which
claims that, for instance, neutrinos have mass is no less internally consistent with the other
hypothesis which claims that neutrinos lack mass, though the two hypotheses are
contradictory to each other.

Figure 26: The comparison of the indeterminacy of translation and the underdetermination
of scientific theory.

latter, there is no fact of the matter, so Quine is pleased to accept the possible revisability of scientific theory.
For this, see Quine, Indeterminacy of Translation Again, pp.9-10. Indeed we now know that the neutrino
is a massive particle as a result of late 1990s neutrino revolution, but for Quine it does not hurt the
underdetermination of scientific theory.

But we should note it is not that for Quine the consistency alone is the only criterion
scientific hypotheses should satisfy. Quine lists the following three criteria which he
believes we should adopt in formulating scientific hypotheses.
(1) Simplicity: Simplicity is guidance when scientists generalize from sample data to laws.
(2) Familiarity of principle: A new theory conserves the truths of the older theory. In
other words, we favor minimum revision.
(3) Sufficient reason: A sufficient reason for positing invisible physical particles is that a
theory which contains these particles can explain physical phenomenon more simply than
others which do not.
Quine emphasizes simplicity among other things. According to Quine, whenever
simplicity and familiarity of principle conflicts, the verdict should be on the side of
simplicity, and a sufficient reason may be subsumed under simplicity.
Though, as I said earlier, Quine doesnt mention the Lwenheim-Skolem results when
discussing the indeterminacy of translation, the situation is more precise in mathematics
and logic. The Lwenheim-Skolem Theorem, combined with Gdels model existence
theorem, tells us that if a first-order formal system is consistent, then it has multiple models.
So it is possible to interpret logical symbols of the first-order formal system in so many
ways that we cannot uniquely fix the references of logical symbols. This fact is the more
important since Quine believes the priority of the first-order formal language to the
ordinary language. Quine says, we can see that paraphrasing into logical symbols is after
all not unlike what we all do every day in paraphrasing sentences to avoid ambiguity.

But Quines point is of course that the indeterminacy of translation cannot be completely
resolved even by paraphrasing into logical symbols because it is inherent in our language
itself. A sentence has a meaning only relative to the frame of reference in which it is
Based on this argument, Quine is very skeptical of the possibility of the unique true
scientific theory. Indeed, as we shall see later, Putnams model-theoretic arguments
against metaphysical realism push this skepticism. But it is interesting to see that Quine
says, Vagueness, ambiguity, fugacity of reference, are traits of verbal forms and do not

extend to the objects referred to.
Also, Quine admits numbers as objects because of
their efficacy in organizing and expediting the sciences. We can see here Quines
indispensability argument of mathematical objects. Hence I believe that Quines point is
not so much the impossibility of the unique true scientific theory as the inexpressibility of
the unique true scientific theory, if at all, even in terms of our formal language.
If there were (contrary to what we just concluded) an unknown but unique best total
systematization of science conformable to the past, present, and future nerve-hits of mankind,
so that we might define the whole truth as that unknown , still we should not thereby have
defined truth for actual single sentences. We could not say, derivatively, that any single
sentence S is true if it or a translation belongs to , for there is in general no sense in equating a
sentence of a theory with a sentence S given apart from . Unless pretty firmly and directly
conditioned to sensory stimulation, a sentence S is meaningless except relative to its own
theory; meaningless intertheoretically.

Quine actually mentions the Lwenheim-Skolem Theorem in Ontological Relativity
and Ontological Reduction and the World of Numbers even though in a backhanded way.
His point is that there is no tension between the Lwenheim-Skolem Theorem and the
thesis of the indeterminacy of translation. The Downward Lwenheim-Skolem Theorem
says that if a first-order formal system has a model, it has a countable model. So, at first
sight, it seems that the Lwenheim-Skolem gives a false impression, as though an
uncountable ontology were reducible to a countable ontology. But Quine argues that this
really does not follow from the Lwenheim-Skolem Theorem. Carnap successfully
reduced impure numbers of temperature to pure numbers. Zermelo and von Neumann
succeeded to reduce natural numbers to sets. What is the difference between the failure
and the success of reduction?
Quine claims that the reason for the success of reduction is the existence of a proxy
function that designates a permutation between the reducing and the reduced domain. So
if there is a proxy function between two models, they are just notational variants of each
other. As we have seen earlier, however, the Lwenheim-Skolem Theorem tells us not
just that there are multiple models, but that there are non-isomorphic multiple models.

Word and Object, p. 159.
Word and Object, p. 193.
Word and Object, p. 23-4.

Two models with different cardinalities are non-isomorphic, so there is no proxy function
between them. Hence Quine holds that the reduction of an uncountable ontology to
Pythagorean ontology is doomed to fail. Since the Lwenheim-Skolem Theorem does not
claim that one ontology is reducible to the other ontology, it is compatible with the thesis of
the indeterminacy of translation.
The thesis of the indeterminacy of translation is closely related to the thesis of the
underdetermination of scientific theory. In On the Reason for Indeterminacy of
Translation Quine himself says that when he is talking about the former, his real intention
is in the latter. In On Empirically Equivalent Systems of the World Quine explains the
thesis of underdetermination of scientific theory in detail. The thesis of the
underdetermination of scientific theory says that there is an empirically equivalent but
logically incompatible theory. I use the term logically incompatible in a strong sense.
There is a weaker sense of logical incompatibility that is just apparent and rendered
logically equivalent by a reinterpretation of predicates. This is parallel to the case in
which one ontology is reducible to the other ontology by a proxy function. But Quines
point is rather that just as an uncountable model cannot be reduced to a countable model for
the lack of a proxy function, so one theory cannot be rendered logically equivalent to
another by a reinterpretation of predicates. This is exactly what the thesis of the
underdetermination of scientific theory means and Quine goes in a direction where he
claims that we can talk about the truth or falsity of sentences only relative to the
background theory.
5.4 Putnams Model-Theoretic Arguments against Metaphysical Realism
It is well-known that the model-theoretic arguments based on the Lwenheim-Skolem
Theorem caused Putnam to change his mind from metaphysical realism to internal
realism. In this section, we shall see exactly how that happened by drawing upon his
Realism and Reason. As we have already seen, the Lwenheim-Skolem Theorem tells us
that if a first-order formal system has a model, then it has multiple models. This means
that theoretical and operational constraints cannot fix the unique intended model for a
first-order formal system. Also, Putnam claims that there is nothing else with which to fix
the intended model than theoretical and operational constraints. To extend the formal

system by adding new axioms doesnt work either because the Lwenheim-Skolem
Theorem says that the new formal system in turn has multiple models. Its just adding
more theory. So the impossibility of fixing the intended model is believed to be inherent in
first-order formal systems.
According to Putnam, the model-theoretic arguments deal a fatal blow to metaphysical
realism. Metaphysical realism is a doctrine that there is the uniquely true and complete
description of the world apart from our conceptual imposition. If we have two theories
which are true and complete descriptions of the world, they are not different in substance
but are just notational variations of each other. Putnam plays devils advocate and shows
that there are two ways to save metaphysical realism from the model-theoretic arguments
against it. One is to assume that, along the lines of Gdel or Kripke, we have a non-natural
mental power like an intellektuelle Anschauung in order to decide which of empirically
equivalent but incompatible theories. The other, which is called called natural
metaphysics, is to get the job done by, instead of an intellektuelle Anschauung, the
scientific method. But natural metaphysics goes against a scientific spirit that pursues its
own activities within the confines of empirically significant claims. So, according to
Putnam, the revival of metaphysics is far more likely to be along the lines of those
who believe that we have an intellektuelle Anschauung than along the lines of natural
metaphysics, but it seems highly unlikely that metaphysics will succeed in its revival along
either line.

Against metaphysical realism, Putnam bills his own view as internal realism. In this
view, there is no such thing as the uniquely true and complete description of the world
regardless of our conceptual schemes. A statement has a meaning only relative to the
interpretation or model in which it is placed. According to Putnam, the world does not
pick models or interpret languages. We interpret our languages or nothing does.
world is not a furnished room.
This is the reason why Putnam believes that there isnt
ready-made world. Our understanding of the meaning of a sentence consists not in
knowing its truth conditions but in mastering its verification procedures. Thus Putnam

Realism and Reason, p. 228
Realism and Reason, p. 24.

identifies truth with our idealized justification. Here idealized justification, as opposed
to tensed justification or justification-on-present-evidence, indicates that some of the
statements which are now justified may turn out not to be true.
Putnam holds that
explanation is epistemic or intentional because it is interest-relative and context-sensitive.

Even though a statement has a meaning only relative to the model in which it is placed,
however, we have to note Putnam doesnt believe that the truth or falsity of a statement is
subjective. He says, Urging this relativism is not advocating unbridled relativism.

Putnam also claims that reason cant be naturalized. For we cant eliminate the normative
like the notions of rightness and wrongness because if we did it, then we would stop being
thinkers and our statements would be nothing but noise-makings. He goes as far as to say
that the elimination of the normative is attempted mental suicide. Putnam quotes Nelson
Goodmans remark that relativity of rightness and the admissibility of conflicting right
renderings in no way precludes rigorous standards for distinguishing right from
In that vein, though Putnam agrees with Quine that even logical or
mathematical statements are revisable, he tries to differentiate himself from Quine by
saying that our notion of rationality cannot be so flexible as Quine believes.
In this respect, it is interesting to see Putnams view on the laws of classical logic
(tautologies or logically valid sentences). Frege and Russell believe that the laws
of logic are the truths which hold in the actual world, albeit in more general and abstract
aspects of the actual world. But according to possible world theorists, the laws of
logic are the truths which hold in every possible world. Indeed, it seems to me that there
was a great advance from the former to the latter in regard to the understanding of the
ontological status of logical laws. Putnam follows in Quines footsteps and shows that
even logical laws are revisable by pointing out that there are some laws of classical logic
which do not hold in quantum logic:

(1) The law of conjunction introduction p, qp&q

Realism and Reason, p. 23.
Realism and Reason, p. 85.
Realism and Reason, p. 297.
Realism and Reason, p. 10.
Realism and Reason, p. 169.

has to be restricted to pairs of compatible propositions p, q.
(2) The distributive law p(qr)pqpr
has to be restricted to the case in which all three propositions p, q, r are totally
But his emphasis is on the fact that every statement is revisable but not in every
way. Putnam says that we must not make a hasty judgment from the fact that it is
dangerous to claim that any statement is absolutely a priori to the absolute claim that there
are no a priori truths.
It could be that not all, but some of the laws of classical logic are
a priori. For instance, Putnam doesnt take the law of contradiction (p&p) as an a
priori truth. For there is left room for the possibility that p&p does hold. Nevertheless,
there is indeed a statement p such that (p&p) does hold a priori. Therefore,
according to Putnam, there exists at least one a priori truth: Not every statement is both
true and false. Putnam accepts the law of contradiction only in such a weak form.

Putnam claims that, insofar as empirically equivalent but incompatible theories make
different claims, some facts are soft in the sense that they depend for their truth value on
the speaker, the circumstances of utterance, etc.
According to Putnam, whether VL or
not belongs to a soft fact.
So does whether the Axiom of Choice is accepted or
not. Following Putnam, as a thought experiment, lets assume that there are intelligent
extraterrestrials who reject the Axiom of Choice due to its some counter-intuitive
consequences like the Banach-Tarski Paradox.
Keep in mind that most mathematicians
and logicians here on the earth accept the Axiom of Choice due to its pleasant
consequences. Then could we say that we are right and they are wrong? Of course, our
acceptance of the Axiom of Choice is not arbitrary. But the fruitful results from the Axiom
of Choice alone are not good enough to say that acceptance of the Axiom of Choice is so
rational that rejection of it is irrational. The Axiom of Choice depends for its truth-value

Realism and Reason, p. 48, p. 96, p.100.
Realism and Reason, p. 114.
Realism and Reason, p. 100, p. 112, p. 129, p. 131.
Realism and Reason, p. 19.
Realism and Reason, p. 23.
Realism and Reason, p. 14.

upon the model in which it is placed. This seems to me to be a foregone conclusion from
the model-theoretic arguments. In the next section, we shall consider what are the positive
and negative lessons that can be learnt from the model theoretic arguments.
5.5 The Lessons from the Model-Theoretic Arguments
We have learned from the model-theoretic arguments that there are multiple internally
consistent but mutually inconsistent models. The truth-value of some mathematical
statements depends on a model we have in mind. Therefore, in some cases different
models give us different answers to the same question. When we are confronted with
multiple conflicting models, we should not think that only one of them is true and others
are not if they are all internally consistent. We must be sufficiently tolerant to admit the
possibility that each of them is true and accepted. Indeed the Skolem Paradox suggests
that the cardinality of a set is relative to a model. But does it follow from this that all the
truths depend on a model?
Let me explain what Im trying to say here. I grant the following two points from
mode-theoretic arguments:
(1) A first-order consistent formal system has multiple models. (The Lwenheim-Skolem
(2) There are some truths that depend on a model. (The Skolem Paradox)
But it doesnt necessarily follow from this that:
(3) All the truths depend on a model.
It seems to me that we cannot move from (1) and (2) to (3), without a surreptitious
assumption that we have to treat every model in an egalitarian manner insofar as it is
consistent. Even if there are multiple models, however, if one of the models is preferred
to the others in light of other criteria (if at all), then we couldnt say that all the truths
depend on a model. In fact sometimes none of the models is preferred to the others.
Even so, it is one thing to say that, although one of the models is actually true and the
others are not, we cannot decide which one is true due to a lack of evidence at this point,
and another thing to say that, since the models are all equally true, we cannot tell which one
is true. In the former, the models should be competing with each other for the truth and

one of them is actually true. Only in the latter, we could say that the models should
coexist peacefully and the truth depends on a model.
Even though there are multiple reductions of a natural numbers to a set, we have to
pay attention to the fact that von Neumanns system is entrenched in set theory. For von
Neumanns system has several advantages compared with Zermelos. One of them is that
since in von Neumanns system each natural number is the set of all smaller natural
numbers, a larger-than relation can be replaced by a membership relation.
Indeed the Skolem Paradox suggests that the cardinality of a set is relative to a model.
But is the cardinality of every set relative to a model? Is there a model in which the set of
real numbers is countable? Also, regarding some mathematically important problem, such
as the Axiom of Choice and the Continuum Hypothesis, could we say that its validity also
depends on a model?
The affirmative answer to this question is a consequence of the model-theoretic
arguments. That is, the Axiom of Choice and the Continuum Hypothesis are only true or
false relative to a model, and they are neither true nor false by themselves. More
specifically, in Gdels VL model both the Axiom of Choice and the Continuum
Hypothesis are true, but by Cohens generic extension we can have the one model in which
the Axiom of Choice is false and the other model in which the Axiom of Choice is true but
the Continuum Hypothesis is false. Note that in order that the Continuum Hypothesis
should make sense, the Axiom of Choice has to be accepted.
Recall Putnams thought experiment on the Axiom of Choice. His point is that the
pleasant consequences of the Axiom of Choice alone (e.g., the well-ordering of a set of real
numbers) are not good enough to say that acceptance of it is so rational that rejection of it is
irrational. So Putnams identification of truth with idealized justification does not give us
reason enough to justify the Axiom of Choice. Also, it seems to me that the criterion of
our rational acceptability differs from person to person. This is because for me the fruitful
results from the Axiom of Choice make it fully rational to accept the Axiom of Choice.
What makes possible the Lwenheim-Skolem Theorem is the weak (limitative)
expressive power of first-order logic. When it comes to second-order logic, the
Lwenheim-Skolem Theorem doesnt hold any longer. So my concern is whether or not

Putnams model-theoretic argument, based on Lwenheim-Skolem Theorem, still holds in
second-order logic. One could argue that first-order logic is the only real logic, whereas
second-order logic is not. Actually, Quine is a case in point. But since the central
mathematical notions, such as finitude and countability, can be defined only in
second-order logic, not in first-order logic, we cannot ignore the significance of
second-order logic. But even if we limit ourselves to first-order logic, we can find a way
to save Platonism.
There are multiple models of a first-order formal system. The Skolem Paradox
shows that a mathematically important notion is relative to a model. So, one is inclined to
say that the Axiom of Choice and the Continuum Hypothesis depend on a model. But we
should not give up the investigation there. As a Platonist, I propose to examine the
interrelationships among models as follows. An apparent contradiction may be resolved if
one makes a transition from a lower to higher level. Let me explain what I mean by this
by giving examples from mathematics/logic and science.
(1) Narrowing down the models.
We have the intended or standard model by eliminating unintended or non-standard
Mathematics/Logic: We can get the model of arithmetic by eliminating uncountable models
because the model of arithmetic is countable. Also, most mathematicians reject Gdels V
L in favor of the existence of measurable cardinals.
Science: The geocentric model of the universe was replaced by the heliocentric model.
Moreover, both the phlogiston hypothesis and the ether hypothesis were abandoned in the
scientific development.
(2) Extending the models.
There are some cases in which one model is an extension of the other.
Mathematics/Logic: Gdels VL is an extension of ZF set theory. Cohens generic
extension is another one.
Science: We can explain the relation between Newtonian mechanics and Einsteins theory
of relativity by a correspondence principle. That is, the former is a limiting case of the
latter for a velocity much less than the speed of light.

(3) Overlapping the models.
This case is the most intractable because it is difficult for us to see how the models are
Mathematics/Logic: The whole picture of the Skolem Paradox becomes clearer when
viewed from both inside and outside the model simultaneously.
Science: It is well-known that light has the wave-particle duality. Also, the behavior of a
photon in the double-slit experiment can be explained by a superposition of states.
Epistemologically speaking, then, how can we make comparisons among models?
There are at least two major theories of truth: the correspondence theory of truth and the
coherent theory of truth. The pith of the model-theoretic arguments is that insofar as we
hold the coherent theory of truth, since there are numerous coherent models out there, we
are haunted by the indeterminacy of reference. So Platonists cannot claim any longer, in
accordance with the coherent theory of truth, that we can get access to them implicitly by
the consistency of the model. Hence, Platonists are forced to claim, in accordance with
the correspondence theory of truth, that we can get access to them explicitly by a causal
connection. Since Platonists claim that mathematical objects are super-spatio-temporal
entities, however, we have no choice but to assume that we have a special epistemic faculty
such as a mathematical intuition. In an earlier chapter, however, I denied that we have a
mathematical intuition by invoking the Banach-Tarski Paradox. So, as it seems,
Platonism without a mathematical intuition is endangered. But can we not really be
Platonists if we reject a mathematical intuition?
One way out is to make an excuse by claiming that the problem of the indeterminacy
of reference poses no threat to Platonism. Are there any theoretical terms ever whose
references are fixed once and for all? To see this, it is enough to recall that there were
multiple physical models for subatomic structure at the dawn of modern physics or
chemistry. Just as the indeterminacy of reference of a theoretical term is compatible with
scientific realism, so the indeterminacy of reference of a mathematical entity is compatible
with Platonism. But I think we should take the problem of the indeterminacy of reference
raised by the model-theoretic arguments seriously because it doesnt make sense even to
say that we are talking about something unless we can reidentify what we are talking about.

Thus we are caught on the two horns of dilemma: on one hand we have multiple internally
consistent models and on the other hand we cannot rely on our mathematical intuition to fix
the model. I believe, however, that there are still good candidates which serve as criteria
to measure the excellence of the model, such as simplicity, comprehensiveness and
Combined with the results from the last chapter, it becomes clear that there are the
limitations inherent in formal methods in the following three respects:
(1) As Gdels First Incompleteness Theorem shows, there is a sentence in a first-order
formal system that is neither provable nor disprovable.
(2) As Gdels Second Incompleteness Theorem shows, the consistency of a first-order
formal system cannot be proven within that system.
(3) As the Lwenheim-Skolem Theorem shows, a first-order formal system cannot uniquely
fix the model.
All these are the defects for formalists who rely only on formal methods, not for
Platonists who claim that we need to make an appeal to informal methods by going outside
of a formal system at some point. In light of the Lwenheim-Skolem results, we have
seen both Quines thesis of the indeterminacy of translation and Putnams model-theoretic
arguments against metaphysical realism. We learn from the model-theoretic arguments
the dependence of the truth on a model. Indeed the Skolem Paradox suggests that the
cardinality of a set is relative to a model.
Nevertheless I cannot accept that the truth or falsity of the Axiom of Choice or the
Continuum Hypothesis just depends on a model. We should not give up the investigation
there. As a Platonist, I propose to examine the interrelationships among models. As I
argued in an earlier chapter by invoking the Banach-Tarski Paradox, it is implausible that
we have a special epistemic faculty such as a mathematical intuition. But I believe that
we still have the principles which serve as criteria to measure the excellence of the model.




The aim of this chapter is to meet Benacerrafs challenges to Platonism by claiming that in
mathematics essence amounts to existence. In Section 1, we shall see Benacerrafs
epistemological and ontological challenges to Platonism. In Section 2, in spite of
Benacerrafs challenges, I defend Platonic realism in mathematics by showing the
deductive power and stability of the Axiom of Choice through a variety of its applications
in many branches of mathematics. In Sections 3 and 4, we shall examine how
mathematical models are developed in the actual practice of mathematics. As a result, I
claim that true mathematical theories belong to the maximally consistent theory that
describes mathematical reality. The theoretical ground for my claim is that in
mathematics essence amounts to existence. In Sections 5 and 6, I seek for the
philosophical foundation for my view by drawing upon Anselms argument for the
existence of God and Lockes doctrine that nominal and real essences coincide in
mathematics. In Section 7, I shall show that there are two conditions to be satisfied in
order for us to derive existence from essence alone. Given the fact that in natural sciences
existence cannot be derived from essence alone, this shows the unique nature of
mathematical existence.
6.1 Benacerrafs Challenges to Platonism
Benacerrafs challenges to Platonism are twofold: one is epistemological and the other
ontological. First of all, we shall examine Benacerrafs epistemological challenge.
Benacerraf, in Mathematical Truth, a paper with an immense influence on the philosophy
of mathematics in recent decades, rejects Platonists standard view of mathematical truth.
The account Benacerraf calls the standard view treats a sentence like (1) and a sentence
like (2) as straightforward instances of the logical form of (3).
(1) There are at least three large cities older than New York.
(2) There are at least three perfect numbers greater than 17.

(3) There are at least F Gs that bear R to a.
This account attempts to draw a parallel between mathematics and natural sciences.
According to Benacerraf, we can find such an account only in Tarskis theory of truth,
whose characteristic feature is to define truth in terms of reference, denotation, or
On this account, there has to be some causal connection between ourselves and
mathematical objects. On Platonists account, however, mathematical objects are
supposed to be super-spatio-temporal and thus causally inert. So if Platonism is true, then
the causal theory of knowledge is false. And if the causal theory of knowledge is true,
then Platonism is false. Hence, Platonism creates an irreconcilable tension with the causal
theory of knowledge. To meet this challenge, Platonists have to explain how we can get
access to mathematical objects that are supposed to be super-spatio-temporal and thus
causally inert. Let me call this problem the Access Problem.
To meet Benacerrafs challenge, one could simply stick with the correspondence
theory of truth and claim that in mathematics we have a mathematical intuition just as we
have an empirical intuition in natural sciences. As already explained, however, the
Banach-Tarski Paradox raises a doubt that we have a special epistemic faculty such as a
mathematical intuition.
Next, we shall consider Benacerrafs ontological challenge to Platonism in detail.
Benacerraf, in What Numbers Could Not Be, spells out the conditions for a correct
account of numbers to satisfy. That is, it is necessary
(1) to give definitions of 1, number, and successor, and , , and so forth, on
the basis of which the laws of arithmetic could be derived; and
(2) to explain the extramathematical uses of numbers, the principal one being
countingthereby introducing the concept of cardinality and cardinal number.
If numbers are sets, then we must be able to know which sets numbers are. But it is
well-known that there are several different reductions of numbers to sets. Frege defines
the number 3 as the class of all classes consisting of triplesthe extension of the concept
equivalent with some 3-membered set in his terminology. Also, in 1908 Zermelo
proposed to use

0, 1{}, 2{{}}, {{{}}},
Later von Neumann proposed an alternative:
2{0, 1}{, {}}
3{0, 1, 2}{, {},{ , {}}},
Therefore, for Zermelo, 3 17, whereas for von Neumann, 317. Their cardinality
relations are also different. On the former, every number is single-membered, whereas on
the latter, a set of the number n had n members. Therefore, for the former 17 has only one,
while for the latter, it has 17 members.
Although there are differences between the two systems, it still remains the fact that
both can satisfy the conditions for a correct account of numbers as stated above.
So we
have no principled way of deciding between these reductions. But if a number is a set,
then this set has to be unique. So Benacerraf concludes that numbers could not be sets at
all. Let me call this problem the Multiple Reduction Problem. This shows that any set
could contain some superfluous conditions irrelevant to arithmetic, so a number is too weak
to exist as a set.
In the extension of his argument, he argues that numbers could not be objects of any
sort. For there is no more reason to identify any individual number with any one
particular object than with any other (not already known to be a number). To be the
number 3 is no more and no less than to be preceded by 2, 1, and possibly 0, and to be
followed by 4, 5, and so forth. Any object can play the role of 3; that is, any object can be
the third element in some progression. Hence, according to Benacerraf, the essence of
numbers exhausts itself in the relative positions they occupy in over-all structure. Any
object could contain some superfluous conditions irrelevant to arithmetic, so a number is
not such as can be given self-subsistent existence. Arithmetic is therefore the science that

We have already seen that even though there are multiple reductions of a natural numbers to a set, von
Neumanns system is entrenched in set theory because von Neumanns system has several advantages
compared with Zermelos. See p. 107.


elaborates the abstract structure that all progressions have in common merely in virtue of
being progressions. Arithmetic is not a science concerned with particular objectsthe
It might be objected to Benacerrafs ontological challenge that even if it is granted
that numbers are not objects, it does not follow from this that numbers are not anything
unique. Also, it might be objected that whether or not numbers are objects depends on
what it is to be an object. Indeed, as Benacerraf himself recognizes, his opponent may
simply bite the metaphysical bullet, affirming the possibility of objects that lack any inner
intrinsic nature and whose essence consists entirely in relations to other objects. In any
case, it seems to me that Benacerrafs point is that we dont have to be committed to the
existence of numbers in order for arithmetical propositions to be true.
6.2 Some Applications of the Axiom of Choice
In spite of Benacerrafs challenges, I believe that Platonic realism fits in well with the
actual practice of mathematicians. Also, we can actually gain a more fruitful picture of
mathematics by working hypotheses based on Platonic realism. Despite the early
criticisms, not only did the Axiom of Choice play a major role in the systematization of
Cantors set theory, but it also had a tremendous impact on many branches of mathematics
outside set theory. This means that the Axiom of Choice is not a freak ad hoc principle
formed in the development of mathematics, but a stable principle which is widely
applicable in many branches of mathematics. The deductive power of the Axiom of
Choice is borne out by the fact that even some of the opponents of the Axiom of Choice
used it implicitly.
Zorns Lemma is the key to the various applications of the Axiom of Choice.
Actually, we can prove numerous theorems in mathematics by using Zorns Lemma. For
instance, using Zorns Lemma, we can prove in linear algebra that every vector space V has
a basis. A basis is a maximal independent subset of V. Especially, the subset of V, {{1, 0,
. , 0 , 0}, {0, 1, . , 0 , 0}, , {0, 0, . , 1 , 0}, {0, 0, . , 0, 1}} is called the standard
basis. So the theorem amounts to the claim that every vector space has a maximal
independent subset of V. Let S{X X is an independent subset of V}. Partially order S
by inclusion. Let C be any chain of S. Let C{Xi: iI}. Consider the set U

U is a subset of V. U is an upper bound of C because, for every XiC, XiU. In order
to apply Zorns Lemma, we now show that US. For reductio, suppose that U S.
That is, U is dependent. So there should be some element of U that is expressed as a
linear combination of the other elements of U. But since U is just the union of all the
independent subsets of V in C, we can find an earlier set X in C that contains these linearly
dependent elements. By assumption, however, X is an independent subset of V. A
contradiction. Therefore US. Applying Zorns Lemma, S has a maximal element.
So we conclude that V has a maximal independent subset, i.e., a basis. This seems to be
an intuitive consequence of Zorns Lemma. In a two dimensional vector space, {{1, 0},
{0, 1}} is the standard basis. In a three dimensional vector space, {1, 0, 0}, {0, 1, 0}, {0,
0, 1} is the one. Extending this argument, the claim that every vector space V has a basis
is not so surprising, though its harder to prove when it comes to the infinite vector space.
Also, in topology Zorns Lemma is useful to prove the important Tychonoff Product
Theorem for compact spaces that if all the factors are compact then also the product is
6.3 The Axiom of Determinacy
In the next two sections, we shall examine how mathematical models have been developed
in the actual practice of mathematics. Two Polish mathematicians, Mycielski and
Steinhaus, introduced an axiom that contradicts the Axiom of Choice: the Axiom of
Determinacy. Imagine a game in which two players alternate in choosing natural numbers.
Player I starts and chooses a
, then Player II chooses a
, then Player I chooses a
and so
forth. After moves, the players construct an infinite sequence,
, a
, a
, )

be the set of all infinite sequences of natural numbers and A

. Player I wins
the game G
associated with A if aA, and Player II wins if a A. We say that A is
determined if one of the two players has a winning strategy in the associated game G
The Axiom of Determinacy tells us that for every A

, the game G
is determined.
Here is a set of two competing theories without inner contradictions (Theory-Choice

(1) ZFthe Axiom of Choice
(2) ZFthe Axiom of Determinacy
Most mathematicians reject the Axiom of Determinacy and accept the Axiom of Choice.
For the Axiom of Determinacy implies that every set of real numbers is measurable and has
the Baire property, and also that the set of real numbers cannot be well-ordered, whereas the
Axiom of Choice implies the existence of non-Lebesgue measurable sets and the
well-ordering of real numbers, which provide us with a more fruitful picture of
mathematics. This fact shows that since there are some cases in which we could have
multiple logically consistent competing theories, it is not sufficient to say that every
logically consistent theory is a mathematical theory. More precisely speaking, the theories
mathematicians are actually working on are the parts of the maximally logically consistent
theory that describes mathematical reality.
6.4 The Axiom of Constructibility
In 1960 Dana Scott proved a very simple result that the existence of a measurable cardinal
implies VL, equivalently, if VL there are no measurable cardinals. Before we
estimate this result, however, we shall see what the Axiom of Constructibility is. In ZFC
the universe V of all sets is divided into a hierarchy of sets V

by transfinite induction on .
At successor ordinals we take the power set of the previous stage, and at limit ordinals the
union of the preceding stages. Each V

is a transitive set, and V





By the axiom of Foundation, every set in the universe V is a member of some V

. So,


We shall modify the above definition by using the function SDef(S) instead of the
function SP(S).

A set is definable over S if there is a formula of x and some members a
, a
, ., a
in S by which the
members of the set satisfies x. Def(S) is the set of sets which are definable over S. A set is
ordinal-definable if a
, a
, ., a
are ordinals.





Now we define

The Axiom of Constructibility tells us that every set is constructible, i.e., VL. In
short, the Axiom of Constructibility does not admit of the full universe of sets, but only a
restricted universe of the constructible sets. In other words, the Axiom of Constructibility
rejects the existence of a non-constructible set. But we have to note that the term
constructible here is used in too broad a sense for Constructivists to accept. The Axiom
of Constructibility is so strong a hypothesis that it implies the Axiom of Global Choice,
which is the strongest form of the Axiom of Choice, and the Generalized Continuum
Hypothesis. Also, the Axiom of Ordinal-definability is a consequence of the Axiom of
Now, we shall define the Borel sets:

the open sets

the closed sets

countable unions of closed sets

complements of the collection of

countable unions of


complements of

sets that are both



Borel setsthe union of the

The Borel sets are constructed from below, as it were, and well-behaved. By contrast,
the non-measurable sets are built from above and pathological. The Axiom of Choice
ensures that there are non-measurable sets of real numbers, while the Borel sets of real
numbers are Lebesgue measurable.

Then, we shall define the projective sets:

the open sets

the complements of

the projections of

the complements of

the projections of

the complements of the

sets that are both

Projective setsthe union of the

, are formulas with only first-order quantifiers and formulas with only
first-order quantifiers are called arithmetical. So the hierarchy of Borel sets is a
counterpart of the arithmetical hierarchy. We also have seen that
is a recursively
enumerable set and
is a co-recursively enumerable set.


. So

is a recursive set.
, are formulas with second-order quantifiers as
well and formulas with first-order and second-order quantifiers are called analytical. So
the hierarchy of projective sets is a counterpart of the analytical hierarchy.
Here is another set of two competing theories without inner contradictions
(Theory-Choice II):
(1) ZFCthere exists a measurable cardinal
(2) ZFCthe Axiom of Constructibility (VL)
Many mathematicians reject the Axiom of Constructibility and claim VL. For, though
the Axiom of Constructibility implies that there are no measurable cardinals, the
assumption of the existence of a measurable cardinal opens up more possibilities of fruitful

For the Borel sets and the projective sets, Maddy gives an informal presentation in Realism in Mathematics
(p. 112, 113). For formal details, see e.g. Hinman, Recursion-Theoretic Hierarchies, p. 84, Jech, Set Theory,
p. 140, 144, Soare, Recursively Enumerable Sets and Degrees, p.70.
See p. 74.

mathematics. We have to note that two theories give opposite answers to the question
about the measurability of
sets, which ZFC alone cannot answer. On one hand, the
Axiom of Constructibility implies that there exist
sets which are non-measurable and
dont have the Baire property. On the other hand, as in 1967 Solovay showed, the
existence of measurable cardinals implies that every
set of reals is measurable and has
the Baire property, and has the perfect set property. We could say that contemporary
mathematics moves in a direction that opens the possibility of more fruitful mathematics
guaranteed by the Platonic assumption, insofar as it is internally consistent. From all the
above, I shall meet Benacerrafs challenge by claiming that true mathematical theories
belong to the maximally consistent theory that describes mathematical reality.

Figure 27: Example of actual practice of mathematics.

6.5 Anselms Argument of the Existence of God
The theoretical ground for my claim is provided by the idea that in mathematics essence
amounts to existence. In general, we cannot derive existence from essence alone. For
instance, although Sherlock Holmes has the properties of being the character created by
Conan Doyle or living in Baker Street, he is a fictional character. As far as mathematical
objects are concerned, however, essence and existence coincide. We can find the
prototype of this argument in Anselms proof of the existence of God. My aim here is not
discuss whether or not Anselms argument of the existence of God is sound. Rather, my
point is that this argument applies to the existence of mathematical objects mutatis

mutandis. We can see the core of Anselms argument of the existence of God in his
Proslogion, Chapters II and III. God is supposed to be
that-than-which-a-greater-cannot-be-thought. Anselm says even the Fool agrees that
that-than-which-a-greater-cannot-be-thought exists in mind at least. For otherwise it
would not make sense for the Fool to deny that
that-than-which-nothing-greater-can-be-thought actually exists. But it may be objected
that some things exist only in mind and do not actually exist. To forestall an objection like
this, Anselm indeed takes as an example the picture in the mind of a painter. In this
example, the painter having the picture in the mind is not sufficient to say that the picture
actually exists. We can say that the picture actually exists only after the painter executed
the painting. But this is not the case with the existence of God.
Now we shall show that that-than-which-a-greater-cannot-be-thought also exists in
reality. Assume for reductio ad absurdum that
that-than-which-a-greater-cannot-be-thought exists in the mind alone. Even so
that-than-which-a-greater-cannot-be-thought can be thought to exist in reality. The latter
would be greater than the former. So that-than-which-a-greater-cannot-be-thought is
that-than-which-a-greater-can-be-thought. A Contradiction. Therefore,
that-than-which-a-greater-cannot-be-thought also exists in reality. We have to note that
the matter at stake here is that-than-which-a-greater-cannot-be-thought, not
Also, the existence of God is of a unique nature such that God cannot be thought not
to exist. Again, assume for reductio ad absurdum that
that-than-which-a-greater-cannot-be-thought can be thought not to exist.
Something-that-cannot-be-thought-not-to-exist is greater than
something-that-can-be-thought-not-to-exist. So
that-than-which-a-greater-cannot-be-thought is that-than-which-a-greater-can-be-thought.
A Contradiction. Therefore, that-than-which-a-greater-cannot-be-thought cannot be
thought not to exist. This means that that-than-which-a-greater-cannot-be-thought
necessarily exists.
The upshot of Anselms argument is that in God essence and existence coincide.

This means that we can derive the existence of God from the concept alone insofar as it has
no inner contradictions. My claim is that in mathematics essence and existence coincide.
I shall argue that we can derive the existence of mathematical objects from the concept
alone insofar as it has no inner contradictions. Although the object of his intention was
different from that of mine, I pay attention to Anselms proof only as a primary source of
the argument that essence amounts to existence. It would be too sensitive to flatly reject
the argument itself just because it is the argument that concerns the proof of the existence of
God. The success or failure of Anselms proof depends more on what Anselm means by
God than on the argument itself. We have seen that by God Anselm means
that-than-which-a-greater-cannot-be-thought. It seems to me that this notion of God is
quite different from what we normally call God. But it would be too great a digression
if I discussed here whether Anselms notion of God should be supposed to be God or not.
Im not totally committed to Anselms proof of the existence of God, still less to the
existence of God by his proof. Im committed to Anselms proof only in the sense that his
argument gives us a clue about how we get access to an object to be known when there is
no causal connection between ourselves and the object. So Anselms proof contains
important insight into the existence of mathematical objects as the paradigmatic example of
that, although he might have not intended it by his proof.
6.6 Locke on Essences
In this connection, it is worth mentioning Lockes philosophy of mathematics. Although
Locke is often referred to as the father of British Empiricism, it is worth noting that he has
a unique philosophy of mathematics. Provided that Locke gives mathematics a special
status, it would be rash to criticize his philosophy of mathematics for having the defects in
which empiricist philosophers of mathematics (e.g., J. S. Mill) are often said to be involved,
that is, not being able to explain high-order mathematical objects, such as set, large
numbers, and infinite divisibility of space. What makes Lockes philosophy of
mathematics unique is the distinction between nominal and real essences and his claim that
in mathematics nominal and real essences coincide.
By contrast Locke claims that in

Locke also claims that in morality as well nominal and real essences coincide. But I want to focus on
mathematics alone in this paper.

substances we have no idea of the real essence at all.
The significance of Lockes theory of essences can be seen more clearly when
considered in light of Aristotelian doctrine of substantial forms. I grant that it is an
oversimplification to identify proto-Aristotelian notion of a substantial form with the notion
of a substantial form which modern philosophers reject as Aristotelian. The notion of a
substantial form took on new life among scholastic Aristotelians, and was developed in
ways that Aristotle himself never suggested. But the notion of a substantial form has its
roots in Aristotles physical conception of form as one of the four causes, along with his
metaphysical condition that form, above all else, is substance in the primary sense. So it
seems to me that there is a good reason modern philosophers ascribe the doctrine of
substantial forms to Aristotle.
Aristotle rejected the atomic theory of matter and saw matter as extending in a
continuum throughout the universe, leaving no void. Aristotle thinks that everything
under the moon is compounded from the four elements which he borrowed from
Empedocles: earth, water, air, and fire. But they are all composed of the same prime
matter, and are differentiated by their simple qualities, one from each pair of opposites
hot/cold and wet/dry. Here it is worth noting that after the quantitative, mathematical
theories of the atomists and Plato, Aristotle appeals to a qualitative doctrine. Thus earth is
cold and dry, water wet and cold, air hot and wet, fire hot and dry.

When the four elements are combined to form more complex substances, the
qualities which are innate in the matter composing the elements also are combined,
producing the properties of the mixture. Aristotle makes an absolute distinction between
up and down. This leads him to distinguish heavy bodies, which naturally tend to
move down, from light bodies, which tend to move up away from the center. The
heaviest elements tend to be gathered together nearest the center, the lightest to be furthest
from it. Each element thus has its natural place, that of water being immediately above
earth, that of air next, and that of fire further from the center, and nearest to the regions

The Greek terms and are wider than wet and dry in English, for refers to both
liquids and gases, and especially, but not exclusively, to solids. (See Lloyd, Early Greek Science:
Thales to Aristotle, p. 107)


occupied by celestial matter. This line of thought bears fruit as the doctrine of substantial
forms and real qualities. That is, heat, color and a bunch of physical properties are
thought to be real, innate and intrinsic in bodies.
Locke assumes that things are particular, whereas most words are general, and
abstraction is suggested to be what makes ideas general. The point is that we get abstract
ideas by separating particular ideas from all other existences such as space and time. For
example, we can get the abstract idea of men by leaving out that which is peculiar to Peter
and James, Mary and Jane, and retain only what is common to them all. This procedure
can be repeated to yield the still more abstract idea of animal. According to Locke,
however, the categorization or classification of natural kinds is the workmanship of the
understanding, and in this sense the essences of natural kinds are nominal essences. The
real essence on which the sensible properties depend is unknown to us.
Take gold as an example. We have the complex idea of gold, e.g., yellowness,
weight, fusibility, malleableness and solubility in aqua regia, but it is just the nominal
essence of gold. Since we have no idea at all of the real essence of gold, it is impossible
for us to certainly know whether or not these properties are universally affirmed of gold and
how they are connected each other. Their connection can be ascribed to an arbitrary will
of God, so we have only experimental knowledge of this.
But, Locke says, there are some cases in which the ideas contain natural connections
among themselves, and in these cases alone we have certain and universal knowledge.
Here Locke has in mind the ideas which are the nominal as well as real essences: Three
angles of a triangle are equal to two right ones. Locke claims that in mathematics
nominal and real essences coincide. According to Locke, the ignorance of mathematical
truths is not due to any imperfection of our faculties or uncertainty in the things themselves,
but a failure in acquiring, examining and comparing our own ideas. (E, IV, iii, 30)
When Locke says that in mathematics nominal and real essences coincide, I dont
take it that Locke claims that mathematical truths are fictional or analytic. According to
Locke, mathematical truths are not only certain but real knowledge, and not the bare empty

vision of vain, insignificant chimeras of the brain. (E, IV, iv, 6)
Also, Locke actually
classifies truth or knowledge into two kinds: verbal/trifling and real/instructive. Indeed
Locke acknowledges that there are general propositions which are true but do not increase
our knowledge. For example, the propositions such that the whole is equal to all its
parts, or if you take equals from equals, the remainder will be equal, do not help us
increase our knowledge. When it comes to mathematics, however, Locke claims that the
mathematical truth the external angle of all triangles is bigger than either of the opposite
internal angles is not contained in the complex idea triangle. He goes on to say that
this is a real truth and conveys with it instructive real knowledge. (E, IV, viii, 8)
Indeed Locke argues that we cannot discover the real essence of natural kinds, this
does not necessarily mean that Locke flatly denies the existence of real essences. The
following two are quite different claims:
(1) There are no real essences of natural kinds.
(2) We even dont know whether or not there are real essences of natural kinds. If any,
they are unknown to us.
Locke just argues against taking our ideas of natural kinds for real essences, but he doesnt
deny the existence of real essences of natural kinds. In this respect, I agree with Boltons
claim that Lockes anti-essentialist doctrine of nominal essences holds without denying the
existence of real essences.
Although Lockes theory of essences is often construed as
anti-essentialism, this should have to do only with nominal essences. To put it
dramatically, Lockes anti-essentialist view of nominal essences is compatible with
essentialist metaphysics.
Although there is no doubt that Lockes doctrine of the coincidence of nominal and
real essences in mathematics is significant, I believe that it comes short of the Platonic view
of mathematical existence. Since for Locke all there are in mathematics are the ideas
inside our minds, we dont have to care about external existence corresponding to them (E,

Lockes definition of certain and real knowledge is as follows: Wherever we perceive the agreement or
disagreement of any of our ideas, there is certain knowledge: and wherever we are sure those ideas agree with
the reality of things, there is certain real knowledge. (E, IV, iv, 18)
Bolton, The Relevance of Lockes Theory of Ideas to His Doctrine of Nominal Essence and
Anti-Essentialist Semantic Theory, In Chappell, Locke, p. 214-225.

IV, iv, 8). But since Platonists literally believe in mathematical objects outside our minds,
some explanation to link essence with existence in mathematics is still needed for them.
6.7 Essence and Existence in Mathematics
Even though the main interests of contemporary debate on essentialism lies not in
mathematical or logical essence but rather in natural kinds, when the latter is seen from the
former perspective, we can clearly see the characteristic features of natural kinds. We
should make a distinction between two cases in which the truth holds:
(1) Since it is impossible that there does not exist an object which the truth is about, the
truth does hold in every possible world.
(2) Since it is possible that there does not exist an object which the truth is about, the truth
holds only in every possible world where the object exists.
The former is concerned with mathematical objects and the latter with natural kinds (if any).
This distinction also coincides with that between necessary a priori and necessary a
posteriori. In this light, we can clearly see why in mathematics nominal and real essences
coincide, whereas in empirical sciences they diverge. The world has a hierarchical
structure organized by how nominal and real essences are interwoven.
Now, with the aid of arguments above I shall claim that there are two conditions to
be satisfied in order for us to derive existence from essence alone.
(1) Insofar as we think, we could think existence without contradictions. In other words,
except when we dont think, we could not think non-existence.
(2) Existence contains no empirical contents.
This applies only to mathematical objects. I mean that, by (1) actual mathematical
theories belong to the maximally consistent theory, and by (2) mathematical truths are a
Even in natural sciences we can find out some examples in which existence was
successfully derived from essence. In l846 Leverrier and Adams predicted the existence
of Neptune on the ground that its gravitational force would explain the observed anomalies
in the motion of Uranus. Based on this conjecture, Galle actually discovered the planet.
In this case, we could say that there were indirect causal connections with the planet. So
the discovery of germanium would be more appropriate here. Before Winckler discovered

germanium in 1887, Mendeleev knew many properties of the metal due to the gap in the
periodic table. In this case, there were no causal connections at all with the metal.
From the history of science, however, we also know there were some failed attempts
to derive existence from essence. The ether and phlogiston hypotheses are two typical
examples. In natural sciences, on one hand, even if a theory which predicts the existence
of an object is internally consistent, the theory is not confirmed until we can find out the
real existence of the object. On the other hand, even if the existence of an object is
empirically verified, we need to corroborate the existence by formulating a theory that
explains it.
6.8 What is a Maximally Consistent Theory?
If consistency is the only criterion that mathematical theories must satisfy, there are
multiple internally consistent but mutually inconsistent theories. By arguing that in
mathematics essence amounts to existence, I mean that mathematical theories should satisfy
maximal consistency. In the last chapter as a Platonist I proposed to examine the
interrelationships among the models. Actually, the independence proofs give us clues as
to how the models are interrelated to each other. Gdels inner model VL implies the
Axiom of Choice and the Continuum Hypothesis. L is the smallest proper class that is a
first-order universe. But the inner model method doesnt work in proving the
independence of the Continuum Hypothesis. Let M be a countable transitive model for
ZFC. Cohens method of forcing also shows that in the generic extension of M (i.e.,
M[G]), the Axiom of Choice is true but the Continuum Hypothesis is false. But the
forcing method can also be used to construct a model in which the Axiom of Choice doesnt
hold. This means that there is a submodel N such that MNM[G]. From this
perspective, Im inclined to think that the Axiom of Choice is true, but the Continuum
Hypothesis is false.


Figure 28: Example of the interrelationships among the models.

Since two models with different cardinalities are enough to prevent isomorphism, all
we can hope is that all models with the same cardinality are isomorphic. A first-order
formal system S is -categorical if all models of S with an infinite cardinal are
isomorphic. Vaughts test tells us an interesting fact about the relation between
-categorical and complete: If all models of S are infinite and S is -categorical for
some infinite cardinal , then S is complete. This means that there are two
non-isomorphic models with the same cardinality. Which model should we choose in this
case? Since two models are of the same cardinality, whether or not there is one-one
correspondence between the two models is not good enough to compare the size of the
models. For instance, there is one-one correspondence between the set N of natural
numbers and the set Q of rational numbers. Should we choose the set of natural numbers
on Kroneckers dictum: God made the natural numbers; all else is the work of man.?

A good epistemology must be able to answer the question of which theory we should
choose when facing multiple internally consistent but mutually inconsistent theories. In
my view, the problem with an appeal to a non-natural mental faculty, such as a
mathematical intuition, is that it cannot give an objective criterion to this question. Here
maybe we should be reminded of the three criteria Quine believes we should adopt in
formulating scientific hypotheses other than consistency: such as simplicity, familiarity,
and sufficient reason. In the above example, however, the set of natural numbers is a
subset of the set of rational numbers. So we should favor the set of rational numbers over
the set of natural numbers. So in the case of two models with the same cardinality as well
we can maintain the criterion of maximality. Indeed there is a case in which we cannot
tell for sure which theory maximizes a realm of mathematical objects. But considering
that mathematical theories are not accomplished but still developing, some theories are
more fruitful than others in the sense that they lead to further research by the method of trial
and error.
We cannot say that physical objects actually exist on the ground that we can think of
them. As Kant says, in my financial condition there is more with a hundred actual
dollars than with the mere concept of them.
In regard to mathematical objects, however,
if they are to exist only in mind and do not actually exist, it doesnt even make sense to say
that we can think of them. In mathematics as well as in natural sciences we are thinking
about not the objects inside of our mind but the objects outside of our mind. But physical
objects need empirical materials to be realized, whereas mathematical objects contain no
empirical contents. So in mathematics alone, there is no problem of whether or not there
is an object corresponding to a concept, but a concept turns to an object by default. For
instance, a unicorn does not actually exist because it needs a horn, head, nose, etc., whereas
a measurable cardinal actually exists because it does not need such empirical materials.
It seems to me that the relation between essence and existence has some implications
for the contemporary debates on possible worlds. My concern is to what degree an
analogy does hold between the existence of possible worlds and that of mathematical

Kant, Critique of Pure Reason (A599/B627).


objects. I shall draw upon Lewiss possibilism. On one hand, Lewis attempts to justify
the possible worlds in parallel with the mathematical objects in that the existence of both
can be accepted even without the causal connection between ourselves and the objects.
On the other hand, he claims that the mathematical objects are different from the possible
worlds in that the former are abstract objects, while the latter are concrete objects.
Considering the events outside the light cone, I grant that there are a bunch of concrete
objects which we cannot get access to. But they are concrete objects, which contain
empirical contents. Since the existence of possible worlds does not satisfy the condition
(2) stated at the end of the last section, essence alone cannot guarantee existence. In the
case of mathematics, the existence of mathematical objects does not contain empirical
contents, essence alone can guarantee existence. But my view is not that there are not the
possible worlds, rather that we just cannot show whether or not there exist the possible
We have seen that most mathematicians prefer the Axiom of Choice to the Axiom of
Determinacy in favor of the existence of non-Lebesgue measurable sets and the
Well-Ordering of reals. We have also seen that most mathematicians reject the Axiom of
Constructibility in favor of the existence of a measurable cardinal. In both cases, working
mathematicians are driven by Platonic realism rather than Constructivism. Not only does
Platonic realism fit in well with the actual practice of mathematicians. More importantly,
we can actually gain a more fruitful picture of mathematics by working hypotheses based
on Platonic realism.
The fact that there are some cases in which we could have multiple logically
consistent competing theories shows that it is not appropriate to say that every logically
consistent theory is a mathematical theory. More precisely, we should say that the actual
mathematical theories are the parts of the maximally logically consistent theory that
describes mathematical reality. Platonists put emphasis on the difference of existential
nature between concrete and abstract objects. Also, as far as abstract objects are
concerned, Platonists see a close connection between essence and existence, or possibilities
and actualities. The problem with this argument is that it is somewhat dogmatic and

heavily depends on the rationalist tradition. But this argument is destined to be of a
self-foundational nature.
We should of course seek to explain the established world as efficiently as possible.
But we also have to investigate the new world. If the mathematical objects should be
restricted to those that are indispensable to empirical sciences, mathematical objects such as
measurable cardinals cannot be accepted. If we admit of the significance of mathematical
activities themselves, however, we can believe in such mathematical objects. What
working mathematicians are doing is not blindly increasing the number of mathematical
entities, but rather introducing new potential objects into the mathematical universe.



The Axiom of Choice has been a paradigm case for the debate between Platonic realism and
Anti-Platonism since its appearance in the early last century. Despite the early criticisms,
not only did the Axiom of Choice play a major role in the systematization of Cantors set
theory, but it also had a tremendous impact on many branches of mathematics outside set
theory. A variety of applications show the deductive power and stability of the Axiom of
Choice. This also suggests that there are a lot of things we can do in the presence of the
Axiom of Choice that we couldnt do otherwise.
Against the actual practices of mathematics, the trend of philosophy of mathematics
in the last few centuries moves in a direction that eliminates Platonic entities from
mathematics. This leads to the rejection of Kantian doctrine that the mathematical truths
are a priori synthetic. Coffa suggests that the most fruitful anti-Kantian line was what he
calls the semantic tradition, culminating in Logical Positivism. The main insight was to
locate the source of necessity and a priori knowledge in the use of language. A priori
knowledge is truth by definition. Dummett calls the approach the linguistic turn in
Fictionalism is of the same ilk. According to fictionalism, all of mathematics is
simply false. The usefulness of mathematics in natural sciences can be explained by
means of the conservation theorem. This means that the mathematical theory preserves
the truth of the scientific theory, but facilitates the deductions that could be made at greater
length and with greater difficulty otherwise.
A more plausible view is the indispensability argument. The indispensabilists do
not believe that such a fictionalist attempt to nominalize mathematics in its entirety would
be successful. According to this view, we may accept the existence of mathematical
entities insofar as they are indispensable in explaining natural sciences. Considering that
mathematics after the nineteenth-century century is separated from and develops itself
automatically independently from natural sciences, however, the drawback of the
indispensability argument is that we have to pay a high price: the sacrifice of many fruitful
results of contemporary mathematics. Maybe the Axiom of Choice is one of them. I

doubt that the Axiom of Choice is indispensable to natural sciences dealing with the finite
domain of the universe. In my view, however, the development of contemporary
mathematics is too well organized to be mathematical recreation as Quine calls it. We
should believe that it reflects the mathematical reality of some kind.
Also, Russell refuses to posit the Axiom of Choice as an axiom. The Axiom of
Choice is formulated in an existential sentence. And, according to Russell, mathematics is
simply a development of logic. But logic is not committed to the existence of any object
whatsoever. Therefore mathematics is also not committed to existence as such. Russell
claims that mathematics is only concerned with conditional statements with regard to
existence. That is to say, mathematics claims nothing more than this: If the Axiom of
Choice is true, then such and such a statement follows. Thus Russell interprets the Axiom
of Choice as investigated in an if-then sprit, and claims that conditional statements are
the laws of logic.
A lesson from the model-theoretic arguments is that there are multiple internally
consistent but mutually inconsistent theories. So Platonists can no longer claim that every
consistent theory belongs to the true mathematical theory. In fact, model-theorists argue
that some mathematically important problem depends for its truth-value upon the model in
which it is placed. In my view, however, indeed it is suspicious that we have a
mathematical intuition to fix the intended model, but we still have principles which serve as
criteria to measure the excellence of the model. We should say that the true mathematical
theories are those that belong to the maximally consistent theory that describes
mathematical reality. As far as mathematical objects are concerned, essence and existence
coincide. By contrast, in natural sciences we cannot derive existence from essence alone.
Therefore, the coincidence of essence and existence shows the unique nature of
mathematical truths.




(I) The Axiom of Choice is equivalent to the Well-Ordering Theorem.
Proof. (i) The Axiom of Choicethe Well-Ordering Theorem:
Let S be a set. In order to find a well-ordering of S, we can only find an ordinal number
and a one-to-one -sequence
, a
, , a

, ()
which enumerates S. By the Axiom of Choice, there is a choice function f on the power
set P(S). Now we construct the sequence by transfinite recursion:


The construction stops as soon as we exhaust all members of S.
(ii) The Well-Ordering TheoremThe Axiom of Choice:
Let F be a family of nonempty sets S. By the Well-Ordering Theorem, there is a
well-ordering of F. Then, we can define a choice function f on F:

(II) The Principle of Dependent Choices implies the Denumerable Axiom of Choice.
Proof. Let {S
} be a family of denumerably many nonempty sets. Assuming the Principle
of Dependent Choices, we shall find a choice function f on {S
}. Let A be a set of the all
finite sequences of S
. Let R be such a relation on A as defines z
for z
,.. , x
such that x
By the Principle of Dependent Choices, since R is a relation on A such that for every z
there exists z
A with z
, there is a sequence z
, z
, z
, of members of A such

R z
, z
R z
, , z
R z
Therefore we can define a choice function f on {S
}: f{S
, z

(III) The union of countably many countable sets is countable (The Countable Union
Proof. Let {S
} be a family of countably many countable sets. So |S
and |n|

By the Denumerable Axiom of Choice, there exists a choice function f on {S
}. Therefore


(IV) Every infinite set has a denumerable subset.
Proof. Let T be an infinite set. Since T is infinite, there are subsets S
of T such that n.
By the Denumerable Axiom of Choice, there exists a choice function f on the set of S
Therefore there exists a set C such that {x
}. This set C is a denumerable
subset of T.



