Preferences

Mathematical foundations of microeconomic theory: Preference, utility, choice
Mark Voorneveld September 6, 2010
Contents
Preface 1 Preference
1.1 1.2 Preference relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preference over commodity bundles . . . . . . . . . . . . . . . . . . . . . . . . . .
iii 1
1 2
2 Utility
2.1 2.2 2.3 2.4 2.5 2.6 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . From preference to utility: nite or countable sets . . . . . . . . . . . . . . . . . . Preference, but no utility Continuous utility In no-man's-land: A necessary and sucient condition for utility representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Some special functional forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9 10 11 12 13 16
3 Choice
3.1 3.2 3.3 Existence of most preferred elements . . . . . . . . . . . . . . . . . . . . . . . . . Revealed preference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
22 23 24
4 Choices of a consumer: classical demand theory

4.1 4.2 4.3 4.4 4.5 4.6 The preference/utility maximization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties of the demand correspondence and indirect utility Relations between UMP and EMP The expenditure minimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Welfare analysis for the consumer . . . . . . . . . . . . . . . . . . . . . . . . . . . Welfare and Hicksian demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
26 28 31 34 36 38
5 Choices of a producer: classical supply theory

5.1 5.2 5.3 5.4 5.5 5.6 5.7 Production sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties of production sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The prot maximization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . Solving the PMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The cost minimization problem Eciency . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linking the PMP and the CMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
40 40 43 45 46 47 49
6 General equilibrium
6.1 6.2 6.3 6.4 6.5 What is an equilibrium? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pure exchange economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Welfare analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Private ownership economies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
50
50 51 52 53 55
7 Expected utility theory

7.1 7.2 7.3 7.4 Simple and compound gambles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Preferences over gambles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . von Neumann-Morgenstern utility functions Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
57 58 60 62
8 Risk attitudes
8.1 8.2 8.3 8.4 In for a gamble? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Certainty equivalent and risk premium . . . . . . . . . . . . . . . . . . . . . . . . Arrow-Pratt measure of absolute risk aversion . . . . . . . . . . . . . . . . . . . . A derivation of the Arrow-Pratt measure . . . . . . . . . . . . . . . . . . . . . . .
63
63 64 65 66
9 Some critique on expected utility theory

9.1 9.2 9.3 9.4 Problems with unbounded utility: a variant of the St. Petersburg paradox . . . . Allais' paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Probability matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rabin's calibration theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
67 67 68 68
10 Time preference
10.1 Stationarity and exponential discounting . . . . . . . . . . . . . . . . . . . . . . . 10.2 Preference reversal and hyperbolic discounting . . . . . . . . . . . . . . . . . . . . 10.3 Limit-of-means and overtaking 10.4 Better may be worse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
70 72 73 75
11 Probabilistic choice
11.1 The Luce model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The logit model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 The linear probability model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
77 80 82 84
Full circle: overview Notation References Suggested solutions
85 88 89 91
ii
Preface
Overview
The purpose of these notes is to introduce you to some mathematical foundations of economic theory. These are building blocks of economics that hopefully contribute to your understanding of formal modeling in your other courses and in the research papers you will read and eventually write. The typical model of the behavior of an economic agent requires careful answers to the following questions:
(Q1) What can the agent choose from, i.e., what is the set of feasible alternatives? (Q2) What does the agent like, i.e., what are the preferences over alternatives? (Q3) How are the former two combined to make a choice, i.e., to select among alternatives?
Although we make some brief excursions into bounded rationality, the main building block of traditional economics is rational choice: choose from your set of feasible alternatives a most preferred one. This raises important related questions:
(Q4) When do most preferred elements exist? (Q5) How are they aected when the agent's environment changes?
The fourth question is extremely important: you'd be surprised about how many people simply skip over the existence issue and write papers about how solutions to economic problems are aected by parameter changes, without ever wondering whether there even is a solution. The fth question concerns things like how a consumer's demand is aected by price changes, wage increases, etc. Try to keep this in mind, because this is what will occupy us most of the time and constitutes the red line of the course: regardless of the setting, we rst have to answer (Q1) to (Q3) to provide a meaningful microfounded model of an economic agent's behavior. Sections 1 to 3 provide a general framework for modeling preferences over and choice from a feasible set of alternatives. This general framework is then applied to a number of specic cases: traditional models of consumer choice (Section 4), producer choice (Section 5), choice over outcomes that are no longer deterministic, but occur with certain probabilities (Section 7), choice over outcomes occurring over time (Section 10), and even the modeling of seemingly suboptimal choices (Section 11).
Special features
Every course reects some of the teacher's own preferences. Although the material covered here is pretty standard for a rst PhD course in microeconomic theory, what distinguishes these notes from other graduate texts is:
Focus on preferences:
The notes have a relatively strong focus on preferences, rather than
utility functions. Utility functions are practical in the sense that they allow you to use standard calculus tools, but this tends to blur the picture by making economics into an exercise in advanced dierentiation. I try to avoid this. Although people make statements like I like coee more than tea, you hardly ever see them in a supermarket with a calculator and their utility function written on a piece of paper. This allows us to give a much more general answer with a remarkably simple proof to the question when most preferred elements exist; see Proposition 3.1.
From preferences to utility:

iii
Not all preferences can be represented by means of a utility function.
Graduate texts
typically give exactly one example, lexicographic preferences, as if it concerns an exotic phenomenon. These notes try to give some counterweight by providing several economically relevant examples, all arising from the same general principle; see Section 2.3. So the question remains, when does a utility function exist? Section 2.4 provides necessary and sucient conditions. As an important special case, when does a continous utility function exist? Proposition 2.6 provides a detailed proof. Remarkably, not even Fishburn (1970a), the standard reference on utility theory, contains such a proof, and neither does any of the standard textbooks in microeconomic theory. I don't actually expect you to know the proof, I just wanted to ll a gap and make sure you have access to it.
Miscellanea:
Other things not commonly found in standard texts include:
An existence result for Walrasian equilibria in terms of excess demand correspondences, rather than excess demand functions; see Proposition 6.5. Some excursions into the realm of bounded rationality, with brief discussions of hyperbolic discounting (Section 10.2), probabilistic choice (Section 11), and some exotic preferences (Section 3.3).
Solutions manual:
solutions to all
1
Like any textbook, these notes contain exercises.
They also contain if you have time to
of the exercises, in the hope of facilitating self-study:
do some exercises, you can immediately check your solutions. applications.
If you're pressed for time, you
can treat the worked exercises as a collection of a few dozen (cleverly disguised) examples and
Recommended reading
The lecture notes are the reading material for the course. You may omit the proof of Propositions 2.5 and 2.6, as well as the more mathematical exercises in Section 10.3. For the interested reader, the following table refers to related material in Mas-Colell, Whinston, and Greene (1995, MWG), which is by no means obligatory. Lecture notes 1. Preference 2. Utility 3. Choice 4. Choices of a consumer 5. Choices of a producer 6. General equilibrium 7. Expected utility theory 8. Risk attitudes 9. Some critique 10. Time preference 11. Probabilistic choice See also MWG 1.AB, 2.AC, 3.B 3.C 1.CD, 2.D 2.E, 3.DE, G, I 5.AC, FG 15.AC, 16.AD, 17.AC, 18.AB 6.AB 6.C 6.B 20.AB none
Well, almost all exercises, as a couple of them will be used as this year's home assignments...
iv
Terminology
In economics, there is little consensus on terminology. For instance, following Arrow (1959)
and Fishburn (1970b), I refer to a complete transitive binary relation that models an economic agent's preferences as a `weak order'. Other names include `rational preference relation' (MasColell et al., 1995), a very loaded term, simply `preference relation' (Rubinstein, 2006), `complete preordering' (Debreu, 1959), `complete weak order' (Fishburn, 1979), and `complete ordering' (Debreu, 1954). The Micro I course and its exam use the denitions from these lecture notes.
1.
1.1.
Preference
Preference relations
Rational choice essentially means choosing from a set of feasible options a most preferred alternative. Let
be a set of alternatives. A
preference relation
is a binary relation on read
X,
allowing the comparison between pairs of alternatives. For each
x, y X ,
as x is at least as good as/weakly preferred to/weakly better than is a
y .
A binary relation
weak order
if it satises:
Completeness: for all x, y X , x Transitivity: for all x, y, z X , if x

Exercise 1.1
(a)
or
x y
(or both).
and
z,
then
z.
Are the following binary relations
necessarily complete, transitive? is the alphabetical order in which they are
X X
consists of the items in an English dictionary,
listed. (b) is a group of people and for
x, y X : x
if and only if
knows
y.
From preference relation
, one can derive two other binary relations:
Strict preference:
to
x
if
y if x x y
y , but not y
and
x ( x is better than/strictly preferred y

are equally good/equivalent).
y ). xy y x y y x
( x and
Indierence:
We sometimes write
instead of
and
instead of
y.
Economic theory relies
heavily on preferences. You should be aware of some hidden assumptions: Preferences are deterministic: they are not susceptible to a change of mind or mood shocks. Statements like I like coee more than tea at any time, but today I prefer a cup of tea. are ruled out. Preferences are ordinal: the intensity of preferences as in I'm rather fond of the 6
o'clock news, but detest soap operas. plays no role. Preference is a binary relation: it compares pairs of alternatives, independently of external factors. Conditional statements like If there are twenty types of coee to choose from, I prefer tea to any type of coee. Otherwise, I take an espresso. are ruled out. Also completeness and transitivity deserve scrutiny. Completeness rules out the existence of
incomparable alternatives. Transitivity is violated in a number of plausible situations:
Majority rule voting:
Consider three agents with strict preferences over three alternatives
a, b, c
as follows:
and
and
c a
a. c.
Dene a new preference
This involves a slight but common abuse of notation: although this was not stated explicitly, the notation above is taken to suggest, for instance, that also relation via majority rule voting:
b,
because a majority (namely the agents 1 and 2)
strictly prefers
over
b.
Similarly,
and
a,
in violation of transitivity. This example is
sometimes referred to as the Condorcet paradox.
Nonperceivable differences and similarity:

dierence between a cup of tea with
The human body cannot perceive dierences
in stimuli unless they exceed a certain threshold. For instance, you will typically not sense the
nN
grains of sugar and
n+1
grains of sugar. Therefore,
you will be indierent between them. If preferences are transitive, you will be indierent between a cup of tea with 1 grain of sugar, 2 grains of sugar, 3 grains of sugar. . . one kilo of sugar. Are you? This example is related to the more general issue of similarity: nearby alternatives may be perceived similar and therefore equally good. But with a long chain of nearby alternatives, you can create a huge change between alternatives, so that you may no longer be indierent between them. Properties of relation imply some properties of the indierence relation
and the strict preference
. The proofs involve only simple manipulations of the denitions of
and
; check
that you can do this. I only prove part (d).
Proposition 1.1
Let
be a weak order on
X.
(a) The indierence relation
is an equivalence relation, i.e., it satises:

(c) (d)
reexivity: symmetry:
x X : x x. x, y X :
if
x y,
if
then
y x. y z,
then
transitivity:
x, y, z X :
xy
and
x z.
(b) The strict preference relation irreexivity: asymmetry: transitivity: if if
satises:
x X :
not if
x x
if
x. y, x
then not
x, y X :
y z,
x.
then
x, y, z X :
and and
and
z.
x, y, z X : x, y, z X :
Let
xy x y
y y x
z, z,
then then
x x y
z. z.
Proof. (d):
z . By denition of , x y implies x y . With y z and transitivity of , this implies x z . It is not true that z x: if it were, it would imply with y z and transitivity of that y x, contradicting that x y . Since x z , but not z x: x z .
with and
x, y, z X
Exercise 1.2
1.2.
Complete the proof of the proposition.
Preference over commodity bundles
In the standard microeconomic model of consumer choice, the set of alternatives
is usually
L taken to be R or
called a
RL for some +
L N.
The interpretation is that there are
the latter case to be consumed in nonnegative amounts. An element
(commodity) bundle ; its k-th coordinate xk indicates the

2
L commodities, in x = (x1 , . . . , xL ) X is quantity of commodity k .
The additional structure obtained this way allows us to introduce a number of new properties; throughout this subsection, assume therefore that typically illustrated using indierence curves. The set
{y X : x y}
indierence curve containing x X is the

x, y RL
is dened as
equals
RL +
or
RL .
These properties are
of points equivalent with
x.
Recall that the (Euclidean) distance between vectors
xy =
=1
The preference relation over and
(x y )2 .
satises
local nonsatiation if, for every alternative x, there

xX
and each
is an alternative arbitrarily close to
that is better: for each
>0
there is a
yX
with Let
xy <
x. ek RL
denote the
Monotonicity properties come in dierent varieties, all reecting the intuition that more is better.
k {1, . . . , L} xy x>y
and let
k -th
standard basis vector with
k -th
coordinate equal to one and all other coordinates equal to zero. For if if
x, y RL ,
write
xi yi xi > yi
for all coordinates for all coordinates
i = 1, . . . , L, i = 1, . . . , L.
The preference relation
is:
strongly monotonic in coordinate

alternatives: for each
xX
if
and each
strongly monotonic if an increase in at least one coordinate gives better alternatives: for all
k if increasing this > 0 : x + ek x.

then
coordinate gives better
x, y X ,
xy
and
x = y,
y. y,
and
monotonic if, for all x, y X : x y implies x

For instance, a strongly monotonic preference relation The converse holds if is transitive.
x>y
implies
y.
is strongly monotonic in each coordinate.
Exercise 1.3
(a) Prove the previous two sentences. (b) Give an example of a preference relation over for (c) Let
R2 +
that is strongly monotonic in coordinate
k,
both
k=1
and
k = 2,
but not strongly monotonic. , less is better (think of the
X = RL +
and assume that according to the preference relation
coordinates as measures of pollution, unhealthy commodities, etc.) in the sense that
xy
and
x=y
imply that
y.
Is this preference relation locally nonsatiated?
(d) Answer the same question as in (c), but with
X = RL .
Each of the three monotonicity properties implies local nonsatiation. On the other hand, local nonsatiation has no implications for monotonicity: the preference relation on
R2 +
with
(x1 , x2 )
(y1 , y2 ) (x1 x2 )2 + x1 (y1 y2 )2 + y1
is locally nonsatiated, but satises none of the monotonicity properties. Figure 1 contains three indierence curves of this preference relation with the better ones further away from the
x2 3 2 1 0 0
Figure 1:
x1 1 2 3
Local nonsatiation has no implications for monotonicity.
origin and shows that small increases in one or both of the coordinates may lead into areas with strictly worse alternatives. Figure 2 summarizes the relations between the three monotonicity relations and local nonsatiation. An arrow from strong monotonicity to monotonicity means that the former implies the latter; the absence of an arrow in the opposite direction means that the converse is not true.
strongly monotonic in coordinate

strongly monotonic
E d d d d d c
locally nonsatiated
monotonic
RL
or
Figure 2:
Relation between monotonicity properties and local nonsatiation on
RL . +
A preference relation
is
continuous
y
and the set
if for every
alternatives weakly better than
y X , the set {x X : x y} of {x X : x y} of alternatives weakly worse than
are closed. The literature contains some alternative denitions as well:
Proposition 1.2
(a) closed. (b) For every (c) The graph
Let
be a weak order on
X.
The following properties are equivalent:
is continuous, i.e., for every
y X,
the sets
{x X : x {x X : x
y} y}
and
{x X : x
y}
are
y X,
the sets
{x X : x y}
of
y}
and
are open.
{(x, y) X X : x
is closed.
(d) For all sequences
n N,
(e) For all
then also
(xn )nN x y.
if
and
(yn )nN
in
X,
if
x n x , yn y , Ux
for of
and
xn
yn
for all
x, y X , containing x) and
y,
then there is a neighborhood
a neighborhood
Uy
of
such that
x (i.e., an open set Ux all x Ux , y Uy . (xn , yn ) X X

is
Proof.
Statements (a) and (b) are equivalent, since the complement of an open set is closed, if and only if
and vice versa. Also the equivalence of (c) and (d) is a matter of denition: an element of the graph of
xn
yn .
Proving three implications suces to close
the circle and make sure that all ve statements are equivalent:
[(b) implies (e):]

Case 1:
x, y X with x y . Distinguish two cases: m X with x m y . Dene Ux = {z X : z m} and Uy = {z X : m z}. These sets are open by (b). Moreover, x Ux and y Uy by assumption. Let x Ux , y Uy . Then x m and m y . By Proposition 1.1, is transitive, so x y , as we had to show. Case 2: There is no m X with x m y. Dene Ux = {z X : z y} and Uy = {z X : x z}. These sets are open by (b). Moreover, x Ux and y Uy by assumption. Let x Ux , y Uy . Then x y . It cannot be that x x , otherwise we would have x x y . By completeness, x x. Similarly, y y . So x x y y . By Proposition 1.1, x y , as we had to show.
Assume (b) holds. Let There is an
[(e) implies (c):]

the graph
Conclude from cases 1 and 2 that (e) holds. Assume (e) holds. To establish (c), we need to show that the complement of
{(x, y) X X : x
y}
is open. By completeness of
, this complement is the set
S = {(x, y) X X : x
For each
y}.
and
(x, y) S , x, using (e), x Ux , y Uy . Conclude that
neighborhoods
Ux
of
Uy
of
such that
for all
(x, y) S :
Taking the union over all
(x, y) Ux Uy S.
(x, y) S ,
one obtains
S = (x,y)S Ux Uy .
As the union of open sets,
[(c) implies (a)]:
is open, as we had to show.
Assume (c) holds. Let
The following proof is more general, but requires some knowledge of topology. In case of emergency, don't worry. Simply forget that this footnote even exists! [(c) implies (b)]: Assume (c) holds, i.e., the set S dened above is open. Let y X . We show that L(y) = {x X : x y} is open; establishing that also the set {x X : x y} is open is analogous. Let x L(y). Then (x, y) S. Since S is open in the product topology generated by Cartesian products of open sets in X , we can x neighborhoods U of x and U of y such that U U S. In particular, for each x U , it follows that (x , y) U U , so x y . Conclude that
2
x y x y x x y
closed; establishing that also the set
y X . We show that the set {x X : x {x X : x y} is closed is analogous.
y}
is
Taking the union over all x L(y), one obtains
x L(y) :
x Ux L(y).
As the union of open sets, L(y) is open, as we had to show.

5
L(y) = xL(y) Ux .
{x X : x y} with limit x . We need to show that x also lies in this set. By denition, (xn , y)nN is a sequence in the graph of , which is closed by assumption. Therefore, it contains the limit (x , y), i.e., x y , as we had to show.
Let
(xn )nN
be a sequence in
Very roughly speaking, continuity of preferences requires that the strict preference relation is unaected by small changes in the alternatives: if alternatives.
is better than
y,
the same holds for nearby
A subtlety about open sets:

subsets of the feasible set
Continuity properties are typically dened in terms of open
X.
We often consider commodity spaces like
X = RL . +
Open sets are
dened using the usual distance between vectors
x
L
and
y:
xy =
=1
A subset
(x y )2 . Y,
i.e., if for each
Y X
is
xX
open if each y Y
y
lie in
is an interior point of
y Y,
all points
suciently close to there is an
as well:
>0
such that for all
xX
with
x y < : X
xY. x X. . .
(1) in (1). This
Many people overlook a slight subtlety, namely the statement . . . for all looks innocuous: if you want to dene whether a subset of interested in stu that is outside of for instance, that as subsets of
is open, then obviously you're not
X. X = R2 +
But it does matter in identifying open subsets! Notice, but not as subsets of
X = R2
sets like
Y 1 = R2 , +
Y 2 = {y R2 : y1 < 1}, +
Y 3 = {y R2 : y1 + 2y2 < 4} +
are open. You might want to draw their pictures. In topological language,
X = RL is endowed + L : a set Y X is open if and with the that it inherits from the larger set R L only if Y = X O , where O is an open set in the larger space R . This provides quick proofs 1 , Y 2 , Y 3 are open subsets of X = R2 : that the sets Y +
relative topology
Y 1 = X R2 ,
and the sets
Y 2 = X {y R2 : y1 < 1}, R2 , {y R2 : y1 < 1},
Y 3 = X {y R2 : y1 + 2y2 < 4},
{y R2 : y1 + 2y2 < 4}
are open in
R2 .
is:
The next two properties are related to other changes, namely shifts in or rescaling of the coordinates. The preference relation
quasilinear in coordinate k if, for all x, y X and all > 0, x

x + ek y + ek :
that adding the same positive amount of commodity aect the preference over them.
implies that
the preference relation is insensitive to parallel shifts in the sense
to both alternatives does not
homothetic if rescaling the coordinates does not aect the preferences: for all x, y
Of course, this requires knowing which subsets of X are open. In general as you will recall from the math course this requires X to be a topological space, i.e., it comes equipped with a denition of open sets, subject to three restrictions: (1) the empty set and X are open, (2) unions of open sets are open, (3) intersections of nitely many open sets are open.
3
6
and all
> 0,
if
y,
then
y .
For instance, any preference relation where only the dierence between the rst coordinates matters, like
(x1 , x2 )
(y1 , y2 ) 3x1 + exp x2 3y1 + exp y2 ,

Often, such a coordinate is referred to as numeraire A simple example of homothetic
is quasilinear in the rst coordinate.
or money and the economic idea is that not the exact amounts of money associated with two alternatives matter, but the dierence between them. ingredients and let preferences arises in most linear production processes: let alternatives
and
denote vectors of
be weakly preferable to
much of your favorite cake as a function
y.
Then also
y if the ingredients of x suce to make at least as x yields at least as much cake as y . More generally, k R if for each x RL and each > 0: + f (x) f (y). Then is homothetic: y.
generate homoth-
any preference relation dened in terms of a homogeneous function is homothetic. Recall that
f (x) =
f : RL R is homogeneous of degree + k f (x). Suppose that x y if and only if x
y f (x) f (y) f (x) = k f (x) k f (y) = f (y) x f (x1 , x2 ) = min{x1 , x2 }

R2 +
and
Therefore, functions dened by etic preferences.
f (x1 , x2 ) = x1 x3 2
Exercise 1.4
Give an example of a weak order
on
that satises:
(a) strong monotonicity in coordinate 1, but not quasilinearity in coordinate 1. (b) quasilinearity in coordinate 1, but not strong monotonicity in coordinate 1. (c) homotheticity, but none of the three monotonicity properties. (d) all three monotonicity properties, but not homotheticity.
Exercise 1.5
(a) Prove: if
Consider a weak order is continuous, then
on
X = RL +
with
if
x > y. x y
if
is monotonic. That is, also
x y.
drink, whereas the amount of on
(b) Not a drop too much: amount: if
Your favorite drink requires mixing its two ingredients in the same
x1 , x2 0 indicate the two amounts, you can mix min{x1 , x2 } of your max{x1 , x2 } min{x1 , x2 } goes to waste. If you are primarily concerned about x, y R2 , + x y
if and only if
drink, but also feel it is unfortunate to waste ingredients, the following weak order reect your preferences: for all
R2 +
may
x x
yields more of the drink than
gives the same amount of the drink as
but
y : min{x1 , x2 } > min{y1 , y2 }, or y , but not more waste: min{x1 , x2 } = min{y1 , y2 }, max{x1 , x2 } min{x1 , x2 } max{y1 , y2 } min{y1 , y2 }. x y
whenever
Show that
x > y,
but not necessarily
if
x y.
A preference relation alternatives is convex.
is
convex
also
if for each
y X,
the set
{x X : x
y}
of weakly better
Proposition 1.3
with
Let
be a weak order on
and all
[0, 1],
just walking part of the way from
X . Then is convex if and only if for all x, y X x + (1 )y y . Informally, if x is at least as good as y , y to x is a weak improvement.
Exercise 1.6
(a) Prove this proposition.
(b) Give an example to show that the proposition is false if
is not a weak order.
A somewhat stronger version: a preference relation
is
x=y
and
and all
(0, 1),
it holds that
x + (1 )y
strictly convex if for all x, y X with

y. x, y X , +
1 2 y is strictly better.
This property implies that if you are indierent between two distinct alternatives
1 you can still improve upon them: by strict convexity, the alternative x 2
2.
2.1.
Utility
Utility functions
In many cases, preferences over alternatives can be evaluated by some numerical assessment: I prefer the alternative with the higher percentage of alcohol or I prefer the alternative yielding the higher prot. In that case, we say that these functions in the latter case the function assigning to each alternative its associated prot represent the decision maker's preferences. Formally, a function
u:XR
is a
utility function representing

x y u(x) u(y). u
if for all
x, y X :
(2)
One often uses the following simple result to verify that relation .
represents a complete preference
Proposition 2.1
(a)
Let
be a complete preference relation on a set
and let
u:X R
be a
function. The following two claims are equivalent:
represents
(b) For all
x, y X :
if if
x y, x y,
then then
u(x) > u(y), u(x) = u(y).
Proof. (a) (b):

y
Assume (a) holds. Let x, y X . If x y , by denition of : x y and not x. Hence, by denition of a utility function, u(x) u(y) and not u(y) u(x). Conclude that u(x) > u(y). Similarly, if x y , u(x) = u(y). (b) (a): Assume (b) holds. Let x, y X . To show:
x
One direction is easy: if Hence Suppose
y u(x) u(y). y
or
y , then x
u(x) u(y). Conversely, x y is not true. Then
x y , so by (b), either u(x) > u(y) or u(x) = u(y). assume that u(x) u(y). By completeness, x y or y x. y x, so by (b), u(y) > u(x), a contradiction.
Exercise 2.1
The completeness condition in Proposition 2.1 cannot be omitted. Indeed, consider the on
preference relation
with
x, y R :
and the function (a) (b)
y xy+1
Show that:
u:RR
with
u(x) = x
for all
x R.
is transitive, but not complete.
satises Proposition 2.1(b), but not Proposition 2.1(a).
If one function represents a preference relation, then many others do as well: represent the same preference relation. In general:
if preferences
are represented by a prot function, then also twice the prot or prot to the power three
Proposition 2.2
function
u : X R represents and f : R R v : X R dened by v(x) = f (u(x)) represents .

If
is strictly increasing, then also the
Proof.
By (2) and the denition of strictly increasing, we nd for all
x, y X :
x
so
y u(x) u(y) v(x) = f (u(x)) f (u(y)) = v(y),
represents
Since the
ordering of the real numbers is complete and transitive, a preference relation that
can be represented by a utility function is necessarily complete and transitive: it must be a weak order. But is being a weak order enough to guarantee the existence of a utility function? The answer is positive for nite or countable sets.
2.2.
From preference to utility: nite or countable sets
Representing a weak order on a nite set by means of a utility function is easy: the more preferred an alternative
xX
is, the larger is the set of elements weakly worse than
x.
Therefore, counting
how many elements are weakly worse than
measures its utility.
Proposition 2.3
X
is nite,
Assume:
is a weak order on
X.
.
let So
Then there is a utility function representing
Proof.
For each
x, y X . If {z X : y
x X , dene u(x) = |{z X : x z}|. Then u : X R represents : x y , then for each z X with y z , Proposition 1.1(c) gives that x z. z} {z X : x z}. Similarly, the converse inclusion holds, so {z X : x z} = {z X : y z}. x
(3) lies in the former set, but (4)
Hence
u(x) = u(y).
If
y,
Proposition 1.1(d) and the fact that
not in the latter, imply:
{z X : x
Hence
z} {z X : y
z}.
u(x) > u(y).
If
is countable, simply counting the number of weakly worse alternatives does not work: there
may be innitely many of them. But we can give each element a positive weight, make sure that the weights have a well-dened sum even if we add innitely many of them, and use the total weight of the elements weakly worse than
as a measure of the utility of
X = {x1 , x2 , . . .}
and divide a bar of chocolate by giving half (weight
x. 21 )
For instance, label to
x1 ,
then half of
2 ) to the remainder (weight 2
x2 , then half of the remainder (weight 23 ) to
x3 ,
and so on.
Proposition 2.4
X
Assume:
is countable; is a weak order on
X.
.
Then there is a utility function representing
10
Proof.
Since
is countable, there is an injective function
n : X N.
For each
x X,
dene
u(x) =
zX:x z
The sequence
2n(z) .
(2n )nN has a nite sum nN 2n = 1, so u is well-dened. To see that u represents , let x, y X . If x y , (3) holds, so u(x) = u(y). If x y , (4) holds, so n(x) > 0. u(x) u(y) 2
2.3. Preference, but no utility
Not all preference relations not even weak orders can be represented by means of a utility function. Graduate textbooks usually give exactly one example (lexicographic preferences), as if it concerns an exotic phenomenon. This section gives some counterweight by providing several economically relevant examples, all arising from the following general principle. Fix a set of alternatives countable set bad one:
X.
Suppose you can associate with each number
I R,
one bad alternative
following two properties. Firstly, for
z in some unb(z) X and one good alternative g(z) X with the each z I , the good alternative is strictly preferred to the g(z) b(z). z
(5) is worse than the bad alternative (6)
Secondly, if
associated with
z<z, z:
then the good alternative associated with
z, z I :
z < z b(z )
g(z).
Combining (5) and (6), representing such preferences by a utility function requires, for
z<z:
u(b(z)) < u(g(z)) < u(b(z )) < u(g(z )).

So for each
the intervals
z I , the interval [u(b(z)), u(g(z))] has positive length and if z, z I have z = z , [u(b(z)), u(g(z))] and [u(b(z )), u(g(z ))] are disjoint: one of them lies entirely to the left of the other on the real axis. So uncountably many intervals [u(b(z)), u(g(z))] of positive
length must somehow be placed on the real line without any two of them intersecting. is impossible: we simply run out of space! Formally, each interval This
[u(b(z)), u(g(z))] contains a rational number r(z) Q. Since the intervals associated with dierent values of z are disjoint: z = z implies r(z) = r(z ), i.e., the function r : I Q is injective. But I is uncountable and Q
is countable, a contradiction. Some examples:
Lexicographic preferences.
(Debreu, 1954) Let or
X = R2 .
Dene
as follows:
(x1 , x2 )
(y1 , y2 ) x1 > y1
(x1 = y1
and
x2 y2 ) .
Alternatives are compared according to their rst coordinates; if these happen to be equal, they are compared according to their second coordinates. a dictionary. For each , then Think of the way words are ordered in
z, z R, z < z
z R, let b(z) = (z, 0) and g(z) = (z, 1). Then g(z) b(z) and, if g(z) = (z, 1) (z , 0) = b(z ). So (5) and (6) hold: this preference relation
(Dubra and Echenique, 2001) It is common in economics
cannot be represented by a utility function.
Preferences over information.
to model information by means of partitions of a state space. Let
zR
be a certain threshold.
11
Suppose you get the following information about a number of
if
x < z,
otherwise you are told that
lies in
perfectly distinguish between all real numbers numbers in the interval
with
x R: you are told the exact value the interval [z, ). That means you can x < z , but cannot distinguish between the
[z, ).
Therefore, information is summarized by the partition
b(z) = {{x} : x < z} {[z, )}

of
R.
Similarly, dene the information partition
g(z) = {{x} : x z} {(z, )}

that arises if you are told the exact value of
also in the case where
x = z:
all numbers
xz
can be perfectly distinguished, but larger ones not. Assume it is preferable to have more precise information, i.e., ner information partitions (partition from
is ner than partition
Q
so
if every set
is contained in a set from
Q).
Partition
g(z)
b(z),
g(z)
b(z).
Also if
z<z,
partition
b(z )
g(z),
so
b(z )
g(z).
So (5) and (6) hold:
this preference relation cannot be represented by a utility function.
Preferences over utility flows.

payo zero or one: an alternative
At every moment in time t [0, ), an agent receives x is simply a function x : [0, ) {0, 1}. Suppose preferences satisfy the following monotonicity condition: if x(t) y(t) at all times t, with strict inequality for at least one time period, then x y . Dene, for each z [0, ), the alternative b(z) giving payo one before time z and payo zero afterwards:
b(z)(t) =
Similarly, alternative
1 0
if
t < z, z
and payo zero afterwards:
otherwise.
g(z)
gives payo one at/before time
g(z)(t) =
By the monotonicity requirement,
1 0
if
t z, z<z
:
otherwise.
g(z)
b(z)
and if
b(z )
g(z).
So (5) and (6) hold:
this preference relation cannot be represented by a utility function.
2.4.
In no-man's-land: A necessary and sucient condition for utility representation
We saw above that preference relations where there are uncountably many disjoint intervals between bad and good alternatives cannot be represented by means of a utility function. On the other hand, complete and transitive preferences on a countable set do have a utility representation. Is there something in between these two cases that allows uncountably many alternatives, but still has enough of a countable character that it allows a utility representation? Let subset be a complete, transitive preference relation over a set a minor abuse of notation, the set
CX
such that for all if
X x, y X :
is
Jaray order-separable
c1 , c2 C
s.t.
X.
The pair
(X, )
or, with
if there is a countable
x x
y,
then there exist
c1
c2
y. c1
and
The condition roughly says that countably many alternatives suce to keep all pairs with
apart:
lies on one side of the no-man's-land between
c2 ,
x, y X whereas y
lies on the other. This condition is both necessary and sucient for the existence of a utility representation:
12
Proposition 2.5
if and only if
Let
be a weak order on a set
X.
There is a utility function representing
is Jaray order-separable.
Exercise 2.2
This exercise guides you through the steps of the proof. Assume that of
U = {u(x) : x X} be the range open interval (u1 , u2 ) contains no

(a) Prove that
u.
jump in U is a pair (u1 , u2 ) U U where u1 < u2 and the

U : (u1 , u2 ) U = .
represents
. Let
elements of
contains at most countably many jumps. (Suppose not. Use the idea behind (5) and
(6) to nd a contradiction.)
(u1 , u2 ), x a point x(u1 , u2 ) with utility u1 and a point y(u1 , u2 ) with utility u2 . Let J = {x(u1 , u2 ), y(u1 , u2 )} be the union (over all jumps (u1 , u2 )) of these points. By (a), J is countable. Next, for each pair of rational numbers r1 , r2 Q with r1 < r2 and (r1 , r2 ) U = , x an element x(r1 , r2 ) X with utility in (r1 , r2 ) U . Let R be the union of all such points x(r1 , r2 ). Since there are only countably many pairs (r1 , r2 ) as above, R is countable. Let C = J R.
For each jump (b) Show that
makes
Jaray order-separable.
Conversely, assume
u(x) =
cC:c x
X is 2n(c) .
Jaray order-separable via the set
C.
Let
n:CN
be injective. Dene
by
(c) Show that
represents
For nite or countable sets 2.8, for instance, that on
X,
simply let
C=X
to show that
is Jaray order-separable. For
preferences over uncountable sets, additional restrictions are required. We will see in Proposition
RL , +
adding continuity to our list of requirements works.
2.5.
Continuous utility
Economists usually work with continuous utility functions. Establishing existence of a continuous utility function is troublesome: not even Fishburn (1970a), the standard reference in the eld, bothers to give the proof. A well-known continuity result is often wrongly attributed to Debreu (1954). However, his proof is awed (Debreu, 1964) and a more general continuity result was already known from much older research on order types in the classical theory of sets, due to Georg Cantor. See, for instance, Kamke (1950). The proof of Proposition 2.6 is not obligatory reading; it follows Jaray (1975).
Proposition 2.6
X
Assume:
is a weak order on
X; y X,
the sets .
is Jaray order-separable; a topology where, for all open, i.e.,
X is endowed with {x X : x y} are
{x X : x
y}
and
is continuous.
Then there exists a continuous utility function representing
Proof.
with
Let
C X
either
make
X c
Jaray order-separable. Omitting redundant elements from
necessary, one may assume that no two distinct elements of
are equivalent: for all
C if c, c C
[Dene utility on C :]
of rationals in by induction:
c=c,
or
c. C
is countable, label C = {c1 , c2 , . . .}. Since the set Q = (0, 1) Q (0, 1) is countable, label Q = {q1 , q2 , . . .}. Dene a utility function f : C Q f (c1 ) := q1 . Let n N, n 2, and assume f was dened on {c1 , . . . , cn1 }. To Since
13
extend the utility function to element
{c1 , . . . , cn },
dene
f (cn )
to be rst element of
(dened
as the
q Q
with smallest index
) among those elements
that give the desired extension:
k {1, . . . , n 1} :
A useful implication: let
q > f (ck ) cn
If the set of points in
ck . C
between
(7)
a, b C
with
b.
and
b,
(a, b) = {c C : a
is nonempty, it has a rst element (Why?), say in
b},
cm . By construction, cm is the rst element (a, b) to be assigned its value by f and therefore its image f (cm ) is the rst element in (f (a), f (b)) Q. [Extend utility to X :] For each x X , dene u(x) = sup {f (c) : c C, c x}. The set over which the supremum is taken is nonempty (it contains x) and bounded from above (by 1), so this supremum exists. Moreover, u represents . Let x, y X . If x y , the supremum is taken over the same set, so u(x) = u(y). If x y , there exist, by Jaray order-separability, elements a, b C with x a b y , so that u(x) f (a) > f (b) u(y). [Establish continuity of utility:] The usual topology on R is generated by the intervals (, r) and (r, ), with r rational. Therefore, it suces to prove that u1 ((, r)) and u1 ((r, )) are open for all r Q. Let's do the former; the latter is similar. 1 ((, r)) equals (i) if r inf f (C), (ii) X if r > sup f (C) or if r = sup f (C) and Now u r f (C), (iii) {x X : x f 1 (r)} if r f (C). By assumption, all these sets are open. / The only remaining case is when r f (C) and inf f (C) < r < sup f (C). We show that r / belongs to a jump of f (C). Recall from Exercise 2.2 that a jump in f (C) is a pair of points (f1 , f2 ) f (C) f (C) with f1 < f2 and (f1 , f2 ) f (C) = . Suppose not. Since inf f (C) < r < sup f (C), there exist a, b C with f (a) < r < f (b). Let m N be the maximum of the indices of f (a), r, f (b) Q. Then {q1 , . . . , qm } contains r and elements p, p f (C) with p < r < p . Let n N be the smallest index for which {q1 , . . . , qn }
has this property. Let
p1 = max f (C) {q1 , . . . , qn } (, r), p2 = min f (C) {q1 , . . . , qn } (r, ),

So so
so
(p1 , r) {q1 , . . . , qn } = ,
(r, p2 ) {q1 , . . . , qn } = .
(p1 , p2 ). Since it contains r, the interval (p1 , p2 ) cannot be a jump, i.e., it contains elements from f (C). We show that this yields a contradiction. Since p1 , p2 f (C), there exist b1 , b2 C with p1 = f (b1 ), p2 = f (b2 ). Since (p1 , p2 )f (C) = , there is a p C with f (p1 ) < f (p) < f (p2 ), i.e., the set (b1 , b2 ) of points in C between b1 and b2 is nonempty. Let b be its rst element. By the implication following (7), its image f (b ) must be the rst element of (p1 , p2 ), which was r . But r f (C), a contradiction. / 1 ((, r)) = {x X : x This shows that r belong to a jump (f1 , f2 ) of f (C). But then u 1 (f )}, which is open by assumption. f 2
is the rst element of Let us apply this result to show that continuous weak orders on
RL +
can be represented by a
continuous utility function. We rst establish an auxiliary result that is of interest in its own right whenever we want to nd alternatives in between two others.
Caveat: `rst element' is dened in terms of the chosen enumerations of C and Q. This allows us to speak, for instance, of the rst element in (0, 1), which makes absolutely no sense if one mistakenly were to believe it was dened in terms of the usual order on R.
4
14
Proposition 2.7
X = RL + Y
Intermediate Value Theorem for preferences: Assume:
for some
L N; X; X.
is a continuous weak order on is a connected subset of
The following two results hold: (a) If (b) If
xX
and
y, y Y
are such that
y,
then there is a
y Y y
with
xy
y, y Y
are such that
y,
then there is a
y Y
with
y. x. That is, each B = {z X : z x}.

they separate the
Proof. (a):
element of
Suppose not: all elements of
are strictly better/worse than
belongs to exactly one of the sets
The former contains connected set
y,
the latter
y.
As
and
A = {z X : z x} and B are open by continuity,
(b):
and
Y,
a contradiction.
Suppose not. Then each element of
belongs to exactly one of the sets
B = {z X : z
y }.
The former contains
y,
the latter
y.
As
and
A = {z X : y z} B are open by conti-
nuity, they separate the connected set
Y,
a contradiction.
In typical applications of this proposition, one takes in Proposition 2.9.
to be equal to the entire set
Proposition 2.8, or to a suitably chosen convex set like the diagonal
X , as in {x RL : x1 = = xL } +
Proposition 2.8
X = RL +
Assume:
for some
L N; X.
.
is a continuous weak order on
Then there is a continuous utility function representing
Proof.
The countable set
By Proposition 2.7, there
C = QL makes X Jaray + is a z X with x z y. a z} = {a X : x
order-separable: let
x, y X
with
y.
By continuity, the set
{a X : x
a} {a X : a
z}
is the intersection of two open sets, hence open itself. It is nonempty by Proposition 2.7. The set
is dense in
there is a
c1
X : every nonempty, open set in X has a nonempty intersection with C . Hence, c1 C with x c1 z . Similarly, there is a c2 C with z c2 y . Conclude that c2 y , in correspondence with the requirement for Jaray order-separability. Now all
conditions of Proposition 2.6 are satised. Below we present a special case of Proposition 2.8 with a particularly simple proof.
Proposition 2.9
X=
Assume: L for some L R+
N; X.
.
is a continuous, monotonic weak order on
Then there is a continuous utility function representing
15
Proof. Let e = (1, . . . , 1) RL denote the vector of ones. + Step 1: For each x X , there is a unique x 0 with x x e.
Let
x X.
Choose
max{x1 , . . . , xL }.
By monotonicity,
0e.
By Proposition
2.7, the diagonal
{x RL : x1 = = xL }, +
being connected, contains an element equivalent to follows from monotonicity: increasing
x:
there is an
x 0
with
x x e .
Unicity
Step 2:
Let
gives better alternatives, decreasing worse.
Dene u(x) = x . Then u represents . x, y X . Then x y x e y e u(x) = x y = u(y). Step 3: u is continuous. 1 ((, )) of every open interval (, ) It suces to show that the preimage u
is open. Now
u1 ((, )) = {x X : x
e} {x X : x
e}
is the intersection of two open sets by continuity, and therefore open.
As a simple application, suppose that preferences are also homothetic. Then implies that
x x e
and
x x e,
so
u(x) = x = u(x).
This proves:
Corollary 2.10
If in addition to the assumptions in Proposition 2.9 the preference relation .
is homothetic, there is a utility function homogeneous of degree one representing
The next exercise studies the connection between continuous preferences and continuous utility. The fact that statement (a) in that exercise is true, is useful: you will have relatively little trouble recognizing continuous functions, and continuous utility implies continuous preferences !
Exercise 2.3
(a) If (b) If
Consider a weak order
on topological space
represented by utility function
u : X R.
Are the following claims true or false?
is continuous, then is continuous, then
is continuous.
is continuous.
2.6.
Some special functional forms
Recall that if a preference relation over commodity bundles is quasilinear in some coordinate, this coordinate is often referred to by economists as `money' or a `numeraire'. Under mild additional assumptions, such quasilinear preferences can be represented by means of a utility function of the form `money plus whatever utility I get from the other commodities'.
Proposition 2.11
X=
Assume: L for some L N; R+
is a weak order on
X; x (0, . . . , 0)
if for every
is quasilinear and strongly monotonic in the rst coordinate; Getting something is at least as good as getting nothing: Any dierence can be compensated for by money: s.t.
x X; v0
x, y X :
y,
there is a
x (y1 + v, y2 , . . . , yL ).
16
Then there is a utility function of the form
u(x) = x1 + v(x2 , . . . , xL )
representing
Proof.
Let
x X.
By assumption:
(0, x2 , . . . , xL )
Hence there is a number
(0, . . . , 0).
v(x2 , . . . , xL ) 0
s.t.
(0, x2 , . . . , xL ) (v(x2 , . . . , xL ), 0, . . . , 0).

This number is unique, since is strongly monotonic in the rst coordinate. Adding
x1 0
to
the rst coordinate, quasilinearity implies that
(x1 , x2 , . . . , xL ) (x1 + v(x2 , . . . , xL ), 0, . . . , 0) .

The utility function
u:XR
with
u(x) = x1 + v(x2 , . . . , xL )
represents
x, y X : x
y (x1 + v(x2 , . . . , xL ), 0, . . . , 0)
(y1 + v(y2 , . . . , yL ), 0, . . . , 0)
x1 + v(x2 , . . . , xL ) y1 + v(y2 , . . . , yL ),
where the second equivalence follows from strong monotonicity of in the rst coordinate.
The proof establishes that each alternative is equivalent with receiving a suciently large amount of just the rst commodity: utility can be measured in units of commodity 1. This explains the frequent use of quasilinear preferences: only if they are measured on the same scale can one do meaningful comparisons between, say, your utility and mine.
Exercise 2.4
Is the nal property
x, y X :
if
y,
there is a
v0
s.t.
x y + ve1
(8)
in Proposition 2.11 implied by the others?
Exercise 2.5
Preferences with money (Kaneko, 1976):
A R+ , where an element (a, m) X

A decision maker has a weak order
is interpreted as receiving on
Let A be a nonempty set. Let X = a A and an amount of money m R+ .
with the following three properties:
(a, m) and (a , m ) in X : m 0 such that (a, m) (a , m ). is strongly monotonic in money: for all a A and m, m R+ : if m > m , then (a, m) (a, m ). indierence is insensitive to shifts in money: for all alternatives (a, m) and (a , m ) in X and all c 0: if (a, m) (a , m ), then (a, m + c) (a , m + c). We construct a utility function assigning to each (a, m) X a utility of the form money plus utility from a.
strict preference can be compensated for by money: for all alternatives if
(a, m)
(a , m ),
there is a number
(a) Let (b) Let
a, a A.
Show that there exist amounts of money satisfy
m, m R
such that
(a, m) (a , m ).
Show that
a, a A and m, m , w, w R+ mm =ww . a A.
(a, m) (a , m ) v:AR
and
(a, w) (a , w ). a A:
Fix an arbitrary element
Dene the function where
by taking, for each
v(a) = m m,
Such
m, m v
are chosen such that
(a , m ) (a, m). m, m
by (b), so
m, m
exist by (a) and the function
is independent of the particular choices of
this function is well-dened.
17
(c) Show that the function
u:XR
with
u(a, m) = v(a) + m
is a utility function representing
Also convexity and strict convexity of preferences have implications for the form of the utility function. Recall that a real-valued function
on a convex domain
(Why convex?) is
quasiconcave if for all x, y X and all (0, 1):

u(x + (1 )y) min{u(x), u(y)}.
strictly quasiconcave if for all x, y X with x = y and all (0, 1):

u(x + (1 )y) > min{u(x), u(y)}.
Proposition 2.12
X = RL +
Assume:
for some
L N; X; u
is strictly quasiconcave. . is strictly convex,
is a convex weak order on
u:XR
Then
represents
is quasiconcave. If
Let
Proof.
u(x) u(y),
x, y X and (0, 1). so min{u(x), u(y)} = u(y).
Assume without loss of generality that By convexity of :
y.
Then
x + (1 )y
y,
so
u(x + (1 )y) u(y) = min{u(x), u(y)},

as we had to show. The proof for strict quasiconcavity is analogous.
Exercise 2.6
(a) An equivalent way of dening a quasiconcave function
r R,
the upper contour set
u on a convex domain X is that for all Xu (r) = {x X : u(x) r} is convex. Provide a second proof of u : X R
is a (strictly) quasiconcave utility is (strictly) convex.
Proposition 2.12, using this denition. (b) As a converse to Proposition 2.12, prove that if function on a convex set
X,
the corresponding preference relation
(c) Give an example of a convex weak order on not by a concave one.
that can be represented by a utility function, but
Next, we provide conditions for a weak order to be representable by a linear utility function. Although we go into more detail, the proof follows Diecidue and Wakker (2002). A convenient mathematical tool is treated in the following exercise.
Exercise 2.7
(a) Let
Cauchy's functional equation: On two domains, we show that, under mild assump-
tions, additive functions are linear. Let
f :RR
be additive:
f (x + y) = f (x) + f (y)
for all
x, y R. x N,
then for Setting
u R. Show that f (xu) = xf (u) x Z, then for x Q.

and
for all rational
x.
Hint: First establish the claim for
u = 1
c = f (1),
it follows that
f (x) = cx
for all rational
x,
i.e.,
is linear on the eld
Q.
Approximating real numbers by rational ones and taking limits, it follows that
continuous additive
functions
f :RR
are linear. But much weaker conditions than continuity suce:
18
(b) Suppose
is
not linear on
R.
Show that its graph
{(x, y) R2 | y = f (x)} f
is dense.
So any assumption that prevents the graph of We now extend the domain to
being dense implies that real vectors. Let
must be linear! Such conditions be additive:
include continuity in a single point, boundedness/sign restrictions on small intervals, monotonicity, etc.
n-dimensional
F : Rn R
F (x + y) = fi : R R
F (x) + F (y)
for
for all
x, y R
(c) Reduce this to the previously solved case by showing that there exist additive functions
i = 1, . . . , n
such that, for all
x Rn , F (x) = f1 (x1 ) + + fn (xn ).
With this tool in our baggage, we can prove the linear representation result:
Proposition 2.13
X=
Assume: L for some L N; R
is a weak order on
X; x, y, z X ,
if
is strongly monotonic;
x y , then x + z y + z ; For each x X there is a constant R such that x (1, . . . , 1). Then there are 1 , . . . , L R++ such that the function u : X R with u(x) = 1 x1 + +L xL
is additive: for all represents .
Proof.
By assumption, there is, for each .
x X,
a number
strong monotonicity, this number is unique. represents preferences Moreover,
So the function
u(x) R such that x u(x)e. By u : RL R is well-dened and
u is additive. Let x, y X . Using additivity of twice (for and ), x u(x)e implies that x + y u(x)e + y . Similarly, y u(y)e implies that u(x)e + y u(x)e + u(y)e = (u(x) + u(y))e. By transitivity, x + y (u(x) + u(y))e. Hence u(x + y) = u(x) + u(y). L As u : R R satises Cauchy's functional equation, Exercise 2.7 implies that there are L additive functions ui : R R (i = 1, . . . , L) with u(x) = i=1 ui (xi ). By strong monotonicity, each ui is strictly increasing: its graph cannot be dense. Hence, each ui is linear: there are 1 , . . . , L R such that u(x) = L i xi . The constants 1 , . . . , L are positive by strict i=1
monotonicity. Most assumptions are familiar. Strong monotonicity assures that all the
are positive; with If you don't
milder monotonicity requirements, one can only assure that some of them are.
like the nal assumption, recall from Proposition 2.9 that it can be replaced by continuity. Additivity of preferences is obviously the key assumption. It essentially states that in evaluating two alternatives translations. With later applications in mind (see Proposition 2.14), there is no nonnegativity assumption on the vectors over which preferences were dened:
x, y X ,
only their dierence
xy
matters:
preferences are insensitive to
X = RL ,
not
RL . +
If this makes you ner-
vous, notice that the proof hinges on the linearity of the function satisfying Cauchy's functional equation. Fortunately, linearity can be derived even if additivity holds only on the nonnegative orthant. The remainder of this section is based on Voorneveld (2008), which contains more general results. Due to its analytical tractability, the
Cobb-Douglas utility function

L
u : RL R +
with
u(x) = xa1 xaL = 1 L

i=1
19
xai i
(L N, a1 , . . . , aL > 0)
is among the most commonly used in economics; see also Exercise
which? .
Its name credits
Cobb and Douglas (1928), who used it in the context of production theory. What properties of an agent's preferences assure that they can be represented by a Cobb-Douglas utility function? Part of the trick is in exploiting the fact that this function also goes under the name of
log-linear utility : taking logarithms, we have that for all x, y RL : ++

L L
y
i=1
ai ln xi
i=1
ai ln yi .
This reduces preferences to a linear utility function in the logarithm of the variables, allowing us to exploit Proposition 2.13. Of course, this trick goes only part of the way, as one cannot take logarithms on the boundary of
RL , +
where some coordinates equal zero.
Proposition 2.14
X = RL +
Assume:
for some
L N; X; i {1, . . . , L}, all x, y X , and each t > 0: (y1 , . . . , yi1 , tyi , yi+1 , . . . , yL ). x (1, . . . , 1).
is a weak order on
is strongly monotonic; is homothetic in each coordinate: for each if
y,
then
(x1 , . . . , xi1 , txi , xi+1 , . . . , xL )

there is a constant
For each Then
xX
R+
such that
can be represented by a Cobb-Douglas utility function.

We use Proposition 2.13 to show that can be represented by a Cobb-Douglas utility
Proof.
function on
f : RL RL for each x RL by f (x) = (exp x1 , . . . , exp xL ). ++ 1 : RL RL with f 1 (y) = (ln y , . . . , ln y ) are continuous. Notice that f and its inverse f 1 L ++ L L Given the weak order on R++ , dene a weak order f on R as follows:
Dene
Step 1, domain RL : ++
RL . ++
The domain is then extended to
RL . +
x, y RL :
f (x)
f (y).
(9)
The exponential function is strictly increasing, so by substitution in (9), properties imposed on carry over in a straightforward way to properties of order satisfying strong monotonicity, and there exists, for each
f : one easily veries that it is a weak x RL , a scalar such that
x f (1, . . . , 1).
Applying coordinatewise homotheticity
times, if follows that
x, y, t RL : ++
Hence, by denition (9),
(t1 x1 , . . . , tL xL ) (ln y1 , . . . , ln yL )
f
(t1 y1 , . . . , tL yL ).
(ln x1 , . . . , ln xL )
implies that
(ln x1 , . . . , ln xL ) + (ln t1 , . . . , ln tL )
As
(ln y1 , . . . , ln yL ) + (ln t1 , . . . , ln tL ).
is bijective, it follows that
f is additive.
Conclude that such that
f is
RL satises all assumptions of Proposition 2.13: there are a1 , . . . , aL > 0 L L represented by the utility function x i=1 ai xi . By (9), for all x, y R++ :
f on L L
(ln x1 , . . . , ln xL )
(ln y1 , . . . , ln yL )
i=1
ai ln xi
i=1
ai ln yi .
20
To see that u represents x (0, . . . , 0) for each x RL with some, but not all, coordinates equal to zero. Pick such + L an x. As x + (1/n)e R++ for each n N, strong monotonicity implies (0, . . . , 0) x + (1/n)e. Hence, there is an n > 0 with x + (1/n)e n e. As at least one coordinate of x + (1/n)e goes that to zero:
Step 2, domain RL : +
Taking exponentials,
is represented by utility function
L ai n i=1 xi on R++ . L , we must establish on the entire domain R+
with
u(x) =
0 = lim u(x + (1/n)e) = lim u(n e) = lim a1 ++aL . n

n n n
As all
a1 + + aL > 0, it follows that limn n = 0. By assumption, x e for some 0. Positive n N and limn n = 0. So must be zero.
are ruled out:
x + (1/n)e n e
for
Again, most assumptions are familiar.
The homotheticity requirement says that rescaling of
specic coordinates does not aect preferences.
21
3.
3.1.
Choice
Existence of most preferred elements
The
Hitherto, we discussed how microeconomists usually model what economic agents want .
obvious next step is to consider what they actually do . The rationality paradigm underlying the classical microeconomic theory requires that given (1) a set of mutually exclusive alternatives and (2) a nicely behaved preference relation/utility function over the alternatives, the agent will choose a most preferred alternative. This sounds pretty obvious, but an abundance of economic terminology sometimes blurs the picture: most of traditional microeconomics is plain and simple constrained optimization. This begs the question: when do most preferred alternatives exist? This is not straightforward: if you have strongly monotonic preferences over apples and face no consumption constraints whatsoever, there is no optimal amount of apples. Here is a very general existence result:
Proposition 3.1
Assume:
is a weak order on a set
X; x X, X.
the lower contour set
is upper semicontinuous: for all is open;
L(x) = {y X | y
x}
Y
Then
is a nonempty, compact subset of
contains a most preferred element:
y Y :
for all
y Y.
with
Proof.
sets
y y . Then the lower contour Y . By compactness, there is a nite subcovering, i.e., a nite subset Y Y such that {L(y ) : y Y } covers Y . Since Y is nite, it contains a most preferred element y . But then L(y ) covers Y , i.e., y is a best element of Y ,
Suppose not: for every there is a
y Y
y Y
{L(y) : y Y }
are an open covering of the compact set
contradicting our assumption.
Application to consumer model:

upper semicontinuous) weak order
Let
X = RL . +
Suppose a consumer has a continuous (or vector is
on
reecting his preferences and an amount of money
w > 0 in his pocket (w for wealth). Suppose the price B(p, w) at prices p and wealth w consists of all aordable
p RL . ++
The
budget set
feasible commodity bundles:
B(p, w) = {x RL | p x w}. +
This set is: nonempty: it contains the zero vector, closed: it is the intersection of nitely many closed halfspaces:
5
B(p, w) = L {x RL | xi 0} {x RL | p x w}. i=1

bounded:
(10)
Recall that a halfspace in R is a set of the type {x R a = 0, and c R.

5
n
0 xi w/pi
for all commodities
i,
n
: a x c}
or {x R
: a x c}
, where a R ,
n
22
compact: it is a closed and bounded subset of theorem,
RL and therefore compact by the Heine-Borel
convex: by (10), it is the intersection of convex halfspaces. Since
B(p, w)
is nonempty and compact and
is assumed to be an upper semicontinuous weak
order, the budget set contains at least one most preferred alternative.
Exercise 3.1
A decision maker has lexicographic preferences
over and
R2 : x2 y2 ) .
(x1 , x2 )
(a) Is upper semicontinuous?
(y1 , y2 ) x1 > y1
or
(x1 = y1
(b) Does each nonempty, compact subset
Y R2
contain a most preferred element?
3.2.
Revealed preference
Rather than going from preferences to choices, this subsection, based on Arrow (1959), tries to move in the opposite direction: can we under suitable assumptions explain observed choices by constructing a preference relation that makes such choices rational? Formally, a
choice structure is a tuple (X, B, C), where

X B C
is a nonempty set of alternatives.
is a nonempty collection of choice sets. Each element of
is a nonempty subset
B X,
interpreted as a potential problem for a decision-maker: `Please choose from is a choice rule, assigning to each choice set
B .'
B B a nonempty set C(B) B , interpreted B that the decision maker nds acceptable. The choice structure (X, B, C) is rationalizable if there is a weak order on X such that for each choice set B B , the associated choices C(B) are the most preferred ones under :
as those elements from
B B :
C(B) = {x B | x
for all
y B}.
(11)
Consider two properties one might expect from revealed preferences:
Weak axiom of revealed preference (WARP)

satises WARP if
The choice structure
(X, B, C)
A, B B, x, y A B :
if
x C(A), y C(B), A
and
then
x C(B). x
and
The idea behind WARP is this: in both choice problems available. If Similarly, if equivalent
B,
alternatives
are
x C(A), this reveals x to be at least good as y ; otherwise x wouldn't be acceptable. y C(B), then y must be at least as good as x. But then x and y ought to be and you should nd x acceptable also in B .
Independence of irrelevant alternatives (IIA) The choice structure (X, B, C)

satises IIA if
A, B B :
if
AB
and
C(B) A = , B
then
C(A) = C(B) A.
Intuitively, suppose that some items on menu to
are not feasible after all and choice is restricted
A.
If
still contains some acceptable elements from
B,
choice should remain unaected: an
element is acceptable in the smaller set
if and only if it was acceptable in the larger set
B.
23
Proposition 3.2
Consider a choice structure
(X, B, C).
(a) If it satises WARP, then it satises IIA. (b) If it satises IIA and all choice sets with at most three elements are contained in
B,
then
(X, B, C)
is rationalizable.
Proof. (a):
Assume WARP holds. Let To show:
b C(B) A.
A, B be as in the denition of IIA. Let a C(A) a C(B) A, b C(A). Since C(A) A B , we have a, b A B, a C(A), b C(B).
and
a C(B), b C(A). x, y X , the set {x, y} lies in B by the assumption on B . Hence, we may dene x y if x C({x, y}). We need to check three things: [ is complete:] Let x, y X . By nonemptiness, either x C({x, y}) or y C({x, y}), i.e., x y or y x. [ is transitive:] Let x, y, z X and assume that x y and y z . By denition of : x C({x, y}) and y C({y, z}). To show: x z , i.e., x C({x, z}). If x = y or y = z , this follows immediately. If x = z , then x z is the same as x x, which follows from completeness. So let x, y, z be distinct and consider the set {x, y, z} B . It suces to show that x C({x, y, z}), because then x C({x, z}) by IIA. Suppose, to the contrary, that x C({x, y, z}). By nonemptiness of C , C({x, y, z}){y, z} = / . By IIA and y z : y C({y, z}) = C({x, y, z}) {y, z}. So C({x, y, z}) {x, y} = . By IIA and x y : x C({x, y}) = C({x, y, z}) {x, y}, contradicting the assumption that x C({x, y, z}). / [ rationalizes (X, B, C):] To show that (11) holds, let B B. Firstly, let z C(B). To show: z y for all y B . So let y B . Then {y, z} B, {y, z} B , and z C(B) {y, z} = . By IIA, z C({y, z}). So z y . Secondly, let z B satisfy z y for all y B . To show: z C(B). By nonemptiness, there is a y C(B). Then {y, z} B, {y, z} B , and y C(B) {y, z} = . By z y and IIA: z C({y, z}) = C(B) {y, z}, so z C(B).
(b):
By WARP,
For all
Exercise 3.3 investigates the other relations between rationalizability, WARP, and IIA.
3.3.
Exercises
Weierstrass' Maximum Theorem:
Exercise 3.2
function
Use Proposition 3.1 to prove that a continuous
f :XR
on a nonempty, compact set
achieves a maximum and a minimum.
Exercise 3.3
(a) Show that if
(X, B, C)
is rationalizable, it satises WARP.
(b) Does IIA imply WARP? (c) Can the restriction on
in Proposition 3.2 be omitted?
(d) Does WARP imply rationalizability?
24
Exercise 3.4
Let
X = {1, 2, . . . , n}
for some
n N, n 3,
and let
consist of all nonempty subsets of
X.
For each of the following choice rules
C,
prove whether the choice structure rationalizing it.
(X, B, C) satises WARP
and/or IIA. If possible, construct a weak order (a) Satisficing (Simon, 1955): A function
v : X R assigns to each alternative x X a value v(x) R. Those with a value at/above a given threshold r R are deemed `satisfactory'. For each B B , the choice C(B) is dened as follows: go through the elements of B in increasing order and
choose the rst satisfactory one. If no such element exists, choose the nal (i.e., largest) element of
B.
For each choice set on X in which no two distinct elements B B with two/more elements, you politely abstain from C(B) = {x B | y B : y x}.
(b) Madly in love: Assume your partner has a weak order are equivalent. choosing your partner's favorite:
Exercise 3.5
A taste for precious metals: A consumer faces two luxury goods, the rst is gold,
the second platinum, and spends the entire wealth on the good with the highest price. a choice structure
If prices are sets:
equal, half of the wealth is spent on each good. To investigate the rationality of such behavior, consider
(X, B, C),
where
B1 = B((2, 1), 2), C(B2 )
the budget set at
X = R2 , the commodity space, and B consists of two choice + prices p = (2, 1) and wealth w = 2, and B2 = B((1, 2), 2).
in the same gure. Given the assumptions above, nd
(a) Draw the choice sets
B1
and
B2
C(B1 )
and
and also draw these in your gure.
(b) Does the choice structure (c) Does the choice structure (d) Is the choice structure
(X, B, C) (X, B, C)
satisfy IIA? satisfy WARP?
(X, B, C)
rationalizable?
Economic models of luxury goods often allow price-dependent preferences. (e) Give an example of a utility function depending both on the commodity bundle
and the price
p denoted u(x, p) (p, w) R3 . ++

vector
that makes the consumer's behavior utility maximizing for every
25
4.
4.1.
Choices of a consumer: classical demand theory

The preference/utility maximization problem
This model consists
Section 3.1 set the stage for the classical model of consumer behavior. of a specication of:
(i) what the consumer wants: a preference relation or utility function;

a budget set indicating the commodity bundles that
(ii) what the consumer nds feasible:
he can choose from; (iii) what the consumer putting these two together nds the most preferable commodity bundles. Formally: there are
L N
commodities that can be consumed in nonnegative quantities, so the
commodity space is a price vector
X = RL ; +
assigns to each commodity
p RL ++
i {1, . . . , L}
a price
pi > 0;
the consumer has a given income/`wealth' buying a commodity bundle; the consumer has a preference relation representing these preferences.
w > 0,
on
i.e., an amount of money to spend on
or even a utility function
u : X R
Typically, no additional restrictions are imposed on consumption, so the budget set
B(p, w) = {x RL : p x w} +
species the commodity bundles the consumer can aord. At this stage, it would be a good idea to look back at Section 3.1 to recapitulate some properties of this budget set. solves the following The consumer
preference maximization problem ( -MP):
-MP:
budget set
Find the set of most preferable commodity bundles according to
in the
B(p, w). u,
this yields the s.t.
Given utility function
utility maximization problem (UMP):
UMP:
The
Solve
max u(x)
x B(p, w).
It is common economic practice to assign special names to the set of solutions and in case a utility function is given the corresponding optimal value of such optimization problems.
(Walrasian) demand correspondence assigns to each price vector p RL and wealth ++

the associated set
w>0
x(p, w)
of optimal commodity bundles:
x(p, w) = {x B(p, w) : x
for all
y B(p, w)}
= {x B(p, w) : u(x) =
Given a utility function vector
max u(y)}.
yB(p,w)
u,
the
RL and wealth ++
indirect utility function

where
v : RL+1 R ++
assigns to each price
w>0
the maximal utility the consumer can achieve. To compute it
is easy:
v(p, w) = u(x ),
choice of x
x x(p, w),
This is independent of the particular
is the utility of an arbitrary vector in the demand at
(p, w).
x(p, w):
since all such vectors are utility maximizers, their utility is the same.
26
Remark 4.1
If the utility function
continuous on an open set
u is a C 1 -function containing X ), the UMP max u(x) s.t. p x w, x1 0,

. . .
(its partial derivatives exist and are
xL 0,
is usually solved using the associated Kuhn-Tucker conditions.
Remark 4.2
If the Walrasian demand correspondence is single-valued, i.e., if
x(p, w)
consists
of a single element for each than a correspondence.
(p, w)
RL+1 , it is common to treat demand as a function, rather ++
Let us conclude this subsection with an example involving a well-known type of utility function.
Leontiev utility:
Baking your favorite cake requires xed proportions of its
one unit of cake takes a vector
(a1 , . . . , aL )
RL of ingredients. Given ingredient vector ++ i
L 2 ingredients: x RL , +
how much cake can you produce? Well, looking at the i-th ingredient, your guess will be at most
xi /ai
units.
What constrains you are those ingredients
where this fraction is the smallest.
Therefore, a suitable utility function would be
u(x) = min{x1 /a1 , . . . , xL /aL },

specifying how many units of cake you can make from so the Kuhn-Tucker conditions are not applicable.
(12)
x.
This utility function is not dierentiable,
Exercise 4.1
Check that the associated preference relation is continuous, monotonic (but not strongly),
convex (but not strictly), and homothetic.
Let prices and wealth be
(p, w) RL+1 . ++
Since preferences are continuous and the budget set (see Section 3.1):
B(p, w) nonempty and compact, there is at least one solution to the UMP x(p, w) = . Let's compute it. Firstly, if x solves the UMP, it must be that x /a1 = = x /aL . 1 L
Why? Well, suppose this were not true:
(13)
min{x /a1 , . . . , x /aL } < max{x /a1 , . . . , x /aL }. 1 1 L L Then you're using the ingredients in the wrong proportions: you can only make u(x ) = /a , . . . , x /a } units of cake, but there are commodities i where you have enough for min{x1 1 L L x /ai = max{x /a1 , . . . , x /aL } units, an utter waste. If you were to trade a small amount of 1 i L
these wasted ingredients for the non-wasted ones, you would still be in your budget set, but able to make more cake. Hurray! Secondly, preferences are monotonic, so you will use your entire budget on ingredients:
px =
w.
Combining this with (13) gives us that there is a unique solution to the UMP at
(p, w), namely
x =
a1 w ,..., L i=1 ai pi
27
aL w L i=1 ai pi
By Remark 4.2, it is common to write this result down as a demand function:
(p, w) RL+1 : ++
x(p, w) =
a1 w ,..., L i=1 ai pi
aL w L i=1 ai pi
instead of a single-valued demand correspondence:
(p, w) RL+1 : ++
x(p, w) =
a1 w ,..., L i=1 ai pi
aL w L i=1 ai pi
Substituting the demand vector in the utility function, we nd the indirect utility function:
L+1 (p, w) R++ :
v(p, w) = u
a1 w ,..., L i=1 ai pi
aL w L i=1 ai pi
w
L i=1 ai pi
Exercise 4.2
wealth
Our denition of the budget set is standard, but other realistic restrictions can be modeled
just as easily. In the commodity space
X = R2 , +
let the price vector be on
p = (8, 4).
The consumer has
w = 40 and an upper semicontinuous weak order
X.
In each of the following cases separately,
specify the budget set given the additional information. Does the new budget set necessarily contain at least one most preferred bundle? (a) Indivisibilities: The commodities cannot be cut into ever smaller pieces. Only integer quantities are feasible. (b) Rationing: The consumer is not allowed to buy more than three units of the rst commodity. (c) Rebates 1: If the consumer buys more than ve units of the second commodity, these additional units in excess of the rst ve have a lower price, namely two. (d) Rebates 2: If the consumer buys more than ve units of the second commodity, the price of this commodity (also the rst ve units) is decreased to two. (e) Initial endowment: Instead of having wealth
w, suppose the consumer has an initial endowment
= (1, 1) of one unit of both commodities.
He can sell (parts of ) his initial endowment to generate
income to purchase other commodity bundles. (f ) Package deal: The consumer has to buy the same quantity of both commodities. (g) Gift certificate: The consumer has received a gift certicate of one monetary unit, which he can spend in its entirety on commodity one.
4.2.
Properties of the demand correspondence and indirect utility
Section 1 listed a lot of properties that can be imposed on the consumer's preferences. The next result indicates the consequences of such restrictions on the demand correspondence.
Proposition 4.3
(a) If (b) If
Let
X = RL +
for some
LN
and let
be a weak order on
X.
The Walrasian
demand correspondence has the following properties: is upper semicontinuous, then
x(p, w)
is nonempty for all
(p, w) RL+1 . ++
sequence
is continuous, the Walrasian demand correspondence has a closed graph: for each L+1 (pn , wn , xn )nN in R++ X with limit (p, w, x) RL+1 X : if xn x(pn , wn ) ++ for all n N, then also x x(p, w).
28
(c) Homogeneity of degree zero: (d) If
(p, w) RL+1 , > 0 : x(p, w) = x(p, w). ++ u

is quasiconcave, then
is convex, or equivalently, if
x(p, w)
is a convex set for all
(p, w) RL+1 . ++
(e) If at most one element for all (f ) is strictly convex, or equivalently, if u is strictly quasiconcave, then (p, w) RL+1 . ++ All money is spent: If and is locally nonsatiated, then
x(p, w)
contains
(p, w) RL+1 ++
Walras' law:
px = w
for all
x x(p, w).
RL+1 X with limit (p, w, x) RL+1 X . Assume ++ ++ n n n that x x(p , w ) for all n N. To show: x x(p, w). n n n Firstly, x X and p x w , so taking limits: p x w . Conclude that x B(p, w). Suppose that x x(p, w): there is a y B(p, w) with y / x. By continuity of and Proposition 1.2, there are neighborhoods Ux of x and Uy of y such that y x for all (x , y ) Ux Uy . Choose y Uy with p y < w . This is possible: y B(p, w) implies that p y w . In case of strict inequality, take y = y . In case of equality, small decreases in the positive coordinates of y will give the desired y . n n n n n n As (p , w ) (p, w), it follows that p y w for n suciently large, so y B(p , w )Uy . n n n n n As x x, x Ux for n suciently large. Hence, for large n, x Ux and y B(p , w ) Uy . n , contradicting that xn was optimal at prices pn and wealth w n . But then y x (c): Since B(p, w) = {x RL : (p) x w} = {x RL : p x w} = B(p, w), the -MP + +
sequence in has the same domain before and after rescaling and therefore the same set of solutions.
Proof. (a): See Section 3.1. (b): Let (pn , wn , xn )nN be a
x(p, w) = , it is convex. If x(p, w) = , let x x(p, w). Then x(p, w) = B(p, w) {x X : x x } is the intersection of two convex sets and therefore convex. 1 (e): Assume is strictly convex. Suppose there are x, y x(p, w), x = y . Then 2 x + 1 y lies in 2 B(p, w) by convexity of B(p, w). By strict convexity of , this bundle is strictly better than x and y , contradicting that these were most preferred bundles in B(p, w). (f): Assume is locally nonsatiated. Let x x(p, w). Then p x w, since x B(p, w). Suppose p x < w . For > 0 suciently small, the entire neighborhood {y X : x y < } is contained in the budget set. By local nonsatiation, this neighborhood contains a point y with y x, contradicting that x is a most preferred bundle in the budget set.
Assume is convex. If An important consequence of the closed-graph property is that if Walrasian demand is singlevalued, the Walrasian demand function is continuous!
(d):
Exercise 4.3
If
is homothetic, then. . . what can you conclude about Walrasian demand?
To formulate properties of indirect utility, we will need to assume (Surprise!) that preferences are represented by means of a utility function and that the demand correspondence is non-empty valued: otherwise, indirect utility is undened.
Proposition 4.4
X=
Assume: L for some L R+
N;
The consumer's preference relation
u : X R;
29
Walrasian demand is nonempty-valued:
(p, w) RL+1 , x(p, w) = . ++
Then the indirect utility function has the following properties: (a) Homogeneity of degree zero: (b) For each commodity i, better o ). (c)
(p, w) RL+1 , > 0: v(p, w) = v(p, w). ++ i

(higher prices cannot make you
is nonincreasing in the price of
v v
is nondecreasing in wealth; if
is locally nonsatiated,
is even strictly increasing in
wealth. (d) is quasiconvex:
r R : {(p, w) RL+1 : v(p, w) r} ++ u,
is a convex set.
(e) If
is represented by a continuous utility function
then
is continuous.
Proof. (a): Follows from Proposition 4.3(c). L+1 (b): Let (p, w) R++ and let i {1, . . . , L}
strict increase in the price of commodity
be a commodity. Let
i.
Then
p be B(p , w) B(p, w), so max u(y) = v(p, w),
obtained from
by a
v(p , w) =
max
yB(p ,w)
u(y)
yB(p,w)
since the second maximum is taken over a larger set.
(c):
The nondecreasing part is similar to (b), so we will only do the strictly-increasing part. is locally nonsatiated. Let
Assume
p RL and 0 < w < w . To show: v(p, w) < v(p, w ). ++ Let x x(p, w). Then x B(p, w), so p x w < w . Since p x < w , for > 0 suciently small, the entire neighborhood {y X : x y < } is contained in the budget set B(p, w ). By local nonsatiation, this neighborhood contains a point y with y x. Conclude that v(p, w) = u(x) < u(y) max
zB(p,w ) L+1 r R. If {(p, w) R++ : v(p, w) r} = , it is convex. If it is nonempty, let (p, w), (p , w ) lie in this set and let [0, 1]. Write (p , w ) = (p, w) + (1 )(p , w ). To show: v(p , w ) r , i.e., u(x) r for all x B(p , w ). L Let x B(p , w ). Then x R+ and (p x) + (1 )(p x) w + (1 )w . Therefore, p x w or p x w (or both). W.l.o.g., p x w. Then x B(p, w), so u(x) v(p, w) r.
Let Follows from Proposition 4.3(b).
u(z) = v(p, w ).
(d):
(e):
Exercise 4.4
(a) Proposition 4.4(c) might suggest that also (b) can be strengthened a bit: If indirect utility is strictly decreasing in the price of commodity (b) Write out the proof of Proposition 4.4(e) in detail. (c) Why not just write If is continuous, is locally nonsatiated,
i.
But this wrong. Why?
is continuous?
30
4.3.
The expenditure minimization problem
Consider a consumer with utility function reach utility level
u : RL R, prices p RL , and a utility level u R. + ++
What is the minimal amount the consumer has to pay, i.e., the minimal level of wealth needed to
u?
The answer is given by the
expenditure minimization problem (EMP):
min p x s.t. x RL , + u(x) u.

The
Hicksian or compensated demand correspondence assigns to each price vector p RL ++

u
the associated set
and each utility level
h(p, u)
of solutions to the EMP: for all
h(p, u) = {x RL : u(x) u +
the
and
pxpy
y RL +
with
u(y) u}.
The Hicksian demand correspondence species the set of consumption bundles solving the EMP,
expenditure function e(p, u) indicates its value:

e(p, u) =
xRL ,u(x)u +
min
p x = p x
for all
x h(p, u).
Similar to our earlier approach to Walrasian demand and indirect utility, one can derive properties of Hicksian demand and the expenditure function. To make the proposition at all sensible, one needs to restrict attention to utility levels that are actually reachable; therefore, let
U = {u(x) :
x RL } +
be the range of the utility function
u.
and let
Proposition 4.5
order (a) If
Let
X = RL +
for some
LN h(p, u)
u:XR
represent a consumer's weak
. The Hicksian demand correspondence has the following properties: is upper semicontinuous, then is nonempty for all
(p, u) RL U . ++
(b) Homogeneity of degree zero in prices: (c) If
(p, u) RL U, > 0 : h(p, u) = h(p, u). ++ h(p, u)

is convex, or equivalently, if utility is quasiconcave, then
(p, u) RL U . ++
(d) If utility is continuous and concave, then
h(p, u)
contains at most one element for all
is strictly convex, or equivalently, if utility is strictly quasi(p, u) RL U . ++
(e) No excess utility: If utility is continuous, then
u(x) = u
and
for all
(p, u) RL U ++
If
with
u u(0, . . . , 0)
(f )
and all
x h(p, u).
let
Compensated law of demand:

x h(p , u),
then
p ,p (p p ) (x x ) 0.
RL ++
u U.
x h(p , u)
L +
and
Why this restriction? Well, suppose that u < u(0, . . . , 0). Since p x 0 for all x R , it follows that h(p, u) = {(0, . . . , 0)}: expenditure is not minimal at utility u, because the zero vector, with higher utility, is the cheapest option. Under suitable monotonicity restrictions, however, this will turn out to be an exotic case: the zero vector will often give you the lowest utility in R , so that this footnote becomes irrelevant.
6
L +
31
(p, u) RL U . By feasibility, u(y) = u for some y X . By upper + semicontinuity of preferences, the set {x X : u(x) u} = {x X : x y} is closed. Therefore, L L the solution of the EMP lies in the nonempty set {x R+ : u(x) u} {x R+ : p x p y},
Let which is the intersection of a closed and a compact set and therefore compact. The goal function
Proof. (a):
x p x is continuous.
see Section 3.1.
A continuous function on a nonempty, compact set achieves a minimum;
(b): (c):
Minimizing Let
x (p) x gives the same solutions as minimizing x p x. (p, u) RL U . If h(p, u) = , it is convex. If h(p, u) = , let y h(p, u). ++ h(p, u) = {x RL : u(x) u} {x RL : p x p y} + +
By
denition,
is the intersection of convex sets, hence convex.
(d):
u u(0, . . . , 0), as h(p, u) = {(0, . . . , 0)} in those cases. So let u > u(0, . . . , 0). Suppose h(p, u) contains two distinct alternatives, x, x . By strict convexity, (x + x )/2 is strictly better, yet causes the same expenses. As (x + x )/2 x (0, . . . , 0), (x + x )/2 = (0, . . . , 0): some of its coordinates are positive. By continuity, slight decreases in these coordinates still yield alternatives at least as good as x, i.e., they remain feasible in the EMP at (p, u), but cheaper than x, a contradiction. (e): Assume the utility function is continuous. Let (p, u) RL U with u u(0, . . . , 0) and ++ x h(p, u). If u = u(0, . . . , 0), then h(p, u) = {(0, . . . , 0)}, so the result is true: u(x) = u(0, . . . , 0) = u. Next, let u > u(0, . . . , 0). Suppose u(x) > u. Then x = (0, . . . , 0), so that at least some coordinates of x exceed zero. By continuity, u(y) > u for all y in a neighborhood of x. By continuity, lim1 u(x) = u(x) > u, so u(x) > u for (0, 1) close to one. But p (x) = (p x) < p x, contradicting that x h(p, u). (f): Since x is optimal and x feasible in the EMP at (p , u), it follows that
The result is true if
p x p x .
Similarly,
p x p x.
Adding these inequalities and rewriting gives the compensated law of demand. If
is single-valued, we will treat it as a function, rather than a correspondence, just as we did
for Walrasian demand (see Remark 4.2). The compensated law of demand implies that if you raise the price of one of the goods, then the Hicksian demand for this good will not increase. The next proposition states some properties of the expenditure function. Given the similarity with earlier results, proofs are left as an exercise.
Proposition 4.6
X = RL +
Assume:
for some
L N;
The consumer's preference relation Hicksian demand is nonempty-valued: Then the expenditure function
u : X R;
(p, u)
RL ++
U : h(p, u) = .
e:
RL ++
U R
has the following properties:
(a) Homogeneity of degree one in prices:
(p, u) RL U, > 0 : e(p, u) = e(p, u). ++

32
(b) Monotonicity in
u: If u(0, . . . , 0) u < u :
utility is continuous, then for all
p RL ++
and all
u ,u U
with
e(p, u ) < e(p, u ).

(c) For each commodity i, expenditure is nondecreasing in the price of i. (d) For all
u U , e(, u)
is concave in
p.
Exercise 4.5
Prove this proposition.

7
Remark 4.7
u : R+ R
Establishing continuity properties for Hicksian demand and expenditure is less
straightforward than for Walrasian demand and indirect utility. Concave functions are continuous, so Proposition 4.6(d) implies that expenditure is continuous in prices. The utility function with utility levels. Letting
u(x) = max{0, x 1} shows that expenditure is not necessarily p > 0 be the price of the only commodity, one nds e(p, u) = 0 p(u + 1) u = 0.
if if
continuous in
u = 0, u > 0. e : RL U R ++
Since
p > 0, e(p, )
has a discontinuity at
However, if the utility function is both
continuous and locally nonsatiated, continuity of the expenditure function
can be established using a result known as Berge's Maximum Theorem. Contrary to what most textbooks (which do not provide the proof ) suggest, the proof is not straightforward. To establish
(p0 , u0 ) RL U , local nonsatiation is used to establish existence of ++ 0 . Next, on a neighborhood of (p0 , u0 ), the EMP reduces to minimizing a y u(y) > u p x subject to x {z RL : u(z) u, p z p y}. This nal condition assures that the +
continuity at an arbitrary
RL with +
conditions of the Maximum Theorem are satised. Let us proceed with the example on Leontiev utility functions.
U = R+ . to the EMP at (p, u) RL U must satisfy In order not to waste resources, a solution x ++ (13). Moreover, by continuity, it satises u(x ) = u. Combining these two conditions gives us
The Leontiev utility function in (12) has range that there is a unique solution to the EMP at a correspondence:
Leontiev utility (Continued):
(p, u),
namely
x = (a1 u, . . . , aL u).
Since the
solution is unique, it is common to write the result as a Hicksian demand function, rather than
h(p, u) = (a1 u, . . . , aL u)
and
e(p, u) = p (a1 u, . . . , aL u) = u
and
L i=1 ai pi .
The following result gives a relation between
h(p, u)
e(p, u)
in a particularly simple case.
Proposition 4.8
Assume the utility function
nonsatiated, strictly convex preferences. Then demand for each good respect to the price
u : RL R is continuous and represents locally + L for all p R++ and all u > u(0, . . . , 0), Hicksian e(p, u) . p
= 1, . . . , L
can be found by derivating the expenditure function with
= 1, . . . , L :
7
h (p, u) =
(14)
Requires some knowledge of topology. Can be omitted.
33
Proof.
We will not prove that the expenditure function is dierentiable.
The remainder of the denote Hicksian
proof proceeds as follows. By strict convexity of preferences, Hicksian demand is single-valued, so we treat
h()
as a function.
Fix
demand at prices
and utility level
p RL and u U and let x = h(p, u) ++ u. For every price vector p RL , ++

x RL ,u(x )u +
e(p , u) =
with equality if maximized at
min
p x p x,
with
p = p. p = p. By
Hence, the function
f : RL R ++
the rst order conditions, its partial derivatives at
f (p ) = e(p , u) p x p must be zero:
is
= 1, . . . , L :
proving the result.
f (p) e(p, u) e(p, u) = x = h (p, u) = 0, p p p
Exercise 4.6
Roy's identity: Similarly, one can prove:
Assume the utility function strictly convex preferences. at a point
u : RL R +
is continuous and represents locally nonsatiated,
Assume that the indirect utility function
v()
is dierentiable
(p, w) with p RL and w > 0. ++ = 1, . . . , L can be found as follows:
Then the Walrasian demand for each good
= 1, . . . , L : x (p, w) =
Do this by showing that the function its minimum at
v(p, w)/p . v(p, w)/w

where
f : RL R ++
with
f (p ) = v(p , p x),
x = x(p, w),
achieves
p = p.
4.4.
Relations between UMP and EMP Assume the utility function
Proposition 4.9
(a) If
nonsatiated preferences. Fix a price vector
u : RL R + p RL . Then: ++
is continuous and represents locally
is optimal in the UMP with wealth w > 0, then x is optimal in the EMP with utility ). Moreover, the expenditure level in this EMP is exactly p x = w : level u = u(x
x x(p, w) x h(p, u(x ))

(b) If
and
e(p, u(x )) = w.
is optimal in the EMP with utility level u U , u > u(0, . . . , 0), then x is optimal in the UMP with wealth w = p x . Moreover, the indirect utility level in this UMP is exactly
u: x h(p, u) x x(p, p x )
and
v(p, p x ) = u
Bundle
Proof. (a):
with prices
x x(p, w). By Walras' law, p x = w. p and utility level u(x ). Let x h(p, u(x )). By
Let
is feasible in the EMP
denition,
e(p, u(x )) = p x p x = w
8
and
u(x) u(x ).
It follows from a duality result in convex analysis: for xed u, e(, u) is the support function of the strictly convex set {x X : u(x) u}.
34
The rst inequality means that
maximizing bundle x
x B(p, w). But then its utility cannot exceed that of the utility x(p, w). So u(x) = u(x ) and by Walras' law: e(p, u(x )) = p x = p x = w.
x h(p, u(x )) and e(p, u(x )) = w. (b): h(p, u). By Proposition 4.5(e), u(x ) = u. Bundle x prices p and wealth p x . Let x x(p, p x ). By denition,
Conclude that
Let x
is feasible in the UMP at
v(p, p x ) = u(x) u(x ) = u

The rst claim shows that
and
p x p x .
But then the inequality in the
is feasible in the EMP at
(p, u).
second claim cannot be strict:
px=p
x . By Proposition 4.5(e),
v(p, p x ) = u(x) = u(x ) = u.

Conclude that
x x(p, p x )
and
v(p, p x ) = u.
Under the assumptions above, we obtain important relations between the UMP and EMP:
e(p, v(p, w)) = w v(p, e(p, u)) = u x(p, w) = h(p, v(p, w)) h(p, u) = x(p, e(p, u))
(15) (16) (17) (18)
x x(p, w). By denition, v(p, w) = u(x ). By Proposition 4.9(a), e(p, v(p, w)) = e(p, u(x )) = w. (17): We rst show that x(p, w) h(p, v(p, w)). Let x x(p, w). Then u(x) = v(p, w), so x h(p, u(x)) = h(p, v(p, w)) by Proposition 4.9(a). Secondly, we show that h(p, v(p, w)) x(p, w). Let x h(p, v(p, w)). By Proposition 4.9(b), x x(p, p x). Moreover, x h(p, v(p, w)) and (15) imply that p x = e(p, v(p, w)) = w . Conclude that x x(p, p x) = x(p, w).
Let
Proof. (15):
(16), (18):
Similar.
These results give convenient ways to nd solutions to the UMP from those of the EMP and vice versa. Let us illustrate this in our Leontiev example.
Recall that
v(p, w) =
By (16), expenditure solves
w
L i=1 ai pi
and
x(p, w) =
a1 w ,..., L i=1 ai pi e(p, u) = u
aL w L i=1 ai pi
u = v(p, e(p, u)) =
e(p,u) , so L i=1 ai pi
L i=1 ai pi , exactly (Good
news, isn't it!) as we saw before. Hicksian demand can now be found in dierent ways. Firstly, using Proposition 4.8:
= 1, . . . , L :
and, secondly, using (18):
h (p, u) =
e(p, u) = a u, p
h(p, u)
solves
h(p, u) = x(p, e(p, u)) =
a1 e(p, u)
L i=1 ai pi
,...,
aL e(p, u)
L i=1 ai pi
= (a1 u, . . . , aL u).
35
Exercise 4.7
For Leontiev utility, use (15) and (17) to nd Walrasian demand and indirect utility from
the solutions of the EMP.
Exercise 4.8
Slutsky equation:
The so-called Slutsky equation provides a relation between the is continuous and represents locally nonsatiated, strictly If these functions are dierentiable, the following Then for all commodities
sensitivity to price changes of the Walrasian and Hicksian demand functions. Assume the utility function
u : RL R +
convex preferences. We know that in this case there are unique solutions to the UMP and EMP: we can consider Walrasian and Hicksian demand functions. holds. Fix
(p, w) RL+1 ++
and utility level
u = v(p, w) > u(0, . . . , 0).9
k,
(19)
{1, . . . , L}: x (p, w) x (p, w) h (p, u) = + xk (p, w). pk pk w

Prove (19) as follows: You know from (18) that
h (p, u) = x (p, e(p, u)).
Dierentiate this equation w.r.t.
pk ,
using the Chain rule. Continue by substituting (14), (15), and (18).
4.5.
Welfare analysis for the consumer
Welfare analysis studies how changes in the consumer's environment in our case: the budget
set aect his well-being. Let only if whatever is optimal in improving are: The budget set has grown: An optimal bundle in
B0
be the budget set before, and
B1
the budget set after the
change. Assuming that optimal bundles exist, the consumer is better o after the change if and
B1
is strictly preferred to whatever is optimal in
B0 .
This is welfare
analysis in a nutshell. Some obvious ways of detecting changes that are (at least weakly) welfare
B0 B1 . B1 .
B0
remains feasible in
Exercise 4.9
How is the consumer's welfare aected by the changes described in Exercise 4.2?
Whereas the above describes the idea behind welfare analysis in its full generality and simplicity, economic textbooks tend to restrict attention to changes only in prices and wealth. The initial vector of prices and wealth is denoted
(p0 , w0 ) RL+1 ++
and the vector of prices and wealth
L+1 1 1 after the change is denoted (p , w ) R++ . This allows changes in prices only, keeping wealth 0 = p1 , w 0 = w 1 ), changes in wealth only, keeping prices constant (p0 = p1 , w 0 = w 1 ), constant (p
or simultaneous changes in prices and wealth (p
= p1 , w0 = w1 ).
RL . +
Consider a change from the consumer is strictly better o under
Exercise 4.10
Let
be a locally nonsatiated weak order on Show that if
(p1 , w1 ).
Let
than under
x0 x(p0 , w0 ). (p0 , w0 ).
p 1 x0 < w 1 ,
(p0 , w0 ) to (p1 , w1 )
Assume that the consumer's continuous, locally nonsatiated preference relation
can be repre-
sented by means of a utility function. We can derive the consumer's indirect utility function
1 1 and conclude that the consumer is better o after the change if and only if v(p , w )
represent , this does not tell us how much better o the consumer is.
>
v(p0 , w0 ).
However, since the indirect utility function depends on which utility function is chosen to changes unambiguously in monetary units, one constructs a so-called
This inequality holds because the zero vector cannot solve the utility maximization problem: by local nonsatiation and strict positivity of prices and wealth, there is an aordable bundle preferred to the zero vector.
9
36
money metric indirect
To express welfare
utility function using the expenditure function. Fix an arbitrary price vector the real-valued function
p RL . ++
Consider
e(, ). p
By Proposition 4.6, this function is strictly increasing, so
e(, v(p1 , w1 )) > e(, v(p0 , w0 )) v(p1 , w1 ) > v(p0 , w0 ). p p

Moreover, since the expenditure function is expressed in monetary units,
e(, v(p1 , w1 )) e(, v(p0 , w0 )) p p (p0 , w0 ) (p1 , w1 ),
(20)
can be used as a monetary measure of welfare change: if it is positive, the welfare of the consumer increases as a consequence of the change from of the consumer has decreased. to if it is negative, the welfare It remains to prove that this money metric does not depend
on the choice of utility function representing the consumer's preferences. This follows from the fact that expenditure can be expressed in a form independent of the utility function: for all
(p, u) RL U , ++
there is a
y RL +
with
u(y) = u,
so
e(p, u) = min p x = min p x L s.t. x R+ s.t. x RL + u(x) u x y

In (20), two natural choices for
would be the initial vector of prices
p0
and the new vector
1 of prices p . These choices give rise to two well-known measures of welfare change: 0 0 0 1 and . Let u = v(p , w ) and u = 0 0 Notice that e(p , u )
variation (EV)
compensating variation (CV)

w0 and e(p1 , u1 ) =
equivalent
v(p1 , w1 ).
w1 by local nonsatiation. We dene
EV ((p0 , w0 ), (p1 , w1 )) = e(p0 , u1 ) e(p0 , u0 ) = e(p0 , u1 ) w0 , CV ((p0 , w0 ), (p1 , w1 )) = e(p1 , u1 ) e(p1 , u0 ) = w1 e(p1 , u0 ).
There is no obvious way to say that one of the measures is better than the other, although the equivalent variation has an advantage when comparing alternative changes: suppose
(p0 , w0 )
1 1 changes either to (p , w ) or
(p2 , w2 ).
Both
are expressed in terms of wealth at prices
EV p0 and
((p0 , w0 ), (p1 , w1 )) and p1 and CV
EV
((p0 , w0 ), (p2 , w2 ))
However,
can consequently be compared.
CV ((p0 , w0 ), (p1 , w1 )) is expressed in wealth at prices 2 prices p , so they are incomparable.

earlier:
((p0 , w0 ), (p2 , w2 )) in wealth at
The equivalent and compensating variation for Leontiev
utility follow immediately from the indirect utility function and expenditure function computed
u0 = v(p0 , w0 ) =
so
w0
L 0 i=1 ai pi
and
u1 = v(p1 , w1 ) =
w1
L 1 i=1 ai pi
L i=1 L i=1
EV ((p0 , w0 ), (p1 , w1 )) = e(p0 , u1 ) e(p0 , u0 ) = w1
ai p0 i ai p1 i
L i=1 L i=1
w0 ,
ai p1 i ai p0 i
CV ((p0 , w0 ), (p1 , w1 )) = e(p1 , u1 ) e(p1 , u0 ) = w1 w0

Lump-sum tax:
a Given initial prices and wealth
(p1 , w1 )
lump-sum tax
=
(p0 , w0 ),
suppose that the government levies Then
T (0, w0 ) on the consumer's wealth, keeping prices unchanged. (p0 , w0 T ). Hence e(p0 , u0 ) = e(p1 , u0 ) = w0 and e(p1 , u1 ) = e(p0 , u1 ) = w1 =
37
w0 T ,
so
EV ((p0 , w0 ), (p1 , w1 )) = CV ((p0 , w0 ), (p1 , w1 )) = T . T .
This is intuitive:
since the prices
remain unchanged, the monetary measure of welfare change as a consequence of a decrease of in the consumer's wealth should equal
Deadweight loss:
government levies a
Let the preference relation
be a continuous, locally nonsatiated, strictly
convex weak order on
p1 = p0 + te coordinate 1
commodity tax t > 0 on the price of good
RL . +
Fix a price vector
p 0 RL ++
and wealth
w > 0.
Suppose the with -th
. Thus, the new price vector is
, where and all
e = (0, . . . , 0, 1, 0, . . . , 0) is the -th standard basis vector of RL 1 other coordinates 0. The total tax revenue is T = tx (p , w) and EV ((p0 , w), (p1 , w)) = e(p0 , u1 ) w 0,
where
u1 = v(p1 , w)
as before.
Alternatively, to raise the same amount, the government can on the wealth of the consumer, keeping prices xed, yielding an
levy a lump-sum tax equivalent variation
T directly T .
The consumer is at least weakly better o under lump-sum taxation. Let
under commodity taxation. Then x B(p0 , w T ), i.e., w T . So x

Therefore,
B(p0 +te
, w), so p0 x +tx
x solve the UMP w, i.e., p0 x w tx =
is feasible in the UMP under lump-sum taxation: the
consumer cannot be worse o under lump-sum taxation than under commodity taxation.
e(p0 , u1 ) w T .
The dierence
w T e(p0 , u1 ) 0
is called the
deadweight loss of commodity taxation .

Cobb-Douglas utility:
Exercise 4.11
utility function
In Section 4, the Leontiev utility function was used as a by
running example to illustrate all denitions. Go through the same steps, now using the Cobb-Douglas
u : RL R +
dened for all
x RL +
u(x) = xa1 xa2 xaL , 1 2 L
where
a1 , . . . , aL > 0.
4.6.
Welfare and Hicksian demand
Assume that the preferences of the consumer are continuous, locally nonsatiated and strictly convex. If the only change is in the price of a single good, equivalent and compensating variation can simply be expressed in terms of the Hicksian demand function. notation, we denote for an arbitrary price vector obtained from is changed to To somewhat simplify
and an arbitrary
p > 0
the price vector
by changing the price of good giving rise to
So given initial prices and wealth
p1 = p0 ,
to p by (p , p ). (p0 , w), suppose that only the price (p1 , w) = ((p1 , p0 ), w). Recall that and
of good
{1, . . . , L}
e(p, u) = h (p, u) p
Hence
e(p1 , u1 ) = w.
EV ((p0 , w), (p1 , w)) = e(p0 , u1 ) w = e(p0 , u1 ) e(p1 , u1 )

p0
=
p1 p0
e(p , p0 , u1 ) dp p h (p , p0 , u1 )dp .
(21)
=
p1
38
Similarly,
p0
CV ((p , w), (p , w)) =

p1
h (p , p0 , u0 )dp .
(22)
This means that the equivalent and compensating variation due to such a simple price change can be represented by areas to the left of the Hicksian demand curve.
Normal goods:
Suppose good
is a
creasing in income) and that its price is decreased:
p0 > p1 . We claim that EV ((p0 , w), (p1 , w)) 0 , w), (p1 , w)). To see this, write u0 = v(p0 , w) and u1 = v(p1 , w). Since v is nonincreasing CV ((p 0 1 0 0 0 1 in p , u u . Since e is increasing in u, this implies that e(p , p , u ) e(p , p , u ) for all p > 0. Since good is normal and x (p, e(p, u)) = h (p, u), it follows that h (p , p0 , u0 ) = x (p , p0 , e(p , p0 , u0 )) x (p , p0 , e(p , p0 , u1 )) = h (p , p0 , u1 )
for all
normal good
(i.e., its Walrasian demand is weakly in-
p > 0.
0
Combining this with (21) and (22), it follows that
p0
p0
EV ((p , w), (p , w)) CV ((p , w), (p , w)) =

p1 p0
h (p
, p0
, u )dp
p1
h (p , p0 , u0 )dp
=
p1
h (p , p0 , u1 ) h (p , p0 , u0 ) dp
0.
39
5.
5.1.
Choices of a producer: classical supply theory

Production sets
Having treated the demand side of the economy in detail, we now turn to the supply side. The supply side consists of rms that use a technology to convert one set of commodities (inputs) to another (outputs). Just as for consumers, it is assumed that rms take prices as given and that all commodities are traded at the market at publicly quoted prices. Consider an economy with
production plan
LN
commodities. The rm's production can be described by a
y = (y1 , . . . , yL ) L commodities. If y < 0, we say that good is used as an input in the production plan y , if y > 0, we say that good is used as an output in y . For instance, if L = 2, the production plan y = (2, 6) indicates that two units of the rst commodity are used as an input to produce
an output of 6 units of the second commodity. The production vectors is denoted by
RL which gives the net amount produced of each of the
production vector
or
RL . This general description allows that a commodity is
production set
of technologically feasible
used as an input in some production vectors, but as an output in others. You may come across the following special cases:
Transformation functions:
using a function
Sometimes the production set can conveniently be described
F :
RL
called the
transformation function as follows:

F (y) = 0
if
Y = {y RL : F (y) 0}
The set of boundary points
and
lies on the boundary of
Y.
{y RL : F (y) = 0}
is called the
transformation frontier .
Single-output technologies:
that is produced using the remaining goods, say
L, is an output 1, . . . , L1, as inputs. These are single-output technologies , typically summarized using a production function f : RL1 R that assigns + to each vector of input quantities z the maximal amount f (z) of output that can be produced
In many examples, one of the goods, say good from it. One can then write
Y = {(z1 , . . . , zL1 , q) RL : q f (z)

Consider, for instance, a Cobb-Douglas production function where
and
z RL1 }. +
given by
f : R2 R +
f (z) = z1 z2 ,
, > 0.
Then
Y = {(z1 , z2 , q) R3 : q z1 z2 ,
5.2. Properties of production sets
and
z1 , z2 0}.
Properties that are often imposed on the production set
Y RL
include:
nonempty : there is at least one feasible production vector. Possibility of inaction : 0 Y . It is possible to do nothing,
Y
is outputs from zero inputs.
i.e., produce zero
closed . This assumption is mainly for mathematical convenience. No free lunch : if y Y RL , then y = 0. It is not possible to produce positive +
Y
is amounts of output without using inputs. 40
Free disposal : if y Y Irreversibility :

if
and
y y,
then
y Y.
If
is feasible and
uses at least
as much of each input, yet gives no more of the outputs, then also
is feasible.
y Y
and
y = 0,
then
y Y . /
It is impossible to reverse a
feasible production vector, i.e., to turn the outputs into the same amount of inputs used to produce it.
Nonincreasing returns to scale : Nondecreasing returns to scale :
if
y Y yY
and
[0, 1],
then
then
y Y .
This
means that feasible production plans can be scaled down. if and
1,
y Y . y Y .
and
This means
that feasible production plans can be scaled up.
Constant returns to scale (CRS): if y Y

conjunction of the previous two properties.
and
0,
then
This is the
Additivity/free entry :
together yielding
if
y, y Y ,
then
y+y Y.
If both
then it is feasible to set up two independent plants, one producing
y are feasible, y , the other y ,
y+y .
Y Y
is
convex : if y, y Y and [0, 1], then y + (1 )y Y . is a convex cone : if y, y Y and , 0, then y + y Y .
One easily establishes relations between these properties. Possibility of inaction implies nonemptiness. Nondecreasing and nonincreasing returns to scale imply constant returns to scale. Some less trivial ones are:
Proposition 5.1
(a) If (b) (c)
Let
Y RL
be a production set. then
is convex and
0Y,
has nonincreasing returns to scale.
Y Y
is a convex cone if and only if is a convex cone if and only if
Y Y
is convex and has constant returns to scale. is additive and has nonincreasing returns to scale. is a
(d) If
Y satises no free lunch and for all x, y Y and (0, 1), there z x + (1 )y, z = x + (1 )y , then Y satises irreversibility.
z Y
with
let y Y and y Y . Conversely, assume Y is convex and has CRS. To show that Y is a convex cone, let y, y Y and , 0. By CRS, 2y Y 1 1 and 2y Y . By convexity, (2y) + (2y ) = y + y Y . 2 2 (c): If Y is a convex cone, it is additive (take = = 1) and has nonincreasing returns to scale (similar to the proof of CRS above). Conversely, assume that Y is additive and has nonincreasing returns to scale. Let y, y Y and , 0. By additivity, ky Y and ky Y for all k N. Choose k N such that /k 1 and /k 1. Since Y has nonincreasing returns to scale, (/k)y Y and (/k)y Y . By additivity: y = k(/k)y Y and y Y . Again by additivity y + y Y . (d): Let y Y, y = 0, and suppose y Y . By assumption, there is a z Y such that 1 z 1 y + 2 (y) = 0, z = 0, contradicting no free lunch. 2
Proof. (a): Let y Y and [0, 1]. By convexity, y + (1 )0 = y Y . (b): Assume Y is a convex cone. Then Y is trivially convex. To establish CRS,
0.
Since
is a convex cone,
y =
1 2
y+
1 2
41
In the special case of a production function, properties of the production set are related to properties of the production function. For instance:
Proposition 5.2
with
Consider a single-output technology with production function so that
f : RL1 R +
f (0, . . . , 0) = 0,
Y = {(z, q) RL : q f (z)
(a) (b)
and
z RL1 }. +
Y Y
has constant returns to scale if and only if is convex if and only if
is homogeneous of degree one.
is concave.
returns to scale. and We show that for each
Proof. (a):
z RL1 +
and
First, assume that
Y satises constant > 0: f (z) f (z). So let z RL1 +
> 0.
By denition of the production set By CRS, this implies
Y , (z, f (z)) Y .
(z, f (z)) Y . Y , (z, f (z)) Y

means that So let
By denition of the production set
f (z) f (z). z RL1 +

and
Next, we show that for each Fix
z = z
RL1 and +
z RL1 and > 0: f (z) f (z). + = 1/ > 0. z

and
> 0.
Apply the result above to Substitute
: f (z ) f ( z ).
z = z
and
= 1/ : (1/)f (z) f ((1/)z) = f (z). : f (z) f (z). Y

has CRS, then for each
Multiply both sides with
Using the above, it follows that if
z RL1 +
and each
> 0 : f (z) =
f (z).
So
is homogeneous of degree one.
z RL1 and each + > 0 : f (z) = f (z). To show: Y has CRS, i.e, if (z, q) Y and 0, then (z, q) Y . This follows from the assumption that f (0, . . . , 0) = 0 if = 0. So let (z, q) Y and > 0.
Conversely, assume that
is homogeneous of degree one:
for each
By denition of the production set Multiply both sides with Since
Y , q f (z).
> 0 : q f (z). f (z) = f (z),

so
is homogeneous of degree one,
q f (z) = f (z). z RL1 +

implies that
By denition of the production set
Y , q f (z)
together with
(z, q) Y .
(b):
The function
f : RL1 R +
is concave if and only if its subgraph and
{(z, q) RL1 R : q f (z)} = {(z, q) RL : q f (z) +

is convex. Multiplying the rst with
z RL1 } +
L1 coordinates with 1 maintains convexity, so this is equivalent

and
{(z, q) RL : q f (z)
being convex.
z RL1 } = Y +
42
5.3.
The prot maximization problem
The production set
species a rm's set of feasible options. To make the choice problem of the
rm complete, we have to endow it with preferences. These preferences are particularly simple. It is assumed that rms maximize prots given the commodity prices and the rm's production set: given production set
problem (PMP) is
The
Y RL
and a price vector
p RL , ++
the
prot maximization
max p y s.t. y Y.
prot function assigns to every price vector p RL the maximal prot ++

(p) = max{p y : y Y }.
The
supply correspondence
y()
assigns to every price vector
p RL ++
the set of prot-
maximizing production vectors:
y(p) = {y Y : p y = (p)}.
As opposed to the utility maximization problem, which has a solution under mild conditions (like continuity of the utility function), there may not be a solution to the PMP: prots may be unbounded. In that case, we set
(p) = +.
Indeed, we may have the following:
Proposition 5.3
price vector made, or
p (p) = +.
L Let Y R be nonempty and satisfy nondecreasing returns to scale. For each RL , either p y 0 for all y Y , which means that no positive prot can be ++
p RL . Suppose that p y > 0 for some y Y . Since Y has ++ nondecreasing returns to scale, y Y for all 1, so p (y) = (p y) can be made arbitrarily large by letting go to innity.
Consider a price vector This makes the existence of solutions to the PMP a nontrivial issue. The following two results provide sucient conditions.
Proof.
Proposition 5.4
nonempty, closed,
Assume that the production set
Y RL
is:
bounded above: there is an
rR
such that
y r
for all
yY
and all
Then the prot maximization problem has at least one solution for each price
{1, . . . , L}. L vector p R++ .
Proof.
p RL . By nonemptiness, there is a y Y . A solution to the PMP must lie in ++ L the set P = Y {y R : p y p y }. P is closed: Y is closed by assumption and the second set in the intersection is closed, since
Let it is the upper contour set of a continuous function. The intersection of two closed sets is closed.
is bounded:
By assumption, the coordinates of vectors in Since
Moreover, all coordinates are bounded from below as well: let coordinate
P yP
are bounded above by
r.
and consider an arbitrary
{1, . . . , L}.
py py ,
it follows that
p y py
k=
pk yk p y
k=
43
pk r,
so
is bounded from below by
py
k=
pk r /p
Hence,
is compact. Since we maximize a continuous prot function over a compact set
Y,
there is at least one solution. The following result establishes existence of solutions to the prot maximization problem under resource constraints.
Proposition 5.5
Assume that the production set
Y RL :
satises possibility of inaction, satises no free lunch, is closed, is convex, has a resource constraint: there is a nonzero vector production to vectors
RL +
of inputs restricting feasible
yY
with
y . p RL . ++
Then the prot maximization problem has at least one solution for each price vector
Exercise 5.1
This exercise guides you through the proof of Proposition 5.5.
(a) Show that To show that of vectors in
Y = Y {y RL : y }
is nonempty and closed.
Y = Y {y RL : y } is Y whose increasing length yn nN

large enough,
bounded, suppose it were not: there is a sequence diverges to innity. Dene
(yn )nN
zn = yn / yn
(b) Show that for (c) Show that
zn
lies in
and satises
zn + / yn 0. z=0
in
(zn )nN
has a convergent subsequence with limit
Y.
(d) Combine this with (b) to derive a contradiction. As
is nonempty and compact and the prot function is continuous, a maximum exists!
Thus, whenever we talk about properties of the prot function and the supply correspondence, we implicitly assume that the PMP has a solution, so that
y(p) =
and
(p) < .
Proposition 5.6
Consider a rm with production set
Y RL .
(a) The prot function is homogeneous of degree one, the supply correspondence is homogeneous of degree zero. (b) The prot function is convex. (c) If (d)
is convex,
y(p)
is a convex set for all Let
p RL . ++
the prot
Hotelling's lemma: Law of supply:
function is dierentiable (e) for all
p RL . If y(p) consists of a single point y , then ++ at p and (p)/p = y for all goods = 1, . . . , L.
and all
p, p RL ++
y y(p)
and
y y(p ):
(p p ) (y y ) 0.
44
Proof. (a): Do this yourself. (b): We give two proofs. First proof: we show that the epigraph epi() = {(p, v) RL R : v (p)} is a convex set. ++
Let let
(p1 , v 1 ), (p2 , v 2 ) epi() and [0, 1]. To show: v 1 + (1 )v 2 (p1 + (1 )p2 ). y y(p1 + (1 )p2 ). Then pi y (pi ) v i for both i = 1, 2, so (p1 + (1 )p2 ) = p1 y + (1 )p2 y v 1 + (1 )v 2 .
So
Second proof:
we show that for all so
(p1 ) + (1 )(p2 ). So let pi y (pi ) for both i = 1, 2,
p1 , p2 RL and all [0, 1] : (p1 + (1 )p2 ) ++ p1 , p2 RL and [0, 1]. Let y y(p1 + (1 )p2 ). Then ++
(p1 + (1 )p2 ) = p1 y + (1 )p2 y (p1 ) + (1 )(p2 ). y(p) = Y {y RL : p y = (p)} is the intersection of Y and a hyperplane. Since both are convex, so is y(p). (d): We prove Hotelling's lemma, assuming that is dierentiable at p. By denition of the L prot function we know that for all p R++ : p y (p ), with equality if p = p. So the L function h : R++ R with h(p ) = (p ) p y achieves its minimum at p. But then its partial derivatives at p must be zero:
Let Then
(c):
p RL . ++
= 1, . . . , L :
proving Hotelling's lemma.
h(p)/p = (p)/p y = 0,
(e):
Notice that
(p p ) (y y ) = (p y p y ) + (p y p y) 0,
where the inequality follows from the denition of prot maximizers:
p y = (p) p y
and
p y = (p ) p y .
5.4. Solving the PMP
Just like in the utility maximization problem UMP, the Kuhn-Tucker conditions can be used to nd necessary rst order conditions for the prot maximization problem PMP: if the production set is
Y = {y RL : F (y) 0},
where
is continuously dierentiable and the price vector is
condition for y
p RL , ++
a necessary rst order
to be a solution to the PMP
max p y s.t. F (y) 0

is that there exists a Lagrange multiplier
such that for each good
= 1, . . . , L :
(23)
p =
F (y ) . y
45
If we divide the rst order condition for good goods
with that for good
k,
we nd that for all pairs of
,k :
p F (y )/y = , pk F (y )/yk y,
the price ratio between two goods equals its so-called
i.e., in an optimal production plan sucient for a solution to the PMP.
marginal rate of transformation. If the set
is convex, the rst order conditions in (23) are also
In the single-output case, assume the production function price of input
is dierentiable and that the
= 1, . . . , L 1
equals
w >0
and the price of the output equals
p > 0. p to
with the wealth
Remark 5.7
level
I don't know the reason for this sudden change of notation from a price vector
an output-input price vector
(p, w).
Do not confuse the vector of input prices
of the consumer. This choice of notation is unfortunate, but widespread in economics.
The PMP can be rewritten as
max pf (z) w z s.t. z RL1 . + 0

such that for all inputs
If
is optimal, the Kuhn-Tucker conditions imply the existence of Lagrange multipliers
for each of the conditions
z 0 p
= 1, . . . , L 1 :
(24)
f (z ) w = z
and
z = 0. =0
for all
Assuming an interior solution (z order conditions become
>0
for all
), this implies that
, so the rst
= 1, . . . , L 1 : p
so that for all inputs
f (z ) =w , z
(25)
,k :
w f (z )/z = , wk f (z )/zk Y
which has the interpretation that the price ratio between two goods has to equal their so-called marginal rate of technical substitution. Again, if the set in (24) are also sucient for a solution to the PMP. is convex, the rst order conditions
5.5.
The cost minimization problem
In a prot maximizing production plan, there is no way to produce the same amount of outputs at a lower total input cost. This motivates a study of the
cost minimization problem (CMP),
which we consider only in the single-output case. Assume the production function is
and the input price vector is
RL1 . We want to produce at least an amount ++ min w z s.t. z RL1 , + f (z) q.
f : RL1 + q of the
output. What is the minimal amount we have to spend on inputs to achieve this? The answer is given by the CMP:
The
conditional factor demand correspondence assigns to each vector w RL1 of input ++

q
the associated set and
prices and each output level
z(w, q)
of solutions to the CMP:
L1 z(w, q) = {z R+ : f (z) q
wz wz
46
for all
z RL1 +
with
f (z ) q}.
The conditional factor demand correspondence species the set of input vectors solving the CMP, the
cost function c(w, q) indicates its value:

c(w, q) =
zRL1 ,f (z)q +
min
w z = w z
for all
z z(w, q).
The cost minimization problem and the expenditure minimization problem
min p x s.t. x RL + u(x) u

are identical, up to a relabeling of the involved functions. Therefore, rewriting Propositions 4.5, 4.6, and 4.8 provides a long list of properties for conditional factor demand and the cost function. If the production function associated conditions
is continuously dierentiable, the Kuhn-Tucker conditions can
be used to show that at a solution
z of the CMP, there must be a Lagrange multiplier 0 with the condition q f (z) 0 and Lagrange multipliers 0 associated with the z 0 such that for all = 1, . . . , L 1 : w = f (z ) + z
and
z = 0.
for all ), this implies that
If the solution uses positive amounts of all inputs (z all , so
>0
=0
for
w =
for all and consequently
f (z ) z
w f (z )/z = , wk f (z )/zk
as in (25)!
5.6.
Linking the PMP and the CMP
In the case of a single-output economy with production function vector
f : RL1 R+ , +
input price
RL1 , and output price ++
p > 0,
the PMP becomes
max pq w z s.t. q f (z), z RL1 . +

The set of solutions is commonly denoted as solution
(z, q),
positivity of the output price (p
y(p, w) and the maximal prot as (p, w). In a > 0) implies that q = f (z), otherwise the prot
can be increased:
pq w z < pf (z) w z.
Consequently, the PMP simplies to
zRL1 +
max pf (z) w z.
Moreover, production has to be as cheap as possible, so there is a link with the CMP:
47
Proposition 5.8
Consider a production function
L1 set {z R+ : f (z) q} is nonempty and p > 0 the output price. Consider the optimization problems
(P1) (P2)
f : RL1 R+ + L1 closed. Let w R++
such that for each
q 0,
the
be the vector of input prices,
maxzRL1 pf (z) w z ,
+
maxq0 pq c(w, q).
The following claims are true: (a) For each (b) For each
z RL1 , + q 0,
there is a
qz 0
with with
pf (z) w z pqz c(w, qz ). pf (zq ) w zq pq c(w, q).
there is a
L1 zq R +
(c) If one of the problems (P1) and (P2) has a solution, so does the other and the corresponding maximum values coincide:
zRL1 +
max pf (z) w z = max pq c(w, q).

q0
Exercise 5.2
Prove Proposition 5.8.
The PMP as formulated in (P2) is particularly easy: given the cost function, the PMP reduces to a single-variable maximization problem. an optimum In practice, this is often the easiest way to solve the PMP. Under suitable dierentiability assumptions, the necessary Kuhn-Tucker condition at
is that there exists a Lagrange multiplier
associated with the condition
q 0
such that:
p
Assuming
c(w, q ) = q =0
and
q = 0.
q > 0,
this means that
and hence that price equals marginal costs at a prot
maximizing quantity. If the cost function is convex in
q,
this condition is also sucient. Consider a technology using
Example: some calculations in a single-output economy:

a single input to produce a single output via the production function for all
f : R+ R with f (z) = y1 }.
z 0.
The production set is
Y = {(z, q) R2 : q f (z), z 0} = {y R2 : y1 0, y2
Assume that the input price is problem (P1) becomes
w>0
and the output price is
p > 0.
The prot maximization
max p z wz.
z0
At
z = 0,
the prot is zero. At an interior solution
z > 0,
the following rst order condition
must be satised:
p w = 0, 2 z z =
p p2 2w and prot 2w
so
z =
p 2 2w , yielding output
p2 4w
p2 4w
> 0.
Conclude that the
supply function is
y(p, w) =
p 2w
48
p 2w
(26)
and the prot function
(p, w) =
p2 4w . The cost minimization problem for production level
is
min wz s.t. z 0, z q.
At an optimum demand is
z , it is clear z = z(w, q) = q 2
that
z = q:
no inputs are wasted. Hence the conditional factor
and the cost function is
c(w, q) = wq 2 .
This allows us to rewrite
the prot maximization problem as in (P2):
max pq c(q, w) = max pq wq 2 .

q0 q0
Solving this optimization problem yields an optimal output quantity
q =
p 2w as in (26).
5.7.
Eciency
A production plan
yY
is
ecient
if there is no
y Y
with
y y
and
y = y.
In words,
there is no dierent production plan producing at least as much output while using at most as much input. There is a close connection between prot maximization and eciency:
Proposition 5.9
(a) If (b) If
Consider a production set
Y RL .
i.e., if
yY Y
maximizes prots at prices
p RL , ++
y y(p),
then
is ecient.
is convex, then for every ecient y that y is prot maximizing at prices p.

Suppose
there is a nonzero price vector
p RL +
such
Proof. (a):
y is not ecient: there is a y Y with y y, y = y . Then p y > p y : y exceeds that from the prot-maximizing y , a contradiction. (b): Let Z = {y RL : y > y }. Since y is ecient: Z Y = . By the separating hyperplane L theorem, there is a vector p R , p = 0 such that p y p y for all y Z and y Y . Two
the prot from things remain to be shown:
p RL . Suppose, to the contrary, that p < 0 for some coordinate . Then + for some y Z with y y > 0 suciently large. A contradiction. py <py Secondly, that y is prot maximizing at prices p. Let y Y . To show: p y p y . For n = (y + 1/n, . . . , y + 1/n) Z . Then p y n p y . Since each n N, dene the vector y 1 L y n y , it follows that also in the limit p y p y .
Firstly, that
Exercise 5.3
This exercise investigates the need for the dierent assumptions in Proposition 5.9.
(a) Give an example of a production set such that
maximizes prots at
Y R2 , a point y Y and prices p, but y is not ecient. Y R2

and a point
price vector
p R2 , p = (0, 0), +
(b) Give an example of a convex production set prot maximizing for any
yY
which is ecient but not
p R2 . ++ Y R2
which is not convex and a point
(c) Give an example of a production set
y Y
which is
ecient, but not prot maximizing for any nonzero price vector
p R2 . +
49
6.
6.1.
General equilibrium
What is an equilibrium?
Earlier, we studied how consumers choose optimal consumption bundles given their preferences, wealth, and the price vector and how rms choose optimal production plans given their technology and the price vector. Are there price vectors where all these optimal choices are actually feasible? You don't, for instance, want people demanding ten apples if there only are ve. Such a price vector and the corresponding demand and supply constitute a foundations it is a description of: something feasible, where each involved agent taking as given those things beyond his control makes a choice that makes him as happy as possible. Notice, in particular, that it involves no statements like markets clear or supply equals demand. Economic agents quite frankly couldn't care less: they have their preferences, some constraints, and all they wish for is to choose optimally. Nevertheless, some people become very nervous when one doesn't assume that markets clear (excess demand equal to zero) in equilibrium. I want to take this concern seriously, so let me briey explain this. Market clearing is an assumption about aggregate behavior that is not in line with the microeconomic idea behind equilibrium that combines feasibility with optimal behavior of
Walrasian equilibrium .
Its
denition follows the central idea behind any economic equilibrium concept with decent micro-
individual agents; Kreps (1990, p. 6), for instance, states:

Generally speaking, an equilibrium is a situation in which each individual agent is doing as well as it can for itself, given the array of actions taken by others and given the institutional framework that denes the options of individuals and links their actions. Sometimes, it is downright silly to insist on market clearing. Suppose agents in an economy are endowed with a positive quantity of a commodity that is undesirable and of no use whatsoever as an input. Why would you insist on supply and demand for this commodity being equal? What are you going to do? Stu the good down people's throat? Or what if agents only want to consume gloves in matching pairs? If there happen to be more left- than right-hand gloves, simply leave excess gloves to gather dust somewhere. Consequently, market clearing is often not a part of the denition of equilibrium. See, for instance, Arrow and Hahn (1971, p. 107), Kreps (1990, p. 190), Mas-Colell (1985, p. 169), and Varian (1992, p. 316). Market clearing in equilibrium, however, turns out to be a consequence of commonly imposed restrictions. You may nd Exercise 6.2(c) helpful. To illustrate the main ideas behind general equilibrium analysis, we start by studying a
exchange economy
pure
where there is no production, but where consumers are initially endowed
with certain amounts of the dierent goods. This entails no real loss of generality: our main tool will be to study excess demand, regardless of whether it involves producers or not. Walrasian equilibrium is dened and shown to exist in a particularly simple case. result is provided in Section 6.4. Also, we study some of its welfare properties. After introducing producers into the model, a more general existence
50
6.2.
A
Pure exchange economies
pure exchange economy is a tuple E = (

H hH
has a weak order
h, h) hH , where:
is a nonempty, nite set of consumers/households,
and each consumer
h over
RL , +
where
L N, L
commodities.
h an initial endowment
consumer
RL of the + =
The total endowment is denoted
hH
h a commodity bundle x
hH
h.
An
RL . Allocation +
allocation
x
is:
x = (xh )hH
assigns to each
feasible if hH xh , nonwasteful if hH xh = .
If the price vector is can aord bundles
xh ()
p, the initial endowment of consumer h H is worth p h , so consumer h x RL with p x p h , i.e., consumer h's budget set is B h (p, p h ). Let +
h, h)
denote this consumer's demand correspondence.
The basic idea behind equilibria (feasibility and optimal choices) leads to the following denition. A where:
Walrasian equilibrium of a pure exchange economy E = (

is a price vector,
hH is a pair
(p, x),
p RL , p = 0, + x= (xh )
excess demand correspondence z assigning to each price vector p the dierence between total demand for and the total
Properties of Walrasian equilibrium are often studied using the availability of the commodities:
hH is a feasible allocation, h for each consumer h H , x is a most preferred bundle at prices
p,
i.e.,
xh xh (p, p h ).
z(p) =
hH
xh (p, p h ) { h } =
hH
xh (p, p h ) {}.
By denition of Walrasian equilibrium, corresponding excess demand vector i.e., a
p is an equilibrium price vector if and only if there is a z z(p) where no commodity has positive excess demand,
z z(p) RL . h H, p RL , > 0 : + B h (p, p h ) = B h (p, (p) h ). p

for all
Budget sets are homogeneous of degree zero in prices:
Therefore, if
is an equilibrium price vector, then so is
> 0.
In the computation
of Walrasian equilibria, this allows some simplications, for instance by assuming that the equilibrium price of one of the goods is equal to one, or that the sum of the prices is equal to one, i.e., they lie in the unit simplex
= {p RL : +
L =1 p
= 1}
(also denoted
if we want to
stress the dimension of the vectors). To illustrate the idea behind existence proofs of Walrasian equilibria, the next result makes a lot of simplications.
Proposition 6.1
Assume that excess demand
z: z : RL ,
is a well-dened function (rather than a correspondence) is continuous,
51
satises Walras' Law: Then there is a price vector
p z(p) = 0 for all p . p with z(p) 0.
Proof.
The idea is to change prices by making goods in excess demand relatively more expensive If there are no more changes, there is no excess
and hope that demand for them goes down.
demand, and we found an equilibrium price vector. Dene
f : .
i=1,...,L
by
f (p) =
Function
pi + max{zi (p), 0} 1+
L j=1 max{zj (p), 0}
increases the price of commodities for which excess demand is positive and then
rescales the resulting price vector so that its coordinates add up to one. As the composition of continuous functions,
is continuous. By Brouwer's xed point theorem, there is a
with
f (p) = p.
We show that
z(p) 0.
By Walras' Law:
0 = p z(p) = f (p) z(p) 1 p z(p) + = L 1 + j=1 max{zj (p), 0}

=0
Therefore,
L i=1
max{zi (p), 0}zi (p) .
max{zi (p), 0}zi (p) = 0.

i=1
Notice:
(27)
max{zi (p), 0}zi (p) =
0 zi (p)2 > 0
if if
zi (p) 0, zi (p) > 0.
So (27) is the sum of nonnegative terms. The only way in which it can be zero, is if all its terms are zero, i.e., if
zi (p) 0 p
for all with
i,
as we had to show.
z(p) 0 together with the allocation x = (xh (p, p h ))hH is a Walrasian equilibrium. Using z(p) 0 and Walras' Law (p z(p) = 0), it follows that excess demand is zero for commodities i with pi > 0: a good can be in excess supply in equilibrium,
The price vector but only if its price equals zero. The desired properties of excess demand are usually derived from conditions on consumer preferences, using Proposition 4.3.
6.3.
Welfare analysis
A feasible allocation
is: if there is another feasible allocation
Pareto dominated
hH x as in x
and x h
with
xh
xh
for all
xh for some
h H,
i.e., if all consumers are at least as well o in
and at least one of them is strictly better o.
Pareto optimal if it is not Pareto dominated. Call a nonempty collection S H of consumers a coalition . Coalition S can improve upon a
feasible allocation
if there are commodity bundles
xh
for all
these bundles simply redistribute initial endowments:
h S such h hS x =
that
hS
h,
52
The that
core of E is the set of feasible allocations that no coalition can improve upon.
xh h
for all
all members of
are better o:
xh
xh
for all
h S.
The requirement that no one-agent coalition can improve upon allocation
h H. (p, x)
This condition is often referred to as
individual rationality .
lies in the core.
simply requires
Proposition 6.2 Proof.
If
is a Walrasian equilibrium of
E,
then
S H can improve upon x via commodity bundles (h )hS . Then x h h xh for each h S . By denition, xh is a most preferred bundle at prices p, so xh x h h h cannot lie in the budget set B (p, p ), i.e., p x > p . Summing over all h S gives h p hS xh > p hS h . This contradicts that (h )hS redistributes initial endowments: x h = h. hS x hS
Suppose coalition Under weak assumptions, Walrasian equilibrium allocations are Pareto optimal:
Proposition 6.3
rium of
First fundamental welfare theorem: If (p, x) is a Walrasian equilib-
and consumers have locally nonsatiated preferences, then
is Pareto optimal.
Proof.
xk
k
Suppose
is Pareto dominated by feasible allocation
xk for some k H . k k and p x > p x = p . k h h hH x hH .
x: xh h xh for all h H and h p xh = p h for all h H By local nonsatiation, p x So, p xh > p hH h , contradicting feasibility of x: hH
As a partial converse to the previous result, some additional assumptions guarantee that anything that is Pareto optimal can be sustained as a Walrasian equilibrium allocation at least if initial endowments can somehow be redistributed.
Proposition 6.4
Second fundamental welfare theorem: Assume:
for each redistribution of initial endowments in the pure exchange economy equilibrium exists, consumers have strictly convex preferences. If
E,
a Walrasian
x is a Pareto optimal allocation, redistribute initial endowments such that h = xh for all h H . Then x is a Walrasian equilibrium allocation for the resulting pure exchange economy.
Proof.
For each
(, x). p B h (, p xh ), so xh p xh . h h xh for all h H . By Pareto optimality of x, none of these preferences can be strict, so x h = xh for all h H , suppose there is an h H with xh = xh . Consumer h can To see that x h aord ( + x )/2. By strict convexity of preferences, this bundle is strictly preferred to x , xh h contradicting that x is an optimal bundle for the consumer in the Walrasian equilibrium. h
By assumption, the resulting pure exchange economy has a Walrasian equilibrium
h H , xh
is optimal and
xh
is feasible in the budget set
6.4.
Private ownership economies
Let us extend the pure exchange economy by adding rms, owned by the households: each household is entitled to a share (possibly zero) of each rm's prot. Formally, a
economy is a tuple
where:
private ownership
E= (
, h )hH , (Y f )f F , (hf )hH,f F ,
53
is a nonempty, nite set of consumers/households,
a nonempty, nite set of rms,
each rm and each
f F has consumer h H
a production set has
Y f RL ,
where
L N,
a weak order
h over
RL , + h RL + [0, 1]
of the
an initial endowment
commodities,
hf a claim to a share
of the prot of rm
f F
(where
hH
hf = 1
for all
An
allocation (x, y) = ((xh )hH , (yf )f F ) assigns to each consumer h H a commodity bundle xh RL and to each rm f F a production plan y f Y f . Allocation (x, y) is feasible if +
xh
hH
If the price vector is set
f F ).
h +
hH f F
yf .
p and rms decide on production plans (y f )f F , consumer h H x RL : p x p h + +

f F
has budget
hf y f
, hf
of the prot
h because the initial endowment is worth p and
receives share
p yf
of rm
f F.
Let dence
xh () denote the demand correspondence of consumer h H , y f () the supply corresponf of rm f F , and () its prot function. The basic idea behind equilibria (feasibility E
and optimal choices) leads to the following denition. A ownership economy is a triple
Walrasian equilibrium
of a private
(p, x, y),
where
p RL , p = 0, +
is a price vector,
(x, y) = ((xh )hH , (y f )f F ) is a feasible allocation, h for each consumer h H , x is a most preferred bundle at prices xh xh p, p h +
f F
for each rm
p:
hf y f
f F , yf
maximizes prots at prices
p: y f y f (p)
and
f (p) = p y f .
Once again, existence of Walrasian equilibrium is usually established by looking at the for and total availability of the commodities:
demand correspondence z assigning to each price vector p the dierence between total demand
z(p) =
hH
and the interest is in nding a price vector to prices in the unit simplex
excess
xh p, p h +
f F
hf f (p)
f F
y f (p) {},
p where z(p) RL = .
The following result (Debreu,
1959, Section 5.6) establishes existence of such a price vector; as before, one may restrict attention
. z: Z RL : z(p) Z
for all
Proposition 6.5
Assume that excess demand
achieves values in some convex, compact set is nonempty-valued:
p ,
z(p) =
for all
p ,
54
is convex-valued:
z(p)
p ,
is a closed set, all
has a closed graph:
{(p, z) Z : z z(p)} p
with
satises a weak form of Walras' Law: Then there is a price vector
p z 0 for z(p) RL = .
and all
z z(p).
Proof.
p z,
Once again, the idea is to make goods with large excess demand expensive in the hope of
decreasing it. This is achieved by maximizing, for a given excess demand vector which requires putting all weight of
on the largest coordinate(s)
z , the expression of z . Dene the
correspondence from
to
by
(z) = {p : p z = max p z}.

p
As it maximizes a continuous function valued. Let convex graph. The correspondence
p p z over a nonempty, compact set , is nonemptyz Z and p0 (z). Then (z) = {p RL : p z = p0 z} is the intersection of sets, so is convex-valued. A standard continuity argument shows that has a closed
from and to
with
(p, z) = (z) z(p) and z have these properties. By Kakutani's xed point theorem, there is a (p, z) Z with (p, z) (p, z) = (z)z(p). As z z(p), the weak Walras' Law implies that p z 0. As p (z), p z p z for all p . For each {1, . . . , L}, taking p = e gives that z = p z p z 0, so z 0.
is nonempty-valued, convex-valued, and has a closed graph because The trick, of course, is to derive the desired properties of the excess demand correspondence by imposing properties on the components of the private ownership economy
E.
Given the results
of Sections 4 and 5, most of them should not come as a surprise. Only the rst is somewhat complicated: what allows us to restrict attention to such a convex, compact set
Z ? Convexity of Z is not the issue: if you can nd a compact set containing all the images z(p), they also lie in a suciently large (convex) ball. Without going into details, compactness of Z is established f by realizing that the relevant production plans, by feasibility, must satisfy f F y + 0.
Following the lines of Proposition 5.5, this set of attainable production plans can be shown to be compact. Appropriate modications of the fundamental welfare theorems continue to hold for private ownership economies. As this section was meant only as a short introduction to the topic, the interested reader is referred to Debreu (1959) for a more comprehensive treatment. Textbooks on general equilibrium theory include Hildenbrand and Kirman (1988) and Starr (1997).
6.5.
Exercises
Exercise 6.1
(a) What is wrong with the following argument: Proposition 6.2 implies Proposition 6.3: if the core, the coalition
lies in
S=H
of all consumers cannot improve upon it. So
is Pareto optimal.
(b) Give an example of a pure exchange economy in the core, but is not Pareto optimal.
and a Walrasian equilibrium
(p, x)
such that
lies
55
Exercise 6.2
Market clearing:
Consider a (pure exchange/private ownership) economy
where
Walras' Law holds:
pz =0
for all price vectors
and all
z z(p).
Prove:
(a) In equilibrium, markets with a positive price clear:
p RL , z z(p) RL , {1, . . . , L} : +
(b) If prices are positive and
if
p > 0,
then
z = 0.
L1
markets clear, then so does the nal one: if
p RL , z z(p), {1, . . . , L} : ++
Markets clear in most standard applications:
zk = 0
for all
k=
, then
z = 0.
(c) Consider an equilibrium. Suppose (c1) or (c2) is true for at least one consumer (c1) (c2)
h H:
is strongly monotonic on
X=
RL . + h's
least preferred alternatives are on the axes:
has a positive amount of money to spend,
x, y RL : +
and
x RL , y RL x / ++ ++
y,
is strongly monotonic on
X = RL . ++
Prove that all markets clear. Cobb-Douglas preferences, for instance, satisfy the requirements in (c2), not those in (c1).
Exercise 6.3
and
Pareto dominated if there is another feasible allocation (, y) with xh x Pareto optimal if it is not Pareto dominated.
xh
h
Consider a private ownership economy
E.
A feasible allocation
(x, y)
h
is for all
xh
hH
xh
for some
h H.
(a) Why do you think Pareto dominance is dened in terms of consumer preferences, ignoring those of producers? (b) Prove the First fundamental welfare theorem: If and consumers have locally nonsatiated preferences, then
(p, x, y) (x, y) is
is a Walrasian equilibrium of Pareto optimal.
Exercise 6.4
Restricting attention to prices in the unit simplex
(to avoid trivialities), give an example
of a pure exchange economy
with two consumers, two commodities, and
(a) no Walrasian equilibrium. (b) exactly one Walrasian equilibrium. (c) exactly two Walrasian equilibria. (d) innitely many Walrasian equilibria. Answer the same question for a private ownership economy by adding 714 producers (yes, seven hundred and fourteen. . . You don't seriously believe I'd ask this if the answer weren't trivial, do you?).
Exercise 6.5
King Solomon's problem:
In a well-known parable, king Solomon settles a dispute
between two women, each claiming that a certain baby is hers, by suggesting to cut it in two with his sword: the true mother is revealed as she is willing to give up her child to the liar, rather than have it killed. Swords make babies divisible commodities, so consider a pure exchange economy with two
mother has utility function
x [0, 1] be a share of a baby. The true uT : [0, 1] R with uT (x) = x if x {0, 1} and uT (x) = 1 otherwise. L The liar has utility function u : [0, 1] R with uL (x) = x. Determine for each initial allocation T L 2 ( , ) {z R+ : z1 + z2 = 1} the set of feasible allocations, the set Pareto optimal allocations, the
consumers (the two women), one commodity (the baby). Let core, and the set of Walrasian equilibria.
56
7.
Expected utility theory
Hitherto, we assumed that decision makers act in a world of absolute certainty; typically, however, the consequences of decisions entail some stochastic elements. This section treats the development of expected utility theory, using the axiomatic approach of von Neumann and Morgenstern.
7.1.
Simple and compound gambles
We maintain the notion of preferences, but instead of assuming that a decision maker (DM) has preferences over certain outcomes, we consider preferences over probability distributions over outcomes. Formally, let set of
ai A.
(deterministic) outcomes . A simple gamble assigns a probability pi to each outcome

We denote a simple gamble by
A = {a1 , . . . , an }
lotteries or gambles , which are

be a nonempty, nite
g = (p1 a1 , , pn an ).
Probabilities should be nonnegative and add up to one, so the set of simple gambles is
G1 =
(p1 a1 , , pn an ) : p1 , . . . , pn 0,
i=1
pi = 1 .
or tails
(28)
For instance, when tossing a coin, the outcome will be heads fair coin corresponds with the simple gamble
T,
so
A = {H, T }.
( 1 H, 1 T ). 2 2
Some notational conventions:
one often omits outcomes with probability zero from the notation of a simple gamble:
( 1 a1 , 1 an ) 2 2
is an abbreviation for the simple gamble
1 1 a1 , 0 a2 , , 0 an1 , an . 2 2
one often writes
ai
for the simple gamble
(1 ai ) whose outcome is ai
with probability one.
Not all gambles are simple. Perhaps you decided to bet one dollar on your favorite number in a roulette game, but toss a coin to decide which of two roulette wheels you want to play in a casino: the outcome of the rst gamble (the coin toss) is another gamble (the roulette game). This is an example of a compound gamble. In principle, we can have any level of compound gambles. For convenience, we will assume that a compound gamble ends in a deterministic outcome after only nitely many steps. Formally, the set of compound gambles is dened as follows. Let and, inductively, for each the lower levels
G0 = A
m N, let Gm G0 , . . . , Gm1 :
be the set of gambles whose outcomes are gambles from
Gm =
The
(p1 g1 , , pk gk ) : k N, p1 , . . . , pk 0,
i=1
pi = 1,
and
g1 , . . . , gk m1 G =0
set of compound gambles is

A occur. g yielding a1 with
G = Gm . m=0
For instance, suppose that probability
Associated with each compound gamble is a simple one, specifying the eective probabilities with which the outcomes in compound gamble
A = {a1 , a2 }
and consider the
and a lottery ticket with probability
1 .
57
The lottery ticket is a simple gamble yielding
a1
with probability
1 .
Eventually, this implies that
a1
occurs with probability
and a2 with probability + (1 ) and a2 occurs with
probability
(1 )(1 ).
Thus,
gives rise to the simple gamble
(( + (1 )) a1 , (1 )(1 ) a2 ).
Similarly, for every gamble We say that
g induces the simple gamble (p1 a1 , , pn an ) G1 or that the latter is the reduced simple gamble associated with g. Notice that this reduced simple gamble is unique.
7.2. Preferences over gambles
over the set
g G,
let
pi
be the eective probability assigned to
ai A
by
g.
Assume the DM has a preference relation following properties: (G1) is a weak order.
of compound gambles. Impose the
Given the set of deterministic outcomes described by its vector unit simplex can state: (G2) Continuity on
A = {a1 , . . . , an }, every simple gamble g G1 is fully (p1 , . . . , pn ) Rn of probabilities, i.e., we can interpret G1 simply as the n = {p Rn : i pi = 1}. And in Rn , we know what continuity means, so we + G1 : G1
restricted to
is continuous.
Continuous weak orders have played an extensive role also in our earlier sections; the following properties explicitly exploit the specic structure of our gambling framework. Our next property requires that in considering a gamble, the DM cares only about the eective probabilities assigned to each outcome in
A:
it suces to restrict attention to simple gambles:
(G3) Reduction to simple gambles: for each induced by
g,
then
g G, g (p1 a1 , , pn an ).
if
(p1 a1 , , pn an )
is the simple gamble
This is a strong assumption. It rules out, for instance, any preference relation that takes into account the complexity of compound gambles: a DM may strictly prefer the associated reduced simple gamble to some
g G2562 ,
since it involves a much less intricate chain of events leading
to eventual deterministic outcomes. Our next property, independence, says that if we mix two gambles
and
with a third one,
, then the preference between the mixtures should be independent of the particular choice of
the third gamble. It essentially requires some form of independence of irrelevant alternatives: in the two gambles
( g, (1 ) g )
the gamble gambles
and
( g , (1 ) g ),
According to independence, this means
occurs with the same probability
1 .
that the preference should depend only on the part where the two gambles are dierent, i.e., on
and
g. g, g , g G
and all
(G4) Independence: for all
(0, 1): ( g , (1 ) g ).
g ( g, (1 ) g )
58
These four properties have a number of intuitive consequences:
Proposition 7.1
Assume the preference relation
on
satises (G1) to (G4). i.e., for all
(a) There is a best element (b) For each
and a worst element
in
G1 ,
g G1 : g
g.
g G,
there is a number
g [0, 1]
such that
g (g g , (1 g ) g).
(c) Substitution: let be such that
k N and let p1 , . . . , pk > 0 add gi hi for all i = 1, . . . , k . Then
up to one. Let
g1 , . . . , gk , h1 , . . . , hk G
(p1 g1 , , pk gk ) (p1 h1 , , pk hk ).
Finally, let us assume that (d) Monotonicity: for all
g,
to avoid trivial cases. if
, [0, 1],
> ,
then
( g , (1 ) g)
( g , (1 ) g).
on the compact unit
Proof. (a): (b):

g
simplex Let
Immediate from continuity (G2) of the weak order (G1)
n . g G and let gs G1 be its reduced simple gamble. Since g gs by (G3) and gs g , it follows from transitivity (G1) that g g g . Let p, p n be the associated probabilities of g and g . By connectedness of the set of convex
combinations of these best and worst gambles in the unit simplex, Proposition 2.7 implies that there is a gamble with probabilities
g p + (1 g )p
equivalent with
g.
By reduction to simple gambles (G3), this means
g (g g , (1 g ) g)
(c):
By induction on
k N.
The claim is trivially true if
k = 1.
Let
k N, k 2,
and suppose
the claim is true for mixtures of less than gambles, notice that
gambles.
To prove the case with mixtures of
p2 pk (p1 g1 , , pk gk ) (p1 g1 , (1 p1 ) ( 1p1 g2 , , 1p1 gk )) p2 pk (p1 h1 , (1 p1 ) ( 1p1 h2 , , 1p1 hk )) (p1 h1 , , pk hk )

so the claim holds by transitivity of
by (G1) and (G3) by induction by (G1) and (G3)
(d):
Assume
and let
. , [0, 1] satisfy > .
from reduction (G3) and independence (G4), so
= 1 or = 0, the result follows assume that 1 > > > 0. Then

If by (G4) by (G1) and (G3).
easily
( g , (1 ) g)
( g, (1 ) g) g
59
Since
is a weak order (G1):
( g , (1 ) g)
Denote the left gamble by
g.
g.
Then
( g , (1 ) g) = g ( g , (1 ) g )
( g , (1 ) g)
by (G1) and (G3)
by (G4) by (G1) and (G3).
( g , (1 ) g)
Since is a weak order (G1):
( g , (1 ) g)
as we had to show.
( g , (1 ) g),
7.3.
von Neumann-Morgenstern utility functions
Equipped with these results, one can show that properties (G1) to (G4) imply the existence of a utility function Formally, a that represents the preference relation on
von Neumann-Morgenstern (vNM) utility function

G: g h u(g) u(h), g G:
n
u:GR
that is linear in the eective probabilities over the outcomes. is a function
u:GR
g, h G :
and does so in a way that for every gamble
u(g) =
i=1
where
pi u(ai ), g.
(p1 a1 , , pn an )
is the simple gamble induced by
In words: a vNM utility function represents the preferences of the DM and the utility assigned to a gamble equals the expected utility of the induced simple gamble.
Proposition 7.2 Proof.

that that
If
is a preference relation over .
satisfying (G1) to (G4), there exists a
vNM utility function representing
By Proposition 7.1(a), there exists a best gamble
and a worst gamble
in
G1 .
In the
trivial case where
g g,
any constant function is a vNM utility function. So assume, w.l.o.g., of a unique number
g. g [0, 1]
such (29)
For each
g G, Proposition 7.1 implies the existence g (g g , (1 g ) g). Dene u(g) = g .
60
This utility function represents
: let
g, h G.
Then
h (g g , (1 g ) g) (h g , (1 h ) g) u(g) = g h = u(h),
and the second equivalence from
where the rst equivalence follows from transitivity (G1) of monotonicity and the denition of simple gamble induced by
u. g G and let gs = (p1 a1 , , pn an ) be the g gs , so u(g) = u(gs ). For each ai A, we know from u(ai ) that
To obtain the expected utility expression, let
g.
By (G3),
Proposition 7.1 and the denition of
ai (u(ai ) g , (1 u(ai )) g).

For each
i = 1, . . . , n,
dene
hi = (u(ai ) g , (1 u(ai )) g).
By substitution:
gs = (p1 a1 , , pn an ) (p1 h1 , , pn hn ).
Notice that
h1 , . . . , hn
are gambles over the best and worst gambles only.
By computing the
probability for the best gamble
and using reduction to simple gambles (G3), one nds that
(p1 h1 , , pn hn )
is equivalent with
pi u(ai )
i=1
Combining the above with transitivity of
g, 1
i=1
we nd:
pi u(ai )
g .
g gs (p1 h1 , , pn hn )
i=1
By denition,
pi u(ai ) [0, 1]
g, 1
i=1
pi u(ai )
g .
(30)
u(g)
is the unique number in
satisfying
g (u(g) g , (1 u(g)) g).

Combining this with (30) yields
u(g) =
n i=1 pi u(ai ).
Remark 7.3
Conversely, it is straightforward to verify that if a preference relation
on
can
be represented by a vNM utility function, it must satisfy properties (G1) to (G4). The linearity requirement on vNM utility implies that the earlier result from utility theory any strictly increasing transformation of the utility function of the consumer still represents the same preferences no longer holds. Indeed, the only transformations of a vNM utility function that remain vNM utility functions, are positive ane transformations:
Proposition 7.4
with
Consider the vNM utility function on
u:GR a, b R
dened in (29). For all . Conversely, if
a > 0,
also
au + b
is a vNM utility function representing
vNM utility function representing
G,
there exist
with
a>0
such
a, b R v : G R is a that v = au + b.
61
Proof.
To avoid trivialities, assume that
g.
The rst claim is simple.
To establish the
second claim, let
a>0
and
be the unique solution (do you understand why a solution exists
and why it is unique?) to
v() = au() + b, g g v(g) = au(g) + b.

Let
g G.
By construction see (29)
g (u(g) g , (1 u(g)) g),
so (31)
u(g) = u(g)u() + (1 u(g))u(g), g

and, similarly,
v(g) = u(g)v() + (1 u(g))v(g) g = u(g)[au() + b] + (1 u(g))[au(g) + b] g = a[u(g)u() + (1 u(g))u(g)] + b g = au(g) + b,

where the last equation follows from (31). Our development of vNM utilities involved a nite set topological and measure-theoretic complexity.
of deterministic outcomes and com-
pound gambles of nite length. These assumptions can be relaxed, but at the cost of increased
7.4.
Exercises
Throughout this exercise, let
Exercise 7.1
set
G = Gn n=0
be the set of compound gambles over a nite
{a1 , . . . , ak } R of k 2 dierent deterministic outcomes. Recall: Gn is the set of n-th level gambles. For each of the preference relations over G dened below, answer the following questions: If possible, nd the best and the worst elements of G.
For each of the four properties (G1) to (G4) guaranteeing the existence of a vNM utility function, check whether satises it. . If (G1) to (G4) are satised, nd a vNM utility function representing
(a) Most likely outcomes: A decision maker bases preferences on the average of the deterministic outcomes that are most likely to occur. simple gamble. Let Let
g G
and let
(p1 a1 , , pk ak )
be its induced
L(g) = {ai : pi pj
relation on
for all
j = 1, . . . , k}
its number of elements. The preference
be the set of most likely deterministic outcomes and
is dened as follows: for all
|L(g)| g, h G:
ai L(g)
1 |L(g)|
ai
1 |L(h)|
ai L(h)
ai .
over
(b) Keeping it simple: A decision maker dislikes complex alternatives and has preferences represented by the following utility function: for each
(p1 a1 , , pk ak )
(c) Satisficing: preference relation
be its induced simple gamble.
g G, there Then u(g) =
is a unique
n with g Gn . k pm am n. m=1 g G,
Let
A decision maker is content with all deterministic outcomes larger than 5. on
The let
is represented by the following utility function: for each
(p1 a1 , , pk ak )
be its induced simple gamble. Then
u(g) =
i:ai >5 pi .
62
8.
8.1.
Risk attitudes
In for a gamble?
Let us conne attention to cases where the outcomes of the gambles are amounts of money: a convex set in
A is
R.
Despite the fact that we now allow an innite set of outcomes, we will assume The existence
that every gamble assigns positive probability to only nitely many outcomes. to (G4) to innite sets. We assume that the vNM utility function
theorem of vNM utility functions can be adjusted to this case by modifying the properties (G1)
is increasing in money and
investigate the relation between this function and the DM's attitude towards risk. Consider a nontrivial (i.e., at least two dierent deterministic outcomes have positive probability) simple gamble
g = (p1 w1 , , pn wn )
and suppose the DM is oered two scenarios:
1. Accept the gamble; this yields utility
u(g) =
n i=1 pi u(wi ). n i=1 pi wi .
2. Accept the outcome that gives the expected value of the gamble with certainty (this is where we need convexity of
A!).
The expected value of the gamble is equal to
E(g) =
This alternative has utility The DM is said to be:
u(E(g)) = u(
n i=1 pi wi ).
risk averse at g if u(g) < u(E(g)), risk neutral at g if u(g) = u(E(g)), risk loving at g if u(g) > u(E(g)). The DM is said to be risk averse (on G) if he is risk averse at every nontrivial simple gamble g over outcomes in A. Risk neutral and risk loving behavior are dened analogously. These
risk attitudes directly translate to properties of the associated vNM utility function over money:
Proposition 8.1
function
Let
A R
be nonempty and convex.
Assume the DM has a vNM utility
u.
Then the DM is:
(a) risk averse if and only if (b) risk neutral if and only if (c) risk loving if and only if
u u
is strictly concave on is linear on
A,
A, A.
is strictly convex on
Proof.
We only prove the rst claim; the others are similar. Risk aversion means that for every
nontrivial gamble
(p1 w1 , , pn wn ),
n n
u(p1 w1 , , pn wn ) =
i=1
pi u(wi ) < u(E(g)) = u

i=1
p i wi
. u is p1 , . . . , p n > 0
But this is equivalent with strict concavity: strictly concave on with
by induction it follows that the function
if and only if for all dierent
n i=1 pi
=1:
n i=1 pi u(wi )
< u(
n i=1 pi wi ).
w1 , . . . , w n A
and all
Although we can always check whether a DM is risk averse/neutral/loving at a specic gamble
g,
he does not have to be risk averse/neutral/loving over the entire collection of lotteries. It may
well be, for instance, that he is risk averse at high-stake lotteries and risk loving at low-stake lotteries.
63
8.2.
The
Certainty equivalent and risk premium
certainty equivalent
of a simple gamble
is an amount of money
certainty such that the DM is indierent between the gamble
and accepting
CE(g) oered CE(g):
with
u(g) = u(CE(g)).
Remark 8.2
on
For topologists (can be omitted): generalizing the continuity requirement G2 to
the case of an innite set
AR
of deterministic outcomes entails in particular that preferences with
are continuous.
So for each simple gamble
best deterministic outcome in
g)
and a
value theorem for preferences, Proposition 2.7, monotonicity of preferences in money, notion. The
w A (say, weight one on the w A with g w. By the Intermediate there is a CE(g) A with g CE(g). By g,
there is a the certainty equivalent is a well-dened
CE(g) is unique:
risk premium of a simple gamble g is an amount of money P (g) such that u(g) = u(E(g)
Clearly,
P (g)).
P (g) = E(g) CE(g).

Intuitively, a risk averse DM prefers the gamble
E(g) with certainty over the gamble g .
But there will be some
amount that makes him indierent between accepting that amount with certainty and accepting
g.
This amount is called the certainty equivalent. It is easy to show (see below) that
for a risk averse DM who strictly prefers more money to less, the certainty equivalent is less than the expected value
E(g)
of the gamble: a risk averse person is willing to pay a positive amount
of money to avoid the gamble's inherent risk. This willingness to pay is the risk premium.
Proposition 8.3
Consider a DM with vNM utility function
u which is increasing in wealth.
The
following three statements are equivalent: 1. DM is risk averse, 2. 3.
CE(g) < E(g) P (g) > 0

Since
for all nontrivial gambles
g S,
for all nontrivial gambles
g S.
Proof.
P (g) = E(g) CE(g),
statements 2 and 3 are equivalent, so it suces to show with
that statements 1 and 2 are equivalent. The DM is risk averse if and only if for every nontrivial
g S , u(g) < u(E(g)).

which is equivalent with
CE(g), this is equivalent CE(g) < E(g), since u is increasing.

By denition of
u(CE(g)) < u(E(g)),
As a simple exercise, try to formulate similar characterizations of risk neutral and risk loving behavior.
u(w) = ln(w) for all w A. This DM is risk averse, since u is strictly concave. Assume DM's initial wealth is w0 and DM faces a gamble g oering 50-50 odds of winning or losing an amount h (0, w0 ) :
Take
Example.
A = R++
and assume that
g = ((1/2) (w0 h) , (1/2) (w0 + h)).
64
Hence
E(g) = 1 (w0 h) + 1 (w0 + h) = w0 . 2 2 u(CE(g)) = u(g) =
The certainty equivalent
CE(g)
must satisfy
1 1 ln(w0 h) + ln(w0 + h) = ln 2 2
2 w0 h2 > 0 .
2 w0 h2 ,
where the nal equation follows from the properties of the natural logarithm. Hence
CE(g) =
2 w0 h2 < w0 = E(g)
8.3.
and
P (g) = w0
Arrow-Pratt measure of absolute risk aversion
Arrow and Pratt considered the problem of measuring the extent of risk aversion. They assumed that the vNM utility function
is an increasing, strictly concave function of wealth levels that
is twice dierentiable. In particular, they assume:
w : u (w) > 0
Using this, the
and
u (w) < 0.
(32)
Arrow-Pratt measure of absolute risk aversion at wealth w is dened as

Ra (w) = u (w) . u (w)
Why is this a sensible measure of risk aversion? A heuristic derivation is provided in the next subsection. The intuition is as follows: the more risk averse a DM is, the more he is willing to pay to avoid certain gambles. Thus, the size of the risk premium in some way measures risk aversion. It turns out that the Arrow-Pratt measure of absolute risk aversion is roughly proportional to the risk premium the DM is willing to pay to avoid actuarially fair bets (a bet is actuarially fair if its expected value equals initial wealth: the expected loss/gain is zero). Thus, if DM 1 is more risk averse than DM 2, his risk premium for every nontrivial gamble exceeds that of DM 2, so the same should hold (due to proportionality) for the Arrow-Pratt measures of absolute risk aversion. The actual proof is somewhat more complicated; we omit it.
Proposition 8.4
1.
Consider two DMs with vNM utility functions
and
respectively, both sat-
isfying (32). The following two claims are equivalent:
(w) (w) 1 2 Ra (w) = u (w) > v (w) = Ra (w) u v
for all wealth levels
w,
1 2. The risk premium P (g) of the DM with utility function u is strictly larger than the risk 2 premium P (g) of the DM with utility function v for every nontrivial gamble g S.
Notice that positive ane transformations of the utility functions do not aect not depend on the choice of vNM utility function. It is common in the literature on for instance portfolio choice to assume that risk aversion decreases with wealth. This is the
Ra (w):
it does
DARA assumption (Decreasing Absolute Risk Aversion):

is a decreasing function of
Ra ()
w.
65
8.4.
A derivation of the Arrow-Pratt measure
The argument in this section is due to Pratt (1964). Assume (32) and let the DM's initial wealth be
w0 .
Consider the gamble with 50-50 odds of winning or losing an amount
h:
g = ((1/2) (w0 h) , (1/2) (w0 + h)).

The gamble is fair:
E(g) = w0 .
Let
P = P (g) > 0
be the risk premium of
g:
(33)
1 1 u(g) = u(w0 h) + u(w0 + h) = u(E(g) P ) = u(w0 P ). 2 2

Take a rst order Taylor approximation of
u(w0 P )
around
w0 :
(34) around
u(w0 P ) u(w0 ) u (w0 )P.

Take a second order Taylor approximation of
u(w0 h)
and
u(w0 + h)
w0 :
1 u(w0 h) u(w0 ) u (w0 )h + u (w0 )h2 , 2 1 u(w0 + h) u(w0 ) + u (w0 )h + u (w0 )h2 . 2
Consequently,
1 1 1 u(w0 h) + u(w0 + h) u(w0 ) + u (w0 )h2 . 2 2 2

Using (33), (34), and (35), it follows that
(35)
1 u(w0 ) + u (w0 )h2 u(w0 ) u (w0 )P. 2

Rearranging terms, one nds
1 u (w0 ) P h2 . 2 u (w0 )
Conclude that the Arrow-Pratt measure of absolute risk aversion is approximately proportional to the risk premium losing an amount
P,
the willingness to pay in order to avoid the 50-50 odds of winning or
h.
66
9.
Some critique on expected utility theory
Expected utility theory is the main tool in economic models involving uncertainty. Nevertheless, expected utility theory has been under constant attack from behavioral economists and psychologists who show that subjects in experiments or real-life situations systematically violate the properties (G1) to (G4) or that mindless application of expected theory leads to counterintuitive conclusions. For this reason, many alternative models for decision making under risk and uncertainty have been developed. Perhaps the most well-known especially since Daniel Kahneman was awarded the 2002 Nobel Prize in economics is Kahneman and Tversky's prospect theory (Kahneman and Tversky, 1964). Although we lack time to go into such alternative models, we stand still for a while and consider a number of blows to the expected utility model.
9.1.
Problems with unbounded utility: a variant of the St. Petersburg paradox
Nothing in the development of our expected utility model required the utility function to be bounded. Unbounded utility functions, however, make decision-makers susceptible to cunning exploitation. Suppose a DM with initial wealth which is not bounded from above. By assumption, there is some wealth
w0 > 0 has a vNM utility function u over money

1 u(w0 ) < 2 (u(0) + u(w1 )).
Smile and oer your
w1
with
1 victim the gamble ( 2
0, w1 ),
1 2
which he will accept by construction.
If he loses, he ends up with wealth zero. If he wins, reach him where
w1
and just before he takes the
money from your hand, retract it, turn your smile back on, and oer him a gamble
w2
is chosen such that
u(w1 ) <
1 2 (u(0)
1 ( 2 0, 1 w2 ), 2
+ u(w2 )).
Again, by construction, the DM will
accept. As long as the DM goes on winning, keep oering such 50-50 odds gambles. . . The DM will end up with wealth zero with probability one!
9.2.
Allais' paradox
Consider the following four simple gambles:
g1 = (1 $1, 000, 000), g2 = ((0.10) ($5, 000, 000), (0.89) ($1, 000, 000), (0.01) ($0)), g3 = ((0.11) ($1, 000, 000), (0.89) ($0)), g4 = ((0.10) ($5, 000, 000), (0.90) ($0)).
It turns out that in dierent experiments, most people prefer
g1
to
expected utility theory. Suppose a DM has vNM utility function
g2 , but g4 u. Then
to
g3 .
This violates
g1
g2 u($1, 000, 000) > 0.10u($5, 000, 000) + 0.89u($1, 000, 000) + 0.01u($0).
Rearranging terms, we nd
g1
g2 0.11u($1, 000, 000) > 0.10u($5, 000, 000) + 0.01u($0) 0.11u($1, 000, 000) + 0.89u($0) > 0.10u($5, 000, 000) + 0.90u($0) g3 g4 , g3
and
where the last equivalence follows from computing the expected utility of
g4 .
67
9.3.
Probability matching
You are paid
$1 each time you guess correctly whether a red or a green light will ash.
The lights
ash randomly, but the red is set to turn on three times as often as the green. It has been found that many subjects in experiments of this type try to imitate the chance mechanism: they choose red about three quarters of the time and green one quarter. Obviously it would be more protable to always choose red. with probability Formally, the expected utility of the compound lottery of choosing red gives you a one dollar payo with probability
3/4
(3/4)2 + (1/4)2 = 10/16,
corresponding with the simple gamble
((10/16) $1, (6/16) $0),

while choosing red with probability one corresponds with the simple gamble
((3/4) $1, (1/4) $0).

Since
3/4 > 10/16,
the second gamble should be strictly preferred over the rst.
This type of matching behavior has been frequently observed in real life, as well as laboratory experiments, using both humans and animals as subjects. In an experiment with animals, for instance, foraging behavior of pigeons was studied, using two food patches (call them red and green, as above) with food being dispatched at the red location three quarters of the time and at the green location one quarter of the time. distribution. A small personal anecdote: jointly with two colleagues, I published two papers on a game theoretic model of bounded rationality in which players are assumed to display matching behavior. To explain the type of behavior to laymen and motivate that it is observed in real life, we used dierent examples, among them the pigeon example mentioned above. This led the Dutch Foundation for Mathematical Research, which at that time was nancing my work, to publish a press statement proudly proclaiming: People behave like pigeons when dealing with probability, a press statement that gave us extensive media coverage but where we desperately tried to qualify our employers' overzealous interpretation. So in case you sometimes wonder what you are doing. . . you may just be behaving like a pigeon! The pigeons tried to match this probability
9.4.
Rabin's calibration theorem
Matthew Rabin, one of the world's leading behavioral economists, published a remarkable article (Rabin, 2000) on the consequences of risk aversion with respect to small-stake gambles. Let us start with an example to illustrate the result. Consider a risk averse DM who for each initial wealth level rejects a 50-50 odds gamble of winning
11
dollars or loosing
10
dollars: certainly a
rather unremarkable level of risk aversion. What does this imply about his preferences for other gambles? Consider, for instance, the following statements: 1. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing
100 100 100
dollars and a 50 percent chance of gaining
150
dollars.
2. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing dollars and a 50 percent chance of gaining
1, 500
dollars.
3. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing dollars and a 50 percent chance of gaining
1, 000, 000
dollars.
68
4. For each level of wealth, the DM will reject the lottery with a 50 percent chance of loosing
100
1, 000, 000, 000, 000, 000, 000
dollars.
5. You can proceed with gains
as high as you want, but the DM will always reject the
lottery with a 50 percent chance of loosing
100
G.
Which of these statements are true? The rst and the second may perhaps not be so surprising and I probably wouldn't be asking you if the question was trivial, so even the third could be true. On the other hand, one would certainly doubt the sanity of a DM rejecting the bet in the fourth claim and lingering doubt turns to certainty in the fth case. Yet this is exactly what the DM will do: no amount of money in the world will make him accept a gamble with a 50 percent chance of loosing the gamble
100
dollars. Clearly, such behavior is absurd.
Let us try to establish some intuition. The fact that the DM at each wealth level
rejects
1 1 (w 10) , (w + 11) 2 2
implies that
w :
or, rewriting the expression, that
1 1 u(w 10) + u(w + 11) < u(w), 2 2
w :
u(w + 11) u(w) < u(w) u(w 10). w and w + 11 w 10 and w :

by at most
Hence, on average, the DM values each dollar between much as he, on average, values each dollar between
10/11
times as
w :
u(w + 11) u(w) 10 u(w) u(w 10) < . 11 11 10 (w + 11)-th dollar (w 10)-th
dollar:
By concavity of the utility function, this means that the marginal utility of the is at most
10/11
times the marginal utility of the
w :
u (w + 11) <
10 u (w 10). 11
times the marginal utility of dollar
(36)
Repeated application of (36) implies an enormous decrease in marginal utility of money: the marginal utility of dollar which is at most dollar
w + 32
is at most
10/11
w + 11,
10/11 times the marginal utility of dollar w 10, so the marginal utility of w + 32 is at most (10/11)2 0.83 times the value of dollar w 10. Similarly, the DM 3 values dollar w + 53 by at most (10/11) 0.75 times the value of dollar w 10. More generally, k+1 times the value of dollar the DM values dollar w + 11 + 21k , where k N, by at most (10/11) w 10, which is an extremely high rate of deterioration for the value of money.
69
10.
Time preference
Discounting essentially means that a given benet is valued higher when it is received immediately than when it is received with a delay. A common economic motivation for discounting is that, say, one dollar today is worth more than one dollar next year, as the immediate reward can be put into a bank at an annual interest rate year.
r > 0, making the dollar today worth 1 + r
dollars next
Another motivation, common in evolutionary models, is the risk that a delayed benet
may not be realized: you may die before receiving it (or be interrupted in achieving it, or be cheated in the promise of receiving it). In addition to the question how to model discounting in an appropriate way, decision theory in the presence of time involves a number of careful considerations: Choice of horizon: should one look nitely or innitely far into the future? Keynes' famous quote In the long run, we are all dead could be an argument in favor of a nite horizon. Many economic models involve just two time periods as an abstraction of now and the future. On the other hand, many decisions have no clearly dened nal period: you or in an evolutionary sense as in overlapping generations models, your genes may live to see another day. In such cases, an innite horizon makes sense. Choice of time as a discrete or continuous variable: also here, common sense, the appropriate level of abstraction, and (not rarely) the modeler's choice of mathematical tools is decisive. Unless specied otherwise, this section takes time as being discrete and uses an innite horizon. We derive the standard exponential discounting model from a stationarity assumption on preferences and briey discuss a violation of stationarity and hyperbolic discounting. Section 10.3, based on Osborne and Rubinstein (1994, Sec. 8.3), considers two criteria for evaluating outcomes over time without discounting. The nal section, based on Voorneveld (2007), illustrates the somewhat paradoxical statement that a sequence of utility-maximizing choices can minimize utility.
10.1.
Stationarity and exponential discounting
The standard model of preferences over time assumes that: the set of alternatives consists of sequences of outcomes arbitrary set, where preferences
c = (c0 , c1 , . . .) = (ct ) t=0 U
in some
ct
denotes the outcome at time
t.
of the form
over such sequences are represented by a utility function
U (c) = (0)u(c0 ) + (1)u(c1 ) + (2)u(c2 ) + =

t=0
with
(t)u(ct ),
(37)
(0) = 1.
the at time
The function in (37) is often interpreted as a sum of discounted instantaneous utilities: outcome
ct
tN
gives utility
in the future. The discount factor
u(ct ), but is discounted by a factor (t) (0, 1) as it lies (0) = 1 for current outcomes is mostly cosmetic, facilitating
the notation involving an innite sum.
Exercise 10.1
The expression in (37) involves an innite sum, which may not be well-dened.
(a) Give an example to show this.
70
(b) Prove that the sum is well-dened if the sequence of discount factors is summable ( and the instantaneous utility function
t=0
(t) < )
such
is bounded.
The most common form of (37) involves that
(t) = t
exponential discounting :
there is a
(0, 1)
for all
t,
turning utility into
U (c) =
t=0
t u(ct ). r > 0
(38)
Recall the earlier motivation for discounting of money: given a xed interest rate period, one dollar tomorrow is worth only future money by powers of Preferences satisfy
per
(1 + r)1
dollars today, so it makes sense to discount
= (1 + r)1 .
Following Koopmans (1960), exponential discounting
can also be derived by imposing a stationarity requirement on preferences.
stationarity
if they are not aected if a common rst outcome is
dropped, and the timing of all other outcomes is advanced by one period. By repeated application, it implies that for a comparison between two sequences all initial periods with common outcomes can be dropped, and the rst period of dierent outcomes can be taken as the initial period. Formally, the preference relation is
stationary if for all pairs (ct ) and (dt ) with c0 = d0 : t=0 t=0
(c0 , c1 , c2 , . . .) (d0 , d1 , d2 , . . .) (c1 , c2 , . . .) (d1 , d2 , . . .).
Deriving exponential discounting usually proceeds along the following lines:
Proposition 10.1
preferences
For notational convenience, let 0 be a feasible outcome. Assume that: can be represented by a utility function as in (37),
satisfy stationarity, the decision-maker is indierent between:
option 1: option 2:
where
getting getting
today and today and
tomorrow (i.e., the sequence tomorrow,
(, , 0, 0, . . .)),
u( ) = u( ). (t) =
u()u() u( )u( ) t
.
Let
Then the discount factor is exponential:
Proof.
By induction on
u()u() u( )u( )
t.
The result is trivial if
t = 0.
tN
and assume that
( ) =
for all
< t.
Repeated application of stationarity implies that
( 0, . . . , 0 , , , 0, 0, . . .) ( 0, . . . , 0 , , , 0, 0, . . .),
t1
times
t1
times
i.e., their utility must be the same. Substitution in (37) gives
(t 1)(u() u()) + (t)(u( ) u( )) = 0,

so
(t) = (t 1)
u() u() u( ) u( )
u() u() u( ) u( )
where the nal equality uses the induction hypothesis.
71
Exercise 10.2
Rational suicide: A decision maker (DM) lives for at most two periods,
t = 0
and
t = 1. 1/2
At each time
t {0, 1}
that he is alive, he must decide, depending on his mood, whether or not
to commit suicide. Regardless of his initial mood, at time or happy with probability
t=1
he will be depressed with probability
1/2.
His instantaneous utility is
state-dependent , i.e., it depends not only
on his action but also on the state of the world at time alive and happy, as follows:
t.
The set of states is
is alive and depressed, and
is dead. The set of actions is
is commit suicide and
is go on living. The instantaneous utility function
S = {h, d, D}, where h is A = {k, }, where k u : S A R is dened
1 1 u(s, a) = 0
where
if if if
(s, a) = (h, ), (s, a) = (h, k), (s, a) = (d, ),
otherwise,
>0
is the intensity of the depression. Thus, given that you're happy, killing yourself appears
silly, but if you're depressed, it may seem less so. State himself at time
D is irreversible: should the DM decide to kill t = 0, then he receives utility 0 at time t = 1. The DM discounts the future exponentially at rate 0 < < 1, and maximizes expected lifetime utility (of the standard additive form). We solve the decision problem by backward induction, starting with optimal behavior in the nal period t = 1. Assume the DM is alive at time t = 1.
(a) What is the optimal action and the resulting instantaneous utility if the DM at (a2) depressed? Now consider the initial period: assume the DM is depressed at time (b) Assuming optimal behavior at time the answer depends on
t = 1 is (a1) happy?
t = 0. t = 0?
Note:
t = 1,
what is the optimal action at time
(i) if
the DM does not kill himself immediately, there is uncertainty about his mood at time
t = 1; (ii)
and
(c) A psychologist claims that the option of future suicide might prevent depressed people from killing themselves straight away. Explain this claim using the answers above.
10.2.
Preference reversal and hyperbolic discounting
Stationarity requires that if you prefer one apple today over two apples tomorrow, then shifting this choice by one year (one apple next year versus two apples one year and a day from now) doesn't change that you'd still rather have the single apple. On the other hand, empirical evidence (Thaler, 1981) seems to suggest that people are much more sensitive to a waiting time of one day when it occurs right now than to a waiting time in the far future: if you anyway have to wait an entire year for a lousy apple, you might as well wait one day more and double the booty. Dierent attempts to capture such a preference reversal go under the heading of hyperbolic discounting. It simply involves discount factors that are not exponential. Arguably the simplest approach is the so-called for
, (0, 1),
the discount factors
(, )-model of Phelps and Pollak (1968). They dene, 2 3 as (0) = 1, (1) = , (2) = , (3) = , . . ., turning
the utility function (37) into:
U (c) = u(c0 ) +
t=1
t u(ct ). u
To see that this model can explain the preference reversal for the apples, assume that utility satises
u(0) = 0
and is strictly increasing in apples. Preferring one apple today over two apples
tomorrow means that
u(1) > u(2).

72
(39)
Preferring two apples one year and a day from now to one apple a year from now (and assuming we're not in a leap year) means that
366 u(2) > 365 u(1).

For (39) and (40) to hold simultaneously, we simply need
(40) to satisfy
(, ) (0, 1) (0, 1)
<
Taking
u(1) < . u(2)
suciently close to one and
suciently close to zero will do the trick.

Discount factors
Exercise 10.3 (Loewenstein and Prelec, 1992)
(t) = (1 + t)/ ,
with
, > 0,
t experimental data well. Show that also this model captures the preference reversal described above.
Exercise 10.4 (Wrneryd, 2007)
Sex and time preference:
In some evolutionary models of
intertemporal consumption, time periods represent generations and people care about future consumption to the extent that it is exercised by their ospring (children, grandchildren, etc.). To simplify matters, assume that are selected at random and have the relevant gene with probability
(i) a DM cares about consumption of its ospring only if it has a specic gene; (ii) mates [0, 1], (iii) ospring gets in expectation half of its genes from each parent, (iv) we consider one unit of ospring per time period.
(a) Let
the DM's (b) Set
t N. Show, for t-th period
instance by conditioning on the giver of the gene, that the probability ospring carrying the gene satises the recurrence relation
pt of pt = 1 pt1 + 1 . 2 2
1 t for all 2
p0 = 1 and t = 0, 1, 2, . . .
show that the solution to the recurrence relation is
pt = + (1 )
With these kinship parameters function becomes increasing, and (c) Let
(pt ) in the place of discount factors, the standard separable utility t=0 U (c) = t=0 pt u(ct ). Assume that consumption is in units of apples, u is strictly u(0) = 0. Let's investigate the opportunity of preference reversal. u be such that the DM prefers 1 T N suciently large, the DM
apple now (t
and
= 0)
to 2 apples next generation (t
Prove: for
prefers 2 apples at time
T +1
to 1 apple at time
= 1). T.
10.3.
Limit-of-means and overtaking
By discounting, less weight is assigned to future utilities. This section introduces two other ways of evaluating sequences of utilities, attaching equal weight to all periods. To save on notation,
(xt ) of real numbers, rather than t=0 . Probably the rst thing that comes to mind is to value a using the more elaborate (u(ct ))t=0 sequence of utilities (xt )t=0 using the long-term average of the utilities:
we will denote a sequence of utilities simply by a sequence
x0 + x1 + + xT 1 . T T lim
However, even if the sequence is bounded, this limit may not exist: the average may continue to oscillate. We verify this statement with a binary (zero-one) sequence. The idea is to append enough ones to increase the average until it achieves a xed high value, then to append enough zeroes to decrease the average until it reaches a xed low value, and continue this process.
An oscillating average:
Consider the binary sequence
(0, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, . . .)
73
obtained by starting with a zero and two ones, and then after each block of zeroes or ones, double the length of the sequence obtained so far with a block of the other number: after the rst block of ones, we have three coordinates, so we double the length to six coordinates by appending some zeroes. Then we double the length to twelve coordinates by adding some ones, etc. A simple inductive proof shows that after the
k -th
block of ones, the sequence has
3 22k2 1/3.
coordinates,
22k1 of them equal to one, and therefore an average of
2/3.
Doubling the length to
3 22k1
coordinates by appending zeroes decreases the average by a factor to oscillate between
1/2
to
As appending
zeroes decreases, and appending ones increases the average, it follows that the average continues
1/3
and
2/3.
Consider a bounded
Taking, instead, a pessimistic view of how the average utility changes over time will give us a well-dened criterion. sequence This requires some mathematical preliminaries. For each
(xt ) t=0
of real numbers.
t, st = inf{xs : s t}
indicates the inmum (in
somewhat colloquial terms, the worst value) of the tail of the sequence from time is weakly increasing: increasing
onwards.
This inmum is well-dened, as the sequence is bounded. Notice also that the sequence
(st ) t=0 is a implies taking the inmum over a smaller set. As (st )t=0
monotonic, bounded sequence, it converges. Its limit is called the lower limit or
(liminf) of the original sequence (xt ) : t=0

t
By convention,
limes inferior
lim inf xt = lim (inf{xs : s t}) .

t
lim inf t xt =
if
(xt ) t=0
is not bounded from below.
If it is bounded
from below, but not from above, the sequence of inma may diverge, in which case one sets
lim inf t xt = +.
The following characterization of the lower limit may come in handy. sequence and let Let
(xt ) t=0
be a
c R.
Then
lim inf t xt = c
if and only if:
[L1] for each > 0, there is a T N such that c < xt for all t T , [L2] for each > 0 and each T N, there is a t T with xt < c + .
In words, the sequence eventually remains above matter how small
c ,
but dives below
c+
innitely often, no
> 0.
Exercise 10.5
Prove this.
The limit-of-means criterion evaluates utility streams by means of the lower limit of the average utility:
Limit of means:
preferred to
Let
x = (xt ) t=0
and
y = (yt ) t=0
be sequences in
R.
Then
is
by the limit-of-means criterion, denoted
y,
if and only if
lim inf
T
1 T
T 1
(xt yt ) > 0.
t=0
(41)
Inequality (41) is equivalent with the statement that for some between sequences
> 0,
the average dierence
and
eventually exceeds
: T.
1 T
T 1
(xt yt ) >
t=0
for all but nitely many periods
74
Exercise 10.6
Prove this.
Changes in a single coordinate of a sequence become negligible once the average is taken over a long time, so under the limit-of-means criterion, changes in any nite number of periods do not matter. In particular, these preferences are stationary.
Exercise 10.7
1 lim inf T T
Some authors refer to the limit-of-means criterion as the preference relation repre-
sented by the utility function assigning to each bounded sequence
T 1 t=0
x = (xt ) t=0
the number
U (x) =
xt .
(a) Why must the sequences be bounded? (b) Aside from this, are the two denitions really the same?
The following criterion also assigns equal weight to periods, but remains sensitive to changes in single coordinates:
Overtaking:
to
Let
x = (xt ) t=0
and
by the overtaking criterion,
y = (yt ) be sequences in R. t=0 denoted x O y , if and only if

T
Then
x is preferred
lim inf
T t=0
(xt yt ) > 0.
Let us compare exponential discounting, and the limit-of-means and overtaking criteria.
The
latter were dened in terms of strict preferences. Dene the corresponding indierence relation
as follows:
x L y
if neither
nor
x.
Of course,
is dened similarly.
Comparison:
The sequence counting for
(1, 1, 0, 0, . . .) is preferred to the sequence (0, 0, . . .) under exponential all (0, 1). Under the other two criteria, they are equivalent. (1, 2, 0, 0, . . .)
the sequence is preferred to the sequence
dis-
The sequence
(0, 0, . . .)
under the overtaking
criterion. Under the limit-of-means criterion, they are equivalent. For every
n N,
(0, . . . , 0, 1, 1, . . .)
n
is preferred to
times
(1, 0, 0, . . .)
under the limit-of-means criterion. However, for each
(0, 1),
a large enough delay in a constant stream of ones makes the instant gratication of getting 1 immediately the preferable option.
10.4.
Better may be worse
Consider an alcoholic who has to decide at each moment in discrete time whether to take a drink (action 1) or not (action 0). Given his uncertain life-length, common modeling practice is to treat this as an innite horizon problem, discounting the impact of future decisions if so desired.
x = (xt ) of zeroes and ones, with xt = 1 if the alcoholic t=0 takes a drink at time t and xt = 0 otherwise. With a minor abuse of notation, (0, xt ) denotes the drinking pattern obtained from x by not drinking at time t. The pattern (1, xt ) is dened
A drinking pattern is a sequence likewise.
75
The philosophy of Alcoholics Anonymous is to ght the temptations of alcohol by forgetting about the past or the future and concentrate exclusively on the present: simultaneously models: stay away from a drink one day at a time. Let us investigate the possibility of having a utility function
that
Temptation:
at any given day, the alcoholic is at least as well of and sometimes better
by choosing to drink:
U (1, xt ) U (0, xt ) U (1, xt ) > U (0, xt )

Health concerns:
to drink at all times:
for all for
t, xt , some t, xt .
minimizes
nevertheless, the best thing is never to drink and the worst thing is
(0, 0, . . .)
maximizes, and
(1, 1, . . .)
U.
This sounds paradoxical and is indeed impossible under a nite horizon: suppose there are only
T N
periods. Start with an arbitrary drinking pattern and switch, one period at a time, any By temptation, each such switch weakly increases the utility So drinking at all times maximizes utility, in conict with health concerns, which
abstention (0) to drinking (1). function.
would require that all these weak increases in utility eventually lead to a plunge in utility: it is like climbing a stairway, but ending up lower than before (Figure 3).
Figure 3:
An impossible stairway
The next example shows that temptation and health concerns can be reconciled under an innite horizon.
Drinking paradox:
Dene the utility for each drinking pattern if if
as follows:
3 0 U (x) =
As a switch from it by
xt = 1 xt = 0
for only nitely many for only nitely many
t t
(a rare drinker), (a heavy addict),
t t xt 2
at time
otherwise.
to
leaves the utility unaected in the rst two cases and increases
2t > 0
otherwise, the temptation assumption is satised. However,
U (0, 0, . . .) = 3 = max U (x) > min U (x) = 0 = U (1, 1, . . .),

x x
in conformance with health concerns.
76
11.
Probabilistic choice
Consider a DM with a nite set
A of alternatives.
Earlier, we saw that if the DM has a weak order
over these alternatives, there is a utility function
u:AR
representing these preferences
and making an optimal choice reduces to choosing an alternative
a arg maxbA u(b),
a utility
maximizing alternative. However, in numerous experiments, it turns out that DMs: do not always make the same choice under seemingly identical circumstances, sometimes choose seemingly suboptimal alternatives. Such apparently irrational behavior has led to the development of so-called
models , where the main idea is that:

if
probabilistic choice
a
should be at
each alternative is chosen with some probability,
and
are feasible choices and
b,
then the probability of choosing
least as large as the probability of choosing
b.
This section gives a very short introduction to three probabilistic choice models: the Luce model , the logit model , and the linear probability model . Often, probabilistic choice models are derived in a random utility framework, where the true utility of each alternative consists of a deterministic component plus a random component. Depending on the realization of the random utility component, a feasible choice will look good under some circumstances and bad under others, thus motivating that observed choice is probabilistic: an alternative is only chosen in circumstances where it looks optimal. We will not consider such random utility models: they are (or should be) treated in detail in the econometrics courses. The development of these models was one of the main causes for awarding Daniel McFadden the Nobel Prize in 2000. Instead, we derive the models either axiomatically or via the introduction of
control costs : DMs want to
choose optimally, but incur costs to precisely implement their choices. A good introduction to probabilistic choice models can be found in Anderson et al. (1992, Ch. 2) and Ben-Akiva and Lerman (1985, Ch. 3). On the content of this section: The Luce model is due to Luce (1959). The derivation of the logit choice probabilities using the entropy cost function can be found in Mattsson and Weibull (2002). The derivation of the linear probability model using the Euclidean distance as cost function is due to Voorneveld (2006). It is based on an early contribution to the literature on bounded rationality in games by Rosenthal (1989).
11.1.
The Luce model
Consider a nite set
of alternatives. Some notation:
in the remainder of this section, we assume that the DM has to choose from a subset of alternatives in
containing at least two elements : choosing from a set with only one
alternative is trivial. We will typically denote such sets by If the DM has to make a choice from a set chooses If
SA
aS
or
T A.
S A,
we denote the probability that the DM
aS
by
PS (a) [0, 1].
Obviously, we require that
PS (a) = 1.
set is
S T A, T by
we denote the probability that an element from
is chosen when the choice
PT (S) =
aS
By the assumption above:
PT (a). T A.
PT (T ) = 1
for all
77
The set obtained from
by removing an element
aA
is denoted by
S \ {a}.
With this notation, the following two properties should be intuitive. The rst property states that if some alternative i.e.,
P{a,b} (a) = 0, T A
then
a T is never chosen in a pairwise comparison with some other b T , a can be deleted from T without aecting the choice probabilities of the bT
remaining alternatives: (L1) Let and
a T.
If there exists a
with
P{a,b} (a) = 0,
then
PT (S) = PT \{a} (S \ {a})

for all Taking
S T.
in (L1), we get
PT (T \ {a}) = PT \{a} (T \ {a}) = 1, so PT (a) = 0. a is always rejected in pairwise comparisons? In that case it is reasonable to assume the following path independence condition: if a S T , then the probability of choosing a from T should be equal to the probability of (i) rst selecting the subset S and (ii) from S choosing the element a. Formally:
What about cases where no alternative (L2) Let
S = T \ {a}
ST A
and
a S.
If
P{a,b} (a) {0, 1} /
for all
b T,
then
PT (a) = PT (S)PS (a).

When making a choice from a set
T A,
(L1) allows us to restrict attention to the alternatives
for which there is imperfect discriminatory power:
P{a,b} (a) {0, 1} for all a, b T, a = b. / A.
The
path independence condition then yields the following result:
Proposition 11.1
Assume that P{a,b} (a) {0, 1} for all dierent a, b / (L2) holds if and only if there is a function u : A R++ such that
Path independence
PS (a) =
for every
u(a) bS u(b)
(42)
S A.
Moreover, the function
is unique up to multiplication by a positive scalar.
Proof.
Step 1:
Assume path independence (L2) holds. We rst prove that
PA (a) > 0
for all
a A.
Suppose, to the contrary, that
PA (a) = 0
for some
a A.
By (L2), we know that for every
b A \ {a} : 0 = PA (a) = PA ({a, b})P{a,b} (a).

Since
P{a,b} (a) = 0,
it follows that
PA ({a, b}) = PA (a) + PA (b) = 0 b A : PA (b) = 0,
for all
b A \ {a}.
Proba-
bilities are nonnegative, so it must be that
contradicting
bA PA (b)
= 1.
Having shown that
Path independence (L2) implies that for every
PA (a) > 0 for all a A, dene u(a) = PA (a). SA: u(a) . bS u(b)
PS (a) =
PA (a) = PA (S)
PA (a) = bS PA (b)
78
Step 2:
Conversely, suppose that there is a function
u : A R++
such that
PS (a) =
for every
u(a) bS u(b)
and
S A.
To show: (L2) holds. So let
ST A
a S.
Then
PT (a) =
Step 3:
u(a) = bT u(b) u
u(b) bT u(b)
bS
u(a) = PT (S)PS (a). bS u(b) u.

It follows that for every
To show that the function
in (42) is unique up to multiplication with a positive
constant, suppose there are two such functions
and
aA:
PA (a) =
Hence
u(a) = bA u(b) /
u (a) . bA u (b) (b) > 0.
u(a) = u (a),
where
bA u(b)
bA u
In words: In Luce's choice model, each alternative can be assigned a positive value such that the probability of choosing a given alternative from a choice set is proportional to its value. Debreu (1960) showed that path independence although reasonable at rst sight can lead to counterintuitive conclusions. Consider, for instance, the following well-known variant of Debreu's argument:
The blue bus/red bus paradox.

alternatives:
A DM has to make a traveling mode decision: he can either
go to his destination by car or by bus. Assume the DM assigns the same probability to both
P{car, bus} (car) = P{car, bus} (bus) = 1/2.

one of them is red, the other is blue. So the choice set is that the DM pays no attention to color:
(43)
Suppose now that two buses can be used, which are completely identical, except in their colors:
A = {car,
blue bus, red bus}. Assume
P{blue bus, red bus} (blue
bus)
= P{blue bus, red bus} (red
bus).
(44)
Intuitively, since the DM according to (43) doesn't seem to care whether he goes by car or by bus, it would seem reasonable to expect that he will choose to go by car with probability and to go by bus with probability
1/2
1/2,
and
choosing randomly between the blue and the red bus:
PA (car) = 1/2
PA (blue
bus)
= PA (red
bus)
= 1/4,
or at least that the probability of taking the car should be larger than the probability of taking any of the two buses. However, path independence (L2) implies
PA (car) = PA (blue
To see this, notice that
bus)
= PA (red
bus)
= 1/3.
PA (car)
(L2)
(43)
def
PA ({blue bus, car})P{blue bus, car} (car) 1 PA ({blue bus, car}) 2 1 1 PA (blue bus) + PA (car), 2 2
79
so
PA (car) = PA (blue
bus) and, similarly,
PA (car) = PA (red
bus)
bus).
As the probabilities must
add up to one:
PA (car) = PA (blue
= PA (red
bus)
= 1/3.
So: in the choice problem with only one bus, the DM will choose to go by car or by bus with equal probability, but when faced with the choice between going by car or going by bus in case there are two virtually identical buses, the probability of choosing the car decreases from
1/2
to
1/3.
11.2. The logit model
Again, consider a choice set each alternative
A = {1, . . . , n}
with at least two distinct elements. Assume that
i A
gives some utility or payo
> 0,
the probability of choosing alternative
from
(i). In the A is equal to
logit model
with parameter
PA (i) =
e(i)/ = (j)/ jA e
exp((i)/) . jA exp((j)/)
Our goal will be two-fold:
(45)
Notice from (42) that this is just a special case of Luce's model, where the utility assigned to each alternative
iA
is equal to
u(i) = exp((i)/) > 0.
1. motivating these choice probabilities by introducing control costs, 2. studying the role of the parameter
> 0.
Control costs.
We allow the DM to choose each of the alternatives with a certain probability,
so the DM chooses a probability distribution from
n =
p Rn : +
i=1
pi = 1 .
Of course, if the DM is faced with choice set that
the set
A and has preferences over the outcomes such j if and only if (i) (j), the optimal thing to do is to choose only elements from arg maxiA (i) with positive probability. In most real-life situations, the DM cannot
guarantee the exact implementation of his choices: a careless driver may drive of the road, an absentminded shopper may by mistake buy the wrong item. To model this, we assume that it requires eort to implement choices: associated with each choice
control cost c(p) R. The (expected total) utility associated with each choice p n is dened as the dierence
between the expected payo Hence, the DM aims to solve
p n
will be a disutility or
n i=1 pi (i) and
> 0
times the control cost
c(p),
where
is a
positive scalar representing the relative weight assigned to the eort of implementing choice
p.
n pn
max
pi (i) c(p).
i=1
Dierent cost functions give rise to dierent choice probabilities. A common control cost function that appears in many branches of science (physics, chemistry, information science, to name but a few) is the following
entropy function :
c(p) =
i=1
pi ln (pi ) ,
(46)
80
where we use the convention that are chosen with equal probability.
0 ln 0 = 0.
One can show (we will not do so) that this is a
strictly convex function achieving its minimum at the vector
(1/n, . . . , 1/n), where all alternatives
Proposition 11.2
The optimization problem
n pn
max
pi (i) c(p),
i=1
(47)
with the control cost function from (46) has a unique maximum location with
i A : pi =
the logit choice probabilities from (45).
exp((i)/) , jA exp((j)/)
Proof.
The cost function
is strictly convex, so the function
n i=1 pi (i)
c(p)
is strictly
concave. Since we maximize a strictly concave, continuous function over a compact set, a maximum exists and is unique. Since the feasible set is entirely dened by linear (in)equalities, the Kuhn-Tucker conditions give necessary and sucient conditions for a solution to be a maximum. The condition for an interior solution exists a Lagrange multiplier
p n ,
i.e., a solution where
associated with the constraint
pi > 0 for all i, is that n i=1 pi = 1, such that
there
i = 1, . . . , n : (i) (ln pi + 1) + = 0,
since the gradient at
(48) coordinate
of the goal function
n i=1 pi (i)
c(p)
has
i-th
(i)
Rewriting (48) gives, for each
c(p) = (i) (ln pi + 1). pi
i = 1, . . . , n:
with
pi = c exp((i)/),
As
c = exp(( )/)
a constant.
n j=1 pj
= 1,
it follows that
i = 1, . . . , n :
as we had to show.
pi =
exp((i)/) , jA exp((j)/)
The role of .
Let us investigate what happens with the logit choice probabilities in (45) as Consider two alternatives
and as
i, j A, i = j .
Notice that the ratio of their
logit choice probabilities equals
PA (i) exp ((i)/) = = exp PA (j) exp ((j)/)

which converges to one as
(i) (j)
(49)
But if the ratios of any two choice probabilities converge
to one, their limits must be equal; together with the fact that probabilities add up to one, we conclude that the choice probabilities converge to
1/n
as
81
To consider the limit behavior as to innity as payo
0,
suppose that
(i) > (j). i
But then ratio (49) goes
0.
Since we are dealing with probabilities here, which are bounded below by
zero and above by one, if must be that
PA (j) 0.
If we let
be the alternative with maximal
(i),
it follows that the probability of choosing an alternative with less than maximal
payo converges to zero. So in the limit, all probability is restricted to optimal alternatives and it is clear from the denition of the choice probabilities that all of these will be chosen with equal probability. In summary, the parameter large values of
can be interpreted as a measure of irrationality of the DM: for
the DM chooses by more or less blindly picking any of the alternatives, while
for small values of
the choice of the DM is more or less optimal.
11.3.
The linear probability model
The idea behind the linear probability model is the same as behind Luce's model and the logit model: the probability of choosing an alternative should be (weakly) increasing in the payo associated to the alternative:
(i) (j) PA (i) PA (j). > 0,
(50)
The adjective linear indicates that the dierence between these two probabilities should be linear in the payo dierence: for a parameter we require that (51)
PA (i) PA (j) = ((i) (j)).
Unfortunately, it is not always possible to combine these two properties for large values of
. Let's consider a simple example with two alternatives: A = {1, 2} and respective payos (1) = 4, (2) = 0. By (50), we want PA (1) PA (2) and by (51), we want PA (1) PA (2) = ((1) (2)) = 4 . If we take = 1/8, this gives PA (1) PA (2) = 1/2. The probabilities have to add up to one, so the unique solution is that PA (1) = 3/4 and PA (2) = 1/4. So far, so good. Now take = 100: PA (1) PA (2) = 4 = 400. Since PA (1) and PA (2) are probabilities between zero and one, making their dierence equal to 400 (or for that matter any number larger than 1) is simply impossible.
So we have to relax our requirements (50) and (51) somewhat. Unwilling to change (50), let us adapt (51). Indeed, we require the linearity condition whenever possible, but when we run into problems like the one in the example above, we simply require that alternatives with low payo are chosen with probability zero. Formally, choice probabilities
iA
satisfy the linear probability model with parameter > 0 if the following holds:
if
PA (i)
for all alternatives
PA (i) > 0,
then
PA (i) PA (j) ((i) (j))
for all
j A.
(52)
Let us check to see that (52) gives us what we want: If both
and
are chosen with positive probability, we nd from (52) that and
PA (i) PA (j) ((i) (j))

This implies
PA (j) PA (i) ((j) (i)).
PA (i) PA (j) = ((i) (j)),

in correspondence with the linearity requirement (51).
82
(i) (j). We need to show that the choice probabilities in the linear probability model satisfy PA (i) PA (j). Discern two cases. First, if PA (j) = 0, it automatically follows that PA (i) 0 = PA (j). If PA (j) > 0, application of (52) yields
The choice probabilities also satisfy (50): take with
i, j A
PA (j) PA (i) ((j) (i)) 0,

since
(i) (j)
and
> 0.
Combining the two points above, we see that the choice probabilities are weakly increasing in the associated payos. By necessity, we had to set the probability of choosing low-payo alternatives equal to zero, but those that are chosen with positive probability still satisfy the linearity requirement.
Control costs.
vector the
The choice probabilities can be derived in the same way as before by making
a clever choice of the cost function. Consider the cost function that assigns to every probability
p n
the squared Euclidean distance to the vector
(1/n, . . . , 1/n)
that chooses each of
alternatives with equal probability:
c(p) =
i=1
pi
1 n
(53)
So choosing all alternatives with equal probability gives zero costs and costs increase the further away you go from the vector
(1/n, . . . , 1/n). > 0,
As in the proof of Proposition 11.2, it follows that:
Proposition 11.3
For each
there is a unique solution to the maximization problem
n pn
max
pi (i)
i=1
1 c(p) 2
(54)
with the cost function given in (53). The solution coincides with the choice probabilities in the linear probability model with parameter
.
in the two optimization problems with control
The role of .
Comparing the parameter
costs in (47) and (54), you will notice that they switched roles: large values of
correspond with
a large weight assigned to the control cost function in the logit model, but with a small weight assigned to the control cost function in the linear probability model. This change was necessary because I wanted to follow the standard denition of the linear probability model in (52). But the intuition remains the same: alternative (highest
measures (ir)rationality. In the case of the linear probability
model: for large values, (52) indicates that the dierence in the probability of choosing an optimal
(i))
and a suboptimal alternative must be large. In the limit, this forces
the probability of choosing suboptimal alternatives to zero. Conversely, for small values of
(52) indicates that the dierence in the probability of Combining this with the fact that probabilities
choosing any two alternatives must be small.
add up to one, this implies that in the limit, all alternatives will be chosen with equal probability.
83
11.4.
Exercises
Prove Proposition 11.3. Let
Exercise 11.1 Exercise 11.2
A = {1, 2}, (1) = 4, (2) = 0. >0

the choice probabilities satisfying the linear probability model.
(a) Compute for every
(b) What happens with the choice probabilities as (c) What happens with the choice probabilities as
0?
Interpret. Interpret.
Exercise 11.3
Let
A = {1, 2, 3}, (1) = 0, (2) = 2, (3) = 8. >0

the choice probabilities in the logit model. Do these choice probabilities, satisfy path independence? What happens with the choice probabilities as
(a) Compute for each for each
> 0,
(b) Answer the same questions for the linear probability model.
Exercise 11.4
The penalty function approach: Two of the probabilistic choice models considered
above could be rationalized using control cost functions giving a penalty to deviations from uniform randomization. This exercise gives the general argument behind such rationalizations. A penalty function on that
Rn
is a function
of rearranging the coordinates: for each
c : Rn R+ . A symmetric penalty function is independent n bijection r : {1, . . . , n} {1, . . . , n} and each x R , it follows A = {1, . . . , n}
with
c(x1 , . . . , xn ) = c(xr(1) , . . . , xr(n) ). n 2

elements and
Consider a probabilistic choice model over a nite set payo function
: A R.
Suppose a decision maker's choice probabilities can be rationalized using a
symmetric penalty function: given parameter
0,
they solve the problem
P () :
pn
max
pi (i) c(p (1/n, . . . , 1/n)).

i=1
Show that the resulting choice probabilities satisfy the desired monotonicity requirement: if and
p solves P ()
(i) > (j),
then
pi pj .
84
Full circle
To make sure you get the big picture, let us at the end of this course turn back to where we started: the overview of the course goals in the preface, and briey summarize how we achieved them.
The general framework

A meaningful microfounded model in any branch of economics derives its conclusions from assumptions about the behavior of individual economic agents. It requires careful answers to the following questions:
(Q1) What can the agent choose from, i.e., what is the set of feasible alternatives? (Q2) What does the agent like, i.e., what are the preferences over alternatives? (Q3) How are the former two combined to make a choice, i.e., to select among alternatives?
We mostly stuck to rational choice: choose from your set of feasible alternatives a most preferred one. Sections 1 to 3 provided a general framework for modeling preferences over and choice from arbitrary sets of alternatives. Important stops along the way included:
Utility theory:
utility functions are convenient tools to summarize an agent's preferences. We provided an exact Moreover, For
Nevertheless, in relevant cases, no utility function exists (Section 2.3).
answer to when preferences can be represented by a utility function (Section 2.4).
we provided conditions under which utility functions had some additional nice structure. terms of a numeraire in Section 2.6.
instance, continuity was studied in Section 2.5, cases where preferences could be expressed in
Existence of solutions:
Proposition 3.1 gave a general answer to a fourth central question:
(Q4) When do most preferred elements exist?

If the weak order reecting the agent's preferences is upper semicontinuous, the agent can nd a most preferred alternative in any nonempty, compact set of options. We regularly appealed to this result to establish that problems faced by economic agents actually have a solution; sometimes (as in Propositions 4.3(a) and 7.1(a)) the result could be applied immediately, but sometimes (as in Propositions 4.5(a), 5.4, and 5.5) a little more caution was needed.
Applications of the general framework

In many of the remaining sections, this general framework was applied to specic economic problems. This required giving the set of alternatives as well as the preferences a specic meaning that seems relevant to the problem under consideration. Moreover, this allowed us to study a fth central question:
(Q5) How are most preferred elements aected by changes in the agent's environment?
Below, I will go through these applications, summarize how feasible sets and preferences were dened, and if applicable indicate where we studied the answer to
(Q5).
85
Application 1: consumer facing budget constraint.

Feasible alternatives: commodity bundles Preferences: an arbitrary weak order
x RL +
in a budget set
B(p, w). X = RL . +
over the commodity space
Changes in agent's environment: see Sections 4.2 and 4.5.
Application 2: consumer minimizing expenditure.

Feasible alternatives: commodity bundles
x RL + px
achieving a desired utility level. at price vector
Preferences: dened in terms of the expenses
p RL . ++
Changes in agent's environment: see Section 4.3.
Application 3: producer maximizing profit.

Feasible alternatives: production plans
in a production set
Y RL . p RL . ++
Preferences: dened in terms of the prot
py
at price vector
Application 4: producer minimizing costs.

Feasible alternatives: input vectors
z RL1 +
achieving a desired output level. at input price vector
Preferences: dened in terms of the costs
wz
w RL1 . ++
Application 5: expected utility theory.

Feasible alternatives: compound gambles Preferences: an arbitrary weak order
over a set of deterministic outcomes.
over the set of compound gambles
G,
under some
assumptions resulting in a von Neumann-Morgenstern utility function. Changes in agent's environment: see Section 8 on risk attitudes.
Application 6: time preference.

Feasible alternatives: sequences
c = (ct ) t=0
of outcomes occuring over time
t.
Preferences: come in dierent forms, for instance: 1. represented by a utility function of the form 2. in terms of the limit of means criterion, 3. in terms of the overtaking criterion.
U (c) =
t=0 (t)u(ct ),
Application 7: probabilistic choice.
Although slightly outside the general framework,
in some probabilistic choice models like the logit and linear probability model, agents choose probabilities as if they maximize expected payos subject to implementation costs: Feasible alternatives: choice probabilities assigned to a nite set
of alternatives.
Preferences: represented by a utility function of the form expected payo minus control costs; see Propositions 11.2 and 11.3.
86
Beyond these notes

Applications of the general framework abound also in other branches of economics. In macroeconomics, a government may evaluate alternative policies in terms of some social welfare function summarizing the well-being of its citizens. In game theory the mathematical toolbox used to study interaction between agents, used in many branches of microeconomics, industrial organization, and political economics players have dierent strategies to choose from and evaluate them in terms of a preference relation that incorporates the uncertainty they face about, for instance, the choices of the other players. And what if we leave the realm of rational decision making? Parts of these notes (see, for instance, Exercises 3.4, 3.5, and Section 11) illustrate that as long as we can write down formal postulates about agents' behavior, our mathematical tools allow us to study their consequences in a rigorous and consistent way. This is just the right amount of rationality we need: Behavior is procedurally rational when it is the outcome of appropriate deliberation. Its procedural rationality depends on the process that generated it. (Simon, 1976, p. 131) Behavior is procedurally rational if there is a procedure a recipe, if you wish that translates a decision problem to a well-dened choice. Procedurally rational decision makers are not wild maniacs choosing without any logic whatsoever. Paraphrasing Shakespeare: Though this be madnesse/Yet there is Method in't. Hamlet, 1603, Act 2, Sc. 2.
I hope that the tools you acquired during this course will help you to address also other economic problems in a structured way.
87
Notation
If
is a nite set,
|X|
denotes its cardinality, i.e., its number of elements.
A is also an element of B ): A B . B , but A = B ): A B . Set of positive integers: N = {1, 2, 3, . . .}. Set of integers: Z = {. . . , 2, 1, 0, 1, 2, . . .}. Set of rational numbers: Q = {p/q : p, q Z, q = 0}. Set of real numbers: R. For arbitrary L N : L L L Set of vectors in R with nonnegative coordinates: R+ = {x R : x1 , . . . , xL 0}. L with positive coordinates: RL = {x RL : x , . . . , x > 0}. Set of vectors in R 1 L ++ L Sets like Q++ are dened analogously. L For two vectors x, y R , their inner product is denoted by x y = x1 y1 + + xL yL .
Weak set inclusion (each element of Strict/proper set inclusion (A Moreover, write
xy x>y
Relations For
if if
xi yi xi > yi
for all coordinates for all coordinates
i = 1, . . . , L, i = 1, . . . , L.
and < are dened analogously. k {1, . . . , L}, ek RL denotes the k -th ek = (0, . . . , 0,
standard basis vector with
k -th
coordinate equal
to one and all other coordinates equal to zero:
k
The vector of ones is denoted by
th coordinate
, 0, . . . , 0).
e = (1, . . . , 1) RL .
88
References
Anderson, S.P., de Palma, A., Thisse, J.-F., 1992. Discrete choice theory of product dierentiation. MIT Press. Arrow, K.J., 1959. Rational choice functions and orderings. Economica 26, 121-126. Arrow, K.J., Hahn, F.J., 1971. General competitive analysis. Amsterdam: North-Holland. Ben-Akiva, M., Lerman, S.R., 1985. Discrete choice analysis. MIT Press. Cobb, C.W., Douglas, P.H., 1928. A theory of production. American Economic Review (supplement) 18, 139-165. Debreu, G., 1954. Representation of a preference ordering by a numerical function. In: Decision Processes. Thrall, Davis, Coombs (eds.), John Wiley, pp. 159-165. Debreu, G., 1959. Theory of value. Yale University Press. Debreu, G., 1960. Review of R.D. Luce, Individual Choice Behavior: A Theoretical Analysis. American Economic Review 50, 186-188. Debreu, G., 1964. Continuity properties of Paretian utility. International Economic Review 5, 285-293. Diecidue, E., Wakker, P.P., 2002. Dutch books: avoiding strategic and dynamic complications, and a comonotonic extension. Mathematical Social Sciences 43, 135-149. Dubra, J., Echenique, F., 2001. Monotone preferences over information. Topics in Theoretical Economics 1, article 1.
http://www.bepress.com/bejte/topics/vol1/iss1/art1
Fishburn, P.C., 1970a. Utility theory for decision making. New York: John Wiley & Sons. Fishburn, P.C., 1970b. Intransitive individual indierence and transitive majorities. Econometrica 38, 482-489. Fishburn, P.C., 1979. Transitivity. Review of Economic Studies 46, 163-173. Hildenbrand, W., Kirman, A.P., 1988. Equilibrium analysis. North-Holland. Jaray, J.-Y., 1975. Existence of a continuous utility function: An elementary proof. Econometrica 43, 981-983. Kahneman, D., Tversky, A., 1964. Prospect theory: an analysis of decision under risk. Econometrica 47, 263-291. Kamke, E., 1950. Theory of sets. New York: Dover Publications. Kaneko, M., 1976. Note on transferable utility. International Journal of Game Theory 5, 183-185. Koopmans, T.C., 1960. Stationary ordinal utility and impatience. Econometrica 28, 287-309. Kreps, D.M., 1990. A course in microeconomic theory. Hertfordshire: Harvester Wheatsheaf. Loewenstein, G., Prelec, D., 1992. Anomalies in intertemporal choice: evidence and interpretation. Quarterly Journal of Economics 107, 573-597. Luce, R.D., 1959. Individual choice behavior: A theoretical analysis. Wiley. Mas-Colell, 1985. The theory of general economic equilibrium; A dierentiable approach. Cambridge: Cambridge University Press. Mas-Colell, A., Whinston, M.D., Green, J.R., 1995. University Press. Mattsson, L.-G., Weibull, J.W., 2002. Probabilistic choice and procedurally bounded rationality. Games and Economic Behavior 41, 61-78. Osborne, M.J, Rubinstein, A., 1994. A course in game theory. Cambridge, MA: MIT Press. Phelps, E.S., Pollak, R.A., 1968. On second-best national saving and game-equilibrium growth. Review of Economic Studies 35, 201-208. Pratt, J.W., 1964. Risk aversion in the small and in the large. Econometrica 32, 122-136. Microeconomic theory. Oxford: Oxford
89
Rabin, M., 2000. Risk aversion and expected-utility theory: a calibration theorem. Econometrica 68, 1281-1292. Rosenthal, R.W., 1989. A bounded-rationality approach to the study of noncooperative games. International Journal of Game Theory 18, 273-292. Rubinstein, A., 2006. Lecture notes in microeconomic theory. Princeton NJ: Princeton University Press. 99-118. Simon, H.A., 1976. From substantive to procedural rationality. In: Method and Appraisal in Economics. Latsis, S.J. (ed.), Cambridge University Press, pp. 129-146. Starr, R.M., 1997. General equilibrium theory. Cambridge University Press. Thaler, R., 1981. 201-207. Varian, H.R., 1992. Microeconomic analysis. New York: W.W. Norton & Company, 3rd edition. Voorneveld, M., 2006. Probabilistic choice in games: properties of Rosenthal's ternational Journal of Game Theory 34, 105-121. Voorneveld, M., 2007. The possibility of impossible stairways: Tail events and countable player sets. To appear in Games and Economic Behavior. Voorneveld, M., 2008. From preferences to Cobb-Douglas utility. SSE/EFI Working Paper Series in Economics and Finance, No. 701. Wrneryd, K., 2007. Sexual reproduction and time-inconsistent preferences. Economics Letters 95, 14-16. Some empirical evidence on dynamic inconsistency. Economics Letters 8,
http://arielrubinstein.tau.ac.il/Rubinstein2007.pdf
A behavioral model of rational choice. Quarterly Journal of Economics 69,
Simon, H., 1955.
t-solutions.
In-
90
Suggested solutions
These are (sometimes short) solutions to most exercises in the lecture notes. In solutions to the home assignments and exam questions, you are expected to start from relevant denitions, and clearly deduce and motivate your answers. potential mistakes?) are welcome! Suggestions for improvements (and corrections of
Section 1 Exercise 1.1 (a): Each pair

word word of words can be arranged in alphabetical order, so is complete. Moreover, if
x x
is found before or at the same place as (in case the words are identical) word
in the is A
dictionary, and word transitive.
is found before or at the same place as word
in the dictionary, then
is found before or at the same place as word
in the dictionary. Conclude that
(b):
The binary relation
dened by knows is not necessarily complete or transitive.
violation of completeness occurs if there exist people who are unfamiliar with each other. Also violations of transitivity are common: I know my wife, my wife knows her boss, but I do not know my wife's boss.
Exercise 1.2 (a): [Reexivity of ]

x [Symmetry of ]
order of writing)
: x x and (simply changing the : x x. Conclude that is reexive. Let x, y X with x y . By denition of , x y and y x. But this is also the denition of y x. Conclude that is symmetric. [Transitivity of ] Let x, y, z X have x y and y z . By denition of , this means that x y , y x, y z , z y . By transitivity of , x y and y z give x z . Similarly, z y and y x give z x. Since x z and z x: x z . Conclude that is transitive. (b): [Irreexivity of ] Let x X . By denition of , x x would require that x x but not x x, a contradiction. Conclude that is irreexive. [Asymmetry of ] Let x, y X with x y . By denition of , x y but not y x. By denition of , not y x. Conclude that is asymmetric. [Transitivity of ] Let x, y, z X have x y and y z . By denition of , this means that x y but not y x and that y z , but not z y . By transitivity of , x y and y z give x z . It is not true that z x. If it were, transitivity of with z x and x y would imply z y , contradicting y z . Since x z , but not z x: x z . Conclude that is transitive. (c): Let x, y, z X have x y and y z . By denition of , this implies that x y . As x y and y z , transitivity of gives x z . Let By completeness of
x X.
x.
By denition of
Exercise 1.3 (a): Assume

and that
> 0.
is
k {1, . . . , L} be one of the coordinates, let x X , x + ek x and x + ek = x, so by strong monotonicity, x + ek x. Conclude strongly monotonic in coordinate k .
is strongly monotonic. Let Then is strongly monotonic in each of its coordinates and transitive. Let
Now assume that
x, y X
for
x y and x = y . To show: x y . Starting with x, change the coordinates one by one to those of y . Formally, let z(0) = x and, k each k {1, . . . , L}, dene z(k) = x + =1 (y x )e . Then either z(k) = z(k 1) if the
with
91
k -th
coordinates of
and
are the same, or on
(b):
coordinate. By transitivity, we nd that The preference relation
z(k 1) z(k) by x = z(0) z(L) = y .
strong monotonicity in the
k -th
R2 +
with for exactly one coordinate
x, y R2 : + (2, 2)
y xk > yk
k {1, 2}
k for both k = 1 and k = 2, but not strongly monotonic: (1, 1). Notice: in line with (a), relation is not transitive. (c): No. The point (0, . . . , 0) RL cannot be improved upon: since less is better, (0, . . . , 0) x + L for every x R+ with x = (0, . . . , 0).
is strongly monotonic in coordinate is not strictly preferred to
(d):
Yes. Notice that the issue above, that improvements beyond the zero vector are impossible
if one is constrained to vectors with nonnegative coordinates, disappears. Let Dene
y = x
2 e1
RL . Then
coordinate. Since less is better,
x y = < and y x, with strict inequality y x. Conclude that is locally nonsatiated.
x RL
and
> 0.
in the rst
Exercise 1.4 (a): The preference relation on R2 +

x y
with
(x1 + 1)(x2 + 1) (y1 + 1)(y2 + 1) x = (1, 2)

and
is strongly monotonic in coordinate 1, but not quasilinear in coordinate 1: let
y = (2, 1).
Then
(x1 + 1)(x2 + 1) = (y1 + 1)(y2 + 1) = 6,

Increase the rst coordinate of
so
x y.
and
by
> 0.
Then so
(x1 + + 1)(x2 + 1) = 3(2 + ) > 2(3 + ) = (y1 + + 1)(y2 + 1),

Quasilinearity would require that the indierence remains unaected.
x + e1
y + e1 .
(b):
on
R2 where all alternatives are equivalent with each other (x +
for all
x, y ,
represented by a constant utility function) is trivially quasilinear but not strongly
monotonic in coordinate 1.
(c): Same preference relation as in (b). (d): The preference relation on R2 with +
x 02 12 , but y
2 4x1 + 3x2 4y1 + 3y2 2
satises all three monotonicity properties, but is not homothetic. For instance,
(1, 0)
(0, 1),
as
41+3
>40+3
2(1, 0)
2(0, 1),
as
42+3
02
<40+3
22 .
Exercise 1.5 (a): Let x, y RL +
x y . For each n N, xn = x + (1/n, . . . , 1/n) RL satises xn > y , + so xn y (in fact, even xn y ). Letting n , continuity implies that limn xn = x y . (b): Let x, y RL have x > y . Then min{x1 , x2 } > min{y1 , y2 }, so x y , but not y x, i.e., + x y . Let x = (2, 1) and y = (1, 1). In both cases, you can only mix one unit of drink, but x wastes one unit of the rst ingredient, so even though x y , x y.
have
Exercise 1.6
92
(a):
that
Assume the rst denition of convexity holds. Let
y X.
To show:
{x X : x
y}
is a
convex set. Let
z, z {x X : x y} and [0, 1]. z z . By convexity, z + (1 )z z

To show:
Using completeness of
, we may assume w.l.o.g.
y,
so
z + (1 )z
Conversely, assume the second denition of convexity holds. Let
y by transitivity of x, y X with x y
. and
[0, 1].
y. Elements x and (by completeness) y both lie in the set {x X : x y}, assumption, so it also contains x + (1 )y . Conclude that x + (1 )y (b): Consider the preference relation on R with x, y R : x y x 0 > y. R+
if if
x + (1 )y
which is convex by
y.
For each
y R: {x R : x x + (1 )y y} = y 0, 0 > y, x = 1, y = 3, = 1/2,
is convex. Therefore, it satises the rst convexity condition. However, if then
y,
but not
y,
in violation of the second convexity denition.
Section 2 Exercise 2.1 (a): [Transitivity] Let x, y, z R satisfy x

so
y, y
z.
By denition,
x y+1
and
y z + 1, x,
[Violation of completeness] Completeness requires in particular that for each x R: x

i.e., that
x y + 1 z + 2 z + 1, x x + 1.
so
z.
(b): [Prop. 2.1(b) satised]
Clearly, this is not true.
Let x, y R. If x y , then x y , so x y + 1. Therefore, u(x) = x y + 1 > y = u(y). Moreover, there are no x, y X with x y (as this would require x y + 1 and y x + 1), so the second condition is vacuous. [Prop. 2.1(a) violated] u does not represent , since is not complete and the order induced by u is.
Exercise 2.2 (a): Suppose the collection of jumps in U

and
is uncountable. Consider two distinct jumps
(u1 , u2 )
(v1 , v2 ).
The intervals
(u1 , u2 )
and
(v1 , v2 )
are disjoint by denition of a jump. Moreover,
each such interval contains a rational number, necessarily distinct from the one in the other interval, since these intervals are disjoint. Therefore, there is an injective function from the uncountable set of jumps to the countable set of rational numbers, a contradiction.
J and R and therefore countable itself. Let x, y X y . To show: there are c1 , c2 C with x c1 c2 y . Case 1: (u(y), u(x)) is a jump in U . By denition of J , there are points c1 , c2 J C with utility u(c1 ) = u(x), u(c2 ) = u(y). Hence x c1 c2 y , as in the requirement for Jaray
is the union of two countable sets with
(b): C
x
order-separability. a
Then (u(y), u(x)) U = . By denition of R, there is c R C with u(c) (u(y), u(x)). Now apply the reasoning so far to (u(c), u(x)). If it is a jump in U , Case 1 says that there are c1 , c2 C with x c1 c2 c y , as in the requirement for Jaray order-separability. If it is not a jump, repeating the construction of Case 2 says that there is a
Case 2: (u(y), u(x)) is not a jump in U .
c C
with
u(c ) (u(c), u(x)),
so that
y,
as in the requirement for
Jaray order-separability.
93
(c):
x, y X . If x y , there exist, by Jaray order-separability, c1 , c2 C with x c1 c2 y . Therefore, {c C : c x} {c C : c y}, as the former set includes c1 , whereas the latter doesn't. Conclude that u(x) u(y) 2n(c1 ) > 0. If x y , then {c C : c x} = {c C : c y}, so u(x) = u(y).
Let
Exercise 2.3 (a): True. By denition of a continuous function, pre-images of open sets are open sets.
quently, for each
Conse-
x X,
the sets
{y X : y
x} = u1 ((, u(x))
open
and
{y X : y
x} = u1 ((u(x), ))
open
are open sets.
(b)
False. The usual greater than or equal to order
utility function increasing
on R is represented by the continuous u : R R with u(x) = x and hence, by (a), continuous. However, any strictly function u : R R represents , including the discontinuous function u(x) = x x+1
if if
x < 0, x 0.
Exercise 2.4
No. Lexicographic preferences (modied in such a way that you start comparing the second coordinates, then the rst) on
R2 +
constitute an example where preferences cannot even be
represented by a utility function. Let you add to the rst coordinate of
x, y R2 +
have
x2 > y2 . x.
The modied lexicographic
preference started by looking at these second coordinates, so no matter how much money
y,
you will strictly prefer
Here is an example where preferences can be represented by a utility function. It makes having a second coordinate below one so bad, that you can never compensate this with money and make it look as nice as an alternative whose second coordinate is at least one. The preference relation on
R2 +
represented by the utility function
u(x) =
where
(x1 ) + 1 (x1 )
if if
x2 1, x2 < 1,
: R (0, 1)
is strictly increasing (like the cdf of a standard normal distribution),
satises all properties in Proposition 2.11, except (8). Under additional assumptions (like continuity, monotonicity), the answer is yes. See Rubinstein (2006, Lecture 4).
m = m = 0, or one of the alternatives is strictly preferred over the other, w.l.o.g. (a, 0) (a , 0). In the latter case, invoke the rst property to conclude that there is an amount of money m such that (a, 0) (a , m ). Take m = 0, m = m . (b): W.l.o.g., m w. By the third property with c = w m:
Either in which case we take
Exercise 2.5 (a): Consider (a, 0) and (a , 0) in X .
(a, 0) (a , 0),
(a , w ) (a, w) = (a, m + (w m)) (a , m + (w m)),

94
so
(a , w ) (a , m + (w m))
Let
by transitivity of
But then
w = m + (w m)
by strong
monotonicity in money.
(c):
u(a, m) u(a , m ). By the rst two properties, there are unique amounts of money m1 , m2 0 such that (a, m) (a , m1 ) and (a , m ) (a , m2 ). By denition of v , we nd that
To show:
(a, m), (a , m ) X .
(a, m)
(a , m )
if and only if
u(a, m) = (m1 m) + m = m1
Therefore,
and, similarly, that
u(a , m ) = m2 .
(55)
(a, m)
(a , m ) (a , m1 ) m1 m2
(a , m2 )
u(a, m) u(a , m ),
where the rst equivalence follows from the fact that
(a, m) (a , m1 )
and
(a , m ) (a , m2 ),
the second equivalence from strong monotonicity in money, and the nal one from (55).
Exercise 2.6 (a): Let r R.

let
Xu (r) contains at most one element, it is convex. If it contains two or more, (0, 1). To show: x + (1 )y Xu (r). Without loss of generality, assume that x y , so that u(x) u(y) r. By convexity of : x + (1 )y y , so u(x + (1 )y) u(y) r, i.e., x + (1 )y Xu (r). (b): Let's do the quasiconcavity part; strict quasiconcavity proceeds similarly. Assume u : X R is quasiconcave. Let y X . To show: {x X : x y} is a convex set. By denition, {x X : x y} = {x X : u(x) u(y)} = Xu (r), with r = u(y). The latter
If
x, y Xu (r)
and let
set is convex by the denition of a quasiconcave function under (a).
(c):
A function
on a convex domain
is concave if its subgraph
subgraph(u) = {(x, y) X R : y u(x)}

is a convex set. Consider the weak order on
X=R 0 1
if if
represented by the utility function
u(x) =
This preference relation is convex, as, for each
x 0, x > 0.
the upper contour sets are convex: if if
y X,
{x X : x
Suppose and
y} =
R (0, )
y 0, y > 0.
v : X R were a concave utility function representing . By denition, (1, v(1)) (1, v(1)) are elements of subgraph(v). Take = 1/2 and consider the convex combination 1 1 1 1 (1, v(1)) + (1, v(1)) = (0, v(1) + v(1)). 2 2 2 2
Since
v(1) < v(1),
this point does not lie in the subgraph of
v:
1 1 v(0) = v(1) < v(1) + v(1). 2 2
Exercise 2.7 (a):

95
n N, f (nu) = nf (u) by additivity and induction on n. f (0) = f (0 + 0) = f (0) + f (0), so f (0) = 0. Hence f (0u) = 0f (u). For all n N, f (nu) = nf (u): indeed, 0 = f (0) = f (nu + (nu)) = f (nu) + f (nu), so f (nu) = f (nu) = nf (u). So f (xu) = xf (u) for all x Z. For x Q, write x = p/q for some p, q Z, q = 0. Rewriting xu = (p/q)u gives q(xu) = pu. Hence f (q(xu)) = f (pu). By the above, qf (xu) = pf (u), so f (xu) = (p/q)f (u) = xf (u).
For all
(b):
and
If f is not linear, there are x, y R\{0} with f (x)/x = f (y)/y . Hence, vectors a = (x, f (x)) b = (y, f (y)) are linearly independent: vectors a + b with , R span R2 . So vectors a + b with , Q are dense in R2 . The latter vectors are in the graph of f : for , Q, (a) implies that
(x + y, f (x + y)) = (x + y, f (x) + f (y)) = (x + y, f (x) + f (y)) = a + b.
(c):
For each
i {1, . . . , n},
dene
fi : R R
as follows:
xi R :
Applying additivity of
fi (xi ) = F (xi ei ). x Rn
n
that
F (n 1)
times gives, for each
F (x) = F
i=1
To see that each
xi e i
=
i=1
F (xi ei ) =
i=1
fi (xi ). F:
fi
must be additive, let
xi , yi R.
By additivity of
fi (xi + yi ) = F (xi ei + yi ei ) = F (xi ei ) + F (yi ei ) = fi (xi ) + fi (yi ).
Section 3 Exercise 3.2

By continuity of
f,
the weak order
on
with
x, y X :
has open lower contour sets: for each
y f (x) f (y)
x X,
L(x) = {y X : y
of
x} = {y X : f (y) < f (x)} = f 1 ((, f (x))) X

contains a best element. By denition
is the pre-image of an open interval. By Proposition 3.1, , this best element is a maximum of
f.
Existence of a minimum can be established by applying
the proposition to the weak order
with
x, y X :
y f (x) f (y).
Exercise 3.3 (a): Assume (X, B, C) is rationalizable by the weak order

x C(A), y C(B).
To show:
on for
x C(B) = {z B : z
96
X . Let A, B B , x, y A B , all z B}.
y A and x C(A) = {z A : z z for all z A}: x y C(B): y z . Using x y and transitivity of : x z . So x x C(B).
Since
y. z
Let for
z B . Since all z B , i.e.,
(b):
No. Consider the choice structure with
X = {a, b, c, d}, B = {{a, b, c}, {b, c, d}}, C({a, b, c}) = {b}, C({b, c, d}) = {c}.
It trivially satises IIA: there are no distinct sets WARP: in the rst problem, least as good as
A, B B
is revealed at least as good as
(c):
b.
So
should have been contained in
A B . It does not satisfy c, in the second c is revealed at C({b, c, d}).

with Suppose, to the
No.
The choice structure in (b) satises IIA, but is not rationalizable. rationalizes it. Since
contrary, that
C({a, b, c}) = {b}, we must have that b c and b a. Since C({b, c, d}) = {c}, we must have that c b and c d. But then b c, so c b a implies c a. But then c y for all y {a, b, c}, so c should have been included in C({a, b, c}). (d): No. Consider the choice structure with X = {a, b, c}, B = {{a, b}, {b, c}, {a, c}}, C({a, b}) = {a}, C({b, c}) = {b}, C({a, c}) = {c}. As distinct choice sets have only one point in common, WARP is trivially satised. It is not rationalizable, as a rationalizing should satisfy a b, b c, c a, in violation of transitivity.
Exercise 3.4 (a): [WARP satised]

x C(B).
Let
A, B B , x, y A B , x C(A),
By denition of
and
y C(B).
To show:
We will simply show that
x = y.
C: C(B)
selects one of them. (56)
B B :
if
contains a satisfactory alternative,
Distinguish two cases: Then also v(y) < r by (56). Now x C(A) implies that x is the largest A. In particular, since y A: y x. Similarly, y C(B) implies that x y . So x = y C(B). Case 2: v(x) r. Then also v(y) r by (56). Now x C(A) implies that x is the smallest satisfactory element of A. In particular, since y A: x y . Similarly, y C(B) implies that y x. So x = y C(B). element of conditions need to be satised: A satisfactory element
Case 1: v(x) < r.
[IIA satised] WARP implies IIA. [A rationalizing weak order] Some
is always preferred to a nonsatisfactory one; Among nonsatisfactory alternatives, the largest is chosen, so there having a high index is preferable. Among satisfactory alternatives, the smallest is chosen, so there having a low index is preferable. One weak order (verify!) rationalizing the choice structure is obtained by writing down (from worst to best) all nonsatisfactory alternatives from smallest to largest, then all satisfactory alternatives from largest to smallest. By denition of
(b): [IIA violated] For each B B, let x (B) be your partner's most preferred element of B .
C : C(B) = B \ {x (B)} for each B B with more than one element. Take B = X, A = C(B). Both sets lie in B and A B . Moreover, C(B) A = C(B) = . IIA would imply that C(A) = C(B) A = C(B), but C(A) = A \ {x (A)} A = C(B), a contradiction.
[WARP violated] WARP implies IIA and is therefore violated as well. [Rationalizability] As WARP is violated, the choice structure is not rationalizable.
Exercise 3.5
97
(a:) (b): (c):
In
B1 ,
the rst commodity has the highest price
rst commodity gives No, bundles
C(B1 ) = {(1, 0)}.
Similarly, lie in
p1 = 2, so spending C(B2 ) = {(0, 1)}. B1 B2 .

Since
wealth
w=2
on the
Yes, there is no set-inclusion between the two choice sets, so IIA holds vacuously.
(d): No: (e): For instance:
WARP would require
x = (1, 0) and y = (0, 1) x C(B2 ). C(B1 ) = {x} would require x y ,
x C(B1 )
and
y C(B2 ), x.
whereas if if if
C(B2 ) = {y} p1 > p2 , p2 > p1 , p1 = p2 .
would require
x1 x2 u(x, p) = x1 x2
Section 4 Exercise 4.1 [Continuity:]

mally, for each As is represented by the continuous utility function
u,
it is continuous. For-
y X, {x X : x y} = {x X : u(x) u(y)} = u1 ([u(y), )) u

and therefore closed. Similarly, is an
is the preimage of a closed set under the continuous function
[Monotonicity, but not strong:]

that
the set
{x X : x
y}
is closed.
x, y RL with x y . There + u(x) = min{x1 /a1 , . . . , xL /aL } = xi /ai . As x y , it follows that

Take
i {1, . . . , L}
such
u(x) = xi /ai yi /ai min{y1 /a1 , . . . , yL /aL } = u(y),

Similarly, if
so
y.
x > y,
then
y.
For a violation of strong monotonicity, notice that
u(0, . . . , 0) = u(1, 0, . . . , 0) = 0,
i.e., if you start with nothing, but get one unit of the rst ingredient, you still cannot bake a cake due to lack of all the other ingredients!
[Convexity, but not strict:]

{x RL : x +
Let
y RL +
and let
u(y) = .
Then
y} = {x RL : min{x1 /a1 , . . . , xL /aL } } + = L {x RL : x /a } + =1
is the intersection of convex halfspaces and therefore convex. For a violation of strict convexity, take to make one cake: for each
x = (a1 +1, a2 , . . . , aL ), y = (a1 , . . . , aL ). (0, 1):
Both vectors (and any convex combination) suce
x y x + (1 )y,
in contradiction with strict convexity.
[Homotheticity:] u is homogeneous of degree one. Exercise 4.2

With the additional restrictions, the budget sets become:
Indivisibilities: B(p, w) Z2 . +
98
Rationing: B(p, w) {x R2 : x1 3}. + Rebates 1: {x R2 : p1 x1 + 4 min{x2 , 5} + 2 max{x2 5, 0} w}, +

commodity two cost
as the rst ve units of
p2 = 4 and any additional ones only 2. 2 : x 5, p x w} {x R2 : x > 5, 8x + 2x 40}. Rebates 2: {x R+ 2 2 1 2 + Initial endowment: {x R2 : p x p }. + Package deal: B(p, w) {x R2 : x1 = x2 }. + Gift certificate: B(p, w) {x R2 : x1 1/p1 , p1 (x1 1/p1 ) + p2 x2 w}, +
acquires
the rst set
being the budget set if he does not use the gift certicate, the second one if he does and therefore
1/p1
units of the rst commodity without needing to address his budget.
Except for exists. Under
Rebates 2, Rebates 2,
the budget sets are nonempty, compact, so a most preferred bundle the budget set is not closed (it doesn't contain the boundary point there is no optimal bundle in the budget set. Drawing the
(30/8, 5))
and a most preferred bundle need not exist. For instance, if the utility function of the
consumer is
u(x) = min{4x1 , 3x2 },
budget set and some indierence curves will help you to verify this.
Exercise 4.3
Walrasian demand is homogeneous of degree one in wealth: for all if
and all > 0, x x(p, w). Proof. Suppose not: there is a z B(p, w) with z x. Then y := (1/)z B(p, w). As x x(p, w), x y . As is homothetic, also x y = z , contradicting that z x.
(p, w) RL+1 ++
x x(p, w),
then
Exercise 4.4 (a): Consider

If if
a consumer with utility function
u(x) = x1 + x2 .
Local nonsatiation is obvious.
p1 > p 2 , p1 > p2 .
the consumer spends the entire income on the second commodity, so Increasing
v(p, w) = w/p2
p1
even further does not aect indirect utility, i.e., indirect utility is not
strictly decreasing in the price of commodity 1.
(b):
To show: for each sequence For each
(pn , wn )nN
in
RL+1 ++
with limit
(p, w) RL+1 , v(pn , wn ) ++
n N, let xn x(pn , wn ), which is possible by the assumptions in Proposition B(pn , wn ) for all n and (pn , wn ) (p, w), the sequence (xn )nN eventually lies in the slightly enhanced budget set B(p, w + 1), which is compact: taking a subsequence if necesn sary, we may assume w.l.o.g. that the sequence (x )nN is convergent, with limit x X . The n n n sequence (p , w , x )nN satises the properties of Proposition 4.3(b). In particular, x x(p, w), n n n i.e., limn v(p , w ) = limn u(x ) = u(x), by continuity of u.
n 4.4. As x
Proof.
v(p, w).
(c):
Roughly speaking, because continuous preferences may be represented by discontinuous
utility functions, which may cause jumps in the indirect utility function as well. For instance, suppose a consumer has continuous utility function
U : R+ R
with
U (x) =
min{x, 1}
and hence continuous preferences. These preferences can also be represented by the
discontinuous utility function
u : R+ R
with
u(x) =
Notice that
x 2
if if
x 1, x > 1.
if if
x(p, w) =
{w/p} [1, w/p]

99
w p, w > p.
The indirect utility function given
is
v(p, w) =
with discontinuities at all points where
w/p 2
if if
w p, w > p,
p = w.
Exercise 4.5 (a): Follows since

e(p, u) = min (p) x = min p x = e(p, u). s.t. x RL , s.t. x RL , + + u(x) u. u(x) u. p RL . Suppose there are u , u U with u(0, . . . , 0) u < u and e(p, u ) ++ e(p, u ). Let x h(p, u ) and x h(p, u ). Then x = (0, . . . , 0) and p x p x . By continuity, lim1 u(x ) = u(x ) u > u , so u(x ) > u for (0, 1) close to one. But then p (x ) = (p x ) (p x ) < p x , contradicting that x h(p, u ). (c): Let (p, u) RL U , i {1, . . . , L}, and > 0. For each x RL with u(x) u, ++ + (p + ei ) x p x, so e(p + ei , u) e(p, u). (d): Let u U . To show: the set {(p, r) RL R : r e(p, u)} is convex. ++ 1 1 2 2 1 1 2 2 Let (p , r ), (p , r ) lie in this set, let [0, 1], and dene (p, r) = (p , r ) + (1 )(p , r ). 1 , u) and at (p2 , u), so Let x h(p, u). Then x is feasible in the EMP at (p
Let
(b):
e(p, u) = p x = (p1 x) + (1 )(p2 x) e(p1 , u) + (1 )e(p2 , u) r1 + (1 )r2 = r.
Exercise 4.6
Let
(p, w) RL+1 ++
and
in the UMP at prices
x = x(p, w). By Walras' p and wealth p x:
Law,
p x = w.
For each
p RL , x ++
is feasible
v(p , p x) u(x) = v(p, w) = v(p, p x).

So the function
f : RL R ++
with
f (p ) = v(p , p x)
achieves its minimum at
p = p.
By the
rst order conditions, it partial derivatives must be zero at
p:
= 1, . . . , L :
As
f (p) v(p, p x) v(p, p x) = + x = 0. p p w
x = x(p, w)
and
p x = w,
the result now follows.
Exercise 4.7
By (15), indirect utility solves By (17),
w = e(p, v(p, w)) = v(p, w)
x(p, w) = h(p, v(p, w)) = (a1 v(p, w), . . . , aL v(p, w))

100
L i=1 ai pi , so v(p, w) = w/ a a1 w = , . . . , L Lw L i=1 ai pi i=1 ai pi
L i=1 ai pi .
.
Exercise 4.8
We know from (18) that using the Chain rule gives
h (p, u) = x (p, e(p, u)).
Dierentiating this equation w.r.t.
pk
and
h (p, u) x (p, e(p, u)) x (p, e(p, u)) e(p, u) = + . pk pk w pk

Recall from (14) that
e(p,u) pk
= hk (p, u):
h (p, u) x (p, e(p, u)) x (p, e(p, u)) = + hk (p, u). pk pk w

It follows from (15) and
u = v(p, w)
that
u = v(p, w) that e(p, u) = e(p, v(p, w)) = w h(p, u) = h(p, v(p, w)) = x(p, w), so:
and it follows from (18) and
h (p, u) x (p, w) x (p, w) = + xk (p, w). pk pk w
Exercise 4.9
Indivisibilities, rationing, package deals, as well as the specic initial endowment certicate imply a larger budget set and therefore a (weakly) higher welfare.
= (1, 1)
imply smaller budget sets and therefore a (weakly) lower welfare. Rebates 1 and 2 and the gift
Exercise 4.10
As
p1 x0 < w1 , x0 B(p1 , w1 ). x y suciently close to zero. 0 preferred to x .
As
x0
does not exhaust the budget,
p1 y w 1
for all
with
By local nonsatiation, this neighborhood contains a
strictly
Exercise 4.11
Write
A=
L i=1 ai . Standard calculations give:
x(p, w) = v(p, w) =
aL w a1 w , ..., A p1 A pL w A
A L
, ,
i=1 L
ai pi pi ai pi ai
L
ai
ai /A
h(p, u) = u1/A
i=1 L 1/A i=1
a1 aL ,..., p1 pL ,
ai /A
e(p, u) = Au
EV ((p0 , w0 ), (p1 , w1 )) = A(u1 )1/A

i=1
p0 i ai
L
ai /A
w0 , p1 i ai
ai /A
CV ((p0 , w0 ), (p1 , w1 )) = w1 A(u0 )1/A

i=1
101
It is commonly assumed (w.l.o.g., as this is just a monotonic transformation of the utility) that
A = 1,
which yields slightly more sympathetic expressions.
Section 5 Exercise 5.1 (a): Y {y RL : y } is the intersection of closed sets, hence closed.
vector It contains the zero suciently large. By assumption,
(b): (c):
Let as
0. (yn )nN
diverges to innity,
As the length of the vectors
By convexity and possibility of inaction,
yn + 0 , z zn
so dividing by
yn
gives
zn = zn + / yn
1 yn
yn 1 for n yn + 1 y1 0 Y . n 0.
All vectors lies in
zn
have length one. A bounded sequence contains a convergent subsequence.
z = 0, as it is the limit of a sequence of vectors of length one. Secondly, Y for n large, and Y is closed, also the limit z lies in Y . (d): Letting n , and realizing that / yn 0, (b) implies that z 0. As z = 0, this
be its limit. Firstly, contradicts no free lunch.
Exercise 5.2
solvable.
Reasoning as in the EMP, the assumptions on
guarantee that the CMP is
(a):
Dene qz = f (z) 0. The CMP at (w, qz ) has a solution and z is feasible in this CMP, so c(w, qz ) w z . Conclude that pf (z) w z pqz c(w, qz ). (b): Let zq solve the CMP at (w, q), i.e., zq RL1 , f (zq ) q , and c(w, q) = w zq . Conclude + that pf (zq ) w zq pq c(w, q). (c): Assume (P1) has a solution z (the case where (P2) has a solution is similar). By (a), there is a feasible qz in (P2) with equal or higher prot. It cannot be higher. Otherwise, by (b), there is a feasible zqz in (P1) yielding a higher prot than qz and therefore higher than the prot maximizing z , a contradiction. Conclude that qz solves (P2) and yields the same prot as z in (P1).
Exercise 5.3 (a), (b): Consider

point prices.
the convex production set
(0, 1) Y maximizes prot at price (0, 0) Y . The point (0, 0) Y is ecient,
Y = {y R2 : y1 0, y2 y1 }. 2 vector p = (1, 0) R+ , but is not ecient, as
The also
but does not maximize prot at strictly positive
Y = {y R2 : y (1, 1), (y1 1)2 + (y2 1)2 2}. The point (0, 0) is ecient, but not prot maximizing for any nonzero vector p R2 : if p1 p2 , then Y + (1, 1 2) Y yields a positive prot, and if p1 p2 , then (1 2, 1) Y yields a positive prot, whereas (0, 0) Y yields only zero prot.
Consider the production set
(c):
Section 6 Exercise 6.1 (a): Look at the denitions of improvements and Pareto optimality:
S=H
of all consumers cannot improve upon the fact that the coalition
means that there is nothing feasible that makes
everybody better o. But there may still be room for improvement for some if not all consumers:
it may still be Pareto dominated.
(b):
Consider a pure exchange economy with two consumers and two commodities. The rst con-
sumer's preferences are represented by the utility function
u1 (x) = x1 x2 ,
the second consumer's
102
preferences by a constant utility function: he is indierent between all commodity bundles. If
1 = 2 = (1, 1),
then
(p, x) = ((1, 1), (1, 1), (1, 1))
(i.e., prices are equal and each consumer
sticks to the initial endowment) is a Walrasian equilibrium. By Proposition 6.2, the allocation lies in the core. But the allocation is not Pareto optimal: giving the total endowment to the rst consumer makes him better o, while not aecting the happiness of the second consumer.
p z = k:pk >0 pk zk = 0. As the sum of nonpositive terms, it can be zero only if z = 0 whenever p > 0. (b): Let p, z, be as in the statement of the exercise. As zk = 0 for k = , Walras' Law implies p z = p z = 0. As p > 0, this implies z = 0. (c): If in equilibrium the market for good {1, . . . , L} does not clear, its price is zero by (a). So consumer h is not constrained in his consumption of . In equilibrium, h must choose a
By Walras' Law, most preferred bundle from the budget set, but there is none: under (c1), each bundle can be improved upon by adding more of good axes, as ; under (c2), a most preferred bundle can't lie on the
Exercise 6.2 (a): Let p RL , z z(p) RL . +
can aord a better alternative in .
RL ; ++
the latter can be improved upon by adding
more of good
Exercise 6.3 (a): Pareto dominance (b):

Let
tries to compare allocations regardless of prices.
Preferences of rms
(prot) are functions of prices.
(p, x, y)
be a Walrasian equilibrium of
E.
Suppose there is a feasible allocation
(, y ) x
Pareto dominating
h H :
(x, y). Local xh xh
nonsatiation implies
h h
xh p xh p xh = p h + xh p xh > p xh = p h +
f F f F
hf y f , hf y f . h H.
By Pareto dominance, such a weak preference holds for all, and strict preference for some Summing over at prices
hH
and using that equilibrium production plans
(y f )f F
are prot maximizing
gives
p
hH
xh > p
hH
xh p h + hf y f
f F
=
hH
= p+p
f F
yf yf .
f F
p+p
But
hH
xh > p + p
f F
yf
contradicts feasibility of
(, y ). x
Exercise 6.4
Pure exchange economies:
You may verify that the following pure exchange economies
E =(
(a):
1,
Let
2 , 1 , 2 ) have the desired property: 1 and 2 be lexicographic preferences over
R2 , 1 = (1, 0), +
and
2 = (0, 1).
There is
no Walrasian equilibrium:
103
p has both prices positive, then consumer 1 demands 1 (p2 /p1 , 0), so there is excess demand for the rst commodity;
if
and consumer 2 demands
if one of the commodities has price zero, demand for this commodity is unbounded.
(b): (c):
and
The standard Cobb-Douglas case. Let
1 be represented by the utility function 2 by the utility function u2 (x) = x + x . Let 1 2
u1 (x) = max{min{2x1 , x2 }, min{x1 , 2x2 }} 1 = 2 = (1, 1).
if one of the commodities has price zero, demand for this commodity is unbounded: there are no Walrasian equilibria at such prices; if both prices are positive and i.e., the bundle the entire
p1 > p2 , the rst consumer demands a bundle with 2x1 = x2 , (p 1 /(p1 + 2p2 ), 2p 1 /(p1 + 2p2 )) and the second consumer spends 2 income on the second commodity, i.e., demands the bundle (0, p /p2 ). In
particular, demand for the second commodity is at least twice the demand for the rst commodity. As the total endowment of both commodities is equal, not both markets can clear at the same time, contradicting the fact that (given local nonsatiation) markets with a positive price must clear. There are no Walrasian equilibria at such prices; similarly, Walrasian equilibria with positive prices and if both prices are positive and equal, i.e., is
p2 > p1
are ruled out;
p = (1/2, 1/2),
the rst consumer's demand
{(2/3, 4/3), (4/3, 2/3)}
and the second consumer's demand is clearing) allocations:
2}. There are two (equilibrium/market ((4/3, 2/3), (2/3, 4/3)).
{x R2 : x1 + x2 = + ((2/3, 4/3), (4/3, 2/3)) and
(d):
Preferences
1,
1 bundles; xh B h (p, p
= h)
2 are such that the consumers are indierent between all commodity (1, 1). Every (p, x) with p and x = (x1 , x2 ) R2 R2 with + +
for both
h = 1, 2
is a Walrasian equilibrium. Take the examples above and give the producers the trivial
Private ownership economies:

production set to
{0} consisting of the remarkable feat of producing absolutely nothing using abso-
lutely nothing. If you prefer slightly larger production sets, you may want to choose them equal
R2 ,
containing all production plans producing absolutely nothing, possibly using something.
Exercise 6.5
Feasible allocations: {(xT , xL ) R2 : xT + xL 1}. + Pareto optimal allocations: Must be nonwasteful, otherwise
otherwise the true mother can be made happier by giving her allocations
the remainder can be given
to the liar, who becomes happier, while the true mother is not harmed. Moreover,
xT (0, 1): /
Only
0, while not harming the liar.
(0, 1)
and
(1, 0)
are Pareto optimal.
Core:
The core depends on the initial allocation
( T , L ).
Denote an allocation by a vector
x=
(xT , xL ).
The liar can improve upon any allocation with For the true mother:
xL < L ,
so
xL L
in the core.
if if
T = 0,
individual rationality and feasibility require that
xT {0, 1},
T (0, 1), T = 1,
individual rationality has no bite: everything is at least as good as her
initial allocation, if individual rationality and feasibility require that
xT = 1.
104
The coalition of both women can improve upon any feasible allocation with
xT (0, 1)
by
T giving the liar the entire baby, so x
{0, 1}
in the core.
Combining the above gives that the core is
{(1, 0)} {(0, x) : L x 1} {(0, 1)}

Notice that if
if the initial endowment is if the initial endowment if the initial endowment
( T , L ) = (1, 0), T has (0, 1), T L is ( , ) = (0, 1).
T (0, 1),
there are wasteful core allocations.
Walrasian equilibria:
is
The Walrasian equilibria depend on the initial allocation
( T , L ).
As
equilibrium involves a nonzero price vector, we may assume w.l.o.g. that the equilibrium price
p > 0.
The true mother demands The liar demands
if
T [0, 1)
and
if
T = 1.
L.
Therefore, the set of Walrasian equilibria is
{(p, xT , xL ) R3 : p > 0, xT = 0, xL = L } {(p, xT , xL ) R3 : p > 0, xT = 1, xL = 0}
if the initial endowment has
T [0, 1),
T if
= 1.
Section 7 Exercise 7.1 (a):

Best elements of
G:
those whose reduced simple gambles put largest probability on
max{a1 , . . . , ak }.
Worst elements of
G:
those whose reduced simple gambles put largest probability on
min{a1 , . . . , ak }.
(G1) satisfied: (G2) violated:
preferences represented by utility function assume w.l.o.g. that
u(g) =
1 |L(g)|
ai L(g) ai .
and
a1 > a2
and consider the gambles
a1
(pa1 , (1
the
p) a2 ).
If
p > 1/2, a1
is the most likely outcome in both gambles, so the DM is indierent
between them. Continuity would require
a +a2 DM assigns value 1 2 a1 for sure.
a1 ( 1 a1 , 1 a2 ). 2 2
However, at
p = 1/2,
< a1
to the second gamble, so he strictly prefers the gamble giving
(G3) satisfied: (G4) violated:
preferences are dened in terms of reduced simple gambles: assume w.l.o.g. that
u(g) = u(gs ). g = a1
to
a1 > a 2 .
Then the DM strictly prefers
g = a2 .
Independence requires that also
( g, (1 ) a1 )
for all
( g , (1 ) a1 ) a1
is the most likely outcome in both
(0, 1).
However, for
close to zero,
gambles, so the DM is indierent between them. As (G2) and (G4) are violated, Remark 7.3 implies that vNM utility function. cannot be represented by a
(b):
105
G: deterministic outcome max{a1 , . . . , ak }. Worst elements of G do not 1 1 exist: for each g G, the gamble ( g, g) has higher complexity and is therefore worse 2 2 than g .
Best element of
(G1) satisfied: (G2) satisfied:
preferences represented by a utility function. on
G1 ,
the DM's utility function
u(g) =
k m=1 pm am
is continuous.
1 (G3) violated: the gambles a1 G0 and ( 1 a1 , 2 a1 ) G1 both have reduced simple 2 gamble (1 a1 ), yet the former lies in G0 and is therefore strictly preferred to the latter in
G1 .
(G4) violated:
Let
g g g
Let
= a1 G0 , = ( 1 a1 , 1 a1 ) G1 , 2 2 1 = ( 2 g , 1 g ) G2 . 2
(0, 1).
By construction,
( g, (1 ) g ), ( g , (1 ) g ) G3 .
Hence
u(g) = a1 0, u(g ) = a1 1, u( g, (1 ) g ) = a1 3, u( g , (1 ) g ) = a1 3,
in violation of (G4). As (G3) and (G4) are violated, Remark 7.3 implies that vNM utility function. cannot be represented by a
(c):
To characterize the best and worst elements of 1.
G,
distinguish two cases:
min{a1 , . . . , ak } 5 < max{a1 , . . . , ak }. Best elements of G: those putting probability

to its maximum, one). Worst elements of
one on outcomes
am > 5
(utility equal
G:
those putting probability one on outcomes
am 5 (utility equal
to its minimum, zero). 2. Otherwise, if all
ak
exceed 5 or all
ak
are at most ve, the utility function is constant
(one in the former case, zero in the latter), so all gambles are equivalent (and hence both best and worst elements of Shortcut: for each every
G).
i = 1, . . . , k , dene u(ai ) = 0 if ai 5 and u(ai ) = 1 otherwise. Then for g G with reduced simple gamble (p1 a1 , , pk ak ), we have u(g) = i:ai >5 pi = k must satisfy (G1) i=1 pi u(ai ), i.e., this denes a vNM utility function. By Remark 7.3,
to (G4).
106
Section 10 Exercise 10.1 (a): If u has no

upper bound, construct a sequence of instantaneous utilities
u(ct ) > 1/(t) for each time t. Then (t)u(ct ) > 1 at each time t and (b): Let u be bounded by B R and let c = (ct ) be an arbitrary t=0 B(t) = B (t) converges. By each t, |(t)u(ct )| B(t) and t=0 t=0 summable sequences, also t=0 (t)u(ct ) converges.
(u(ct )) with t=0 t u(ct ) diverges. t=0

stream of choices. For the comparison test for
Exercise 10.2 (a1) k gives instantaneous

optimal action is
utility
u(h, k) = 1, u(d, k) = 0,
gives instantaneous utility gives instantaneous utility
u(h, ) = 1,
so the so the
(a2) k (b) k
with instantaneous utility 1.
gives instantaneous utility
u(d, ) = ,
optimal action is
with instantaneous utility 0.
gives expected discounted utility
u(d, ) +
1 2 (u(h,
) + u(d, k)) = + k k
1 2 (1
u(d, k) + 0 = 0, 1 + 0) = 2 , so
if
gives expected discounted utility the optimal action is
and
1 2 1 if 2 1 if 2
< 0, = 0, > 0.
1 (c) If the severity of the depression is relatively small ( 2 > 0), an initially depressed person
may decide not to take his life in the hope of becoming happy later while still having the option of suicide in case of continued depression.
Exercise 10.3
Preferring one apple today over two apples tomorrow means that
u(1) > (1 + )/ u(2).

Preferring two apples one year and a day from now to one apple a year from now (and assuming we're not in a leap year) means that
(1 + 366)/ u(2) > (1 + 365)/ u(1).

These two inequalities hold simultaneously if
1 1+
Given
u(1) < < u(2)
1 + 365 1 + 366 /
. simply , > 0
it remains possible to choose the exponent
arbitrarily: having it equal to
means choosing solving
= .
So we can simplify the problem and show that there are
1 1+
or similarly
<
u(1) < u(2)

1/
1 + 365 1 + 366
1 < 1+
u(1) u(2)
<
1 + 365 . 1 + 366
107
Notice that
>0
implies that
0<
The expression
1 1 + 365 < < 1, 1+ 1 + 366
(u(1)/u(2))1/ is a continuous function of > 0. As u(1)/u(2) (0, 1), it goes to zero to as 0 and to one as . By the Intermediate Value Theorem, there exists, for 1/ lies between the two desired bounds. each > 0, a > 0 such that (u(1)/u(2))
lim inf t xt = c implies [L1] and [L2]: Let > 0. As limt inf{xs : s t} = c, T N such that c < inf{xs : s t} < c + ,
for all
Exercise 10.5
there is a
t T.
Apply the rst inequality in the special case of
t = T:
c < inf{xs : s T },
so
c < xt for all t T , proving [L1]. Let T N and apply the second inequality
to the special case of
t = max{T, T }:
inf{xs : s max{T, T }} < c + ,

i.e., there is a
[L1] and [L2] imply lim inf t xt = c:
tT
with
x t < c + ,
proving [L2]. Let
> 0.
By [L1] there is a
T N
such that
c /2 < xt
for all
t T.
Hence,
c < c /2 inf{xs : s T }.
As the inmum increases weakly if the bound
does, it follows that, for each
t T:
(57)
c < inf{xs : s t}.

By [L2] applied to an arbitrary
t T,
there is an
st
such that
xs < c + /2 < c + ,
i.e.,
inf{xs : s t} c + /2 < c + .
Combining (57) and (58) gives that for each
(58) such that
>0
there is a
T N
c < inf{xs : s t} < c + ,

i.e.,
lim inf t xt = c. (xt ) : t=0

for all but nitely many
Exercise 10.6
It suces to show, for an arbitrary sequence
lim inf xt > 0

t
> 0 : xt >
108
t.
(): Assume
lim inf t xt > 0. If the liminf is innite, the weakly increasing sequence of inma inf{xs : s t} diverges, so there is a T N with inf{xs : s T } 1. In particular, xt 1 for all t T . If the liminf is nite, [L1] with = c/2 implies that there is a T N with xt > c = c/2 for all t T . (): Assume there is an > 0 such that xt > for all but nitely many t: there is a T N such that xt > for t T . Then inf{xs : s t} for t T , so also the limit of the inma exceeds : it must be positive!
Exercise 10.7 (a): If a sequence

(t +
is unbounded, the liminf of average payos need not converge. For instance,
the unbounded sequence
(b):
1)2
Let
x = (xt ) dened recursively by x0 = 1 and, for all t N, xt = t=0 t1 1 xk , has time average T T 1 xt = T , so its liminf diverges to innity. k=0 t=0 = (xt ) and y = (yt ) be two bounded sequences. We need to investigate whether t=0 t=0 1 lim inf T T
T 1
(xt yt ) > 0
t=0
1 lim inf T T
T 1 t=0
1 xt > lim inf T T
T 1
yt .
t=0
(59)
To see that this is not the case, let using, for any sequence superior is dened
x = (0, 0, . . .) be the zero sequence. Substitution in (59) and z = (zt ) , that lim inf t zt = lim supt zt where the limes t=0 analogously to liminf as lim supt zt = limt (sup{zs : s t}) yields lim sup
T
1 T
T 1
yt < 0
t=0
lim inf
T
1 T
T 1
yt < 0.
t=0
This is obviously false. equal to
For an explicit example, take the sequence from page 73 with the
oscillating average and subtract
1/3 1/2 = 1/6 < 0,
1/2 from each entry to obtain a sequence of averages with liminf but limsup equal to 2/3 1/2 = 1/6 > 0.
Section 11 Exercise 11.1

The cost function
is strictly convex, so the function
n i=1 pi (i)
1 2 c(p) is strictly concave.
Since we maximize a strictly concave, continuous function over a compact set, a maximum exists and is unique. Notice that the gradient of the goal function has
i-th 1
coordinate
(i)
1 c(p) 1 1 = (i) 2 pi 2 pi 2 n
= (i)
pi
1 n
Since the feasible set is entirely dened by linear (in)equalities, the Kuhn-Tucker conditions give necessary and sucient conditions for a solution to be a maximum. So maximization problem if and only if there are Lagrange multipliers inequality constraints such that for each
solves the
i 0
associated with the
p 0 and R i i = 1, . . . , n : (i) 1 p i 1 n
associated with the equality constraint
n i=1 pi
= 1
+ i + = 0
and
i p = 0. i 1 . n
(60)
Rewriting we nd
i = 1, . . . , n : p = (i) + (i + ) + i
109
Assume that
solves the maximization problem. We check that it satises the linear probability
model with parameter
If
p > 0, i
then
i = 0
by complementary slackness. Hence for every
j A,
we nd, using (60):
p p = i j
1 1 (j) + (j + ) + n n = ((i) (j)) j (i) + +
((i) (j)),
where the inequality follows from the fact that (52). Conversely, if
>0
and
j 0.
This is exactly requirement
satises requirement (52), one can easily show that it satises the
Kuhn-Tucker conditions. Recall that if
p > 0 i
and
p > 0, j
then
p p = ((i) (j)), i j
so
p i
1 n
(i) = p > 0 i p i
p j
1 n
(j).
(61)
Hence if we choose
i {1, . . . , n}
with
and dene
=
we have from (61) that
1 n 1 n
(i) R,
=
for all
p j k:
(j)
with
p > 0. j
Now dene for each
k =
To see that
0
1
if
p k
1 n
(k) j
if
p > 0, k p = 0. k p > 0. j
By denition of the linear
k 0
if
p = 0, k
choose an alternative
with
probability model,
p p ((j) (k)), j k
which implies
((j) (k))
Hence
1 p p 0. k j
k = =
1 1
1 n 1 p k n p k
(k) (k) 1 p j 1 n + (j)
= ((j) (k)) 0,
1 p p k j
110
as we had to show. Substituting the denition of the Lagrange multipliers in (60) shows that the Kuhn-Tucker conditions are satised.
Exercise 11.2 (a): Choice probabilities are weakly increasing in payos, so the probability of choosing 1 must
be positive. If also the probability of choosing
is positive, the linearity requirement implies
PA (1) PA (2) = ((1) (2)) = 4.

Together with
PA (1) + PA (2) = 1,
this gives
PA (1) = 1/4.
4 + 1 1 4 , PA (2) = . 2 2
(62)
Obviously, this is possible if and only if both these probabilities are nonnegative, i.e., if and only if So for and we know that for every we nd
(0, 1/4], the choice probabilities in (62) satisfy the linear probability model there is only one such vector of choice probabilities. For > 1/4, PA (1) = 1, PA (2) = 0.
(63)
(b) (c):
Answered in the notes on
The role of .
Solution 11.3 (a):

In the logit model with parameter is
> 0,
the choice probability for each alternative
iA
(64)
PA (i) =
Substituting the payos, we nd:
exp((i)/) . jA exp((j)/)
PA (1) = = PA (2) = PA (3) =
exp(0/) exp(0/) + exp(2/) + exp(8/) 1 , 1 + exp(2/) + exp(8/) exp(2/) , 1 + exp(2/) + exp(8/) exp(8/) . 1 + exp(2/) + exp(8/)
Since the exponential function takes strictly positive values, all choice probabilities lie in
(0, 1).
The logit model is a special case of Luce's choice model (see (42) and (45)), which satises path independence. Hence the logit model satises path independence. As
the choice probabilities converge to
1/3.
See the motivation in Section 11.2.
(b):
Choice probabilities with parameter if
PA (i) for all alternatives i A > 0 if the following holds:

then
satisfy the linear probability model
PA (i) > 0,
PA (i) PA (j) ((i) (j))

111
for all
j A.
(65)
Since choice probabilities are weakly increasing in payos and are three cases to consider:
(3) > (2) > (1),
there
Case 1: Case 2: Case 3:
PA (i) > 0
for all
i A.
or equivalently,
PA (3), PA (2) > 0, PA (1) = 0. PA (3) > 0, PA (2) = PA (1) = 0, PA (3) = 1.
Using (65), the rst case requires:
PA (3) PA (2) = ((3) (2)) = 6, PA (3) PA (1) = ((3) (1)) = 8, PA (1) + PA (2) + PA (3) = 1.
So:
PA (2) = PA (3) 6, PA (1) = PA (3) 8, 3PA (3) 14 = 1.

Conclude that
PA (3) = P (2) = A PA (1) =
1+14 3 , 1+14 3 1+14 3
6 = 8 =
14 3 , 110 3 .
(66)
To make sure that all probabilities are positive, this requires that probabilities in (66) satisfy the linear probability model for Using (65), the second case requires:
(0, 1/10). (0, 1/10).
So the
PA (3) PA (2) = ((3) (2)) = 6, PA (3) PA (1) = PA (3) ((3) (1)) = 8, PA (1) = 0, PA (1) + PA (2) + PA (3) = 1.
Rewrite:
PA (2) = PA (3) 6, PA (1) = 0, PA (3) 8, 2PA (3) 6 = 1.

Conclude that
PA (3) = 1+6 , 2 PA (2) = 1+6 6 = 2 PA (1) = 0. PA (2)

and
16 3 ,
(67)
To make sure that
PA (3)
are positive and
PA (3) =
1 + 6 8, 2
112
this requires that linear
[1/10, 1/6). Conclude that the choice probabilities in (67) satisfy the probability model for [1/10, 1/6).
Using (65), the third case requires:
PA (3) PA (2) = 1 ((3) (2)) = 6, PA (3) PA (1) = 1 ((3) (1)) = 8.

So choice probabilities as long as
PA (1) = PA (2) = 0, PA (3) = 1
satisfy the linear probability model
1/6. > 0. In > 0, PA (1) = PA ({1, 2})P{1,2} (1).

and
The linear probability model does not satisfy path independence for every particular, we will show that for a specic value of alternatives
This means that we have to consider choice probabilities in the smaller problem with only
and
2.
Let us assume that both
P{1,2} (1)
P{1,2} (2)
are positive. This
requires that
P{1,2} (2) P{1,2} (1) = ((2) (1)) = 2, P{1,2} (1) + P{1,2} (2) = 1,
so
P{1,2} (1) =
let us choose
1 + 2 1 2 , P{1,2} (2) = . 2 2 (0, 1/2).

Now
These choice probabilities satisfy the linear probability model as long as
= 1/20.
Then
PA (1) =
but
1 10 1 = 3 6 1 10 1 4 + 3 3 2 14 1 2 3 2 (1 7)(1 2) 3 39 200 1 . 6 3 1 2 2
PA ({1, 2})P{1,2} (1) = = = = =

As
it follows from our earlier analysis that Case 3 is the only feasible one: the with probability one.
decision maker rationally chooses alternative
Exercise 11.4
Suppose vector
pi < pj .
Exchange the probabilities assigned to the i-th and
j -th alternative to obtain a
p.
By construction,
n i=1 pi (i)
is unaected, contradicting that
solves
> P ().
n i=1 pi (i), and by symmetry, the control cost term
113

Preferences

Uploaded by

Copyright:

Available Formats

You might also like

Preferences

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Preferences

Uploaded by

Copyright:

Available Formats

Mathematical foundations of microeconomic theory: Preference, utility, choice

Mark Voorneveld September 6, 2010

4 Choices of a consumer: classical demand theory

5 Choices of a producer: classical supply theory

7 Expected utility theory

9 Some critique on expected utility theory

Full circle: overview Notation References Suggested solutions

The notes have a relatively strong focus on preferences, rather than

From preferences to utility:

Not all preferences can be represented by means of a utility function.

Other things not commonly found in standard texts include:

Like any textbook, these notes contain exercises.

They also contain if you have time to

of the exercises, in the hope of facilitating self-study:

do some exercises, you can immediately check your solutions. applications.

If you're pressed for time, you

is a binary relation on read

allowing the comparison between pairs of alternatives. For each

as  x is at least as good as/weakly preferred to/weakly better than is a

Completeness: for all x, y X , x Transitivity: for all x, y, z X , if x

Are the following binary relations

necessarily complete, transitive? is the alphabetical order in which they are

consists of the items in an English dictionary,

listed. (b) is a group of people and for

From preference relation

, one can derive two other binary relations:

x ( x is better than/strictly preferred y

Economic theory relies

incomparable alternatives. Transitivity is violated in a number of plausible situations:

Majority rule voting:

Consider three agents with strict preferences over three alternatives

because a majority (namely the agents 1 and 2)

in violation of transitivity. This example is

sometimes referred to as the Condorcet paradox.

Nonperceivable differences and similarity:

The human body cannot perceive dierences

grains of sugar and

grains of sugar. Therefore,

and the strict preference

. The proofs involve only simple manipulations of the denitions of

that you can do this. I only prove part (d).

(a) The indierence relation

is an equivalence relation, i.e., it satises:

(b) The strict preference relation irreexivity: asymmetry: transitivity: if if

Complete the proof of the proposition.

Preference over commodity bundles

In the standard microeconomic model of consumer choice, the set of alternatives

The interpretation is that there are

the latter case to be consumed in nonnegative amounts. An element

(commodity) bundle ; its k-th coordinate xk indicates the

L commodities, in x = (x1 , . . . , xL ) X is quantity of commodity k .

indierence curve containing x X is the

These properties are

of points equivalent with

Recall that the (Euclidean) distance between vectors

local nonsatiation if, for every alternative x, there

is an alternative arbitrarily close to

that is better: for each

standard basis vector with

as x is at least as good as/weakly preferred to/weakly better than is a

x ( x is better than/strictly preferred y

The human body cannot perceive dierences

. The proofs involve only simple manipulations of the denitions of

(a) The indierence relation

is an equivalence relation, i.e., it satises:

(b) The strict preference relation irreexivity: asymmetry: transitivity: if if

the latter case to be consumed in nonnegative amounts. An element

indierence curve containing x X is the

but not strongly monotonic. , less is better (think of the

are closed. The literature contains some alternative denitions as well:

Proving three implications suces to close

(x, y) S , x, using (e), x Ux , y Uy . Conclude that

Continuity properties are typically dened in terms of open

dened using the usual distance between vectors

suciently close to there is an

is quasilinear in the rst coordinate.

Therefore, functions dened by etic preferences.