Tomasini 1971

Zeitschrift ftir National/Skonomie 31 (1971), 1--12
9 by Springer-Verlag 1971
Economic Aspects of Information

By
Luigi M. Tomasini*, Torino, Italy
(Received April 15, 1971)
1.1. Let us start by giving a definition of expectations. In

Georgescu-Roegen's [3] words "expectation is the state of the
mind of a given individual with respect to an assertion, a coming
event, or other matter on which absolute knowledge does not neces-
sarily exist". In other words, we are in a situation where the indi-
vidual is in a state such that the consequences of a given action are
partially known. Drinking my coke when I am thirsty can give me
a pleasant feeling of satisfaction or can cause me problems (stomach-
ache, for example). In this way, whenever I am offered a coke,
I will have to face the problem of a pleasant feeling or a stomach-
ache. This last proposition suggests the possibility of comparing
expectations and therefore of measuring them.
M a n y economists continue to use the distinction between risky
and uncertain expectations, the distinction being based on whether
or not we have a past "history" which enables us to establish or
not frequencies of occurrence of given events. According to this
difference, well established since K n i g h t ' s work, it is possible to
assigne mathematical probabilities to risk but not to uncertainty. In
the case of uncertainty, the decision maker should choose a "prin-
cipium" which reflects his personal "degree of belief" on the occur-
rence of a given event.
* My indebtness to Professor Jacob M a r s c h a k's stimulating work

on the subject will be clear to anyone reading the paper. I wish to thank
Bruno de F i n e t t i and Herbert S c a r f who, at different stages, contri-
buted to improve a first draft of the paper. Responsibility is, of course,
entirely mine.
The research described in this paper was carried out under grants
from the National Science Foundation and from the Ford Foundation,
while I was associated with the Cowles Foundation for Research in
Economics at Yale.
Zeitschr. f. NatlonalSkonomie, 31. Bd., H e ~ t-2 1
2 L.M. Tomasini:
Several criteria have been proposed in the literature (cfr. O z ga

[12]) such as: (i) the safety margin (Feller, 1948); (ii) the index of
pessimism and optimism ( H u r w i c z , 1951); (iii) the minimum sub-
jective loss ( S a v a g e , 1951); (iv) the minimum chance of disaster
(Roy, 1952 and 1956). It seems worth emphasizing that once
a principium is chosen, any pay-off matrix will be the same in form
as that under risk.
All these criteria have been amply discussed and criticised in
the literature. The problem is not to find wether a given princi-
plum is better than any other; it deals with the very concept of
probability and hence on the irrelevance of the distinction between
risky and uncertain situations.
1.2. One of the many achievements of the subjective probability
theory ( R a m s e y , de F i n e t t i , S a v a g e ) is the possibility of for-
mulating a more general theory in which risk is a special case of
uncertainty.
To understand the nature of this "revolution" consider, for
example, two distinct events. These, for many circumstances, are
different. However, it is possible that an individual thinks they are
equally probable, in the sense that he considers irrelevant the differ-
ences between the two events; i.e., they do not affect his judge-
ment. If an individual is sufficiently interested in a given event, he
will associate a probability to such an event trying to evaluate it
with due care. Therefore, any event is associated with a probability
that reflects the decision makers behavior. This, obviously, depends
upon the degree of ignorance of the individual and it is, therefore,
related to his status of information. Probability is not a thing that
it is possible to know or to ignore in itself, it exists because it is
useful to express what the decision maker chooses when not fully
informed 1. (cfr. de F i n e t t i [2]).
In order to assess the probability of a given event the decision
maker has: (i) a decision schema and, (ii) a consistent condition.
The forecast of an event x will be a value ~ which could be chosen
randomly as an estimate of x. The first criterion says that the deci-
sion maker should accept a wager proportional to the random
Statements such as "I do not know" or "for me there is not such

a thing as probability" are meaningless and cannot be considered as
answers. " . . . To immagine the possibility of a higher degree of ignorance
so to justify the refusal to answer should be as to think that in statistics
it is meaningful to indicate, beyond individuals whose sex is unknown,
also those for whom we do not know whether the sex is unknown or
not." Cfr. de F i n e t t i [2], p. 104.
Economic Aspects of Information 3
value x - - ~ chosen by a competitor; so that it is not convenient for

him to move away from that value which makes him indifferent
between the two wagers. The second criterion tells us that the
individual will be penalized by a value which is positively propor-
tional to ( x - - ~ ) 2 which increases as he moves away from that
value (cfr. de F i n e t t i [2], p. 106).
1.3. In our analysis we will accept entirely the argument sup-
ported by the subjectivist school, i.e. that probability is given by
a numerical coefficient which measures the subjective degree of
belief that a given event will happen. The numerical coefficient is
built with the aid of the betting ratio w that any subject is will-
ing to accept on a given uncertain event. More precisely, putting
p = probability, we have, w = p/(l--p). Probabality will be de-
fined as p = w/(w + 1).
In this formulation, different individuals will have different
betting ratios, i.e. they will attribute different coefficients to the
probability of a given event. In real life the betting ratio will be
the result of the quantity of data (information drawn from past
experience and the ability to assign to the latter a weight based
on an intuitive judgement), available to the decision maker.
The difference between risky expectations and uncertain expec-
tations is, therefore, unnecessary as it is always possible to obtain
a betting ratio from any individual for any outcome. This implies
that all expectations are measurable and that no meaningful distinc-
tion can in real life be made between risk and uncertainty. From
now on when we talk about uncertainty we will consider this term
to include the concept of risk as well.
1.4. An information system is defined, fi la M a r s c h a k, as a set
of potential messages which the decision maker will use by respond-
ing to each message with some action. An action is considered
optimal with respect to given information, if the result of the action
could not be, on the average, improved by choosing any other
available action.
Information can be considered as a good for which there exist
demand and supply prices. The'former is given by the highest price
an individual is willing to pay for it; the latter by the lowest the
seller is willing to offer it. The demand price depends, obviously,
on the utility the buyer attaches to those messages he will obtain
from a given information source. The value of information can,
then, be defined as the average amount of utility (money) earned
with the help of that information. We will consider later the value
of information; here we will briefly discuss the cost of information.
4 L.M. Tomasini:
Information has a price which for the buyer represents a cost.

This has to be considered by the decision maker before formulating
a strategy. If, for example, I do not have a television to look at
the weather forecast and I have to buy one in order to decide if
tomorrow I have to take or not my umbrella, even if the gross
value of information given by the television forecast is higher than
my looking at the sky, the net value will, in general, be less. It is
this latter that the decision maker should consider before formulat-
ing decisions.
Another cost is that which is associated to the decision making
process or, in M a r s c h a k terminology, the cost of thinking. This
last is given by the cost, in terms of efforts, that the decision maker
faces selecting an action which is optimal given the information
received. The net value of an action is given by subtracting from
the gross value of information the cost of obtaining it plus the cost
of decision making.
2.1. In order to have a better understanding of the problem
sketched in the previous sections, it is useful to formalize the con-
cepts we have discussed. Consider:
A = {a} a set of feasible actions;
Z = {z) a set of states of the world;
X - - {x} a partition set of Z of payoff-relevant events 2, i. e. of
those states of the world which form equivalence sets;
(x, a ) - ) - R a payoff function 3 (an utility function which respects
the axioms of the theory of choices);
p (x) a subjective probability function such that p (x)_> 0,
all x; p (x U x') -----p (x) § p (x') if x f"l x" are disjoint;
p (z) = 1;
H = {h} a partition set of Z of available information.
In the case of certainty the decision maker knows {x}. By the
rationality assumption he will then choose an action a* e A so as to
max ~ (x,a) ->~ (x,a*), all a~A. (2.1)
a~A
2 Consider, for example, the payoff matrix 0 (z, a):

Zl Z~ g:3
al 1 2 2
a2 2 1 1
Obviously the distinction between states z2 and za is due to details not
relevant to the decision, z~ and za are payoff equivalent.
3 It is to be pointed out that ~ (x,a) ----0 (z,a) if z e x c X.
Definition 1. A decision function or strategy is a mapping d

that associates each event x r X with some action a e A, i.e. d:
X-+A.
Definition 2. An optimal decision function d': associates each
x e X with an optimal a* e A.
In the case of uncertainty, the rational decision maker will
choose an action a * e A only if
X p (x) ~ (x, a*) >- X p (x) ~ (x, a), all a ~ A,

xEX x~X
and will behave so as to
max 2' p (x) ~ (x, a). (2.2)

aEA x ~ X
2.2. If we rule out the case of complete information which is, of

course, indeed an extreme situation, we are left with the problem
of choosing an action and of basing such a choice on the know-
ledge received from some "informative source".
Consider the partition H of Z which defines the information
set. The decision maker, because he is ignorant of which x r X will
occur, bases his decision on the knowledge of which h e H will
occur. Of course, in the case of complete information H and Z
coincide.
Definition 3. The information set H is noiseless with respect to
the set of events X if X is finer than H (X f H) 4.
In such a case there exists a single-valued function W: X - ~ H;
i. e. each event x c X is associated with exactly one message h E H.
Any information set H lies somewhere between the set of com-
plete information (Hm~x)and that of zero information (Hmin)-More
formally,
X=Hm~xfHfHmin
Consider now a noisy information set. In such a case we do
not have a single-valued function k~. It is however possible to
associate each value x r X with a probability distribution on H. In
such a case the random function ~V takes x e X into the matrix of
conditional probabilities p (h Ix) of h a H, given x.
It is obvious that the noiseless case is a special case of the noisy
one. In the case of complete information, for example, ~v is the
identity matrix.
4 Given two partitions H and H' of a set Z, if H is a sub-partition

of H' we say that H is finer than H' or HfH'.
6 L.M. Tomasini:
2.3. The probability function p (x) depends on the information

set available to the decision maker. His problem [(2.2)] can, there-
fore, be restated as
max 2 ~ p (x [h 0) Q (x, a) (2.3)
a~A x ~ X
where h ~ e H indicates a fixed value of the message h e H.

Given the set H, a decision maker who responds to each obser-
vation h e H with an optimal action, will obtain, averaging over
all such observations, a higher (or, at least, not lower) expected
utility than by using any other way of responding to those obser-
vations by actions. M a r s c h a k calls this (maximum) expected
utility the gross value o[ the in[ormation set H, i.e. U (H). There-
fore, we have,
U (H) =h:up (h) maxa~x~xZp (xlh) e (x,a), (2.4)
where U (H) could be written as U(llv) if we neglect observations

costs and consider all the messages with the same ~.
Definition 1'. A decision function d is a mapping which asso-
ciates each value h e H with some action a e A, i. e. d: H - + A.
Definition 2". An optimal decision function d* is a mapping
which associates each observation h e H with an optimal action
a* eA.
The problem for the decision maker is to find an optimal deci-
sion rule d*, given the information set H. If different information
sets (H, H', H " , . . . ) are available, we have to compare - - neglecting
for the moment the costs of these different information sets - - the
values U(H), U(H') . . . More formally, we state that H is more
informative than H' with respect to X, i.e. ( H > H' IX) if
U (H; ~Xx~, px) > U (H'; Qx• Pc), (25)

for all Q defined in the cartesian product X • A, all probability
functions p on X.
M a r s c h a k [4] proves that, in the case of noiseless information
set the Eq. (2.5) coincides with the concept of "finer than". More
formally, let X f H, X f H', then
H >H'IX~HfH'. (2.6)
2.4. Up to this point the decision maker's problem has been that
of choosing an information set H with the greatest gross value, i. e.
he will act so as to maximize the expected payoff simultaneously

with respect to X and to the decision function ds. However, not
all information sets are feasible. In fact, there will be some which
will not be available to the decision maker. The same applies to
the decision functions. The unavailability depends, in general, on
the larger costs which the decision maker faces.
An important question concerns the measurements of the obser-
vation costs and the payoff (gross value of information). The
problem is somewhat complicated by the fact that, in reality, we
are interested in measuring the net value of information.
An alternative is to measure gross payoff, observation costs
and decision costs, in the same unit, for example, util. The net
value of information V could, therefore, be
V(H;o, px, t~,~,)=U(H;~,px)-k(H)-~(d~), (2.7)

where k represents the total cost function, y is a function from the
set du of feasible decision functions on H to the set R of utilities.
Sometimes such a decomposition of the set value is not possible;
in such a case V must be evaluated directly.
2.5. In order to formulate an economic theory of information,
we must investigate the relationship between information value-cost
and the concept of quantity of information. The quantity of infor-
mation does not depend on the needs of any decision maker who
buys information. Therefore, it is, in general, different from the
value of information.
Consider, for example, a decision maker who is informed about
two different future events: sunshine tomorrow and a crisis in the
stock market such that he will lose all his wealth. In such a case,
although the amount of information received by him (in the form
yes/no for the two events) is the same, the "value" of it will be,
assuming his rationality, quite different. At the same time, it is
conceivable that the price he is willing to pay to receive the infor-
mation regarding the two events is different.
The simple consideration should make clear that what is meant
by the "value" of information in communication theory (S h a n n o n,
1948) is in fact the cost of transmitting a message. As it is well
known, such a theory studies, considering H = X , the optimal
choice of a communication function F: X--~ H, usually a stochastic
function,
/ ' = (yxh) ; yxh = p (h Ix), (2.8)
where x e X is the message sent and h e H the one received.

8 L.M. Tomasini:
The payoff function considered is
0, if h = x
(2.9)
(x,h)= -1, ifh4=x,
that is, every error in identifying the message sent has the same
negative payoff (penalty).
The problem is to choose among the set (iV') of all possible
communication systems a / ' that minimizes the probability of error
for a fixed cost o f / ~ and obtain for each cost, the most economical
communication system.
2.6. The communication f u n c t i o n / ' depends on the channel and
on the code. The event x e X is encoded into a message-input; the
channel is characterized by a transmission function ~ so that
a message-output is received and decoded at the other end of the
channel. The problem is then to select, given known costs, the
efficient pair (r, k).
The amount of messages can be carried by the channel in
a given unit of time is called the capacity of the channel. Assuming
there is no transmission noise, we can increase the capacity of the
channel by increasing the length of a "block", i. e., instead of trans-
mitting a message every unit of time we transmit one (in the same
unit of time) about a sequence (block) of two events. As the length
of the block increases, the needed channel capacity decreases and
converges towards the function Px
px = - X p (x) log~ p (x), (2.10)

xGX
where, for convenience, r = 2. Usually, Px is denoted by H (X).

If the event xi e X is the only possible, p~ = 1 and all the other
pj = 0 (i 4 = j). In such a case Px = 0, i. e. it gets its minimum value.
For p~ = 1/n, for all i = 1 , . . . , n, PXm~x= logan, i.e. it gets maxi-
mum value. In other words, if we have n potential messages with
equal probability, the (2.10) can be written as
H (X) = n. (l/n) [log2 (l/n)] = log2 n. (2.11)

In general, the larger the value of logn, the more precise the
information source; the more precise the latter, the more expensive
it will be. The increase in costs with an increase of exactness is
due to the fact that higher exactness means a larger number of
symbols required to distinguish messages. The required minimum
number of symbols increases in proportion to the log of the num-

ber of potential messages, so that the cost of information is pro-
portional to the quantity of information.
The economic value of information depends on the decision
maker's payoff function. In default of its knowledge, we can hardly
assess its value.
If we partition the set of all states of the world, we obtain
different information structures. Their ability to produce different
expected gains depends on the payoff function. It is possible, hence,
to rank different information structures according to their value.
Of course, the ranking is subjective as it depends on the decision
maker's information utility.
To sum up this section, we can say that in case of uncertainty
the decision maker chooses an action and formulates such a choice
on the available information. His problem is that of selecting simul-
taneously, an information set, a communication system, and a de-
cision rule. This triple is optimal if there exists no other triple
which gives, on the average, a better result than that obtainable
by the. one considered. It is important to note that if the choice
of each component of the triple is made independently, the decision
maker may suffer losses.
3.1. In order to have a better understanding of the decision

maker's problem in the face of uncertainty, we have to specify the
system in which he operates.
Two variables are of fundamental importance for the decision
maker: resources and information. Until recently, economic theory
has dealt with the problem of allocation of resources among com-
peting uses, assuming that information is equally distributed among
all the individuals of an economic system. In fact, one usually makes
an even stronger assumption, i.e. that all individuals have perfect
information.
For more realistic models of the economic system we recognize
that such an assumption is rather restrictive. Moreover, it is our
contention that information plays a more important role than the
one economists are willing to attribute to it. In particular, the distri-
bution of the stock of information at a given time determines the
flow of information from the environment to the system (whether
it is an economy, a firm or an individual) thereby, increasing the
stock of information. The resulting process would be cumulative
and have a tendency to maintain and eventually, increase differen-
ces in the stock of information among the components of the
system as well as differences in the flows of information.
10 L.M. Tomasini:
A decision maker can, in general, be described by an ordered

n. tuple of n economic variables or characteristics (in our case
n = 2, resources and information). If we define a function for each
n. tuple, we can identify the number of decision makers who pos-
sess, at a given time, the same amount of the two characteristics.
The state of the system can be, therefore, represented by n such
functions.
The state is not stable. In fact, if we define an ordered pair of
n. tuples, the first representing the initial state and the second the
subsequent state, we can define a transition probability function
which tells us what the probability is that a given person in a given
state will move to a different state in the next period. Of course,
in a deterministic process (transition probability = 1), given the
initial state, the process is completely determined.
3.2. As we have seen, all decision makers can be classified
according to the amount of information and resources they possess
at a given time. For a given state of information, we can find
a number N of individuals who have the same amount of infor-
mation.
Consider now the entropy of a system H (X) which measures
the rate at which information is transmitted by the environment to
the system. The stock of information possessed by the i-th deci-
sion maker at the stage s is measured by
H (X max) - H i ( X s ) . (3.1)
Of course, if he knows H ( X m a x ) then the stock of information is

maximal. If H(Xmax)-----H~(Xs), the stock of information is zero.
In a given economy neither the resources nor the information is
equally distributed among the individuals. Each decision maker has
the possibility to move from one state of information to another;
the same is true for resources. The change depends on losses of
information (due to forgetting) and on losses of resources (choice
of the wrong decisions). It is also conceivable that the choice among
all the feasible actions is such that the decision maker increases at
the same time the resources and the stock of information.
The differences in the stock of information and of resources
explain the differences in behavior which can be found among the
decision makers. If all of them have the same stock of information
and different quantity of resources, the optimality of an action will
be different for different decision makers who face the same event.
In the literature, such behavior is associated with the theory of
expected utility and the distinction between risk-takers and risk-

averters. The same sort of reasoning applies in case of inequality
in the stock of information and equality of resources among indi-
viduals.
3.3. A decision maker operates by formulating decisions. Let us
assume that he has to formulate some decisions at time zero with
a zero amount of information. Although such an assumption is
open to discussion, as it is clear if one considers the Laplace prin-
ciple, the decision maker will choose an action which, assuming
there are n feasible actions, has probability 1/n of being optimal:
At time one a new decision has to be taken. This time however,
the previous information (experience, knowledge) will determine an
increase in the probability of choosing an optimal action. After
t trials, the decision maker will reach some kind of statistical equi-
librium with his environment. The passage of time being associated
with the increase in information.
The amount of information which flows from the environment
to the decision maker can be increased or decreased according to
the quantity of resources which any decision maker allocates to
its aquisition. This proposition has its "dual". In fact, it is still true
that the increase in the amount of resources depends on the choice
of a decision, whose probability of being optimal increases with the
increase in information.
Consider two decision makers who start with the same amount
of resources but with different stocks of information. Let us further
assume that the rate of growth in resources is a given constant
common to both and that they want to reach a given target (a given
stock of resources and information) which we will consider as
a steady state.
The length of the path to this target will be different. In fact,
it will be a function of the initial stock of information and of the
flow of information which the decision makers receive with the pas-
sage of time. This implies that the one who starts with a larger stock
of information will reach, coeteris paribus, the steady state in a shor-
ter period of time.
The cost, in terms of loss of resources, to reach the steady state
will be greater the less information the decision maker possesses.
Moreover, as we implicitly assumed that both face the same
sequence of events in reaching the target, the two paths to the
steady state cannot, in general, cross each other. This rules out the
possibility of overtaking, the hypothesis which is the basis of the
modern theory of growth.
12 L.M. Tomasini: Economic Aspects of Information
References
[1] N. A b r a m s o n , Information Theory and Coding, New York:
McGraw-Hill, 1963.
[2] B. de E i n e t t i , Teoria delle ProbabilitY, 2 voll., Torino: Einaudi,
1970.
[3] N. G e o r g e s c u- R o e g e n, The Nature of Expectation and Un-
certainty, in: J. M. B o w m a n (ed.), Expectations, Uncertainty and Busi-
ness Behavior, New York; reprinted in: N. G e o r g e s c u - R o e g e n , Ana-
lytical Economics, Harvard University Press, 1966.
[4] J. M a r s c h a k , Remarks on the Economics of Information, in:
Contributions to Scientific Research in Management, Los Angeles: Western
Data Processing Center, University of California, 1959.
[5] J. M a r s c h a k , The Payoff-Relevant Description of States and
Acts, Econometrica 31 (1963), 719--726.
[6] J. M a r s c h a k , Problems in Information Economics, in: C.P.
B o n i n i et al. (eds.), Management Controls: New Directions in Basic
Research, New York: McGraw-Hill, 1964.
[7] J. M a r s c h a k , Economics of Planning and the Cost of Think-
ing, Social Research 33 (1966), 151--159.
[8] J. M a r s c h a k , Decision Making: Economic Aspects, in: D.L.
S ills (ed.), International Encyclopedia of the Social Sciences 4, New York:
Crowell Collier and McMillan, 1968.
[9] J. M a r s c h a k , Economics of Inquiring, Communicating, Decid-
ing, The American Economic Review, Papers and Proceedings 68 (1968),
1--18.
[10] J. M a r s c h a k and K. M i y a s a w a, Economic Comparability of
Information Systems, International Economic Review 9 (1968), 137--174.
[11] E. R. M u r p h y , Adaptive Processes in Economic Systems, New
York: Academic Press, 1965.
[12] S. A. O zga, Expectations in Economic Theory, London: Wein-
denfeld and Nicolson, 1965.
[13] L. J. S a v a g e, The Foundations of Statistics, New York: J. Wiley
and Sons, 1954.
[14] H. A. Simon, Models of Man, New York: J. Wiley and Sons,
1957.
Address of author: Prof. Dr. Luigi M. T o m a s i n i , Fondazione Luigi

Einaudi, Palazzo d'Azeglio, via Principe Amedeo, 34, Torino, Italy.

Tomasini 1971

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tomasini 1971

Uploaded by

Copyright:

Available Formats

Zeitschrift ftir National/Skonomie 31 (1971), 1--12

Economic Aspects of Information

1.1. Let us start by giving a definition of expectations. In

* My indebtness to Professor Jacob M a r s c h a k's stimulating work

Several criteria have been proposed in the literature (cfr. O z ga

Statements such as "I do not know" or "for me there is not such

value x - - ~ chosen by a competitor; so that it is not convenient for

Information has a price which for the buyer represents a cost.

2 Consider, for example, the payoff matrix 0 (z, a):

Definition 1. A decision function or strategy is a mapping d

X p (x) ~ (x, a*) >- X p (x) ~ (x, a), all a ~ A,

and will behave so as to

max 2' p (x) ~ (x, a). (2.2)

2.2. If we rule out the case of complete information which is, of

4 Given two partitions H and H' of a set Z, if H is a sub-partition

2.3. The probability function p (x) depends on the information

where h ~ e H indicates a fixed value of the message h e H.

U (H) =h:up (h) maxa~x~xZp (xlh) e (x,a), (2.4)

where U (H) could be written as U(llv) if we neglect observations

U (H; ~Xx~, px) > U (H'; Qx• Pc), (25)

he will act so as to maximize the expected payoff simultaneously

V(H;o, px, t~,~,)=U(H;~,px)-k(H)-~(d~), (2.7)

where x e X is the message sent and h e H the one received.

The payoff function considered is

px = - X p (x) log~ p (x), (2.10)

where, for convenience, r = 2. Usually, Px is denoted by H (X).

H (X) = n. (l/n) [log2 (l/n)] = log2 n. (2.11)

number of symbols increases in proportion to the log of the num-

3.1. In order to have a better understanding of the decision

A decision maker can, in general, be described by an ordered

Of course, if he knows H ( X m a x ) then the stock of information is

expected utility and the distinction between risk-takers and risk-

Address of author: Prof. Dr. Luigi M. T o m a s i n i , Fondazione Luigi

You might also like