Chapter 5 - Game Theory: History

AMA484 Decision Analysis 1
Chapter 5 - Game Theory
History
• The study of game theory dates back to 1944, when John von Neumann and Oscar
Morgenstern published their classic book, Theory of Games and Economic Behavior. Since
then, game theory has been used by army generals to plan war strategies, by union
negotiators and managers in collective bargaining, and by businesses of all types to
determine the best strategies given a competitive business environment.
• Game theory continues to be important today. In 1994, John Harsanui, John Nash, and
Reinhard Selten jointly received the Nobel Prize in Economics from the Royal Swedish of
Sciences. In their classic work, these individuals developed the notion of noncooperative
game theory. After the work of John von Neumann and Oscar Morgenstern, Nash
developed the concepts of Nash equilibrium and the Nash bargaining problem, which are
the corner-stones of modern game theory.
Reference: H.A. Taha, Operations Research: An Introduction, 8th Edition, Prentice Hall, 2007.
What Is a Game?
In a game, there are three elements,
• alternation of moves, which can be either personal or random (chance) moves,
• a possible lack of knowledge, and
• a payoff function.
Suppose we have
(1) a topological tree Γ with a distinguished vertex A (called the starting point); and
(2) a function, called the payoff function, which assigns an n-vector (p1 , p2 , · · · , pn )
to each terminal vertex of Γ for an n players’ game.
Example
• Player 1 chooses heads (H) or tails (T );
• Player 2, not knowing player 1’s choice, chooses heads or tails;
• if the two choose alike, then Player 2 wins a cent from Player 1; otherwise, Player
1 wins a cent from Player 2;
• in the tree, vectors at the terminal vertices represent the payoff function;
• number near other vertices denote player to whom move corresponds.
(−1, 1)
H
2
H
T
(1, −1)
1
(1, −1)
H
T
2
T
(−1, 1)
Zero-sum Game
The payoff function is
(p1 , p2 ) ∈ {(−1, 1), (1, −1)}.
A game Γ is said to be zero-sum if, at each terminal vertex, the payoff function
(p1 , · · · , pn ) satisfies
Xn
pi = 0
i=1
and n is the number of the players (n-persons game).
Each terminal means a collection of strategies from all players.
Normal Form
Here we consider a game with two players only.
Normal form:
The normal form is a matrix with
no. of rows = no. of Player I’s strategies;
no. of columns = no. of Player II’s strategies;
(i, j)th element aij = the expected payoff.
That is, aij is the amount which Player I receives from Player II.
Let
A = [aij ],
where A is called a payoff matrix.
Example
In the game of matching pennies, each player has two strategies head and tail. The
normal form of the game is
 
−1 1
A=  .
1 −1
This is a zero-sum game, since one player’s loss is the other player’s gain.
Example
• Two countries, I and II, are at war.
• Country II has two airfields and can defend one but not both:
βj = defend airfield j, j = 1, 2.
• Country I can attack only one of the airfields:
θi = attack airfield i, i = 1, 2.
• If I attacks a defended one, it withdraws immediately with no loss.
• If I attacks an undefended airfield, the airfield will be destroyed.
• Airfield 1 has value 1 and airfield 2 has value 2. Then
β1 β2
θ1 0 1
θ2 2 0
This is a zero sum game, providing the values to I of the destruction of the airfields are
the same as the values of the airfields to II.
Example
A manufacturer is playing a game between him and Nature (or call it fate). Each of the
players has the choice of two moves: The manufacturer has the choice between actions
β1 (to expand his plant now) and β2 (delay expansion), and Nature controls the choice
between θ1 (economic remain good) and θ2 (recession). Depending on the choice of
the moves, the payoffs are shown in the following table:
Player II (Manufacturer)
β1 β2
Player I θ1 L(β1 , θ1 ) L(β2 , θ1 )
(Nature) θ2 L(β1 , θ2 ) L(β2 , θ2 )
The amounts L(β1 , θ1 ), L(β2 , θ2 ) are referred to as the values of the loss function that
characterizes the particular game.
In other words, L(βj , θi ) is the loss of Player II (the amount he has to pay Player I)
when he chooses alternatives βj , and Player I chooses alternative θi .
Although it does not really matter, we shall assume here that these amounts are in
dollars. In actual practice, they can also be expressed in terms of any goods or services,
in units of utility (desirability or satisfaction), and even in terms of life or death.
Let us also assumed that each player must choose his strategy without knowing what
his opponent is going to do and that once a player has made his choice it cannot be
changed.
The objectives of the theory of games are to determine optimum strategies (i.e.,
strategies that are most profitable to the respective players) and the corresponding
payoff, which is called the value of the game.
Example 1
Given the 2 × 2 zero-sum two-person game
Player II
β1 β2
Player I θ1 7 −4
θ2 8 10
Find the optimum strategies of Players I and II and the value of the game.
S OLUTION.
For Player I, Strategy θ2 will yield more than Strategy θ1 regardless of the choice made
by Player II.
In a situation like this we say that Strategy θ2 dominates Strategy θ1 .
If we do this here, we find that Player I’s optimum strategy is Strategy θ2 , the only one
left, and the Player II’s optimum strategy is Strategy β1 , since a loss of $8 is obviously
preferable to a loss of $10.
Also, the value of the game, the payoff corresponding to Strategy β1 and θ2 , is $8.
Example 2
Given the 3 × 2 zero-sum two-person game
Player II
β1 β2 β3
Player I θ1 −4 1 7
θ2 4 3 5
Find the optimum strategies of Player I and II and the value of the game.
S OLUTION.
In this game, Player I don’t have dominant strategy, but the third strategy of Player II is
dominated by other two strategy. (Why?)
Clearly, for Player II, a profit of $4 or a loss of $1 is preferable to a loss of $7, and a
loss of $4 or a loss of $3 is preferable to a loss of $5. Thus, we can discard the third
column of the payoff matrix and study the 2 × 2 game
Player II
β1 β2
Player I θ1 −4 1
θ2 4 3
where Strategy θ2 of Player I is dominant strategy. Thus, the optimum choice of Player
I is Strategy θ2 , the optimum choice of Player II is Strategy β2 , and the value of the
game is $3.
Dominance may not even exist in two-person game:
Player II
β1 β2 β3
Player I θ1 −1 6 −2
θ2 2 4 6
θ3 −2 −6 12
First, if Player II chooses Strategy β1 , β2 , β3 , the worst is that he loses $2, $6 or $12
respectively; Thus, he could minimize the maximum loss by choosing Strategy β1 .
For Player I, if he chooses Strategy θ1 , θ2 or θ3 , the worst that can happen is that she
loses −$2, $2 (wins) or −$6 respectively. Thus, she could minimize the maximum loss
by choosing Strategy θ2 .
The selection of Strategies β1 and θ2 is called minimax strategies.
By choosing Strategy β1 , Player II makes sure that his opponent can win at most $2,
and by choosing Strategy θ2 , Player I makes sure that she will actually win this amount.
In our example, even if Player I (II) announced publicly that she will choose Strategy
θ2 (β1 ), it would still be best for Player II (I) to choose Strategy β1 (θ2 ).
Saddle point
A strategy pair (i, j) is in equilibrium (a saddle point) if the element aij corresponding
to both the largest in its column and the smallest in its row (LCSR, in short).
The value aij is called is the optimal payoff.
For example,
(i) The game matrix

 
5 1 3
 
 3 2 4 
 
−3 0 1
has a saddle point in the second row and second column, a22 = 2.
(ii) The game matrix below does not have a saddle point
 
−1 1
 .
1 −1
The saddle point may not be unique, but the optimal payoff is unique.
 
4 3 5
 
 0 1 0 .
 
6 3 9
Both (1, 2) and (3, 2) are saddle points with the optimal payoff 3. How to find a saddle
point, suppose that there exists one?
Gain Floor
Let us consider the game without a saddle point
Player I:
 
Strategy θ1 → 4 2
 .
Strategy θ2 → 1 3
• Player II is not only unpredictable, but omniscient, (he will guess correctly
whatever the player I decides).
• Player I wins at least 2 units with Strategy θ1 , and 1 unit with Strategy θ2 .
• This certain win of at least two units is the Player I’s gain-floor and we shall
denote it by vI :
vI = max{min aij }.
i j
That is vI = 2.
Loss Ceiling
 
4 2
 .
1 3
• Player II losses at most 4 units with Strategy β1 ;
• Player II losses at most 3 units with Strategy β2 ;
• the minimum of these two values is defined to be the loss-ceiling of Player II,
which is 3 units.
This loss-ceiling will be denoted by vII
vII = min{max aij } = 3.

j i
Minimax Inequality
It can be easily shown that
vI ≤ vII ,
equivalently
max min aij ≤ min max aij .
i j j i
The statement of this inequality is that:

Player I should not win less than vI ; Player II should not lose more than vII .
If the equality holds, we have a saddle point; if not, we have a game without a saddle
point.
Show that the minimax strategies of Players I and II are not spyproof in the following
game:
Player II
β1 β2
Player I θ1 8 −5
θ2 2 6
S OLUTION.
Player I can minimize her maximum loss by choosing Strategy θ2 .
Player II can minimize his maximum loss by choosing Strategy β2 .
However, if Player II knew that Player I was going to base her choice on the minimax
criterion, he could switch to Strategy β1 and thus reduce his loss from $6 to $2.
Of course, if Player I discovered that Player II would try to outsmart her, she could in
turn switch to Strategy θ1 and increase her gain to $8.
In any case, the minimax strategies of the two players are not spyproof.
Clearly, there is no saddle point in this Example, since the smallest value of each row is
also the smallest value of its column. Therefore not all games are spyproof.
In general, if a game has a saddle point, it is said to be strictly determined, and the
strategies corresponding to the saddle point are spyproof (optimum) minimax
strategies.
If a game does not have a saddle point, minimax strategies are not spyproof, and each
player can outsmart the other if he knows how the opponent will react in a given
situation. To avoid this possibility, each player should somehow mix up his or her
behavior patterns intentionally, and the best way of doing this is by introducing an
element of chance into the selection of strategies.
Mixed Strategy
A mixed strategy for a player is a probability distribution on the set of his pure
strategies.
Suppose that a player has only a finite number, m, of pure strategies, a mixed strategy
reduces to an m-vector, x = (x1 , · · · , xm ), satisfying
m
X
xi ≥ 0, xi = 1.
i=1
We shall denote the set of all mixed strategies for Player I by X, and the set of all
mixed strategies for Player II by Y .
m
P
X = {x = (x1 , · · · , xm ) : xi ≥ 0, xi = 1},
i=1
n
P
Y = {y = (y1 , · · · , yn ) : yj ≥ 0, yj = 1}.
i=1
Expected Payoff
Let us suppose that Players I and II are playing the matrix game A. If Player I chooses
the mixed Strategy x, and Player II chooses y, then the expected payoff will be
computed by
   
a · · · a1n x a y · · · x1 a1n yn
 11  1 11 1
 .. ..  .. ..
 
.. −→ ..
.
 
 . . .   . . .
   
am1 · · · amn xm am1 y1 · · · xm amn yn
That is,
m X
X n
A(x, y) = xi aij yj .
i=1 j=1
Or, in matrix form

A(x, y) = xAy T .
This can be thought as a weighted average of the expected payoffs.
Player I’s Maximin Strategy

Assuming the Player I uses x. Then Player II will certainly choose y as to minimize
A(x, y), i.e., Player I’s expected gain-floor will be
v(x) = min xAy T = min xA•j

y∈Y j
(the minimum will be attained by a pure strategy j, A•j is the jth column of the matrix
A). Hence Player I should be choose x so as to maximize v(x):
vI = max min xA•j

x∈X j
Such a Strategy x is Player I’s maximin strategy.

Player II’s Minimax Strategy

Similarly, if Player II chooses y, he will obtain the expected loss-ceiling
v(y) = max Ai• y T ,

i
where Ai• is the ith row of A and he should choose y so as to obtain
vII = min max Ai• y T .

y∈Y i
Such a Strategy y is Player II’s minimax strategy.

Minimax Theorem
Thus we obtain the two numbers vI and vII . These numbers are called the values of
the game to Players I and II, respectively.
Theorem 1 (Minimax Theorem)
vI = vII
This theorem is the most important in game theory. It says that every two-person
zero-sum game will have optimal strategies.
Computing Optimal Strategies

The simplest case:
If a saddle point exists, (there exists an entry aij which is both the largest entry in its
column and the smallest entry in its row), then the pure strategies i and j, or
equivalently, the mixed strategies x and y with xi = 1 and yj = 1 and all other
components equal to zero,
i
x = (0, · · · , 1, · · · , 0),
j
y = (0, · · · , 1, · · · , 0),
will be optimal strategies for Player I and Player II, respectively.

Row Domination
General case:
In a matrix A, we say the ith row dominates the kth row if
aij ≥ akj , for all j
and
aij > akj , for at least one j.
Example,
 
2 0 1 4
 
→ 
 6 2 5 .
3 
→ 4 1 3 2
The second row dominates the third row. The third row is removed.
Column Domination
Similarly, we say that the jth column dominates lth column if
aij ≤ ail , for all i
(opposite inequality direction) and
aij < ail , for at least one i.
Example,
↓ ↓
 
2 0 1 4
1 2 5 3 .
 
4 1 3 2
The second column dominates the fourth column.
Theorem
Theorem 2 Let A be a matrix game, and assume that rows i1 , i2 , · · · , ik of A are

dominated. Then Player I has an optimal strategy x such that
xi1 = xi2 = · · · = xik = 0; moreover, any optimal strategy for the game obtained by
removing the dominated rows will also be an optimal strategy for the original game.
A similar theorem will hold for the column domination.

Example
Consider the game with matrix
 
2 0 1 4
 
.
 1 2 5 3 

4 1 3 2
It is seen that the second column dominates the fourth column, i.e, column 4 is deleted,
 
2 0 1 4
 
 1 2 5 3 .
 
4 1 3 2
In this new matrix, we find the third row dominates the first row,
 
2 0 1 4
 
 1 2 5 3 
 
4 1 3 2
and in this new matrix, the third column is dominated by the second column. Hence
the matrix is reduced to  
2 0 1 4
 
 1 2 5 3 
 
4 1 3 2
and we now look for optimal strategies to the small 2 × 2 matrix game.
2 × 2 games
Theorem 3 Let A be a 2 × 2 matrix game. Then if A does not have a saddle point, its
unique optimal strategies and optimal expected payoff will be given by
JA⋆
x= ⋆ T
,
JA J
J(A⋆ )T
y= ,
JA⋆ J T
|A|
v= ,
JA⋆ J T
where A⋆ is the adjoint of A, |A| the determinant of A, and J is the vector (1, 1).
Adjoint
Let
a b

A= .
c d
Then the adjoint of A is
d −b

⋆
A = .
−c a
Counterexample
If there is a saddle point, the denominator may be zero:
JA⋆ J T = 0.
A saddle point exists for the matrix
1 3

A= ,
2 4
a21 is a saddle point, and
4 −3 1 1

JA⋆ J T = (1, 1) = (2, −2) = 0.
−2 1 1 1
Example Continued
We have
1 2

A= .
4 1
Thus,
1 −2

A⋆ = , |A| = −7,
−4 1
1

JA⋆ J T = (−3, −1) = −4,
1
⋆

JA (−3, −1) 3 1
x= = = , ,
JA⋆ J T −4 4 4
1 −4

(1, 1)
⋆ T −2 1

J(A ) 1 3
y= ⋆ T
= = , .
JA J −4 4 4
The conclusion is that the optimal mixed strategy for Player I is

3 1
x⋆ = 0, , ,
4 4
and that the optimal mixed strategy for Player II is

1 3
y⋆ = , , 0, 0 .
4 4
The optimal expected payoff is
7
v= .
4
Example
Solve the matrix game
−1 0

.
−1 2
It is easy to check that the game doesn’t have a saddle point. Now the adjoint matrix
A⋆ of A is
2 0

A⋆ =
1 1
and |A| = 2, JA⋆ = (3, 1); J(A⋆ )T = (2, 2) and JA⋆ J T = 4. Thus we have

3 1 1 1 1
x= , , y= , , v= .
4 4 2 2 2
Using Game Theory to Shape Strategy at General Motors

Game theory often assumes that one player or company must lose for another to win.
In the auto industry, car companies typically compete by offering rebates and price
cuts. This allows one company to gain market share at the expense of other car
companies. Although this win-lose strategy works in the short term, competitors
quickly follow the same strategy. The result is lower margins and profitability. Indeed,
many customers wait until a rebate or price cut is offered before buying a new car. The
short-term win-lose strategy turns into a long-term lose-lose result.
By changing the game itself, it is possible to find strategies that can benefit all
competitors. This was the case when General Motors (GM) developed a new credit
card that allowed people to apply 5% of their purchases to a new GM vehicle, up to
$500 per year with a maximum of $3500. The credit card program replaced other
incentive programs offered by GM. Changing the game helped bring profitability back
to GM. In addition, it also helped other car manufacturers who no longer had to
compete on price cuts and rebates. In this case, the new game resulted in a win-win
situation with GM. Prices, margins, and profitability increased for GM and some of its
competitors.
Statistical Games
In statistical inference, the decision based on the populations of sample data, and it is
no need to look upon such an inference as a game between nature (which controls the
relevant features(s) of the population) and the person who must arrive at some decision
about Nature’s choice.
For instance, if we want to estimate the mean, µ of a normal population on the basis of
a random sample of size n, we could say that Nature has control over the true value of
µ.
On the other hand, we might estimate µ in terms of the value of the sample mean or
median, and presumably there is some penalty that depends on the size of our error.
There are essentially two distinguish features in statistical games:
(1) Statistical games treat Nature as a rational opponent rather than a rational
opponent rather than malevolent opponent.
(2) In a statistical game, the statistician is supplied with sample data that provide him
with some information about Nature’s choice. This also complicates matters, but it
merely amounts to the fact that we are dealing with more complicated kinds of
games.
For example, we are told that a coin is either balanced with heads on one side and tails
on the other or two-headed. We cannot inspect the coin, but we can flip it once and
observe whether it comes up heads or tails. Then we must decide whether or not it is
two-headed, keeping in mind that there is a penalty of $1 if our decision is wrong and
no penalty if our decision is right. If we ignored the fact that we can observe one flip of
the coin, we could treat the problem as the following game:
Player II (Statistician)
β1 β2
Player I θ1 L(β1 , θ1 ) = 0 L(β2 , θ1 ) = 1
(Nature) θ2 L(β1 , θ2 ) = 1 L(β2 , θ2 ) = 0
Now θ1 and θ2 are the ‘state of Nature’ that the coin are two-headed and balanced
(head and tail) respectively. β1 and β2 are the statistician’s decision that the coin is
two-headed and balanced respectively. The entries in the table are the corresponding
values of the given loss function.
Player II know the result of the flip of the coin, i.e., a random variable X has taken on
the value x = 0 (head) or x = 1 (tails). Since we shall want to make use of this
information in choosing between β1 and β2 , we need a function, a decision function,
that tells us what action to take when x = 0 (x = 1). We can express this by writing

 β , if x = 0,
1
d1 (x) =
 β2 , if x = 1.
The purpose of the subscript is to distinguish this decision function from others. For
instance, we have four various decision function here, i.e.,
d2 (0) = β1 , d2 (1) = β1 ,
d3 (0) = β2 , d2 (1) = β2 ,
d4 (0) = β2 , d4 (1) = β1 .
To compare the merits of all these decision functions, let us first determine the
expected losses to which they lead for the various strategies of Nature, that is, the
values of the risk function
R(di , θj ) = E[L(di (X), θj )]

X
= L(di (X), θj ) × P (X|θj ),
X
where the expectation is taken with respect to the random variable X.

1
Since the probabilities for x = 0 and x = 1 are, respectively, 1 and 0 for θ1 , and 2 and
1
2 for θ2 , we get
R(d1 , θ1 ) = 1 × L(β1 , θ1 ) + 0 × L(β2 , θ1 ) = 1 × 0 + 0 × 1 = 0,

1 1 1 1
R(d1 , θ2 ) = 2 × L(β1 , θ2 ) + 2 × L(β2 , θ2 ) = 2 ×1+ 2 × 0 = 21 ,
R(d2 , θ1 ) = 1 × L(β1 , θ1 ) + 0 × L(β1 , θ1 ) = 1 × 0 + 0 × 0 = 0,

1 1 1 1
R(d2 , θ2 ) = 2 × L(β1 , θ2 ) + 2 × L(β1 , θ2 ) = 2 ×1+ 2 × 1 = 1,
R(d3 , θ1 ) = 1 × L(β2 , θ1 ) + 0 × L(β2 , θ1 ) = 1 × 1 + 0 × 1 = 1,

1 1 1 1
R(d3 , θ2 ) = 2 × L(β2 , θ2 ) + 2 × L(β2 , θ2 ) = 2 ×0+0× 2 = 0,
R(d4 , θ1 ) = 1 × L(β2 , θ1 ) + 0 × L(β1 , θ1 ) = 1 × 1 + 0 × 0 = 1,

1 1 1 1
R(d4 , θ2 ) = 2 × L(β2 , θ2 ) + 2 × L(β1 , θ2 ) = 2 ×0+ 2 × 1 = 21 .
We have thus arrived at the following 4 × 2 zero-sum two-person game, in which the
payoffs are the corresponding values of the risk function
d1 d2 d3 d4
Player I θ1 0 0 1 1
1 1
(Nature) θ2 2 1 0 2
As can be seen by inspection, d2 is dominated by d1 and d4 is dominated by d3 , so that

d2 and d4 can be discarded and we say that they are inadmissible.
d1 d3
Player I θ1 0 1
1
(Nature) θ2 2 0
This leaves us with the 2 × 2 zero-sum two-person game.
It can be verified that if Nature is looked upon as a malevolent opponent, the optimum
strategy is to randomize between d1 and d3 with respective probabilities of 32 and 13 ,
and the value of the game is 31 of a dollar.
If Nature is rational opponent, we formulated this problem with reference to a
two-headed coin and an ordinary coin. We must decide on the basis of a single
observation whether the random variable has the Bernoulli distribution with the
parameter θ = 0 or the parameter θ = 21 .
The minimax criterion

If we apply the minimax criterion to the previous the coin problem, that is either
two-headed or balanced with heads on one side and tails on the other.
1
We know that d2 and d4 can be deleted and the maximum risk of d1 and d3 are 2 and 1
respectively.
Hence, the one that minimizes the maximum risk is d1 .
Example Continued
A random variable has the uniform density

 1 , for 0 < x < θ,
θ
f (x) =
 0, otherwise,
and we want to estimate the parameter θ (the move of Nature) on the basis of a single
observation. If the decision function is to be of the form d(x) = kx, where k ≥ 1, and
the losses are proportional to the absolute value of the errors, that is,
L(kx, θ) = c|kx − θ|, c > 0,
and find the value of k that will minimize the risk.

S OLUTION.
Since
L(kx, θ) = c|kx − θ|, c > 0.
For the risk function we get
θ
θ
1 1
Z k
Z
R(d, θ) = c(θ − kx) × dx + c(kx − θ) × dx
0 θ θ
k
θ
k 1
= cθ −1+ ,
2 k
and there is nothing we can do about the factor θ, but it can easily be verified that
√
k = 2 will minimize k2 − 1 + k1 .
Thus, if we actually took the observation and got x = 5, our estimate of θ would be
√
5 2.
Decision Criteria
In the above example, we were able to find a decision function that minimized the risk
regardless of the true state of Nature, but this is the exception rather than the rule. Had
we not limited ourselves to decision functions of the form d(x) = kx, then the decision
function given by d(x) = θi would be best when θ happens to equal θi and it is
obvious that there can be no decision function that is best for all values of θ.
In general, we thus have to be satisfied with decision functions that are best only with
respect to some criterion, such as
(1) the minimax criterion, according to which we choose the decision function d for
which R(d, θ), maximized with respect to θ, is a minimum;
(2) the Bayes criterion, according to which we choose the decision function d for
which the Bayes risk E[R(d, θ)] is a minimum, where the expectation is taken
with respect to θ. This requires that we look upon θ as a random variable having a
given distribution.
Example
Use the minimax criterion to estimate the parameter θ of a binomial distribution on the
basis of the random variable X, the observed number of successes in n trials, when the
decision function is of the form
x+a
d(x) = ,
n+b
where a and b are constants, and the loss function is given by
x + a x + a 2
L ,θ = c −θ ,
n+b n+b
where c is a positive constant.
S OLUTION.
The problem is to find the values of a and b that will minimize the corresponding risk
function after it has been maximized with respect to θ.
After all, we have control over the choice of a and b, while Nature (our presumed
opponent) has control over the choice of θ.
Since E(X) = nθ and E(X 2 ) = nθ(1 − θ + nθ), it follows that
h x + a 2 i c 2 2 2
R(d, θ) = E c −θ = [θ (b − n) + θ(n − 2ab) + a ],
n+b (n + b)2
and we could find the value of θ that maximizes R(d, θ) and we can then find the value
of both a and b that minimize R(d, θ).
To simplify the work in a problem of this kind, we can often use the equalizer
principle, according to which (under fairly general conditions) the risk function of a
minimax decision rule is a constant; for instance, it tells us that the risk function should
not depend on the value of θ.
To make the risk function independent of θ, the coefficients of θ and θ2 must both
equal 0 in the expression for R(d, θ). This yields b2 − n = 0 and n − 2ab = 0, and,
1√ √
hence, a = 2 n and b = n. Thus, the minimax decision function is given by
1√
x+ 2 n
d(x) = √ ,
n+ n
and if we actually obtained 39 successes in 100 trials, we would estimate the parameter
θ of this binomial distribution as
1
√
39 + 2 100
d(39) = √ = 0.40.
100 + 100
Linear Programming Formulation

Given the payoff matrix:
 
a a12 ··· a1m
 11 
 a21 a22 ··· a2m
 

.
 .. .. .. ..


 . . . . 
 
am1 am2 ··· amn
Player I’s optimal probabilities x1 , x2 , · · · , and xm can be determined by solving the

following maximin problem
n P m Pm m
P o
max min ai1 xi , ai2 xi , · · · , ain xi ,
xi i=1 i=1 i=1
x1 + x2 + · · · + xm = 1,
xi ≥ 0, i = 1, 2, · · · , m.
Now let
m
nX m
X m
X o
v = min ai1 xi , ai2 xi , · · · , ain xi .
i=1 i=1 i=1
The equation implies that
m
X
aij xi ≥ v, j = 1, 2, · · · , n.
i=1
Player I’s problem thus can be written as
maximize z = v,
m
P
subject to v− aij xi ≤ 0, j = 1, 2, · · · , n,
i=1
x1 + x2 + · · · + xm = 1,
xi ≥ 0, i = 1, 2, · · · , m,
v unrestricted.
Player II’s optimal strategies, y1 , y2 , · · · , and yn , are determined by solving the

problem
n P n Pm n
P o
min max a1j yj , a2j yj , · · · , amj yj ,
yj j=1 i=1 j=1
y1 + y2 + · · · + yn = 1,
yj ≥ 0, j = 1, 2, · · · , n.
Using a procedure similar to that of Player I, Player II’s problem reduces to
minimize w = v,
n
P
subject to v− aij yj ≥ 0, i = 1, 2, · · · , m,
j=1
y1 + y2 + · · · + yn = 1,
yj ≥ 0, j = 1, 2, · · · , n,
v unrestricted.
Example
Solve the following game by linear programming.
β1 β2 β3
 
θ1 3 −1 −3
 
θ2  −2 4 −1 
 
θ3 −5 −6 2
Solving by a Computer
Player I’s Linear Program
maximize z = v,
subject to v − 3x1 + 2x2 + 5x3 ≤ 0,
v + x1 − 4x2 + 6x3 ≤ 0,
v + 3x1 + x2 − 2x3 ≤ 0,
x1 + x2 + x3 = 1,
x1 , x2 , x3 ≥ 0,
v unrestricted.
The optimum solution is x1 = 0.39, x2 = 0.31, x3 = 0.29 and v = −0.91.

Player II’s Linear Program
minimize z = v,
subject to v − 3y1 + y2 + 3x3 ≥ 0,
v + 2y1 − 4y2 + y3 ≥ 0,
v + 5y1 + 6y2 − 2y3 ≥ 0,
y1 + y2 + y3 = 1,
y1 , y2 , y3 ≥ 0,
v unrestricted.
The optimum solution is y1 = 0.32, y2 = 0.08, x3 = 0.60 and v = −0.91.

Chapter 5 - Game Theory: History

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 5 - Game Theory: History

Uploaded by

Copyright:

Available Formats

AMA484 Decision Analysis 1

Chapter 5 - Game Theory

• alternation of moves, which can be either personal or random (chance) moves,

• a possible lack of knowledge, and

no. of rows = no. of Player I’s strategies;

no. of columns = no. of Player II’s strategies;

(i, j)th element aij = the expected payoff.

Dominance may not even exist in two-person game:

(i) The game matrix

• Player II losses at most 4 units with Strategy β1 ;

• Player II losses at most 3 units with Strategy β2 ;

This loss-ceiling will be denoted by vII

vII = min{max aij } = 3.

The statement of this inequality is that:

Or, in matrix form

Player I’s Maximin Strategy

v(x) = min xAy T = min xA•j

vI = max min xA•j

Such a Strategy x is Player I’s maximin strategy.

Player II’s Minimax Strategy

v(y) = max Ai• y T ,

where Ai• is the ith row of A and he should choose y so as to obtain

vII = min max Ai• y T .

Such a Strategy y is Player II’s minimax strategy.

Theorem 1 (Minimax Theorem)

Computing Optimal Strategies

will be optimal strategies for Player I and Player II, respectively.

aij ≥ akj , for all j

aij ≤ ail , for all i

(opposite inequality direction) and

aij < ail , for at least one i.

Theorem 2 Let A be a matrix game, and assume that rows i1 , i2 , · · · , ik of A are

A similar theorem will hold for the column domination.

A saddle point exists for the matrix

a21 is a saddle point, and

The conclusion is that the optimal mixed strategy for Player I is

Using Game Theory to Shape Strategy at General Motors

There are essentially two distinguish features in statistical games:

R(di , θj ) = E[L(di (X), θj )]

where the expectation is taken with respect to the random variable X.

R(d1 , θ1 ) = 1 × L(β1 , θ1 ) + 0 × L(β2 , θ1 ) = 1 × 0 + 0 × 1 = 0,

R(d2 , θ1 ) = 1 × L(β1 , θ1 ) + 0 × L(β1 , θ1 ) = 1 × 0 + 0 × 0 = 0,

R(d3 , θ1 ) = 1 × L(β2 , θ1 ) + 0 × L(β2 , θ1 ) = 1 × 1 + 0 × 1 = 1,

R(d4 , θ1 ) = 1 × L(β2 , θ1 ) + 0 × L(β1 , θ1 ) = 1 × 1 + 0 × 0 = 1,

As can be seen by inspection, d2 is dominated by d1 and d4 is dominated by d3 , so that

The minimax criterion

L(kx, θ) = c|kx − θ|, c > 0,

and find the value of k that will minimize the risk.

Linear Programming Formulation

Player I’s optimal probabilities x1 , x2 , · · · , and xm can be determined by solving the

Player I’s problem thus can be written as

Player II’s optimal strategies, y1 , y2 , · · · , and yn , are determined by solving the

The optimum solution is x1 = 0.39, x2 = 0.31, x3 = 0.29 and v = −0.91.

Player II’s Linear Program

The optimum solution is y1 = 0.32, y2 = 0.08, x3 = 0.60 and v = −0.91.

You might also like