We consider a highly simplified game between two players.
- Player
has pure strategies
. Similarly, Player
has pure strategies
.
- If
chooses strategy
and
chooses strategy
then
receives reward
and
receives reward
.1 We call the game a two-person zero-sum game because the rewards sum to zero.
- We can think of the outcome of the game being decided by player
choosing a row
of the matrix
and
choosing the column
. Hence we may refer to
as the row player and
as the column player.
- We can allow players to randomize and choose a mixed strategy. We let
be the set of probability distributions on
. If
,
is the probability that player
plays strategy
.
- If player
chooses mixed strategy
and
chooses
then the probability of pure strategy
with pure strategy
is
thus the expect reward for
is
As noted, we simplify the notion of a game. We do not explicitly consider the rules, movements or explicit structure of the game; we just consider strategies and their outcomes. Such a representation is often called the strategic form of the game. For this section, we consider a non-cooperative game, which means that players will seek to maximize there own rewards regardless of potential costs to others. Given a player cannot guarantee her opponent will act in her favor, what strategy might a rational player choose?
- A player plays a maximin criterion if the player chooses the action with the highest guaranteed reward.If the player can only choose amongst pure strategies it chooses
that maximizes
.If the player can choose mixed strategies it chooses
that maximizes
.
A maximin criterion is somewhat pessimistic; an opponent need not choose the strategy that gives the worst outcome. This said, one setting where it is in an player’s interest to take its opponents reward is a zero-sum game. Even so, it may still be the case that a player and his opponent choose maximin pure strategies but the opponents choice gives a reward above that anticipated by the players pessimistic maximin strategy. If we now allow players to choose mixed strategies, we see in following theorem that this is no longer the case and the player was correct to be pessimistic.
Thrm [Minimax Theorem] For any two-person zero-sum game
Moreover, there exist ,
and
such that
and
achieve the respective maximizations and minimizations in and
.
- If player
observes that on average
plays actions with distribution
then his best strategy is to play
that maximizes
. Given this,
’s best response is to minimize
over
. This leads to the
statement in equality . Conversely,
observing
and
responding give the
statement in . Equation asserts that these incentives are aligned.
- The fact there exists a
and
satisfying equality suggests there is a notion of equilibrium for this game: it would make no sense for
(or
) to deviate from
(or
). If
did and
’s strategy remain the the same demonstrates that
either be worse off or, at best, be as well off as before: i.e.
,
.
- We say that
is the value of the game and
,
is a solution of the game.
- The equality is, by construction2, remarkably similar to the statement of .
Proof. Player wishes to minimizes
’s reward. So let’s consider the optimization problem
Let’s calculate the dual of this linear program. Minimizing its Lagrangian gives
This gives us a dual
which we interpret as attempting to maximize his reward. As Strong Duality holds for the feasible linear programs and , we can find a
,
and
where
,
solves and
,
solves .
Because and
, we can say for all
and
Thus
Above, the two middle inequalities are exactly our statement , the left-most and right-most inequalities replace and
with an optimization over
and
, respectively. So we know
.
We can demonstrate the reverse inequality
for all and
. Minimizing both sides over
Finally maximizing the right-hand expression over gives
. Thus
as required. Finally, from we see that
,
and
achieves this minimax saddle point.
Ex 1. [Matching Pennies] Suppose two players each have a coin, say euro. Without showing their coin to their opponent, each player can choose to place the coin heads up or tails up. The players then reveal their coins to each other. If the coins faces match then the first player wins the coin. If the coins do not match the second player wins the coin. This leads to the game matrix
Clearly, no pure strategy equilibrium can exist e.g. if the first player always chooses heads then the second always chooses tails and so it is in the first player’s advantage to play tails more. By this same logic, intuitively, no player would want to bias towards heads or tails as their opponent could exploit this. Thus one might expect the solution to be a mixed strategy playing head and tails with equal probability and thus the expected value is zero. Let’s validate this.
Minimizing over gives
. Thus
and is achieved by
. By the symmetric argument we see that
also.
Ex 2. Find the solution and value of the two-person zero-sum game whose row player rewards are given by the matrix
Ex 3. Consider a two-player zero-sum game where players must play pure strategies under a maximin criterion. 1) Show that
2) Construct a matrix where this inequality is strict.
Ex 4. We consider a two-person zero-sum game with the row player ’s rewards given by matrix
.
- Strategy
is weakly dominated if there exists
such that
, i.e. the rewards of
are better than
.
Suppose is a weakly dominated strategy and consider
the game where the
th row is removed from matrix
, i.e.
is the game where player can no long select strategy
. Prove that the game
has the same value as
. Explain how a solution to the game
can be constructed from a solution of
.
Ex 5. [Paper, Scissors, Stone] We consider a game of paper, scissors, stone. As a reminder, in this game there are two players. simultaneously the players reveal to each other the sign paper, scissors or stone. Paper is beaten by scissors; scissors is beaten by stone and stone is beaten by paper.
Ex 6. We assume for our game that if a player wins with stone the losing player pays euro to the winner; similarly, if a player wins with scissors the loser pays
euro to the winner, and if a player wins with paper he pays
euro to the winner. We assume that for a draw no money is exchanged.
a) Write down , the game matrix for this two-person zero sum game.
b) Prove that, for any game with , the value of this game is
and there exist a solution where both players use the same (mixed) strategy which we call
.
c) Argue, for the solution from part b), that if
then the
-th entry of
equals zero.
d) Use the equalities from part c) to find a solution to this game.
Ex 7. We say that are the solution of a two-person zero-sum game
if they respectively solve the maximizations and minimizations in the expression
We say that are an equilibrium pair if they satisfy
a) Show that is the same for all equilibrium pairs.
b) Show that is an equilibrium pair iff it is a solution.
Ex 8. [A Blotto Game] Two armies battle over two areas of land. Each army must distribute their battalions over these two land regions. Army has
battalions. Army
has
battalions. Battalions are indivisible and each army must place at least one battalion on each land region.
The army with most battalions depolyed on an area land wins that region of land. If an army places battalions on a land region and the other army places
battalions on that same region, and without loss of generality
, then the army with
battalions wins and recieves a reward of
and the other army losses recieves a loss of
:
for the battalions defeated and
for the land won/lost. If
, the battle results in a draw and the reward/loss to both armies is zero. Rewards are added and losses are subtracted over the land regions to give the total gain of each army.
Write out the game matrix for the total gain of army A in this two-person zero-sum game. Find the value and solutions of this game.
Ex 9. a) State the Minimax Theorem and, for a reward matrix , define what is meant by the value and solution of a two-person zero-sum game.
Strategy is weakly dominated if there exists
such that
, i.e. the rewards of
are better than
.
b) Suppose is a weakly dominated strategy and consider
the game where the
th row is removed from matrix
, i.e.
is the game where player can no long select strategy
. Prove that the game
has the same value as
. Explain how a solution to the game
can be constructed from a solution of
.
c) Find the value and a solution of the following two-person zero-sum game where row player rewards are given by the matrix