We consider a highly simplified game between two players.
- Player has pure strategies . Similarly, Player has pure strategies .
- If chooses strategy and chooses strategy then receives reward and receives reward .1 We call the game a two-person zero-sum game because the rewards sum to zero.
- We can think of the outcome of the game being decided by player choosing a row of the matrix and choosing the column . Hence we may refer to as the row player and as the column player.
- We can allow players to randomize and choose a mixed strategy. We let be the set of probability distributions on . If , is the probability that player plays strategy .
- If player chooses mixed strategy and chooses then the probability of pure strategy with pure strategy is thus the expect reward for is
As noted, we simplify the notion of a game. We do not explicitly consider the rules, movements or explicit structure of the game; we just consider strategies and their outcomes. Such a representation is often called the strategic form of the game. For this section, we consider a non-cooperative game, which means that players will seek to maximize there own rewards regardless of potential costs to others. Given a player cannot guarantee her opponent will act in her favor, what strategy might a rational player choose?
- A player plays a maximin criterion if the player chooses the action with the highest guaranteed reward.If the player can only choose amongst pure strategies it chooses that maximizes .If the player can choose mixed strategies it chooses that maximizes .
A maximin criterion is somewhat pessimistic; an opponent need not choose the strategy that gives the worst outcome. This said, one setting where it is in an player’s interest to take its opponents reward is a zero-sum game. Even so, it may still be the case that a player and his opponent choose maximin pure strategies but the opponents choice gives a reward above that anticipated by the players pessimistic maximin strategy. If we now allow players to choose mixed strategies, we see in following theorem that this is no longer the case and the player was correct to be pessimistic.
Thrm [Minimax Theorem] For any two-person zero-sum game
Moreover, there exist , and such that and achieve the respective maximizations and minimizations in and .
- If player observes that on average plays actions with distribution then his best strategy is to play that maximizes . Given this, ’s best response is to minimize over . This leads to the statement in equality . Conversely, observing and responding give the statement in . Equation asserts that these incentives are aligned.
- The fact there exists a and satisfying equality suggests there is a notion of equilibrium for this game: it would make no sense for (or ) to deviate from (or ). If did and ’s strategy remain the the same demonstrates that either be worse off or, at best, be as well off as before: i.e. , .
- We say that is the value of the game and , is a solution of the game.
- The equality is, by construction2, remarkably similar to the statement of .
Proof. Player wishes to minimizes ’s reward. So let’s consider the optimization problem
Let’s calculate the dual of this linear program. Minimizing its Lagrangian gives
This gives us a dual
which we interpret as attempting to maximize his reward. As Strong Duality holds for the feasible linear programs and , we can find a , and where , solves and , solves .
Because and , we can say for all and
Above, the two middle inequalities are exactly our statement , the left-most and right-most inequalities replace and with an optimization over and , respectively. So we know .
We can demonstrate the reverse inequality
for all and . Minimizing both sides over
Finally maximizing the right-hand expression over gives . Thus as required. Finally, from we see that , and achieves this minimax saddle point.
Ex 1. [Matching Pennies] Suppose two players each have a coin, say euro. Without showing their coin to their opponent, each player can choose to place the coin heads up or tails up. The players then reveal their coins to each other. If the coins faces match then the first player wins the coin. If the coins do not match the second player wins the coin. This leads to the game matrix
Clearly, no pure strategy equilibrium can exist e.g. if the first player always chooses heads then the second always chooses tails and so it is in the first player’s advantage to play tails more. By this same logic, intuitively, no player would want to bias towards heads or tails as their opponent could exploit this. Thus one might expect the solution to be a mixed strategy playing head and tails with equal probability and thus the expected value is zero. Let’s validate this.
Minimizing over gives . Thus and is achieved by . By the symmetric argument we see that also.
Ex 2. Find the solution and value of the two-person zero-sum game whose row player rewards are given by the matrix
Ex 3. Consider a two-player zero-sum game where players must play pure strategies under a maximin criterion. 1) Show that
2) Construct a matrix where this inequality is strict.
Ex 4. We consider a two-person zero-sum game with the row player ’s rewards given by matrix .
- Strategy is weakly dominated if there exists such that , i.e. the rewards of are better than .
Suppose is a weakly dominated strategy and consider the game where the th row is removed from matrix , i.e. is the game where player can no long select strategy . Prove that the game has the same value as . Explain how a solution to the game can be constructed from a solution of .
Ex 5. [Paper, Scissors, Stone] We consider a game of paper, scissors, stone. As a reminder, in this game there are two players. simultaneously the players reveal to each other the sign paper, scissors or stone. Paper is beaten by scissors; scissors is beaten by stone and stone is beaten by paper.
Ex 6. We assume for our game that if a player wins with stone the losing player pays euro to the winner; similarly, if a player wins with scissors the loser pays euro to the winner, and if a player wins with paper he pays euro to the winner. We assume that for a draw no money is exchanged.
a) Write down , the game matrix for this two-person zero sum game.
b) Prove that, for any game with , the value of this game is and there exist a solution where both players use the same (mixed) strategy which we call .
c) Argue, for the solution from part b), that if then the -th entry of equals zero.
d) Use the equalities from part c) to find a solution to this game.
Ex 7. We say that are the solution of a two-person zero-sum game if they respectively solve the maximizations and minimizations in the expression
We say that are an equilibrium pair if they satisfy
a) Show that is the same for all equilibrium pairs.
b) Show that is an equilibrium pair iff it is a solution.
Ex 8. [A Blotto Game] Two armies battle over two areas of land. Each army must distribute their battalions over these two land regions. Army has battalions. Army has battalions. Battalions are indivisible and each army must place at least one battalion on each land region.
The army with most battalions depolyed on an area land wins that region of land. If an army places battalions on a land region and the other army places battalions on that same region, and without loss of generality , then the army with battalions wins and recieves a reward of and the other army losses recieves a loss of : for the battalions defeated and for the land won/lost. If , the battle results in a draw and the reward/loss to both armies is zero. Rewards are added and losses are subtracted over the land regions to give the total gain of each army.
Write out the game matrix for the total gain of army A in this two-person zero-sum game. Find the value and solutions of this game.
Ex 9. a) State the Minimax Theorem and, for a reward matrix , define what is meant by the value and solution of a two-person zero-sum game.
Strategy is weakly dominated if there exists such that , i.e. the rewards of are better than .
b) Suppose is a weakly dominated strategy and consider the game where the th row is removed from matrix , i.e. is the game where player can no long select strategy . Prove that the game has the same value as . Explain how a solution to the game can be constructed from a solution of .
c) Find the value and a solution of the following two-person zero-sum game where row player rewards are given by the matrix