We now consider games with more that players, so called
-person games. We continue to consider non-cooperative games i.e. where players act solely in their own interests. From zero-sum games we extend our notation.
- We consider
to be a finite set of players. As before each player
has a finite set of pure strategies
. As before
will be
’s set of mixed strategies, i.e. the set of probability distributions over
.
- We let
and we call each vector
a strategy profile. Similarly for mixed strategies, we also call
a strategy profile.
- Because each player
is not explicitly aware of his opponents choices, it will be convenient consider the strategies of players excluding
, namely, either
or
. We can then consider the strategy profile with
’s strategy included, namely,
or
.
- When each player has selected a mixed strategy, we assume the pure strategy of each player is selected independently at random according to the players mixed strategy. So given the mixed strategies of each player
is
, where
denotes the probability of player
choosing pure strategy
, the probability of a strategy profile
is
- Given a strategy profile
is chosen by each player, the reward to player
is given by
. If we allow mixed strategies then for a strategy profile
the reward to player
is
- We assume that each player is allowed to chose their strategy as a function of the games parameters:
,
and A. In other words, the player has perfect information about the game.
To clarify further, when we say random here we imagine each player rolling a biased dice and then according to the outcome of that throw they select their (pure) strategy. The shape and way the dice is biased is decided by the player and is certainly dependent on information about their opponents and themselves i.e. their pure strategies, their opponent’s pure strategies and the rewards given for each strategy profile. So when we say that the random choices of players are independent, we mean that when each player has their own dice which they throw separately and each player is not aware of the outcome from their opponents dice throw. Finally, when we talk about the reward of a mixed strategy profile to a player, we mean the expected reward to the player from the procedure just described.
We now extend our notion a game’s equilibrium.
- A Nash equilibrium is a strategy profile where, given your opponents do not change their strategies, there is no advantage for you to change your strategy. Writen mathematically, a (pure strategy) Nash equilibrium is an
such that for each
and each $latex \sigma^\alpha \in {\mathcal S}^\alpha
- If players can choose mixed strategies then the notion of a Nash equilibrium defined above is extended by replacing pure strategies with mixed strategies. So
is a (mixed strategy) Nash equilibrium if for all
and for all
One can find simple examples of games where there is no pure strategy Nash equilibrium. An key result is that, provided there are a finite number of pure strategies, a mixed strategy Nash Equilibrium exists. We will assume is the following
Theorem [Brouwer Fixed Point Theorem] Let be a closed bounded subset of
. If
is a continuous function then there exists
such that
.
Theorem [Nash] For a -person game where each player has a finite set of pure strategies, there exists a mixed strategy Nash equilibrium.
Proof. Each is a closed, bounded subset of
. So,
is a closed, bounded subset of
. Thus, we can apply Brouwer’s Fixed Point Theorem to any continuous function from the set of mixed strategies to itself.
Consider a mixed strategy profile, . For each
, we define
Note, quantifies the additional reward that
could get by playing pure strategy
instead of mixed strategy
. A player just played
, one might imagine that a player would next place more emphasis on strategies
for which
. Hence, we define function
where for each player
the components of
are given by
One can check that is a continuous function from
to itself. Thus by the Brouwer Fixed Point Theorem there exists some vector
such that
. From now on we let
denote this fixed point.
We now attempt to show that for all
,
. Suppose for some
that
for all
with
then
for all
with
. Thus
This is clearly a contradiction. So for each player there exist some strategy
such that
and
, but then for this
In the first implication above, we just rearrange our equation. In the second implication we note that for each
.
Since for each we have shown
, it must be that
for all
. Thus for all
So as the above expression holds for each player ,
is a Nash equilibrium.
Ex. [Prisoner’s Dilemma] Two men are taken prisoner by the authorities. They are interviewed separately and asked to confess to the other prisoners involvement in a crime. A prisoner that does not confess receives year in prison. A prisoner that confesses adds
years to the other prisoner’s sentence. This game can be expressed by the matrix
For each entry in the above matrix the left entry
gives the row players sentence and
gives the column player’s sentence.
It is clear the the best outcome for both players is to not confess. Let’s think about what might happen. If row player confesses then column player may either confess and get years in prison or not confess and get
years. So it is better for him to confess. Similarly, if row player does not confess then column player either gets
year for not confessing or
years for confessing. So again it is better for him to confess. So column player decides to confess. By symmetry row player must do the same and thus both get
years in prison!
A little more mathematically, if row player does not confess with probability and does confess with probability
then it is in column players interest to choose a strategy minimizing
So it is better to confess. The prisoner’s dilemma is a strikingly simple example of non-cooperation giving an inefficient outcome.
Ex. [Battle of the Sexes] Two students decide to go on a date. One wants to go see the Ajax play the other wants to see a show the National Ballet at the Stopera. If they can’t agree then the date will not happen. The rewards to this game are
Find the Nash equilibria for this game.
Ex. [Symmetric Nash Equilibrium] We consider an -person game which is symmetric for each player: each player can choose amongst the same set of pure strategies
and given other players choices the reward for each player is the same i.e. for all players
, the rewards are equal
whenever
and
.
Show that there exists a symmetric Nash Equilibrium that is a Nash Equilibrium where each player has the same mixed strategy
for all
.
Ex. [Repeated Prisoner’s Dilemma] We assume the criminals from the Prisoner’s Dilemma (Example [nGame:Prisoner]) are repeat offenders. The are repeatedly arrested by the police and interviewed. Each time they can chose whether to confess or not and afterwards they find out if their fellow prisoner confessed or not. Recall the costs of the prisoners dilemma were
Now the payoffs are the same in our previous example, except each time they meet their payoffs are discounted by a multiplicative factor , for some
i.e. the
-th time the prisoners are interviewed, we multiply the above matrix by
. We now consider two questions one where time is finite
and one where time is infinite
.
a) Show that if this repeated game is played for a finite set of times then it is in the interested of each player to confess at each time. [Hint: Argue for time
and work backwards].
A punishing strategy is a strategy where the prisoner will not confess at every round unless his fellow prisoner confesses. If his fellow prisoner confesses at one time instance then the prisoner will confesses for all subsequent time. I.e. the strategy places a heavy penalty on the opponent for confessing.
b) If this repeated game is played for an infinite set of times and if
is suitably small then show that both players playing a punishing strategy is a Nash Equilibrium.
Find the Nash Equilibria of the game with rewards
For each entry in the above matrix the left entry
gives the row player’s reward and
gives column player’s reward.
Ex. A traveller must commute to work. He can choose to buy a ticket for a cost of 20 euros or he can chose not to buy a ticket and thus pay nothing. The railway company employs some (expensive) conductors. It can choose to either to put a conductor on the commuters train or not. It costs 10 euros for a conductor to inspect the passenger’s ticket. If the commuter is caught by the conductor, then the commuter must pay a fine of 60 euros.
We view this as a two player game between the railway company and the commuter.
a) Write the rewards of this game in a matrix?
b) For a game with two players, define what is meant by a Nash equilibrium for both pure and mixed strategies?
c) Does this game have any pure strategy equilibria? Does this game have any mixed strategy equilibria? If the answer to either question is yes then calculate all corresponding equilibria.
d) Suppose that the train company decides to change its fines. How much should the fine be increased by in order to guarantee 99% of commuters pay their fares?