n-Person Games

We now consider games with more that $2$ players, so called $n$ -person games. We continue to consider non-cooperative games i.e. where players act solely in their own interests. From zero-sum games we extend our notation.

We consider $\mathcal{P}$ to be a finite set of players. As before each player $\alpha\in\mathcal{P}$ has a finite set of pure strategies ${\mathcal S}_\alpha$ . As before $<\!{\mathcal S}_\alpha\!>$ will be $\alpha$ ’s set of mixed strategies, i.e. the set of probability distributions over ${\mathcal S}^\alpha$ .
We let ${\mathcal S}=\prod_{\alpha\in\mathcal{P}} {\mathcal S}_{\alpha}$ and we call each vector $s=(s_\alpha: \alpha\in\mathcal{P})\in {\mathcal S}$ a strategy profile. Similarly for mixed strategies, we also call $p=(p^\alpha: \alpha\in\mathcal{P})\in \prod_{\alpha\in\mathcal{P}} <\!S_{\alpha}\!>$ a strategy profile.
Because each player $\alpha$ is not explicitly aware of his opponents choices, it will be convenient consider the strategies of players excluding $\alpha$ , namely, either $s^{-\alpha}=(s^\beta: \beta \in \mathcal{P}\backslash \{\alpha\})$ or $p^{-\alpha}=(p^\beta: \beta \in \mathcal{P}\backslash \{\alpha\})$ . We can then consider the strategy profile with $\alpha$ ’s strategy included, namely, $s=(s^\alpha , s^{-\alpha})\in{\mathcal S}$ or $p=(p^\alpha, p^{-\alpha})\in \prod_{\alpha\in\mathcal{P}} <\!S_{\alpha}\!>$ .
When each player has selected a mixed strategy, we assume the pure strategy of each player is selected independently at random according to the players mixed strategy. So given the mixed strategies of each player $\alpha$ is $p^{\alpha}$ , where $p^\alpha_{s^\alpha}$ denotes the probability of player $\alpha$ choosing pure strategy $s^\alpha$ , the probability of a strategy profile $s=(s^\alpha:\alpha\in\mathcal{P})\in{\mathcal S}$ is

$\prod_{\alpha\in\mathcal{P}} p^\alpha_{s^\alpha}.$

Given a strategy profile $s\in{\mathcal S}$ is chosen by each player, the reward to player $\alpha\in \mathcal{P}$ is given by $A^\alpha(s)\in{\mathbb R}$ . If we allow mixed strategies then for a strategy profile $p\in \prod_{\alpha\in\mathcal{P}} <\! {\mathcal S}^\alpha \! >$ the reward to player $\alpha$ is

$A^\alpha(p)=\sum_{s\in {\mathcal S}} A^\alpha(s) \prod_{\alpha\in\mathcal{P}} p^{\alpha}_{s^\alpha}$

We assume that each player is allowed to chose their strategy as a function of the games parameters: $\mathcal{P}$ , ${\mathcal S}$ and A. In other words, the player has perfect information about the game.

To clarify further, when we say random here we imagine each player rolling a biased dice and then according to the outcome of that throw they select their (pure) strategy. The shape and way the dice is biased is decided by the player and is certainly dependent on information about their opponents and themselves i.e. their pure strategies, their opponent’s pure strategies and the rewards given for each strategy profile. So when we say that the random choices of players are independent, we mean that when each player has their own dice which they throw separately and each player is not aware of the outcome from their opponents dice throw. Finally, when we talk about the reward of a mixed strategy profile to a player, we mean the expected reward to the player from the procedure just described.

We now extend our notion a game’s equilibrium.

A Nash equilibrium is a strategy profile where, given your opponents do not change their strategies, there is no advantage for you to change your strategy. Writen mathematically, a (pure strategy) Nash equilibrium is an $s=(s^\beta:\beta \in\mathcal{P})\in {\mathcal S}$ such that for each $\alpha\in\mathcal{P}$ and each $latex \sigma^\alpha \in {\mathcal S}^\alpha

$A^\alpha(s)\geq A^\alpha(\sigma^\alpha,s^{-\alpha}).$

If players can choose mixed strategies then the notion of a Nash equilibrium defined above is extended by replacing pure strategies with mixed strategies. So $p=(p_\beta: \beta\in \mathcal{P}) \in \prod_{\beta\in\mathcal{P}} <\! {\mathcal S}^\beta\! >$ is a (mixed strategy) Nash equilibrium if for all $\alpha\in\mathcal{P}$ and for all $q^\alpha\in <\! S^\alpha\! >$

$A^\alpha(p)\geq A^\alpha(q^\alpha,p^{-\alpha}).$

One can find simple examples of games where there is no pure strategy Nash equilibrium. An key result is that, provided there are a finite number of pure strategies, a mixed strategy Nash Equilibrium exists. We will assume is the following

Theorem [Brouwer Fixed Point Theorem] Let ${\mathcal X}$ be a closed bounded subset of ${\mathbb R}^d$ . If $f:{\mathcal X}\rightarrow {\mathcal X}$ is a continuous function then there exists $x\in{\mathcal X}$ such that $f(x)=x$ .

Theorem [Nash] For a $n$ -person game where each player has a finite set of pure strategies, there exists a mixed strategy Nash equilibrium.

Proof. Each $<\! S^\alpha \!>$ is a closed, bounded subset of ${\mathbb R}^{|{\mathcal S}^\alpha|}$ . So, $\prod_{\alpha\in\mathcal{P}}<\! S^\alpha \!>$ is a closed, bounded subset of ${\mathbb R}^{|{\mathcal S}|}$ . Thus, we can apply Brouwer’s Fixed Point Theorem to any continuous function from the set of mixed strategies to itself.

Consider a mixed strategy profile, $p=(p^\alpha:\alpha\in\mathcal{P})\in\prod_{\alpha\in\mathcal{P}}<\! S^\alpha \!>$ . For each $\alpha\in\mathcal{P}$ , we define

$r^\alpha_{\sigma}= \max\{0,A^\alpha(\sigma,p^{-\alpha})- A^\alpha(p)\},\qquad \forall \sigma\in{\mathcal S}^\alpha.$

Note, $r^\alpha_\sigma$ quantifies the additional reward that $\alpha$ could get by playing pure strategy $\sigma$ instead of mixed strategy $p^\alpha$ . A player just played $p^\alpha$ , one might imagine that a player would next place more emphasis on strategies $\sigma$ for which $r^\alpha_{\sigma}>0$ . Hence, we define function $P:p\mapsto q$ where for each player $\alpha\in\mathcal{P}$ the components of $q^\alpha$ are given by

$q_{\sigma}^\alpha=\frac{p^\alpha_{\sigma}+r^\alpha_{\sigma}}{\sum_{\sigma'\in{\mathcal S}^\alpha} \{ p^\alpha_{\sigma'}+r^\alpha_{\sigma'}\} },\qquad \forall \alpha\in\mathcal{P},\quad \sigma \in {\mathcal S}^\alpha.$

One can check that $P$ is a continuous function from $\prod_{\alpha\in\mathcal{P}}<\! S^\alpha \!>$ to itself. Thus by the Brouwer Fixed Point Theorem there exists some vector $p$ such that $P(p)=p$ . From now on we let $p$ denote this fixed point.

We now attempt to show that $r^\alpha_\sigma=0$ for all $\alpha\in\mathcal{P}$ , $\sigma\in{\mathcal S}^\alpha$ . Suppose for some $\alpha\in\mathcal{P}$ that $r^\alpha_\sigma>0$ for all $\sigma\in{\mathcal S}^\alpha$ with $p^\alpha_\sigma>0$ then $A^\alpha(\sigma,p^{-\alpha}) > A^\alpha(p)$ for all $\sigma\in{\mathcal S}^\alpha$ with $p^\alpha_\sigma>0$ . Thus

$A^\alpha(p)=\sum_{\sigma\in{\mathcal S}^\alpha} p^\alpha_\sigma A^\alpha(\sigma,p^{-\alpha})> \sum_{\sigma\in{\mathcal S}^\alpha} p^\alpha_\sigma A^\alpha(p)=A^\alpha(p)$

This is clearly a contradiction. So for each player $\alpha$ there exist some strategy $\sigma$ such that $r^\alpha_\sigma=0$ and $p^\alpha_\sigma>0$ , but then for this $\sigma$

$p^\alpha_\sigma = \frac{p^\alpha_{\sigma}+r^\alpha_{\sigma}}{\sum_{\sigma'\in{\mathcal S}^\alpha} \{ p^\alpha_{\sigma'}+r^\alpha_{\sigma'}\} } = \frac{p^\alpha_{\sigma}}{1+\sum_{\sigma'\in{\mathcal S}^\alpha} r^\alpha_{\sigma'} } \implies \sum_{\sigma'\in{\mathcal S}^\alpha}r^\alpha_{\sigma'} =0 \implies r^\alpha_{\sigma'} =0,\quad \forall \sigma'\in {\mathcal S}^\alpha$

In the first implication above, we just rearrange our equation. In the second implication we note that $r^\alpha_{\sigma'}\geq 0$ for each $\sigma'$ .

Since for each $\alpha\in\mathcal{P}$ we have shown $r^\alpha_{\sigma} =0,\quad \forall \sigma\in {\mathcal S}^\alpha$ , it must be that $A^\alpha(\sigma,p^{-\alpha}) \leq A^\alpha(p)$ for all $\sigma\in{\mathcal S}^\alpha$ . Thus for all $q^\alpha\in <\! {\mathcal S}^\alpha \! >$

$A^\alpha(q^\alpha,p^{-\alpha})= \sum_{\sigma\in{\mathcal S}^\alpha} q^\alpha_{\sigma} A(\sigma,p^{-\alpha})\leq \sum_{\sigma\in{\mathcal S}^\alpha} q^\alpha_{\sigma} A^\alpha(p)=A^\alpha(p).$

So as the above expression holds for each player $\alpha$ , $p$ is a Nash equilibrium. $\square$

Ex. [Prisoner’s Dilemma] Two men are taken prisoner by the authorities. They are interviewed separately and asked to confess to the other prisoners involvement in a crime. A prisoner that does not confess receives $1$ year in prison. A prisoner that confesses adds $6$ years to the other prisoner’s sentence. This game can be expressed by the matrix

$\begin{aligned} &\text{don't confess} \quad \text{confess} \\ \begin{array}{c} \text{don't confess} \\ \text{confess} \end{array} &\left( \begin{array}{ccc} (1,1) &\quad & (7,0) \\ (0,7) &\quad & (6, 6) \end{array} \right)\end{aligned}$

For each entry $(a,b)$ in the above matrix the left entry $a$ gives the row players sentence and $b$ gives the column player’s sentence.

It is clear the the best outcome for both players is to not confess. Let’s think about what might happen. If row player confesses then column player may either confess and get $6$ years in prison or not confess and get $7$ years. So it is better for him to confess. Similarly, if row player does not confess then column player either gets $1$ year for not confessing or $0$ years for confessing. So again it is better for him to confess. So column player decides to confess. By symmetry row player must do the same and thus both get $6$ years in prison!

A little more mathematically, if row player does not confess with probability $p$ and does confess with probability $1-p$ then it is in column players interest to choose a strategy minimizing

$\min\{p+7(1-p), 0p+ 6(1-p)\} = \min \{\underbrace{1+ 6(1-p)}_{\text{don't confess}},\; \underbrace{6(1-p)}_{\text{confess}}\}.$

So it is better to confess. The prisoner’s dilemma is a strikingly simple example of non-cooperation giving an inefficient outcome.

Ex. [Battle of the Sexes] Two students decide to go on a date. One wants to go see the Ajax play the other wants to see a show the National Ballet at the Stopera. If they can’t agree then the date will not happen. The rewards to this game are $\begin{aligned} &\quad\text{Ajax} \qquad \text{Stopera} \\ \begin{array}{c} \text{Ajax} \\ \text{Stopera} \end{array} &\left( \begin{array}{ccc} (3,1) &\quad & (0,0) \\ (0,0) &\quad & (1, 3) \end{array} \right)\end{aligned}$

Find the Nash equilibria for this game.

Ex. [Symmetric Nash Equilibrium] We consider an $n$ -person game which is symmetric for each player: each player can choose amongst the same set of pure strategies $\tilde{{\mathcal S}}$ and given other players choices the reward for each player is the same i.e. for all players $\alpha,\beta\in\mathcal{P}$ , the rewards are equal $A^\alpha(s^\alpha,s^{-\alpha})=A^\beta(t^\beta,t^{-\beta})$ whenever $s^\alpha=t^\beta$ and $s^{-\alpha}=t^{-\beta}$ .

Show that there exists a symmetric Nash Equilibrium that is a Nash Equilibrium $p=(p^\alpha:\alpha\in\mathcal{P})$ where each player has the same mixed strategy $p^\alpha=p^\beta$ for all $\alpha, \beta\in\mathcal{P}$ .

Ex. [Repeated Prisoner’s Dilemma] We assume the criminals from the Prisoner’s Dilemma (Example [nGame:Prisoner]) are repeat offenders. The are repeatedly arrested by the police and interviewed. Each time they can chose whether to confess or not and afterwards they find out if their fellow prisoner confessed or not. Recall the costs of the prisoners dilemma were

Now the payoffs are the same in our previous example, except each time they meet their payoffs are discounted by a multiplicative factor $(1+r)$ , for some $r>0$ i.e. the $t$ -th time the prisoners are interviewed, we multiply the above matrix by $(1+r)^{-t}$ . We now consider two questions one where time is finite $t=1,...,T$ and one where time is infinite $t=1,2,3,...$ .

a) Show that if this repeated game is played for a finite set of times $t=0,1,...,T$ then it is in the interested of each player to confess at each time. [Hint: Argue for time $T$ and work backwards].

A punishing strategy is a strategy where the prisoner will not confess at every round unless his fellow prisoner confesses. If his fellow prisoner confesses at one time instance then the prisoner will confesses for all subsequent time. I.e. the strategy places a heavy penalty on the opponent for confessing.

b) If this repeated game is played for an infinite set of times $t=0,1,2,3,...$ and if $r$ is suitably small then show that both players playing a punishing strategy is a Nash Equilibrium.

Find the Nash Equilibria of the game with rewards

$\begin{aligned} \left( \begin{array}{cc} (2,4) & (3,5) \\ (1,1) & (4, 0) \end{array} \right)\end{aligned}$

For each entry $(a,b)$ in the above matrix the left entry $a$ gives the row player’s reward and $b$ gives column player’s reward.

Ex. A traveller must commute to work. He can choose to buy a ticket for a cost of 20 euros or he can chose not to buy a ticket and thus pay nothing. The railway company employs some (expensive) conductors. It can choose to either to put a conductor on the commuters train or not. It costs 10 euros for a conductor to inspect the passenger’s ticket. If the commuter is caught by the conductor, then the commuter must pay a fine of 60 euros.

We view this as a two player game between the railway company and the commuter.

a) Write the rewards of this game in a matrix?

b) For a game with two players, define what is meant by a Nash equilibrium for both pure and mixed strategies?

c) Does this game have any pure strategy equilibria? Does this game have any mixed strategy equilibria? If the answer to either question is yes then calculate all corresponding equilibria.

d) Suppose that the train company decides to change its fines. How much should the fine be increased by in order to guarantee 99% of commuters pay their fares?

Share this:

Leave a comment Cancel reply