Sequentially a player decides to play and his adversary decides . At time , a decision results in a vector payoff . Given is the average vector payoff at time , Blackwell’s Approachability Theorem is a necessary and sufficient condition so that, regardless of the adversary’s decisions, the player makes the sequence of vectors approach a convex set .
- We consider a sequence . Here each is a probability distribution on a set of (pure) decisions and similarly is a probability distribution on .
- For each pair , there is a payoff vector and average payoff vector . Here
- At each time , we assume is chosen as a function of and the past decisions .
- We say that approaches a set if the distance between and converges to zero, namely,
Theorem (Blackwell’s Approachability Theorem): For closed convex set , the following are equivalent
- is approachable.
- For every there exists such that .
- Every half space containing is approachable.1
Before a proof we comment on how Blackwell’s Approachability Theorem relates to the Minimax Theorem.
- For , holds iff the following are equivalent for every i. s.t. ,[“without knowing , we can think of a good ”]
ii. s.t. .
[“if we know , we can think of a good ”]
- Blackwell’s Approachability Theorem now makes the assumption that is a vector.
- The equivalence between (1) and (2) found in Blackwell’s Approachability Theorem is analogous to the equivalence between (i) and the seemingly weaker statement (ii) found in the Minimax Theorem.
- However, we note that Approachability (1) is weaker than the statement s.t. . Approachability states (chosen sequentially) so that on average converges to as .
- Finally, we observe that (3) approaching a half-space corresponds to applying the Minimax Theorem to game matrix whose -th component is .
Proof: : For every containing ,
: For every containing ,
: so is immediate.
: Firstly note
At time , let the projection be closest point to in and let be the half-space containing defined by normal and . In other words, is defined by the hyperplane through perpendicular to . Since is approachable, we can see from there must exist a such that for all distributions . Using this .
As , we now extract from the term :
Substituting this expression, we have
Thus multiplying both sides by and rearranging gives
Summing these interpolating terms gives
Thus, as required,
- Note that our proof is constructive. We define a policy for choosing such thatwhere from , the projection of onto we define and .
We now consider one interesting consequence of this result.
Theorem (Hannan-Gaddum Theorem)
Suppose as defined in is such that . There exists a playing strategy such that for any
In other words, our performance in the game is asymptotically as good as the best fixed action.
We verify Blackwell’s Approachability Theorem for the vector payoff and convex region . For all there exists such that component-wise , in particular we choose where . This verifies condition 2. of Blackwell’s Approachability Theorem. Thus there exists a strategy such that for all
which is equivalent to the required expression above.
- Recall a half-space in is a set for some and .↩