Sequentially a player decides to play and his adversary decides . At time , a decision results in a vector payoff . Given is the average vector payoff at time , Blackwell’s Approachability Theorem is a necessary and sufficient condition so that, regardless of the adversary’s decisions, the player makes the sequence of vectors approach a convex set .

- We consider a sequence . Here each is a probability distribution on a set of (pure) decisions and similarly is a probability distribution on .
- For each pair , there is a payoff vector and average payoff vector . Here
- At each time , we assume is chosen as a function of and the past decisions .
- We say that approaches a set if the distance between and converges to zero, namely,

**Theorem** (Blackwell’s Approachability Theorem): For closed convex set , the following are equivalent

- is approachable.
- For every there exists such that .
- Every half space containing is approachable.
^{1}

Before a proof we comment on how Blackwell’s Approachability Theorem relates to the Minimax Theorem.

- For , holds iff the following are equivalent for every i. s.t. ,[“without knowing , we can think of a good ”]ii. s.t. .
[“if we know , we can think of a good ”]

- Blackwell’s Approachability Theorem now makes the assumption that is a vector.
- The equivalence between (1) and (2) found in Blackwell’s Approachability Theorem is analogous to the equivalence between (i) and the seemingly weaker statement (ii) found in the Minimax Theorem.
- However, we note that Approachability (1) is weaker than the statement s.t. . Approachability states (chosen sequentially) so that on average converges to as .
- Finally, we observe that (3) approaching a half-space corresponds to applying the Minimax Theorem to game matrix whose -th component is .

**Proof**: : For every containing ,

: For every containing ,

: so is immediate.

: Firstly note

At time , let the projection be closest point to in and let be the half-space containing defined by normal and . In other words, is defined by the hyperplane through perpendicular to . Since is approachable, we can see from there must exist a such that for all distributions . Using this .

As , we now extract from the term :

Substituting this expression, we have

Thus multiplying both sides by and rearranging gives

Summing these interpolating terms gives

Thus, as required,

(Typo: the term 2A_{\max}^2 in the last 4 displays above should be 4A_{\max}^2.)

- Note that our proof is constructive. We define a policy for choosing such thatwhere from , the projection of onto we define and .

We now consider one interesting consequence of this result.

**Theorem** (Hannan-Gaddum Theorem)

Suppose as defined in is such that . There exists a playing strategy such that for any

In other words, our performance in the game is asymptotically as good as the best fixed action.

**Proof:**

We verify Blackwell’s Approachability Theorem for the vector payoff and convex region . For all there exists such that component-wise , in particular we choose where . This verifies condition 2. of Blackwell’s Approachability Theorem. Thus there exists a strategy such that for all

which is equivalent to the required expression above.

- Recall a half-space in is a set for some and .↩

Excellent notes! One question: in the proof of the Blackwell’s Approachability Theorem, (3) -> (1): I wonder why ||A(p_t, q_t), a_t||^2 can be uniformly bounded from above by a constant 2A_{max}^2?

LikeLike

Excellent notes! For the proof of the Blackwell’s Approachability Theorem, (3) to (1): Can I ask why ||A(p_t, q_t) – a_t||^2 can be uniformly bounded from above by a constant 2 A_{max}^2?

LikeLike