Sequentially a player decides to play and his adversary decides
. At time
, a decision
results in a vector payoff
. Given
is the average vector payoff at time
, Blackwell’s Approachability Theorem is a necessary and sufficient condition so that, regardless of the adversary’s decisions, the player makes the sequence of vectors
approach a convex set
.
- We consider a sequence
. Here each
is a probability distribution on a set of (pure) decisions
and similarly
is a probability distribution on
.
- For each pair
, there is a payoff vector
and average payoff vector
. Here

- At each time
, we assume
is chosen as a function of
and the past decisions
.
- We say that
approaches a set
if the distance between
and
converges to zero, namely,
![d(a_t,\mA):=\inf_{\alpha\in\mA} ||a_t-\alpha||_2 \xrightarrow[t\rightarrow\infty]{} 0.](https://appliedprobability.blog/wp-content/uploads/2017/05/e890e818ae0f8f2617b71b2319387dd2.png?w=840)
Theorem (Blackwell’s Approachability Theorem): For closed convex set , the following are equivalent
is approachable.
- For every
there exists
such that
.
- Every half space containing
is approachable.1
Before a proof we comment on how Blackwell’s Approachability Theorem relates to the Minimax Theorem.
- For
,
holds iff the following are equivalent for every
i.
s.t.
,
[“without knowing
, we can think of a good
”]ii.
s.t.
.
[“if we know
, we can think of a good
”]
- Blackwell’s Approachability Theorem now makes the assumption that
is a vector.
- The equivalence between (1) and (2) found in Blackwell’s Approachability Theorem is analogous to the equivalence between (i) and the seemingly weaker statement (ii) found in the Minimax Theorem.
- However, we note that Approachability (1) is weaker than the statement
s.t.
. Approachability states
(chosen sequentially) so that
on average
converges to
as
.
- Finally, we observe that (3) approaching a half-space
corresponds to applying the Minimax Theorem to game matrix
whose
-th component is
.
Proof: : For every
containing
,

: For every
containing
,

:
so is immediate.
: Firstly note

At time , let the projection
be closest point to
in
and let
be the half-space containing
defined by normal
and
. In other words,
is defined by the hyperplane through
perpendicular to
. Since
is approachable, we can see from there must exist a
such that
for all distributions
. Using this
.

As , we now extract
from the term
:
![\begin{aligned} (a_{t+1}-a_t)\cdot (a_t-P(a_t)) &= \frac{A(p_{t+1},q_{t+1})}{t+1}\cdot ( a_t-P(a_t)) - \frac{a_{t}}{t+1}\cdot (a_t-P(a_t)) \notag\\ &= \frac{1}{t+1}\Big[ (A(p_{t+1},q_{t+1}) - P(a_t) )\cdot ( a_t-P(a_t)) - ({a_{t}}-P(a_t)\cdot (a_t-P(a_t)) \Big] \label{Blackwell:sub} \end{aligned}](https://appliedprobability.blog/wp-content/uploads/2017/05/cc7484b2c17e3f38ffcf902da200a863.png?w=840)
Substituting this expression, we have

Thus multiplying both sides by and rearranging gives

Summing these interpolating terms gives

Thus, as required,
![\label{blackwell: conv} || a_t - P(a_t) || \leq A_{max}\sqrt{\frac{2}{t}}\xrightarrow[t\rightarrow\infty]{} 0.](https://appliedprobability.blog/wp-content/uploads/2017/05/6a80e5ee82ad2a45be5c6dc19a448c56.png?w=840)
(Typo: the term 2A_{\max}^2 in the last 4 displays above should be 4A_{\max}^2.)
- Note that our proof is constructive. We define a policy for choosing
such that
where from , the projection of
onto
we define
and
.
We now consider one interesting consequence of this result.
Theorem (Hannan-Gaddum Theorem)
Suppose as defined in is such that
. There exists a playing strategy
such that for any

In other words, our performance in the game is asymptotically as good as the best fixed action.
Proof:
We verify Blackwell’s Approachability Theorem for the vector payoff and convex region
. For all
there exists
such that component-wise
, in particular we choose
where
. This verifies condition 2. of Blackwell’s Approachability Theorem. Thus there exists a strategy
such that for all
![\left\{\frac{1}{t}\sum_{\tau=1}^t A(p_t,q_t) - \frac{1}{t}\sum_{\tau=1}^t A(i,q_t)\right\} \vee 0 \xrightarrow[t\rightarrow\infty]{} 0](https://appliedprobability.blog/wp-content/uploads/2017/05/be273ee37a7620168a82ac31cfdaa4ea.png?w=840)
which is equivalent to the required expression above.
- Recall a half-space in
is a set
for some
and
.↩
Excellent notes! One question: in the proof of the Blackwell’s Approachability Theorem, (3) -> (1): I wonder why ||A(p_t, q_t), a_t||^2 can be uniformly bounded from above by a constant 2A_{max}^2?
LikeLike
Excellent notes! For the proof of the Blackwell’s Approachability Theorem, (3) to (1): Can I ask why ||A(p_t, q_t) – a_t||^2 can be uniformly bounded from above by a constant 2 A_{max}^2?
LikeLike