We prove a powerful inequality which provides very tight gaussian tail bounds “” for probabilities on product state spaces . Talagrand’s Inequality has found lots of applications in probability and combinatorial optimization and, if one can apply it, it generally outperforms inequalities like Azzuma-Hoeffding.

**Thrm** [Talagrand’ Concentration Inequality] For a measurable subset of the product space ,

and, consequently,

Here,

where, .

**Proof.** The duality between the two characterizations of are given in the appendix to this section.

We will prove the inequality holds by induction on . Note for the case , so holds since .^{1}

Now assuming our induction hypothesis up to , we prove the result for the case. We do so by proving the following claim

**Claim:** For , and ,

where,

- In equality , we over estimate by two distances on : the first, a refined distance using information about , ; and the second, a cruder distance using only information about and , . for rectangular sets only.

**Proof of claim**: We prove the bound by considering the convex combination of points in and . Note if is a point in then there exists a such that for each , also for the case. So, if then . By a similar easy argument, we see that if then . It is also clear that these inclusions extend on convex combinations and . Thus we see that if and then

Also, using convexity of ,

Optimizing over and , as required, we get

This completes the proof of our claim and we continue with the proof of Talagrand’s Inequality.

Continuing to write and applying of inequality , we have for the conditional expectation that For second inequality, above, we apply Holder’s Inequality; in the third inequality, we apply our induction hypothesis; in the fourth inequality, we use the fact that , which we prove in the appendix to this section.

Now taking expectations over , and observing , we gain the required result:

## Applications

Let’s informally discuss the application of Talagrand’s Inequality and then give a few examples of how to apply the result. First we recall that the Hamming distance is , so Talagrand considers a weighted Hamming distance from the set of , . A result of a similar form was known to hold previously to Talagrand’s, namely,^{2}

However the bound is vastly improved by allowing us to optimize over given : from an event it is much more possible to deduce that for some , . In particular, we are in a strong position if we can show a bound of the form

Note by multiplying by both bounds are equivalent. In both cases, we imagine that, if making a change so some of the compenents of gives , then we have a way of estimating the change in for each component changed. Note the (left) bound above implies

Thus if we can bound above by , (the smaller the better) and we take an event then . Thus, for this choice of , implies . Thus applying Talagrand gives

For median, , inspecting the above bound with and and then with and yields a bound

At this point, we have a pretty convincing concentration bound about the median of the distribution of .

One might prefer work with the mean of F(x), , rather than the median, . Note, a Cantelli’s inequality implies that the the mean is withing one standard deviation (see Lemma [Ineqs:MeanMedian]). We can estimate the variance of from the above concentration inequality: the variance about the mean is always minimal so

So, which gives and so

So once again we get a tail-bound with the same concentration behaviour.

### Longest Increasing Subsequence

Given independent random variables are independent real valued random variables, we consider the problem of finding the following Longest Increasing Subsequence:

Comparing two (similar) sequences and notice a longest subsequence of , could be used as a (long) subsequence for . That is

We take such that if and otherwise. If we consider the distance , then we have

which, on the event , gives

Note given we are working with an event of the form we look for an event that implies is of a certian size. Since the function is increasing for , then implies . Applying this then Talagrands inequality gives

For the median of this distibution, then setting and setting gives two tail bounds in the following result.

**Theorem**

Note the bounds do not look not terribly symmetric but, in order to achieve a guaranteed probability in both bounds, we must choose of the same order, , in both bounds.

# Appendix

We first show the equality of our primal-dual characterization for .

For

It is that this is equivalent to showing that

The left-hand side is minimized by . Thus, after substituting, we are required to show

We can equivalently expressed this inequality as an integral

The term in the square brackets is negative: is concave and is convex, both terms are equal when so the rest of the time . This completes the lemma.