In the Cross Entropy Method, we wish to estimate the likelihood
Here is a random variable whose distribution is known and belongs to a parametrized family of densities
. Further
is often a solution to an optimization problem.
It is assumed that is a relatively rare event, say of order
. One way to estimate
is to take
IID samples
and then average
This gives us an unbiased estimate. However, because the event is rare, we may require a large number of samples in order to provide a good estimate of
. Importance sampling is a straightforward method that can mitigate these effects. Here sample each
from a different density
and perform the estimate
where W(x) = f(x)/g(x). Such a change can improve the estimation of . For instance, the choice
gives the correct estimate of
with certainty. Clearly
cannot work in practice; it requires knowledge of
, the parameter which we are estimating. Nonetheless a good estimate of
may be possible. The point of the cross entropy method is to find a reasonable choice for
so that importance sampling can be applied.
In particular, the cross-entropy method advocates estimating by
according to a parameter choice that minimizes the relative entropy between
and
. Namely,
With a little rearrangement, we see that this is equivalent to the optimization
With importance sampling in mind, we apply a change of measure to the above integral to get
where . The above optimization may not be explicitly solvable so we replace it with the (importance sampled) optimization
where are IID samples from density
.
In many examples the above optimization in concave and solvable either by descent methods or by differentiating and solving for the condition
The reason for incorporating importance sampling into the estimate is because the event could be too unlikely to form an accurate estimate under the choice
. The idea behind the cross entropy method is to apply a multi-level approach where we sequentially increase
and
as follows
- We estimate the
-th percentile of
by taking by ordering the samples
and calculating the empirical
-th percentile, which gives the update
The event is chosen to be reasonably likely, e.g.
. Note that since, in principle, the parameter
is such that less likely
under
when compared with the previous choice of
.
- Given this
, we find
as the best estimate according to our cross-entropy rule . This, in principle, should increase the likelihood of the event
when compared with prior choice of
.
- When
passes the required target level
then the have found a parameter choice that gives reasonable importance to the event
and we can calculate
with importance sampling rule .