Sequential Monte-Carlo is a general method of sampling from a sequence of probability distributions .
Here, we have the update equations
Notice if we take then any sequence of distributions can be realized by the above recursion.1 Except special cases, we cannot calculate the integrals above when the state space is infinite (or simply large).
However, we can approximately sample from the required distribution and use that as a proxy. We can then update with the same equations as above replacing the continuous with the discrete which is achieved by taking samples. This is called Sequential Monte-Carlo (SMC).
More precisely, SMC does the following:
1) Resample. For
It is worth noting that the resampled RVs from the distribution and thus are a random subset of . Since we then apply transitions we should, in general, recover distinct points. We assume that it is easy to sample from . Further notice, the algorithm is always dealing in ratios of the distributions . Thus we do not have to specify normalizing constants for these distributions if we do not know them (which is often the case in the world of Bayesian statistics). A final important point, is that although we assume to be a state in the set . We could allow it to be the set of previous states . This is what we do in the hidden Markov models (discussed below).
Roughly why this works. Although we can say more precisely why things work out for this algorithm, let’s sketch out why it works. Suppose that is a good approximation of . So much so that we assume that (and thus ). Notice that from infinitely many resamples from and gives:
We can apply the strong law of large numbers to each of the two sums above. Notice that
This holds for and . Thus applying to , we see that
Thus we see that in the limit where is large, we should be sampling from the correct distribution.
Hidden Markov Models
As an example SMC can be used for Hidden Markov models. Suppose that us a Markov chain with and transition distributions . Suppose that we receive observations as a function of . That is I.e. the distribution of is conditionally independent of when we condition on .
Like with the Kalman filter we wish to calculate
Letting and and and , holds that
If we are given and , and if the state space of our Markov chain is finite, then we can calculate all the distributions above. However, this is not possible when is not finite and instead we can for instance use Sequential Monte-Carlo which we define next.
We now prove the convergence of SMC (in norm). Just like Monte-Carlo the standard deviation error goes down at a rate . We require the weights to remain bounded by some value .
Theorem. For all bounded continuous it holds that
where is a positive constant.
Proof. The proof proceeds by induction applying the same bounds that we proved from standard Monte-Carlo and self-normalized importance sampling (cf. and Proposition [MC:SNI]).
Notice at time , . Thus by standard Monte-Carlo bounds:
So the result holds at .
We now suppose the induction hypothesis that the result holds at time and we prove that it holds at time .
First notice that, similar to self-normalized importance sampling, we can replace the self-normalized sum with a non-self-normalized sum. Specifically,
Note that the term in square bracket is bounded by . Thus, recalling that the norm is defined by , In the final inequality above, we now that .
We analyze the term , noting that the term is the same when we take . Now note that
Thus for we have
The first term in follows by the same argument that we used to derive . The second term follows by our induction hypothesis applied to the function . Thus, applying the above inequality to (and ), we see that
Thus we see that the require bound holds at time with This completes the induction step and the proof.
Here we have the caveat that we require for any .↩