Multi-Level Monte Carlo (MLMC)

Multi-Level Monte-Carlo is an Monte-carlo method for calculating numerically accurate estimates when fine grained estimates are expensive, but cheap coarse-grained estimates can be used to supplement this. We considered the simulation of stochastic differential equations, which is the application first proposed, but we note that the approach applies in a variety of other settings.

Continue reading “Multi-Level Monte Carlo (MLMC)”

Markov Chain Monte Carlo (MCMC)

Markov chain Monte Carlo is a variant of the Monte Carlo, where samples are no longer independent but instead are sampled from a Markov chain. This can be useful in Bayesian statistics, or when we sequentially adjust a small number of parameters for a more complex combined distribution.

We cover MCMC, its use in Bayesian statistics, Metropolis-Hastings, Gibbs Sampling, and Simulated Annealing.

Continue reading “Markov Chain Monte Carlo (MCMC)”

Monte-Carlo (MC)

Due to some projects, I’ve decided to start a set of posts on Monte-Carlo and its variants. These include Monte-Carlo (MC), Markov chain Monte-Carlo (MCMC), Sequential Monte-Carlo (SMC) and Multi-Level Monte-Carlo (MLMC). I’ll probably expand these posts further at a later point.

Here we cover “vanilla” Monte-Carlo, importance sampling and self-normalized importance sampling:

Continue reading “Monte-Carlo (MC)”

Neural Tangent Kernel

The Neural Tangent Kernel is a way of understanding the training performance of Neural Networks by relating them to Kernel methods. Here we overview the results of the paper [Jacot et al. here]. The paper considers a deep neural network with a fixed amount of data and a fixed depth. The weights applied to neurons are initially independent and normally distributed. We take a limit where the width of each layer tends to infinity.

Continue reading “Neural Tangent Kernel”

Stochastic Control Notes Update – 2021

I’ve updated the notes for this year’s stochastic control course, here:

Asside from general tidying. New material includes:

  • Equilibrium distributions of Markov chains
  • Occupancy measure of infinite time horizon MDPs
  • Linear programming as an algorithm for solving MDPs
  • Convergence of Asynchronous Value Iteration
  • (s,S) Inventory Control
  • POMDP (though there is still more to add)
  • Calculus of Variations (though there is still more to add)
  • Pontyagin’s Maximum Prinicple
  • Linear-Quadratic Lyapunov functions (Sylvester’s equation and Hurwitz matrices)
  • (some) Online Convex Optimization
  • Stochastic Bandits (UCB and Lai-Robbins Lower bound)
  • Gittins’ Index Theorem.
  • Sequential/Stochastic Linear Regression (Lai and Wei)
  • More discussion on TD methods
  • Discussion on double Q-learning and Dueling/Advantage updating
  • Convergence proof for SARSA
  • Policy Gradients (some convergence arguments from Bhanhari and Russo, but still more to do)
  • Cross Entropy Method (but still more to do)
  • Several new appendices (but mostly from old notes)

Like last year, I will likely update the notes further (and correct typos) towards the end of the course.