Stochastic Control Notes Update – 2021

I’ve updated the notes for this year’s stochastic control course, here:

Asside from general tidying. New material includes:

  • Equilibrium distributions of Markov chains
  • Occupancy measure of infinite time horizon MDPs
  • Linear programming as an algorithm for solving MDPs
  • Convergence of Asynchronous Value Iteration
  • (s,S) Inventory Control
  • POMDP (though there is still more to add)
  • Calculus of Variations (though there is still more to add)
  • Pontyagin’s Maximum Prinicple
  • Linear-Quadratic Lyapunov functions (Sylvester’s equation and Hurwitz matrices)
  • (some) Online Convex Optimization
  • Stochastic Bandits (UCB and Lai-Robbins Lower bound)
  • Gittins’ Index Theorem.
  • Sequential/Stochastic Linear Regression (Lai and Wei)
  • More discussion on TD methods
  • Discussion on double Q-learning and Dueling/Advantage updating
  • Convergence proof for SARSA
  • Policy Gradients (some convergence arguments from Bhanhari and Russo, but still more to do)
  • Cross Entropy Method (but still more to do)
  • Several new appendices (but mostly from old notes)

Like last year, I will likely update the notes further (and correct typos) towards the end of the course.

Stochastic Control Notes Update

I’ve updated the stochastic control notes here:

Stochastic_Control_2020_May.pdf

These still remain a work in progress.

(Typos/errors/mishaps found are welcome)

A quick summary of what is new:

  • I’ve updated a section on the ODE method for stochastic approximation.
  • I’ve improved the discussion around Temporal Difference methods and included some proofs.
  • I’ve added a proof of convergence for SARSA. 
  • I’ve added a section on Lyapunov functions in continuous time
    • La Salle, exponential convergence, and online convex optimization…
  • I’ve started a section on Policy Gradients but there is more recent proofs to include
  • I’ve started a section on Deep Learning for RL.

Stochastic Control 2020

Another year of MATH69122! — aka Stochastic Control.

This year, I will try to keep updating PDFs with slides and notes for each lecture. I’ll keep notes for the course in the “PDF” tab above. These are also here:

Stochastic Control 2020 [pdf]

Here is a rough plan for each week of lectures:

Continue reading “Stochastic Control 2020”