I’ve updated the stochastic control notes here:
These still remain a work in progress.
(Typos/errors/mishaps found are welcome)
A quick summary of what is new:
- I’ve updated a section on the ODE method for stochastic approximation.
- I’ve improved the discussion around Temporal Difference methods and included some proofs.
- I’ve added a proof of convergence for SARSA.
- I’ve added a section on Lyapunov functions in continuous time
- La Salle, exponential convergence, and online convex optimization…
- I’ve started a section on Policy Gradients but there is more recent proofs to include
- I’ve started a section on Deep Learning for RL.