Stochastic Control Notes Update – 2021 – Applied Probability Notes

I’ve updated the notes for this year’s stochastic control course, here:

Stochastic_Control_2021_Jan.pdfDownload

Asside from general tidying. New material includes:

Equilibrium distributions of Markov chains
Occupancy measure of infinite time horizon MDPs
Linear programming as an algorithm for solving MDPs
Convergence of Asynchronous Value Iteration
(s,S) Inventory Control
POMDP (though there is still more to add)
Calculus of Variations (though there is still more to add)
Pontyagin’s Maximum Prinicple
Linear-Quadratic Lyapunov functions (Sylvester’s equation and Hurwitz matrices)
(some) Online Convex Optimization
Stochastic Bandits (UCB and Lai-Robbins Lower bound)
Gittins’ Index Theorem.
Sequential/Stochastic Linear Regression (Lai and Wei)
More discussion on TD methods
Discussion on double Q-learning and Dueling/Advantage updating
Convergence proof for SARSA
Policy Gradients (some convergence arguments from Bhanhari and Russo, but still more to do)
Cross Entropy Method (but still more to do)
Several new appendices (but mostly from old notes)

Like last year, I will likely update the notes further (and correct typos) towards the end of the course.

2 thoughts on “Stochastic Control Notes Update – 2021”

Very nice notes, I like the probability appendix especially (for being concise yet wide-ranging). I was looking at the Robbins-Monro proofs, and a few typos I found:

– p. 153, “Doob’s Martingale Convergene Theorem” has “Convergnce” by mistake
– The final line in the proof of Thm 120 uses a_n instead of alpha_n
– p. 149, “An Easy Robbin’s Monro Proof”, no apostrophe needed (and elsewhere, replace “Robbin” with “Robbins”
– In the next paragraph, instead of assuming sum_n E[y_n] is bounded, I think you need y_n^2 instead of y_n, as in the proof you need the term e_n bounded.
– Need a search-and-replace to replace “Munro” with “Monro”

LikeLike

Typo in the notes on the page 8. Transition function f is defined as $f:X times A rightarrow A$ should be as $f:X times A rightarrow X$.

LikeLike

Share this:

2 thoughts on “Stochastic Control Notes Update – 2021”

Leave a comment Cancel reply