Stochastic Control 2019 – Applied Probability Notes

Q-learning

Q-learning is an algorithm, that contains many of the basic structures required for reinforcement learning and acts as the basis for many more sophisticated algorithms. The Q-learning algorithm can be seen as an (asynchronous) implementation of the Robbins-Monro procedure for finding fixed points. For this reason we will require results from Robbins-Monro when proving convergence.

Continue reading “Q-learning”

Robbins-Monro

We review a method for finding fixed points then extend it to slightly more general, modern proofs. This is a much more developed version of an earlier post. We now cover the basic Robbin-Monro proof, Robbins-Siegmund Theorem, Stochastic Gradient Descent and Asynchronous update (as is required for Q-learning).

Continue reading “Robbins-Monro”

Merton Portfolio Optimization

HJB equation for Merton Problem; CRRA utility solution; Proof of Optimality.
Multiple Assets; Dual Value function Approach.

Continue reading “Merton Portfolio Optimization”

Diffusion Control Problems

We consider a continuous time analogue of Markov Decision Processes.

Continue reading “Diffusion Control Problems”

Stochastic Integration: A Quick Summary

What follows is a heuristic derivation of the Stochastic Integral, Stochastic Differential Equations and Itô’s Formula.

Continue reading “Stochastic Integration: A Quick Summary”

Continuous Time Dynamic Programming

Discrete time Dynamic Programming was given in the post Dynamic Programming. We now consider the continuous time analogue.

Continue reading “Continuous Time Dynamic Programming”

Optimal Stopping

An Optimal Stopping Problem is an Markov Decision Process where there are two actions: $a=0$ meaning to stop, and $a=1$ meaning to continue. Here there are two types of costs

$c(x,a)=\begin{cases} \kappa(x),& \text{for }a=0\quad\textit{ (the stopping cost)}\\ c(x),& \text{for }a=1\quad\textit{ (the continuation cost)}, \end{cases}$

This defines a stopping problem.

Continue reading “Optimal Stopping”

Category: Stochastic Control 2019