Stochastic Control Notes Update

I’ve updated the stochastic control notes here:

Stochastic_Control_2020_May.pdf

These still remain a work in progress.

(Typos/errors/mishaps found are welcome)

A quick summary of what is new:

  • I’ve updated a section on the ODE method for stochastic approximation.
  • I’ve improved the discussion around Temporal Difference methods and included some proofs.
  • I’ve added a proof of convergence for SARSA. 
  • I’ve added a section on Lyapunov functions in continuous time
    • La Salle, exponential convergence, and online convex optimization…
  • I’ve started a section on Policy Gradients but there is more recent proofs to include
  • I’ve started a section on Deep Learning for RL.

Kalman Filter

Kalman filtering (and filtering in general) considers the following setting: we have a sequence of states  x_t, which evolves under random perturbations over time. Unfortunately we cannot observe x_t, we can only observe some noisy function of  x_t, namely,  y_t. Our task is to find the best estimate of x_t given our observations of y_t. Continue reading “Kalman Filter”

Stochastic Linear Regression

We consider the following formulation of Lai, Robbins and Wei (1979), and Lai and Wei (1982). Consider the following regression problem,

for n=1,2,... where \epsilon_n are unobservable random errors and \beta_1,...,\beta_p are unknown parameters.

Typically for a regression problem, it is assumed that inputs x_{1},...,x_{n} are given and errors are IID random variables. However, we now want to consider a setting where we sequentially choose inputs x_i and then get observations y_i, and errors \epsilon_i are a martingale difference sequence with respect to the filtration \mathcal F_i generated by \{ x_j, y_{j-1} : j\leq i \}.

Continue reading “Stochastic Linear Regression”