Kalman Filter

Kalman filtering (and filtering in general) considers the following setting: we have a sequence of states  x_t, which evolves under random perturbations over time. Unfortunately we cannot observe x_t, we can only observe some noisy function of  x_t, namely,  y_t. Our task is to find the best estimate of x_t given our observations of y_t. Continue reading “Kalman Filter”

Stochastic Linear Regression

We consider the following formulation of Lai, Robbins and Wei (1979), and Lai and Wei (1982). Consider the following regression problem,

for n=1,2,... where \epsilon_n are unobservable random errors and \beta_1,...,\beta_p are unknown parameters.

Typically for a regression problem, it is assumed that inputs x_{1},...,x_{n} are given and errors are IID random variables. However, we now want to consider a setting where we sequentially choose inputs x_i and then get observations y_i, and errors \epsilon_i are a martingale difference sequence with respect to the filtration \mathcal F_i generated by \{ x_j, y_{j-1} : j\leq i \}.

Continue reading “Stochastic Linear Regression”

Q-learning

 

Q-learning is an algorithm, that contains many of the basic structures required for reinforcement learning and acts as the basis for many more sophisticated algorithms. The Q-learning algorithm can be seen as an (asynchronous) implementation of the Robbins-Monro procedure for finding fixed points. For this reason we will require results from Robbins-Monro when proving convergence.

Continue reading “Q-learning”