## Kalman Filter

Kalman filtering (and filtering in general) considers the following setting: we have a sequence of states $x_t$, which evolves under random perturbations over time. Unfortunately we cannot observe $x_t$, we can only observe some noisy function of $x_t$, namely, $y_t$. Our task is to find the best estimate of $x_t$ given our observations of $y_t$. Continue reading “Kalman Filter”

## Temporal Difference Learning – Linear Function Approximation

For a Markov chain $\hat{x} = (\hat x_t : t\in\mathbb Z_+)$, consider the reward function associated with rewards given by $r = (r(x) : x\in\mathcal X)$. We approximate the reward function $R(x)$ with a linear approximation, ## Stochastic Linear Regression

We consider the following formulation of Lai, Robbins and Wei (1979), and Lai and Wei (1982). Consider the following regression problem, for $n=1,2,...$ where $\epsilon_n$ are unobservable random errors and $\beta_1,...,\beta_p$ are unknown parameters.

Typically for a regression problem, it is assumed that inputs $x_{1},...,x_{n}$ are given and errors are IID random variables. However, we now want to consider a setting where we sequentially choose inputs $x_i$ and then get observations $y_i$, and errors $\epsilon_i$ are a martingale difference sequence with respect to the filtration $\mathcal F_i$ generated by $\{ x_j, y_{j-1} : j\leq i \}$.