Lyapunov functions are an extremely convenient device for proving that a dynamical system converges. We cover:

- The Lyapunov argument
- La Salle’s Invariance Principle
- An Alternative argument for Convex Functions
- Exponential Convergence Rates

Skip to content
## Lyapunov Functions

## Optimal Stopping

## Algorithms for MDPs

## Infinite Time Horizon MDP

## Markov Decision Processes

## Markov Chains

## Dynamic Programming

## Stochastic Control 2020

## Kalman Filter

## Temporal Difference Learning – Linear Function Approximation

Lyapunov functions are an extremely convenient device for proving that a dynamical system converges. We cover:

- The Lyapunov argument
- La Salle’s Invariance Principle
- An Alternative argument for Convex Functions
- Exponential Convergence Rates

Here are the slides from Lectures

Please read Section 1.6 from the notes:

Please attempt Ex53, Ex54, Ex56, Ex57.

Here are the slides from Lectures

Please read Section 1.5 from the notes:

Please attempt Ex39, 40 & 41 [if you can code], 42 and 43.

Here are the slides from Lectures

Please read Section 1.4 from the notes:

Please attempt Ex35, Ex36, Ex37.

Here are the slides from Lectures

3_Markov Decision Processes [pdf]

Please read Section 1.3 from the notes:

Please attempt exercises Ex22, Ex23, Ex24, Ex25.

Here are the slides from Lectures

Please read Section 1.2 from the notes:

Slides from Lectures are here:

Please read Section 1.1 of the notes:

Stochastic Control Notes [pdf]

Please attempt exercises **Ex3, Ex5 **and** Ex6** from the notes.

Another year of MATH69122! — aka Stochastic Control.

This year, I will try to keep updating PDFs with slides and notes for each lecture. I’ll keep notes for the course in the “PDF” tab above. These are also here:

Here is a rough plan for each week of lectures:

Kalman filtering (and filtering in general) considers the following setting: we have a sequence of states , which evolves under random perturbations over time. Unfortunately we cannot observe , we can only observe some noisy function of , namely, . Our task is to find the best estimate of given our observations of . Continue reading “Kalman Filter”

For a Markov chain , consider the reward function

associated with rewards given by . We approximate the reward function with a linear approximation,

Continue reading “Temporal Difference Learning – Linear Function Approximation”