Markov chains and matrices — A Quick Summary

Probabilities for a Markov chain can be expressed in terms of the probability vector \boldsymbol{\lambda} and the transition matrix P.

For example, for x, x_0, \ldots, x_t \in \mathcal{X} and a function r : \mathcal{X} \rightarrow \mathbb{R},

\displaystyle \mathbb{P}(X_0 = x_0, \ldots, X_t = x_t) = \lambda_{x_0} P_{x_0 x_1} \cdots P_{x_{t-1} x_t}

\displaystyle \mathbb{P}(X_t = x) = [\lambda P^t]_x

\displaystyle \mathbb{E}_x[r(X_t)] = (P^t r)_x

So essentially, multiplication to the left gives probabilities and multiplication to the right gives expectations.

Notation explanation. If the above is unclear, some notation may need explaining. For P^t = P \times P \cdots \times P, we multiply t copies of the matrix P together.1 Interpreting \boldsymbol{\lambda} as a row vector, \boldsymbol{\lambda} P^t is a row vector, and [\boldsymbol{\lambda} P^t]_x is the x-th component of that row vector.

Similarly, we can think of the function r : \mathcal{X} \rightarrow \mathbb{R} as a column vector \boldsymbol{r} = (r(x) : x \in \mathcal{X}). Then P^t \boldsymbol{r} is a column vector under matrix multiplication, and (P^t \boldsymbol{r})_x denotes its x-th component.

Throughout these notes, \mathbb{P}_x(\cdot) = \mathbb{P}(\cdot \mid X_0 = x) and \mathbb{E}_x[\cdot] = \mathbb{E}[\cdot \mid X_0 = x].

1 We use \top for transpose, not t.

Leave a comment