# Entropy and Boltzmann’s Distribution

Entropy and Relative Entropy occur sufficiently often in these notes to justify a (somewhat) self-contained section. We cover the discrete case which is the most intuitive.

## Entropy – Discrete Case

Def 1. [Entropy] Suppose that $X$ is a random variable with values in the countable set ${\mathcal X}$ and distribution $P=(p_i:i\in {\mathcal X})$ then is the entropy of $X$.

Def 1. [Relative Entropy] For probability distributions $latex {\mathbb P}=(p_i:i\in {\mathcal X})$ and ${\mathbb Q}=(q_i:i\in {\mathcal X})$ then is the Relative Entropy of $P$ with respect to $Q$.

Ex 1. For a vector ${\mathbf n}=(n_i : i=1,...,m)$ with $latex n=\sum_in_i$ , if then Ans 1. Take logs and apply the Stirling’s approximation that $\log(n!)=n\log n - n + o(n)$. So, Ex 2. Suppose that $X_k$, $k=1,...,n$ are IIDRVs with distribution $P$. Let $\hat{P}^n=(\hat{P}_i^n : i \in {\mathcal X} )$ be the in empirical distribution of $X_k$, $k=1,..,n$, that is If $n_i/n \rightarrow q_i$ for each $i=1,...,m$ then Ans 2. Note that $p_i^{n_i} = e^{nq_i \log p_i}$, so combining with  gives ## Boltzmann’s distribution

We now use entropy to derive Boltzmann’s distribution.

Consider a large number particles. Each particle can take energy levels $e_i$, $i\in {\mathbb N}$. Let $\bar{e}$ be the average energy of the particles. If we let $p_i$ be the proportion of particles of energy level $e_i$ then these constraints are As is often considered in physics, we assume that the equilibrium state of the particles is the state $p$ with the largest number of ways of occurring subject to the constraints on the system. In other words, We solve the optimization Ex 3. [Boltzmann’s distribution] Show that the solution to is given by the distribution is called Boltzmann’s distribution. The scaling constant $Z(\lambda)$ is called the Partition function and the constant $\lambda$ is chosen so that Ans 3. The Lagrangian of this optimization problem is So, finding a stationary point which implies For our constraints to be satisfied, we require that Thinking of $Z$ as a function of $\lambda$, we call $Z(\lambda)=\sum_{i=1}^\infty e^{-\lambda e_i}$ the partition function and notice it is easily shown that The distribution we derived 