# The Law of Large Numbers and Central Limit Theorem

Let’s explain why the normal distribution is so important.

(This is a section in the notes here.)

Suppose that I throw a coin $100$ times and count the number of heads

The proportion of heads should be close to its mean and for $10,000$ it should be even closer. This can be shown mathematically (not just for coin throws but for quite general random variables)

Theorem [Weak Law of Large Numbers] For independent random variables $X_i$, $i=1,...,n$, with mean $\mu$ and variance bounded above by $\sigma$, if we define

then for all $\epsilon >0$

We will prove this result a little later. But, continuing the discussion, suppose $X_1,...,X_n$ are independent identically distributed random variables with mean $\mu$ and variance $\sigma^2$. We see from the above result that $S_n /n$ is getting close to $\mu$. Nonetheless, in general, there is going to be some error. So let’s define

So what does $\epsilon_n$ look like? We know that, in some sense, $\epsilon_n \rightarrow 0$ as $n \rightarrow \infty$ but how fast?

For this we can analyze the variance of the random variable $\epsilon_n$:

Thus the standard deviation of $\epsilon_n$ decreases as $\sigma / \sqrt{n}$. Given this we can define

Notice that $\mathbb E [Z_n]=0$ and

So $Z_n$ has mean zero and its variance is fixed. I.e. the error as measured by $Z_n$ is not vanishing, but is staying roughly constant. So it seems like there is sometime happening for this random variable $Z_n$, a question is what happens to $Z_n$. The answer is that $Z_n$ converges to a normal distribution.

This is a famous and fundamental result in probability and statistics called the central limit theorem.

Theorem [Central Limit Theorem] For independent random variables $X_i$ with mean $\mu$ and variance $\sigma^2$, for $S_n = \sum_{i=1}^n X_i$ and

then

where $Z$ is a standard normal random variable.

Given the discussion above the Central Limit Theorem, roughly says that

where $Z$ is a standard normal random variable. So whenever we measure errors about some expected value we should start to consider normal random variables.