Let’s explain why the normal distribution is so important.
(This is a section in the notes here.)
Suppose that I throw a coin times and count the number of heads
The proportion of heads should be close to its mean and for
it should be even closer. This can be shown mathematically (not just for coin throws but for quite general random variables)
Theorem [Weak Law of Large Numbers] For independent random variables ,
, with mean
and variance bounded above by
, if we define
then for all
We will prove this result a little later. But, continuing the discussion, suppose are independent identically distributed random variables with mean
and variance
. We see from the above result that
is getting close to
. Nonetheless, in general, there is going to be some error. So let’s define
So what does
look like? We know that, in some sense,
as
but how fast?
For this we can analyze the variance of the random variable :
Thus the standard deviation of decreases as
. Given this we can define
Notice that and
So has mean zero and its variance is fixed. I.e. the error as measured by
is not vanishing, but is staying roughly constant. So it seems like there is sometime happening for this random variable
, a question is what happens to
. The answer is that
converges to a normal distribution.
This is a famous and fundamental result in probability and statistics called the central limit theorem.
Theorem [Central Limit Theorem] For independent random variables with mean
and variance
, for
and
then
where is a standard normal random variable.
Given the discussion above the Central Limit Theorem, roughly says that
where is a standard normal random variable. So whenever we measure errors about some expected value we should start to consider normal random variables.