# Central Limit Theorem and Standardized Sums

caffeine

## Main Question or Discussion Point

I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of $n$ independent random variables $\{X_i\}$ with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum $S_n = \sum_{i=1}^n X_i$. Suppose I want to know the probability that $S_n$ is between a and b. In other words, I want to know $P(b > S_n > a)$.

The central limit theorem uses a standardized sum:

$$P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right) = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy$$

What is the relationship between what I want:

$$P(b > S_n > a)$$

and what the central limit theorem tells me about:

$$P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)$$

How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?

Last edited by a moderator:

Related Set Theory, Logic, Probability, Statistics News on Phys.org
HallsofIvy
Homework Helper
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of $n$ independent random variables $\{X_i\}$ with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum $S_n = \sum_{i=1}^n X_i$. Suppose I want to know the probability that $S_n$ is between a and b. In other words, I want to know $P(b > S_n > a)$.

The central limit theorem uses a standardized sum:

$$P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right) = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy$$
No, the Central Limit Theorem does not use a standardized sum. (At least it doesn't have to. I can't speak for whatever text book you are using.)

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, $\mu$, and finite standard deviation, $\sigma$, then their sum will be approximately normally distributed with mean $n\mu$ and standard distribution $\sqrt{n}\sigma$- and the larger n is the better that approximation will be.

What is the relationship between what I want:

$$P(b > S_n > a)$$

and what the central limit theorem tells me about:

$$P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)$$

How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
Well, they are not equal- they are approximately equal.

Assuming that your base distribution has mean $\mu$ and standard deviation $\sigma$, then Sn is approximately normally distributed with mean $n\sigma$ and standard deviation $\sqrt{n}\sigma$. You then convert from that to the standard normal distribution (with mean 0 and standard deviation 1) as you probably have learned before:
$$z= \frac{x- \mu}{\sigma}$$
In particular, take x= Sn, a< S_n< b becomes $a- n\mu< S_n-n\mu< b-n\mu$ and then
$$\frac{a-n\mu}{\sqrt{n}\sigma}< \frac{S_n- n\mu}{\sqrt{n}\sigma}< \frac{b- n\mu}{\sqrt{n}\sigma}$$

Look up those values in a standardized normal distribution table.

ssd
Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, $\mu$, and finite standard deviation, $\sigma$, then their sum will be approximately normally distributed with mean $n\mu$ and standard distribution $\sqrt{n}\sigma$- and the larger n is the better that approximation will be.
This is famous as Lindeberg-Levy CLT. I am not sure but I have a doubt about the terms "all possible samples".... essentially, for a sequence of independently and identically distributed random variables with same mean $\mu$ and same finite sd $\sigma$, the quantity (sum of n variables - n$\mu$)/$\sqrt{n}\sigma$ should asymptotically follow normal distribution with mean 0 and standard deviation 1(belive "distribution" is a typing mistake).

Last edited:
HallsofIvy