Central Limit Theorem and Standardized Sums

AI Thread Summary
The discussion centers on the Central Limit Theorem (CLT) and its application to the sum of independent random variables. It clarifies that while the probability P(b > S_n > a) is not equal to the standardized sum probability, they are approximately equal under certain conditions. The CLT states that the sum of n independent random variables with finite mean and variance will be approximately normally distributed as n increases. To find the desired probability, one can convert the sum into a standardized form and use the standard normal distribution. The conversation emphasizes the importance of understanding the relationship between these probabilities and the approximation provided by the CLT.
caffeine
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of n independent random variables \{X_i\} with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum S_n = \sum_{i=1}^n X_i. Suppose I want to know the probability that S_n is between a and b. In other words, I want to know P(b > S_n > a).

The central limit theorem uses a standardized sum:

<br /> P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy<br />

What is the relationship between what I want:

P(b &gt; S_n &gt; a)

and what the central limit theorem tells me about:

P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
 
Last edited by a moderator:
Physics news on Phys.org
caffeine said:
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of n independent random variables \{X_i\} with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum S_n = \sum_{i=1}^n X_i. Suppose I want to know the probability that S_n is between a and b. In other words, I want to know P(b &gt; S_n &gt; a).

The central limit theorem uses a standardized sum:

<br /> P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy<br />
No, the Central Limit Theorem does not use a standardized sum. (At least it doesn't have to. I can't speak for whatever textbook you are using.)

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, \mu, and finite standard deviation, \sigma, then their sum will be approximately normally distributed with mean n\mu and standard distribution \sqrt{n}\sigma- and the larger n is the better that approximation will be.

What is the relationship between what I want:

P(b &gt; S_n &gt; a)

and what the central limit theorem tells me about:

P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)


How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
Well, they are not equal- they are approximately equal.

Assuming that your base distribution has mean \mu and standard deviation \sigma, then Sn is approximately normally distributed with mean n\sigma and standard deviation \sqrt{n}\sigma. You then convert from that to the standard normal distribution (with mean 0 and standard deviation 1) as you probably have learned before:
z= \frac{x- \mu}{\sigma}
In particular, take x= Sn, a< S_n< b becomes a- n\mu&lt; S_n-n\mu&lt; b-n\mu and then
\frac{a-n\mu}{\sqrt{n}\sigma}&lt; \frac{S_n- n\mu}{\sqrt{n}\sigma}&lt; \frac{b- n\mu}{\sqrt{n}\sigma}

Look up those values in a standardized normal distribution table.
 
HallsofIvy said:
Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, \mu, and finite standard deviation, \sigma, then their sum will be approximately normally distributed with mean n\mu and standard distribution \sqrt{n}\sigma- and the larger n is the better that approximation will be.

This is famous as Lindeberg-Levy CLT. I am not sure but I have a doubt about the terms "all possible samples"... essentially, for a sequence of independently and identically distributed random variables with same mean \mu and same finite sd \sigma, the quantity (sum of n variables - n\mu)/\sqrt{n}\sigma should asymptotically follow normal distribution with mean 0 and standard deviation 1(belive "distribution" is a typing mistake).
 
Last edited:
Yes, I meant "all possible samples" for a specific distribution- therefore "independently and identically distributed"

You are right that I mean "standard deviation", not "standard distribution"!
 
Back
Top