Central Limit Theorem and Standardized Sums

caffeine
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of n independent random variables \{X_i\} with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum S_n = \sum_{i=1}^n X_i. Suppose I want to know the probability that S_n is between a and b. In other words, I want to know P(b > S_n > a).

The central limit theorem uses a standardized sum:

<br /> P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy<br />

What is the relationship between what I want:

P(b &gt; S_n &gt; a)

and what the central limit theorem tells me about:

P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
 
Last edited by a moderator:
Physics news on Phys.org
caffeine said:
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of n independent random variables \{X_i\} with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum S_n = \sum_{i=1}^n X_i. Suppose I want to know the probability that S_n is between a and b. In other words, I want to know P(b &gt; S_n &gt; a).

The central limit theorem uses a standardized sum:

<br /> P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy<br />
No, the Central Limit Theorem does not use a standardized sum. (At least it doesn't have to. I can't speak for whatever textbook you are using.)

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, \mu, and finite standard deviation, \sigma, then their sum will be approximately normally distributed with mean n\mu and standard distribution \sqrt{n}\sigma- and the larger n is the better that approximation will be.

What is the relationship between what I want:

P(b &gt; S_n &gt; a)

and what the central limit theorem tells me about:

P\left(b &gt; \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} &gt; a\right)


How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
Well, they are not equal- they are approximately equal.

Assuming that your base distribution has mean \mu and standard deviation \sigma, then Sn is approximately normally distributed with mean n\sigma and standard deviation \sqrt{n}\sigma. You then convert from that to the standard normal distribution (with mean 0 and standard deviation 1) as you probably have learned before:
z= \frac{x- \mu}{\sigma}
In particular, take x= Sn, a< S_n< b becomes a- n\mu&lt; S_n-n\mu&lt; b-n\mu and then
\frac{a-n\mu}{\sqrt{n}\sigma}&lt; \frac{S_n- n\mu}{\sqrt{n}\sigma}&lt; \frac{b- n\mu}{\sqrt{n}\sigma}

Look up those values in a standardized normal distribution table.
 
HallsofIvy said:
Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, \mu, and finite standard deviation, \sigma, then their sum will be approximately normally distributed with mean n\mu and standard distribution \sqrt{n}\sigma- and the larger n is the better that approximation will be.

This is famous as Lindeberg-Levy CLT. I am not sure but I have a doubt about the terms "all possible samples"... essentially, for a sequence of independently and identically distributed random variables with same mean \mu and same finite sd \sigma, the quantity (sum of n variables - n\mu)/\sqrt{n}\sigma should asymptotically follow normal distribution with mean 0 and standard deviation 1(belive "distribution" is a typing mistake).
 
Last edited:
Yes, I meant "all possible samples" for a specific distribution- therefore "independently and identically distributed"

You are right that I mean "standard deviation", not "standard distribution"!
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Back
Top