Central Limit Theorem and Standardized Sums

  • Thread starter caffeine
  • Start date
caffeine

Main Question or Discussion Point

I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of [itex]n[/itex] independent random variables [itex]\{X_i\}[/itex] with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum [itex]S_n = \sum_{i=1}^n X_i[/itex]. Suppose I want to know the probability that [itex]S_n[/itex] is between a and b. In other words, I want to know [itex]P(b > S_n > a)[/itex].

The central limit theorem uses a standardized sum:

[tex]
P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)
= \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy
[/tex]

What is the relationship between what I want:

[tex]P(b > S_n > a)[/tex]

and what the central limit theorem tells me about:

[tex]P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)[/tex]


How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
 
Last edited by a moderator:

Answers and Replies

HallsofIvy
Science Advisor
Homework Helper
41,738
898
I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of [itex]n[/itex] independent random variables [itex]\{X_i\}[/itex] with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum [itex]S_n = \sum_{i=1}^n X_i[/itex]. Suppose I want to know the probability that [itex]S_n[/itex] is between a and b. In other words, I want to know [itex]P(b > S_n > a)[/itex].

The central limit theorem uses a standardized sum:

[tex]
P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)
= \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy
[/tex]
No, the Central Limit Theorem does not use a standardized sum. (At least it doesn't have to. I can't speak for whatever text book you are using.)

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, [itex]\mu[/itex], and finite standard deviation, [itex]\sigma[/itex], then their sum will be approximately normally distributed with mean [itex]n\mu[/itex] and standard distribution [itex]\sqrt{n}\sigma[/itex]- and the larger n is the better that approximation will be.

What is the relationship between what I want:

[tex]P(b > S_n > a)[/tex]

and what the central limit theorem tells me about:

[tex]P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)[/tex]


How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?
Well, they are not equal- they are approximately equal.

Assuming that your base distribution has mean [itex]\mu[/itex] and standard deviation [itex]\sigma[/itex], then Sn is approximately normally distributed with mean [itex]n\sigma[/itex] and standard deviation [itex]\sqrt{n}\sigma[/itex]. You then convert from that to the standard normal distribution (with mean 0 and standard deviation 1) as you probably have learned before:
[tex]z= \frac{x- \mu}{\sigma}[/tex]
In particular, take x= Sn, a< S_n< b becomes [itex]a- n\mu< S_n-n\mu< b-n\mu[/itex] and then
[tex]\frac{a-n\mu}{\sqrt{n}\sigma}< \frac{S_n- n\mu}{\sqrt{n}\sigma}< \frac{b- n\mu}{\sqrt{n}\sigma}[/tex]

Look up those values in a standardized normal distribution table.
 
ssd
268
6
Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, [itex]\mu[/itex], and finite standard deviation, [itex]\sigma[/itex], then their sum will be approximately normally distributed with mean [itex]n\mu[/itex] and standard distribution [itex]\sqrt{n}\sigma[/itex]- and the larger n is the better that approximation will be.
This is famous as Lindeberg-Levy CLT. I am not sure but I have a doubt about the terms "all possible samples".... essentially, for a sequence of independently and identically distributed random variables with same mean [itex]\mu[/itex] and same finite sd [itex]\sigma[/itex], the quantity (sum of n variables - n[itex]\mu[/itex])/[itex]\sqrt{n}\sigma[/itex] should asymptotically follow normal distribution with mean 0 and standard deviation 1(belive "distribution" is a typing mistake).
 
Last edited:
HallsofIvy
Science Advisor
Homework Helper
41,738
898
Yes, I meant "all possible samples" for a specific distribution- therefore "independently and identically distributed"

You are right that I mean "standard deviation", not "standard distribution"!
 

Related Threads for: Central Limit Theorem and Standardized Sums

  • Last Post
Replies
1
Views
2K
  • Last Post
Replies
3
Views
2K
  • Last Post
Replies
8
Views
4K
  • Last Post
Replies
5
Views
3K
Replies
1
Views
7K
Replies
2
Views
2K
  • Last Post
Replies
6
Views
2K
Replies
1
Views
471
Top