Central Limit Theorem and Standardized Sums

caffeine · Jan 22, 2007

I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of [itex]n[/itex] independent random variables [itex]\{X_i\}[/itex] with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum [itex]S_n = \sum_{i=1}^n X_i[/itex]. Suppose I want to know the probability that [itex]S_n[/itex] is between a and b. In other words, I want to know [itex]P(b > S_n > a)[/itex].

The central limit theorem uses a standardized sum:

[tex] P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy[/tex]

What is the relationship between what I want:

[tex]P(b > S_n > a)[/tex]

and what the central limit theorem tells me about:

[tex]P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)[/tex]How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?

HallsofIvy · Jan 22, 2007

caffeine said:

I don't think I *really* understand the Central Limit Theorem.

Suppose we have a set of [itex]n[/itex] independent random variables [itex]\{X_i\}[/itex] with the same distribution function, same finite mean, and same finite variance. Suppose we form the sum [itex]S_n = \sum_{i=1}^n X_i[/itex]. Suppose I want to know the probability that [itex]S_n[/itex] is between a and b. In other words, I want to know [itex]P(b > S_n > a)[/itex].

The central limit theorem uses a standardized sum:

[tex] P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)<br /> = \frac{1}{\sqrt{2\pi}} \int_a^b e^{-y^2/2} \, dy[/tex]

No, the Central Limit Theorem does not use a standardized sum. (At least it doesn't have to. I can't speak for whatever textbook you are using.)

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, [itex]\mu[/itex], and finite standard deviation, [itex]\sigma[/itex], then their sum will be approximately normally distributed with mean [itex]n\mu[/itex] and standard distribution [itex]\sqrt{n}\sigma[/itex]- and the larger n is the better that approximation will be.

What is the relationship between what I want:

[tex]P(b > S_n > a)[/tex]

and what the central limit theorem tells me about:

[tex]P\left(b > \frac{ \sum_{i=1}^n X_i - n\mu}{\sqrt{n}\sigma} > a\right)[/tex]

How can they possibly be equal? If they are equal, how is that possible? And if they're not equal, how would I get what I want?

Well, they are not equal- they are approximately equal.

Assuming that your base distribution has mean [itex]\mu[/itex] and standard deviation [itex]\sigma[/itex], then S_n is approximately normally distributed with mean [itex]n\sigma[/itex] and standard deviation [itex]\sqrt{n}\sigma[/itex]. You then convert from that to the standard normal distribution (with mean 0 and standard deviation 1) as you probably have learned before:
[tex]z= \frac{x- \mu}{\sigma}[/tex]
In particular, take x= S_n, a< S_n< b becomes [itex]a- n\mu< S_n-n\mu< b-n\mu[/itex] and then
[tex]\frac{a-n\mu}{\sqrt{n}\sigma}< \frac{S_n- n\mu}{\sqrt{n}\sigma}< \frac{b- n\mu}{\sqrt{n}\sigma}[/tex]

Look up those values in a standardized normal distribution table.

ssd · Jan 22, 2007

HallsofIvy said:

Essentially, the Central Limit Theorem says that if we consider all possible samples of size n from a distribution having finite mean, [itex]\mu[/itex], and finite standard deviation, [itex]\sigma[/itex], then their sum will be approximately normally distributed with mean [itex]n\mu[/itex] and standard distribution [itex]\sqrt{n}\sigma[/itex]- and the larger n is the better that approximation will be.

This is famous as Lindeberg-Levy CLT. I am not sure but I have a doubt about the terms "all possible samples"... essentially, for a sequence of independently and identically distributed random variables with same mean [itex]\mu[/itex] and same finite sd [itex]\sigma[/itex], the quantity (sum of n variables - n[itex]\mu[/itex])/[itex]\sqrt{n}\sigma[/itex] should asymptotically follow normal distribution with mean 0 and standard deviation 1(belive "distribution" is a typing mistake).

HallsofIvy · Jan 22, 2007

Yes, I meant "all possible samples" for a specific distribution- therefore "independently and identically distributed"

You are right that I mean "standard deviation", not "standard distribution"!

Central Limit Theorem and Standardized Sums

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Graduate Probability puzzle

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect