Central Limit Theorem and STEM

ampakine · Apr 29, 2011

I keep reading explanations that say things like "the mean is normally approximated" but I don't know what that means. Are they saying that if you take a load of samples then plot the means of every one of those samples on a graph that the mean of that graph will be approximately the population mean of the population that you took the samples from? For example let's say I want to know the probability that I can catapult a gypsy 15 metres or more so once a year I go around the world randomly picking out gypsies and seeing how far I can catapult them. The population is how far I can catapult every gypsy in the world but on my yearly I only catapult about 20 gypsies. Does the central limit theorem say that if I take the average catapult distances obtained from each of these yearly gypsy catapulting expeditions and plot them on a graph that this graph will keep becoming more and more normal every year as I add a new mean to it? Is that the idea or have I got it wrong?

Stephen Tashi · Apr 29, 2011

ampakine said:

I keep reading explanations that say things like "the mean is normally approximated" but I don't know what that means.

You should read and ask questions about the precise mathematical statement of the theorem, not about popularized summaries of it. Then people will offer you popularized summaries of it as explanations and you can demand to know what they are talking about.

Are they saying that if you take a load of samples then plot the means of every one of those samples on a graph that the mean of that graph will be approximately the population mean of the population that you took the samples from?

Theorems about probability rarely say anything definite about actual outcomes, approximate or otherwise. Instead they talk about the probabilities of outcomes and approximations to probability distributions. The mean of an independent random sample of N things from a population has an approximately normal distribution. The standard deviation of this approximately normal distribution gets smaller as N increases. So it would be correct conclusion that the mean of a large sample is "probably" close to the mean of the population. The Central Limit Theorem implies this, but it wouldn't be correct to say that they Central Limit Theorem "is" that statement.

For example let's say I want to know the probability that I can catapult a gypsy 15 metres or more so once a year I go around the world randomly picking out gypsies and seeing how far I can catapult them. The population is how far I can catapult every gypsy in the world but on my yearly I only catapult about 20 gypsies. Does the central limit theorem say that if I take the average catapult distances obtained from each of these yearly gypsy catapulting expeditions and plot them on a graph that this graph will keep becoming more and more normal every year as I add a new mean to it? Is that the idea or have I got it wrong?

Suppose you have a random variable whose distribution is not normal. Imagine one whose density is shaped like an isoceles triangle. If you take samples of size 1 and plot them as a histogram, the it is probable that your plot will begin to look like an isoceles triangle as you take more and more samples.

Suppose you take samples of size 10 and histogram the mean of those samples (not the value of each of the 10 individually, only the mean of all 10). You do this for many samples of size 10. Then it is probable that your plot won't look like an isoceles triangle. It will be smoother.

If you take samples of size 1000 and histogram their means, the graph will probably be even smoother and more "normal" looking.

ampakine · May 1, 2011

Stephen Tashi said:

The mean of an independent random sample of N things from a population has an approximately normal distribution.

I think its the terminology that's confusing me. You say that the mean of a sample has an approximately normal distribution. When I think of a sample I think of 1 sample which can have only 1 mean so the idea of a distribution doesn't make sense. Do you mean that if I was to take multiple samples from a population then get the mean of each of these samples then plot them on a graph that I'd have a distribution for the mean?

Stephen Tashi · May 1, 2011

If you only look at the mean of, say, 100 independently chosen values of a random variable, you increase the probabilities that extreme values in the sample will "cancel out". If you only plot the means of such "batched" samples, they have a smaller probability of taking on extreme values than if you plot the values of single samples. (For example,of X is a 0-or-1 random variable, the mean of a sample of 100 X's might be 100/100 = 1, but that is less likely than observing a single example of X = 1. Furthermore a histogram of single observations, can't look very normal since it has only two bars on it for X = 0 or X =1 , while the histogram of a sample mean has bars for values like 85/100, 10/100 etc.)

If you had 10,000 independent random samples, you could group them into mutually exclusive batches of 10 or 100 or 1000 etc. The Central Limit Theorem doesn't say that you can "cheat the Devil" by doing this and gain greater and greater certainty about the mean value. (The number of samples N is decreasing as you group observations into larger and larger batches.) If you take the mean of the 10,000 observations and plot it as a single point then, yes, the Central Limit Theorem says you have plotted 1 observation from an approximately normal distribution and it tells you about the standard deviation of that distribution.

blue_raver22 · May 8, 2011

The Central Limit Theorem is a fundamental concept in statistics that states that the means of a large number of independent and identically distributed random variables will follow a normal distribution, regardless of the underlying distribution of the variables themselves. This means that if you take a large number of samples from a population and calculate the mean of each sample, the distribution of those means will be approximately normal.

In your example, if you were to take a large number of samples of catapult distances from different gypsies and calculate the mean of each sample, the distribution of those means would follow a normal distribution. This is important because it allows us to make inferences about the population mean based on a sample mean, as the sample mean will be a good approximation of the population mean.

This concept is particularly important in STEM fields because it allows us to make predictions and draw conclusions based on limited data. For example, in scientific experiments, we often take a sample of individuals and make conclusions about the entire population based on that sample. The Central Limit Theorem ensures that these conclusions will be valid, as long as the sample is representative of the population.

In summary, the Central Limit Theorem is a powerful tool in statistics that allows us to make inferences about a population based on a sample. It is a fundamental concept in STEM fields and is crucial for understanding and interpreting data.

Central Limit Theorem and STEM

1. What is the Central Limit Theorem?

2. How is the Central Limit Theorem used in STEM?

3. Can the Central Limit Theorem be applied to any sample size?

4. How does the Central Limit Theorem relate to the Law of Large Numbers?

5. Why is the Central Limit Theorem important in science and research?

Similar threads

Hot Threads

Recent Insights