# Why is so much well described by the normal distribution?

• I
Science Advisor
Why are so many phenomena well described by the normal distribution?

For example: the height of 18 year old males in Sweden, the weight of apples on a particular tree, the volume of coke cans (supposed to be 33 cl), etc. etc. are all well described by the normal distribution.

How come?

A typical answer would be to refer to the Central Limit Theorem (CLT). In its standard formulation, CLT says that the distribution of the (normalized) average of n indepedent, identically distributed stochastic variables approaches the standard normal distribution as n → ∞.

Although there are several versions of CLT for which the assumptions are weakened, I still don't see how it can be applied to the cases above. Since these don't deal with averages, how can CLT in any form be applied?

• DuckAmuck

## Answers and Replies

jfizzix
Science Advisor
Gold Member
Why are so many phenomena well described by the normal distribution?

For example: the height of 18 year old males in Sweden, the weight of apples on a particular tree, the volume of coke cans (supposed to be 33 cl), etc. etc. are all well described by the normal distribution.

How come?

A typical answer would be to refer to the Central Limit Theorem (CLT). In its standard formulation, CLT says that the distribution of the (normalized) average of n indepedent, identically distributed stochastic variables approaches the standard normal distribution as n → ∞.

Although there are several versions of CLT for which the assumptions are weakened, I still don't see how it can be applied to the cases above. Since these don't deal with averages, how can CLT in any form be applied?

When the central limit theorem is not explicitly invoked, it's often just really easy to model certain quantities with a normal distribution. For example, there's lots of quantities (including some of the ones you list) that are Poisson-distributed (e.g., the number of giraffes in a square mile of the Savannah, or the number of photons per second in a laser seen by a detector). When the mean values are sufficiently far from zero, these Poisson distributions look exactly like normal distributions.

Last edited:
Science Advisor
When the central limit theorem is not explicitly invoked, it's often just really easy to model certain quantities with a normal distribution. For example, there's lots of quantities (including some of the ones you list) that are poisson-distributed (e.g., the number of giraffes in a square mile of the Savannah, or the number of photons per second in a laser seen by a detector). When the mean values are sufficiently far from zero, these poison distributions look exactly like normal distributions.
OK, I agree in these cases where the Poisson distribution seems natural. But what about the distributions in my three examples and others similar to those?

jfizzix
Science Advisor
Gold Member
Distributions of people sizes (one-gender) and apple sizes, and can sizes are not necessarily exactly normally distributed, even though they seem to fit a normal curve decently well. However, there is another principle that explains why the normal distribution is so popular. It's called the principle of maximum entropy.

Basically, if you have some random process, like growing apples, or cutting blocks of metal for cans, if you only know what the average size is, and the standard deviation of the sizes, the most conservative estimate of the probability distribution will be a normal distribution because it is the one that has maximum uncertainty (as measured by entropy). This is why all sorts of random error and noise is modeled as normally distributed.

• Fooality and OrangeDog
FactChecker
Science Advisor
Gold Member
Why are so many phenomena well described by the normal distribution?

For example: the height of 18 year old males in Sweden,
Notice that you have eliminated much of the non-normal variation by specifying males in Sweden. The more you limit the set to eliminate non-normal variation due to known causes, the more the remaining randomness is normal. Suppose we divide a male Swedes body into a million parts and scored each one as 0 if it was small and 1 if it was large. If the parts were independent, we would have his total height with a binomial distribution. The limit of a binomial distribution is normal. There may be an argument that specific genes still make the part sizes dependent. Clearly, genetics makes elephants larger than humans, so there is great non-normal behavior caused by genetics. But within the males of Sweden, the remaining randomness of body part size and proportions may have some degree of independence caused by different independent genes.

Stephen Tashi
Science Advisor
Why are so many phenomena well described by the normal distribution?

Suppose the net outcome of some process (like a person's total height) is the sum (meaning the arithmetic sum) of a large number of independent random variables, each of which contributes a small amount to that sum. In that situation, a population of outcomes created by such a process will have an approximately normal distribution.

In the Central Limit Theorem, the approximation concerns the sample mean, not the raw sum of the sample values. Computing the sample mean involves "scaling" the sum of a large number of random variables so that the contribution of each individual outcome is regarded as a small.

A particular physical situation would have to be considered on its own merits, but one can form a vague mental picture of the outcome of many processes (such as "grow a person to his adult height") as being approximated by the sum of many small "factors" (- in the common language sense of that word - mathematically they are "summands") , each of which contributes a small amount to the final result.

• Fooality
chiro
Science Advisor
Hey Erland.

Building on what has been said above, Normal distributions find their way in means (expectations) provided you have enough information in the sample.

Expectations find themselves in a number of common statistics and many statistical theorems are based on Normal distributions because of the Central Limit Theorem (and its variants).

If you are looking at a raw distribution (as opposed to a mean), there are some possible explanations for this.

You probably have to look at the process itself and consider some basic assumptions on why something would become Normally distributed.

For the case of something like height, that has more to do with things like genetics and environmental factors than anything else.

Being able to describe a processes distribution requires more of an understanding of the process and not some sampling distribution and that is a different topic.

One thing to do is to try and capture the variation within a process and then see what variables best capture that variation. This also can involve looking to see whether the variances in certain conditions are minimized (for example - people who are tall tend to create babies that are tall and so on).

It's an interesting question though and quite thought provoking.

Science Advisor
Thank you all for your replies, and my apologies for not thanking you until now.

jfizzix, I must look up this about entropy, it looks interesting.

FactChecker and Stephen Tashi both talk about a body being made up by many small parts which at least might be independent, and if so, CLT can be applied to their "sum", giving a normal distribution. This is probably the standard answer to my question, and it looks quite reasonable. But we often use the normal distribution without making this kind of considerations. chiro talks about trying do capture the variation within a process, and again, we rarely do so. Maybe so much has this kind of variation that by chance, this kind of "summation" can often be made. Is that what's the point of the entropy argument?

chiro
Science Advisor
Entropy reflects information mass and this (often) has relations to variance in a statistical capacity.

Basically the lower the uncertainty the more entropy captured in the sample in relation to the invariants (parameters) being estimated. These two things in statistics (variance and information) are often inversely related to each other.

Most attributes can be captured as functions of expectations in one form or another and in combination with CLT can explain why many things end up having Normality when it comes to the parameters (not necessarily the distribution itself) and then one has a function of these things.

The other thing is that you can represent a probability function as a combination of different moments (like you can represent a Taylor series as a function of scaled whole powers of the variable) and summing Normal distributions (even with covariance) is always Normally distributed as well.