"The distribution of heights is not Gaussian"

Click For Summary

Discussion Overview

The discussion revolves around the distribution of human heights, specifically whether it can be accurately modeled as a Gaussian (normal) distribution. Participants explore the implications of this distribution in relation to statistical modeling, historical perspectives, and the nuances of height data across different populations.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants assert that the distribution of human heights is typically represented as Gaussian in textbooks, particularly within a certain range close to the mean, such as between 5'5" and 6'3".
  • Others argue that the distribution is not Gaussian, citing that it is skewed and may resemble a Maxwell-Boltzmann distribution instead, particularly when considering the limits of human height.
  • A participant references historical work by Quetelet, suggesting that human traits, including height, were proposed to follow a normal distribution, influencing public health measures like the body mass index.
  • There is mention of a lognormal distribution as a potentially better model for human height data, with a reference to a study that discusses this approach.
  • Some participants express uncertainty about the validity of applying different statistical methods to parts of the data treated as separate distributions.
  • Concerns are raised about the significance of deviations from normality in large samples, suggesting that even small departures could be meaningful in a population of billions.

Areas of Agreement / Disagreement

Participants do not reach a consensus on whether the distribution of human heights can be considered Gaussian, with multiple competing views presented regarding the appropriateness of different statistical models.

Contextual Notes

Limitations include the potential for significant deviations from normality in large samples and the complexity of modeling height distributions across different demographics, such as age and sex.

bluemoonKY
Messages
130
Reaction score
16
I was browsing old threads at Physics Forums, and I came across some information in this thread from 2008 that got my interest. The thread is titled "Child molester avoids prison because he is short." PF member stickythighs wrote the following: "Since the average American man is 5'10", there are about equal numbers of men in Florida 5'5" and shorter as there are men 6'3" and taller."In post #25 on the thread, PF Member Gokul43201 responded: "Not true. The distribution of heights is not Gaussian. It's almost Boltzmannian, and 5 inches is way bigger than the standard deviation - so a Gaussian approximation could be quite off when you go that far away from the mean. And it is..."

The distribution of human heights is the classic example that statistics textbooks and other textbooks use to show a Gaussian Distribution. By the way, a Guassian distribution = a Normal Distribution. I admit that the distribution of human heights is not Gaussian at the tails. In reality, there are far more people at 5+ standard deviations both above and below the mean than a graph of a 100% Gaussian Distribution of human heights would show. In other words, a graph of a 100% Guassian Distribution of human heights would show less people at 5 SD from the mean than there would actually be in real life.

However, in the example that stickythighs and gokul were discussing, the comparison was between male heights of 5'5" and 6'3". Human height distribution IS Gaussian when you are so close to the mean as 5'5" and 6'3". Therefore, why did Gokul deny that the human height distribution is Gaussian in the 5'5"-6'3" range?

Here is a link to the thread that I am referencing: https://www.physicsforums.com/threads/child-molester-avoids-prison-because-he-is-short.249825/page-2

Why did Gokul say that the distribution of human heights is almost Boltzmannian? Clearly it's not.

Note to moderators: The topic of the thread that I am referencing is about a child molester avoiding prison because he is short. The main topic of the thread that I am referencing is NOT about whether or not the distribution of human height is Gaussian or not. The correct etiquette and protocol for a digression in another area is to create a new thread on the digression, not to hijack the previous thread. There is no thread that I am aware of specifically about whether or not the distribution of human heights is Gaussian. Therefore, I should not be breaking any rules by creating this thread. It's a new topic.
 
Biology news on Phys.org
Have a look at the graph here.
https://www.khanacademy.org/science.../a/what-is-the-maxwell-boltzmann-distribution

This is a distribution that has a lower limit like zero, and a maximum going out the x axis. Maxwell-Boltzmann distribution.

So, with normal human adults there is a minimum height, and a larger maximum. I'm excluding dwarfism and gigantism due to abnormalities. And there is a skewness to the result; the graph is not symmetric across the mean, it is skewed.

What gokul43210 (no longer active on the forums) said was that it was a poor fit to a Gaussian curve, and close (but not really) a M-B distribution.
His link to what he cites as a model of the distribution is broken.

How you find a model to fit an existing distribution is interesting. @Dale works with this kind of thing. Maybe he can help clarify what you do with 'almost-fits' situations.

There is also this: Limpert, E; Stahel, W; Abbt, M (2001). "Lognormal distributions across the sciences: keys and clues". BioScience. 51 (5): 341–352
which I cannot get to show in a link. It says that a good human height distribution model is lognormal.

Also, it is not valid to claim that a part of the data is distribution A, and another part is distribution B. And then apply statistical methods on each part as if they were separate.
Since I cannot get all the facts, I cannot give you a good answer.
 
Last edited:
bluemoonKY said:
The distribution of human heights is the classic example that statistics textbooks and other textbooks use to show a Gaussian Distribution.

You can reference Lambert Adolphe Jacques Quetelet for bringing statistical studies into the humanities. His conclusion is that the traits of the average man follow a normal distribution. That has been followed ever since in many areas.

https://en.wikipedia.org/wiki/Adolphe_Quetelet
In his 1835 text on social physics, in which he presented his theory of human variance around the average, with human traits being distributed according to a normal curve, he proposed that normal variation provided a basis for the idea that populations produce sufficient variation for artificial or natural selection to operate.[7]

In terms of influence over later public health agendas, one of Quetelet's lasting legacies was the establishment of a simple measure for classifying people's weight relative to an ideal for their height. His proposal, the body mass index (or Quetelet index), has endured with minor variations to the present day.[8] Anthropometric data is used in modern applications and referenced in the development of every consumer-based product

You may also want to read this.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2831262/
Our data are based on complete enumerations, not samples. The number of young men whose heights were tabulated in the Torre reports rose from about 250,000 individuals for the cohorts born before 1860 to over half a million for those born after 1905 ). Overall, the heights of over 21 million individuals were tabulated in these records.
See figure 3 for the tabulated distribution of 20 year old males (1900 ) - raw data, adjusted and fitted normal distribution.

For other populations, the height curve could well not follow a normal distribution.
ie ages 0 to old age ( 60, 70 ... ) whole population - what's that going to look like??
ie sex - male and female follow two difference curves. Bring them together and one gets a flat-ish top
 
  • Like
Likes   Reactions: jim mcnamara
bluemoonKY said:
Human height distribution IS Gaussian when you are so close to the mean as 5'5" and 6'3".
If you have a large enough sample then even the tiniest departures from normality become significant. With N on the order of a billion I am sure that it is not normal.

The question isn't really whether or not something is normal, just whether or not the approximation is close enough that you can use the nice simplifying assumption that normality provides.
 
  • Like
Likes   Reactions: jim mcnamara
Closed threads should not be re-opened without moderator approval. I think this has been answered.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 29 ·
Replies
29
Views
7K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
2
Views
8K
Replies
3
Views
2K
Replies
2
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K