# Height probability

I have two questions.

Suppose I am a parent and want to predict how tall my child will be as an adult. I want to survey family members from both sides.

Should I survey my brothers and sisters, or should I survey my 20-year-old nephews and nieces? I think I should survey my brothers and sisters. And they should be about the same age.

Also, the distribution of those heights will always approximate a normal distribution. That's what I learned in biology yesterday. How do I use this fact to predict the adult

Also, the distribution of those heights will always approximate a normal distribution.

Perhaps, but not if there are only a few samples.

chiro
I have two questions.

Suppose I am a parent and want to predict how tall my child will be as an adult. I want to survey family members from both sides.

Should I survey my brothers and sisters, or should I survey my 20-year-old nephews and nieces? I think I should survey my brothers and sisters. And they should be about the same age.

Also, the distribution of those heights will always approximate a normal distribution. That's what I learned in biology yesterday. How do I use this fact to predict the adult

For small sample sizes, it is not wise to use classical statistical methods.

The classical methods are asymptotic and they rely on having a large enough sample size.

Apart from this you need to understand a little bit about the underlying process. Your brothers and sisters data might better represent the outcome than your nieces. If biology has results that say that your nieces may not be a good representation for what you are looking for, it might be more damaging using that data than not using it all.

Also with doing things like trying to predict outcomes, you would use a simple linear model.

One thing you need to be aware of is that growth is a highly non-linear process. By this I mean that we don't grow at a constant rate: there are periods where our growth is sudden and there are periods where our growth is somewhat negligible.

If you want to fit some kind of model it would have to take this into account, and this kind of model would require a little bit of advanced mathematics to transform your data correctly so that you a) get a transformed linear model that makes sense and b) can transform it back to what it originally represents to get your predicted values.

If you have small sample sizes, you should probably use Bayesian statistics. What this does is basically use conditional probability and what you do is use a distribution to represent prior information.

In conclusion, there are a lot of things to consider, even though the problem seems relatively simple. The non-linearity factor, combined with small sample sizes, combined with a bad understanding of the process involved are things that will give you results that may not be useful.

For small sample sizes, it is not wise to use classical statistical methods.

The classical methods are asymptotic and they rely on having a large enough sample size

If you have small sample sizes, you should probably use Bayesian statistics.

Ignoring the other factors such as any genetic disorders that may affect my children's height, can I not use the Student's t-distribution?

chiro
Ignoring the other factors such as any genetic disorders that may affect my children's height, can I not use the Student's t-distribution?

Since you want to make predictions, chances are you want to have some kind of model that has regression coeffecients.

Since your model is not a simple model (because of things like growth spurts), you can't just use the data to get a simple linear or even non-linear model.

This creates a bit of difficulty if you want a model that is reasonable for this kind of problem.

Since you want to make predictions, chances are you want to have some kind of model that has regression coeffecients.

Since your model is not a simple model (because of things like growth spurts), you can't just use the data to get a simple linear or even non-linear model.

This creates a bit of difficulty if you want a model that is reasonable for this kind of problem.

Huh. I thought it always approximated the normal distribution. I guess empirical probability would be my best friend here. If I had like 30 or so siblings, the height distribution might come close enough to being normally distributed.

Huh. I thought it always approximated the normal distribution. I guess empirical probability would be my best friend here. If I had like 30 or so siblings, the height distribution might come close enough to being normally distributed.

The fact that you're sampling from a distribution of siblings means it will be difficult to defend the independence assumption for a variable like height.