Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Height probability

  1. Nov 19, 2011 #1
    I have two questions.

    Suppose I am a parent and want to predict how tall my child will be as an adult. I want to survey family members from both sides.

    Should I survey my brothers and sisters, or should I survey my 20-year-old nephews and nieces? I think I should survey my brothers and sisters. And they should be about the same age.

    Also, the distribution of those heights will always approximate a normal distribution. That's what I learned in biology yesterday. How do I use this fact to predict the adult
     
  2. jcsd
  3. Nov 23, 2011 #2
    Perhaps, but not if there are only a few samples.
     
  4. Nov 23, 2011 #3

    chiro

    User Avatar
    Science Advisor

    For small sample sizes, it is not wise to use classical statistical methods.

    The classical methods are asymptotic and they rely on having a large enough sample size.

    Apart from this you need to understand a little bit about the underlying process. Your brothers and sisters data might better represent the outcome than your nieces. If biology has results that say that your nieces may not be a good representation for what you are looking for, it might be more damaging using that data than not using it all.

    Also with doing things like trying to predict outcomes, you would use a simple linear model.

    One thing you need to be aware of is that growth is a highly non-linear process. By this I mean that we don't grow at a constant rate: there are periods where our growth is sudden and there are periods where our growth is somewhat negligible.

    If you want to fit some kind of model it would have to take this into account, and this kind of model would require a little bit of advanced mathematics to transform your data correctly so that you a) get a transformed linear model that makes sense and b) can transform it back to what it originally represents to get your predicted values.

    If you have small sample sizes, you should probably use Bayesian statistics. What this does is basically use conditional probability and what you do is use a distribution to represent prior information.

    In conclusion, there are a lot of things to consider, even though the problem seems relatively simple. The non-linearity factor, combined with small sample sizes, combined with a bad understanding of the process involved are things that will give you results that may not be useful.
     
  5. Dec 12, 2011 #4
    Ignoring the other factors such as any genetic disorders that may affect my children's height, can I not use the Student's t-distribution?
     
  6. Dec 12, 2011 #5

    chiro

    User Avatar
    Science Advisor

    Since you want to make predictions, chances are you want to have some kind of model that has regression coeffecients.

    Since your model is not a simple model (because of things like growth spurts), you can't just use the data to get a simple linear or even non-linear model.

    This creates a bit of difficulty if you want a model that is reasonable for this kind of problem.
     
  7. Dec 15, 2011 #6
    Huh. I thought it always approximated the normal distribution. I guess empirical probability would be my best friend here. If I had like 30 or so siblings, the height distribution might come close enough to being normally distributed.
     
  8. Dec 15, 2011 #7
    The fact that you're sampling from a distribution of siblings means it will be difficult to defend the independence assumption for a variable like height.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook