Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

I T distribution

  1. Nov 10, 2015 #1
    So my understanding of the T distribution is that if you do not know the variance of a population you estimate the distribution of the mean with the T distribution. But I am not sure about this because if you know the variance of the population, law of large numbers shrinks the variance significantly. How can that be that just because the variance is chi-square RV it makes the spread of the distribution much bigger. I feel like I am missing something can someone please explain T distribution.
     
  2. jcsd
  3. Nov 10, 2015 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    You need to make your vocabulary more precise so that you distinguish between things like "mean" and "sample mean".

    The T-distribution is the distribution for the T-statistic. A mean of a distribution would be a single number, the random variable that is the sample mean would have a distribution.

    If you knew the variance, you'd know it. It wouldn't "shrink". It would be a single number.
     
  4. Nov 10, 2015 #3
    I thought t distribution was the distribution of the sample mean if the variance is unknown.

    But how can you have a distribution of the sample mean if you do not know the population mean or the population variance
     
  5. Nov 10, 2015 #4
    I just don't really get why you say the sample mean is the population mean but then you can't do that for the population variance can you please explain the t distribution please
     
  6. Nov 10, 2015 #5
    But I don't understand if you use the t distribution to find the probability of the sample mean taking on some value why is the sample mean normally distributed
     
  7. Nov 10, 2015 #6

    Stephen Tashi

    User Avatar
    Science Advisor

    Think of it this way: Suppose you want to graph the distribution of the sample mean of a sample of size N from a population that is normally distributed. The way you draw you graph depends on the population mean (which defines the central peak) and the population standard deviation and the value of N - they determine the "spread" of the graph. So to draw the graph, you have to specify 3 values.

    Now suppose you want to graph the distribution of the t-statistic for the sample mean. The shape of this graph depends only on N.

    In a typical real life problem, we take samples from an (assumed) normally distributed population without knowing the population mean or population standard deviation. If we use some procedure to estimate the population mean, it is very complicated to make a precise statement about the "quality" of the estimate. You can make a "confidence interval" type of statement about the sample mean using the normal distribution if you know ( or assume you know) the population standard deviation. You can make a "confidence interval" type of statement about the sample mean using the t-distribution without assuming you know the population standard deviation because the t-distribution uses a version of the sample standard deviation.

    For large sample sizes, theory says that the sample standard deviation is probably a very good estimator of the population standard deviation so, for large sample sizes, people treat the sample standard deviation as if it were the population standard deviation and do calculations with the normal distribution. For small sample sizes, the t-distribution is used because it not safe to assume the sample standard deviation is probably near the population standard deviation.

    Confidence intervals are a complicated topic and interpreting what they tell you is not simple.
     
  8. Nov 11, 2015 #7
    Ok thank you very much ;

    I just don't really understand sample variance I thought it was supposed to be more accurate than x bar but you say they are both normal.

    And one more question; if sigma is known the distribution of x bar approaches normal with no variation but if you do not know sigma the distribution of x bar approaches the normal distribution with variance equal to the sample variance?
     
  9. Nov 11, 2015 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    You have to be precise about what you mean by "more accurate".

    From a given population, the distribution of the sample mean of a sample of 100 independent random samples has a higher peak that the distribution of the sample mean of 10 independent random samples.

    I don't know what you mean by "approaches normal with no variation".

    There are many different ways to talk about whether one distribution approaches another distribution. These have to do with various definitions for how the limit of a sequence of functions approaches another function.

    I'll make this general suggestion. Try to be more precise in your statements. It won't be easy and it is particularly difficult to do when dealing with topics involving probability. Some people never master mathematical topics because they are too willing to think "I know what I'm talking about" when they utter statements about mathematical topic. Practice some self-doubt. Ask yourself "What exactly am I saying? Do I mean this...or that... ?".
     
  10. Nov 11, 2015 #9
    I just mean the variance approaches 0.

    I'm saying shouldn't the sample variance be more closely approximated by the normal distribution because you take sums of standard normal RVs while for the mean it could be any distribution. And like why is it dumb to say that if the sample size is very large the distribution of the variance should get much much smaller. I don't get T distribution.

    There is no way that x bar and S^2 are independent that is crazy.
     
  11. Nov 12, 2015 #10

    Stephen Tashi

    User Avatar
    Science Advisor

    You still haven't stated a precise mathematical question.

    I suggest you look at a textbook problem of how the t-distribution is used to state a confidence interval. Then ask yourself how you would state a confidence interval using a normal distribution. Would you do this by assuming the population variance is exactly equal to the variance that you estimated from the sample?
     
  12. Nov 12, 2015 #11

    Dale

    Staff: Mentor

    No, the variance does not approach 0.

    Let's say you have a population of 1 million adults and their heights are normally distributed with mean 6' and standard deviation 2". The adults do not all become exactly 6' tall just because you take a large sample.
     
  13. Nov 2, 2016 #12
    I mean the variance of the sampling distribution of the mean, when the variance of the population is known, would shrink, when you take samples of a large number.

    And my question my question remains: why would the distribution of the sample mean with an unknown population be normal, while the distribution of the sample mean with a known variance be a single number when you take samples of like 10 billion?

    Is the answer that the distributions of the two are nearly the same especially when you take large samples and the variance is not really a parameter in the t-distribution.
     
  14. Nov 2, 2016 #13

    MarneMath

    User Avatar
    Education Advisor

    1. why would the distribution of the sample mean with an unknown population be normal? It doesn't have to be. The central limit theorem only applies when you take many samples. An individual sample mean doesn't have to be normal . The central limit theorem simply states that repeatedly sampling the sample means will tend to a normal distribution.

    2.while the distribution of the sample mean with a known variance be a single number when you take samples of like 10 billion? Huh? Even if you know the variance of the population (unlikely) you still have a standard error.
     
  15. Nov 2, 2016 #14
     
  16. Nov 2, 2016 #15

    MarneMath

    User Avatar
    Education Advisor

    I think you really need to be more careful with how you use terms because i'm sensing a fundamental misunderstanding of what things mean and what they do with how you're writing. The t distribution requires that the parameters mu and sigma are from a normal a distribution which is applicable because of the central limit theorem. If you know a particular sample comes from a normal distribution and are not relying on the central limit theorem, then why are you using the t-distribution? You can simply pivot off the standard normal distribution since it is location-scale.
     
  17. Nov 2, 2016 #16
    You use t-distribution because it is more accurate than the normal distribution when the variance of the sample is random, which it always is, you would only use the normal distribution for x-bar if the variance was the same for all of your samples.

    You don't know what the heck you're talking about.
     
  18. Nov 2, 2016 #17

    Stephen Tashi

    User Avatar
    Science Advisor

    It isn't clear what you mean when you say "the variance was the same for all your samples".

    Statistical terms such as "variance" are ambiguous. In typical scenario, there are several differnt things that can be called "variance". We have a population with a distribution and that distribution has a parameter called "variance". We have samples from the population and the mean values of those samples has a distribution and that distribution has a parameter called "variance". For a sample, we can compute the variance of the sample values about the sample mean. That computation defines a random variable called the "sample variance" (and it has a distribution with a parameter called "variance" of the sample variance). And we can have the specific realization of one numerical value (e.g. 23.65) of the sample variance that comes from one particular sample.
     
  19. Nov 2, 2016 #18

    MarneMath

    User Avatar
    Education Advisor

    This goes back to my comment that you need to be more clear about what you're saying is that you need to know that your mu and sigma are from a normal distribution. If I gave you sample, a single sample, and you decided to use the T-distribution, that would be wrong. You have no idea if that single sample is exponential, gamma, beta etc, so that would violate the need for mean and variance to come from a normal distribution.

    Secondly, it's unclear what variance you keep jumping too. Do you want to know the variance of the population, sample variance, variance of the mean variance of the variance? Then you talk about one sample, but then talk about the variance of samples. Part of mathematics is accurately writing what you want.
     
  20. Nov 2, 2016 #19
    I'm talking about the variance of your sample. Like you have a sample, some numbers, and you compute the variance of these numbers. There is no other way to make this clear.

    Marne Math
    I'm not talking about the clt, ok stop bringing up stuff about the central limit theorem.
     
  21. Nov 2, 2016 #20

    berkeman

    User Avatar

    Staff: Mentor

    Thread closed for Moderation, obviously...

    Edit: a post has been deleted and the thread will remain closed
     
    Last edited by a moderator: Nov 2, 2016
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: T distribution
  1. T distribution (Replies: 1)

Loading...