Confidence Intervals: t-distribution or normal distribution?

Click For Summary

Discussion Overview

The discussion centers on the appropriate use of t-distributions versus standard normal (z) distributions for calculating confidence intervals based on population samples, particularly in relation to sample size and the assumption of normality. Participants explore the implications of sample size, the knowledge of population standard deviation, and the effects of data skewness on the choice of distribution.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • Some participants suggest using the t-distribution when the number of degrees of freedom (df=n-1) is less than 30, while others argue that this guideline is outdated and should not be the sole determining factor.
  • It is proposed that if the assumption of normality holds and the population standard deviation (\sigma) is known, the z-interval should be used; if \sigma is unknown, the t-interval is recommended.
  • Concerns are raised about the appropriateness of using the mean as a measure of central tendency in cases of badly skewed data.
  • One participant expresses hesitation about calculating confidence intervals with very small sample sizes (e.g., n=5), even if \sigma is known, but acknowledges that the z-interval could be appropriate under certain conditions.

Areas of Agreement / Disagreement

Participants express differing views on the relevance of sample size as a criterion for choosing between t and z distributions, with no consensus reached on the best approach. The discussion remains unresolved regarding the implications of skewness and sample size on the validity of confidence intervals.

Contextual Notes

Limitations include the dependence on the assumption of normality and the unknown status of population standard deviation in many practical scenarios. The discussion also highlights the potential inadequacy of confidence intervals in the presence of skewed data.

Richard_R
Messages
12
Reaction score
0
Hi all,

When working out confidence intervals based on population samples are you supposed to always use t-distributions, standard normal (z) distributions, or do you make a choice based on the sample size?

Up until now I've been lucky enough to have large sample sizes (for some work I'm doing) so have been using the z-distribution. However I now have some data sets which range from n=1 (lol) to n=29 so am not sure if I should now be using t-distributions to define confidence intervals, or how I'd make that decision (e.g. use t-distribution if n<30, for example?)

Thanks
-Rob
 
Physics news on Phys.org
Richard_R said:
Hi all,

When working out confidence intervals based on population samples are you supposed to always use t-distributions, standard normal (z) distributions, or do you make a choice based on the sample size?
Thanks
-Rob

Assuming the normal assumption is valid, the general rule is to use the t-distribution to calculate confidence intervals where the number of degrees of freedom (df=n-1) is less then 30, The Z and t scores are similar around this value. Skewed data, particularly in small samples, make CIs fairly useless. In larger samples, normalizing transformations can be useful for constructing CIs..
 
Actually the notion of using the sample size as the determining factor is being (as it should be) tossed out. It is a remnant of the days before computing power was so readily available.

IF the assumption of normality can be made, when you know \sigma (population standard deviation) use the Z-interval. When you don't know sigma (so you have only the sample standard deviation) use the t-interval.

If your data is badly skewed, it is debatable whether the mean is the appropriate parameter to measure central tendency.
 
statdad said:
Actually the notion of using the sample size as the determining factor is being (as it should be) tossed out. It is a remnant of the days before computing power was so readily available.

IF the assumption of normality can be made, when you know \sigma (population standard deviation) use the Z-interval. When you don't know sigma (so you have only the sample standard deviation) use the t-interval.

If your data is badly skewed, it is debatable whether the mean is the appropriate parameter to measure central tendency.

Well I am retired and involved in other things, but I have researched the t distribution recently and I've not run across this. However, my research was mostly on the math and not the application.

What you say makes sense. Would you use the Z value for very small samples, say n=5, if you did know sigma?

EDIT: In most of my experience sigma is not known.
 
Last edited:
If the sample size is only 5 i would be hesitant to do any confidence interval but, if pushed, if sigma were known, and if told that the data were known to be normally distributed, the Z-interval would be appropriate.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
5
Views
6K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K