Confidence integrals when n is small

  • Context: Undergrad 
  • Thread starter Thread starter TaliskerBA
  • Start date Start date
  • Tags Tags
    Integrals
Click For Summary

Discussion Overview

The discussion revolves around the interpretation and application of confidence intervals, particularly in the context of small sample sizes. Participants explore the implications of using confidence intervals in statistical analysis, especially when dealing with a limited number of observations, as illustrated by a poker results example.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant expresses confusion regarding the application of a 95% confidence interval based on their poker results, questioning the validity of the interval when n is small.
  • Another participant suggests that the original poster should consider prediction intervals instead of confidence intervals for their specific example.
  • A question is raised about whether the variable in question is approximately normally distributed, which is crucial for the validity of the confidence interval.
  • There is a clarification that confidence intervals assume the sample comes from a normally distributed population, and that for small samples, the t distribution should be used instead of the normal distribution.
  • One participant notes that the standard deviation estimate for the t distribution is dependent on the number of degrees of freedom, and provides a formula for calculating confidence intervals when n is at least 8.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the correct application of confidence intervals versus prediction intervals, and there is ongoing debate about the assumptions regarding normality and sample size. The discussion remains unresolved regarding the implications of using confidence intervals with small sample sizes.

Contextual Notes

Limitations include the assumption of normality for the underlying distribution and the potential misapplication of confidence intervals when sample sizes are small. The discussion highlights the need for careful consideration of these factors in statistical analysis.

TaliskerBA
Messages
26
Reaction score
0
Should Say Intervals.. I'm tired...

I am probably going wrong somewhere but I am running into problems with understanding this. My understanding of a 95% confidence interval is that in a sample of n the sample mean is 95% likely to be within 1.96 standard errors of the actual mean. I have a problem because I think I have an example where this isn't true.

I play a bit of online poker and have been playing around with my results to help me grasp some of these concepts and it is with my poker results I get the contradiction.

In 425 games I have won 58 for $51.50 profit, come 2nd in 53 for $24.50 profit, come 3rd in 41 for $11 profit and not cashed in 273 for -$16. With these figures I work out:

sample mean = $0.87
sample variance = 614
standard deviation = $24.77

So, here is my problem. Say I play another tournament (ie. n=1), based on this sample there is a 58/425 = 13.6% chance that I win $51.50 but a 95% confidence interval states that:

Pr(0.87 - 1.96*24.77 < X < 0.87 + 1.96*24.77) = 0.95

Pr(-47.7<X<49.4) = 0.95

So it suggests that I am 95% likely to have a result that yields me less than $49.40 even though I already know I am 13.6% likely to win $51.50...

I presume it's all to do with n being small but my notes don't give any acknowledgment that confidence intervals aren't completely sound when n is small. Where am I going wrong?
 
Last edited:
Physics news on Phys.org
TaliskerBA said:
Say I play another tournament (ie. n=1)

No no no. You need to talk about http://en.wikipedia.org/wiki/Prediction_interval" for your example.
 
Last edited by a moderator:
Is X really an (approximately) normally distributed random variable?
 
pwsnafu said:
No no no. You need to talk about http://en.wikipedia.org/wiki/Prediction_interval" for your example.

That is useful to know, but the notation in this example still claims that Pr(-47.7<X<49.4) = 0.95 which clearly doesn't work when n=1. My notes state that over many repetitions of sampling then 95% of intervals will include X, but what if all samples were of size n=1? Or are the samples sizes themselves random as well (in which case this would most likely work)..?

* Edit. I think I understand now, thanks for your help.

Hurkyl said:
Is X really an (approximately) normally distributed random variable?

I think I get your point now. It's not a normally distributed RV because I have dictated what the sample size should be. Right?

Thanks for your help.
 
Last edited by a moderator:
TaliskerBA said:
I think I get your point now. It's not a normally distributed RV because I have dictated what the sample size should be. Right?

Thanks for your help.

No. Sample size is almost always stated a priori. To calculate confidence intervals, you must assume the random sample came from a normally distributed population. Confidence intervals for small samples (typically less than 30) may be estimated from the t score. The standard deviation estimate is based on the number of degrees of freedom.

The standard deviation estimate for the t distribution is \sqrt{\frac {n}{n - 2}}. You can use a t test table to get the relation between the standard deviation (or just n) and the confidence limits. As with the Z score, confidence intervals extending 2 sd from the mean are considered acceptable for most applications.

EDIT: More specifically to your question the CI is \bar X \pm_{Z_{\alpha/2}} \sigma/ \sqrt {n} if n is at least 8.
 
Last edited:

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 7 ·
Replies
7
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K