Confidence integrals when n is small

  • Thread starter Thread starter TaliskerBA
  • Start date Start date
  • Tags Tags
    Integrals
AI Thread Summary
The discussion centers on the confusion surrounding the application of confidence intervals, particularly when the sample size (n) is small. A participant grapples with the interpretation of a 95% confidence interval based on their poker results, questioning the validity of the interval when predicting outcomes from a single game. It is clarified that confidence intervals assume a normally distributed population and that for small samples, the t-distribution should be used instead of the normal distribution. The importance of sample size in determining the accuracy of confidence intervals is emphasized, particularly noting that confidence intervals may not be reliable when n is less than 30. The conversation concludes with a better understanding of the relationship between sample size, distribution, and confidence intervals.
TaliskerBA
Messages
26
Reaction score
0
Should Say Intervals.. I'm tired...

I am probably going wrong somewhere but I am running into problems with understanding this. My understanding of a 95% confidence interval is that in a sample of n the sample mean is 95% likely to be within 1.96 standard errors of the actual mean. I have a problem because I think I have an example where this isn't true.

I play a bit of online poker and have been playing around with my results to help me grasp some of these concepts and it is with my poker results I get the contradiction.

In 425 games I have won 58 for $51.50 profit, come 2nd in 53 for $24.50 profit, come 3rd in 41 for $11 profit and not cashed in 273 for -$16. With these figures I work out:

sample mean = $0.87
sample variance = 614
standard deviation = $24.77

So, here is my problem. Say I play another tournament (ie. n=1), based on this sample there is a 58/425 = 13.6% chance that I win $51.50 but a 95% confidence interval states that:

Pr(0.87 - 1.96*24.77 < X < 0.87 + 1.96*24.77) = 0.95

Pr(-47.7<X<49.4) = 0.95

So it suggests that I am 95% likely to have a result that yields me less than $49.40 even though I already know I am 13.6% likely to win $51.50...

I presume it's all to do with n being small but my notes don't give any acknowledgment that confidence intervals aren't completely sound when n is small. Where am I going wrong?
 
Last edited:
Physics news on Phys.org
TaliskerBA said:
Say I play another tournament (ie. n=1)

No no no. You need to talk about http://en.wikipedia.org/wiki/Prediction_interval" for your example.
 
Last edited by a moderator:
Is X really an (approximately) normally distributed random variable?
 
pwsnafu said:
No no no. You need to talk about http://en.wikipedia.org/wiki/Prediction_interval" for your example.

That is useful to know, but the notation in this example still claims that Pr(-47.7<X<49.4) = 0.95 which clearly doesn't work when n=1. My notes state that over many repetitions of sampling then 95% of intervals will include X, but what if all samples were of size n=1? Or are the samples sizes themselves random as well (in which case this would most likely work)..?

* Edit. I think I understand now, thanks for your help.

Hurkyl said:
Is X really an (approximately) normally distributed random variable?

I think I get your point now. It's not a normally distributed RV because I have dictated what the sample size should be. Right?

Thanks for your help.
 
Last edited by a moderator:
TaliskerBA said:
I think I get your point now. It's not a normally distributed RV because I have dictated what the sample size should be. Right?

Thanks for your help.

No. Sample size is almost always stated a priori. To calculate confidence intervals, you must assume the random sample came from a normally distributed population. Confidence intervals for small samples (typically less than 30) may be estimated from the t score. The standard deviation estimate is based on the number of degrees of freedom.

The standard deviation estimate for the t distribution is \sqrt{\frac {n}{n - 2}}. You can use a t test table to get the relation between the standard deviation (or just n) and the confidence limits. As with the Z score, confidence intervals extending 2 sd from the mean are considered acceptable for most applications.

EDIT: More specifically to your question the CI is \bar X \pm_{Z_{\alpha/2}} \sigma/ \sqrt {n} if n is at least 8.
 
Last edited:
I was reading documentation about the soundness and completeness of logic formal systems. Consider the following $$\vdash_S \phi$$ where ##S## is the proof-system making part the formal system and ##\phi## is a wff (well formed formula) of the formal language. Note the blank on left of the turnstile symbol ##\vdash_S##, as far as I can tell it actually represents the empty set. So what does it mean ? I guess it actually means ##\phi## is a theorem of the formal system, i.e. there is a...

Similar threads

Replies
1
Views
1K
Replies
22
Views
3K
Replies
2
Views
2K
Replies
7
Views
2K
Replies
7
Views
3K
Replies
3
Views
2K
Replies
4
Views
5K
Back
Top