- #1

- 31

- 1

Thanks!

You are using an out of date browser. It may not display this or other websites correctly.

You should upgrade or use an alternative browser.

You should upgrade or use an alternative browser.

- Thread starter M_1
- Start date

- #1

- 31

- 1

Thanks!

- #2

BvU

Science Advisor

Homework Helper

- 14,396

- 3,708

Basically you are testing the hypothesis that p = 0.5 and find a .05 deviation.

(Variance ##np(1-p) = 25##, sigma = 5 -- an estimate )

So such a deviation is one sigma from the expected mean and as such not sensational.

You can claim that with 95% confidence you found ##(55 \pm 10)## % probability .

(some rounding off is necessary; the relative error in ##\sigma## itself is of the order of ##1/\sqrt {100}##, so 10% )

- #3

Buzz Bloom

Gold Member

- 2,364

- 432

The equations you want can be found at the following link:

There are two aspects of an error range.

(1) Show the +/- value. This is generally the calculated standard deviation or a multiple of the standard deviation. The value of the multiple most commonly used is 1 or 2.

(2) There are two ways describe the confidence interval regarding the +/-value.

Hope this is helpful.

Regards,

Buzz

(2) There are two ways describe the confidence interval regarding the +/-value.

(a) Say the number of standard deviations used.

(b) This is a more preferred method. Say the percentage that is associated with the number of standard deviations. For the Gaussian distribution, the common multiple values 1 or 2 have the following corresponding confidence values: 68% and 95%. The following describes a method to calculate a confidence interval.

You may want to also read the following to get a good understanding of what a confidence interval means since here are common misunderstandings about this.

The confidence interval generally assumes a Gaussian distribution. For the coin flip example, you can use the actual binomial distribution. If the sample is large enough, the Gaussian distribution would be a good approximation.

(b) This is a more preferred method. Say the percentage that is associated with the number of standard deviations. For the Gaussian distribution, the common multiple values 1 or 2 have the following corresponding confidence values: 68% and 95%. The following describes a method to calculate a confidence interval.

You may want to also read the following to get a good understanding of what a confidence interval means since here are common misunderstandings about this.

The confidence interval generally assumes a Gaussian distribution. For the coin flip example, you can use the actual binomial distribution. If the sample is large enough, the Gaussian distribution would be a good approximation.

Hope this is helpful.

Regards,

Buzz

- #4

- 31

- 1

Using the formula for the Wilson score interval on the same web page and again with z=1,96 we obtain the interval for p between

94.6%-99.8% (1), or

97.2+/-2.6% (2), or

99 +0.8% or – 4.4% (3)

with 95% confidence.

This looks intuitively very good since the interval doesn’t reach above 100%. I still have two questions:

Q1: Is (1),(2), or (3) the most correct way of presenting the interval? I think it should be (3) since the seemingly most educated guess would be p=99%. Furthermore I don’t like (1) since I want to plot in a bar charts and then it’s nice to have a centerpoint (99%) with error bars (+0.8% and -4.4%). But what do you think?

Q2: Is the approach correct? I can understand using z=1.96 for the normal approximation but for the binomial distribution I don’t understand what z is and I definitely have the impression that 1.96 comes from the number of standard deviatons for 95% confidence of the normal distribution – so basically I put z=1.96 in the Wilson score interval without having a clue of what I’m actually doing. But the result looks good! So do you think the approach is correct?

Thanks again, a fantastic forum!

- #5

- 31

- 1

Thanks you Physics Forum!

- #6

Buzz Bloom

Gold Member

- 2,364

- 432

At my age, my math skills are not as good as when I was younger, so please take that into account that when considering my comment.

I think there is a problem with using the value z = 1.96.

That value is based on a Gaussian assumption. When you try to evaluate the the binomial distribution at an extreme mean, 99 h 1 t, rather than a more central 55 h 45 t, I believe the Gaussian assumption will give an large error in the result.

I am unsure what concept of "confidence interval" you want to use. My guess is you want this one:The explanation of a confidence interval can amount to something like: "*The confidence interval represents values for the population parameter for which the difference between the parameter and the observed estimate is not statistically significant at the 10% level*". In fact, this relates to one particular way in which a confidence interval may be constructed. (From https://en.wikipedia.org/wiki/Confidence_interval#Meaning_and_interpretation .)

I also guess that the Wilson score method is not the best choice if accuracy is the most important criterion. I suggest you may want to use the Clopper-Pearson interval instead, becauseThe Clopper-Pearson interval is an early and very common method for calculating binomial confidence intervals. This is often called an 'exact' method, but that is because it is based on the cumulative probabilities of the binomial distribution...

(From https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper-Pearson interval .)

(From https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper-Pearson interval .)

I hope this is helpful.

Regards,

Buzz

- #7

Buzz Bloom

Gold Member

- 2,364

- 432

Hi M_1:

I have been thinking about your question some more, and I came up with an interpretation that feels right to me. It it based on the following

With a Gaussian distribution, which is symmetrical about the mean, it is reasonable to find a +/- x, where x is a multiple of the standard deviation, such that the probability that a repeat of the experiment will result in a new calculated mean m that will be inside the range m_{0} +/- x (where m_{0} is the old mean) a specified percentage of the time. That is, for example, if x = 2 x standard deviation:

So, we want to make a similar statement for the experiment that resulted in 99 h and 1 t. However, the result for this experiment is at the tail of the binomial distribution, which is far from Gaussian, and even from being symmetrical. So instead of (1) we want something like

One more observation. This calculation is not exactly what the quoted interpretation above says. However, I think it will be a good approximation of what it says.

Regards,

Buzz

I have been thinking about your question some more, and I came up with an interpretation that feels right to me. It it based on the following

"*Were this procedure to be repeated on multiple samples, the calculated confidence interval (which would differ for each sample) would encompass the true population parameter 90% of the time"*

fromWith a Gaussian distribution, which is symmetrical about the mean, it is reasonable to find a +/- x, where x is a multiple of the standard deviation, such that the probability that a repeat of the experiment will result in a new calculated mean m that will be inside the range m

(1) Probability of m ∈ {m0 - x, m0 + x} > 95%.

So, we want to make a similar statement for the experiment that resulted in 99 h and 1 t. However, the result for this experiment is at the tail of the binomial distribution, which is far from Gaussian, and even from being symmetrical. So instead of (1) we want something like

(2) Probability of m ∈ {Min, 100} > 95%.

To find the value of Min, calculate the binomial probability distribution distribution for the range of integers 100, 99, 98, . . ., Min so that(3) P_{100} + P_{99} + ... + P_{Min} ≥ 95%.

To calculate these values:(4) P_{100} = 0.99^{100}

(5) P_{k} = P_{k+1} (1/0.99) (k+1) / (100-k)

My guess is that Min will be about 94.(5) P

One more observation. This calculation is not exactly what the quoted interpretation above says. However, I think it will be a good approximation of what it says.

Regards,

Buzz

Last edited:

Share:

- Replies
- 1

- Views
- 2K