# Margin of error, if all responses identical

1. Sep 2, 2014

### cesiumfrog

Hi,

If I poll 10 people (with a yes/no question), and all of them respond with 'yes', should I report the rate of 'no' answers (in the greater population) is "zero plus or minus zero", or simply be confident that it is "less than one in five"?

I ask because using the "margin of error" (or "standard error of the proportion") formula sqrt[p(1-p)/n] it would appear, counter-intuitively, that the confidence interval narrows to zero (regardless of how few the samples) when the sample proportion is 0 or 1.

2. Sep 2, 2014

### FactChecker

You certainly have a positive confidence interval for the probability of "no" = q = 1-p.
You can not use that formula unless you know p. You are using the sample to estimate p and your sample appears to be small for the probability, p. There are several equations for the confidence interval (see http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval. In the case you are asking about, consider the rule of three (see http://en.wikipedia.org/wiki/Rule_of_three_(statistics) ). It gives the 95% confidence interval (0, 3/n).

3. Sep 2, 2014

### cesiumfrog

Ah, I see, so that formula involved an approximation that the ratio wasn't extreme.

Is there a exact Bayesian approach, given a prior distributed uniformly over [0,1] for the population's ratio, to correctly determine the credible interval for the estimate of that ratio from the sample?

4. Sep 8, 2014

### Murtuza Tipu

I flip a coin with probability of heads =0, then in nflips I will get 0 heads with a standard deviation of 0. That's OK. But the coin may instead have a non-zero probability of heads but by luck I did not get any heads in the sample.

The formula you give is used for a large n normal approximation (CLT) to the binomial. Instead we can use an exact binomial test for small n. Let q=proportion of "yes" voters in the population. We want to see what values of q are plausible given we saw 10 of 10 "yes" responses. q near 1 is very likely while q small, near 0, is very unlikely. Formally, a hypothesis test:

Suppose we wanted to test:

H0:q=0.741 versus Ha:q>0.741

This is an upper-tailed test. If we want to find the p-value corresponding to the observed result of all 10"yes", then we obtain 0.74110=0.05 (which is why I chose0.741). If we had used 0.795 or 0.631 we would obtain 0.79510=0.10 and 0.63110=0.01.

If we use the usual type I error α=0.05 then we are right on the border with the stated null hypothesis and will reject the null and conclude the alternative q>0.741 is a more plausible statement. So I would report the interval for the proportion of "yes" as (0.741,1) or the range for the proportion of "no" as (0,0.259). If you want to be even more conservative, we could report (0,0.369) for "no" using a 1% type I error.

5. Sep 10, 2014

### cesiumfrog

If I count n heads out of N tosses, based on my prior assumption that the bias (the true probability of heads) could equally be any value between zero and unity i.e. P(a≤r≤b)=$\int_a^b$dr, and noting the binomial distribution is the well-known likelihood of the observed result for a particular bias..

P(n/N|r)=NCnrn(1-r)N-n
.:P(n/N|a≤r≤b)=NCn$\int_a^b$rn(1-r)N-ndr = NCn (Bb-Ba)[n+1,N-n+1] where B is the incomplete beta function.

So I can apply Bayes theorem to find the plausibilty of different biases given my observations:

P(a≤r≤b|n/N)=P(n/N|a≤r≤b) P(a<r<b) / P(n/N) = $\frac{B_b-B_a}{B_1-B_0}$[n+1,N-n+1] = (Ib-Ia)[n+1,N-n+1],
.:P(r≤b|n/N) = Ib[n+1,N-n+1] where I is the regularised incomplete beta function.

Rather than just naively estimating that r≈n/N, I can solve this to impose on r an upper bound (b) with arbitrary credibility, say, set P(r≤b|n/N)=0.90.

.:b = InverseBetaRegularised[0.9,n+1,N-n+1].

My application of this is that I'm doing a permutation-test (performed N times) to determine whether a result (e.g. a classifier's specificity) is statistically significant (so we count the number n of times a just-as-good result occurs purely by chance). For this permutation-test, r represents the p-value. However, because the test is computationally expensive, N cannot be huge, and so n is often very small (possibly zero). Some code would just report the p-value as n/N, but this seems unacceptable (i.e. too-easily estimating low p-values, such as exactly zero).

For example, would it be reasonable to report p<0.04 if N=100 and n=1, and p<0.02 if N=500 and n=5, but not to report p=0.01 in either case?

6. Sep 10, 2014

### Stephen Tashi

Can you clarify what that means? When I think of a "permutation test", I visualize paired observations, e.g. "treatment" and "result". I can think of "treatment" as "actual classification" and "result" is "classifier's classification". But is the "treatment" a variable that has only two values?

7. Sep 10, 2014

### FactChecker

n/N is not naive. It is the maximum likelihood estimator. It is wrong to give more weight to the "no-knowledge" assumption of a uniform prior distribution, than to the data-supported n/N estimate. "no-knowledge" Bayes techniques are not a good substitute for data. Bayes prior distribution should be based on something applicable to the experiment being done (prior knowledge, a conservative assumption, etc.). It is better to directly use the data and a maximum likelihood estimator than to influence the results with a "no-knowledge" Bayes prior. You might also consider the technique of "bootstrapping" if you are not happy using the MLE directly. I don't know if the result will be different.

Last edited: Sep 10, 2014
8. Sep 12, 2014

### Stephen Tashi

Applying statistics to real world problems is a subjective matter. The original post asks about a "confidence interval". A Bayesian prior is needed to compute a "credible interval" (which is a different type of interval). Reporting the maximum liklihood estimate without stating any type of associated interval is a third alternative. These choices are choices about how to formulate a real world problem as a mathematical problem.