# Central limit theorem and estimates of probability

1. Apr 9, 2015

### lotsofmoxie

1. The problem statement, all variables and given/known data
Assume five hundred people are given one question to answer - the question can be answered with a yes or no. Let p =the fraction of the population that answers yes. Give an estimate for the probability that the percent of yes answers in the five hundred person sample is bigger than that of in the whole population (p) by more than 5 %
2. Relevant equations

central limit theorem

p is NOT provided.

3. The attempt at a solution

N=.55*500
n=500

using a typical normal approximation to the binomial we get

P(X>N)=1-CDFOFNORMALDIST[(N-np)/sqrt(n*p*(1-p))]

How can we go farther than this if we don't have an explicit value for p?

Last edited: Apr 9, 2015
2. Apr 10, 2015

### Simon Bridge

You have to use your understanding of how the distribution of the sample compares with the distribution in an entire population.
Be careful, you have used the variable "p" to mean at least two different things.

3. Apr 10, 2015

### lotsofmoxie

What can we say about the population distribution besides it is binomial and about the sample distribution besides its average is approximately normal with large enough n?

Last edited: Apr 10, 2015
4. Apr 10, 2015

### lotsofmoxie

I'm really not seeing how it is possible to do this problem without knowing the probability that a particular person answers yes or no. You need this to calculate an exact Z value, and how large or small it is makes a huge difference to the answer. My gut tells me that we are just supposed to assume that this is equal to .5 (equal chance of yes or no), but maybe I'm just stupid and missing something big conceptually.

Last edited: Apr 10, 2015
5. Apr 10, 2015

### lotsofmoxie

I'm really not seeing how it is possible to do this problem without knowing the probability that a particular person answers yes or no. You need this to calculate an exact Z value, and how large or small it is makes a huge difference to the answer - from what I can see, the central limit theorem explicitly requires it in order to use the normal distribution at all. My gut tells me that we are just supposed to assume that this is equal to .5 (equal chance of yes or no), but maybe I'm just stupid and missing something big conceptually.

6. Apr 10, 2015

### pcm

As Simon said you are using p for two different things. You can use the fact that p(1-p) (where 0<p<1) has a maximum value(find it). You can use it to get a bound for your answer.

7. Apr 10, 2015

### lotsofmoxie

Hmmm. So the max value of p(1-p) (if I'm using p = probability that a given person answers yes to the question) would be 1/2. So I can plug that into my z value?

8. Apr 10, 2015

### pcm

It's not 1/2. Check your calculation.

9. Apr 10, 2015

### lotsofmoxie

Err sorry, typo. It's 1/4.

so we get P(X>275) >= 1-cdfnorm[(275-275*p)/(sqrt(n/4))

what about the p in the numerator? Are we trying to get a lower or upper bound?

10. Apr 10, 2015

### lotsofmoxie

Is the answer zero? I can post my reasoning if its forum policy or something

Last edited: Apr 10, 2015
11. Apr 10, 2015

### WWGD

Try a hypothesis test for population proportions, which relates the actual population proportion to the sample proportion. You first find the standard deviation $\sigma$ of the sampling distribution of the population proportion P :

$\sigma = \sqrt {\frac {P(1-P)}{n}}$ and then use the normally-distributed statistic z:

$z =\frac {p-P}{\sigma}$ , where p is the sample proportion. This gives you the probability of obtaining a sample proportion value p in a population with actual proportion P.

From this calculate the percentile value associated with the $z$-value you got. You want to find the probability that p-P>0.05.