Stats: Approximating a binomial with a normal distribution

Amcote
Messages
16
Reaction score
0

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!
 
Last edited:
Physics news on Phys.org
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≥0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!
 
Ray Vickson said:
When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!

Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.
 
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:
When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
\Phi \left( \frac{.5 + \lfloor 0.35 n \rfloor -.25 n}{.433 \sqrt{n}} \right) = 0.99,
where "##\lfloor 0.35 n \rfloor##" is the largest integer ##\leq 0.35 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.
 
Last edited:
Ray Vickson said:
In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
\Phi \left( \frac{.5 + \lfloor 0.1 n \rfloor }{.433 \sqrt{n}} \right) = 0.99,
where "##\lfloor 0.1 n \rfloor##" is the largest integer ##\leq 0.01 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.

It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
\Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99,

so,

\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33

with all this I get n= 101.786

and if I actually include the correction I get n_1=101.539 and n_2=0.2453 from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks
 
Amcote said:
Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.

Further to my response in #4: for ##X_n \sim \text{Binom}(n, .25)## the probabilities ##P(X_n \leq .35 n)## are not monotone in ##n## over short intervals of ##n##, so if you plot ##P(X_n \leq .35 n)## vs. ##n## you get a graph with a "sawtooth" behavior, which rises over the long run but wiggles in the short run. The normal approximation with modified 1/2- correction behaves that way too. The reason is that as ##n## increases the integer values ##N_n## in the events ##\{ X_n \leq .35 n \} = \{ X_n \leq N_n \}## are non-decreasing but sometimes remain constant for a few neighboring values of ##n##. If ##N_n = N_{n+1}## the probability ##P(X_k \leq N_k)## can go down as ##k## increases from ##n## to ##n+1##. That happens because the distribution of ##X_{n+1}## is shifted to the right, but is narrower than that of ##X_n##, so the end result could be a decrease or an increase in the probability for "##\leq N_n ##". For example, for ##n## going from 79 to 86 the values of ##N_n##, P_exact = ##P(\text{Binom}(n, .25n) \leq N_n)## and P_normal = ##\Phi((.5 + \lfloor .35 n \rfloor -.25 n)/\sqrt{.1875n})## are
\begin{array}{cccc}<br /> n &amp; N_n &amp; \text{P_exact} &amp; \text{ P_normal} \\<br /> 79 &amp; 27 &amp; 0.975007 &amp; 0.977978 \\<br /> 80 &amp; 28 &amp; 0.983370 &amp; 0.985907 \\<br /> 81 &amp; 28 &amp; 0.980154 &amp; 0.982868 \\<br /> 82 &amp; 28 &amp; 0.976467 &amp; 0.979337 \\<br /> 83 &amp; 29 &amp; 0.984286 &amp; 0.986724 \\<br /> 84 &amp; 29 &amp; 0.981281 &amp; 0.983895 \\<br /> 85 &amp; 29 &amp; 0.977840 &amp; 0.980611 \\<br /> 86 &amp; 30 &amp; 0.985153 &amp; 0.987495<br /> \end{array}<br />
 
Last edited:
Amcote said:
It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
\Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99,

so,

\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33

with all this I get n= 101.786

and if I actually include the correction I get n_1=101.539 and n_2=0.2453 from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks

Without the continuity correction: if you use z = 2.33 you get your n = 101.786, but if you use the more accurate value z = 2.326 you get n = 101.436. Ok, they are not that different, but one of them rounds to 102 while the other rounds to 101.

More seriously, though, is the non-monotone behavior of the probability, as explained in post #6. That means that you can have several nearby solutions to the required inequality ##P(\text{Binom}(n,.25n) \leq .35 n) = P(\text{Binom}(n,.25n) \leq \lfloor .35 n \rfloor) \geq 0.99##. This happens in both the exact analysis and in the normal approximation (with 1/2-correction included after the rounding down operation).

By the way: your statement "First I should say I know the answer is suppose to be n≥92" is misleading: n around 92 is too small to achieve the 99% probability.
 
Last edited:
Back
Top