Stats: Approximating a binomial with a normal distribution

Click For Summary

Homework Help Overview

The discussion revolves around a problem involving a multiple choice test with four possible answers per question. The objective is to determine the number of questions required to ensure that a student guessing answers has no more than a 35% score with 99% confidence. This scenario is framed within a binomial distribution context, where the probability of success is set at 0.25.

Discussion Character

  • Exploratory, Assumption checking, Problem interpretation

Approaches and Questions Raised

  • Participants discuss the use of the binomial distribution and the normal approximation with continuity correction. There are attempts to derive a formula for Z and to set up equations based on the cumulative distribution function. Some participants express confusion regarding the correct interpretation of the probability statement and the application of the continuity correction.

Discussion Status

Several participants have shared their attempts at solving the problem, noting specific values and equations they have derived. There is an acknowledgment of a typo in the probability statement, which has been clarified. The discussion includes differing opinions on the appropriateness of the continuity correction and the implications of rounding in the calculations. No consensus has been reached on the correct approach or solution.

Contextual Notes

Participants note that the problem involves a binomial distribution with a specific probability of success and that the continuity correction is a point of contention. There is mention of the non-monotonic behavior of the probabilities as the number of questions changes, which adds complexity to the analysis.

Amcote
Messages
16
Reaction score
0

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!
 
Last edited:
Physics news on Phys.org
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≥0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!
 
Ray Vickson said:
When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!

Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.
 
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:
When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
\Phi \left( \frac{.5 + \lfloor 0.35 n \rfloor -.25 n}{.433 \sqrt{n}} \right) = 0.99,
where "##\lfloor 0.35 n \rfloor##" is the largest integer ##\leq 0.35 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.
 
Last edited:
Ray Vickson said:
In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
\Phi \left( \frac{.5 + \lfloor 0.1 n \rfloor }{.433 \sqrt{n}} \right) = 0.99,
where "##\lfloor 0.1 n \rfloor##" is the largest integer ##\leq 0.01 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.

It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
\Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99,

so,

\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33

with all this I get n= 101.786

and if I actually include the correction I get n_1=101.539 and n_2=0.2453 from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks
 
Amcote said:
Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.

Further to my response in #4: for ##X_n \sim \text{Binom}(n, .25)## the probabilities ##P(X_n \leq .35 n)## are not monotone in ##n## over short intervals of ##n##, so if you plot ##P(X_n \leq .35 n)## vs. ##n## you get a graph with a "sawtooth" behavior, which rises over the long run but wiggles in the short run. The normal approximation with modified 1/2- correction behaves that way too. The reason is that as ##n## increases the integer values ##N_n## in the events ##\{ X_n \leq .35 n \} = \{ X_n \leq N_n \}## are non-decreasing but sometimes remain constant for a few neighboring values of ##n##. If ##N_n = N_{n+1}## the probability ##P(X_k \leq N_k)## can go down as ##k## increases from ##n## to ##n+1##. That happens because the distribution of ##X_{n+1}## is shifted to the right, but is narrower than that of ##X_n##, so the end result could be a decrease or an increase in the probability for "##\leq N_n ##". For example, for ##n## going from 79 to 86 the values of ##N_n##, P_exact = ##P(\text{Binom}(n, .25n) \leq N_n)## and P_normal = ##\Phi((.5 + \lfloor .35 n \rfloor -.25 n)/\sqrt{.1875n})## are
\begin{array}{cccc}<br /> n &amp; N_n &amp; \text{P_exact} &amp; \text{ P_normal} \\<br /> 79 &amp; 27 &amp; 0.975007 &amp; 0.977978 \\<br /> 80 &amp; 28 &amp; 0.983370 &amp; 0.985907 \\<br /> 81 &amp; 28 &amp; 0.980154 &amp; 0.982868 \\<br /> 82 &amp; 28 &amp; 0.976467 &amp; 0.979337 \\<br /> 83 &amp; 29 &amp; 0.984286 &amp; 0.986724 \\<br /> 84 &amp; 29 &amp; 0.981281 &amp; 0.983895 \\<br /> 85 &amp; 29 &amp; 0.977840 &amp; 0.980611 \\<br /> 86 &amp; 30 &amp; 0.985153 &amp; 0.987495<br /> \end{array}<br />
 
Last edited:
Amcote said:
It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
\Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99,

so,

\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33

with all this I get n= 101.786

and if I actually include the correction I get n_1=101.539 and n_2=0.2453 from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks

Without the continuity correction: if you use z = 2.33 you get your n = 101.786, but if you use the more accurate value z = 2.326 you get n = 101.436. Ok, they are not that different, but one of them rounds to 102 while the other rounds to 101.

More seriously, though, is the non-monotone behavior of the probability, as explained in post #6. That means that you can have several nearby solutions to the required inequality ##P(\text{Binom}(n,.25n) \leq .35 n) = P(\text{Binom}(n,.25n) \leq \lfloor .35 n \rfloor) \geq 0.99##. This happens in both the exact analysis and in the normal approximation (with 1/2-correction included after the rounding down operation).

By the way: your statement "First I should say I know the answer is suppose to be n≥92" is misleading: n around 92 is too small to achieve the 99% probability.
 
Last edited:

Similar threads

Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 6 ·
Replies
6
Views
2K
Replies
2
Views
2K
Replies
1
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K