Stats: Approximating a binomial with a normal distribution

I don't understand is why if I'm given P(B(n,0.25)≤0.35*n)=0.99, then when I solve for n I get the wrong answer. I looked at the solution and it says n=92, but when I solve for n I get a quadratic formula. So I'm not sure where I'm going wrong.In summary, this problem involves finding the number of questions needed for a multiple choice test in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test. The problem can be solved using a normal approximation with a continuity correction. However, blindly using the 1/2 correction is
  • #1
Amcote
16
0

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!
 
Last edited:
Physics news on Phys.org
  • #2
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≥0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:

2.33 = ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!
 
  • #3
Ray Vickson said:
When you write P(B(n,0.25)≥0.35*n)=0.99 you are asking to be 99% sure that the student scores at least 35% (i.e., 35% or better). That was not what you were asked!

Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.
 
  • #4
Amcote said:

Homework Statement


A multiple choice test consists of a series of questions, each with four possible answers.

How many questions are needed in order to be 99% confident that a student who guesses blindly at each question scores no more than 35% on the test?

Homework Equations



So I know that this is a binomial setting with p=0.25 and 'n' is what we are trying to solve for.
for binomial, μ=n*p, σ=sqrt(n*p(1-p))
P(B(n,0.25)≤0.35*n)=0.99
And because of the binomial setting, we must use a correction factor, in this case '+0.5'
Z= (x - μ)/σ

The Attempt at a Solution



First I should say I know the answer is suppose to be n≥92

So how I start this problem is I use the standardizing formula

Z= (x - μ)/σ

which in this case would be

Z= (0.35*n + 0.5 - n*0.25)/(sqrt(n*0.25*0.75)

This simplifies to

Z=(0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))

I think what I have done so far is correct. But where I get confused is finding a value for Z,

I thought what I have to do is something like :

P(B(n,0.25)≤0.35*n)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

so, Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875))) = 0.99

look up 0.01 on the table which gives me Φ(2.33)=Φ((0.1*n + 0.5)/(sqrt(n)*sqrt(0.1875)))

and so I should be able to set those equal and solve for n:
When I solve for n I get a quadratic formula but neither answers I get is the correct answer.

Any help would be appreciated.

Thanks!

In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
[tex] \Phi \left( \frac{.5 + \lfloor 0.35 n \rfloor -.25 n}{.433 \sqrt{n}} \right) = 0.99, [/tex]
where "##\lfloor 0.35 n \rfloor##" is the largest integer ##\leq 0.35 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.
 
Last edited:
  • #5
Ray Vickson said:
In the context of this particular problem, just blindly using the "1/2 correction" is a mistake. What would be correct would be to ask for the solution of
[tex] \Phi \left( \frac{.5 + \lfloor 0.1 n \rfloor }{.433 \sqrt{n}} \right) = 0.99, [/tex]
where "##\lfloor 0.1 n \rfloor##" is the largest integer ##\leq 0.01 n##.

Personally, I would not try to solve that exactly as written; instead (if I were doing the question) I would drop rounding-down and the 1/2-correction altogether and just solve the resulting very simple problem. Then, if I really wanted to be sure of the solution, I would check manually one or two values of ##n## surrounding the solution, either using the exact binomial or the normal approximation with the more involved form of 1/2-correction indicated above.

It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
[tex] \Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99, [/tex]

so,

[tex]\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33[/tex]

with all this I get [tex]n= 101.786[/tex]

and if I actually include the correction I get [tex]n_1=101.539[/tex] and [tex]n_2=0.2453[/tex] from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks
 
  • #6
Amcote said:
Sorry I made a typo just in that sentence, I meant P(B(n,0.25)≤0.35*n)=0.99.

Further to my response in #4: for ##X_n \sim \text{Binom}(n, .25)## the probabilities ##P(X_n \leq .35 n)## are not monotone in ##n## over short intervals of ##n##, so if you plot ##P(X_n \leq .35 n)## vs. ##n## you get a graph with a "sawtooth" behavior, which rises over the long run but wiggles in the short run. The normal approximation with modified 1/2- correction behaves that way too. The reason is that as ##n## increases the integer values ##N_n## in the events ##\{ X_n \leq .35 n \} = \{ X_n \leq N_n \}## are non-decreasing but sometimes remain constant for a few neighboring values of ##n##. If ##N_n = N_{n+1}## the probability ##P(X_k \leq N_k)## can go down as ##k## increases from ##n## to ##n+1##. That happens because the distribution of ##X_{n+1}## is shifted to the right, but is narrower than that of ##X_n##, so the end result could be a decrease or an increase in the probability for "##\leq N_n ##". For example, for ##n## going from 79 to 86 the values of ##N_n##, P_exact = ##P(\text{Binom}(n, .25n) \leq N_n)## and P_normal = ##\Phi((.5 + \lfloor .35 n \rfloor -.25 n)/\sqrt{.1875n})## are
[tex] \begin{array}{cccc}
n & N_n & \text{P_exact} & \text{ P_normal} \\
79 & 27 & 0.975007 & 0.977978 \\
80 & 28 & 0.983370 & 0.985907 \\
81 & 28 & 0.980154 & 0.982868 \\
82 & 28 & 0.976467 & 0.979337 \\
83 & 29 & 0.984286 & 0.986724 \\
84 & 29 & 0.981281 & 0.983895 \\
85 & 29 & 0.977840 & 0.980611 \\
86 & 30 & 0.985153 & 0.987495
\end{array}
[/tex]
 
Last edited:
  • #7
Amcote said:
It was made clear by my instructor that we should "apply a normal approximation with a continuity correction" for this problem. But even if I ignore all that and do what you suggest (as it should work that way) I get:
[tex] \Phi \left( \frac{ 0.1 n }{.433 \sqrt{n}} \right) = \Phi \left( 2.33 \right) = 0.99, [/tex]

so,

[tex]\frac{ 0.1 n }{.433 \sqrt{n}} = 2.33[/tex]

with all this I get [tex]n= 101.786[/tex]

and if I actually include the correction I get [tex]n_1=101.539[/tex] and [tex]n_2=0.2453[/tex] from the quadratic.

None of these are the correct answer so I'm wondering what it is I am doing wrong.

Thanks

Without the continuity correction: if you use z = 2.33 you get your n = 101.786, but if you use the more accurate value z = 2.326 you get n = 101.436. Ok, they are not that different, but one of them rounds to 102 while the other rounds to 101.

More seriously, though, is the non-monotone behavior of the probability, as explained in post #6. That means that you can have several nearby solutions to the required inequality ##P(\text{Binom}(n,.25n) \leq .35 n) = P(\text{Binom}(n,.25n) \leq \lfloor .35 n \rfloor) \geq 0.99##. This happens in both the exact analysis and in the normal approximation (with 1/2-correction included after the rounding down operation).

By the way: your statement "First I should say I know the answer is suppose to be n≥92" is misleading: n around 92 is too small to achieve the 99% probability.
 
Last edited:

1. What is the purpose of approximating a binomial distribution with a normal distribution?

Approximating a binomial distribution with a normal distribution allows us to simplify complex data and make predictions about the data using the properties of the normal distribution, which is a more well-known and easily understood distribution.

2. How is a binomial distribution different from a normal distribution?

A binomial distribution is a discrete probability distribution that models the number of successes in a fixed number of independent trials, while a normal distribution is a continuous probability distribution that describes the distribution of a continuous random variable.

3. What are the assumptions made when approximating a binomial distribution with a normal distribution?

The assumptions made when approximating a binomial distribution with a normal distribution are that the sample size is large (at least 30), the probability of success (p) is not too close to 0 or 1, and the trials are independent.

4. How do you determine the parameters of the normal distribution when approximating a binomial distribution?

The mean of the normal distribution is equal to np, where n is the sample size and p is the probability of success. The standard deviation is equal to √(np(1-p)).

5. What are some common uses of approximating a binomial distribution with a normal distribution?

Some common uses of approximating a binomial distribution with a normal distribution include predicting the number of successes in a large number of trials, analyzing survey data, and making predictions in fields such as finance and psychology.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
1K
  • Calculus and Beyond Homework Help
Replies
6
Views
941
  • Calculus and Beyond Homework Help
Replies
4
Views
862
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
993
  • Calculus and Beyond Homework Help
Replies
1
Views
737
  • Calculus and Beyond Homework Help
Replies
8
Views
1K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
  • Calculus and Beyond Homework Help
Replies
5
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Back
Top