Statistics probability questions

Rifscape · Feb 18, 2017

Homework Statement

Each week, Stéphane needs to prepare 4 exercises for the following week's homework assignment. The number of problems he creates in a week follows a Poisson distribution with mean 6.9.

a. What is the probability that Stéphane manages to create enough exercises for the following week's homework? Round your answer to 4 decimal places.

b. Unfortunately, each week there is a 55% chance that a visiting scholar from Switzerland arrives and burdens Stéphane with research questions all week. During these weeks he only writes an average of 3.45 exercises. If Stéphane fails to write 4 exercises one week, what is the probably that he received a visiting scholar that week? Round your answer to 4 decimal places.

c. The last week of the semester, Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If a student with a 60% chance of correctly answering an exercise is expected to answer 3 questions correctly, what is the probably that Stéphane did not have a visitor that week? Round your answer to 4 decimal places.
Hint: First find the number of exercises in the last week of the semester from the chance and expected value of the correct answers.

Homework Equations

P(A and B) = P(A) * P(B)
P(A | B) = P(A and B)/p(B)
Poisson distribution equation:
P(x; μ) = (e-μ) (μx) / x!

The Attempt at a Solution

I was able to finish the first question and get the right answer but I'm having trouble on parts b and c.
For the first question:
P(x = 4) = 1 - P(x <= 3) = 1 - (P(0) + P(1) + P(2) + P(3))

Then I used the poisson distribution equation and was able to get the answer.

I think for b you need to use the conditional equation, but I'm not sure what the P(failing) would be.

I have no idea how to do c

Any help is appreciated, thanks for reading

StoneTemplePython · Feb 18, 2017

The formatting here is kind of hard to read... I think your Poisson distribution equation is wrong though. If you are using parameter ##\mu ## instead of the perhaps more typical ##\lambda##, the poisson pmf is given by:

##P(x, \mu) = \frac{\mu^x e^{-\mu}}{x!}##

I'd focus first on part b. Draw a tree with 55% chance of having a situation with ##\mu := 3.45## and a 45% chance of ## \mu = 6.9##. Trees are visually powerful, and quite useful in computation settings -- so give it a shot and draw this tree. (Historical note: drawing a tree is also how Pascal solved the 'original' probability problem -- the problem of points.) For each of the two leaves of this simple tree, what is that probability of having not done at least 4 exercises? I.e. what is ##\big(P(x=0, \mu) + P(x=1, \mu) + P(x=2, \mu) + P(x=3, \mu)\big)## for both leaves of this tree? From here you just need to get a normalizing constant so that your 'posterior' sums to one. Are you familiar with this?

Part C flips the conditioning around -- ala Bayesian inference, but I'd focus on solving b first.
- - - -

btw for part A), your equation says "P(x = 4) " but it really should read ##P(x \geq 4)##

Rifscape · Feb 19, 2017

Alright I did that, and was able to get the P(failing) since P(failing) = P(failing|scholar) + P(failing|no scholar)

Alright I was able to get .88 for the second question, though I don't know how to do the last part.

Ray Vickson · Feb 19, 2017

Rifscape said:

Alright I did that, and was able to get the P(failing) since P(failing) = P(failing|scholar) + P(failing|no scholar)

Alright I was able to get .88 for the second question, though I don't know how to do the last part.

No, that is not correct; the correct result is
P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Rifscape · Feb 19, 2017

Wait yeah you're right, I did that, just forgot to write it down. You can only get 0.88 if you do P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Do you know how to start part c?

Thanks for the help

StoneTemplePython · Feb 19, 2017

Shouldn't (c) be something like ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##

via law of total expectation and linearity of expectations? All you are doing is solving for p, there. Again, drawing a picture first -- i.e. a tree-- should get you there.

I really can't emphasize this enough -- if at all possible, try drawing picture to help solve your problems in probability. You should pretty much always be able to do this when you are working with a countable number of states...

Ray Vickson · Feb 19, 2017

Rifscape said:

Wait yeah you're right, I did that, just forgot to write it down. You can only get 0.88 if you do P(fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Do you know how to start part c?

Thanks for the help

I don't get 0.88 for part (b).

Rifscape · Feb 19, 2017

Yeah I drew a tree for part b and it made sense and I was able to get it, but I have been unable for this question. I attempted to draw one based on the formula you gave, but I don' think its fully correct.

When I plugged in 6.9 for E[X1] and 3.45 for E[X2], I solved for p and got p= 0.4492, which is not correct. Though I think I am on the right track.

Rifscape · Feb 19, 2017

Ray Vickson said:

I don't get 0.88 for part (b).

I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

StoneTemplePython · Feb 19, 2017

Ray Vickson said:

I don't get 0.88 for part (b).

Rifscape said:

I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

Setting aside the nit that the answer wants 4 decimals, 0.88 checks out and is correct.

Rifscape · Feb 19, 2017

Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution

StoneTemplePython · Feb 19, 2017

Rifscape said:

Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution

To be clear ##X_1## and ##X_2## are random variables. But HIGH LEVEL (see bold at end) yes, you might interpret it as ##E[X_1] = \mu_1## and ##E[X_2] = \mu_2##. Unfortunately a lot of probability problems turn into linguistic ones where meanings easily get trampled on...

The issue I have from C is the question has linguistic issues.

Your problem begins by saying

"Each week, Stéphane needs to prepare 4 exercises for the following week's homework assignment."

Then part c says:

The last week of the semester, Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises

I have two interpretations here. One the question is busted -- the idea of Stéphane preparing questions for the following week during the last week simply doesn't make any sense, as those questions would be issued after the semester end. The alternative interpretation is -- loosely speaking-- that we have a renewal here -- so basically you have a Poisson with a cap at 4 from prior week, and then another unbounded Poisson that happens during the last week for issuing more exercises. So now you'd draw a tree with a root, and two levels in it. That's my current thinking at least -- the wording has room for improvement.

Rifscape · Feb 19, 2017

Yeah, my professor always has these wording issues. I think its probably the second option, where there is a renewal and that there is a new cap.

Ray Vickson · Feb 19, 2017

Rifscape said:

I got that value after dividing the P(scholar arriving and failing) by the product of (fail) = P(fail|scholar)*P(scholar) + P(fail|no scholar)*P(no scholar).

You are correct; I was calculating the wrong thing.

Ray Vickson · Feb 19, 2017

Rifscape said:

Alright cool that's good, just need to solve part c then. The p value I get from that equation doesn't seem to work.

Would X1=6.9 and X2=3.45? Or are the expected values different from the means of a poisson distribution

The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.

Rifscape · Feb 19, 2017

Ray Vickson said:

The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.

How would you get the E[X1] and E[X2] then? I'm still not really sure how I would tackle this question, I drew a tree, but there seems to be many parts to this question.

StoneTemplePython · Feb 19, 2017

Ray Vickson said:

The number of questions answered correctly by the student is neither ##EX_1= 6.9## nor ##EX_2 = 3.45##. These quantities are the numbers of questions on the quiz paper, (with or without a visiting scholar) but on average the student answers some of them incorrectly.

@ Ray Vickson
OP was responding to my equation from earlier where I said: ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##

Rifscape said:

How would you get the E[X1] and E[X2] then? I'm still not really sure how I would tackle this question, I drew a tree, but there seems to be many parts to this question.

@ Rifscape
I would still recommend using my above setup. I don't understand why Ray Vickson would make such a comment.

If you want to formalize this and abstract even further --- note: the following is unnecessary-- you can use the law of iterated expectations. Consider some random variable ##Y## which denotes the number of questions answered correctly by the student. Consider ##N##, a natural number denominated r.v. that gives the number of questions actually offered.

Consider Q, a bernouli r.v. that has probability 0.6 of a correct answer by said student for a given question and is independent of ##N##.

The question states ##E[Y] = 3##. Using law of iterated expectations we can rewrite this as

##E[Y] = 3##
## E[Y] = E[E[Y|N]] = 3 ##
## E[Y] = E[N E[Q]] = 3 ##
## E[Y] = E[N] E[Q] = 3##
## E[Y] = E[N] (0.6) = 3##
## E[Y] = 0.6 E[N] = 3## When you consider that ##E[N] = (p*E[X_1] + (1-p)*E[X_2])##, you then get the equation I originally supplied. I am not supposed to supply the whole answer, so I leave finding ##E[X_1]## and ##E[X_2]## up to you, though maybe I can help if you have some follow-up questions where I won't have to give away the whole thing.

Rifscape · Feb 20, 2017

I'm still kind of stuck on how to find E[X1] and E[X2], would I use the poisson distribution equation and find the probability that x >=3 using 6.9 for X1 and 3.45 for X2?

Ray Vickson · Feb 20, 2017

Rifscape said:

I'm still kind of stuck on how to find E[X1] and E[X2], would I use the poisson distribution equation and find the probability that x >=3 using 6.9 for X1 and 3.45 for X2?

In the original problem statement it said "The number of problems he creates in a week follows a Poisson distribution with mean 6.9" before part (a) and said "During these weeks he only writes an average of 3.45 exercises" in part (b). What do you think those statements signify?

Rifscape · Feb 20, 2017

Ray Vickson said:

In the original problem statement it said "The number of problems he creates in a week follows a Poisson distribution with mean 6.9" before part (a) and said "During these weeks he only writes an average of 3.45 exercises" in part (b). What do you think those statements signify?

Wouldn't those be the expected values, for x1 and x2, since it's a poisson distribution? But I already tried those values and they didn't work.

Ray Vickson · Feb 20, 2017

Rifscape said:

Wouldn't those be the expected values, for x1 and x2, since it's a poisson distribution? But I already tried those values and they didn't work.

They should have worked. When I solved for ##p## I get a value different from your 0.4492.

Rifscape · Feb 20, 2017

StoneTemplePython said:

@ Ray Vickson
OP was responding to my equation from earlier where I said: ##0.6(p*E[X_1] + (1-p)*E[X_2]) = 3##
I got x1 as 5 and x2 as 3, but when I plug them in I get?
@ Rifscape
I would still recommend using my above setup. I don't understand why Ray Vickson would make such a comment.

If you want to formalize this and abstract even further --- note: the following is unnecessary-- you can use the law of iterated expectations. Consider some random variable ##Y## which denotes the number of questions answered correctly by the student. Consider ##N##, a natural number denominated r.v. that gives the number of questions actually offered.

Consider Q, a bernouli r.v. that has probability 0.6 of a correct answer by said student for a given question and is independent of ##N##.

The question states ##E[Y] = 3##. Using law of iterated expectations we can rewrite this as

##E[Y] = 3##
## E[Y] = E[E[Y|N]] = 3 ##
## E[Y] = E[N E[Q]] = 3 ##
## E[Y] = E[N] E[Q] = 3##
## E[Y] = E[N] (0.6) = 3##
## E[Y] = 0.6 E[N] = 3##When you consider that ##E[N] = (p*E[X_1] + (1-p)*E[X_2])##, you then get the equation I originally supplied. I am not supposed to supply the whole answer, so I leave finding ##E[X_1]## and ##E[X_2]## up to you, though maybe I can help if you have some follow-up questions where I won't have to give away the whole thing.

Rifscape · Feb 20, 2017

Ray Vickson said:

They should have worked. When I solved for ##p## I get a value different from your 0.4492.

Hmm really what value ? I keep getting 0.4492, is it the same equation that python gave?

Rifscape · Feb 20, 2017

Ray Vickson said:

They should have worked. When I solved for ##p## I get a value different from your 0.4492.

Isn't it (3/0.6 - 3.45) /3.45, which equal. 4492

Ray Vickson · Feb 20, 2017

Rifscape said:

Hmm really what value ? I keep getting 0.4492, is it the same equation that python gave?

Yes, but with the 0.6 factor included, and using the correct choices of ##EX_1## and ##EX_2##.

Rifscape · Feb 20, 2017

Ray Vickson said:

Yes, but with the 0.6 factor included, and using the correct choices of ##EX_1## and ##EX_2##.

Huh, I'm using that too along with 6.9 and 3.45, but I can't get that answer

Rifscape · Feb 20, 2017

Is the number you got 0.4539?

Ray Vickson · Feb 20, 2017

Rifscape said:

Is the number you got 0.4539?

My p is (1-your p), so the answers agree when expressed in words.

Sorry for the confusion: I have been suffering from a cold that makes my head fuzzy, and so have been a bit mixed up.

StoneTemplePython · Feb 21, 2017

@Rifscape

To be honest, part (c) is probably one of the worst worded questions that I've seen. I hope you have a higher quality text to study from or is reviewing something like MIT's 6.041 (https://ocw.mit.edu/courses/electri...s-analysis-and-applied-probability-fall-2010/ ) or Harvard's intro to probability (see Joe Blitzstein on youtube). Otherwise there's a risk of learning the opposite of clear thinking from this course.

First criticism: see prior page where I pointed out that the problem is illogical if questions are during a given week for use in the following week, if we are talking about Stephane coming up with questions during the last week of the course. Based on results, it seems that the renewal idea is out the door, so my criticism that the question is busted is sustained. The fix would be to say that Stephane makes the decision to 'reward' students at the beginning of the second to last week of the course (so that the questions can be issued during the last week).

Second, and new criticism. The answer being sought violates the law of iterated expectations. Put differently the question asks one thing but wants an answer for a different problem. Let me simplify, and consider a simpler but probabilistically identical problem where instead of a student with a 60% chance of answering an exercise is considered, consider the case where top student with 100% chance of answering an exercise correctly is expected to answer 5 questions correctly.

The equation your professor wants is, have a prior distribution = ##\left[\begin{matrix}0.55\\0.45\end{matrix}\right]## and then use

##posterior \propto diag(likelihood) \left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

and your prof wants you to assume that 5 questions were issued -- and use that for your likelihood function.

##posterior \propto
\left[\begin{matrix}p(5, \mu= 3.45) & 0\\0 & p(5, \mu= 6.9)\end{matrix}\right]
\left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

##posterior \propto
\left[\begin{matrix}
0.12929992
& 0\\0 &
0.13135067
\end{matrix}\right] \left[\begin{matrix}0.55\\0.45\end{matrix}\right]
##

if your normalize the above (i.e. make sure the resulting vector sums to one), you get

##posterior =
\left[\begin{matrix}0.546102377853564\\0.453897622146436\end{matrix}\right]
##

- - - -
The issue is this is not what the question says. If the above is supposed to be the correct answer, then it should say,

Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If the assignment is given out and it has 5 questions on it, what is the probably that Stéphane did not have a visitor that week?

or, more cumbersomely, it could say:
Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. The assignment is given out, and it has a certain number of questions on it, that is not a random variable. If a student with a 100% chance of correctly answering an exercise is expected to answer 5 questions correctly, what is the probably that Stéphane did not have a visitor that week?

but what it actually says (using my top student setup) is:
Stéphane decides to "reward" the students by no longer limiting himself to 4 exercises, and instead assigning every exercise he writes. If a student with a 100% chance of correctly answering an exercise is expected to answer 5 questions correctly, what is the probably that Stéphane did not have a visitor that week?

In this setup, the random variable is actually the number of questions on the test, not the probability of having a visitor -- the probability of a visitor is just a parameter we are estimating. Despite what the hint suggests, you simply cannot make the jump to there being a deterministic 5 exercises. The question must specify that the student already received the x number questions or in some other manner make it explicit that the number of questions is now fixed. As it currently reads, there is no reason to believe that the student has seen the questions (or that Stephane is done working on them). Instead what we in effect learn is that a bookie tells us, based on all publicly known information, a fair bet is that the top student will answer 5 questions correctly.

So the statement we actually get is far more general, and different, than being told that the top student has seen the homework assignment and that the top student expects to get 5 correct, and hence that there are only 5 questions being assigned (i.e. the number of questions being assigned is no longer a random variable).

The issue is that:

##posterior^T \left[\begin{matrix}3.45 \\ 6.9 \end{matrix}\right] =
\left[\begin{matrix}0.546102377853564\\0.453897622146436\end{matrix}\right]
^T \left[\begin{matrix}3.45 \\ 6.9 \end{matrix}\right] =
5.0159467964052062 \neq 5 ##

Hence the solution vector violates the law of iterated expectations, unless the student has seen the number of homework questions and determined it equals 5. The fundamental problem is that instead of telling you the number of questions assigned, the question writer (I believe, your prof) tried to get clever and give you an expected value statement that was not carefully worded. You may want to ask your prof about the law of iterated expectations, why the student's conditional expectation of questions being issued is not a random variable, and hence why said law of iterated expectations does not apply here. I would anticipate a fuzzy answer.

Ray Vickson · Feb 21, 2017

StoneTemplePython said:

@Rifscape

To be honest, part (c) is probably one of the worst worded questions that I've seen. I hope you have a higher quality text to study from or is reviewing something like MIT's 6.041 (https://ocw.mit.edu/courses/electri...s-analysis-and-applied-probability-fall-2010/ ) or Harvard's intro to probability (see Joe Blitzstein on youtube). Otherwise there's a risk of learning the opposite of clear thinking from this course.

Let me simplify, and consider a simpler but probabilistically identical problem where instead of a student with a 60% chance of answering an exercise is considered, consider the case where top student with 100% chance of answering an exercise correctly is expected to answer 5 questions correctly.

It makes perfectly good sense to ask for the posterior probability of a visiting scholar, given that a student answered 3 questions correctly and without making any assumption that the test has 5 questions. Admittedly the problem is more challenging than the "assume 5 questions" version, and may possibly be beyond the ability/knowledge of the OP, but I really do not know what the instructor intended, so having two possible versions cannot hurt. Below, let ##S## be the event "scholar week" and ##\bar{S}## the event "no-scholar week".

First of all, if 3 questions are answered correctly the test must contain ##N \geq 3## questions. For event ##\bar{S}## we have ##N \sim \text{Poisson}(\alpha)## with ##\alpha = 6.9,## while for event ##S## we have ##N \sim \text{Poisson}(\beta),## with ##\beta = 3.45##. Given an ##N = n \geq 3## the probability the student answers 3 questions correctly is the Binomial probability ##C(n,3) p^3 q^{n-3}, ## where ##p = 0.6, q = 0.4##. So, if ##Y## is the number of correctly-answered questions, then for event ##\bar{S}## we have ##P_{\bar{S}}(Y=3\: \&\: N=n) = C(n,3) p^3 q^{n-3} \alpha^n e^{-\alpha}/n!,## which simplifies to
$$P_{\bar{S}}(Y=3\: \&\: N=n) = \frac{p^3 \alpha^3}{3!} \frac{e^{-\alpha} (\alpha q)^{n-3}}{(n-3)!}.$$
Thus,
$$P(Y = 3|\bar{S}) = \sum_{n=3}^{\infty} P_{\bar{S}}(Y = 3\: \& \:N = n) = \frac{(\alpha p)^3}{3!} e^{-\alpha + \alpha q} = \frac{(\alpha p)^3 e^{-\alpha p}}{3!}.$$
Of course, this is Poisson probability with mean ##\alpha p = 0.6 \times 6.9 = 4.14.## Similarly, for event ##S##, ##Y## is Poisson with mean ##p \beta = 0.6 \times 3.45 = 2.07##. We have ##P(Y=3|\bar{S}) = 4.14^3 e^{-4.14}/3! \doteq 0.1883,## while ##P(Y=3|S) = 2.07^3 e^{-2.07}/3! \doteq 0.1865.##

The posterior probability of no visiting scholar, given that ##\{ Y = 3 \}##, is
$$P(\bar{S}|Y=3) = \frac{P(Y=3|\bar{S}) P(\bar{S})}{P(Y=3)} = \frac{(0.45)(0.1883)}{(0.45)(0.1883) + (0.55)(0.1865)}.$$

StoneTemplePython · Feb 24, 2017

Ray Vickson said:

It makes perfectly good sense to ask for the posterior probability of a visiting scholar, given that a student answered 3 questions correctly and without making any assumption that the test has 5 questions. Admittedly the problem is more challenging than the "assume 5 questions" version, and may possibly be beyond the ability/knowledge of the OP, but I really do not know what the instructor intended, so having two possible versions cannot hurt...

First of all, if 3 questions are answered correctly the test must contain ##N \geq 3## questions.$$

I think your response is fair and you are definitely right that the wording is open for multiple interpretations. In my book, having an observation is a very big deal and the wording is silent whether or not the student has even received the assignment... all we have to go on is, that the student "is expected to answer 3 questions correctly." It seems to me that clarity is important generally in math and especially so in probability, but this question is not clear.

That said, maybe OP's prof meant to write a good homework question, but was interrupted by a visitor from Switzerland.

Statistics probability questions

Homework Statement

Homework Equations

The Attempt at a Solution

1. What is the difference between statistics and probability?

2. How do you calculate probability?

3. What is the difference between discrete and continuous probability distributions?

4. How do you interpret a confidence interval?

5. What is the difference between correlation and causation?

Similar threads

Hot Threads

Recent Insights