Math: discrete probability distribution

In summary: Variance= npq*((N-n)/(N-1)) = 8(8/18)(10/18)*((18-8)/(18-1)) = 1.1612 This is the same answer I got, which is different from the answer the teacher gave in class. The teacher said the answer for the first problem is different from the second. So I think the first problem is calculated differently.
  • #36
Ray Vickson said:
The number (he selects is random, not known before.Of course you do not know how many he selects; that is a random variable ranging from 0 to 18. Indeed, he CAN select 18 if he guesses that all of them are milk cookies; he would be wrong, but he does not know that. It is not very probable that he does select 18; the probability would be the same as that for getting 18 heads when you toss a coin 18 times: unlikely, but definitely possible. Similarly, it is not very likely that he selects 0, but it is possible: that has a nonzero probability.

You do not know the actual number he selects, but you DO KNOW its probability distribution. That is all you need in order to determine the number of successfully-detected milk cookies Y, which is also random and can range from 0 to 8. You need to use some pretty fundamental results and methods in probability in order to finish the computation, and I really am not permitted to say much more.
The reason why I am having so much difficulties is because I have never dealt with a problem where n is unknown and ranging.

Because in my class we always use Calculator functions for Binomial, Poisson, Hypergeo etc.

So if n is ranging from 0 to 18 and k is ranging from 0 to 8
I am not sure how I can do this in my Calculator:
HyperCDF(0 to 18, 8, 10, 0 to 8) that would be the logic I believe, but that don't work.

I really appreciate that you are patient with me, I know its been about 2 days and must be tired :/
 
Physics news on Phys.org
  • #37
Ray Vickson said:
The number (he selects is random, not known before.Of course you do not know how many he selects; that is a random variable ranging from 0 to 18. Indeed, he CAN select 18 if he guesses that all of them are milk cookies; he would be wrong, but he does not know that. It is not very probable that he does select 18; the probability would be the same as that for getting 18 heads when you toss a coin 18 times: unlikely, but definitely possible. Similarly, it is not very likely that he selects 0, but it is possible: that has a nonzero probability.

You do not know the actual number he selects, but you DO KNOW its probability distribution. That is all you need in order to determine the number of successfully-detected milk cookies Y, which is also random and can range from 0 to 8. You need to use some pretty fundamental results and methods in probability in order to finish the computation, and I really am not permitted to say much more.
Hey,
(and to identify 8 butter cookies) this part of the problem, doesn't that mean n=8 ? he has to select 8 cookies he thinks are made in butter.
so the only ranging variable is k= 0 to 8

?
 
  • #38
With problems like this, the exact words used to specify the problem are important. Changing or leaving out a word can change the entire meaning of the problem. Can you post a photo of the problem specification? It would be even more helpful if you could post images of the specifications for both this one and the earlier one you refer to, in which the person knows there are eight butter cookies.
 
  • #39
I find something different : ##P(Y = k) = (\frac{3}{4})^{18} {n \choose k} 3^{-k} ##.

What is random in your experiment is the correctness of his guess.
The random experiment can be modeled by ## \Omega = ( \{0,1\} \times \{B,M\} ) ^ {18} ##
Meaning that among the 18 biscuits he eats, and not knowing anything about the number of butter biscuits, there are 4 possible answers at each try :
  • (0,M) : man said margarine incorrectly
  • (0,B) : man said butter incorrectly
  • (1,B) : man said butter correctly
  • (1,M) : man said margarine correctly

Therefore ##\{ Y = k \} = \{ M=(m_1,...,m_{18}) \in \Omega,\ \exists I\in {\cal P}(\{1...18\})\ |I| = k,\ \forall i \in I \ m_i = (1,B), \forall j \in \{1...18\}- I,\ m_j \neq (1,B) \} ##

The probability desired is ##|\{ Y =k \} | / |\Omega| ##

EDIT: Sorry this is incorrect, I haven't deprived ##\Omega## of all its output containing more than 8 times ##(1,B)## and ##(0,M)## as well as more than 10 times ##(1,M)## and ##(0,B)##.
 
Last edited:
  • #40
andrewkirk said:
With problems like this, the exact words used to specify the problem are important. Changing or leaving out a word can change the entire meaning of the problem. Can you post a photo of the problem specification? It would be even more helpful if you could post images of the specifications for both this one and the earlier one you refer to, in which the person knows there are eight butter cookies.
Its in a different language
 
  • #41
I got 2 hours before to get this problem done :/
 
  • #42
masterchiefo said:
I got 2 hours before to get this problem done :/
So we know he has to select 8 cookies that he thinks are butter.
so
n=8 as he has to select 8 biscuit he thinks are made in butter
N=18
I then did P(Y=k) k being 0 to 8
With the P(y) hypergeometric formula.
p(y):=((nCr(n1,y)*nCr(n2,n-y))/(nCr(n1+n2,n)))
p(y):=((nCr(8,y)*nCr(10,6-y))/(nCr(18,8)))
So here I found P(0)...to P(8)

Then I did the variance formula of hypergeometric with the P
 
Last edited:
  • #43
I am about to quit school if I can't do this simple problem, holy moly never wasted 3 days straight none stop.

Someone please follow with me here, what is wrong with my thinking.
 
  • #44
We can't give you the answer, no need to pressure on us.
You failed this exercise partly because you did not make the effort to build a model. This is THE delicate part of a probability problem, but you came on PF asking: hypergeometric ? Binomial ?
So you spent 3 days on it without having the slightest bit of certainty because you want to do 10 steps at a time.
 
  • #45
masterchiefo said:
So we know he has to select 8 cookies that he thinks are butter.
so
n=8 as he has to select 8 biscuit he thinks are made in butter
N=18
I then did P(Y=k) k being 0 to 8
With the P(y) hypergeometric formula.
p(y):=((nCr(n1,y)*nCr(n2,n-y))/(nCr(n1+n2,n)))
p(y):=((nCr(8,y)*nCr(10,6-y))/(nCr(18,8)))
So here I found P(0)...to P(8)

Then I did the variance formula of hypergeometric with the P

No, he does NOT select 8 biscuits he thinks are made with butter. WE know about the (8,10) split in the cookies, but HE DOES NOT: he has no information about the numbers of butter vs. margarine cookies. You said, in your very first message, that "He does not know the exact number of butter cookies . As he sees no difference , he randomly selects those he claims to be butter". I just cut and pasted that directly from message #1, so that really is what you said.

To summarize: (1) He does not know how many butter/marg cookies there are, only that there are 18 cookies altogether.
(2) He selects cookies randomly to be declared as "butter" cookies. (3) Because his choices are random, the number of cookies he says are butter cookies can be any number from 0 to 18---no matter how far from reality that claim may be. Most of the time he will be wrong, but he does not know that---only WE know that.
 
  • #46
geoffrey159 said:
We can't give you the answer, no need to pressure on us.
You failed this exercise partly because you did not make the effort to build a model. This is THE delicate part of a probability problem, but you came on PF asking: hypergeometric ? Binomial ?
So you spent 3 days on it without having the slightest bit of certainty because you want to do 10 steps at a time.
Well I did 14 problems of binomial, hypergeo and poisson this weekend and got them all correct in a fast manner, this one is the only one I can't figure out for some reason and it is frustrating me badly.
 
  • #47
Ray Vickson said:
No, he does NOT select 8 biscuits he thinks are made with butter. WE know about the (8,10) split in the cookies, but HE DOES NOT: he has no information about the numbers of butter vs. margarine cookies. You said, in your very first message, that "He does not know the exact number of butter cookies . As he sees no difference , he randomly selects those he claims to be butter". I just cut and pasted that directly from message #1, so that really is what you said.

To summarize: (1) He does not know how many butter/marg cookies there are, only that there are 18 cookies altogether.
(2) He selects cookies randomly to be declared as "butter" cookies. (3) Because his choices are random, the number of cookies he says are butter cookies can be any number from 0 to 18---no matter how far from reality that claim may be. Most of the time he will be wrong, but he does not know that---only WE know that.
But it says he has to select 8 cookies in total.
(We ask a person to taste 18 biscuits , 8 made to butter ( the other 10 are made to margarine ) , and to identify 8 butter cookies)
doesnt that mean n = 8?
 
  • #48
masterchiefo said:
But it says he has to select 8 cookies in total.
(We ask a person to taste 18 biscuits , 8 made to butter ( the other 10 are made to margarine ) , and to identify 8 butter cookies)
doesnt that mean n = 8?
 
  • #49
Ray Vickson said:
No, he does NOT select 8 biscuits he thinks are made with butter. WE know about the (8,10) split in the cookies, but HE DOES NOT: he has no information about the numbers of butter vs. margarine cookies. You said, in your very first message, that "He does not know the exact number of butter cookies . As he sees no difference , he randomly selects those he claims to be butter". I just cut and pasted that directly from message #1, so that really is what you said.

To summarize: (1) He does not know how many butter/marg cookies there are, only that there are 18 cookies altogether.
(2) He selects cookies randomly to be declared as "butter" cookies. (3) Because his choices are random, the number of cookies he says are butter cookies can be any number from 0 to 18---no matter how far from reality that claim may be. Most of the time he will be wrong, but he does not know that---only WE know that.
:OOOO I think I understand now wow
 
  • #50
masterchiefo said:
Its in a different language
Ah, I thought that might be the case. I suggest you post it anyway. This is a very large community and there will not be many languages for which there won't be somebody here that understands both the language and the subject matter.

In addition, could you post your own translation, where you try to translate it as carefully and literally as possible?

Key issues that are unclear are
  1. for how many cookies does the guesser make a guess? Does he guess for all 18? If not, how does he decide how many cookies to make a guess for, and which ones to choose?
  2. what is the basis for his making a guess? Saying he does it 'randomly' doesn't tell us enough. We need to know what distribution he uses to make the guess. For instance does he flip a coin each time and say 'butter' if the coin is heads. If so then the probability of a correct guess on a particular cookie is 50%. But if he uses a different 'random' method (eg rolling a dice and saying 'butter' if it gives a 6) the distribution will be different.
Thank you
 
  • #51
Ray Vickson said:
This one is subtle and not entirely straightforward. It uses several results/methods in probability.

Actually, the final result is surprisingly easy (almost unbelievable, if fact), but getting to the solution is not simple.

When I first saw the form of the solution I could not believe it; I had to check several numerical examples before finally hitting on a simple proof and convincing myself that the result really is true.
This is the answer:
Variance = p*n*q
= 8*0.5*0.5
=2

I Cant believe it was that sample and I took 3 days...
I complicated my life for nothing trying to do some complex prob with multiple formulas.
 
  • #52
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %
 
Last edited:
  • #53
geoffrey159 said:
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %

There are two ways to do it, and both are instructive in their own ways. First, let X = number of "butter" answers, Y = number of "butte": answers that are truly "butter" = number of correctly-identified "butter" cookies. X has distribution Binomial(18,1/2), while (Y | X=n) has distribution Hypergeom(8,10,n). Below, denote the binomial coefficient ##{a \choose b}## as ##C(a,b)##.

Method (1) conditioning:
[tex] \begin{array}{rcl} P(Y = k) &=& \sum_{n=0}^{18} P(X = n) P(Y = k | X = n) \\
&=& \displaystyle \sum_{n=0}^{18} C(18,n)2^{-18} \frac{C(8,k) C(10,n-k)}{C(18,n)} \\
&=& C(8,k) 2^{-8} \sum_{n=0}^{18} 2^{-10} C(10,n-k)
\end{array} [/tex]
In the last summation, ##n## actually runs from ##n = k## to ##n = 10+k##, so changing ##n## to ## n = k+j, j=0,1, \ldots, 10##, the last line above becomes
[tex] C(8,k) 2^{-8} \left( 2^{-10} \sum_{j=0}^{10} C(10,j) \right) = C(8,k) 2^{-8} \left( 2^{-10} (1+1)^{10} \right) = C(8,k) 2^{-8}.[/tex]
In other words, ##Y \sim \text{binomial}(8,1/2)##.

Method (2) directly:

Each of the 8 "butter" cookies gets labelled as "butter" or "margarine" independently at random, with probabilities of 1/2 for each. Thus, the number of correctly-labelled butter cookies is binomial with parameters 8 and 1/2.

This is a good illustration of how changing a point of view can alter the analysis. In the first way, we initially look at the "butter" labels and then ask now many of them are correct; in the second way we look at the "butter" cookies and ask how many of them are labelled correctly. The final event is the same in both views, but the methods of analysis are very different.
 
  • Like
Likes geoffrey159 and Samy_A
  • #54
I'm impressed, you have solved this so easily and convincingly by conditioning. I have a lot to learn :-)
Thank you for the demo !
 
  • #55
geoffrey159 said:
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %

Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?
 
  • #56
Ray Vickson said:
Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?
I do not really appreciate the fact that I proposed I would use binomial with n p q and being told this is not how we do this problem.
and also I've said multiple times n=8 and you have told me this is incorrect and made me feel like a retard because at the end now you are saying its a binomial with n=8 and p=0.5 which was the early thinking of my problem.

This is the true reason why I took 3 days because of all the things you said confused me and made me look for complex resolution involving multiple probabilities formulas.
 
  • #57
masterchiefo said:
I do not really appreciate the fact that I proposed I would use binomial with n p q and being told this is not how we do this problem.
and also I've said multiple times n=8 and you have told me this is incorrect and made me feel like a retard because at the end now you are saying its a binomial with n=8 and p=0.5 which was the early thinking of my problem.

This is the true reason why I took 3 days because of all the things you said confused me and made me look for complex resolution involving multiple probabilities formulas.

I still do not regard your solution as entirely satisfactory, because it ought to include a statement such as "The random variable has distribution Binomial(8,0.5) because ______________ fill in your reasons here ____________________________________". You never gave any reasons, and you did not convince me.
 
  • #58
Ray Vickson said:
I still do not regard your solution as entirely satisfactory, because it ought to include a statement such as "The random variable has distribution Binomial(8,0.5) because ______________ fill in your reasons here ____________________________________". You never gave any reasons, and you did not convince me.
Because the probability is always 0.5 and in Binomial the probability does not change every trials, stays the same.
P is constant in Binomial while in Hypergeomtric its not constant.
 
  • #59
Ray Vickson said:
Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?

The way I have done it is really over complicated compared to your solution.

Man has to choose at each try between
(0,B) for butter incorrect,
(1,B) for butter correct,
(0,M) for margarine incorrect,
(1,M) for margarine correct,

therefore the random experiment is a uniform choice among the ## (m_1,...,m_{18}) ## such that
if ## k_1 ## (resp. ##k_2##, ##k_3##, ##k_4##) denotes the number of (1,B) ( resp. (0,M), (1,M), (0,B) ),
we have ## k_1 + k_2 = 8 ## and ## k_3 + k_4 = 10 ##.
We call ##\Omega ## this set.

Then calling ## I = \{ k_1,k_2,k_3,k_4:\ k_1 + k_2 = 8, \ k_3+k_4 = 10 \} ##, the cardinal of ##\Omega## is :##|\Omega| = \sum_I { 8 \choose k_1 }{ 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = \sum_I { 8 \choose k_1 }{ 10 \choose k_3 } = \sum_{0 \le k_1 \le 8} { 8 \choose k_1} \sum_{0 \le k_3 \le 10} { 10 \choose k_3} = 2^8.2^{10} = 2^{18} ##

For event ## \{ Y = k_1 \} ##, writing ##J_{k_1} = \{ k_2,k_3,k_4: \ k_2 = 8 -k_1, \ k_3+k_4 = 10 \} ## we have

## |Y = k_1 | = { 8 \choose k_1 } \sum_{J_{k_1}} { 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = 2^{10} . { 8 \choose k_1 } ##

Giving ##P(Y= k_1 ) = \frac{2^{10} . { 8 \choose k_1 }}{2^{18}} = 2^{-8} { 8 \choose k_1 } ##

and ##Y ## is a ##B(8,1/2)##
 
  • #60
geoffrey159 said:
The way I have done it is really over complicated compared to your solution.

Man has to choose at each try between
(0,B) for butter incorrect,
(1,B) for butter correct,
(0,M) for margarine incorrect,
(1,M) for margarine correct,

therefore the random experiment is a uniform choice among the ## (m_1,...,m_{18}) ## such that
if ## k_1 ## (resp. ##k_2##, ##k_3##, ##k_4##) denotes the number of (1,B) ( resp. (0,M), (1,M), (0,B) ),
we have ## k_1 + k_2 = 8 ## and ## k_3 + k_4 = 10 ##.
We call ##\Omega ## this set.

Then calling ## I = \{ k_1,k_2,k_3,k_4:\ k_1 + k_2 = 8, \ k_3+k_4 = 10 \} ##, the cardinal of ##\Omega## is :##|\Omega| = \sum_I { 8 \choose k_1 }{ 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = \sum_I { 8 \choose k_1 }{ 10 \choose k_3 } = \sum_{0 \le k_1 \le 8} { 8 \choose k_1} \sum_{0 \le k_3 \le 10} { 10 \choose k_3} = 2^8.2^{10} = 2^{18} ##

For event ## \{ Y = k_1 \} ##, writing ##J_{k_1} = \{ k_2,k_3,k_4: \ k_2 = 8 -k_1, \ k_3+k_4 = 10 \} ## we have

## |Y = k_1 | = { 8 \choose k_1 } \sum_{J_{k_1}} { 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = 2^{10} . { 8 \choose k_1 } ##

Giving ##P(Y= k_1 ) = \frac{2^{10} . { 8 \choose k_1 }}{2^{18}} = 2^{-8} { 8 \choose k_1 } ##

and ##Y ## is a ##B(8,1/2)##

I like this solution, too, although I must admit I have not verified all the details.
 

Similar threads

  • Calculus and Beyond Homework Help
Replies
8
Views
1K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
  • Calculus and Beyond Homework Help
Replies
8
Views
2K
  • Calculus and Beyond Homework Help
Replies
7
Views
1K
Replies
12
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
1K
  • Calculus and Beyond Homework Help
Replies
13
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Calculus and Beyond Homework Help
Replies
3
Views
991
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
Back
Top