Math: discrete probability distribution

  • #51
Ray Vickson said:
This one is subtle and not entirely straightforward. It uses several results/methods in probability.

Actually, the final result is surprisingly easy (almost unbelievable, if fact), but getting to the solution is not simple.

When I first saw the form of the solution I could not believe it; I had to check several numerical examples before finally hitting on a simple proof and convincing myself that the result really is true.
This is the answer:
Variance = p*n*q
= 8*0.5*0.5
=2

I Cant believe it was that sample and I took 3 days...
I complicated my life for nothing trying to do some complex prob with multiple formulas.
 
Physics news on Phys.org
  • #52
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %
 
Last edited:
  • #53
geoffrey159 said:
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %

There are two ways to do it, and both are instructive in their own ways. First, let X = number of "butter" answers, Y = number of "butte": answers that are truly "butter" = number of correctly-identified "butter" cookies. X has distribution Binomial(18,1/2), while (Y | X=n) has distribution Hypergeom(8,10,n). Below, denote the binomial coefficient ##{a \choose b}## as ##C(a,b)##.

Method (1) conditioning:
\begin{array}{rcl} P(Y = k) &amp;=&amp; \sum_{n=0}^{18} P(X = n) P(Y = k | X = n) \\<br /> &amp;=&amp; \displaystyle \sum_{n=0}^{18} C(18,n)2^{-18} \frac{C(8,k) C(10,n-k)}{C(18,n)} \\<br /> &amp;=&amp; C(8,k) 2^{-8} \sum_{n=0}^{18} 2^{-10} C(10,n-k)<br /> \end{array}
In the last summation, ##n## actually runs from ##n = k## to ##n = 10+k##, so changing ##n## to ## n = k+j, j=0,1, \ldots, 10##, the last line above becomes
C(8,k) 2^{-8} \left( 2^{-10} \sum_{j=0}^{10} C(10,j) \right) = C(8,k) 2^{-8} \left( 2^{-10} (1+1)^{10} \right) = C(8,k) 2^{-8}.
In other words, ##Y \sim \text{binomial}(8,1/2)##.

Method (2) directly:

Each of the 8 "butter" cookies gets labelled as "butter" or "margarine" independently at random, with probabilities of 1/2 for each. Thus, the number of correctly-labelled butter cookies is binomial with parameters 8 and 1/2.

This is a good illustration of how changing a point of view can alter the analysis. In the first way, we initially look at the "butter" labels and then ask now many of them are correct; in the second way we look at the "butter" cookies and ask how many of them are labelled correctly. The final event is the same in both views, but the methods of analysis are very different.
 
  • Like
Likes geoffrey159 and Samy_A
  • #54
I'm impressed, you have solved this so easily and convincingly by conditioning. I have a lot to learn :-)
Thank you for the demo !
 
  • #55
geoffrey159 said:
@Ray Vickson,do you mind to share what you have for ##P(y = k) ## ?

Because I have tried it, and even though I still have errors in my expression (it sums to 98.02 % on ##[[0,8]]##), I find a complicated expression. What do you have ?EDIT: it sums to 100.000073 %

Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?
 
  • #56
Ray Vickson said:
Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?
I do not really appreciate the fact that I proposed I would use binomial with n p q and being told this is not how we do this problem.
and also I've said multiple times n=8 and you have told me this is incorrect and made me feel like a retard because at the end now you are saying its a binomial with n=8 and p=0.5 which was the early thinking of my problem.

This is the true reason why I took 3 days because of all the things you said confused me and made me look for complex resolution involving multiple probabilities formulas.
 
  • #57
masterchiefo said:
I do not really appreciate the fact that I proposed I would use binomial with n p q and being told this is not how we do this problem.
and also I've said multiple times n=8 and you have told me this is incorrect and made me feel like a retard because at the end now you are saying its a binomial with n=8 and p=0.5 which was the early thinking of my problem.

This is the true reason why I took 3 days because of all the things you said confused me and made me look for complex resolution involving multiple probabilities formulas.

I still do not regard your solution as entirely satisfactory, because it ought to include a statement such as "The random variable has distribution Binomial(8,0.5) because ______________ fill in your reasons here ____________________________________". You never gave any reasons, and you did not convince me.
 
  • #58
Ray Vickson said:
I still do not regard your solution as entirely satisfactory, because it ought to include a statement such as "The random variable has distribution Binomial(8,0.5) because ______________ fill in your reasons here ____________________________________". You never gave any reasons, and you did not convince me.
Because the probability is always 0.5 and in Binomial the probability does not change every trials, stays the same.
P is constant in Binomial while in Hypergeomtric its not constant.
 
  • #59
Ray Vickson said:
Often, that type of discrepancy is just a result of using floating-point numbers instead of exact rationals, so you get inevitable roundoff errors. Perhaps that is what you are seeing?

The way I have done it is really over complicated compared to your solution.

Man has to choose at each try between
(0,B) for butter incorrect,
(1,B) for butter correct,
(0,M) for margarine incorrect,
(1,M) for margarine correct,

therefore the random experiment is a uniform choice among the ## (m_1,...,m_{18}) ## such that
if ## k_1 ## (resp. ##k_2##, ##k_3##, ##k_4##) denotes the number of (1,B) ( resp. (0,M), (1,M), (0,B) ),
we have ## k_1 + k_2 = 8 ## and ## k_3 + k_4 = 10 ##.
We call ##\Omega ## this set.

Then calling ## I = \{ k_1,k_2,k_3,k_4:\ k_1 + k_2 = 8, \ k_3+k_4 = 10 \} ##, the cardinal of ##\Omega## is :##|\Omega| = \sum_I { 8 \choose k_1 }{ 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = \sum_I { 8 \choose k_1 }{ 10 \choose k_3 } = \sum_{0 \le k_1 \le 8} { 8 \choose k_1} \sum_{0 \le k_3 \le 10} { 10 \choose k_3} = 2^8.2^{10} = 2^{18} ##

For event ## \{ Y = k_1 \} ##, writing ##J_{k_1} = \{ k_2,k_3,k_4: \ k_2 = 8 -k_1, \ k_3+k_4 = 10 \} ## we have

## |Y = k_1 | = { 8 \choose k_1 } \sum_{J_{k_1}} { 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = 2^{10} . { 8 \choose k_1 } ##

Giving ##P(Y= k_1 ) = \frac{2^{10} . { 8 \choose k_1 }}{2^{18}} = 2^{-8} { 8 \choose k_1 } ##

and ##Y ## is a ##B(8,1/2)##
 
  • #60
geoffrey159 said:
The way I have done it is really over complicated compared to your solution.

Man has to choose at each try between
(0,B) for butter incorrect,
(1,B) for butter correct,
(0,M) for margarine incorrect,
(1,M) for margarine correct,

therefore the random experiment is a uniform choice among the ## (m_1,...,m_{18}) ## such that
if ## k_1 ## (resp. ##k_2##, ##k_3##, ##k_4##) denotes the number of (1,B) ( resp. (0,M), (1,M), (0,B) ),
we have ## k_1 + k_2 = 8 ## and ## k_3 + k_4 = 10 ##.
We call ##\Omega ## this set.

Then calling ## I = \{ k_1,k_2,k_3,k_4:\ k_1 + k_2 = 8, \ k_3+k_4 = 10 \} ##, the cardinal of ##\Omega## is :##|\Omega| = \sum_I { 8 \choose k_1 }{ 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = \sum_I { 8 \choose k_1 }{ 10 \choose k_3 } = \sum_{0 \le k_1 \le 8} { 8 \choose k_1} \sum_{0 \le k_3 \le 10} { 10 \choose k_3} = 2^8.2^{10} = 2^{18} ##

For event ## \{ Y = k_1 \} ##, writing ##J_{k_1} = \{ k_2,k_3,k_4: \ k_2 = 8 -k_1, \ k_3+k_4 = 10 \} ## we have

## |Y = k_1 | = { 8 \choose k_1 } \sum_{J_{k_1}} { 8-k_1 \choose k_2}{ \max ( 18 - k_1 -k_2 , 10 ) \choose k_3 }{ 10 - k_3 \choose k_4} = 2^{10} . { 8 \choose k_1 } ##

Giving ##P(Y= k_1 ) = \frac{2^{10} . { 8 \choose k_1 }}{2^{18}} = 2^{-8} { 8 \choose k_1 } ##

and ##Y ## is a ##B(8,1/2)##

I like this solution, too, although I must admit I have not verified all the details.
 
Back
Top