Probability: Flawed Assumptions in Picking M&M's

  • Thread starter Rage Spectate
  • Start date
  • Tags
    Probability
In summary: The second case is unrealistic, but the first case is not. The probabilities change dramatically between the two cases.In summary, the conversation is about a question on a quiz where the teacher marked the student wrong for stating that the probability of picking a red M&M increases after drawing 10 non-red M&M's in a row. The student argued that this is true regardless of the percentage of red M&M's in the bag, while the teacher argued that the population size and the method of distribution can affect the probability. Ultimately, the teacher is correct in this scenario.
  • #1
Rage Spectate
12
0
I am a student in an Intro. Stat. class at a local community college.

I recently got a quiz back in which the teacher marked me wrong and upon questioning with more explicit detail he still told me I was wrong.

The question:
1) The Masterfoods company manufactures bags of Peanuts Butter M&M's. They report that they make 10% each brown and red candies, and 20% each yellow, blue, and orange candies. The rest of the candies are green.

--the last question on the quiz--

c. After picking 10 M&M's in a row, you still have not picked a red one. A friend says that you should have a better chance of getting a red candy on your next pick since you have yet to see one. Comment on your friends statement.

I answered that the friend is correct that in ideal circumstances probability has increased. (ideal, such as you didn't receive a bag with 0 red M&M's) This is covered by the friends diction of "should" any how.

The teacher said the correct answer is: Your friend is speaking nonsense.


My rationale:

P(picking a red) = N(number of red M&M's) / D(total number of M&M's)

or simplified to P = N/D.

After you have drawn out 10 M&M's previously the equation can be understood as P = N / (D-10)

It will not matter that you don't know what N or D is because you know that N has not changed but that the denominator has, no matter how many M&M's are in the bag the probability will increase. Substitute any non-zero number for the variables and the probability will increase.

If that is true then the friend is correct in assuming that you have a better chance of picking a red M&M.

This seems blatantly obvious to me. When I pressed my teacher he said "But you don't know the denominator" and I told him "that doesn't matter" and his response was "You are thinking to hard."

So, who is right? Me or the teacher?
 
Physics news on Phys.org
  • #2
Your teacher.

You interpreted of the drawing of 10 non-red candies in a row tas meaning the probability of getting a red the next time is more than 10% implicitly assumes that those numbers from M&M represent the percentages in each bag. Those numbers from M&M represent overall percentages, the makeup of the millions of candies they sell each year. They are not a guarantee regarding the makeup of each individual bag.

A better interpretation of this observed fact is that your particular bag has less than the standard 10% red candies. In short, your friend is speaking nonsense.
 
  • #3
D H said:
Your teacher.

You interpreted of the drawing of 10 non-red candies in a row tas meaning the probability of getting a red the next time is more than 10% implicitly assumes that those numbers from M&M represent the percentages in each bag. Those numbers from M&M represent overall percentages, the makeup of the millions of candies they sell each year. They are not a guarantee regarding the makeup of each individual bag.

A better interpretation of this observed fact is that your particular bag has less than the standard 10% red candies. In short, your friend is speaking nonsense.

I'm only assuming that the percentages are reasonably accurate. In the end, the percentages are irrelevant. It wouldn't matter if my bag was largely off the production mean. You can substitute any non-zero number for N and the probability will ALWAYS increase as long as D is greater then 10 the probability will ALWAYS increase.

Show me a case in which the probability will not increase.
 
  • #4
Something you've omitted could be relevant. Do they paint 10% of them at the factory? Do they randomly decide for each one?

Assuming the former, then you are correct, the probability increases. However, the population is so large that the increase is so incredibly tiny that one would never say it that way in common speech.

I think what your teacher means to say is correct, but he's not saying it well.
 
  • #5
Suppose you have a bag containing 60 samples. Before you draw the first candy your going-in position is that the first candy has a 10% chance of being red; you expect that 6 of the 60 candies in the bag are red. You do not know that as a fact, however. Because of vagaries of the manufacturing process, there is a non-zero chance that the bag will contain zero red candies, a non-zero chance it will contain only one, and so on up to all reds.

Note that you would need to have six or more reds in the bag for the probability that the next M&M drawn is red to be greater than 10% after drawing 10 non-reds in a row from the bag. You don't know M&M's assembly process, but right from the onset the probability that the unopened bag contains five or fewer reds is pretty good. After drawing 10 non-reds in a row, the probability that it did indeed contain five or fewer reds is not just pretty good: It is what you should be betting on.
 
  • #6
Hurkyl said:
Something you've omitted could be relevant. Do they paint 10% of them at the factory? Do they randomly decide for each one?

Assuming the former, then you are correct, the probability increases. However, the population is so large that the increase is so incredibly tiny that one would never say it that way in common speech.

I think what your teacher means to say is correct, but he's not saying it well.

We are talking about a single bag of M&M's

I'm not sure how the factory really matters. We are talking about a finite number of M&M's in a bag.
 
  • #7
Rage Spectate said:
We are talking about a single bag of M&M's

I'm not sure how the factory really matters. We are talking about a finite number of M&M's in a bag.
Two very different cases:
Case 1: 10% are painted red in the factory, and then all are randomly distributed to bags,
Case 2: They carefully count and place 6 red M&Ms in each bag of 60.
 
  • #8
D H said:
Suppose you have a bag containing 60 samples. Before you draw the first candy your going-in position is that the first candy has a 10% chance of being red; you expect that 6 of the 60 candies in the bag are red. You do not know that as a fact, however. Because of vagaries of the manufacturing process, there is a non-zero chance that the bag will contain zero red candies, a non-zero chance it will contain only one, and so on up to all reds.

Note that you would need to have six or more reds in the bag for the probability that the next M&M drawn is red to be greater than 10% after drawing 10 non-reds in a row from the bag. You don't know M&M's assembly process, but right from the onset the probability that the unopened bag contains five or fewer reds is pretty good. After drawing 10 non-reds in a row, the probability that it did indeed contain five or fewer reds is not just pretty good: It is what you should be betting on.

Ah I see what you are taking issue with.

OK, you are right that the bag could come with a 7% count of red. But that is understood, the question is not "will the probability be greater then 10%" But, "will the probability be greater then it was before you drew any out of the bag, w/e that % was."

So in that case, if the bag had 7% truly, the friend would in effect be saying "you have a greater then 7% chance of drawing a red M&M"
 
  • #9
Hurkyl said:
Two very different cases:
Case 1: 10% are painted red in the factory, and then all are randomly distributed to bags,
Case 2: They carefully count and place 6 red M&Ms in each bag of 60.

It wouldn't matter if either were the case.

In the end, the %'s given at the start are irrelevant. P(picking red) = N(number of reds) / D (number of total M&M's) The friend isn't assuming factory percentages are actually the case within the bag but rather that probability has changed from the initial bag probability and the current probability.
 
  • #10
Rage Spectate said:
Ah I see what you are taking issue with.

OK, you are right that the bag could come with a 7% count of red. But that is understood, the question is not "will the probability be greater then 10%" But, "will the probability be greater then it was before you drew any out of the bag, w/e that % was."
No. You are thinking too much. Prior opening the bag all you can say about the first candy drawn from the bag are the odds based on the manufacturer's gross statistics. You have no choice but to say that there is a 1/10 chance that the first candy drawn is red. You can't say that the probability is 7% because the 7% of the candies in the bag happen to be red. You do not know that. You have to base your initial guess on what you do know.
 
  • #11
D H said:
No. You are thinking too much. Prior opening the bag all you can say about the first candy drawn from the bag are the odds based on the manufacturer's gross statistics. You have no choice but to say that there is a 1/10 chance that the first candy drawn is red. You can't say that the probability is 7% because the 7% of the candies in the bag happen to be red. You do not know that. You have to base your initial guess on what you do know.

Ok, but you would be foolish to assert that the probability of picking a red is certainly 10% prior to opening the bag.

Now if I have watched in desperation as you have tried and failed to pick a red M&M from the bag and I say "Don't worry, you have increased your chances of picking a red now! For each non-red you choose you increase your chances of picking a red on the next draw."
 
  • #12
And that reasoning is wrong.

You appear to have a "la la la I can't hear you" attitude. That is a bad attitude to have in school and in life. A much better attitude is "where did I go wrong?" An attitude of "my professor is wrong" is not going to help you learn.
 
  • #13
D H said:
And that reasoning is wrong.

You appear to have a "la la la I can't hear you" attitude. That is a bad attitude to have in school and in life. A much better attitude is "where did I go wrong?" An attitude of "my professor is wrong" is not going to help you learn.

I never said my teacher is wrong. I asked if he was.

I don't think this has come down to an issue of reasoning but rather interpretation of the wording of the problem.

Secondly, I fully understand where you are coming from. I just don't agree with your interpretation of what the student is saying.
 
Last edited:
  • #14
Rage Spectate said:
I don't think this has come down to an issue of reasoning but rather interpretation of the wording of the problem.

Not really, it seems pretty straightforward to me that the teacher is right.
 
  • #15
CRGreathouse said:
Not really, it seems pretty straightforward to me that the teacher is right.

If it is the case that the friend is stating "You have a better chance then you did from the initial bag opening" does your assertion still stand?
 
  • #16
Rage Spectate said:
If it is the case that the friend is stating "You have a better chance then you did from the initial bag opening" does your assertion still stand?

If the friend's statement was about prior probability, yes. If about posterior probability, no.
 
  • #17
CRGreathouse said:
If the friend's statement was about prior probability, yes. If about posterior probability, no.

Ok, fair enough, I agree with all of that. This is why I said it comes down to interpreting what the friend actually meant within the problem. I understood it as prior probability and that is why I took issue, obviously my interpretation could be wrong, though arguably the question was ambiguous as I have posed it to several people and gotten issues with the same thing.
 
  • #18
Rage Spectate said:
Ok, fair enough, I agree with all of that. This is why I said it comes down to interpreting what the friend actually meant within the problem. I understood it as prior probability and that is why I took issue, obviously my interpretation could be wrong, though arguably the question was ambiguous as I have posed it to several people and gotten issues with the same thing.

Yes, assuming you also misunderstood the teacher as attributing to the friend, "You have a better chance then you did from the initial bag opening".
 
  • #19
CRGreathouse said:
Yes, assuming you also misunderstood the teacher as attributing to the friend, "You have a better chance then you did from the initial bag opening".

I only had the quiz directions to go on during the quiz.

Also, the teachers answer, "The student was speaking nonsense." Is not necessarily true even given his interpretation, or so it would seem.

The student could in fact be correct if the true ratio of red to non-red was equal to or higher then 10%.

So then the best answer, given that interpretation, is that the student's comment was unjustified though not necessarily wrong.
 
  • #20
Rage Spectate said:
I never said my teacher is wrong. I asked if he was.

I don't think this has come down to an issue of reasoning but rather interpretation of the wording of the problem.

Secondly, I fully understand where you are coming from. I just don't agree with your interpretation of what the student is saying.

Also, it's good to question the assumptions - otherwise are we supposed to make advances in science?

As many of you pointed out, the teacher left some ambiguity in their problem statement, so in a sense both answers are correct. One way around it though is via probabilistic reasoning (James Franklin's book "What is Science?" has a nice discussion of this idea) - for example, one could argue that having 10 machines each producing one colour at a constant rate into a single mixing machine, is more efficient than 10 machines each choosing a random colour, so it seems more likely that the company's claimed ratios are exact; then, whether we're sampling from the entire silo or from a random packet, it is sampling without replacement from a finite population so the probability of success does increase slightly with each failure (though as Hurkyl pointed out, not enough to notice the difference). On the other hand if we're sampling direct from a random M&M generator (which seems less likely) then this is effectively sampling without replacement from an infinite population, so the probability of success does not increase with each failure.

So if you agree with this argument it seems more likely (more plausible) that the student is correct.
 
  • #21
bpet;2932123it is sampling without replacement from a [B said:
finite population[/B] so the probability of success does increase slightly with each failure (though as Hurkyl pointed out, not enough to notice the difference).
No, it does not! The reason is because you do not know how many red candies were in the bag in the first place.

I'll give a concrete example. Suppose that, after lots of research, your class has found that bags of some selected size always contain 15 candies, and that
  • Some bags have no red candies (p=0.24),
  • some have one (p=0.33),
  • some have two (p=0.23),
  • some have three (p=0.11),
  • some have four (p=0.0.7), and
  • some have five (p=0.02).
Note that this averages to 1.5, or 10%, red candies per bag.

Now suppose you pick an unopened bag of this size. Prior to opening the bag all you know is that this bag contains 15 candies. You do not know if it contains no red candies or five. All you know is that, on average, a bag of this size contains 1.5 red candies. If asked what the probability that the first candy drawn will be red your best bet is to say 10%.

You don't get a red candy on the first draw. You keep drawing candies, never drawing a red in the first ten draws. Assuming that the previously examined bags are exemplary of this particular bag of candies, this drawing of ten non-reds in a row gives you a good guess regarding the red/non-red makeup of the remaining five candies in the bag. The probability that all five are red has shrunk to a tiny number; you almost certainly would have drawn one or more red candies somewhere in those first ten draws if the bag did contain five candies. The same goes for the bag containing four, three, two, or even one candy. On the other hand, the probability that you have one of those lousy bags with no red candies has risen from the nominal 0.24 to 0.64. The probability that the next candy will be red is 0.087: Smaller than the default 0.10.
 
  • #22
D H said:
No, it does not! The reason is because you do not know how many red candies were in the bag in the first place.

Now suppose you pick an unopened bag of this size. Prior to opening the bag all you know is that this bag contains 15 candies. You do not know if it contains no red candies or five. All you know is that, on average, a bag of this size contains 1.5 red candies. If asked what the probability that the first candy drawn will be red your best bet is to say 10%.

You don't get a red candy on the first draw. You keep drawing candies, never drawing a red in the first ten draws. Assuming that the previously examined bags are exemplary of this particular bag of candies, this drawing of ten non-reds in a row gives you a good guess regarding the red/non-red makeup of the remaining five candies in the bag. The probability that all five are red has shrunk to a tiny number; you almost certainly would have drawn one or more red candies somewhere in those first ten draws if the bag did contain five candies. The same goes for the bag containing four, three, two, or even one candy. On the other hand, the probability that you have one of those lousy bags with no red candies has risen from the nominal 0.24 to 0.64. The probability that the next candy will be red is 0.087: Smaller than the default 0.10.

Sorry could you explain how 0.087 was calculated? Calculating the probability a different way, if there are N candies in the world of which exactly 10% are red, after we draw 10 non-reds the probability of drawing a red next would be (0.1*N)/(N-10) which is larger than 1/10.
 
  • #23
You are not drawing, without replacement, from a sample space of N candies of which N/10 are red. You are drawing, without replacement, from a sample space of 15 candies of which an unknown number are red. Well, not quite unknown. We do know that the number of red candies in the bag was not six, for example. There would have been no way to draw ten non-reds in a row from a bag containing nine non-reds and six reds.

What I did was to apply Bayes' Theorem, one form of which is

[tex]P(H_i|E) = \frac {P(H_i)P(E|H_i)}{\sum_j P(H_j)P(E|H_j)}[/tex]

Here,
  • E is the evidence that has been gathered (or an event that is known to has occurred). In the problem at hand, E is the fact that the first ten candies drawn from a bag of fifteen M&Ms were not red candies.
  • The Hi are a set of mutually exclusive hypotheses that collectively encompass all of the possible explanations. In the problem at hand, we can hypothesize that of the fifteen M&Ms originally in the bag of candies that
    • Hypothesis H0: The bag contained zero red candles.
    • Hypothesis H1: The bag contained one red candles.
    • Hypothesis H2: The bag contained two red candles.
    • Hypothesis H3: The bag contained three red candles.
    • Hypothesis H4: The bag contained four red candles.
    • Hypothesis H5: The bag contained five red candles.
  • Each P((E|Hi) is the conditional probability of the evidence given hypothesis Hi. In the problem at hand, these conditional probabilities can be calculated via the hypergeometric distribution:
    [tex]P(E|H_i) = \frac{(15-i)!5!}{(5-i)!15!}[/tex]
  • The P(Hi) are the prior probabilities of the hypotheses Hi. In general, these prior probabilities can range anywhere from educated guesses to values based on some deep understanding of the problem at hand. In this case, we have some meticulously collected empirical data regarding the number of red M&Ms in bags that contain fifteen M&Ms.
  • Finally, the value on the left-hand side, P(Hi|E), is the posterior probability of hypothesis Hi given the evidence collected.

Bayes' Theorem is a very powerful statistical tool. In this case it let's us get an updated estimate of the original contents of the bag even though we have only examined some of those contents. The following table presents the Bayesian analysis of the problem at hand:

[tex]
\begin{tabular}{clllll}
$ i $ & $ P(H_i) $ & $ P(E|H_i) $ & $ P(H_i)P(E|H_i) $ & $ P(H_i|E)\qquad $ & $ P(R|H_i,E) $ \\
\hline
0 & 0.24 & 1 & 0.24 & 0.64070834 & 0 \\
1 & 0.33 & 0.33333333 & 0.11 & 0.29365799 & 0.05873160 \\
2 & 0.23 & 0.09523810 & 0.02190476 & 0.05847735 & 0.02339094 \\
3 & 0.11 & 0.02197802 & 0.00241758 & 0.00645402 & 0.00387241 \\
4 & 0.07 & 0.00366300 & 0.00025641 & 0.00068452 & 0.00054761 \\
5 & 0.02 & 0.00033300 & 0.00000666 & 0.00001778 & 0.00001778 \\
\hline
Total &&& 0.37458541 & 1 & 0.08656034 \\
\end{tabular}[/tex]

Note that in addition to the quantities discussed above, the table contains one additional item in the rightmost column, P(R|Hi,E). This is the probability of drawing a red M&M on the eleventh (next) draw from the bag of candies given hypothesis Hi and given that the first ten draws were non-red candies.

The bottom right number, 0.08656 (or 0.087 for short) is the sum of these probabilities. This is the probability that the next candy drawn from the bag will be red given all of the available evidence, which is (a) the empirical probability distribution obtained by examining hundreds of bags of candies and (b) the fact that the first ten candies drawn from the bag being investigated were non-red candies.
 
Last edited:
  • #24
D H said:
...The P(Hi) are the prior probabilities of the hypotheses Hi...In this case, we have some meticulously collected empirical data regarding the number of red M&Ms in bags that contain fifteen M&Ms...
[tex]
\begin{tabular}{clllll}
$ i $ & $ P(H_i) $ & $ P(H_i|E) $ & $ P(H_i)P(H_i|E) $ & $ P(E|H_i)\qquad $ & $ P(R|H_i,E) $ \\
\hline
0 & 0.24 & 1 & 0.24 & 0.64070834 & 0 \\
1 & 0.33 & 0.33333333 & 0.11 & 0.29365799 & 0.05873160 \\
2 & 0.23 & 0.09523810 & 0.02190476 & 0.05847735 & 0.02339094 \\
3 & 0.11 & 0.02197802 & 0.00241758 & 0.00645402 & 0.00387241 \\
4 & 0.07 & 0.00366300 & 0.00025641 & 0.00068452 & 0.00054761 \\
5 & 0.02 & 0.00033300 & 0.00000666 & 0.00001778 & 0.00001778 \\
\hline
Total &&& 0.37458541 & 1 & 0.08656034 \\
\end{tabular}[/tex]

Note that in addition to the quantities discussed above, the table contains one additional item in the rightmost column, P(R|Hi,E). This is the probability of drawing a red M&M on the eleventh (next) draw from the bag of candies given hypothesis Hi and given that the first ten draws were non-red candies.

The bottom right number, 0.08656 (or 0.087 for short) is the sum of these probabilities. This is the probability that the next candy drawn from the bag will be red given all of the available evidence, which is (a) the empirical probability distribution obtained by examining hundreds of bags of candies and (b) the fact that the first ten candies drawn from the bag being investigated were non-red candies.

That's very interesting, only wouldn't the result change according to the choice of prior probabilities? If we take into account that the bags are themselves drawn from a finite population, then [tex]P(H_0)\approx0.21[/tex] etc when N is large and perhaps then the result would be closer to 0.1. Also, how were the [tex]P(R|E,H_i)[/tex] calculated - wouldn't they just be equal to i/5?
 
  • #25
First off, I mislabeled three of the columns in the table in post #23. I have edited the post to correct that error. Sorry if that caused any confusion.

bpet said:
That's very interesting, only wouldn't the result change according to the choice of prior probabilities?
Of course. That is why in this hypothetical example the students collected lots of data involving hundreds, maybe even thousands of bags of candy.

A somewhat similar approach is to use maximum likelihood. This approach will yield a similar answer for P(R|E): less than 10% given the empirically-determined probability distribution.

Also, how were the [tex]P(R|E,H_i)[/tex] calculated - wouldn't they just be equal to i/5?
No. that is P(R|Hi), the probability of getting a red assuming hypothesis i. P(R|Hi,E) is the conditional probability of getting a red assuming hypothesis i and given the evidence and given the empirical priors:

[tex]P(R|H_i,E) = P(R|H_i)P(H_i|E)[/tex]
 
  • #26
D H said:
First off, I mislabeled three of the columns in the table in post #23. I have edited the post to correct that error. Sorry if that caused any confusion.


Of course. That is why in this hypothetical example the students collected lots of data involving hundreds, maybe even thousands of bags of candy.

A somewhat similar approach is to use maximum likelihood. This approach will yield a similar answer for P(R|E): less than 10% given the empirically-determined probability distribution.


No. that is P(R|Hi), the probability of getting a red assuming hypothesis i. P(R|Hi,E) is the conditional probability of getting a red assuming hypothesis i and given the evidence and given the empirical priors:

[tex]P(R|H_i,E) = P(R|H_i)P(H_i|E)[/tex]

Ok thanks, now I see where the discrepancy lies, you were using statistical estimates but we were able to assume that the exact proportion of red candies is known based on the production efficiency argument. I modified your table a bit to show how the probability increases as predicted using the simpler method (and put in what I think are the correct column labels).
 

Attachments

  • Untitled.png
    Untitled.png
    58.4 KB · Views: 444
  • M and M probabilities.xls
    22 KB · Views: 207
Last edited:
  • #27
Neither of those simpler methods correspond to how one would sample candies, do they? You definitely are not drawing from a constantly-replenished or infinite source (binomial) nor are you drawing from a finite but very large source where you know the initial distribution (hypergeometric). Instead you are drawing from a small but somewhat unknown sample space that is selected in a somewhat unknown way from from an even larger sample space with a known distribution.
 
  • #28
D H said:
Neither of those simpler methods correspond to how one would sample candies, do they? You definitely are not drawing from a constantly-replenished or infinite source (binomial) nor are you drawing from a finite but very large source where you know the initial distribution (hypergeometric). Instead you are drawing from a small but somewhat unknown sample space that is selected in a somewhat unknown way from from an even larger sample space with a known distribution.

Ah yes, I was implicitly assuming that the mixing machine involves a large bucket filled with fixed proportions; as you said, isn't necessarily how the manufacturing process works. I wonder, if the mixing machine was a long funnel, what would the prior distribution look like; or if the prior is estimated from data what the uncertainty in the result would be.

I tried a few more models and found some examples with P(R|E) ranging from 0.00014 to 0.24 (see below). What a fascinating problem! Thank you, D H and Rage Spectate!
 

Attachments

  • Untitled.png
    Untitled.png
    8.8 KB · Views: 426
  • #29
I think it'd help to hear your professors reasoning on the problem but in the original post you make it sound like he just said wrong and then you explained your reasoning and he simply said wrong again.
 
Last edited:
  • #31
D H said:
No. that is P(R|Hi), the probability of getting a red assuming hypothesis i. P(R|Hi,E) is the conditional probability of getting a red assuming hypothesis i and given the evidence and given the empirical priors:

Agreed, and this is i/5 (the probability that 11th is red given first 10 not red and i reds altogether).

D H said:
[tex]P(R|H_i,E) = P(R|H_i)P(H_i|E)[/tex]

Not true (e.g. by a Venn diagram argument). Perhaps you meant to write

[tex]P(R,H_i|E) = P(R|H_i,E)P(H_i|E)[/tex]

so that

[tex]P(R|E) = \sum_i P(R,H_i|E)[/tex]
 
  • #32
D H said:
[tex]
\begin{tabular}{clllll}
$ i $ & $ P(H_i) $ & $ P(E|H_i) $ & $ P(H_i)P(E|H_i) $ & $ P(H_i|E)\qquad $ & $ P(R|H_i,E) $ \\
\hline
0 & 0.24 & 1 & 0.24 & 0.64070834 & 0 \\
1 & 0.33 & 0.33333333 & 0.11 & 0.29365799 & 0.05873160 \\
2 & 0.23 & 0.09523810 & 0.02190476 & 0.05847735 & 0.02339094 \\
3 & 0.11 & 0.02197802 & 0.00241758 & 0.00645402 & 0.00387241 \\
4 & 0.07 & 0.00366300 & 0.00025641 & 0.00068452 & 0.00054761 \\
5 & 0.02 & 0.00033300 & 0.00000666 & 0.00001778 & 0.00001778 \\
\hline
Total &&& 0.37458541 & 1 & 0.08656034 \\
\end{tabular}[/tex]

Do you have a mistake in calculating the final probabiity P(R|E)? It seems as though you've calculated

P(R|E) = \sum_i P(R|Hi, E)

whereas you should really be calculating

P(R|E) = \sum_i P(R|Hi,E) P(Hi|E)

It seems intuitively obvious (and I'm pretty sure I have a proof) that if we let p be the average proportion of red candies in each bag, Y be the number of red candies in the first n draws and R be the event "the next candy is red" then

P(R | Y=m) = p

i.e. the probability that the next candy is red is independent of what colors the first m candies were. Note that if p=10% and we've drawn 14 out of the 15 candies in the bag, this tells us that the hypothesis "all the candies are green" is 9 times more likely than the hypothesis "all but one of the candies are green, and we've chosen all the green ones so far", which seems reasonable.
 
  • #33
It would be interesting to know what questions a and b are, the questions and answers might give sufficient information as to whether the bag contained a fixed ratio of sweets to start with just a random selection.
 
  • #34
I think, things are very clear,actually.

Lets denote the number of candies in a bag by N, the number of red candies by m and the probability that the first red will appear at k-th draw by P(k).
Because we have here a sampling without replacement, the following holds:

P(k+1)/P(k)=(N+1-k-m)/(N-k)=1-(m-1)/(N-k)

So P(k+1)<P(k). The probability decreases in any case until m>1.
In case of m=1 the probability remains the same.

There is no need any reference to Bayes' Theorem or to the particularity of the production of M&M's etc.
 
  • #35
bpet said:
D H said:
P(R|Hi,E) is the conditional probability of getting a red assuming hypothesis i and given the evidence and given the empirical priors:

[tex]P(R|H_i,E) = P(R|H_i)P(H_i|E)[/tex]
Not true (e.g. by a Venn diagram argument). Perhaps you meant to write

[tex]P(R,H_i|E) = P(R|H_i,E)P(H_i|E)[/tex]

so that

[tex]P(R|E) = \sum_i P(R,H_i|E)[/tex]

Cexy said:
Do you have a mistake in calculating the final probabiity P(R|E)? It seems as though you've calculated

P(R|E) = \sum_i P(R|Hi, E)

whereas you should really be calculating

P(R|E) = \sum_i P(R|Hi,E) P(Hi|E)
No and no.

What I should be calculating is exactly what is in that table.

Suppose you have partitioned the sample space Ω into subsets {Ai} such that
1. Each Ai is a subset of Ω,
2. The union of all Ai is Ω, and
3. The intersection of any two members of {Ai} is the empty set.
Then the probability of some event B is

[tex]P(A) = \sum_i P(A|B_i)P(B_i)[/tex]

This is the total probability theorem. See http://www.cs.cornell.edu/courses/cs280/2004sp/probability3.pdf (Note: This reference also discusses Bayes' Theorem.)

In the problem at hand, we are randomly drawing one M&M from a partially emptied bag of candies that contains five candies of which zero to five are red. There are six mutually exclusive possibilities: The contents of the bag now include exactly zero (A0), one (A1), two (A2), three (A3), four (A4), of five (A5) red candies. Because ∪Ai=Ω and AiAjij, this a partition of the sample space. Thus the total probability that the next candy drawn from the bag is red is

[tex]P(R) = \sum_{i=0}^5 P(R|A_i)P(A_i)[/tex]

The conditional probabilities P(R|Ai) are easily calculated: They are i/5. The only remaining issue is to calculate the probabilities P(Ai). This has already been done: The conditional probabilities P(Hi|E) are the best guesses available regarding the probabilities P(Ai). Thus

[tex]
P(R) =
\sum_{i=0}^5 P(R|A_i)P(A_i) =
\sum_{i=0}^5 P(R|H_i\,\text{and}\,E)P(H_i|E)
\,\,\text{where}\,P(R|H_i\,\text{and}\,E)=i/5[/tex]

It is often handy to give those somewhat awkward P(R|Hi and E)P(Hi|E) a name. In many texts and articles this name is P(R|Hi,E).
 

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • General Math
Replies
2
Views
808
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
Back
Top