Let's say we have a very simple game of two-card poker, where the deck consists of only 4 aces and 4 kings. There are 28 possible combinations: Six pairs of aces, six pairs of kings, and 16 ace-king hands. In play against a single opponent, both players are dealt two cards, face down. The probability of either player being dealt a pair (aces or kings) is 6/28, while the probability of an ace-king combination is 16/28. Let's say you are dealt a pair of kings and want to calculate the probability of winning. Before the deal, there was a 6/28 probability that your opponent would be dealt a pair of aces. However, you know you got two kings, so you know your opponent could have been dealt only 1 of the 6 possible pairs of kings and that they could have been dealt only 8 of the 16 possible ace-king combinations. Given your pair of kings, your opponent could have only one of 15 possible hands (6 pairs of aces, 1 pair of kings, and 8 ace-king combinations). To calculate the probability that they are holding a pair of aces, do you use 6/15 instead of the original 6/28? My second question has to do with combining probabilities. What if, in the scenario above, there were three players instead of two? Each of your opponents has the same probability of having been dealt a pair of aces. If you are trying to determine the probability of beating both opponents, the probabiliies should be added together, shouldn't they? 1 - (6/15 + 6/15) would make your probability of winning only 0.20, which seems awfully low, given that there are six hands that will beat you, one that will tie, and 8 that will lose to you. What am I missing?
Yes. That's called the conditional probability. If you have two events A and B, it is not in general true that P(A or B) = P(A) + P(B). That is only true if the events are mutually exclusive (in other words, if it is impossible for both events to happen at the same time). Note that P(A or B) means the probability that one or both of the events {A, B} happen. The general formula is P(A or B) = P(A) + P(B) - P(A and B). Of course P(A and B) means the probability that both events A and B happen. In this example, A is the event that your first opponent beats you, and B is the event that your second opponent beats you. Now P(A and B) is the probability that both opponents get a pair of aces, which is 1/15, so P(A or B) = 6/15 + 6/15 - 1/15 = 11/15. But P(A or B) is the probability that you are beaten, so the probability that you win is 1 - P(A or B) = 1 - 11/15 = 4/15 = 0.27. And sure, it's not that good, but it is 2 versus 1.
Thanks for the clear explanation. If I expand the scenario out to a standard 52-card deck, there are 1,326 possible hands. All the pairs (AA, KK, etc) still have 6 combinations and all of the unpaired hands (AJ, 24, etc) still have 16 combinations. If I am dealt a hand of 98, there are 72 combinations of paired hands that will beat me (3 combinations of 88, 3 combinations of 99, and 6 combinations for each of the other 11 pairs). There are 50 unpaired hands (AK through T2) that will also beat me. Ten of those (A9, A8, K9, K8, Q9, Q8, J9, J8, T9, T8) have 8 combinations and the other 40 have 16 combinations. So the number of hands that can beat me is 3 + 3 + 11*6 + 10*8 + 40*16 = 792, and the number of possible combinations is 1326 - 3 - 3 - 10*8 = 1240. With only one opponent, the probability of winning is 1 - 792/1240 = 0.36. Here is where I founder. With two opponents, the probability of winning becomes 1 - (792/1240 + 792/1240 - (792/1240 * 792/1240)) = 0.13. With three opponents, the probability of winnning is 1 - (792/1240 + 792/1240 + 792/1240 - (792/1240 * 792/1240 * 792/1240)) = -0.66. With four opponents, the probability of winning is -1.39. What am I doing wrong?
There are two problems here. First, it is not true in general that P(A and B) = P(A)*P(B). That is only true in special cases, when we say the events are independent. Roughly speaking, events A and B are independent if knowing that event A happens has no effect on your assessment of the probability that event B happens. In this case, the event that Opponent 1 has a better hand than you is not independent of the event that Opponent 2 has a better hand than you. If you know that Opponent 1 has a better hand than you, then some cards are missing from the deck when you draw Opponent 2's hand, and these are disproportionately likely to be aces, kings, and other high cards. So this modifies the odds that Opponent 2 has a better hand than you. This can be more clearly seen in the previous example of the ace-king deck. In that example, P(A) = 6/15, and P(B) = 6/15. But notice that for P(A and B), I did not write 36/225. Instead, I wrote 1/15. To get this value, I had to look at all the possible combinations of hands that players 1 and 2 could have. There are 90 such hands, and in 6 of them both 1 and 2 have better hands than you. So P(A and B) = 6/90 = 1/15, not 36/225. The probability P(A and B) was lower than P(A)P(B), because if event A happened, then there would only be 2 aces left in the deck, reducing the chances that event B would happen as well. To find P(A and B), you'll have to look at all the pairs of hands for Opponent 1 and Opponent 2, and figure out in how many cases they both beat you. You'll end up finding something that I think will be slightly lower than P(A)P(B). So this will change the calculation slightly. Note that if it's too much work to do the exact computation, you could estimate P(A and B) by P(A)P(B). There will be an error, but the larger the deck is, the less the error is. If you play the same game with 100 52-card decks all mixed together, then the fact that Opponent 1's winning hand strips the deck of two better-than-average cards is almost insignificant, and P(A and B) is very, very close to P(A)P(B), but still not quite the same. Second, the formula I mentioned in the previous post was for 2 events only. The formula for any number of events is called the inclusion-exclusion principle. For 3 events, this says that P(A or B or C) = P(A) + P(B) + P(C) - P(A and B) - P(A and C) - P(B and C) + P(A and B and C). [Note: when we write something like P(A and B), this expresses no opinion on whether C occurs or not. It does NOT mean P(A and B and not C).] In general, if you have n events that you care about, you have to look at every nonempty subset of those n events (there are 2^{n} subsets) and figure out the probability that subset of events occurs. Then, if the subset has an odd number of events, you add the probability, and if it has an even number of events, you subtract it. You can see that the formulas for 2 and 3 events follow this pattern. If you do these things, you won't get negative probabilities anymore.