Question on how to calculate this probability.

1. Nov 30, 2013

steveryan

Been working on this one for couple hours and have not come up with an answer.

Two people are given 1144 of the same options. (Two identical sets of options where each option is equal).

Person 1 chooses 22 of the 1144 options.

Person 2 chooses 28 of the 1144 options.

What is the probability that 7 of their choices is identical?

2. Nov 30, 2013

tiny-tim

hi steveryan! welcome to pf!

let's rewrite it with only one person …

there are 1122 white balls and 22 red balls

you choose 28 balls

what is the probability that exactly 7 of them are red?

3. Nov 30, 2013

steveryan

Thanks for the response!

I like your simplified version of the question. Makes it a little easier for me. I can also now see that it is basically impossible for 7 of those balls to be red.

1.92% of the 1144 balls are red. So, after selecting 28 we can assume that approximately 1.92% of them will be red. That's well less than 1 ball.

I still don't know how to get the actual probability. Seems like it should be easy from here though.

Can you lead me into the next step?

4. Nov 30, 2013

tiny-tim

hi steveryan!

how many ways are there of choosing 21 white balls and 7 red balls (out of 1122 white balls and 22 red balls)?

5. Nov 30, 2013

Ray Vickson

Why do you claim that it is "basically impossible for 7 of those balls to be red"? It is certainly very "improbable", but the whole point of the problem is for you to compute the actual value.

Try a simpler case first: suppose there are 1144 balls, of which 28 are red and the rest white. If you pick 3 balls without replacement, what are the probabilities of P(#red = 0), P(#red = 1), P(#red = 2) and P(#red = 3)? Write out in detail the events E1 = {#red = 0}, E2 = {#red = 1}, E3 = {#red = 2} and E4 = {#red = 3} in terms of outcomes like RRW, WRR, RWR, etc. Compute the probability of each separate outcome, then combine all the probabilities for outcomes in E1, etc. Yes, it involves a lot of (mostly simple) work, but you will, I hope, see what is happening and you will then be able to extend it to the larger case in this problem.

6. Dec 6, 2013

steveryan

I can't figure out this problem.

I'm trying to help a friend of mine who was involved in a contest. He suspects two of the people in the contest were cheating because out of the choices they made, they both had 7 of the same choices. There were 1144 choices. One guy selected 22 and the other guy selected 28.

I didn't realize there was much of a difference between "Basically impossible" and "very improbable". Remember, out of 1144 choices, we are only selecting about 1.9% and 2.4% of them, yet somehow 25% of the 28 are supposed to be the same as 31% of the other guys 22.

7. Dec 7, 2013

Ray Vickson

Well, in fields like reliability engineering and the like, you want to be able to assess small probabilities---it is not enough to simply say " the probability is very small". Different designs may yield different (small) probabilities, and business contracts, insurance policies and so forth may depend on the answers. In the present case the probability you seek is comparable to that of getting 21 heads in 21 coin tosses.

I did already give you a hint for how to figure it out. Have you tried? Where are you stuck? If you show what you have attempted we can be in a better position to assist you; but just giving you the answer is not an option.

8. Dec 7, 2013

PeroK

Were the choices random? Or, did they decide what to choose? There's a big difference. In the second case, you could very well get similar choices, and the probability theory used in this post is not applicable.

E.g. two friends ordering from a large menu might order the same dishes quite often - much more often than if they choose the dishes at random.

9. Dec 7, 2013

tiny-tim

hi steveryan!

ah, so this isn't a maths-homework problem?

(i assumed this was part of your maths course)

let's start by checking how much of the maths you know:

do you know the difference between a combination and a permutation?

do you know what nCr means (it's also written (n r), but as a column not a row)?

do you know what "independent" means? ie if A and B are independent events, and C is the event A and B, do you know what P(C) is in terms of P(A) and P(B)?

10. Dec 7, 2013

haruspex

Another thing to consider is how many people altogether were making these choices. If a few million, it might not be so surprising that some pair of them can be found that made very similar ones.

Anyway, how would it constitute cheating? Was there some benefit to having similar choices to someone else? Sounds like we need more background to assess this fairly.

11. Dec 7, 2013

Ray Vickson

Yes, that is true. In this case we might regard the observation as evidence to reject the hypothesis of independent, random choices. Basically, that is what statistical hypothesis tests do.

12. Dec 7, 2013

steveryan

I'm not looking for just the answer. This sort of thing does interest me to some extent. However, I would like to show my friend how to arrive at the answer (He is not my math teacher! I earned my bachelors 2 years ago at UOP and am not going on to get my masters. Even if I was, I would not be taking any type of math course.)

So far, all I can come up with is the probability of of choosing a red ball when there are 1144 balls and 22 of them are red.

1144/22 = 52

So, I have a 1:52 chance of pulling a red ball on the 1st pull. As how to proceed from here I'm not sure because the chance of pulling a red ball on the next pull has 2 different outcomes depending upon what I pulled the 1st time.

Can I just pretend that I pulled red balls on the 1st 7 pulls and then all the rest were white?

1144/22 = 52
1143/21 = 54.42
1142/20 = 57.1
......
......
......
1123/15 = 74.86

Then what do I do from here? Or am I not doing it correctly?

They decided what to choose.

I do not know the difference between a combo and a perm.

I do not know what SUP means or (n,r).

I know what independent events are. P(C) means probability C which refers to P(A) and P(B).

13. Dec 7, 2013

haruspex

First, let's count the total number of ways you can select 28 from 1144, not worrying about the colours of the balls. You can think of this in another way: arrange the 1144 in a random order, then take the first 28. There are 1144! ways to arrange the 1144 in order, but many of these arrangements lead to the same selection of 28, so we have to figure out how many times we have counted each selection. It won't matter what order the first 28 are in, amongst themselves, and it won't matter what order the remaining 1144-28 are in. We can take any one of the 1144! orderings and shuffle the first 28 (28! ways) and the remaining 1144-28 (1116! ways) and still get the same selection of 28 balls. So we counted each selection 28! * 1116! times.
Result: the number of selections of 28 from 1144 is $\frac{1144!}{28!1116!}$. This is usually written 1144C28 ("1144 choose 28") or $\left(^{1144}_{ 28}\right)$.
Now we have to compare that with the number of ways of selecting 7 from the 22 and 28-7 from the 1144-22. Can you work those out?

14. Dec 8, 2013

tiny-tim

ok! then …

N(C) = number of ways of choosing 21 white balls and 7 red balls (out of 1122 white balls and 22 red balls)

N(A) = number of ways of choosing 21 white balls out of 1122 white balls

N(B) = number of ways of choosing 7 red balls out 22 red balls

N(A) = 1122!/21!*1101!

N(B) = 22!/7!*15!

so N(C) = N(A)N(B) = 1122!*22!/7!*15!21!*1101!

(because A and B are independent)

now P(C) = N(C) divided by the total number of ways of choosing 28 balls out of 1144 (colour not mattering) = 1144!/28!*1116!

15. Dec 8, 2013

steveryan

Sorry to slow things down a bit, but what does the exclamation point mean?

16. Dec 8, 2013

Ray Vickson

Factorial. The notation n! means the product of all the integers from 1 to n (and 0! = 1 by definition). Thus, 2! = 2*1 = 2, 3! = 3*2*1 = 6, etc. Of course, (n+1)! = (n+1)*n!, so you can compute them recursively, one-by-one. These numbers get large in a hurry. For example, 1144! ≈ .8606355147e3004 (about $0.86 \times 10^{3004}$).

Last edited: Dec 8, 2013
17. Dec 13, 2013

steveryan

Wow! That's a really large number I suppose. So large, I really don't know what to do with it.

So the chances of picking 7 of the same choices is 1:1144!/28!*1116!

Correct? Or are we not done yet?

18. Dec 13, 2013

haruspex

No, 1144C28 is the total number of ways of choosing 28 from 1144. (See my previous post.) You have to compare that with choosing 7 from the 22 and the other 21 from the remaining 1122. tiny-tim called those N(A) and N(B), and multiplied them to get N(C). To get the probability you want, you then need to divide that by 1144C28.
To get an approximate answer, you can then use Stirling's formula (Google it) applied to each of the nine factorials. Or you may have a calculator or spreadsheet software that has the combinations function. (Answer = 3E-7.)

Last edited: Dec 13, 2013
19. Dec 13, 2013

Ray Vickson

You have 1144 items, of which 22 are 'marked' and the remaining 1122 are 'unmarked'; the marked items are those selected by person 1. Now person 2 chooses 28 items (without replacement), and we want to know the probability that 7 of them are 'marked'. For what it's worth, the probability distribution that applies in this case is the so-called hypergeometric distribution: if we choose $n$ items from a population having $N_1$ items of type I and $N_2$ of type II, the probability of getting $k$ items of type I in the sample is
$$P\{ \text{no. of type I } = k \} = \frac{{N_1 \choose k} {N_2 \choose n-k}}{{N_1+N_2 \choose n}},$$
where
$${u \choose v} = u \text{ choose }v.$$
$$\text{answer } = \frac{{22 \choose 7}{1122 \choose 21}}{{1144 \choose 28}} \doteq .3050936962e-6$$

BTW: For large u and/or v we can use Stirling's formula to get good approximations to ${u \choose v}$. Alternatively, we can (and often do) get it recursively, through simple updating formulas such as
$${u+1 \choose v } = {u \choose v} + {u \choose v-1},$$
starting from
$${n \choose 0} = 1, \;\; {n \choose 1} = n.$$

20. Dec 15, 2013

steveryan

22! / 7!15!

28! / 7!21!

1122! / 7!1115!

I don't know....maybe I'm just wasting my time with all of this now. It seemed like something that I should be able to do, but it's just far too complicated.