# Probability of co-occurence

1. May 19, 2015

### Ahmed Abdullah

Of total N people, m people are good at mathematics and c people are good at computer science. What is the expected number of people good at both mathematics and computer science? Or what is the probability that r people are good at both mathematics and computer science.

I have derived the formula. But it contains n!, p!, c! etc which is difficult to compute for large values (my real problem has large values for all of these). I am looking for a neat workable formula, I am hoping it exist since it is such a basic problem.

2. May 19, 2015

### micromass

Staff Emeritus
You don't have enough information to find this.

3. May 19, 2015

### mathman

It very much depends on the dependence of good at one thing with good at the other. For example if c < m, but all people good at c are also good at m, then the answer is P=0 for r > c and P=1 otherwise.

4. May 19, 2015

### Ahmed Abdullah

What if being good at math and computer science is independent of each other.

5. May 20, 2015

That depends on whether the performance of people at math and computer science (CS) are dependent or not.

In your example, on average the probability of doing good in math is m/N and c/N for CS. Lets call the first one P(math) and the second one P(CS). Then the probaility of doing good in both math and computer science is P(math, CS) = P(math) P(CSImath), where P(CSImath) is the conditional probability of doing good in CS given the probability of doing good in math. Then once P(math, CS) is calculated, the expected number of people doing good in math and CS can be known from N. P(math, CS).

Last edited: May 20, 2015
6. May 20, 2015

In case of independecny, then the problem becomes straightforward. P(math, CS)= P(math) P(CS).
The expected number of good people in both math and CS becomes N [P(math) P(CS)].

7. May 20, 2015

Now once we obtain the probability of being good in math and CS, P(math, CS), we can then calculate the probability of having r people who are good in both. This is a binomial probability distribution. In that case P(math, CS) is regarded as the success rate, lets call it (u) . So P(r)= CN,rur(1-u)N-r where CN,r is the binomial coefficient or combination of N and r which is also equal to N!/r!(N-r)!

8. May 20, 2015

### Ahmed Abdullah

If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .

9. May 20, 2015

### Ahmed Abdullah

The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am assuming it is correct but it contains large N, m and c. I guess i have to use Stirling approximation. Hoping for some direction.

10. May 20, 2015

The binomial formula assumes the outcome of all possible probabilities of having people who are good in math and CS a complete probability space. In other words, if N represents the number of people who are good in both, then N should be the number appears in my formula not the original N which is the total number of people. N` is then caculated from the formula in the last line in the post #5.

11. May 20, 2015

This is not correct, becuase in this example, we can not have more than 2 people who are good in math and CS all together.

12. May 20, 2015

### Ahmed Abdullah

Actually I am interested in the cases where resources are limited (4 people, 2 math guy , 2 computer guy in the example)

13. May 20, 2015

### mathman

If independent, the probability of both is simply the product of the probabilities of each. The expected number is then the probability times the size of the population - result $\frac{mc}{N}$.

14. May 20, 2015

### Ahmed Abdullah

I am interested in the cases where people are fixed. For example if we have 4 people , 2 are good at computer and 2 are math then it is impossible to have 3 people good at both computer and math. Which is different than your approach.

The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am not sure whether the expected number match in your case and my case but the scenario is different. I am wondering is there any standard formula for this particular kind of problem.

15. May 21, 2015

### mathman

I don't understand what you are looking for, but you seem to be overthinking. The formula I gave has an answer of 1 for good at both. Half the people are good at math and half at computer. Independence lads to 1/4 good at both Mean number is (1/4)(4)=1. If you looking for the distribution function, that is another question.

16. May 22, 2015

### FactChecker

People being fixed does not matter. Their answer for expected value = c*m/N is correct. The expected value will always be within the range of possible results, so you do not have to worry about "3 people good at both computer and math". In your example, you would expect 2*2/4 = 1 person to be good at both math and computers. And you must admit that that is correct. However, the expected value may not be an integer, which is totally impossible to ever actually occur for an integer number of people.