Neat Workable Formula for Probability of Co-Occurrence in a Fixed Group

  • Thread starter Thread starter Ahmed Abdullah
  • Start date Start date
  • Tags Tags
    Probability
Ahmed Abdullah
Messages
203
Reaction score
3
Of total N people, m people are good at mathematics and c people are good at computer science. What is the expected number of people good at both mathematics and computer science? Or what is the probability that r people are good at both mathematics and computer science.

I have derived the formula. But it contains n!, p!, c! etc which is difficult to compute for large values (my real problem has large values for all of these). I am looking for a neat workable formula, I am hoping it exist since it is such a basic problem.
 
Physics news on Phys.org
You don't have enough information to find this.
 
It very much depends on the dependence of good at one thing with good at the other. For example if c < m, but all people good at c are also good at m, then the answer is P=0 for r > c and P=1 otherwise.
 
What if being good at math and computer science is independent of each other.
 
Ahmed Abdullah said:
Of total N people, m people are good at mathematics and c people are good at computer science. What is the expected number of people good at both mathematics and computer science? Or what is the probability that r people are good at both mathematics and computer science.

I have derived the formula. But it contains n!, p!, c! etc which is difficult to compute for large values (my real problem has large values for all of these). I am looking for a neat workable formula, I am hoping it exist since it is such a basic problem.
That depends on whether the performance of people at math and computer science (CS) are dependent or not.

In your example, on average the probability of doing good in math is m/N and c/N for CS. Let's call the first one P(math) and the second one P(CS). Then the probaility of doing good in both math and computer science is P(math, CS) = P(math) P(CSImath), where P(CSImath) is the conditional probability of doing good in CS given the probability of doing good in math. Then once P(math, CS) is calculated, the expected number of people doing good in math and CS can be known from N. P(math, CS).
 
Last edited:
Ahmed Abdullah said:
What if being good at math and computer science is independent of each other.
In case of independecny, then the problem becomes straightforward. P(math, CS)= P(math) P(CS).
The expected number of good people in both math and CS becomes N [P(math) P(CS)].
 
Ahmed Abdullah said:
Or what is the probability that r people are good at both mathematics and computer science.
Now once we obtain the probability of being good in math and CS, P(math, CS), we can then calculate the probability of having r people who are good in both. This is a binomial probability distribution. In that case P(math, CS) is regarded as the success rate, let's call it (u) . So P(r)= CN,rur(1-u)N-r where CN,r is the binomial coefficient or combination of N and r which is also equal to N!/r!(N-r)!
 
Adel Makram said:
Now once we obtain the probability of being good in math and CS, P(math, CS), we can then calculate the probability of having r people who are good in both. This is a binomial probability distribution. In that case P(math, CS) is regarded as the success rate, let's call it (u) . So P(r)= CN,rur(1-u)N-r where CN,r is the binomial coefficient or combination of N and r which is also equal to N!/r!(N-r)!

If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
 
The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am assuming it is correct but it contains large N, m and c. I guess i have to use Stirling approximation. Hoping for some direction.
 
  • #10
Ahmed Abdullah said:
If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
The binomial formula assumes the outcome of all possible probabilities of having people who are good in math and CS a complete probability space. In other words, if N` represents the number of people who are good in both, then N` should be the number appears in my formula not the original N which is the total number of people. N` is then caculated from the formula in the last line in the post #5.
 
  • #11
Ahmed Abdullah said:
If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
This is not correct, becuase in this example, we can not have more than 2 people who are good in math and CS all together.
 
  • #12
Actually I am interested in the cases where resources are limited (4 people, 2 math guy , 2 computer guy in the example)
 
  • #13
Ahmed Abdullah said:
What if being good at math and computer science is independent of each other.
If independent, the probability of both is simply the product of the probabilities of each. The expected number is then the probability times the size of the population - result \frac{mc}{N}.
 
  • #14
mathman said:
If independent, the probability of both is simply the product of the probabilities of each. The expected number is then the probability times the size of the population - result \frac{mc}{N}.
I am interested in the cases where people are fixed. For example if we have 4 people , 2 are good at computer and 2 are math then it is impossible to have 3 people good at both computer and math. Which is different than your approach.

The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am not sure whether the expected number match in your case and my case but the scenario is different. I am wondering is there any standard formula for this particular kind of problem.
 
  • #15
I don't understand what you are looking for, but you seem to be overthinking. The formula I gave has an answer of 1 for good at both. Half the people are good at math and half at computer. Independence lads to 1/4 good at both Mean number is (1/4)(4)=1. If you looking for the distribution function, that is another question.
 
  • #16
Ahmed Abdullah said:
I am interested in the cases where people are fixed. For example if we have 4 people , 2 are good at computer and 2 are math then it is impossible to have 3 people good at both computer and math. Which is different than your approach.
People being fixed does not matter. Their answer for expected value = c*m/N is correct. The expected value will always be within the range of possible results, so you do not have to worry about "3 people good at both computer and math". In your example, you would expect 2*2/4 = 1 person to be good at both math and computers. And you must admit that that is correct. However, the expected value may not be an integer, which is totally impossible to ever actually occur for an integer number of people.
 
Back
Top