Neat Workable Formula for Probability of Co-Occurrence in a Fixed Group

  • Context: Graduate 
  • Thread starter Thread starter Ahmed Abdullah
  • Start date Start date
  • Tags Tags
    Probability
Click For Summary

Discussion Overview

The discussion revolves around calculating the expected number of individuals proficient in both mathematics and computer science within a fixed group of people. Participants explore various probabilistic models, including considerations of independence and dependence between the two skills, as well as the challenges posed by large values in combinatorial calculations.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant presents a formula for the expected number of individuals proficient in both subjects but notes difficulties in computation due to factorial terms.
  • Another participant argues that the relationship between proficiency in mathematics and computer science affects the outcome, suggesting that if all individuals proficient in computer science are also proficient in mathematics, certain probabilities become zero or one based on the values of m and c.
  • Independence of the two skills is proposed as a simplifying assumption, leading to straightforward calculations of probabilities and expected values.
  • Several participants discuss the implications of fixed resources, questioning the validity of certain probabilities when the total number of individuals is limited.
  • A specific formula for calculating the probability of having r individuals proficient in both subjects is shared, but its correctness is debated in light of fixed group sizes.
  • Concerns are raised about the interpretation of expected values, particularly regarding non-integer results and their practical implications.

Areas of Agreement / Disagreement

Participants express differing views on the assumptions of independence versus dependence in skill proficiency, and whether fixed group sizes affect the validity of certain probabilistic models. There is no consensus on a single approach or formula that resolves these issues.

Contextual Notes

Limitations include the dependence on assumptions regarding independence, the complexity of factorial calculations for large values, and the implications of fixed group sizes on probability distributions.

Ahmed Abdullah
Messages
203
Reaction score
3
Of total N people, m people are good at mathematics and c people are good at computer science. What is the expected number of people good at both mathematics and computer science? Or what is the probability that r people are good at both mathematics and computer science.

I have derived the formula. But it contains n!, p!, c! etc which is difficult to compute for large values (my real problem has large values for all of these). I am looking for a neat workable formula, I am hoping it exist since it is such a basic problem.
 
Physics news on Phys.org
You don't have enough information to find this.
 
It very much depends on the dependence of good at one thing with good at the other. For example if c < m, but all people good at c are also good at m, then the answer is P=0 for r > c and P=1 otherwise.
 
What if being good at math and computer science is independent of each other.
 
Ahmed Abdullah said:
Of total N people, m people are good at mathematics and c people are good at computer science. What is the expected number of people good at both mathematics and computer science? Or what is the probability that r people are good at both mathematics and computer science.

I have derived the formula. But it contains n!, p!, c! etc which is difficult to compute for large values (my real problem has large values for all of these). I am looking for a neat workable formula, I am hoping it exist since it is such a basic problem.
That depends on whether the performance of people at math and computer science (CS) are dependent or not.

In your example, on average the probability of doing good in math is m/N and c/N for CS. Let's call the first one P(math) and the second one P(CS). Then the probaility of doing good in both math and computer science is P(math, CS) = P(math) P(CSImath), where P(CSImath) is the conditional probability of doing good in CS given the probability of doing good in math. Then once P(math, CS) is calculated, the expected number of people doing good in math and CS can be known from N. P(math, CS).
 
Last edited:
Ahmed Abdullah said:
What if being good at math and computer science is independent of each other.
In case of independecny, then the problem becomes straightforward. P(math, CS)= P(math) P(CS).
The expected number of good people in both math and CS becomes N [P(math) P(CS)].
 
Ahmed Abdullah said:
Or what is the probability that r people are good at both mathematics and computer science.
Now once we obtain the probability of being good in math and CS, P(math, CS), we can then calculate the probability of having r people who are good in both. This is a binomial probability distribution. In that case P(math, CS) is regarded as the success rate, let's call it (u) . So P(r)= CN,rur(1-u)N-r where CN,r is the binomial coefficient or combination of N and r which is also equal to N!/r!(N-r)!
 
Adel Makram said:
Now once we obtain the probability of being good in math and CS, P(math, CS), we can then calculate the probability of having r people who are good in both. This is a binomial probability distribution. In that case P(math, CS) is regarded as the success rate, let's call it (u) . So P(r)= CN,rur(1-u)N-r where CN,r is the binomial coefficient or combination of N and r which is also equal to N!/r!(N-r)!

If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
 
The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am assuming it is correct but it contains large N, m and c. I guess i have to use Stirling approximation. Hoping for some direction.
 
  • #10
Ahmed Abdullah said:
If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
The binomial formula assumes the outcome of all possible probabilities of having people who are good in math and CS a complete probability space. In other words, if N` represents the number of people who are good in both, then N` should be the number appears in my formula not the original N which is the total number of people. N` is then caculated from the formula in the last line in the post #5.
 
  • #11
Ahmed Abdullah said:
If we have 4 people with 2 people good at math and 2 people good at computer. According to your formula we get a non zero probability for r=3 .
This is not correct, because in this example, we can not have more than 2 people who are good in math and CS all together.
 
  • #12
Actually I am interested in the cases where resources are limited (4 people, 2 math guy , 2 computer guy in the example)
 
  • #13
Ahmed Abdullah said:
What if being good at math and computer science is independent of each other.
If independent, the probability of both is simply the product of the probabilities of each. The expected number is then the probability times the size of the population - result [itex]\frac{mc}{N}[/itex].
 
  • #14
mathman said:
If independent, the probability of both is simply the product of the probabilities of each. The expected number is then the probability times the size of the population - result [itex]\frac{mc}{N}[/itex].
I am interested in the cases where people are fixed. For example if we have 4 people , 2 are good at computer and 2 are math then it is impossible to have 3 people good at both computer and math. Which is different than your approach.

The formula I have derived is P(r)= C(N,r)*C(N-r , m-r) * C(N-m, c-r) / ( C(N,m) * C(N,n))
N=Total people
m=number of people good at math
c=number of people good at computer
r= number of people good at both

I am not sure whether the expected number match in your case and my case but the scenario is different. I am wondering is there any standard formula for this particular kind of problem.
 
  • #15
I don't understand what you are looking for, but you seem to be overthinking. The formula I gave has an answer of 1 for good at both. Half the people are good at math and half at computer. Independence lads to 1/4 good at both Mean number is (1/4)(4)=1. If you looking for the distribution function, that is another question.
 
  • #16
Ahmed Abdullah said:
I am interested in the cases where people are fixed. For example if we have 4 people , 2 are good at computer and 2 are math then it is impossible to have 3 people good at both computer and math. Which is different than your approach.
People being fixed does not matter. Their answer for expected value = c*m/N is correct. The expected value will always be within the range of possible results, so you do not have to worry about "3 people good at both computer and math". In your example, you would expect 2*2/4 = 1 person to be good at both math and computers. And you must admit that that is correct. However, the expected value may not be an integer, which is totally impossible to ever actually occur for an integer number of people.
 

Similar threads

  • · Replies 57 ·
2
Replies
57
Views
7K
  • · Replies 18 ·
Replies
18
Views
4K
  • · Replies 40 ·
2
Replies
40
Views
9K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
5K
  • · Replies 2 ·
Replies
2
Views
2K