Probabilities and random variables

AI Thread Summary
In a society where 15% have sickness "Sa" and 20% of those have sickness "Sb," the probabilities calculated are P(A) = 15%, P(B) = 7.25%, and P(A∩B) = 3%. When considering a sample of 10 people, the random variable X represents the number of individuals with both sicknesses, which can range from 0 to 10. Two methods were discussed for calculating the probabilities of X, with one using a probability tree and the other assuming a population size for hypergeometric distribution. The consensus is that while both methods yield similar results for large populations, the binomial distribution approach is simpler and more practical for this scenario.
Mohamed BOUCHAKOUR
Messages
15
Reaction score
0

Homework Statement


In a given society, 15% of people have the sickness "Sa" , from them 20% have the sickness "Sb".
And from those that don't have the sickness "Sa", 5% have the sickness "Sb"
1-We randomly choose a person. and we define:
A:"the person having Sa"
B:"the person having Sb"

-Calculate: P(A) , P(B) and P(A∩B)

2-We take 10 persons from this society. We define X as the random variable that equals to the number of people having the sickness A and B at the same time.
-give the possible values of X, and the probability of them happening.

The first one is pretty easy, need some help with the second.

Homework Equations


Results of the first question:
P(A∩B)=3%
P(A)=15%
P(B)=725/10000

The Attempt at a Solution


X can take any value between 0 and 10.
I tried two things:
1- using the probability tree:
-P(X=0)=(1-P(A∩B))10≈0.73
-P(X=1)=10*P(A∩B)*(1-P(A∩B))9=0.22
But then it gets a bit too complicated to know how many times it's repeated.

2- We suppose that this society consists of 1000 people, so 30 of the will have both Sa and Sb:
-P(X=0)=C10970/C101000=0.73
-P(X=1)=(C130*C9970)/C101000=0.22
.
.
.
.

I think both are correct, but the first is way too complicated for the bigger ones.
And is there another method to avoid the supposition about the number of people in the society.
 
Physics news on Phys.org
Mohamed BOUCHAKOUR said:

Homework Statement


In a given society, 15% of people have the sickness "Sa" , from them 20% have the sickness "Sb".
And from those that don't have the sickness "Sa", 5% have the sickness "Sb"
1-We randomly choose a person. and we define:
A:"the person having Sa"
B:"the person having Sb"

-Calculate: P(A) , P(B) and P(A∩B)

2-We take 10 persons from this society. We define X as the random variable that equals to the number of people having the sickness A and B at the same time.
-give the possible values of X, and the probability of them happening.

The first one is pretty easy, need some help with the second.

Homework Equations


Results of the first question:
P(A∩B)=3%
P(A)=15%
P(B)=725/10000

The Attempt at a Solution


X can take any value between 0 and 10.
I tried two things:
1- using the probability tree:
-P(X=0)=(1-P(A∩B))10≈0.73
-P(X=1)=10*P(A∩B)*(1-P(A∩B))9=0.22
But then it gets a bit too complicated to know how many times it's repeated.

2- We suppose that this society consists of 1000 people, so 30 of the will have both Sa and Sb:
-P(X=0)=C10970/C101000=0.73
-P(X=1)=(C130*C9970)/C101000=0.22
.
.
.
.

I think both are correct, but the first is way too complicated for the bigger ones.
And is there another method to avoid the supposition about the number of people in the society.

No, they are not both correct. In principle, the second way is likely more accurate, but the details depend on the exact size of the whole population. (Furthermore, if anything, it is more complicated to calculate than the first way.) However, we are saved by the fact that for large populations both ways give almost identical results. In other words, for large populations, the hypergeometric distribution (the second way) becomes essentially indistinguishable from the much simpler binomial distribution (the first way).

For the binomial case you can easily use a calculator (or a spreadsheet, or an on-line calculator) to give a complete table of probability values P(N=n) for n = 0,1,2, ... , 10. Furthermore, you can do it recursively: from P(N=0) you can do some simple multiplications to get P(N=1). From P(N=1) it is a simple matter of some multiplications to get to P(N=2), etc. It really is NOT that complicated, especially if you think it through first.
 
  • Like
Likes Mohamed BOUCHAKOUR
I agree with Ray, and:

I would expect that the giving the full expression for the biomial distribution constitutes a valid answer for this exercise ?
 
BvU said:
I agree with Ray, and:

I would expect that the giving the full expression for the biomial distribution constitutes a valid answer for this exercise ?

Probably not, since we didn't go over the Binomial Distribution Formula in class, it would be better to elaborate more.
 
Ray Vickson said:
No, they are not both correct. In principle, the second way is likely more accurate, but the details depend on the exact size of the whole population. (Furthermore, if anything, it is more complicated to calculate than the first way.) However, we are saved by the fact that for large populations both ways give almost identical results. In other words, for large populations, the hypergeometric distribution (the second way) becomes essentially indistinguishable from the much simpler binomial distribution (the first way).
Didn't think about it this way, I'm just starting with probabilities, so I didn't know most of this, but you lead me to do some research in google, thanks for that :wink:

Ray Vickson said:
For the binomial case you can easily use a calculator (or a spreadsheet, or an on-line calculator) to give a complete table of probability values P(N=n) for n = 0,1,2, ... , 10. Furthermore, you can do it recursively: from P(N=0) you can do some simple multiplications to get P(N=1). From P(N=1) it is a simple matter of some multiplications to get to P(N=2), etc. It really is NOT that complicated, especially if you think it through first.

Since i didn't know the Binomial Distribution Formula (didn't go over it in class), the only think i had was probability tree.
And thanks to my stupidity, I didn't think of a way to find how many ways there is to choose in each case (except guessing) (this is why i said "complicated").
 
Mohamed BOUCHAKOUR said:
Didn't think about it this way, I'm just starting with probabilities, so I didn't know most of this, but you lead me to do some research in google, thanks for that :wink:
Since i didn't know the Binomial Distribution Formula (didn't go over it in class), the only think i had was probability tree.
And thanks to my stupidity, I didn't think of a way to find how many ways there is to choose in each case (except guessing) (this is why i said "complicated").

You were starting out writing down the first two probabilities, but then gave up. You say it became "too complicated", but then went on to write formulas like ##C^{30}_1 C^{970}_9 /C^{1000}_{10},## etc., and that is way more complicated than what you would get in the first way. You obviously know about binomial coefficients ##C^n_m,## so you know about the needed tools.

Anyway, you should get in the habit of shortening what you write, by using sensible notation. For example, you could say "let ##p = P(A \cap B) = 0.03## and ##q = 1-p = 0.97##". Then ##P(N=0) = p^{10}, P(N=1) = 10\, p^9 q,## etc. Writing ##P(A \cap B)## over and over again really is a waste of time, and is also much harder to read.
 
Last edited:
I tried to combine those 2 formulas but it didn't work. I tried using another case where there are 2 red balls and 2 blue balls only so when combining the formula I got ##\frac{(4-1)!}{2!2!}=\frac{3}{2}## which does not make sense. Is there any formula to calculate cyclic permutation of identical objects or I have to do it by listing all the possibilities? Thanks
Essentially I just have this problem that I'm stuck on, on a sheet about complex numbers: Show that, for ##|r|<1,## $$1+r\cos(x)+r^2\cos(2x)+r^3\cos(3x)...=\frac{1-r\cos(x)}{1-2r\cos(x)+r^2}$$ My first thought was to express it as a geometric series, where the real part of the sum of the series would be the series you see above: $$1+re^{ix}+r^2e^{2ix}+r^3e^{3ix}...$$ The sum of this series is just: $$\frac{(re^{ix})^n-1}{re^{ix} - 1}$$ I'm having some trouble trying to figure out what to...
Back
Top