Mean, variance and correlation - Multinomial distribution

AI Thread Summary
The discussion revolves around calculating the means, variances, and covariance of two random variables, X and Y, representing the counts of red and green balls drawn from an urn until a black ball is picked. The problem involves understanding the multinomial distribution and the geometric distribution of draws. It is clarified that given the number of draws before a black ball is drawn, the counts of red and green balls follow a binomial distribution. The participants discuss methods for calculating expected values and variances using conditioning on the number of draws. The conversation emphasizes the importance of correctly applying probability concepts to derive the required statistical measures.
AwesomeTrains
Messages
115
Reaction score
3
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

Homework Statement


Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

Homework Equations


Expectation value: E(X)=\sum_{i∈I}x_iP(X=x_i)
Multinomial distribution for 3 different balls: P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z, with n=x+y+z

The Attempt at a Solution


The probabilities for drawing a red ball, p_1=\frac{r}{r+g+b}, green p_2=\frac{g}{r+g+b} and black p_3=\frac{b}{r+g+b}
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means EX=np_1 and EY=np_2 but then the condition (1) isn't used.
I then thought about calculating E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3 where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write E(X) because the sum is not only for the random variable X.

Any tips are much appreciated
Alex
 
Physics news on Phys.org
AwesomeTrains said:
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

Homework Statement


Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

Homework Equations


Expectation value: E(X)=\sum_{i∈I}x_iP(X=x_i)
Multinomial distribution for 3 different balls: P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z, with n=x+y+z

The Attempt at a Solution


The probabilities for drawing a red ball, p_1=\frac{r}{r+g+b}, green p_2=\frac{g}{r+g+b} and black p_3=\frac{b}{r+g+b}
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means EX=np_1 and EY=np_2 but then the condition (1) isn't used.
I then thought about calculating E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3 where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write E(X) because the sum is not only for the random variable X.

Any tips are much appreciated
Alex

Since you are drawing with replacement, the number of draws ##Z## before the first black is a (modified) geometric distribution with success probability ##p_b = b/(r+g+b)##; that is, ##P(Z = k) = (1-p_b)^k p_b, k = 0,1,2, \ldots \, .## Given ##Z = k , k \in \{0,1,2, \ldots \}##, can you see why the pair ##(X,Y)## given ##Z =k## has the binomial distribution ##X \sim \text{Bin}(k,p_r/(p_r + p_g) ) \, ##? That is, ##X## is binomial and ##Y = k-X##. Using this, you can easily get ##E(X|Z=k)##, ##E(Y|Z=k)##, ##\text{Var}(X|Z=k)## and ##\text{Var}(Y|Z=k)##. Then, by "unconditioning" you can get ##EX, EY, \text{Var}(X), \text{Var}(Y)##. I'll let you worry about how to get ##\text{Cov}(X,Y)## by a conditioning-unconditioning argument.

BTW: it is not wrong to write ##E(X)## or ##\text{Var}(X)##; these are just the mean and variance of the marginal distribution of ##X##.
 
Thanks for the fast reply and for the help :)
I would never have thought of that. Thought I had chosen the right distribution.

Ray Vickson said:
can you see why the pair (X,Y) given Z=k has the binomial distribution X∼Bin(k,pr/(pr+pg))?

Is it because there's either succes, (drawing the black ball) or failure drawing a red one?

Ray Vickson said:
That is, X is binomial and Y=k−X

How do you get this?

Then E(X|Z=k)=\sum_x x\frac{P(Z=k, X=x)}{P(Z=k)}?
What is meant by P(Z=k, X=x) here?
 
AwesomeTrains said:
Is it because there's either succes, (drawing the black ball) or failure drawing a red one?
No. It's because having fixed the number of draws before a black is drawn at k (note, Ray calls this Z, which is nor how you used Z in the OP), we know that only red and green balls are drawn in the first k draws. So that's just a binary result for each of k trials.
AwesomeTrains said:
What is meant by P(Z=k,X=x) here?
It's the joint probability, P[Z=k&X=x].
 
Hello
I've gotten this so far: E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)=<br /> P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,<br /> I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)
Can I use that in the solution you've outlined for me?
 
AwesomeTrains said:
Hello
I've gotten this so far: E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)=<br /> P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,<br /> I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)
Can I use that in the solution you've outlined for me?

The event ##\{ X=x, Z=k\}## occurs when there are ##x## 'reds' and ##k-x## 'greens', followed by a 'black', and that has probability
P(X=x,Z=k) = C(k,x)\, p_r^x \, p_g^{k-x} \, p_b
Thus, ##P(Z=k) = \sum_{x=0}^k P(X=x, Z=k) = (p_r + p_g)^k \, p_b##, by the binomial expansion of ##(p_r + p_g)^k##. Therefore, we have
P(X = x | Z = k) = \frac{P(X=x,Z=k)}{P(Z=k)} = C(k,x)\, \left( \frac{p_r}{p_r+p_g}\right)^x \, \left( \frac{p_g}{p_r+p_g}\right)^{k-x}.
That means that ##[X|Z=k\ \sim \text{Bin}\,(k, p_r/(p_r+p_g))## as claimed before. Because ##[X|Z=k]## is binomial, its mean is known right away as ##E(X|Z=k) = kp_r/(p_g+p_r)##. Therefore, ##EX = \sum_{k=0}^{\infty} p_b (p_r+p_g)^k k p_r/(p_r+p_g)##. That last summation is do-able.

If I were trying to get ##\text{Var}(X)## I would use the fact that ##\text{Var}(X) = E(X^2) - (E X)^2##, then get ##E(X^2)## in a similar manner as ##EX##: ##E(X^2|Z=k)## is the mean of a squared binomial random variable, and that is related to its variance and mean as ##E(X^2|Z=k) = \text{Var}(X|Z=k) + [E(X|Z=k)]^2##.

Note added in edit: Alternatively, you can try to compute the marginal distribution of ##X## directly as
P(X=x) = \sum_{k=x}^{\infty} P(Z=k) P(X=x|Z=k)

It also turns out that you can almost immediately write down the final expression for ##P(X = x)## without any complicated evaluations, provided that you reason very carefully (and quite subtly) about the nature of the event ##\{ X = x \}##.
 
Last edited:
I tried to combine those 2 formulas but it didn't work. I tried using another case where there are 2 red balls and 2 blue balls only so when combining the formula I got ##\frac{(4-1)!}{2!2!}=\frac{3}{2}## which does not make sense. Is there any formula to calculate cyclic permutation of identical objects or I have to do it by listing all the possibilities? Thanks
Essentially I just have this problem that I'm stuck on, on a sheet about complex numbers: Show that, for ##|r|<1,## $$1+r\cos(x)+r^2\cos(2x)+r^3\cos(3x)...=\frac{1-r\cos(x)}{1-2r\cos(x)+r^2}$$ My first thought was to express it as a geometric series, where the real part of the sum of the series would be the series you see above: $$1+re^{ix}+r^2e^{2ix}+r^3e^{3ix}...$$ The sum of this series is just: $$\frac{(re^{ix})^n-1}{re^{ix} - 1}$$ I'm having some trouble trying to figure out what to...
Back
Top