Mean, variance and correlation - Multinomial distribution

Click For Summary

Homework Help Overview

The discussion revolves around calculating the means, variance, and covariance of two random variables, X and Y, which represent the counts of red and green balls drawn from an urn containing red, green, and black balls. The drawing continues until a black ball is drawn, introducing a stochastic process that involves both multinomial and geometric distributions.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning, Problem interpretation

Approaches and Questions Raised

  • Participants explore the relationship between the random variables X and Y given the condition of drawing until a black ball is picked. There is discussion about the use of binomial distribution for X and Y, and how to properly calculate expectations and variances using conditioning on the number of draws before a black ball is drawn.

Discussion Status

Some participants have provided insights into the joint distribution of X and Z, and how to derive the means and variances through conditioning. There is ongoing exploration of the relationships between the variables and the implications of the drawing process, with no explicit consensus reached yet.

Contextual Notes

The problem involves specific constraints related to the drawing process, including the requirement to consider the first occurrence of a black ball and the implications this has on the distributions of X and Y. Participants are also navigating the complexities of joint probabilities and conditioning in their calculations.

AwesomeTrains
Messages
115
Reaction score
3
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

Homework Statement


Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

Homework Equations


Expectation value: E(X)=\sum_{i∈I}x_iP(X=x_i)
Multinomial distribution for 3 different balls: P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z, with n=x+y+z

The Attempt at a Solution


The probabilities for drawing a red ball, p_1=\frac{r}{r+g+b}, green p_2=\frac{g}{r+g+b} and black p_3=\frac{b}{r+g+b}
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means EX=np_1 and EY=np_2 but then the condition (1) isn't used.
I then thought about calculating E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3 where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write E(X) because the sum is not only for the random variable X.

Any tips are much appreciated
Alex
 
Physics news on Phys.org
AwesomeTrains said:
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

Homework Statement


Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

Homework Equations


Expectation value: E(X)=\sum_{i∈I}x_iP(X=x_i)
Multinomial distribution for 3 different balls: P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z, with n=x+y+z

The Attempt at a Solution


The probabilities for drawing a red ball, p_1=\frac{r}{r+g+b}, green p_2=\frac{g}{r+g+b} and black p_3=\frac{b}{r+g+b}
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means EX=np_1 and EY=np_2 but then the condition (1) isn't used.
I then thought about calculating E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3 where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write E(X) because the sum is not only for the random variable X.

Any tips are much appreciated
Alex

Since you are drawing with replacement, the number of draws ##Z## before the first black is a (modified) geometric distribution with success probability ##p_b = b/(r+g+b)##; that is, ##P(Z = k) = (1-p_b)^k p_b, k = 0,1,2, \ldots \, .## Given ##Z = k , k \in \{0,1,2, \ldots \}##, can you see why the pair ##(X,Y)## given ##Z =k## has the binomial distribution ##X \sim \text{Bin}(k,p_r/(p_r + p_g) ) \, ##? That is, ##X## is binomial and ##Y = k-X##. Using this, you can easily get ##E(X|Z=k)##, ##E(Y|Z=k)##, ##\text{Var}(X|Z=k)## and ##\text{Var}(Y|Z=k)##. Then, by "unconditioning" you can get ##EX, EY, \text{Var}(X), \text{Var}(Y)##. I'll let you worry about how to get ##\text{Cov}(X,Y)## by a conditioning-unconditioning argument.

BTW: it is not wrong to write ##E(X)## or ##\text{Var}(X)##; these are just the mean and variance of the marginal distribution of ##X##.
 
Thanks for the fast reply and for the help :)
I would never have thought of that. Thought I had chosen the right distribution.

Ray Vickson said:
can you see why the pair (X,Y) given Z=k has the binomial distribution X∼Bin(k,pr/(pr+pg))?

Is it because there's either succes, (drawing the black ball) or failure drawing a red one?

Ray Vickson said:
That is, X is binomial and Y=k−X

How do you get this?

Then E(X|Z=k)=\sum_x x\frac{P(Z=k, X=x)}{P(Z=k)}?
What is meant by P(Z=k, X=x) here?
 
AwesomeTrains said:
Is it because there's either succes, (drawing the black ball) or failure drawing a red one?
No. It's because having fixed the number of draws before a black is drawn at k (note, Ray calls this Z, which is nor how you used Z in the OP), we know that only red and green balls are drawn in the first k draws. So that's just a binary result for each of k trials.
AwesomeTrains said:
What is meant by P(Z=k,X=x) here?
It's the joint probability, P[Z=k&X=x].
 
Hello
I've gotten this so far: E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)=<br /> P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,<br /> I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)
Can I use that in the solution you've outlined for me?
 
AwesomeTrains said:
Hello
I've gotten this so far: E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)=<br /> P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,<br /> I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)
Can I use that in the solution you've outlined for me?

The event ##\{ X=x, Z=k\}## occurs when there are ##x## 'reds' and ##k-x## 'greens', followed by a 'black', and that has probability
P(X=x,Z=k) = C(k,x)\, p_r^x \, p_g^{k-x} \, p_b
Thus, ##P(Z=k) = \sum_{x=0}^k P(X=x, Z=k) = (p_r + p_g)^k \, p_b##, by the binomial expansion of ##(p_r + p_g)^k##. Therefore, we have
P(X = x | Z = k) = \frac{P(X=x,Z=k)}{P(Z=k)} = C(k,x)\, \left( \frac{p_r}{p_r+p_g}\right)^x \, \left( \frac{p_g}{p_r+p_g}\right)^{k-x}.
That means that ##[X|Z=k\ \sim \text{Bin}\,(k, p_r/(p_r+p_g))## as claimed before. Because ##[X|Z=k]## is binomial, its mean is known right away as ##E(X|Z=k) = kp_r/(p_g+p_r)##. Therefore, ##EX = \sum_{k=0}^{\infty} p_b (p_r+p_g)^k k p_r/(p_r+p_g)##. That last summation is do-able.

If I were trying to get ##\text{Var}(X)## I would use the fact that ##\text{Var}(X) = E(X^2) - (E X)^2##, then get ##E(X^2)## in a similar manner as ##EX##: ##E(X^2|Z=k)## is the mean of a squared binomial random variable, and that is related to its variance and mean as ##E(X^2|Z=k) = \text{Var}(X|Z=k) + [E(X|Z=k)]^2##.

Note added in edit: Alternatively, you can try to compute the marginal distribution of ##X## directly as
P(X=x) = \sum_{k=x}^{\infty} P(Z=k) P(X=x|Z=k)

It also turns out that you can almost immediately write down the final expression for ##P(X = x)## without any complicated evaluations, provided that you reason very carefully (and quite subtly) about the nature of the event ##\{ X = x \}##.
 
Last edited:

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
Replies
9
Views
4K
  • · Replies 4 ·
Replies
4
Views
1K