# Mean, variance and correlation - Multinomial distribution

• AwesomeTrains
In summary: Yes, your expression is not quite right. The idea of "independence" is not applicable here. What you need is the conditional distribution of X given Z=k. For this, I suggest working with the joint distribution of (X,Y) given Z=k to get the conditional distribution of X given Z=k, then work with that.EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)That's the standard method for computing ##E(X)## of a discrete random variable ##X##: (*)##E(X) = \sum_x xP(X=x) = \sum_x \sum_i xP(X=x, Y=i)##. This is
AwesomeTrains
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

## Homework Statement

Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

## Homework Equations

Expectation value: $E(X)=\sum_{i∈I}x_iP(X=x_i)$
Multinomial distribution for 3 different balls: $P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z$, with $n=x+y+z$

## The Attempt at a Solution

The probabilities for drawing a red ball, $p_1=\frac{r}{r+g+b}$, green $p_2=\frac{g}{r+g+b}$ and black $p_3=\frac{b}{r+g+b}$
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means $EX=np_1$ and $EY=np_2$ but then the condition (1) isn't used.
I then thought about calculating $E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3$ where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write $E(X)$ because the sum is not only for the random variable X.

Any tips are much appreciated
Alex

AwesomeTrains said:
Hello everyone, I'm stuck at a elementary stochastic problem. I have to calculate means, variance and co-variance for two random variables.

## Homework Statement

Let r,g,b∈ℕ. r red, g green and b black balls are placed in an urn.
The balls are then drawn one at a time with replacement, until a black ball is picked for the first time. (1)
X counts the number of red balls and Y the number of the green ones, until a black one is picked.
Find EX, EY, Var(X), Var(Y) and ρ(X,Y)=cov(X,Y)/σ_Xσ_Y

## Homework Equations

Expectation value: $E(X)=\sum_{i∈I}x_iP(X=x_i)$
Multinomial distribution for 3 different balls: $P(X=x, Y=y, Z=z)=\frac{n!}{x!y!z!}p_1^xp_2^yp_3^z$, with $n=x+y+z$

## The Attempt at a Solution

The probabilities for drawing a red ball, $p_1=\frac{r}{r+g+b}$, green $p_2=\frac{g}{r+g+b}$ and black $p_3=\frac{b}{r+g+b}$
I thought X and Y was i.d.d to the binomial distribution and would therefore have the means $EX=np_1$ and $EY=np_2$ but then the condition (1) isn't used.
I then thought about calculating $E(X)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\sum_{i∈I}x_iP(X=x, Y=y, Z=1)=\frac{(x+y+1)!}{x!y!1!}p_1^xp_2^yp_3$ where I let Z denote the number of black balls drawn.
But I got stuck again, not knowing what variable to sum for, and I think it's wrong to write $E(X)$ because the sum is not only for the random variable X.

Any tips are much appreciated
Alex

Since you are drawing with replacement, the number of draws ##Z## before the first black is a (modified) geometric distribution with success probability ##p_b = b/(r+g+b)##; that is, ##P(Z = k) = (1-p_b)^k p_b, k = 0,1,2, \ldots \, .## Given ##Z = k , k \in \{0,1,2, \ldots \}##, can you see why the pair ##(X,Y)## given ##Z =k## has the binomial distribution ##X \sim \text{Bin}(k,p_r/(p_r + p_g) ) \, ##? That is, ##X## is binomial and ##Y = k-X##. Using this, you can easily get ##E(X|Z=k)##, ##E(Y|Z=k)##, ##\text{Var}(X|Z=k)## and ##\text{Var}(Y|Z=k)##. Then, by "unconditioning" you can get ##EX, EY, \text{Var}(X), \text{Var}(Y)##. I'll let you worry about how to get ##\text{Cov}(X,Y)## by a conditioning-unconditioning argument.

BTW: it is not wrong to write ##E(X)## or ##\text{Var}(X)##; these are just the mean and variance of the marginal distribution of ##X##.

Thanks for the fast reply and for the help :)
I would never have thought of that. Thought I had chosen the right distribution.

Ray Vickson said:
can you see why the pair (X,Y) given Z=k has the binomial distribution X∼Bin(k,pr/(pr+pg))?
Is it because there's either succes, (drawing the black ball) or failure drawing a red one?

Ray Vickson said:
That is, X is binomial and Y=k−X
How do you get this?

Then $E(X|Z=k)=\sum_x x\frac{P(Z=k, X=x)}{P(Z=k)}$?
What is meant by $P(Z=k, X=x)$ here?

AwesomeTrains said:
Is it because there's either succes, (drawing the black ball) or failure drawing a red one?
No. It's because having fixed the number of draws before a black is drawn at k (note, Ray calls this Z, which is nor how you used Z in the OP), we know that only red and green balls are drawn in the first k draws. So that's just a binary result for each of k trials.
AwesomeTrains said:
What is meant by P(Z=k,X=x) here?
It's the joint probability, P[Z=k&X=x].

Hello
I've gotten this so far: $E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)= P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,$ I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
$EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)$
Can I use that in the solution you've outlined for me?

AwesomeTrains said:
Hello
I've gotten this so far: $E(X|Z=k)=\sum_{x=0}^\infty xP(X=x|Z=k)=\sum_{x=0}^\infty x\frac{P(X=x,Z=k)}{P(Z=k)}=P(Z=k)^{-1}\sum_{x=0}^\infty xP(X=x,Z=k)= P(Z=k)^{-1}\sum_{x=0}^\infty x \text{Bin}(k,p_r/(p_r + p_g) ) \,$ I'm pretty sure the last equality sign is wrong though.
I'm not sure how to handle the joint probability. Those events aren't independent are they?

Btw, I've also gotten this tip from our tutor:
$EX=\sum_{i=1}^\infty E(X|Z=i)P(Z=i)$
Can I use that in the solution you've outlined for me?

The event ##\{ X=x, Z=k\}## occurs when there are ##x## 'reds' and ##k-x## 'greens', followed by a 'black', and that has probability
$$P(X=x,Z=k) = C(k,x)\, p_r^x \, p_g^{k-x} \, p_b$$
Thus, ##P(Z=k) = \sum_{x=0}^k P(X=x, Z=k) = (p_r + p_g)^k \, p_b##, by the binomial expansion of ##(p_r + p_g)^k##. Therefore, we have
$$P(X = x | Z = k) = \frac{P(X=x,Z=k)}{P(Z=k)} = C(k,x)\, \left( \frac{p_r}{p_r+p_g}\right)^x \, \left( \frac{p_g}{p_r+p_g}\right)^{k-x}$$.
That means that ##[X|Z=k\ \sim \text{Bin}\,(k, p_r/(p_r+p_g))## as claimed before. Because ##[X|Z=k]## is binomial, its mean is known right away as ##E(X|Z=k) = kp_r/(p_g+p_r)##. Therefore, ##EX = \sum_{k=0}^{\infty} p_b (p_r+p_g)^k k p_r/(p_r+p_g)##. That last summation is do-able.

If I were trying to get ##\text{Var}(X)## I would use the fact that ##\text{Var}(X) = E(X^2) - (E X)^2##, then get ##E(X^2)## in a similar manner as ##EX##: ##E(X^2|Z=k)## is the mean of a squared binomial random variable, and that is related to its variance and mean as ##E(X^2|Z=k) = \text{Var}(X|Z=k) + [E(X|Z=k)]^2##.

Note added in edit: Alternatively, you can try to compute the marginal distribution of ##X## directly as
$$P(X=x) = \sum_{k=x}^{\infty} P(Z=k) P(X=x|Z=k)$$

It also turns out that you can almost immediately write down the final expression for ##P(X = x)## without any complicated evaluations, provided that you reason very carefully (and quite subtly) about the nature of the event ##\{ X = x \}##.

Last edited:

## What is the Multinomial Distribution?

The Multinomial Distribution is a probability distribution that describes the outcomes of a categorical variable with more than two possible outcomes. It is similar to the Binomial Distribution, but it allows for more than two possible outcomes.

## What is the mean of a Multinomial Distribution?

The mean of a Multinomial Distribution is the expected number of successes in a given number of trials. It is calculated by multiplying the number of trials by the probability of success for each outcome.

## How is the variance of a Multinomial Distribution calculated?

The variance of a Multinomial Distribution is calculated by multiplying the number of trials by the probability of success for each outcome, then subtracting the squared mean from that value.

## What is correlation in relation to Multinomial Distribution?

Correlation in relation to Multinomial Distribution refers to the measure of the relationship between two or more categorical variables. It can be calculated using the correlation coefficient, which measures the strength and direction of the relationship between the variables.

## How is Multinomial Distribution used in real life?

Multinomial Distribution is used in various fields such as economics, biology, and social sciences to model and analyze data with multiple categorical outcomes. It can be used to predict the outcomes of elections, market trends, and survey responses, among other things.

• Precalculus Mathematics Homework Help
Replies
7
Views
921
• Precalculus Mathematics Homework Help
Replies
14
Views
1K
• Precalculus Mathematics Homework Help
Replies
1
Views
2K
• Precalculus Mathematics Homework Help
Replies
13
Views
449
• Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
• Calculus and Beyond Homework Help
Replies
0
Views
271
Replies
19
Views
987
• Calculus and Beyond Homework Help
Replies
9
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
• Calculus and Beyond Homework Help
Replies
8
Views
1K