# Coefficient of correletion problem

1. Jul 23, 2011

### ENgez

a vase contains one red ball two white balls and three black balls. n balls are take out of the vase and (each ball returned to it afterwards). let B denote the number of black balls taken out and R denote the number of red balls taken out. what is the coefficient of correlation between R and B?

well i know that coeffieicnt of correlation = $\frac{cov(R,B)}{\sigma R*\sigma B}$
and that R and B are simply bernouli trials with x and n-x trials respectivly.

my problem is calculating the covariance = E[RB] -E[R]E (E is the mean).
E[R] and E are straightforward but E[RB] is a bit trickier for me.

i thought of using the definition of the mean with a multinomial vector for the shared probability for R and B and summing over 0<x<n, but is there an easier way?

2. Jul 23, 2011

### pmsrw3

It would be easy if it weren't for the white balls... As it is, I don't see any way of doing it short of writing out the sum.

3. Jul 23, 2011

### bpet

The number of red balls is R = sum_{i=1}^n I[ball i is red] where I is the indicator function so to calculate E[RB] you just need to expand the product.

4. Jul 23, 2011

### ENgez

bpet, i am sorry, but can you elaborate?

and i thought of using the smoothing theorom,
Due to the fact that it is easier to think in terms of conditional probability in this problem.

E[RB]=E[E[RB|B]]=E[B*E[R|B]=E[B*(n-B)*P(R)]

=P(R)*(nE-E[B^2])

B is distributed binomially with n trials and probability P(B).

the problem here is that i get a different answer for E[E[RB|R]]
what am i doing wrong?

5. Jul 23, 2011

### ENgez

sry for the double post but believe i understood the part about expanding the product :).

tell me if this is correct

$R = R1+R2 +R3 ..... + Rn$

where:
$R = 1 , p=1/6$
$R = 0 , p=5/6$

and likewise for B

$E[RB] = E[(R1+R2+...+Rn)(B1+B2+...+Bn)= (n^2-n)*P(B)*P(R)]$
Ri*Bi is always 0 becuase one being 1 implies the other being 0.

$E[RB]-E[R]E = (n^2-n)*P(B)*P(R)-n^2*P(B)*P(R) =- n*P(R)*P(B) = -n/12$

Last edited: Jul 23, 2011
6. Jul 23, 2011

### bpet

If you change the RHS to E[B*(n-B)*P(R|B)] does that work?

7. Jul 24, 2011

### ENgez

i don't see how P(R|B) changes anything becuase (n-B)*P(R) is the mean of R|B (where B=s , 0<s<n).
P(R) is the probability that in a given "trial" (pulling a ball out) you get red, which is independent from the amount of black balls you took out.