# Homework Help: Find the Correlation Coefficient between X and Y

1. Mar 2, 2012

### rogo0034

1. The problem statement, all variables and given/known data

2. Relevant equations
SS(x-(mean x))(y- (mean y) f(x,y) dxdy
(note: SS=integers for x and y)

3. The attempt at a solution

Last edited: Mar 2, 2012
2. Mar 2, 2012

### jambaugh

Note the covariance:
$$\Sigma_{xy}=E[(X-\mu_x)(Y-\mu_y)]=\iint (x-\mu_x)(y-\mu_y) \cdot f(x,y)dx dy$$
is typically divided by the standard deviations of x and y to give correlation coefficients.
In short be sure to check the definition of correlation coefficient you're supposed to use.

That having been said you can take advantage of the linearity of expectation values to simplify the problem:

$$\Sigma_{xy}=E[ (X-\mu_x)(Y-\mu_y) ] = E[XY - \mu_x Y - X\mu_y +\mu_x\mu_y] = E[XY - \mu_x Y - X\mu_y - \mu_x\mu_y]$$
remembering that in this formula the "mu's" are constants:
$$\Sigma_{xy}=E[XY] -\mu_x E[Y]-\mu_yE[X] +\mu_x\mu_y E[1]$$
remember that $\mu_x \equiv E[X],\mu_y\equiv E[Y]$ so we get...
$$\Sigma_{xy}=E[XY] -\mu_x (\mu_y)-\mu_y(\mu_x) +\mu_x\mu_y = E[XY]-\mu_x\mu_y$$

You then only need to use integrals to work out: $E[XY], \mu_x = E[X], \mu_y=E[Y]$.

With that in mind, you have to do the integration:
$$E[g(X,Y)] = \iint g(x,y) f(x,y) dxdy$$

Specifically:
$E[X] = \mu_x = \iint x\cdot f(x,y) dx dy$
$E[Y] = \mu_y = \iint y\cdot f(x,y) dx dy$
$E[XY] = \iint x\cdot y \cdot f(x,y) dx dy$ (edited to correct!)

You need to show some attempt at these integrals. First set is to draw the region where f(x,y) is non-zero. You'll want to integrate over that region and inside f(x,y) = 2.

Note you can scan hand written work (or take a good quality pic) and attach it to your post. Try [Go Advanced] and look below the edit window for [Manage Attachments] when you reply.

[Edit: Oops, I didn't see you'd already attached a scan! Sorry. Let me look at it a moment...]

Last edited: Mar 2, 2012
3. Mar 2, 2012

### rogo0034

Sorry that image is HUUUGE, not sure how to shrink that.

4. Mar 2, 2012

### rogo0034

"E[X]=∬x⋅y⋅f(x,y)dxdy"

did you mean E[XY] ?

5. Mar 2, 2012

### jambaugh

Looking at your integrals for the means. Not looking good. ;)

Again try drawing the region where f is non-zero. Remember that 0 < x < y < 1 means:
0< x and x < y and y < 1. This region will be bounded by the cases where you change these to equalities, i.e. it is bounded by the lines 0 = x, x = y and y = 1. These cross to form a triangle but double check that a point inside satisfies all three conditions.

Now you set up a double integral by deciding how to slice up the region. Imagine you slice it into strips and then cut those strips into little squares. You will integrate in the reverse order in which you do this cutting.

Example: If you slice into horizontal strips then cut those, you will be adding up the squares which form a strip. That is first adding across the horizontal strip so it's integrating first w.r.t. x. What curves define the end of the strip, and thus are the boundaries of the x integration?

Once you have added across the strips (x direction) you'll add up the quantities you get from the strips (integrate in the y direction). Imagine that when you've done the first integral you collapse each strip along its length onto the corresponding axis (in this case squish each strip onto the y axis. Your 2nd integral will then have limits which span the range of scripts.

Remember to Never use a variable outside the integral in which it is used as the variable of integration!!!!! (Once you integrate w.r.t. x and evaluate at limits you should have a function containing no x's in it!)

6. Mar 2, 2012

### jambaugh

Yea, sorry cut and pasted and forgot to change everything. I edited the earlier post.

7. Mar 2, 2012

### rogo0034

I just got (x^3) for mu(x) and -((2x^3)/3) for mu(y)

Is this correct?

I would post my work, but it seems to make the image massive.

8. Mar 2, 2012

### jambaugh

Another note:
If for correlation coefficient you do need to use:
$corr(X,Y) = \frac{\Sigma_{xy}}{\sigma_x \sigma_y}$
You'll also need the standard deviations and hence the variances:
$\sigma_x = \sqrt{\sigma^2_x},\sigma_y=\sqrt{\sigma^2_y}$ where:
$\sigma^2_x = \Sigma_{xx} \equiv E[ (X-\mu_x)^2] = E[X^2] -\mu_x^2$
$\sigma^2_y = \Sigma_{yy} \equiv E[ (Y-\mu_x)^2] = E[Y^2] -\mu_y^2$

You'll need to do the same integrals with x^2 and y^2 in them.

(Note the matrix formed by $\Sigma_{ij}$ is called the covariance matrix.)

9. Mar 2, 2012

### jambaugh

Try editing the image (rescale in Paint or something) before posting). But if you can't don't worry about it, go ahead and post the big pics.

Your answer cannot be right. It should be pure numbers and not depend on variables, especially since these are variables of integration!!!!! See my earlier warning. Remember after taking anti-derivatives you must evaluate at the boundaries.

You should get something like $\mu_x = 7/9$ or some other value.

10. Mar 2, 2012

### rogo0034

I understood half that... ha, i'm in an entry level stats course, you are blowing my mind, fyi.

11. Mar 2, 2012

### jambaugh

All right, let's focus on just getting those means calculated correctly. Tell me what integral you set up for each, i.e.

mu_x = the integral from y= bla to y= blabla of the integral from x = bleh to x = blehbleh of bla bleh bla dx dy
(or if you did it in reverse order so indicate).

12. Mar 2, 2012

### rogo0034

right, but the bounds; 0<x<y<1 force me to put in variables when i integrate, right?

13. Mar 2, 2012

### rogo0034

Sorry, pretty sloppy, wasn't planning on posting that. But i'm not sure if what i'm supposed to put in the bounds if not for those variables, zeros?

14. Mar 2, 2012

### jambaugh

Only in the inner limit. You MUST draw a picture. The inner integral is correct, x from 0 to y.
But once you've integrated w.r.t. x you should never never, ever ever ever... (and I really mean it!)... ever use x in the integral.

Again visualize the triangle being integrated over, You slice into horizontal strips and then integrate along the strips (x from 0 to y). That's the inner integral. The outer integral is varying which strip based on its y position. This will range from y = 0 to y = 1.

In short since y is in the outer integral it is not bound by x, it binds x. You integrate it from its minimum possible value to its maximum one. So that last step should give:
$$\mu_x = \left[ \frac{y^3}{3}\right]_{y=0}^{y=1} = \frac{1}{3} - \frac{0}{3} = \frac{1}{3}$$

All your other integrals will use the same bounds so this should get you pretty far.
Note, since the pdf is a constant in the region, it acts like a constant density and the mean values are like the center of mass. You can just about figure out their values using geometry. See if the value you get for $\mu_y$ makes sense in that context.

Then work out $E(X^2), E(XY), E(Y^2)$.

15. Mar 2, 2012

### rogo0034

16. Mar 2, 2012

### rogo0034

is this all correct?

i'm applying it to your formula: E[g(X,Y)]=∬g(x,y)f(x,y)dxdy

So do i have to now: E[1/36]=∬(1/36)(2)dxdy ??

EDIT: Ah, nevermind, after going through that it still comes up 1/36, so i'm assuming this is finally the Correlation Coefficient (1/36) ??

17. Mar 2, 2012

### jambaugh

That is what I get for the covariance! Very good!

However again, check your definition. Typically the correlation coefficient is this number divided by sigma_x and sigma_y.

$$\rho_{x,y}=\frac{\mathop{covar}(X,Y)}{\sigma_x \sigma_y}$$

See: wikipedia:Pearson's product moment correlation coefficient