Problem with calculating the cov matrix of X,Y

  • Thread starter Thread starter ChrisVer
  • Start date Start date
  • Tags Tags
    Matrix
AI Thread Summary
The discussion centers on calculating the covariance matrix for two random variables, X and Y, defined by specific Gaussian distributions. The variance for both variables is derived, and the covariance is initially approached through a combined variable Z, leading to a proposed formula. However, confusion arises regarding the independence of the Gaussian components and how they contribute to the covariance. Ultimately, the correct covariance expression is clarified, emphasizing the need to account for shared components in the distributions and the importance of proper notation in defining random variables versus their distributions. The conversation highlights the complexities of joint distributions and the necessity of understanding the relationships between the variables involved.
ChrisVer
Science Advisor
Messages
3,372
Reaction score
465
If I have two random variables X, Y that are given from the following formula:
X= \mu_x \big(1 + G_1(0, \sigma_1) + G_2(0, \sigma_2) \big)
Y= \mu_y \big(1 + G_3(0, \sigma_1) + G_2(0, \sigma_2) \big)

Where G(\mu, \sigma) are gaussians with mean \mu=0 here and std some number.

How can I find the covariance matrix of those two?

I guess the variance will be given by:
Var(X) = \mu_x^2 (\sigma_1^2+ \sigma_2^2) and similarly for Y. But I don't know how I can work to find the covariance?
Could I define some other variable as :Z=X+Y and find the covariance from Var(Z)= Var(X)+Var(Y) +2 Cov(X,Y) ?
while Z will be given by Z= (\mu_x+\mu_y) (1+ G_1 + G_2)?

Then Var(Z)= (\mu_x+ \mu_y)^2 (\sigma_1^2+ \sigma_2^2)

And Cov(X,Y) = \dfrac{(\mu_x+\mu_y)^2(\sigma_1^2+ \sigma_2^2)- \mu_x^2 (\sigma_1^2+ \sigma_2^2) - \mu_y^2(\sigma_1^2+ \sigma_2^2) }{2}=\mu_x \mu_y(\sigma_1^2+ \sigma_2^2)

Is my logic correct? I'm not sure about the Z and whether it's given by that formula.
 
Last edited:
Physics news on Phys.org
Why not simply apply the definition of covariance? Or the reduced form E(XY)-E(X)E(Y)?
 
I don't know E[XY]...?
 
But you know the distributions of ##X## and ##Y## and so you can compute it.
 
I don't know the joint distribution function...
so that E[xy]= \int dxdy~ h(x,y) xy
And I can't say h(x,y)=f(x)g(y) since I don't know if X,Y are independent...if they were independent they wouldn't have a covariance too...the E[xy]=E[x]E[y] and your reduced form formula would vanish the cov.
 
ChrisVer said:
I don't know the joint distribution function...

You do. The only reason I can see to number the ##G##s is to underline out that they are independent distributions so that ##X## and ##Y## have a common part ##G_2## and one individual part ##G_1##/##G_3##.
 
Yes that's the reason of labeling them... It just happens that ##G_1,G_2## have the same arguments ##\mu=0, \sigma_1## but they are not common for X,Y. However the G_3 is a common source of uncertainty in both X,Y... Still I don't understand how to get the joint probability from it... like taking the intersection of X,Y is leading to only the ##G_3##? that doesn't seem right either.
 
I suggest you start from the three-dimensional distribution for ##G_i##. From there you can integrate over regions of constant ##X## and ##Y## to obtain the joint pdf for ##X## and ##Y##. In reality, it does not even have to be that hard. Just consider ##X## and ##Y## as functions on the three dimensional outcome space the ##G_i## and use what you know about those, e.g., ##E(G_1G_2) = 0## etc.
 
ChrisVer said:
If I have two random variables X, Y that are given from the following formula:
X= \mu_x \big(1 + G_1(0, \sigma_1) + G_2(0, \sigma_2) \big)
Y= \mu_y \big(1 + G_3(0, \sigma_1) + G_2(0, \sigma_2) \big)

To keep the discussion clear, you should use correct notation. If "X" is a random variable then X has a distribution, but it is not "equal" to its distribution. I think what you mean is that:

X = \mu_x( X_1 + X_2)
Y = \mu_y( X_3 + X_2 )

where X_i is a random variable with Gaussian distribution G(0,\sigma_i).
 
  • Like
Likes ChrisVer
  • #10
No, X is a random variable,taken from a distribution function with:
mean ##\mu_x## (that's the role of 1)
some measurment uncertainty following a guassian ##G_1## or ##G_3##.
a further common measurement uncertainty ##G_3##.

That means I could take 5 measurements from X :##\{x_1,x_2,x_3,x_4,x_5 \}=\mu_x+ \{ + 0.02, + 0.001, 0, - 0.01, - 0.06\}##.
 
  • #11
Ahhh OK I understood what you meant to say... yes it's fine, I understand what I wrote maybe I didn't write it in a correct way (confusing random variables with distributions)...
 
  • #12
Apply the formula for \sigma(aX + bY, cW + dV) given in the Wikipedia article on Covariance http://en.wikipedia.org/wiki/Covariance

In your case , a = b = \mu_x,\ X=X_1,\ Y= X_2,\ c=d=\mu_y,\ W = X_3, V = X_2

Edit: You'll have to put the constant 1 in somewhere. You could use X = 1 + X_1.
 
  • #13
I also thought about this:
writing the covariance as \sigma_{XY} = \sigma_X \sigma_Y \rho with \rho the correlation coefficient.

And since X= \mu_x (1+ X_1 + X_2) and Y = \mu_y (1 + X_3 + X_2) I think these two are linearly correlated (due to X_2). So \rho>0. Would you find this a logical statement? I mean if X_2 happens to be chosen a larger number, both X,Y will get a larger number as contribution.
For the value of \rho I guess it should (by the same logic) be given by a combination of \mu_y,\mu_x, since they give the difference in how X,Y change with X_2's change. I mean if \mu_x > \mu_y then X will get a larger contribution from the same X_2 than Y and vice versa for \mu_x<\mu_y...So I guess it should be an X(X_2)=\frac{\mu_x}{\mu_y} Y(X_2)?
 
Last edited:
  • #14
I still think you are overcomplicating things. Do you know how to compute ##E(X_iX_j)## when ##X_i## has a Gaussian distribution with mean zero?
 
  • #15
In general I'd know, but as I said I have difficulty in finding the joint pdf...

Are you saying that \mu_i (1+G_2) in my notation is the joint pdf (after integrating out G1 and G3)?
 
  • #16
You do not need to find the joint pdf. You can work directly with the Gaussians!
 
  • #17
Then in that case I don't know how to find ##E[X_i X_j]##...
The formula I know defines the expectation value through an integral with the pdf...:sorry:
 
  • #18
That one is simple, it is the expectation value of two independent gaussians with zero mean. Since they are independent, the pdf factorises ...

Edit: ... unless, of course, if i = j ...
 
  • Like
Likes ChrisVer
  • #19
So you suggest something like:
E[X_i X_j] = \begin{cases} E[X_i]E[X_j] = \mu_i \mu_j & i \ne j \\ E[X_i^2]= \sigma_i^2 + \mu_i^2 & i=j \end{cases}

Where X_i gaussian distributed variables
 
  • #20
Indeed.

Edit: Of course, you now have ##E[(1+X_1+X_2)(1+X_3+X_2)]## so you will have to do some algebra, but not a big deal.
 
  • Like
Likes ChrisVer
  • #21
E[XY] = \mu_x \mu_y E[ (1+G_1 + G_2) (1+G_3 + G_2) ] = \mu_x \mu_y \Big( 1 + 4E[G_i] + 3 E[G_i G_j] + E[G_2G_2] \Big)_{i\ne j} = \mu_x \mu_y (1+ \sigma_2^2 ) ?

So Cov(X,Y) = E[XY]- \mu_x \mu_y = \mu_x \mu_y \sigma_2^2.
That's different to the result I got from Z=X+Y... would you know the reason?
 
Last edited:
  • #22
I suspect it comes from this not being a correct expression for ##Z##.
ChrisVer said:
while Z will be given by Z= (\mu_x+\mu_y) (1+ G_1 + G_2)?
The correct expression is
##
Z = (\mu_x + \mu_y)(1+G_2) + \mu_x G_1 + \mu_2 G_3.
##
Even though ##G_1## and ##G_3## have the same distribution, they are not the same random variable.
 
  • Like
Likes ChrisVer
  • #23
Yup that's right...
<br /> \begin{align}<br /> <br /> Var[Z]=&amp; Var[M_e] + Var[M_f] + 2 Cov(M_e,M_f) \notag\\<br /> <br /> \text{also} =&amp; (m_e+m_f)^2 \sigma_s^2 + m_e^2 \sigma_e^2 + m_f^2 \sigma_f^2 \notag \\<br /> <br /> &amp;\Downarrow \notag \\<br /> <br /> Cov(M_e,M_f) =&amp; \dfrac{(m_e+m_f)^2 \sigma_s^2 + m_e^2 \sigma_e^2 + m_f^2 \sigma_f^2- m_e^2 (\sigma_e^2+\sigma_s^2) - m_f^2 (\sigma_f^2 + \sigma_s^2)}{2} \notag \\<br /> <br /> =&amp;\dfrac{2 m_e m_f \sigma_s^2}{2} = m_e m_f \sigma_s^2<br /> <br /> \end{align}<br /> <br />

So the mistake was that I got the two G's after sum to give a same G.
 
  • #24
Cov( \mu_x( 1 + X_1 + X_2), \mu_y (1 + X_3 + X_2) ) = \mu_x \mu_y Cov(1 + X_1 + X_2, 1 + X_3 + X_2)
= \mu_x \mu_y Cov( X_1 + X_2, X_3 + X_2)
= \mu_x \mu_y ( Cov(X_1,X_3) + Cov(X_1,X_2) + Cov(X_2,X_3) + Cov(X_2,X_2))
Assuming the X_i are independent random variables
= \mu_x \mu_y ( 0 + 0 + 0 + Var(X_2) )
 
  • Like
Likes ChrisVer
  • #25
So far the covariance matrix is:
C[X_1,X_2]= \begin{pmatrix} m_x^2 (\sigma_1^2+ \sigma_s^2) &amp; m_x m_y \sigma_s^2 \\ m_x m_y \sigma_s^2 &amp; m_y^2 (\sigma_2^2 +\sigma_s^2) \end{pmatrix}

I am bringing up a conversation I had in class about that... Since I am still unable to understand how this thing could work out.
In a further step we said that suppose that X_1,X_2 are uncorrelated, that would mean that their covariance is Cov(X_1,X_2)=0. We would need to find what the \sigma_D=\sqrt{Var[D]} is, where D= X_2-X_1.
Obviously Var[D]= Var[X_2] + Var[X_1] - 2Cov[X_1,X_2]

The main problem I had with this conversation is that I was told I should get Cov[X_1,X_2]=0 in the above formula. I was stating instead that in order for the covariance to be 0, I would have to send \sigma_s =0 and that would also influence the variances of X1,X2: Var[X_1]= m_x^2 ( \sigma_1^2 + \sigma_s^2) = m_x^2 \sigma_1^2.

The thing is that in my case I'm dropping out the \Delta_s from the initial expressions: X_1 = m_x (1+ \Delta_1 + \Delta_s) , X_2 = m_y (1+\Delta_2 + \Delta_s), where \Delta_s \sim Gaus(0,\sigma_s) can be seen a systematic error in the measurement of X_1,X_2 and the \Delta_{1,2} would only be the statistical errors in the measurement.

The guy I talked about this told me it's wrong since in the X1 case the \Delta_s= \Delta_{s1}+ \Delta_{s2} +... and in X2: \Delta_s = \Delta_{s2} and their correlation could only come from the \Delta_{s2} (so dropping the \Delta_s's was wrong because I was dropping the \Delta_{s1},\Delta_{s3},... from X1). And \Delta_{si} are unknown individual systematic errors coming from the measurement.

What do you think? Sorry if it sounds a little bit complicated...
 
Last edited:

Similar threads

Back
Top