# A Joint Used to Show lack of Correlation?

#### WWGD

Gold Member
Hi All,
I think I have some idea of how to interpret covariance and correlation. But some doubts remain:
1)What joint distribution do we assume? An example of uncorrelated variables is that of points on a circle, i.e., the variables $X$and $\sqrt{ 1- x^2}$are uncorrelated -- have $Cov(X,Y)=0$.

$Cov(X,Y) =E(XY) - \mu_X \mu_Y$ Now, each of these terms assumes a distribution. That for $X, Y$and a joint for $X,Y$. But I have never seen any mention of either after searching.

2) Is there a way of "going backwards" and deciding which joints/marginals would create uncorrelated variables, i.e., can we find all $f_{XY}(X,Y)$ so that :

$\Sigma X_iY_i f_{XY}(x_i,y_i) - \Sigma xf_X(x) \Sigma yf_Y(y) =0$

or, in the continuous case:

$\int xy f_{XY}(x,y) dxdy - \int xf_X(x) dx \int yf_Y(y)dy =0$ ?

3) In what sense is correlation a measure of linear dependence? I don't see where/how this follows from the formulas.

Thanks.

Last edited:
Related Set Theory, Logic, Probability, Statistics News on Phys.org

#### BvU

Homework Helper
An example of uncorrelated variables is that of points on a circle, i.e., the variables $X$ and $\sqrt{ 1- x^2}$ are uncorrelated
Hogwash. 'uncorrelated' means you know nothing about y when given x

after searching
Where ? Behind the refrigerator ?

#### WWGD

Gold Member
Hogwash. 'uncorrelated' means you know nothing about y when given x
I thought that was independence. AFAIK uncorrelated , implying Cov(X,Y)=0 means there is no "clear pattern of change of Y when X changes" . EDIT: At any rate, $Cov(X,Y)= E(XY)-\mu_X \mu_Y$ In order to determine when this equals 0 we must now the joint for (X,Y). From the joint we can find the marginals $f_X, f_Y$

Besides, I have seen enought that e.g., points in a parabola, or the pair $(X,X^2)$ is uncorrelated, though clearly we know _everything_ about $Y=X^2$ when we know $X$.

Last edited:

#### BvU

Homework Helper
Seems to me we are speaking different languages here. To me there is a very clear pattern of change of $y=x^2$ when $x$ changes.

#### WWGD

Gold Member
Seems to me we are speaking different languages here. To me there is a very clear pattern of change of $y=x^2$ when $x$ changes.
I think it has to see with the fact that $(X-\mu_X)(Y- \mu_Y)$ is alternatingly positive and negative, i.e., neither overwhelmingly positive nor negative. So there is no clear sense of either x increases with y nor x decreases with y, but I am still trying to get a better feel for the term. EDIT: In a more formal sense which I don't fully get yet) Cov(X,Y) is a quadratic form and we can show Cov(X,Y)= -Cov(X,Y), forcing it to equal 0.

EDIT2: Here is a screenshot of a Covariance calculator for the pair $(X, Y=X^2)$ where correlation is 0. Data points X=( 1,2,3,4,5,-1,-2,-3,-4,-5) and Y=( 1,4,9,16,25,1,4,9,16,25)

EDIT2:

Last edited:

#### Stephen Tashi

AFAIK uncorrelated , implying Cov(X,Y)=0 means there is no "clear pattern of change of Y when X changes" .
That is not a good intuition. Suppose (X,Y) jointly take on the following values, each with probability 1/5: (-2,-2), (-1,-1), (0,0), (1,1), (2, -3).

$\mu_X = 0$
$\mu_Y = -1$
$E(X,Y) = 0$
$COV(X,Y) = 0$

In this case, knowing $X$ completely determines $Y$.

In the above example, $X$ and $Y$ are independent. Neither independence nor zero correlation rule out a deterministic relationship ( a "clear pattern") relating two random variables.

(It's also possible to give examples of two random variables that are uncorrelated but not independent. )

A better intuition about COV(X,Y) is that it has to with with approximating the relationship between $X$ and $Y$ by a linear equation.

3) In what sense is correlation a measure of linear dependence? I don't see where/how this follows from the formulas.
Look at the formulas for doing linear regression. They indicate that correlation has something to do with least squares linear approximation.

2) Is there a way of "going backwards" and deciding which joints/marginals would create uncorrelated variables,
Going backwards from what starting point? It's an interesting line of thought, but the task "List all bivariate distributions whose variables are uncorrelated" is too general. More restrictive questions would be better - questions about particular families of distributions or questions about how to take two bivariate distributions and use them to form a third bivariate distributions whose variables are uncorrelated.

#### WWGD

Gold Member
That is not a good intuition. Suppose (X,Y) jointly take on the following values, each with probability 1/5: (-2,-2), (-1,-1), (0,0), (1,1), (2, -3).

$\mu_X = 0$
$\mu_Y = -1$
$E(X,Y) = 0$
$COV(X,Y) = 0$

In this case, knowing $X$ completely determines $Y$.

In the above example, $X$ and $Y$ are independent. Neither independence nor zero correlation rule out a deterministic relationship ( a "clear pattern") relating two random variables.

(It's also possible to give examples of two random variables that are uncorrelated but not independent. )

A better intuition about COV(X,Y) is that it has to with with approximating the relationship between $X$ and $Y$ by a linear equation.

Look at the formulas for doing linear regression. They indicate that correlation has something to do with least squares linear approximation.

Going backwards from what starting point? It's an interesting line of thought, but the task "List all bivariate distributions whose variables are uncorrelated" is too general. More restrictive questions would be better - questions about particular families of distributions or questions about how to take two bivariate distributions and use them to form a third bivariate distributions whose variables are uncorrelated.
Are you assuming E(X,Y)=E(XY)?
Thanks. Yes, I didn't make myself very clear. I meant that the expression $(X-\mu_X)(Y- \mu_Y)$ is neither " Overwhlmingly" positive nor negative so we can neither say with accuracy that Y increases with X nor that Y decreases with X. In this sense they do not have a clear pattern of changing together -- co -varying. Still trying to pin down the concept more clearly.

#### WWGD

Gold Member
Still, my initial question is: What joint are we assuming for a pair (X,Y) when we say they are uncorrelated? It seems strange when I read these statements without seeing a mentioned of a joint.

#### Stephen Tashi

Are you assuming E(X,Y)=E(XY)?
Yes. It's a typo. It should be E(X,Y).

I meant that the expression (X−μX)(Y−μY)
is neither " Overwhlmingly" positive nor negative so we can neither say with accuracy that Y increases with X nor that Y decreases with X. In this sense they do not have a clear pattern of changing together -- co -varying. Still trying to pin down the concept more clearly.
For the purpose of understanding the mathematics, it's best not make this kind of qualitative interpretation of covariance. In the example of the previous post, a qualitative evaluation might say that $Y$ does tend to increase as $X$ increases. For the purpose of understanding a typical presentation where someone is presenting statistics, that kind of qualitative intepretation is often ok.

#### WWGD

Gold Member
Yes. It's a typo. It should be E(X,Y).

For the purpose of understanding the mathematics, it's best not make this kind of qualitative interpretation of covariance. In the example of the previous post, a qualitative evaluation might say that $Y$ does tend to increase as $X$ increases. For the purpose of understanding a typical presentation where someone is presenting statistics, that kind of qualitative intepretation is often ok.
Thank you Stephen. But the regression aspect makes it more confusing to me. We have two major cases: (X,Y) both random variables, and (X,Y): X is a mathematical variable and Y is random. I guess we speak about correlation only in the first case?

#### Stephen Tashi

Still, my initial question is: What joint are we assuming for a pair (X,Y) when we say they are uncorrelated? It seems strange when I read these statements without seeing a mentioned of a joint.
The fact that two random variables are uncorrelated does not imply that they have a particular joint distribution. The technique of linear least squares regression is applied to data, not to distributions, but imagine you have lots of data, so the plot of your data resembles the joint probability density distribution of two random variables. If the random variables are uncorrelated and you did a least squares linear regression between them, the slope of the regression line would zero. There are many different "clouds" of data points that can produce a zero regression line. For example, the shape of the data cloud does not have to be symmetrical about a horizontal line. Many points above a horizontal line could be "canceled out" by a few points far below it. Correlation (in the mathematical sense of "correlation coefficient" ) is a quantitative concept.

#### Stephen Tashi

We have two major cases: (X,Y) both random variables, and (X,Y): X is a mathematical variable and Y is random. I guess we speak about correlation only in the first case?
That's a good point. Yes, we should only speak of mathematical correlation in the case where $X$ and $Y$ are both random varables. However, the properties of random variables are estimated from data, so people say things like "The standard deviation was 25.3" when they mean that 25.3 is a number computed from some data that is used to estimate the standard deviation of a probability distribution. Likewise, we can talk about $COV(X,Y)$ as being a property of a joint probability distribution or we can say things like $COV(X,Y) = 3.20$ when we are taking about estimators computed from data.

In the scenario for linear least squares regresson $y = ax + b$, both $x$ and $y$ are mathematical variables. The data used has the form $(x_i,y_i)$ where $y_i$ is assumed to be a realization of a random variable $Y$ that has the form $Y = ax_i + b + E$ where $E$ is a random variable.

To relate linear least squares regression to a bivariate distribution, you have to imagine taking samples $(x_i,y_i)$ from that distribution and doing a regression on that data. So you wouldn't generate data by picking values of $x_i$ in some systematic manner such as taking an equal number of measurements of $y$ when $x = 1,2,3,...$.

Last edited:

#### StoneTemplePython

Gold Member
3) In what sense is correlation a measure of linear dependence? I don't see where/how this follows from the formulas.
1st: zero mean random variables form a vector space. 2nd changing the mean (by addition of a constant) doesn't change the computed covariance. So assume WLOG that you are dealing with zero mean random variables.

Now supposing your random variables have finite variance, apply cauchy schwarz to
$E\big[XY\big]$
or look at the 2x2 covariance matrix for $(X,Y)$. This is a gram matrix...

#### WWGD

Gold Member
1st: zero mean random variables form a vector space. 2nd changing the mean (by addition of a constant) doesn't change the computed covariance. So assume WLOG that you are dealing with zero mean random variables.

Now supposing your random variables have finite variance, apply cauchy schwarz to
$E\big[XY\big]$
or look at the 2x2 covariance matrix for $(X,Y)$. This is a gram matrix...
I know there was an approach using quadratic forms, possibly similar to this. So you mean we can obtain the result without knowing the actual joint? So I guess E(XY) is an inner-product? Ah, yes, I am remembering the probability subsection of the Cauchy-Schwarz section in Wiki.

#### StoneTemplePython

Gold Member
So I guess E(XY) is an inner-product? Ah, yes, I am remembering the probability subsection of the Cauchy-Schwarz section in Wiki.
run with this for a bit...

So you mean we can obtain the result without knowing the actual joint?
This seems like a vague question. One way or another to directly compute $E\big[XY\big]$ you need a joint distribution.

But depending on what you want out of this, linear algebra still something to consider -- you could have 2 independent random variables (consider it a random vector $\mathbf x$, zero mean for convenience, i.e. $E\big[\mathbf x\big] = \mathbf 0$).

The covariance matrix then is diagonal, $E\big[\mathbf{xx}^T \big] = \Lambda$. But you could multiply by an orthogonal matrix $\mathbf U$ to get random vector $\big(\mathbf {Ux}\big)$ with covariance matrix

$E\big[\mathbf U\mathbf{xx}^T \mathbf U^T \big] = \mathbf U E\big[\mathbf{xx}^T\big] \mathbf U^T = \mathbf U\Lambda\mathbf U^T = \Sigma$
which in general is not diagonal and hence the the random vector $\big(\mathbf {Ux}\big)$ has correlated random variables though you never had to get in the weeds of the distributions.

Going through these manipulations is most productive and sharpest with the very important special case of a multivariate Gaussian $\mathbf x$ where zero covariance is actually the same thing as independence.

#### WWGD

Gold Member
run with this for a bit...

This seems like a vague question. One way or another to directly compute $E\big[XY\big]$ you need a joint distribution.
I read a comment to the effect one can show that we can show that E[XY]:=<X,Y> as an inner- product or quadratic form equals its own negative. I am trying to see why/how. But this is an area I am rusty, so sorry if I am being dense in/with this.

Last edited:

#### StoneTemplePython

Gold Member
I read a comment to the effect one can show that we can show that E[XY]:=<X,Y> as an inner- product or quadratic form equals its own negative.
I don't know what this means. Since inner products are (bi)linear you should immediately question comments like this. From what I can tell you're saying

$E\big[XY\big] = E\big[-XY\big] = -E\big[XY\big]$
where the RHS follows by linearity of expectations. But this implies $E\big[XY\big]=0$ which of course isn't true in general.

Your statement also seems to contradict the fact that every n x n, real symmetric positive (semi)definite matrix is a covariance matrix (for a multivariate gaussian) and every covariance matrix (where 2nd moments exist) is an n x n, real symmetric positive (semi)definite matrix.

#### FactChecker

Gold Member
2018 Award
Still, my initial question is: What joint are we assuming for a pair (X,Y) when we say they are uncorrelated? It seems strange when I read these statements without seeing a mentioned of a joint.
This is a property that a joint distribution may or may not have. There is no need to specify a particular joint distribution. It is like saying that f(x) = f(-x) defines the property of an even function without specifying any particular function.

#### WWGD

Gold Member
This is a property that a joint distribution may or may not have. There is no need to specify a particular joint distribution. It is like saying that f(x) = f(-x) defines the property of an even function without specifying any particular function.
I am not sure I get your point. Do you mean being uncorrelated depends on the joint? Yes, of course. But I wonder when I see the claim of uncorrelated which choice is assumed?

#### FactChecker

Gold Member
2018 Award
There is no need to specify any specific distribution in the definition of "uncorrelated". Of course, when one talks about any particular pair of random variables, X and Y, there is a joint distribution for those variables. That will be the one that applies when one talks about the correlation between X and Y.

#### WWGD

Gold Member
There is no need to specify any specific distribution in the definition of "uncorrelated". Of course, when one talks about any particular pair of random variables, X and Y, there is a joint distribution for those variables. That will be the one that applies when one talks about the correlation between X and Y.
Yes, I understand that, but I am trying to test that, e.g. the pair $(X,X^2)$ is uncorrelated. How would I go about it? Same for points on a circle ( say unit) : $(X, \sqrt{(1-X^2)}$ . How would I show it then?

Last edited:

#### FactChecker

Gold Member
2018 Award
Yes, I understand that, but I am trying to test that, e.g. the pair (X,X^2) is uncorrelated. How would I go about it?
Your question is not well defined (unless I have missed something). It is up to you to specify what distributions you are working with. If X is uniformly distributed on the interval [2,3], then X and X^2 are correlated. If it is uniformly distributed on the interval [-1,1], then X and X^2 are uncorrelated.

A simpler example of the two cases is:
X = 2 or 3 with equal probability 1/2 (correlated)
X = -1 or 1 with equal probability 1/2 (uncorrelated)

#### WWGD

Gold Member
Your question is not well defined (unless I have missed something). It is up to you to specify what distributions you are working with. If X is uniformly distributed on the interval [2,3], then X and X^2 are correlated. If it is uniformly distributed on the interval [-1,1], then X and X^2 are uncorrelated.

A simpler example of the two cases is:
X = 2 or 3 with equal probability 1/2 (correlated)
X = -1 or 1 with equal probability 1/2 (uncorrelated)
Yes, I understand, this is almost tautological. No two pairs (X,Y) are " intrinsically" correlated/uncorrelated. But I am _given_ that they are without a mention of a joint . This means a joint is used _ implicitly_ and I am trying to make this assumption _explicit_.
I think we are not understanding each other. I am _given/told_ that the two are uncorrelated. This is given as a fact, without mentioning any underlying joint. So, a joint is assumed .I want to make this assumption explicit.

#### Stephen Tashi

Yes, I understand that, but I am trying to test that, e.g. the pair (X,X^2) is uncorrelated. How would I go about it?
In such a case, I see why a defining a joint distribution presents a technical problem. The commonly encoutered bivariate density is a function $j(x,y)$ that integrates to 1 over some area (finite or infinite) in 2D space. To define a joint density for a set of points of the form $(x,x^2)$ brings up the problem of defining a function $j(x,y)$ that integrates to 1 over a line or line segment in 2D space. Ordinary 2D Riemann integration gives an answer of zero when we do 2D integration over a line segment in 2D.

I think we can appeal to a more advanced form of integration and solve that technical problem, but we can also sidestep the question of a joint density. To compute the expected value of a function of a random variable $g(X)$ we only need the density $f(x)$ for $X$. $E(g(x)) = \int g(x) f(x) dx$. The question of whether $X$ is correlated with $X^2$ only requires computing $E(X)$, $E(X^2)$ and $E((X)(X^2)) = E(X^3)$. Those expectations are functions of $X$, so they can be computed using only the 1D density function for $X$.

It would an interesting exercise in abstract mathematics to say the correct words for defining a joint density for $(X,X^2)$ in 2D and to use that definition to show computation using the joint density is equivalent to taking the 1D view of things. However, I don't know if that interests you - or whether I could do it.

#### WWGD

Gold Member
Essentially, I am trying to solve:

$E[XY]-\mu_X\mu_Y=0 , aka \int xy f_{XY}(x,y)dxdy- \int xf_X (x)dx \int yf_Y(y)Dy=0 for f_{XY}.$

Last edited:

"Joint Used to Show lack of Correlation?"

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving