Bound correlation coefficient for three random variables

• Master1022
In summary, Given the three random variables X, Y, and Z such that their correlations are all equal to r, the upper bound on r is 1 and the lower bound is -0.5. This can be proven using the determinant of the correlation matrix, which must be non-negative for real random variables. The visual representation of the problem as vectors and angles can also help understand the upper and lower bounds.
Master1022
Homework Statement
Given three random variables ## X##, ##Y##, and ## Z ## such that corr(X, Y) = corr(Y, Z) = corr(Z, X) = r, provide an upper and lower bound on ##r##.
Relevant Equations
Correlation
Hi,

I just found this problem and was wondering how I might go about approaching the solution.

Question:
Given three random variables ## X##, ##Y##, and ## Z ## such that ##\text{corr}(X, Y) = \text{corr}(Y, Z) = \text{corr}(Z, X) = r ##, provide an upper and lower bound on ##r##

Attempt:
I don't quite know how to attempt this rigorously, but I started by thinking about the problem geometrically. I don't know if this is incorrect, but I thought of the variables as 'vectors' and the correlations as the cosine of the angle between them.

Then, let us fix two vectors, ## X ## and ##Y##, with the angle between them ##\theta = \cos(r)##. Now we need to find the vector ##Z## such that it has the same angle between itself and both of ##X## and ##Y##. After doing some visualisation, I thought that the maximum angle possible between each 'vector' was 120 degrees (when the vectors are all planar), thus the correlation is ## cos(120^{o}) = -0.5 ##. Thus, my final answer would be: ## -0.5 \leq r \leq 1 ##: is that correct?

I suppose I could imagine this by imagining taking three pencils/pieces of spaghetti and standing them all up in a bunch. The angle between them all is 0 degrees, so correlation = 1. Then I could start increasing the angle between them all (as the pieces become less inclined to the table), so correlation is decreasing between them. The largest angle occurs (I think!) when they are all flat on the table, and the angle between them all must be equal to satisfy the constraints of the problem. Thus, that led me to the 120 degrees. Apologies for the silly explanation, but that is how I thought about the problem.

FactChecker and uart
Your guess is correct, and I like your geometric way of thinking about it - very intuitive!

To prove it formally, note that the determinant of the correlation matrix must be non-negative. The correlation matrix has entries of r everywhere except on the main diagonal, where it is 1. The determinant is ##1 - 3r^2+2r^3##, which equals zero at ##r=1## and ##r =-0.5## and is positive between those points. It's negative for ##r<-0.5## so 0.5 is the lower bound.

FactChecker, Master1022 and uart
andrewkirk said:
Your guess is correct, and I like your geometric way of thinking about it - very intuitive!

To prove it formally, note that the determinant of the correlation matrix must be non-negative. The correlation matrix has entries of r everywhere except on the main diagonal, where it is 1. The determinant is ##1 - 3r^2+2r^3##, which equals zero at ##r=1## and ##r =-0.5## and is positive between those points. It's negative for ##r<-0.5## so 0.5 is the lower bound.
Thanks @andrewkirk ! With the correlation matrix method, why must its determinant be non-negative? I might be missing something obvious... Otherwise, that arithmetic does make sense.

Master1022 said:
Thanks @andrewkirk ! With the correlation matrix method, why must its determinant be non-negative? I might be missing something obvious... Otherwise, that arithmetic does make sense.
It is a theorem of probability theory that a correlation matrix of real random variables must be positive definite - which means the determinants of the upper-left 1x1, 2x2 and 3x3 submatrices must all be non-negative. See for instance here. There will be various ways of proving it, most involving linear algebra. It may relate to the fact that it must be possible to perform Cholesky decomposition of a correlation matrix.

The requirement turns out to be equivalent to the more intuitive notion of disallowing impossible correlations (such as random variables A, B and C all having pairwise correlations close to -1.).

FactChecker

What is the bound correlation coefficient for three random variables?

The bound correlation coefficient for three random variables is a measure of the strength and direction of the linear relationship between three variables. It is a value between -1 and 1, where -1 indicates a perfect negative correlation, 0 indicates no correlation, and 1 indicates a perfect positive correlation.

How is the bound correlation coefficient calculated?

The bound correlation coefficient is calculated by taking the product of the three individual correlation coefficients between each pair of variables. This value is then divided by the square root of the product of the individual correlation coefficients squared. The resulting value is the bound correlation coefficient.

What does a high bound correlation coefficient indicate?

A high bound correlation coefficient (close to 1 or -1) indicates a strong linear relationship between the three variables. This means that as one variable increases, the other two variables also tend to increase or decrease together. A positive bound correlation coefficient indicates a positive relationship, while a negative bound correlation coefficient indicates a negative relationship.

Can the bound correlation coefficient be used to determine causation?

No, the bound correlation coefficient only measures the strength and direction of a linear relationship between three variables. It does not indicate causation, as there may be other factors at play that are influencing the variables. In order to determine causation, further research and analysis is needed.

Can the bound correlation coefficient be used for non-linear relationships?

No, the bound correlation coefficient is only applicable for linear relationships between three variables. If the relationship between the variables is non-linear, a different measure, such as the Spearman's rank correlation coefficient, should be used.

• Precalculus Mathematics Homework Help
Replies
7
Views
1K
• Precalculus Mathematics Homework Help
Replies
5
Views
1K
• Precalculus Mathematics Homework Help
Replies
29
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
553
• Precalculus Mathematics Homework Help
Replies
13
Views
401
• Set Theory, Logic, Probability, Statistics
Replies
10
Views
1K
• Engineering and Comp Sci Homework Help
Replies
2
Views
650
• Precalculus Mathematics Homework Help
Replies
20
Views
2K
• Precalculus Mathematics Homework Help
Replies
6
Views
529