Why is correlation coefficient -1 to +1

Click For Summary

Homework Help Overview

The discussion revolves around understanding why the correlation coefficient is constrained between -1 and +1. Participants are exploring the mathematical foundations and implications of this concept, particularly in the context of probability and statistics.

Discussion Character

  • Exploratory, Conceptual clarification, Mathematical reasoning

Approaches and Questions Raised

  • Participants are considering analytical proofs for the bounds of the correlation coefficient, questioning whether it can be demonstrated through induction or by using the expected value formula. There is also mention of the Cauchy-Schwarz inequality as a potential foundational concept.

Discussion Status

Multiple approaches are being discussed, with some participants suggesting the use of mathematical inequalities and properties of covariance. There is no explicit consensus yet, but the conversation is actively exploring various mathematical frameworks and interpretations.

Contextual Notes

Some participants have noted potential inaccuracies in the formulas presented, indicating a need for clarification on the definitions and properties of the correlation coefficient.

stunner5000pt
Messages
1,447
Reaction score
5

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
\rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}}

OR

r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}}

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!
 
Physics news on Phys.org
stunner5000pt said:

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
\rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}}

OR

r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}}

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!

Your formula for ##\rho## is incorrect; it should be
\rho \equiv \frac{E(X-EX)(Y-EY)}{\sigma_X \sigma_Y} = \frac{E(XY) - EX EY}{\sigma_X \sigma_Y} You should not have ##\sqrt{ \;\; }## in the denominator.
 
stunner5000pt said:

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
\rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}}

OR

r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}}

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!


Since
0 \leq \text{Var}(aX+bY) = a^2 \sigma_X^2 + 2 a b\, \text{Cov}(X,Y) + b^2 \sigma_Y^2,
for all ##a,b##, and since ##\text{Cov}(X,Y) = \sigma_X \sigma_Y \, \rho##, the matrix
M = \pmatrix{\sigma_X^2 &amp; \sigma_X \sigma_Y \, \rho\\<br /> \sigma_X \sigma_Y \, \rho &amp; \sigma_Y^2}
must be positive semidefinite. Apply the standard tests for semidefinitness.
 
stunner5000pt said:
Why is the correlation coefficient between -1 and +1?
This is a direct consequence of the Cauchy-Schwarz inequality, a very important result which shows up in many forms throughout mathematics. There is a proof on the Wiki page in the context of an abstract inner-product space. It's a good exercise to go through the proof as written, and then restate the result by translating the concepts of inner product and norm into the language of probability.
 

Similar threads

Replies
5
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
8K
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 5 ·
Replies
5
Views
5K