Why is correlation coefficient -1 to +1

In summary, the correlation coefficient between two vectors is maximized when the vectors are orthogonal (have the same orientation), and the correlation is maximized when the vectors are equal.
  • #1
stunner5000pt
1,461
2

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
[tex] \rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}} [/tex]

OR

[tex] r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}} [/tex]

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!
 
Physics news on Phys.org
  • #2
stunner5000pt said:

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
[tex] \rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}} [/tex]

OR

[tex] r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}} [/tex]

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!

Your formula for ##\rho## is incorrect; it should be
[tex] \rho \equiv \frac{E(X-EX)(Y-EY)}{\sigma_X \sigma_Y} = \frac{E(XY) - EX EY}{\sigma_X \sigma_Y} [/tex] You should not have ##\sqrt{ \;\; }## in the denominator.
 
  • #3
stunner5000pt said:

Homework Statement


Why is the correlation coefficient between -1 and +1?


Homework Equations


we know correlation coefficient
[tex] \rho = \frac{E[xy]-E[x]E[y]}{\sqrt{\sigma_{x} \sigma_{y}}} [/tex]

OR

[tex] r = \frac{\sum ^n _{i=1}(X_i - \bar{X})(Y_i - \bar{Y})}{\sqrt{\sum ^n _{i=1}(X_i - \bar{X})^2} \sqrt{\sum ^n _{i=1}(Y_i - \bar{Y})^2}} [/tex]

The Attempt at a Solution


Is there a way to prove this analytically? Perhaps we can use the second formula and prove by induction the bottom is greater than the top? Or perhaps equal??

I tried using the expected value formula for the first version with rho - i couldn't really use that properly. Canyou please suggest an approach? Can this even be done analytically? Or would it just have to be explained?

Thanks for your help!


Since
[tex] 0 \leq \text{Var}(aX+bY) = a^2 \sigma_X^2 + 2 a b\, \text{Cov}(X,Y) + b^2 \sigma_Y^2,[/tex]
for all ##a,b##, and since ##\text{Cov}(X,Y) = \sigma_X \sigma_Y \, \rho##, the matrix
[tex] M = \pmatrix{\sigma_X^2 & \sigma_X \sigma_Y \, \rho\\
\sigma_X \sigma_Y \, \rho & \sigma_Y^2} [/tex]
must be positive semidefinite. Apply the standard tests for semidefinitness.
 
  • #4
stunner5000pt said:
Why is the correlation coefficient between -1 and +1?
This is a direct consequence of the Cauchy-Schwarz inequality, a very important result which shows up in many forms throughout mathematics. There is a proof on the Wiki page in the context of an abstract inner-product space. It's a good exercise to go through the proof as written, and then restate the result by translating the concepts of inner product and norm into the language of probability.
 

FAQ: Why is correlation coefficient -1 to +1

1. What is the correlation coefficient and why is it important?

The correlation coefficient is a statistical measure that represents the strength and direction of the relationship between two variables. It is important because it helps us understand and quantify the relationship between variables, which can help us make predictions and draw conclusions in scientific research.

2. Why is the correlation coefficient always between -1 and +1?

The correlation coefficient is always between -1 and +1 because it is normalized, meaning it is scaled to a range of -1 to +1. This allows us to compare the strength and direction of relationships across different data sets.

3. What does a positive correlation coefficient mean?

A positive correlation coefficient means that as one variable increases, the other variable also tends to increase. In other words, there is a direct relationship between the two variables.

4. What does a negative correlation coefficient mean?

A negative correlation coefficient means that as one variable increases, the other variable tends to decrease. In other words, there is an inverse relationship between the two variables.

5. How do you interpret the correlation coefficient?

The correlation coefficient can be interpreted as follows:

  • Values close to +1 indicate a strong positive relationship, meaning the variables move in the same direction.
  • Values close to -1 indicate a strong negative relationship, meaning the variables move in opposite directions.
  • Values close to 0 indicate a weak or no relationship between the variables.

Similar threads

Replies
36
Views
5K
Replies
4
Views
2K
Replies
1
Views
2K
Replies
3
Views
1K
Replies
4
Views
2K
Replies
3
Views
1K
Replies
43
Views
4K
Back
Top