Can random weights create a correlation between two sequences?

  • Thread starter junglebeast
  • Start date
  • Tags
    Correlation
In summary, by multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y.
  • #1
junglebeast
515
2
I can generate two random sequences X and Y with no correlation. For example,

X = rand(0,1)
Y = 100 + rand(0,1)

Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):

http://img197.imageshack.us/img197/8318/corr1.gif

Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),

[X |1] [m|b]^T ~ [Y ]

Basically, that's a matrix equation of the form

A B ~ Y

where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.

Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).

Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,

W A B ~ W Y

Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...

W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y

Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).

Moving on...let's just choose a random set of weights for the example. ie,

W(i,i) = rand(0,100)

now, if you plot (W A B) as a function of (W Y), we get this picture...

http://img197.imageshack.us/img197/2379/corr2.gif

Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.

What's going on?
 
Last edited by a moderator:
Physics news on Phys.org
  • #2
By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.
 
  • #3
EnumaElish said:
By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

Good point...I did not think of it from that perspective. Silly me
 

1. What is correlation from nothingness?

Correlation from nothingness refers to a statistical phenomenon where a correlation between two variables appears to exist, but in reality, there is no true relationship between the two variables.

2. How does correlation from nothingness occur?

Correlation from nothingness can occur due to chance or coincidence. It can also be the result of confounding variables or biased data analysis.

3. What are the consequences of relying on correlation from nothingness?

Relying on correlation from nothingness can lead to false conclusions and incorrect predictions. It can also result in wasted time, resources, and efforts in pursuing relationships that do not truly exist.

4. How can one detect and avoid correlation from nothingness?

To detect and avoid correlation from nothingness, it is essential to carefully analyze and interpret data, consider alternative explanations, and use proper statistical methods. Additionally, replication of findings and conducting controlled experiments can help prevent correlation from nothingness.

5. Are there any situations where correlation from nothingness can be useful?

In some cases, correlation from nothingness can be useful as a starting point for further research and exploration. It can also highlight potential correlations that may warrant further investigation and can help identify confounding variables that need to be controlled for in future studies.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
14
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Precalculus Mathematics Homework Help
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K
Back
Top