Can random weights create a correlation between two sequences?

junglebeast · Jun 8, 2009

I can generate two random sequences X and Y with no correlation. For example,

X = rand(0,1)
Y = 100 + rand(0,1)

Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):

http://img197.imageshack.us/img197/8318/corr1.gif

Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),

[X |1] [m|b]^T ~ [Y ]

Basically, that's a matrix equation of the form

A B ~ Y

where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.

Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).

Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,

W A B ~ W Y

Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...

W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y

Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).

Moving on...let's just choose a random set of weights for the example. ie,

W(i,i) = rand(0,100)

now, if you plot (W A B) as a function of (W Y), we get this picture...

http://img197.imageshack.us/img197/2379/corr2.gif

Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.

What's going on?

EnumaElish · Jun 8, 2009

By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

junglebeast · Jun 8, 2009

EnumaElish said:

By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

Good point...I did not think of it from that perspective. Silly me

Can random weights create a correlation between two sequences?

1. What is correlation from nothingness?

2. How does correlation from nothingness occur?

3. What are the consequences of relying on correlation from nothingness?

4. How can one detect and avoid correlation from nothingness?

5. Are there any situations where correlation from nothingness can be useful?

Similar threads

Hot Threads

Recent Insights