Can random weights create a correlation between two sequences?

junglebeast · Jun 8, 2009

I can generate two random sequences X and Y with no correlation. For example,

X = rand(0,1)
Y = 100 + rand(0,1)

Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):

http://img197.imageshack.us/img197/8318/corr1.gif

Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),

[X |1] [m|b]^T ~ [Y ]

Basically, that's a matrix equation of the form

A B ~ Y

where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.

Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).

Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,

W A B ~ W Y

Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...

W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y

Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).

Moving on...let's just choose a random set of weights for the example. ie,

W(i,i) = rand(0,100)

now, if you plot (W A B) as a function of (W Y), we get this picture...

http://img197.imageshack.us/img197/2379/corr2.gif

Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.

What's going on?

EnumaElish · Jun 8, 2009

By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

junglebeast · Jun 8, 2009

EnumaElish said:

By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

Good point...I did not think of it from that perspective. Silly me

Can random weights create a correlation between two sequences?

Similar threads

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Graduate Probability puzzle

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect