junglebeast
Jun8-09, 09:59 PM
I can generate two random sequences X[i] and Y[i] with no correlation. For example,
X[i] = rand(0,1)
Y[i] = 100 + rand(0,1)
Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):
http://img197.imageshack.us/img197/8318/corr1.gif
Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),
[X[i] |1] [m|b]^T ~ [Y[i] ]
Basically, that's a matrix equation of the form
A B ~ Y
where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.
Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).
Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,
W A B ~ W Y
Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...
W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y
Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).
Moving on...let's just choose a random set of weights for the example. ie,
W(i,i) = rand(0,100)
now, if you plot (W A B) as a function of (W Y), we get this picture...
http://img197.imageshack.us/img197/2379/corr2.gif
Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.
What's going on?
X[i] = rand(0,1)
Y[i] = 100 + rand(0,1)
Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):
http://img197.imageshack.us/img197/8318/corr1.gif
Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),
[X[i] |1] [m|b]^T ~ [Y[i] ]
Basically, that's a matrix equation of the form
A B ~ Y
where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.
Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).
Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,
W A B ~ W Y
Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...
W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y
Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).
Moving on...let's just choose a random set of weights for the example. ie,
W(i,i) = rand(0,100)
now, if you plot (W A B) as a function of (W Y), we get this picture...
http://img197.imageshack.us/img197/2379/corr2.gif
Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.
What's going on?