Can random weights create a correlation between two sequences?

  • Thread starter Thread starter junglebeast
  • Start date Start date
  • Tags Tags
    Correlation
junglebeast
Messages
514
Reaction score
2
I can generate two random sequences X and Y with no correlation. For example,

X = rand(0,1)
Y = 100 + rand(0,1)

Now, if I plot Y as a function of X, and scale it to a square, I get a random distribution of points showing no correlation (as expected):

http://img197.imageshack.us/img197/8318/corr1.gif

Even though X and Y are not linearly related, I can still find the least squares relationship by solving for [m|b] in this matrix equation (where ~ means as close to equal as possible, in the least squares sense),

[X |1] [m|b]^T ~ [Y ]

Basically, that's a matrix equation of the form

A B ~ Y

where A has 2 columns; the first column is the elements of X, the second column is all 1's. B is just the slope and intercept of the line, and Y is the column vector of the components of Y.

Now, having solved for B, I can just multiply A B = Z, which is essentially the best possible "reconstruction" of Y as a linear combination of X and 1. We don't expect this to be a very good reconstruction because they are not linearly related...and if we plot Y as a function of Z, we still get basically a random noise image (like above).

Now for the kicker... say we want to do a weighted least squares estimate instead. We can form a matrix W that is diagonal, and has the weights for each linear equation on each row. This gives the matrix equation,

W A B ~ W Y

Note that if you wanted to solve for B directly, you can rearrange this through a few simple steps...

W A B ~ W Y
A^T W A B ~ A^T W Y
B ~ ( A^T W A )^{-1} A^T W Y

Note: I only mention that because it puts it into the same form as the definition for weighted least squares on Wikipedia, http://en.wikipedia.org/wiki/Least_squares (for anyone who's confused by my notation).

Moving on...let's just choose a random set of weights for the example. ie,

W(i,i) = rand(0,100)

now, if you plot (W A B) as a function of (W Y), we get this picture...

http://img197.imageshack.us/img197/2379/corr2.gif

Whoa! all of a sudden it looks like there is a strong correlation, but that can't be...we already know there is no correlation by the design of the problem, and all we did was choose random weights.

What's going on?
 
Last edited by a moderator:
Physics news on Phys.org
By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.
 
EnumaElish said:
By multiplying each row of [X Y] with a weight, you have introduced a correlation between X and Y. Nice example.

Good point...I did not think of it from that perspective. Silly me
 
Namaste & G'day Postulate: A strongly-knit team wins on average over a less knit one Fundamentals: - Two teams face off with 4 players each - A polo team consists of players that each have assigned to them a measure of their ability (called a "Handicap" - 10 is highest, -2 lowest) I attempted to measure close-knitness of a team in terms of standard deviation (SD) of handicaps of the players. Failure: It turns out that, more often than, a team with a higher SD wins. In my language, that...
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...

Similar threads

Replies
30
Views
4K
Replies
2
Views
2K
Replies
2
Views
2K
Replies
14
Views
2K
Replies
4
Views
2K
Replies
21
Views
4K
Back
Top